Mastering Claude Code: The Anthropic Field Guide
Lecture 3

The Self-Correction Loop: Testing as Verification

Mastering Claude Code: The Anthropic Field Guide

Transcript

Boris Cherny uses Plan Mode for roughly 80% of his Claude Code sessions. That number should stop you cold. It means the creator of the tool spends the overwhelming majority of his time in a phase where no code is being written at all. Why? Because the real leverage in Claude Code is not generation speed — it is verification. The four-phase workflow Cherny recommends is Explore, Plan, Code, Commit, and the Explore phase is where the self-correction loop is born. Claude reads files, images, and URLs; subagents investigate the codebase in separate context windows without touching the main session. No code yet. Just understanding. While CLAUDE.md serves as the agent's persistent memory, focus now on how it supports the execution discipline. Plan Mode is crucial for separating exploration from implementation, activated by instructing Claude to pause coding or using specific shortcuts. This is non-negotiable for tasks touching multiple files or unfamiliar codebases. Once Claude surfaces a plan, you edit it directly in a text editor using Ctrl+G before a single line of implementation runs. That edit step is yours. Own it. Here is where the self-correction loop becomes structural, Anvesha. During implementation, Claude does not just write and hand off — it tests. Testing verifies the reasonableness of code in real time and checks whether outputs match visual mocks inside iteration loops. Subagents run in separate context windows specifically to investigate without polluting the main session's reasoning. Reference files with the @ symbol so Claude reads them before responding; provide specific file names, scenarios, and testing preferences to scope the task accurately. Tight feedback loops matter: use Escape to stop Claude mid-action if something looks wrong, and use /clear to reset context entirely. Course-correct early. The cost of a wrong direction compounds fast. Now, the trust question. Many developers resist handing verification to the agent because they fear compounding errors — Claude fixes one bug and introduces three more. That concern is legitimate, but the architecture addresses it directly. After two failed correction attempts, the protocol is explicit: use /clear and rewrite the initial prompt with the insights you have gained. You are not debugging Claude; you are refining the specification. Boundary conditions matter here too, Anvesha — after implementation, manually test edge cases like missing data and limit values. The agent handles the recursive loop; you handle the cases that require human judgment about what the system should actually do. The shift this creates in your role is the key takeaway. Stop thinking of yourself as the coder who reviews Claude's output line by line. Start thinking of yourself as the test-setter who defines what correct behavior looks like. When your test suite is robust, Claude can run it, interpret the shell output, identify the failure, edit the source, and re-run — autonomously, in a tight loop. The Explore, Plan, Code, Commit sequence is the workflow's backbone, with the agent handling verification and you defining correctness. Your job is no longer execution. It is defining what done means, then letting the loop prove it.