red/green TDD
A test-driven development pattern adapted for coding agents. It emphasizes an iterative failure/success loop that can make agentic coding more reliable.
Key Highlights
- Red/green TDD adapts classic test-driven development into a structured prompt pattern for coding agents.
- The pattern asks agents to write tests first, confirm failure, implement code, and then verify success.
- It matters to AI PMs because it improves reliability, auditability, and requirement clarity in agent-generated code.
- Recent mentions tie the pattern to Simon Willison’s agentic engineering work and the rise of tools like Claude Code and GPT-5.4.
Overview
Red/green TDD is a test-driven development pattern adapted for coding agents. In this workflow, an agent is prompted to write tests first, run them to verify they fail (the “red” phase), implement the code needed to satisfy those tests, and then rerun the tests to verify success (the “green” phase). In the agentic coding context, this turns a broad coding request into a tighter execution loop with explicit checkpoints.
For AI Product Managers, this matters because it is a practical reliability pattern for autonomous or semi-autonomous software generation. Rather than asking an agent to produce a large feature in one shot, red/green TDD encourages incremental validation, clearer acceptance criteria, and easier debugging. It has emerged as part of broader agentic engineering practices discussed by builders using tools like Claude Code and GPT-5.4, especially as coding agents became more dependable at following instructions.
Key Developments
- 2026-02-23 — Simon Willison’s Agentic Engineering Patterns was highlighted as a guide for building and operating agentic systems, with red/green TDD cited as one of the specific coding-agent patterns it collects.
- 2026-04-03 — A newsletter mention described the operational prompt for red/green TDD: instruct agents to write tests first, run them to confirm failure, implement code, and rerun tests to confirm success. This framed the pattern as a concrete way to improve productivity and reliability with coding agents such as Claude Code and GPT-5.4.
- 2026-04-04 — Lenny Rachitsky shared Simon Willison’s view that late 2025 marked an inflection point for AI coding, with autonomous coding agents, benchmark progress, and thin templates helping make practices like red/green TDD more usable in real workflows.
Relevance to AI PMs
- Define clearer acceptance criteria for agent workflows. PMs can translate product requirements into testable behaviors, making it easier for coding agents to work against explicit pass/fail conditions instead of vague feature requests.
- Reduce risk in agent-generated code. Red/green TDD creates a built-in validation loop that helps catch regressions, hallucinated implementations, and misunderstood requirements before code is merged or shipped.
- Design more reliable human-in-the-loop processes. PMs evaluating AI developer tools can use this pattern to structure agent tasks into auditable steps, improving trust, reviewability, and team adoption.
Related
- Simon Willison — A key advocate of agentic engineering patterns, including red/green TDD for coding agents.
- Lenny Rachitsky — Amplified Willison’s framing of the AI coding inflection point and the importance of patterns like this.
- Claude Code — An example of a coding agent environment where red/green TDD can be applied.
- GPT-5.4 — Referenced as a model enabling more reliable coding-agent behavior, making iterative TDD loops more effective.
- Agentic Engineering Patterns — The broader pattern library in which red/green TDD is positioned.
- Test-driven development — The traditional software practice from which this agent-adapted pattern derives.
Newsletter Mentions (3)
“#8 𝕏 Lenny Rachitsky shares Simon Willison’s insight that November 2025 was the inflection point for AI coding, unleashing autonomous coding agents benchmarked by Pelican Benchmark and driving red/green TDD with “thin templates.”
GenAI PM Daily April 04, 2026 GenAI PM Daily 🎧 Listen to this brief 3 min listen Today's top 17 insights for PM Builders, ranked by relevance from X, Blogs, and LinkedIn. Claude subscriptions will no longer cover usage on third-party tools like OpenClaw. #8 𝕏 Lenny Rachitsky shares Simon Willison’s insight that November 2025 was the inflection point for AI coding, unleashing autonomous coding agents benchmarked by Pelican Benchmark and driving red/green TDD with “thin templates.
“Invoking the prompt “red/green TDD” directs agents to write tests first, run them to confirm failure, implement the code, then rerun tests to confirm success.”
▶️ Why AI came for coders first, automation timelines, and how we’re inside the AI inflection Lennys Podcast Simon Willison details agentic engineering patterns—using coding agents like Claude Code and GPT-5.4 for red/green TDD, thin project templates, and public GitHub hoarding—to boost software productivity and reliability. GPT-5.1 and Claude Opus 4.5 released in November 2025 advanced coding agents from “mostly working” to “almost always following instructions,” enabling engineers to churn out up to 10,000 lines of code per day. Invoking the prompt “red/green TDD” directs agents to write tests first, run them to confirm failure, implement the code, then rerun tests to confirm success. Willison’s GitHub repositories include simonw/tools with 193 HTML/JavaScript client-side utilities and simonw/ressearch with 75 AI-driven research projects to hoard reusable code experiments.
“#3 📝 Simon Willison Agentic Engineering Patterns - A guide collecting patterns for building and operating agentic systems. It serves as a hub for specific patterns such as red/green TDD for coding agents.”
#3 📝 Simon Willison Agentic Engineering Patterns - A guide collecting patterns for building and operating agentic systems. It serves as a hub for specific patterns such as red/green TDD for coding agents. #5 📝 Simon Willison Research WebMCP + Chrome DevTools Protocol Demo - Demo of WebMCP, a proposed browser API for exposing structured, callable tools to AI agents, showing how to register and interact with WebMCP tools from a Python client over the Chrome DevTools Protocol.
Related
Anthropic's coding-focused agentic tool for building and automating software workflows. In this newsletter it is discussed as being integrated with Vercel AI Gateway and as a Chrome extension for browser automation.
Developer and writer known for hands-on AI and tooling tutorials. Here he provides a Docker-based walkthrough for running OpenClaw locally.
The author and host cited for reporting on AI agents replacing most SDR work. Relevant to AI PMs for go-to-market automation and sales workflow shifts.
A newer OpenAI model release with improved natural dialogue, longer context, and stronger tool use. It is discussed as a model now available in Cursor and chatprd.
A collection of patterns for building and operating agentic systems. The newsletter highlights it as a reference hub for practical coding-agent workflows like red/green TDD.
Stay updated on red/green TDD
Get curated AI PM insights delivered daily — covering this and 1,000+ other sources.
Subscribe Free