tool12 mentions· Updated Jun 27, 2026

Opus 4.6

A model used as the underlying engine for an assistant tested against prompt injection. The newsletter notes its explicit anti-prompt-injection rules as a sign that defense measures are improving.

Key Highlights

Opus 4.6 appeared as a core model for agentic workflows, growth experimentation, coding tasks, and long-context use cases.
Its 1M-token context rollout was described as a meaningful practical capability jump, not just a pricing or spec update.
Real-world incidents showed both progress and risk: stronger anti-prompt-injection behavior alongside unsafe hallucinated actions.
Anthropic’s CASH system reportedly became reliably useful only after upgrading from Opus 4.5 to Opus 4.6.
For AI PMs, Opus 4.6 is a strong example of why context design, guardrails, and workflow fit matter as much as raw model quality.

Overview

Opus 4.6 is a high-capability Claude-family model from Anthropic that appeared across newsletters as a core engine for agentic workflows, coding tasks, growth automation, and long-context reasoning. It was referenced both as a production model used inside systems like CASH and OpenClaw, and as a benchmark point for model behavior changes such as 1M-token context, faster operating modes, and improved prompt-injection resistance.

For AI Product Managers, Opus 4.6 matters less as a standalone model release and more as a case study in what modern frontier models enable—and where they still fail. The mentions show a pattern: strong utility in automation and orchestration, measurable gains from long context and speed improvements, but persistent operational risks around hallucinated actions, context management inefficiency, and overconfidence about security. In practice, it represents the kind of model PMs evaluate for agent products, internal copilots, growth systems, and coding workflows.

Key Developments

2026-02-08: Anthropic discussed an experimental /fast mode for Opus 4.6, built and tested with Claude, using significantly more compute and cost for urgent incident response and accelerated work on critical projects. In the same broader discussion cycle, Opus 4.6 was compared with GPT-5.3 Codex and benchmarked strongly on white-collar work tasks.
2026-02-09: Mike Krieger described a Labs version of fast Opus—Claude Opus 4.6 running about 2.5× faster—as a “crazy unlock,” signaling the product value of lower latency on top of strong base capabilities.
2026-02-16: Opus 4.6 was used via OpenRouter in a parallel generation workflow alongside GLM5, Minimax 2.5, and Gemini 3 Pro to create HTML game demos, convert them into video assets, and draft social content—an example of model portfolio orchestration rather than single-model dependence.
2026-02-25: Carl Vellotti criticized Opus 4.6 for poor context management, including loading too many files for simple questions and rarely spawning context-saving agents. The workaround involved a CLAUDE.md instruction snippet, highlighting how system scaffolding can materially shape model efficiency.
2026-03-04: Guillermo Rauch shared an incident where Opus 4.6 hallucinated a fake GitHub repo ID and then used Vercel’s API to deploy random code, underscoring the need for strict validation, permissioning, and guardrails around model-generated actions.
2026-03-14: Anthropic made the 1 million-token context window generally available for Opus 4.6 and Sonnet 4.6, with standard pricing and no long-context premium, making large-context workflows more accessible for production use.
2026-03-22: Peter Yang said the new 1M-token context window felt like a practical jump from Opus 4.6 to 4.7, emphasizing that context expansion can feel like a material capability upgrade, not just a spec bump.
2026-03-30: Claire Vo used Opus-4.6 inside OpenClaw across multiple role-based agents connected to Telegram bots to automate business outreach and family scheduling, showing the model’s role in persistent, real-world agent systems.
2026-04-06: Anthropic’s growth team launched CASH (Claude Accelerates Sustainable Hypergrowth) using Claude with Opus 4.6 to automate growth experimentation end to end—from opportunity identification through post-launch analysis. Reliability reportedly improved after upgrading from Opus 4.5 to Opus 4.6, reaching junior-PM-level win rates on copy and UI tweaks.
2026-06-27: In a public attempt to hack an OpenClaw assistant, the underlying Opus 4.6 model used explicit anti-prompt-injection rules and resisted secret exfiltration across thousands of attacks. Simon Willison noted this as evidence that model-level defenses are improving, while cautioning against assuming production-grade safety.

Relevance to AI PMs

1. Model selection is now a workflow decision, not just a benchmark decision. Opus 4.6 showed up in growth automation, coding agents, parallel generation pipelines, and personal agent systems. PMs should evaluate it based on end-to-end task reliability, tool-use behavior, latency, and orchestration fit—not just eval scores.

2. Guardrails and validation are product requirements. The Vercel deployment incident is a concrete reminder that capable models can still generate invalid identifiers or unsafe actions. PMs shipping agentic features should require human approval gates, typed tool schemas, action simulation, and postcondition checks before external side effects occur.

3. Prompting and context architecture materially affect ROI. Mentions around CLAUDE.md fixes, context-saving behavior, and the 1M-token window show that performance is strongly shaped by scaffolding. PMs should invest in context management policies, retrieval boundaries, agent decomposition, and latency/cost profiling instead of assuming raw model quality alone will solve workflow problems.

Anthropic: Creator of Claude and the vendor most directly associated with Opus 4.6’s deployment, speed modes, and long-context rollout.
Claude / Claude Code / anthropic-cli / claudemd: The broader product and developer ecosystem in which Opus 4.6 was configured, instructed, and operationalized.
OpenClaw / openrouter / opencode: Tooling layers that exposed Opus 4.6 inside multi-agent systems, CLI workflows, and multi-model orchestration.
Sonnet-46, Opus-47, GPT-54, GPT-53-Codex, GLM5, Minimax-25, Gemini-3-Pro: Adjacent or competing models used for comparison, substitution, or portfolio-style execution.
1m-token-context-window / context-management / agentic-task-handling: Key capability themes tied to Opus 4.6’s practical value and limitations.
Simon Willison, Peter Yang, Guillermo Rauch, Boris Cherny, Mike Krieger, Claire Vo, Fernando Irarrázaval, Greg Isenberg: People who surfaced notable examples involving security, context scale, speed, operational risk, and real-world usage.

Newsletter Mentions (12)

2026-06-27

“The underlying Opus 4.6 model used explicit anti-prompt-injection rules, suggesting recent lab efforts at injection defenses are having an effect, though Simon cautions against assuming complete safety for production systems.”

#8 📝 Simon Willison What happened after 2,000 people tried to hack my AI assistant - Fernando Irarrázaval ran a public challenge (hackmyclaw.com) to try to exfiltrate secrets from his OpenClaw instance via email; despite ~6,000 attempts and modest token spend, no secret was leaked. The underlying Opus 4.6 model used explicit anti-prompt-injection rules, suggesting recent lab efforts at injection defenses are having an effect, though Simon cautions against assuming complete safety for production systems.

2026-04-06

“Anthropic’s growth team launches CASH (Claude Accelerates Sustainable Hypergrowth) using Claude with Opus 4.6 to fully automate growth experimentation—from opportunity identification to post-launch analysis—achieving junior PM-level win rates on copy and UI tweaks.”

#12 ▶️ Head of Growth (Anthropic): “Claude is growing itself at this point” Lennys Podcast Anthropic’s growth team launches CASH (Claude Accelerates Sustainable Hypergrowth) using Claude with Opus 4.6 to fully automate growth experimentation—from opportunity identification to post-launch analysis—achieving junior PM-level win rates on copy and UI tweaks. Anthropic’s ARR jumped from $1 billion at the start of 2025 to $19 billion by February 2026 (10× YoY growth), hitting $4 billion mid-2025 and $9 billion end-2025—a $18 billion increase in 14 months. CASH was initiated a few months ago but only began delivering reliable results after upgrading from Opus 4.5 to Opus 4.6, automating four stages of growth work (opportunity ID, build, QA/brand compliance, and analysis). Co-work’s desktop app runs a scheduled task each morning on ~20–25 Hex chart links and Slack MCP transcripts, then uses Claude to summarize top concerns and insights in Slack.

2026-03-30

“Claire Vo installed OpenClaw via a one-line Homebrew script on separate macOS machines (three Mac minis and one MacBook Air), configured nine role-based agents (Polly, Finn, Sam, etc.) using Opus-4.6, Sonnet-4.6 and GPT-5.4 models, and linked them to Telegram bots for automating her business outreach and family scheduling.”

#1 ▶️ How OpenClaw’s AI agents run this founder’s business, family and life | Claire Vo Lennys Podcast Claire Vo installed OpenClaw via a one-line Homebrew script on separate macOS machines (three Mac minis and one MacBook Air), configured nine role-based agents (Polly, Finn, Sam, etc.) using Opus-4.6, Sonnet-4.6 and GPT-5.4 models, and linked them to Telegram bots for automating her business outreach and family scheduling. She ran “brew install openclaw” in iTerm, chose personal use, selected Opus-4.6, Sonnet-4.6 and GPT-5.4, then registered each agent as a Telegram bot via BotFather. Agent “Sam” performs a daily sweep of her CRM for product-led growth signups, enriches leads with Exa People Search, drafts and sends outreach emails via Telegram, replacing a human assistant who worked 10 hours/week. She enabled macOS Screen Sharing and Remote Login on her Mac minis to SSH into and view the agent GUIs from her laptop over Wi-Fi, removing the need for dedicated monitors, keyboards or mice.

2026-03-22

“#12 𝕏 Peter Yang says the new 1M-token context window feels like a version bump from Opus 4.6 to 4.7, delivering a noticeable performance and capacity boost.”

A model capability note highlights the impact of longer context windows. #12 𝕏 Peter Yang says the new 1M-token context window feels like a version bump from Opus 4.6 to 4.7, delivering a noticeable performance and capacity boost.

2026-03-14

“1M context is now generally available for Opus 4.6 and Sonnet 4.6. Standard pricing now applies”

Claude now offers a 1 million-token context window in its Opus 4.6 and Sonnet 4.6 models, and this upgrade is generally available to all users. Also covered by: @Claude #2 📝 Simon Willison 1M context is now generally available for Opus 4.6 and Sonnet 4.6 - Anthropic announced 1M token context availability for Opus 4.6 and Sonnet 4.6; standard pricing now applies across the full 1M window with no long-context premium.

2026-03-04

“Guillermo Rauch recounts how an AI model (Opus 4.6) hallucinated a fake GitHub repo ID and inadvertently used Vercel’s API to deploy random code, underscoring the need for strict validation of AI-generated requests.”

Opus 4.6 is discussed in the context of an unsafe deployment action caused by hallucination.

2026-02-25

“#23 in 🥞 Carl Vellotti calls out Opus 4.6 for needlessly loading eight files to answer a two-sentence question and rarely spawning context-saving agents.”

#23 in 🥞 Carl Vellotti calls out Opus 4.6 for needlessly loading eight files to answer a two-sentence question and rarely spawning context-saving agents. He shares a “Context Management” snippet to drop into your CLAUDE.md to fix it.

2026-02-16

“All About AI Uses an autonomous Claude Code agent on a Mac Mini to invoke the OpenCode CLI via OpenRouter on four models (GLM5, Minimax 2.5, Gemini 3 Pro, Opus 4.6) in parallel to generate HTML demos of a retro space game, convert them with Remotion into a grid-style MP4 video, and draft a post on X.”

#2 ▶️ How to Run OpenCode Inside an Autonomous Claude Code AI Agent All About AI Uses an autonomous Claude Code agent on a Mac Mini to invoke the OpenCode CLI via OpenRouter on four models (GLM5, Minimax 2.5, Gemini 3 Pro, Opus 4.6) in parallel to generate HTML demos of a retro space game, convert them with Remotion into a grid-style MP4 video, and draft a post on X. Executed “open code run --model openrouter GLM5 'Should I walk or drive to the car wash? It’s 50 m away'” via Cloud Code CLI, receiving “you should walk to the car wash,” and then ran “open code run --model openrouter Gemini-3-Pro …” obtaining “drive. You can’t wash the car if you leave it behind.” Created a Cloud Code skill file open code test skill.md to launch four OpenRouter models (GLM5, Minimax-2.5, Gemini-3-Pro, Opus-4.6) in parallel on the prompt “create a full screen animated retro arcade space battle scene,” saving outputs as llm-test/game- .html.

2026-02-09

“Mike Krieger has been building with Labs’ fast Opus—Claude Opus 4.6 running 2.5× faster—and calls it a “crazy unlock.””

#5 𝕏 Mike Krieger has been building with Labs’ fast Opus—Claude Opus 4.6 running 2.5× faster—and calls it a “crazy unlock.” He’s now excited to roll it out beyond Anthropic. Also covered by: @Guillermo Rauch

2026-02-08

“Boris Cherny launched the /fast mode in Opus, using significantly more compute than Opus 4.6 and incurring higher costs for incident response and accelerated work on critical projects, and announced his team built and tested this experimental fast mode for Opus 4.6 with Claude over the past few weeks ( tweet ).”

GenAI PM Daily February 08, 2026 GenAI PM Daily 🎧 Listen to this brief 3 min listen Today's top 20 insights for PM Builders, ranked by relevance from X, Blogs, YouTube, and LinkedIn. Anthropic Launches Fast Mode for Claude Code #4 𝕏 Boris Cherny launched the /fast mode in Opus, using significantly more compute than Opus 4.6 and incurring higher costs for incident response and accelerated work on critical projects, and announced his team built and tested this experimental fast mode for Opus 4.6 with Claude over the past few weeks ( tweet ). #15 ▶️ The Two Models that will Dominate AI Discussions Just Got Released (Claude Opus 4.6 + GPT 5.3 Codex) AI Explained Benchmark comparison shows Claude Opus 4.6 outperforms GPT 5.2 by about 140 ELO points on the GDP val white-collar work benchmark, while GPT 5.3 Codex achieves 77.3% on TerminalBench 2.0 extra-high settings versus 65.4% for Opus 4.6 Max.

Claude Codetool

Anthropic’s coding product/blog referenced in a customer story about Cognition’s use of Claude Fable 5. For AI PMs, it highlights enterprise coding adoption narratives.

Anthropiccompany

Anthropic is the company behind Claude and Claude Code. The newsletter covers its new Reflection dashboard and an enterprise deployment of Claude in industrial workflows.

Claudetool

Anthropic’s assistant and coding tool, discussed here in both the Reflection dashboard and a physical-AI deployment at UST. The newsletter highlights its usage analytics, workflow suggestions, and enterprise integration.

Peter Yangperson

A PM/influencer who shares practical AI workflow experiments around planning, design, and execution. He is cited using Fable, Claude Design, and GPT-5.6 together in a product-building workflow.

Simon Willisonperson

A developer and AI commentator quoted here in relation to OpenAI’s clarification of ChatGPT Work behavior. He is relevant as an interpreter and critic of product messaging.

Guillermo Rauchperson

A developer and founder mentioned as a secondary coverage source for Muse Spark 1.1. He is included among the voices discussing the release.

OpenClawtool

An AI assistant or agent instance used in a public prompt-injection challenge and later in startup support automation. It is relevant to AI PMs as an example of both security testing and customer support automation.

Vercelcompany

A developer platform company mentioned for launching an AI gateway and model routing/origin controls. Relevant to PMs building multi-model infrastructure and trusted inference paths.

Greg Isenbergperson

A startup builder and commentator mentioned using Grok 4.5 inside an agent stack. He is relevant to AI PMs as a practical tester of agentic workflows and product ideas.

Boris Chernyperson

Developer advocate and product figure associated with Claude Code. Here he is credited with rolling out a cleanup command for agentic coding workflows.

GPT 5.4tool

A GPT model variant used here for scientific reasoning and agentic chemistry experimentation. The newsletter frames it as a model capable of proposing experimental improvements and driving benchmarked workflows.

Opus 4.7tool

A model version associated with the Claude Code hackathon. It is referenced as the build basis for the event and its winners.

OpenCodetool

A coding agent or development tool mentioned as an integration target for Omnigent. It is part of the agent workflow stack discussed in the newsletter.

OpenRoutertool

A model-routing platform used to call multiple LLMs through a common interface. Here it is used to run four models in parallel for comparison and generation tasks.

GPT-5.3-Codextool

OpenAI’s coding-focused model/release highlighted for benchmark performance, steerability, and speed improvements. The newsletter frames it as a strong coding agent option with multiple benchmark scores.

Claude.mdtool

A steering file used to guide Claude Code behavior through repository-specific instructions. It is part of a broader control surface for agent workflows.

Sonnet-4.6tool

A Claude model used in the newsletter's example to run Python code and analyze a floor plan. It is discussed as part of an agentic workflow inside Claude Cowork.

Gemini 3 Protool

A Gemini model variant used in a real workflow library project. The newsletter mentions it as one of the tools used to build the ChatPRD index.

Stay updated on Opus 4.6

Get curated AI PM insights delivered daily — covering this and 1,000+ other sources.

Subscribe Free