Simon Willison
Independent AI commentator and developer known for practical analysis of LLM products. Here he argues Anthropic and OpenAI have found product-market fit.
Key Highlights
- Simon Willison is a trusted independent voice translating fast-moving LLM changes into practical product and engineering implications.
- He combines commentary with hands-on tooling, including the llm Python library and related model plugins.
- His coverage is especially useful for AI PMs tracking model migrations, coding agents, structured outputs, and developer UX shifts.
- A notable recent thesis is that Anthropic and OpenAI have likely reached product-market fit based on usage and spending signals.
- His work consistently favors direct experimentation and operational insight over hype or benchmark-only narratives.
Simon Willison
Overview
Simon Willison is an independent developer, writer, and AI commentator who has become one of the most consistently useful interpreters of the fast-moving LLM product ecosystem. He is especially known for practical analysis of model capabilities, API changes, coding workflows, prompt design, and the day-to-day realities of using AI tools in production. For AI Product Managers, his work matters because it translates hype into evidence: what changed, what works, what broke, and what teams should actually pay attention to.Across newsletters and ecosystem discussion, Simon shows up as both a practitioner and a curator. He ships tools such as the `llm` Python library and related plugins, documents model releases quickly, and offers grounded commentary on topics like agentic coding, prompt injection, model migration, HTML-based outputs, and product-market fit in frontier AI labs. His perspective is valuable to AI PMs because it sits at the intersection of product usage, developer ergonomics, and real-world adoption signals.
Key Developments
- 2026-04-25 — Simon was cited alongside Cursor and Aravind Srinivas in coverage of GPT-5.5 and GPT-5.5 Pro API availability, reflecting his role as a trusted interpreter of major model launches.
- 2026-04-26 — He highlighted key takeaways from OpenAI’s GPT-5.5 prompting guide, including migration advice, a Codex-assisted project migration command, and the warning to treat GPT-5.5 as a new model family rather than a drop-in replacement.
- 2026-05-05 — Simon published “Granite 4.1 3B SVG Pelican Gallery,” an experiment comparing outputs from quantized Granite variants. His conclusion that output quality did not clearly track model size reinforced the need for empirical evaluation over assumptions.
- 2026-05-07 — He live-blogged Anthropic’s Code w/ Claude 2026 event, capturing keynote highlights and product observations in real time.
- 2026-05-07 — He also published “Vibe coding and agentic engineering are getting closer than I’d like,” signaling concern that lightweight, intuition-driven AI coding workflows are converging with more disciplined engineering practices in potentially risky ways.
- 2026-05-08 — Simon announced `llm-gemini` 0.31, noting that Gemini 3.1 Flash Lite was no longer in preview, a useful signal for teams watching model maturity and production readiness.
- 2026-05-09 — In “Using Claude Code: The Unreasonable Effectiveness of HTML,” he explored asking models for HTML instead of Markdown, showing how structured output can enable richer interfaces such as SVG diagrams and interactive widgets.
- 2026-05-12 — He highlighted Tobias Lütke’s description of Shopify’s public coding agent River, connecting it to Midjourney’s public workflow model and arguing that visible work can accelerate organizational learning.
- 2026-05-13 — Simon released `llm` 0.32a2, adding support for OpenAI’s `/v1/responses` endpoint and improving handling of reasoning-capable models that interleave reasoning and tool calls.
- 2026-05-19 — He published “The last six months in LLMs in five minutes,” annotated slides from his PyCon US 2026 lightning talk summarizing the recent pace of model and product change.
- 2026-05-28 — In “I think Anthropic and OpenAI have found product-market fit,” Simon argued that both companies show strong PMF signals, citing rising LLM usage and unexpectedly large cost impacts for customers as evidence of real demand.
Relevance to AI PMs
1. He is an early signal source for product and platform shifts. Simon often spots the practical implications of model launches, API changes, and tooling updates before official messaging is fully digested by the market. AI PMs can use his analysis to refine roadmap timing, migration plans, and vendor evaluation.2. He models evidence-based product judgment. Rather than relying on benchmark-driven narratives, Simon frequently tests systems directly and shares concrete examples. This is especially useful for PMs evaluating coding agents, structured outputs, model quality, and new interaction patterns.
3. He surfaces workflow patterns that affect adoption. His writing on HTML outputs, public agent workflows, tool-calling, reasoning visibility, and agentic engineering helps PMs think beyond model selection and toward UX, trust, collaboration, and operational fit.
Related
- Anthropic / Claude / Claude Code — Simon closely follows Anthropic’s coding products and events, making him a strong source for understanding agentic development workflows.
- OpenAI / Codex / GPT-5.5 / GPT-5.4 / GPT-5.1 — He frequently interprets OpenAI model releases, prompting guidance, and API shifts, especially where they affect developer products.
- `llm` / `llm-gemini` / `llm-anthropic` — His open source tooling gives him hands-on credibility and makes his commentary especially relevant for teams building multi-model workflows.
- Google / Gemini / Gemma — Simon tracks Google’s model and plugin ecosystem, including production-readiness signals such as preview status changes.
- Datasette — His broader identity as a builder, not just a commentator, is reinforced by projects like Datasette, which contribute to his reputation for practical software craftsmanship.
- PyCon US 2026 — His conference talk and annotated slides show his role as a concise educator for developers trying to keep up with rapid LLM change.
- Prompt injection / agentic engineering / coding agents — These themes recur in his work and are directly relevant to PMs managing risk, UX, and system reliability in AI products.
Newsletter Mentions (63)
“I think Anthropic and OpenAI have found product-market fit - Simon argues that Anthropic and OpenAI appear to have reached product-market fit, driven by rising LLM usage and surprising cost impacts for companies.”
#23 📝 Simon Willison I think Anthropic and OpenAI have found product-market fit - Simon argues that Anthropic and OpenAI appear to have reached product-market fit, driven by rising LLM usage and surprising cost impacts for companies.
“Simon Willison The last six months in LLMs in five minutes - Annotated slides from Simon Willison’s five-minute lightning talk at PyCon US 2026, built with his annotated presentation tool.”
#25 📝 Simon Willison The last six months in LLMs in five minutes - Annotated slides from Simon Willison’s five-minute lightning talk at PyCon US 2026, built with his annotated presentation tool. The post links to the full slide deck and provides a brief introduction to the material.
“#10 📝 Simon Willison llm 0.32a2 - llm 0.32a2 adds several useful features, with a key change being support for OpenAI models using the /v1/responses endpoint so reasoning-capable models can interleave reasoning and tool calls; the release highlights summarized reasoning tokens displayed separately and introduces flags to hide reasoning if desired.”
#10 📝 Simon Willison llm 0.32a2 - llm 0.32a2 adds several useful features, with a key change being support for OpenAI models using the /v1/responses endpoint so reasoning-capable models can interleave reasoning and tool calls; the release highlights summarized reasoning tokens displayed separately and introduces flags to hide reasoning if desired.
“Simon highlights Tobias Lütke's description of Shopify's public, Slack-based coding agent 'River' that encourages learning by making work visible.”
#8 📝 Simon Willison Learning on the Shop floor - Simon highlights Tobias Lütke's description of Shopify's public, Slack-based coding agent 'River' that encourages learning by making work visible. He relates it to Midjourney's public Discord approach and suggests public workflows can accelerate learning.
“#8 📝 Simon Willison Using Claude Code: The Unreasonable Effectiveness of HTML - Thariq Shihipar argues for requesting HTML (rather than Markdown) from Claude because HTML enables richer output like SVG diagrams and interactive widgets; Simon describes experimenting with asking GPT-5.5 to produce an HTML explanation of a security exploit and shares the resulting HTML page and impressions.”
#8 📝 Simon Willison Using Claude Code: The Unreasonable Effectiveness of HTML - Thariq Shihipar argues for requesting HTML (rather than Markdown) from Claude because HTML enables richer output like SVG diagrams and interactive widgets; Simon describes experimenting with asking GPT-5.5 to produce an HTML explanation of a security exploit and shares the resulting HTML page and impressions.
“#3 📝 Simon Willison llm-gemini 0.31 - Release announcement for llm-gemini 0.31 noting that gemini-3.1-flash-lite is no longer a preview.”
Simon Willison is cited in two separate items, one about llm-gemini and another about Firefox hardening with Claude Mythos preview.
“Simon Willison Live blog: Code w/ Claude 2026 - Live notes from Anthropic’s Code w/ Claude event covering the morning keynote sessions.”
#19 📝 Simon Willison Live blog: Code w/ Claude 2026 - Live notes from Anthropic’s Code w/ Claude event covering the morning keynote sessions. The post is a live blog capturing highlights and observations from the event. #25 📝 Simon Willison Vibe coding and agentic engineering are getting closer than I’d like - A reflection on AI coding tools following a podcast conversation, highlighting concerns that vibe coding and agentic engineering are beginning to converge.
“#5 📝 Simon Willison Granite 4.1 3B SVG Pelican Gallery - Simon tried prompting different quantized variants of IBM's Granite 4.1 3B model to 'Generate an SVG of a pelican riding a bicycle' and published a gallery of the results.”
#4 𝕏 NVIDIA AI now offers end-to-end support in Megatron Core for training 30B-scale Kimi K2 and Qwen3 models with higher-order optimizers (Muon, MOP, REKLS), pushing efficiency on GB300 GPUs and NVL72 systems beyond standard data-parallel methods. #5 📝 Simon Willison Granite 4.1 3B SVG Pelican Gallery - Simon tried prompting different quantized variants of IBM's Granite 4.1 3B model to 'Generate an SVG of a pelican riding a bicycle' and published a gallery of the results. He found no clear relationship between model size and output quality — most results were poor. #6 📝 Anthropic News Building a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs - Anthropic announced plans to build a new enterprise AI services company in partnership with Blackstone, Hellman & Friedman, and Goldman Sachs.
“Simon highlights migration tips, a Codex command to migrate projects, and a warning to treat GPT-5.5 as a new model family rather than a drop-in replacement.”
OpenAI Releases GPT-5.5 Prompting Guide #1 📝 Simon Willison GPT-5.5 prompting guide - OpenAI published prompting guidance for GPT-5.5, including a recommendation to send a short user-visible update before tool calls in multi-step tasks to improve UX. Simon highlights migration tips, a Codex command to migrate projects, and a warning to treat GPT-5.5 as a new model family rather than a drop-in replacement.
“Also covered by: @Simon Willison , @Cursor , @Aravind Srinivas”
#1 𝕏 Sam Altman announced that GPT-5.5 and GPT-5.5 Pro are now accessible via the API, enabling builders to integrate the latest model upgrades into their apps. Also covered by: @Simon Willison , @Cursor , @Aravind Srinivas
Related
Anthropic's coding assistant used for programming and automation tasks. The newsletter references it for building a custom approval device and for writing and research workflows inside AI agents.
AI company behind Claude. The newsletter references Claude usage and later notes Anthropic may have reached product-market fit.
AI company behind Codex and other products. The newsletter references its Codex-based tax agents and the OpenAI Foundation's initial commitment.
Anthropic's model family used for agent orchestration and developer workflows. In this newsletter it is highlighted as powering CodeRabbit's agent orchestration system.
An AI coding editor and automation platform. The newsletter highlights multi-repository support for automations across codebases.
A creator mentioned again as raising seed funding and choosing AI agents for onboarding and role learning. He is also the source credit on the Ryan Carson item.
An AI data infrastructure company known for building tools around retrieval and document processing. Here it is credited with launching LiteParse v2.0.
OpenAI's coding agent/tool used here for self-improving tax workflows and long-running autonomous loops. It is presented as capable of iterative task execution with plugins and goal-based runs.
A Google AI/Developer Relations figure mentioned for demonstrating Gemini Managed Agents and the Interactions API. He appears here as a presenter explaining hosted sandboxed agent execution.
A newsletter/podcast operator cited for summarizing Dan Shipper’s view on AI, work, and value creation. He connects the discussion to skill commoditization and recombination.
An AI agent workflow system used to automate founder and operator tasks with cron jobs, skills, and integrations. The newsletter cites it as part of a solo-founder operating stack alongside Codex and Devin.
Google's frontier AI lab. The newsletter references a Google Research privacy approach and Google I/O 2026 announcements, which are adjacent to DeepMind's broader ecosystem.
A Google AI product leader mentioned for announcing Lyria 3 availability via API. The newsletter credits him with a distribution update relevant to developers.
Google's AI assistant/model family mentioned as one of the systems that can answer category-level brand questions. It is presented alongside ChatGPT and Perplexity in the context of AI-driven visibility.
A major AI platform and product company shipping Gemini models, Search AI features, and developer tools. Important for AI PMs because many of the newsletter’s launches reflect Google’s evolving AI ecosystem.
A practitioner who used Claude and Cursor to generate a design system from GitHub repos. Relevant to PMs for rapid product and design-system iteration.
Well-known AI researcher and builder, mentioned here as joining Anthropic to use Claude for research acceleration. Relevant to AI PMs as a signal of AI-powered research workflows and talent movement.
An ML researcher and writer mentioned for highlighting Gated DeltaNet-2 and sharing a primer on Gated DeltaNet. Relevant for technical AI architecture discussion.
An AI platform and ecosystem company whose products are analyzed in relation to how coding assistants mention them. The newsletter includes it in the context of dataset analysis and assistant behavior.
The Perplexity founder/CEO, mentioned discussing enterprise adoption, security engineering, and Perplexity Computer. He appears here as a voice on agentic security workflows and search infrastructure.
The AI model family/company behind Qwen3.7-Max. The mention indicates a significant release aimed at agentic coding and productivity workflows.
CEO of OpenAI and a prominent AI industry leader. Here he is quoted announcing the OpenAI Foundation's initial $250M commitment.
Co-founder and CEO of Google DeepMind. He is mentioned in connection with Gemini 3.5 Flash and Google’s model launch.
Google AI leader and notable voice in model launches and research updates. Mentioned here in connection with Gemini 3.5 Flash and Google’s AI releases.
CEO of Google and Alphabet mentioned in the context of Google I/O and Gemini strategy. The newsletter cites him in a discussion about AI roadmap and product direction.
Autonomous or semi-autonomous software systems that can take actions, manage workflows, and assist with operational work. The newsletter references them in multiple founder and startup productivity contexts.
An AI development pattern where models act more like autonomous coding agents. The newsletter uses it to describe both NVIDIA Dynamo’s target workload and GPT-5.5/Codex improvements.
Anthropic Labs is mentioned as the organization where Henry Shi works with the founders. It appears as part of the credibility framing for the sponsored AI PM certification.
A model name referenced as part of a survey of recent LLM architectures. It is notable here as an example of the current pace of model iteration and architecture experimentation.
A Claude model version referenced as part of a prompt-comparison analysis. It serves as one endpoint for examining changes in Anthropic’s system prompt evolution.
Anthropic’s latest Opus-class model release with a 1 million-token context window. It is positioned for long-context planning, coding, and agentic task execution.
A newer OpenAI model release with improved natural dialogue, longer context, and stronger tool use. It is discussed as a model now available in Cursor and chatprd.
A frontier coding-capable model referenced in a benchmark comparison. The newsletter says it outperformed earlier coding models but still lagged behind human senior engineers in Every’s test.
A parsing tool used to ingest documents without a vector database in the described demo. It supports exact citation highlighting on original PDF pages.
A Claude model used in the Polymarket trading challenge. It is compared directly with Codex CLI 5.5 on the same market and prompt conditions.
OpenAI's coding assistant referenced as a runtime for NVIDIA-Verified Agent Skills. It appears alongside Claude and Cursor.ai as an interoperable platform.
A Claude model used in the newsletter's example to run Python code and analyze a floor plan. It is discussed as part of an agentic workflow inside Claude Cowork.
A model-routing platform used to call multiple LLMs through a common interface. Here it is used to run four models in parallel for comparison and generation tasks.
Consumer technology company that builds iPhone, Mac, and Apple Intelligence features. In this newsletter it is referenced as partnering with Google for future Apple Intelligence capabilities.
A Gemini model variant that was noted as moving out of preview status.
Agents that perform coding tasks and can increasingly orchestrate adjacent workflows like design. The newsletter uses them as the execution layer for Design.md scripts.
Simon Willison’s command-line LLM tool for interacting with models and APIs. This release adds support for OpenAI’s Responses endpoint and better reasoning-token handling.
A generative media company referenced as an example of a public Discord-based workflow. It is used here to support the idea that visible communities can accelerate learning and product adoption.
A Qwen model release referenced alongside Qwen3.6-Plus and integrated with opencode. It is one of the named models in the announcement.
A W3C-backed browser extension that exposes website functionality to MCP-capable agents. It lets developers register site functions as structured tools in the browser.
A Qwen model release with day-0 support for multimodal integration. The newsletter highlights its immediate compatibility with MLX-VLM for visual-language workflows.
The class of models discussed as having a blind spot with continuous, high-dimensional, noisy data. This concept is used to frame a limitation in current AI capabilities.
A browser automation protocol used here to let a Claude Code agent control Chrome programmatically.
A test-driven development pattern adapted for coding agents. It emphasizes an iterative failure/success loop that can make agentic coding more reliable.
A security risk pattern where AI agents have private data access, ingest untrusted content, and can exfiltrate data. For AI PMs, it is a key framework for designing safe agent features.
A Google AI text-to-speech model with native multi-speaker dialogue support across many languages. It is positioned as part of the Gemini product family.
Google AI Edge Gallery is a Google tool for showcasing and running on-device AI experiences at the edge, including offline use cases.
The practice of building software systems where agents plan and execute tasks with autonomy. The newsletter uses it in the context of anti-patterns and agent behavior management.
A collection of techniques and patterns for building agentic systems. The newsletter frames it as a guide page for AI builders.
An attack pattern where malicious inputs manipulate an AI agent into leaking data or taking unintended actions. The newsletter uses it in the context of Copilot-related data exfiltration.
Technology company that offers the Granite family of models. In this newsletter it appears in relation to Simon Willison's prompting experiments with Granite 4.1 3B.
An ecommerce company referenced for its public, Slack-based coding agent River. The example is used to discuss how visible workflows can accelerate learning and adoption.
A model family from Google used as the base for TranslateGemma. It matters to PMs as an example of reusing a foundation model for a specialized, deployable product.
Armin Ronacher is a developer and writer who often explores AI tooling and infrastructure. In this issue he is credited with a piece on local models, inference engines, and serving ergonomics.
OpenAI leader and product/engineering voice associated here with confirming Codex’s unification with the main model. The newsletter cites him via Simon Willison’s note.
A Codex-powered model release from OpenAI aimed at developers and product teams. The newsletter emphasizes its availability as a research preview and its high token throughput.
A Python library for working with LLM providers through an abstraction layer. The newsletter notes that API research is informing a major change to its provider abstraction.
An open-weight multimodal model in Alibaba's Qwen3.5 series, aimed at agentic and vision-capable use cases. It is relevant to PMs evaluating model capabilities, openness, and deployment options.
A product and engineering concept describing the hidden cost of AI-accelerated development when teams lose shared understanding of the system. It reframes debt from code maintenance to team cognition and system comprehension.
A repository for researching LLM providers' HTTP APIs. It supports abstraction-layer decisions for developers building against multiple model providers.
A quoted individual in a commentary about code quality incentives in AI systems. The newsletter uses him as the source of a viewpoint on maintainable code.
Stay updated on Simon Willison
Get curated AI PM insights delivered daily — covering this and 1,000+ other sources.
Subscribe Free