AI Concepts
57 entities tracked across daily AI PM newsletters
A protocol used to connect AI agents to tools and data sources. The newsletter contrasts MCP with APIs as foundational plumbing for agent actions and prompt-evaluation workflows.
MCP is emerging as a protocol layer for connecting AI agents to tools, services, and data sources in a more agent-friendly way.
Autonomous or semi-autonomous software systems that can take actions, manage workflows, and assist with operational work. The newsletter references them in multiple founder and startup productivity contexts.
AI agents are framed as autonomous systems that can plan, use tools, manage state, and execute multi-step work beyond simple chat interactions.
An AI development pattern where models act more like autonomous coding agents. The newsletter uses it to describe both NVIDIA Dynamo’s target workload and GPT-5.5/Codex improvements.
Agentic coding refers to AI systems that can plan, code, test, and iterate more like autonomous software agents than simple autocomplete tools.
A rapid, intuition-driven way of building software with AI assistance. For PMs, it represents low-friction prototyping and UI iteration.
Vibe-coding is an AI-assisted, intuition-driven way to turn ideas into working software prototypes with minimal friction.
A method for structuring prompts and surrounding artifacts across multiple layers, such as specs, wireframes, and data, to improve AI output quality. It is especially useful for PMs designing AI-assisted product workflows.
Context engineering focuses on designing the full input environment for AI systems, not just the prompt.
A pattern for answering questions by retrieving relevant context and generating responses from it. The newsletter highlights multimodal RAG for searching across audio, image, and video data.
RAG is a core pattern for grounding LLM outputs in retrieved external context instead of model memory alone.
A concept for modular agent capabilities or instructions, mentioned as an emerging hint toward open standards. It is discussed alongside agents.md in the context of agent harness interoperability.
Skills describes modular, reusable agent capabilities or instruction bundles that can be installed and shared across tools.
Agents that perform coding tasks and can increasingly orchestrate adjacent workflows like design. The newsletter uses them as the execution layer for Design.md scripts.
Coding agents are evolving from code assistants into autonomous execution systems for software and adjacent workflows.
Simon Willison’s command-line LLM tool for interacting with models and APIs. This release adds support for OpenAI’s Responses endpoint and better reasoning-token handling.
LLM here refers both to large language models broadly and specifically to Simon Willison’s `llm` command-line tool in the newsletter context.
A framework for measuring whether AI agents reliably complete tasks across real inputs, edge cases, and version changes. It emphasizes step-level traces and component-level decisions, not just final output quality.
Agent evaluation measures not only final outputs but also the steps, tool calls, and component decisions behind them.
A workflow/mode for using AI systems to search the web, synthesize information, and produce detailed reports. The newsletter frames it as a practical capability for research-heavy PM work.
Deep Research combines web search, source synthesis, multimodal input handling, and report generation into one AI workflow.
The class of models discussed as having a blind spot with continuous, high-dimensional, noisy data. This concept is used to frame a limitation in current AI capabilities.
LLMs are foundational to many generative AI products but are best understood as powerful language models, not universal intelligence systems.
A technique for grounding model outputs in retrieved information. It is cited here as a component of a modular agent framework.
RAG improves model outputs by retrieving relevant external information at generation time.
Benchmarking methods for evaluating AI coding agents in realistic software tasks. The newsletter notes that infrastructure variability can materially affect scores.
Agentic coding evals measure AI coding agents on realistic software tasks rather than isolated prompt-response coding tests.
An open-source agent framework associated with Harrison Chase. In the newsletter it is being optimized for open-source models as closed-model costs rise.
Deepagents emerged as an open-source Claude Agent SDK tied to Harrison Chase and the LangChain ecosystem.
A workflow pattern where a main AI system delegates parts of a task to parallel helper agents. Relevant to PMs because it can improve speed, context management, and long-running task execution.
Subagents let a main AI system split complex work into smaller helper-agent tasks, often in parallel.
A security risk pattern where AI agents have private data access, ingest untrusted content, and can exfiltrate data. For AI PMs, it is a key framework for designing safe agent features.
The lethal trifecta describes the dangerous combination of private data access, untrusted content ingestion, and exfiltration capability in one AI system.
An attack pattern where malicious inputs manipulate an AI agent into leaking data or taking unintended actions. The newsletter uses it in the context of Copilot-related data exfiltration.
Prompt injection manipulates AI systems by embedding malicious instructions in untrusted inputs the model consumes.
A paradigm that treats cloud infrastructure as autonomous coding agents to automate deployment and operations. For AI PMs, it reframes infrastructure as an agentic workflow rather than a static system.
Agentic Infrastructure reframes cloud operations as autonomous agent workflows rather than static systems.
Reusable Claude-based skill modules that package agentic workflows into portable components. The newsletter frames them as a way to avoid building AI agents from scratch.
Claude skills package repeatable AI workflows into reusable modules, reducing the need to build agents from scratch.
A lexical retrieval ranking function used here to select relevant tool definitions. In PM tooling, it helps improve retrieval accuracy and reduce context-window bloat.
BM25 is a lexical ranking function that helps AI systems retrieve the most relevant documents or tool definitions.
An approach to AI systems where agents perform tasks autonomously with tools and browser interaction. The newsletter frames 2026 as a year focused less on novelty and more on trust in deployed agentic systems.
Agentic AI describes autonomous AI systems that can plan, use tools, and complete multi-step tasks across software and browser environments.
A test-driven development pattern adapted for coding agents. It emphasizes an iterative failure/success loop that can make agentic coding more reliable.
red/green TDD adapts classic test-driven development into a structured workflow for coding agents.
The practice of building software systems where agents plan and execute tasks with autonomy. The newsletter uses it in the context of anti-patterns and agent behavior management.
Agentic engineering centers on building software with agents that can plan, write, and execute code autonomously.
A collection of techniques and patterns for building agentic systems. The newsletter frames it as a guide page for AI builders.
Agentic Engineering Patterns is a guide that collects practical techniques for building and operating agentic systems.
A framework for defining, managing, and retiring capabilities that AI agents can use. The newsletter frames it as an operational way to keep agent behavior current and useful.
Agent Skills provide a modular framework for defining and maintaining what AI agents can do.
A PM framework focused on user value, tradeoffs, and outcomes rather than just technical implementation. Mentioned here as a skill engineers should develop in AI product teams.
Product-thinking emphasizes user value, tradeoffs, and outcomes over pure implementation.
A concept covering how organizations evaluate large language models consistently and meaningfully. The newsletter frames standardization of benchmarks as a major enterprise challenge.
LLM benchmarks give organizations a repeatable way to compare model performance on real product tasks.
A product and engineering concept describing the hidden cost of AI-accelerated development when teams lose shared understanding of the system. It reframes debt from code maintenance to team cognition and system comprehension.
Cognitive debt describes how AI-accelerated development can shift costs from code maintenance into lost team understanding.
A tool interface used with skill.md to reduce token usage and run MCP commands in a more efficient way.
CRI is a lightweight interface for running MCP commands with lower token overhead.
A pattern for agent-to-agent communication and collaboration. The newsletter mentions it as part of a step-by-step approach to building multi-agent systems.
A2A is a pattern and protocol framing for how multiple AI agents communicate and collaborate.
A memory architecture that mimics human memory instead of relying on RAG or vector search. For PMs, it suggests alternative approaches to long-context recall and personalization.
Large Memory Models are described as a memory architecture that mimics human memory rather than relying on RAG or vector search.
A modular layer that adds tools, guardrails, and custom instructions to AI agents. It is described as a composable harness for production agent systems.
Agent middleware is a modular layer for adding tools, guardrails, and instructions to AI agents.
A software architecture paradigm where engineers orchestrate agents instead of hard-coding decision trees. For PMs, it suggests product teams may design systems around LLM behavior rather than deterministic logic.
Agent-first software design shifts software building from hard-coded decision trees to orchestrated agent behavior.
A practice of capturing learnings from prompts and agent interactions to steadily improve system behavior over time. For PMs, it is a feedback-loop mindset for iterative AI product improvement.
Compound Engineering is the practice of capturing prompt and agent learnings so future AI outputs improve over time.
AI models whose weights or availability are open enough to encourage broad reuse and experimentation. The newsletter frames them as a driver of innovation across the ecosystem.
Open models are framed in the newsletter as a major driver of AI innovation across startups, researchers, students, and industries.
A memory architecture pattern for AI agents that separates different memory layers to improve context retention and task performance. It is presented as part of the design of autonomous coding assistants.
Layered memory separates short-term, task-level, and longer-term memory to improve AI agent performance.
A lightweight skills-based pattern for packaging agent capabilities in small context-efficient files.
skill.md is a lightweight pattern for packaging agent capabilities into small, context-efficient files.
A training approach used here to teach Composer to self-summarize, reducing reliance on handcrafted prompts.
Reinforcement learning helps AI systems improve behavior based on outcome feedback rather than prompts alone.
Systems composed of multiple cooperating AI agents, often designed to divide work and collaborate through structured patterns. The newsletter references building these systems with Python and agent-to-agent communication patterns.
Multi-agent systems divide work across specialized AI agents that coordinate through structured communication patterns.
A defensive technique mentioned as part of Claude Code's strategy to deter model distillation by misleading competitors' training runs.
Anti-distillation poison pills are designed to make model outputs less useful for competitors attempting distillation.
Programmable interfaces that let AI agents and software systems access services and complete tasks. The newsletter positions APIs as one of the means for agents to act on behalf of users.
APIs let AI agents access services and take actions on behalf of users.
OpenTelemetry is an observability standard for traces, logs, and metrics. The newsletter mentions Codex exporting agent-aware telemetry through it for auditing and monitoring.
OpenTelemetry is a standard for collecting traces, logs, and metrics across software and AI systems.
A protocol for connecting AI models to external tools and servers. The newsletter references discovery of MCP servers and reducing MCP token usage.
Model Context Protocol standardizes how AI models and agents connect to external tools, servers, and data sources.
A structured-prompt framework for improving the consistency and quality of outputs from Claude Code. It is positioned as a way to turn an AI coding assistant into a more reliable development partner.
SuperClaude is a community framework that uses structured prompts to improve the consistency of Claude Code outputs.
A framework for specifying goals, context, and guardrails in multi-agent systems. It helps PMs guide autonomous agents with explicit objectives and stop rules rather than rigid control.
Intent Engineering helps PMs specify objectives, context, and guardrails for autonomous agents.
A workflow framework for building customizable agentic systems. It is highlighted as integrating with ACP.
Agent Workflows is a framework for building customizable agentic systems within the LlamaIndex ecosystem.
A legacy programming language often targeted for modernization and migration efforts. For PMs, it represents enterprise technical debt and transformation risk.
COBOL remains core to many enterprise systems despite being viewed as legacy technology.
AGI is referenced as the frontier toward which current AI development is moving. In PM terms, it frames long-term product strategy, governance, and risk discussions.
AGI is a strategic concept that shapes AI roadmaps, governance, and stakeholder expectations more than a single agreed technical milestone.
An agent design pattern where work is split into sub-tasks and assigned dynamically. In the newsletter, it is one of the core ingredients for building autonomous coding agents.
Task delegation breaks complex agent objectives into smaller sub-tasks assigned dynamically across tools or specialized components.
The practice of connecting agents to external developer tools such as linters and debuggers. It is highlighted here as a building block for effective coding agents.
Tool integration connects AI agents to external developer tools such as linters, debuggers, and test runners.
A programming language commonly used for building AI systems and agent workflows. The newsletter references it in the context of constructing multi-agent systems from scratch.
Python is the dominant implementation layer for modern AI experimentation, orchestration, and agent workflows.
The process of updating legacy COBOL systems, often for enterprise migration and maintenance. AI agents are increasingly positioned as tools to accelerate this high-friction modernization work.
COBOL modernization focuses on updating legacy mission-critical systems for maintainability, integration, and migration.
A test introduced by Andrew Ng for evaluating economic utility. It is framed as a way to assess whether AI systems provide meaningful real-world value.
The Turing-AGI Test evaluates AI progress based on economic utility rather than abstract intelligence claims.
A search tool mentioned as part of ingesting PM work into Claude Code. It appears to support retrieval over a large personal knowledge base.
QMD appears to be a search and retrieval layer used to access large personal or operational knowledge bases.
An architecture where multiple specialized agents collaborate instead of one general-purpose agent. The newsletter includes debate over whether this is necessary versus using a single tool-loaded agent.
A multi-agent system uses several specialized agents to collaborate instead of relying on one general-purpose agent.
Leading AI labs that control high-demand model APIs and compute. The newsletter uses the term to describe vendors that might restrict API access to prioritize their own products and customers.
Frontier AI labs control highly demanded model APIs and may prioritize their own products or top customers when compute is scarce.