Doug Turnbull
Search and retrieval expert mentioned for introducing pseudo-relevance feedback. He explains how early retrieval results can be used to refine queries.
Key Highlights
- Doug Turnbull is a key voice on practical search and retrieval design for modern AI products.
- He connects classical IR concepts like BM25 and pseudo-relevance feedback to current RAG and hybrid search systems.
- His work helps AI PMs choose retrieval infrastructure with clearer trade-off thinking.
- He emphasizes rigorous evaluation, including careful LLM pairwise relevance assessment.
- His writing is useful for PMs balancing model performance, retrieval quality, and engineering complexity.
Doug Turnbull
Overview
Doug Turnbull is a search and retrieval expert whose writing frequently surfaces practical ideas for improving information retrieval systems, search relevance, and retrieval architecture. In the newsletter, he is cited across topics such as BM25, hybrid search, pseudo-relevance feedback, retrieval engine selection, RAG implementation trade-offs, and evaluation methods for search quality. His work matters because it translates core IR concepts into operational guidance teams can use when building modern AI products.For AI Product Managers, Turnbull is especially relevant at the boundary between classical search and LLM-based systems. He consistently emphasizes that retrieval quality, evaluation rigor, and system design choices matter as much as model choice. His posts are useful for PMs who need to make practical decisions about search stacks, improve RAG quality, and understand how older retrieval techniques can still unlock performance gains in newer AI applications.
Key Developments
- 2026-02-03 — Cited for "Check twice, cut once with LLM search relevance eval," highlighting the importance of checking both directions in LLM pairwise evaluation when assessing search relevance.
- 2026-03-07 — Discussed "Can BM25 be a probability?" exploring BM25 scores through odds, probabilities, and a Bayesian framing, with implications for calibrating hybrid search systems.
- 2026-03-11 — Mentioned for "The tests are the code now," arguing that AI-assisted coding makes tests the critical artifact for preserving software quality.
- 2026-03-21 — Published guidance on "How to actually choose a retrieval engine," comparing trade-offs across search engines and vector databases including Elasticsearch, OpenSearch, Solr, Vespa, Pinecone, Turbopuffer, and Weaviate.
- 2026-03-24 — Wrote "Why tiny late interaction models win," pointing to the rise of late interaction approaches and referencing a LightOn demo with Antoine Chaffin using a 150M model.
- 2026-04-07 — Argued in "Is grep all you need for RAG?" that a RAG-like system can be built with enough engineering effort using simple tools, while stressing the implementation difficulty.
- 2026-04-14 — Introduced "What is pseudo-relevance feedback?" explaining how top results from an initial BM25 or ranked retrieval can be used as implicit feedback to refine the query and improve downstream retrieval.
Relevance to AI PMs
- Improve RAG quality with retrieval-first thinking. Turnbull’s work helps PMs focus on retrieval mechanics like BM25, pseudo-relevance feedback, and hybrid ranking instead of assuming LLM improvements alone will fix answer quality.
- Make better infrastructure decisions. His retrieval engine comparisons give PMs a practical framework for choosing between classic search engines and vector databases based on product needs, team capability, and relevance requirements.
- Build stronger evaluation loops. His discussions of LLM relevance evaluation and testing encourage PMs to invest in robust eval design, pairwise comparisons, and test-driven quality processes for search and AI features.
Related
- Pseudo-relevance feedback — One of the clearest concepts associated with Turnbull in the newsletter; it connects classic IR methods to modern retrieval optimization.
- BM25 — A recurring topic in his work, especially around score interpretation and calibration in retrieval systems.
- Hybrid search — His BM25 and retrieval engine discussions are directly relevant to combining lexical and vector signals.
- RAG and grep — His commentary on whether simple tools can support RAG highlights implementation trade-offs and engineering complexity.
- Elasticsearch, OpenSearch, Solr, Vespa, Pinecone, Turbopuffer, Weaviate — Retrieval platforms he is mentioned alongside when discussing search engine selection.
- LightOn and Antoine Chaffin — Connected through his discussion of tiny late interaction models and retrieval model design.
- LLM search relevance eval and LLM pairwise evaluation — Related to his emphasis on careful evaluation methodology.
- Tests-as-the-code — Extends his relevance beyond retrieval into how AI changes software quality practices.
Newsletter Mentions (7)
“Doug Turnbull What is psuedo-relevance feedback? - Introduces pseudo-relevance feedback: after an initial BM25 or ranked retrieval, the returned results provide implicit information that can be used to refine queries or improve subsequent retrieval.”
#7 📝 Doug Turnbull What is psuedo-relevance feedback? - Introduces pseudo-relevance feedback: after an initial BM25 or ranked retrieval, the returned results provide implicit information that can be used to refine queries or improve subsequent retrieval. The post outlines how to leverage those initial results as a source of feedback to boost relevance.
“#11 📝 Doug Turnbull Is grep all you need for RAG? - Doug argues that with enough engineering effort you can build a RAG-style search system using only grep, but cautions that this approach is difficult and not for the faint of heart.”
#11 📝 Doug Turnbull Is grep all you need for RAG? - Doug argues that with enough engineering effort you can build a RAG-style search system using only grep, but cautions that this approach is difficult and not for the faint of heart.
“Doug Turnbull Why tiny late interaction models win - Discusses the recent prominence of late interaction models, highlighting a LightOn demonstration (with developer Antoine Chaffin) using a 150M model and the implications for retrieval and interaction approaches.”
#12 📝 Doug Turnbull Why tiny late interaction models win - Discusses the recent prominence of late interaction models, highlighting a LightOn demonstration (with developer Antoine Chaffin) using a 150M model and the implications for retrieval and interaction approaches.
“Doug Turnbull How to actually choose a retrieval engine - Explains how teams should choose a retrieval engine by comparing vector databases and search engines, and considering trade-offs between options like Elasticsearch, OpenSearch, Solr, Vespa, Pinecone, Turbopuffer, and Weaviate.”
#5 📝 Doug Turnbull How to actually choose a retrieval engine - Explains how teams should choose a retrieval engine by comparing vector databases and search engines, and considering trade-offs between options like Elasticsearch, OpenSearch, Solr, Vespa, Pinecone, Turbopuffer, and Weaviate. Emphasizes practical selection criteria beyond hype.
“#15 📝 Doug Turnbull The tests are the code now - Argues that with AI-assisted coding, tests become the most important artifact for maintaining code quality.”
Doug Turnbull is cited for a piece arguing that AI-assisted coding elevates the importance of tests. The newsletter uses him to support a broader software quality point.
“#20 📝 Doug Turnbull Can BM25 be a probability? - Explores the relationship between BM25 scores framed as odds versus probabilities and introduces a Bayesian view of BM25.”
GenAI PM Daily March 07, 2026 GenAI PM Daily 🎧 Listen to this brief 3 min listen Today's top 25 insights for PM Builders, ranked by relevance from LinkedIn, YouTube, X, and Blogs. #20 📝 Doug Turnbull Can BM25 be a probability? - Explores the relationship between BM25 scores framed as odds versus probabilities and introduces a Bayesian view of BM25. Discusses implications for calibrating hybrid search systems when combining lexical and probabilistic signals.
“Check twice, cut once with LLM search relevance eval - Highlights the importance of checking both directions in LLM pairwise evaluation of search relevance.”
GenAI PM Daily February 03, 2026 GenAI PM Daily Today's top 10 insights for PM Builders, ranked by relevance from Blogs, X, YouTube, and LinkedIn. OpenAI Launches Codex App 📝 OpenAI News Introducing the Codex app - OpenAI has launched the Codex app, enhancing user interaction with AI. Read more → 𝕏 claire vo 🖤 @clairevo Claire overhauled Maplewood’s architecture by migrating to Inngest workflows and persisting stories/actions in NeonDB, added infinite scroll for event feeds, and squashed an auto-scroll bug. Read more → 📝 Doug Turnbull Check twice, cut once with LLM search relevance eval - Highlights the importance of checking both directions in LLM pairwise evaluation of search relevance.
Related
A pattern for answering questions by retrieving relevant context and generating responses from it. The newsletter highlights multimodal RAG for searching across audio, image, and video data.
A lexical retrieval ranking function used here to select relevant tool definitions. In PM tooling, it helps improve retrieval accuracy and reduce context-window bloat.
Elasticsearch is referenced in the context of hybrid search and kNN query behavior in practice.
Stay updated on Doug Turnbull
Get curated AI PM insights delivered daily — covering this and 1,000+ other sources.
Subscribe Free