Richard Chen
Instructor credited with teaching the SGLang short course. Relevant as a practitioner translating applied inference techniques into learning material.
Key Highlights
- Richard Chen is credited as the instructor of Andrew Ng’s short course on efficient inference with SGLang.
- His relevance centers on teaching practical methods for reducing redundant LLM compute and serving costs.
- The course links him to SGLang, LMSys, and RadixArk in the context of text and image generation.
- For AI PMs, his work is most useful as a signal of production-focused education around inference optimization.
Richard Chen
Overview
Richard Chen is the instructor credited with teaching the short course “Efficient Inference with SGLang: Text and Image Generation,” a learning program unveiled by Andrew Ng and co-built with LMSys and RadixArk. In the available coverage, Chen appears as a practitioner-educator focused on translating applied inference techniques into accessible training material, particularly around SGLang’s open-source caching framework.For AI Product Managers, Richard Chen matters less as a broadly profiled public figure and more as a signal of where practical GenAI education is heading: toward cost-efficient inference, shared-prompt optimization, and deployable system design for text and image generation. His role in teaching this course connects him to an important operational theme for AI products—reducing redundant LLM compute while improving production efficiency.
Key Developments
- 2026-04-10: Richard Chen was cited as the instructor for Andrew Ng’s short course “Efficient Inference with SGLang: Text and Image Generation.” The course was described as co-built with LMSys and RadixArk.
- 2026-04-10: The course description highlighted SGLang’s open-source caching framework, emphasizing techniques to reduce redundant LLM costs by processing shared prompt components more efficiently.
- 2026-04-10: Richard Chen’s teaching role positioned him as a practitioner helping package advanced inference and serving concepts into learning material relevant to builders of multimodal AI systems.
Relevance to AI PMs
- Operational cost reduction: Chen is associated with instruction on SGLang-based inference optimization, which is directly relevant to PMs managing token spend, latency, and serving efficiency in LLM products.
- Practical enablement for teams: His role suggests a bridge between infrastructure techniques and usable education, helpful for PMs who need to align engineering, platform, and product teams around performance improvements.
- Multimodal product strategy: Because the course covers text and image generation, his work is relevant to PMs evaluating shared infrastructure patterns across multiple GenAI modalities rather than treating each stack in isolation.
Related
- Andrew Ng: Announced the short course taught by Richard Chen, providing the primary context in which Chen was mentioned.
- SGLang: The inference framework at the center of the course; Chen is connected through teaching its efficiency and caching concepts.
- LMSys: Co-builder of the course, linking Chen to the broader ecosystem of LLM systems research and serving infrastructure.
- RadixArk: Also co-built the course, connecting Chen to applied infrastructure and commercialization around efficient inference.
Newsletter Mentions (3)
“Andrew Ng unveiled a new short course, “Efficient Inference with SGLang: Text and Image Generation,” co-built with LMSys and RadixArk and taught by Richard Chen, teaching how to use SGLang’s open-source caching framework to slash redundant LLM costs by processing shared promp...”
#15 𝕏 Andrew Ng unveiled a new short course, “Efficient Inference with SGLang: Text and Image Generation,” co-built with LMSys and RadixArk and taught by Richard Chen, teaching how to use SGLang’s open-source caching framework to slash redundant LLM costs by processing shared promp...
“Andrew Ng unveiled a new short course, “Efficient Inference with SGLang: Text and Image Generation,” co-built with LMSys and RadixArk and taught by Richard Chen, teaching how to use SGLang’s open-source caching framework to slash redundant LLM costs by processing shared promp...”
#15 𝕏 Andrew Ng unveiled a new short course, “Efficient Inference with SGLang: Text and Image Generation,” co-built with LMSys and RadixArk and taught by Richard Chen, teaching how to use SGLang’s open-source caching framework to slash redundant LLM costs by processing shared promp...
“Andrew Ng unveiled a new short course, “Efficient Inference with SGLang: Text and Image Generation,” co-built with LMSys and RadixArk and taught by Richard Chen, teaching how to use SGLang’s open-source caching framework to slash redundant LLM costs by processing shared promp...”
Andrew Ng unveiled a new short course, “Efficient Inference with SGLang: Text and Image Generation,” co-built with LMSys and RadixArk and taught by Richard Chen, teaching how to use SGLang’s open-source caching framework to slash redundant LLM costs by processing shared promp... #16 𝕏 Santiago : They’ve built a completely new Large Memory Models architecture that mimics human memory instead of using RAG or vector search. The founders—authors of 160+ Nature and ICLR papers—even closed their Harvard lab to focus on it.
Related
Andrew Ng is credited with the Turing-AGI Test in DeepLearning.AI’s New Year issue. He remains a major figure in AI education and practical product thinking.
A research organization associated with language model systems and benchmarking. It appears here as a co-builder of an applied short course.
A company or organization co-building an applied AI course with Andrew Ng and LMSys. It is relevant as an ecosystem partner in AI education and tooling.
An open-source caching framework used to reduce redundant LLM inference costs. For PMs, it is relevant to efficiency, latency, and scaling AI features.
Stay updated on Richard Chen
Get curated AI PM insights delivered daily — covering this and 1,000+ other sources.
Subscribe Free