What is the main product development shift in a post-LLM world?

The post-LLM shift is moving from deterministic systems to probabilistic systems. The article recommends treating prompts like code, investing in evaluation harnesses early, and designing guardrails that make non-determinism feel trustworthy.

What criteria should product teams use when evaluating new AI models?

The article’s rubric includes ground-truthed task quality, p95 latency, unit economics, privacy and compliance, and operational reliability. It also recommends shadow evaluations, feature flags, canaries, and backstops before switching production dependencies.

Why does an AI product team need an ML engineer?

The author argues that the fastest teams pair a product manager who owns problem framing with an ML engineer who owns the feasibility frontier. Together they turn ambiguous jobs into measurable tasks, datasets, and iterative model and UX improvements.

What does Sprig’s AI Squad structure look like?

The article describes an AI Squad as a cross-functional pod with a PM, ML engineer, data engineer or analyst, design, and a platform partner. The squad ships thin slices end-to-end, owns its evaluation suite, and reviews errors, edge cases, and customer feedback weekly.

How can teams upskill everyone on AI without heavyweight training?

The author recommends lightweight rituals such as weekly demo hours, prompt libraries maintained in Jira, red-team exercises, and internal brown bags. Small, frequent exposure is presented as more effective than large training programs.

What guiding principles does the article give for post-LLM product success?

The article gives three principles: reduce surface area by starting with one high-value job, treat evaluation as a product, and design for collaboration between AI and human judgment. The emphasis is on resilient systems, teams, and learning loops rather than chasing models.

What is the main product development shift in a post-LLM world?

The post-LLM shift is moving from deterministic systems to probabilistic systems. The article recommends treating prompts like code, investing in evaluation harnesses early, and designing guardrails that make non-determinism feel trustworthy.

What criteria should product teams use when evaluating new AI models?

The article’s rubric includes ground-truthed task quality, p95 latency, unit economics, privacy and compliance, and operational reliability. It also recommends shadow evaluations, feature flags, canaries, and backstops before switching production dependencies.

Why does an AI product team need an ML engineer?

The author argues that the fastest teams pair a product manager who owns problem framing with an ML engineer who owns the feasibility frontier. Together they turn ambiguous jobs into measurable tasks, datasets, and iterative model and UX improvements.

What does Sprig’s AI Squad structure look like?

The article describes an AI Squad as a cross-functional pod with a PM, ML engineer, data engineer or analyst, design, and a platform partner. The squad ships thin slices end-to-end, owns its evaluation suite, and reviews errors, edge cases, and customer feedback weekly.

How can teams upskill everyone on AI without heavyweight training?

The author recommends lightweight rituals such as weekly demo hours, prompt libraries maintained in Jira, red-team exercises, and internal brown bags. Small, frequent exposure is presented as more effective than large training programs.

What guiding principles does the article give for post-LLM product success?

The article gives three principles: reduce surface area by starting with one high-value job, treat evaluation as a product, and design for collaboration between AI and human judgment. The emphasis is on resilient systems, teams, and learning loops rather than chasing models.

Building Products in a Post-LLM World: Hard-Won Lessons, Skeptic Busters, and Team Playbooks

Q: How should teams overcome AI skepticism when building products?

The author recommends demonstrating value instead of arguing about AI. Start with one painful workflow, build a narrow high-precision solution, expose failure modes clearly, and keep a human-in-the-loop escape hatch.

The ground rules for product development have changed in the post-LLM world. I’m sharing a practical, first-person playbook—lessons I’ve pressure-tested in my own product org—to help you build AI-native products with confidence, cut through hype, and deliver outcomes that compound.

Sprig is an AI-powered user insights platform that has raised over $88m. Today’s discussion features two key individuals in Sprig’s journey so far: Ryan Glasgow, Sprig’s CEO and founder; and Kevin Mandich, Sprig’s Head of Machine Learning. Before Sprig, Ryan was an early PM at GraphScience, Vurb, and Weeby (all of which were acquired), and Kevin was an ML Engineer at Incubit, and a Post-Doctoral Researcher at UC San Diego.

In today’s episode, we discuss: Key lessons from the Sprig founding story; Product development in the pre vs. post-LLM world; How to overcome AI skepticism; How to evaluate new models and how to know when to switch; Why you need an ML engineer; Sprig’s “AI Squad” team structure; How Sprig upskills all team members on AI.

Founding story takeaways I keep returning to: conviction compounds when paired with continuous discovery. Early on, prioritize direct customer signal over elegant architectures. I’ve seen the fastest learning loops come from a tight PM–ML partnership that prototypes quickly, validates with real users, and refactors only after signal stabilizes. The Jobs to Be Done Framework: https://hbr.org/2016/09/know-your-customers-jobs-to-be-done remains my favorite lens to separate what the model can do from what the customer actually needs done.

Pre vs. post-LLM product development requires a mindset shift. Pre-LLM, we wrote deterministic systems and pushed the edge with models like Google’s BERT model: https://en.wikipedia.org/wiki/BERT_(language_model). Post-LLM, we design probabilistic systems, treat prompts like code, and invest in evaluation harnesses from day one. I routinely prototype with Chat GPT: https://chat.openai.com and scaffold experiments with Langchain: https://www.langchain.com/ to compress discovery cycles. The key is shipping guardrails and UX affordances that make non-determinism feel trustworthy.

On AI skepticism, I don’t argue—I demonstrate. I target one painful workflow, build a narrow, high-precision solution, and expose transparent failure modes with a human-in-the-loop escape hatch. This reframes AI from magic to leverage. In customer-facing settings (think customer support ai strategy), we measure deflection and satisfaction together so automation never outpaces user psychology.

Evaluating new models—and knowing when to switch—demands a clear rubric: task quality (ground-truthed), latency at p95, unit economics, privacy/compliance, and operational reliability. I run shadow evaluations before swapping production dependencies, then phase changes behind flags with canaries and backstops. Tools like Auto-GPT: https://github.com/Significant-Gravitas/Auto-GPT are useful for ideation, but I never skip rigorous offline and online evaluation before a cutover.

Why you need an ML engineer: the fastest teams pair a product manager who owns the problem framing with an ML engineer who owns the feasibility frontier. This duo translates ambiguous jobs into measurable tasks, instrumented datasets, and iterative model/UX improvements. In my experience, this partnership reduces time-to-learning more than any single tooling decision.

Sprig’s “AI Squad” team structure mirrors what I’ve seen work: a cross-functional pod with a PM, ML engineer, data engineer/analyst, design, and platform partner. The squad ships thin slices end-to-end, owns their eval suite, and meets weekly to review errors, edge cases, and customer feedback. We track outcomes vs output OKRs to ensure velocity serves impact—not the other way around.

Upskilling the entire team on AI is non-negotiable. I’ve had success with lightweight rituals: weekly demo hours, prompt libraries maintained in Jira: https://www.atlassian.com/software/jira, red-team exercises to uncover failure patterns, and internal brown bags where engineers and PMs teach each other. Small, frequent exposure beats heavyweight training.

For deeper exploration and hands-on experimentation, I reference: Auto-GPT: https://github.com/Significant-Gravitas/Auto-GPT; Chat GPT: https://chat.openai.com; Google’s BERT model: https://en.wikipedia.org/wiki/BERT_(language_model); Jira: https://www.atlassian.com/software/jira; Jobs to Be Done Framework: https://hbr.org/2016/09/know-your-customers-jobs-to-be-done; Langchain: https://www.langchain.com/; Sprig: https://sprig.com/.

Timestamps: (02:50) Intro (04:57) What attracted Kevin to Sprig (05:53) Kevin’s background before Sprig (07:56) How Ryan gained conviction about Kevin (09:55) Key technical challenges and how they solved them (18:46) How to overcome AI skepticism (21:47) The early difficulties of building an ML-enabled product (25:06) Evaluating new models and knowing when to switch (35:09) Using Chat GPT (37:23) Product development in the pre vs. post-LLM world (39:53) The impact of AI hype on Sprig’s product development (45:36) Balancing AI automation with user-psychology (48:47) Do recent LLMs reduce Sprig’s competitive advantage? (51:00) The importance of “selling the vision” to customers (54:40) How Sprig structures teams (57:25) How Sprig upskills all team members on AI (60:25) 3 key tips for companies trying to navigate AI (66:05) Major limitations with LLMs right now (70:27) The future of AI and the future of Sprig

Three guiding principles I use daily: first, reduce surface area—start with one high-value job and earn trust with reliability. Second, treat evaluation as a product—version prompts, log failures, and continuously retrain on your own data distributions. Third, design for collaboration—pair AI with human judgment and transparent controls so users feel empowered, not replaced. Post-LLM success isn’t about chasing models; it’s about building resilient systems, teams, and learning loops.

Building Products in a Post-LLM World: Hard-Won Lessons, Skeptic Busters, and Team Playbooks

Comments

Leave a Reply Cancel reply

Signup for Weekly Digest Emails

Categories

Archieve