Building Products in a Post-LLM World: Hard-Won Lessons, Skeptic Busters, and Team Playbooks

Team in a modern office reviews an AI playbook on a large digital screen with charts, diagrams, and dashboards, while professionals seated around a conference table use laptops during a strategy briefing.

The ground rules for product development have changed in the post-LLM world. I’m sharing a practical, first-person playbook—lessons I’ve pressure-tested in my own product org—to help you build AI-native products with confidence, cut through hype, and deliver outcomes that compound.

Sprig is an AI-powered user insights platform that has raised over $88m. Today’s discussion features two key individuals in Sprig’s journey so far: Ryan Glasgow, Sprig’s CEO and founder; and Kevin Mandich, Sprig’s Head of Machine Learning. Before Sprig, Ryan was an early PM at GraphScience, Vurb, and Weeby (all of which were acquired), and Kevin was an ML Engineer at Incubit, and a Post-Doctoral Researcher at UC San Diego.

In today’s episode, we discuss: Key lessons from the Sprig founding story; Product development in the pre vs. post-LLM world; How to overcome AI skepticism; How to evaluate new models and how to know when to switch; Why you need an ML engineer; Sprig’s “AI Squad” team structure; How Sprig upskills all team members on AI.

Founding story takeaways I keep returning to: conviction compounds when paired with continuous discovery. Early on, prioritize direct customer signal over elegant architectures. I’ve seen the fastest learning loops come from a tight PM–ML partnership that prototypes quickly, validates with real users, and refactors only after signal stabilizes. The Jobs to Be Done Framework: https://hbr.org/2016/09/know-your-customers-jobs-to-be-done remains my favorite lens to separate what the model can do from what the customer actually needs done.

Pre vs. post-LLM product development requires a mindset shift. Pre-LLM, we wrote deterministic systems and pushed the edge with models like Google’s BERT model: https://en.wikipedia.org/wiki/BERT_(language_model). Post-LLM, we design probabilistic systems, treat prompts like code, and invest in evaluation harnesses from day one. I routinely prototype with Chat GPT: https://chat.openai.com and scaffold experiments with Langchain: https://www.langchain.com/ to compress discovery cycles. The key is shipping guardrails and UX affordances that make non-determinism feel trustworthy.

On AI skepticism, I don’t argue—I demonstrate. I target one painful workflow, build a narrow, high-precision solution, and expose transparent failure modes with a human-in-the-loop escape hatch. This reframes AI from magic to leverage. In customer-facing settings (think customer support ai strategy), we measure deflection and satisfaction together so automation never outpaces user psychology.

Evaluating new models—and knowing when to switch—demands a clear rubric: task quality (ground-truthed), latency at p95, unit economics, privacy/compliance, and operational reliability. I run shadow evaluations before swapping production dependencies, then phase changes behind flags with canaries and backstops. Tools like Auto-GPT: https://github.com/Significant-Gravitas/Auto-GPT are useful for ideation, but I never skip rigorous offline and online evaluation before a cutover.

Why you need an ML engineer: the fastest teams pair a product manager who owns the problem framing with an ML engineer who owns the feasibility frontier. This duo translates ambiguous jobs into measurable tasks, instrumented datasets, and iterative model/UX improvements. In my experience, this partnership reduces time-to-learning more than any single tooling decision.

Sprig’s “AI Squad” team structure mirrors what I’ve seen work: a cross-functional pod with a PM, ML engineer, data engineer/analyst, design, and platform partner. The squad ships thin slices end-to-end, owns their eval suite, and meets weekly to review errors, edge cases, and customer feedback. We track outcomes vs output OKRs to ensure velocity serves impact—not the other way around.

Upskilling the entire team on AI is non-negotiable. I’ve had success with lightweight rituals: weekly demo hours, prompt libraries maintained in Jira: https://www.atlassian.com/software/jira, red-team exercises to uncover failure patterns, and internal brown bags where engineers and PMs teach each other. Small, frequent exposure beats heavyweight training.

For deeper exploration and hands-on experimentation, I reference: Auto-GPT: https://github.com/Significant-Gravitas/Auto-GPT; Chat GPT: https://chat.openai.com; Google’s BERT model: https://en.wikipedia.org/wiki/BERT_(language_model); Jira: https://www.atlassian.com/software/jira; Jobs to Be Done Framework: https://hbr.org/2016/09/know-your-customers-jobs-to-be-done; Langchain: https://www.langchain.com/; Sprig: https://sprig.com/.

Timestamps: (02:50) Intro (04:57) What attracted Kevin to Sprig (05:53) Kevin’s background before Sprig (07:56) How Ryan gained conviction about Kevin (09:55) Key technical challenges and how they solved them (18:46) How to overcome AI skepticism (21:47) The early difficulties of building an ML-enabled product (25:06) Evaluating new models and knowing when to switch (35:09) Using Chat GPT (37:23) Product development in the pre vs. post-LLM world (39:53) The impact of AI hype on Sprig’s product development (45:36) Balancing AI automation with user-psychology (48:47) Do recent LLMs reduce Sprig’s competitive advantage? (51:00) The importance of “selling the vision” to customers (54:40) How Sprig structures teams (57:25) How Sprig upskills all team members on AI (60:25) 3 key tips for companies trying to navigate AI (66:05) Major limitations with LLMs right now (70:27) The future of AI and the future of Sprig

Three guiding principles I use daily: first, reduce surface area—start with one high-value job and earn trust with reliability. Second, treat evaluation as a product—version prompts, log failures, and continuously retrain on your own data distributions. Third, design for collaboration—pair AI with human judgment and transparent controls so users feel empowered, not replaced. Post-LLM success isn’t about chasing models; it’s about building resilient systems, teams, and learning loops.


Book a consult png image

What is the approach to overcoming AI skepticism?

I don’t argue—I demonstrate. I target one painful workflow, build a narrow, high-precision solution, and expose transparent failure modes with a human-in-the-loop escape hatch.

What rubric is used to evaluate models and decide when to switch?

Task quality (ground-truthed), latency at p95, unit economics, privacy/compliance, and operational reliability. I run shadow evaluations before swapping production dependencies and phase changes behind flags with canaries and backstops.

Why is an ML engineer important?

The fastest teams pair a product manager who owns the problem framing with an ML engineer who owns the feasibility frontier. This duo translates ambiguous jobs into measurable tasks, instrumented datasets, and iterative model/UX improvements.

What is Sprig’s AI Squad?

Sprig’s AI Squad is a cross-functional pod with a PM, ML engineer, data engineer/analyst, design, and platform partner. The squad ships thin slices end-to-end, owns their eval suite, and meets weekly to review errors, edge cases, and customer feedback.

How should teams upskill on AI?

Upskilling the entire team on AI is non-negotiable. I’ve had success with lightweight rituals: weekly demo hours, prompt libraries in Jira, red-team exercises to uncover failure patterns, and internal brown bags where engineers and PMs teach each other.

What are the guiding principles for Post-LLM?

Three guiding principles I use daily are: reduce surface area—start with one high-value job and earn trust through reliability; treat evaluation as a product—version prompts, log failures, and retrain on your data distributions; and design for collaboration—pair AI with human judgment and transparent controls so users feel empowered. Post-LLM success isn’t about chasing models; it’s about building resilient systems, teams, and learning loops.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Signup for Weekly Digest Emails

Categories

Archieve