Tag: product-market fit lessons

Just Now Possible Preview: How Real Teams Ship AI—Workflows, RAG, Agents, Evaluation

I’m excited to share a preview of Just Now Possible, a show where I sit down with the builders who are shipping meaningful AI features in the real world. My goal is simple: pull back the curtain on how AI products actually get made—messy problems, rapid prototyping, and the leadership decisions that move teams from concept to customer value.

Watch the preview on YouTube: https://www.youtube.com/embed/Kb2HbuPbfR8?feature=oembed. Prefer audio? Listen on Spotify: https://open.spotify.com/episode/5xM0pDnqR0JpKmW6aZ0pj6?ref=producttalk.org or Apple Podcasts: https://podcasts.apple.com/us/podcast/podcast-preview/id1838832993?i=1000725807029&ref=producttalk.org. Want a text version? Read the transcript ($): #full-transcript.

How AI products come to life—straight from the builders themselves. In each episode, we dive deep into how teams spotted a customer problem, experimented with AI, prototyped solutions, and shipped real features. We dig into everything from workflows and agents to RAG and evaluation strategies, and explore how their products keep evolving. If you’re building with AI, these are the stories for you.

From my own experience leading product teams, I’ve seen that the real unlocks come from disciplined product discovery, clear outcomes vs output OKRs, and smart use of gen ai for product prototyping. We’ll talk about the tradeoffs between speed and safety, when to bring in forward deployed engineers, and how to validate product-market fit lessons before scaling. Along the way, we’ll unpack practical patterns—like when to use RAG vs fine-tuning, how to evaluate agents in production workflows, and what great product management leadership looks like in AI-first environments.

The first full episode drops on Thursday, September 18th. Don't miss it!

Full transcripts are available to paid subscribers.

Inspired by this post on Product Talk.

October 20, 2025
Building AI Products That Work: My Playbook for LLM Strategy, Evals, and Orchestration

AI features don’t succeed on clever prompts alone—they demand thoughtful product strategy, rigorous evaluation, and tight cross-functional collaboration. As a VP of Product Management and someone deeply immersed in building with Large language model (LLM) technology, I’m constantly refining how we turn generative capabilities into real customer value. This episode of All Things Product zeroes in on that challenge, and it captures many of the principles I rely on when shipping AI to production.

The central question resonates with every product leader I know: How do product teams learn to build AI-powered products “beyond just dabbling with ChatGPT”? I appreciate how the conversation moves past novelty and into the disciplines that make AI reliable, safe, and outcome-oriented.

One metaphor that always lands for me: building AI features is less like writing a single “killer prompt” and more like orchestrating a team of “interns.” You define roles, break down work, set guardrails, and continuously review outputs. That orchestration mindset, coupled with strong observability, evals, and ongoing maintenance practices, is what separates flashy demos from repeatable product value.

Here’s how I frame the work. First, there’s a difference between an AI-powered product manager and an AI product manager. Many of us are becoming AI-powered—using tools to accelerate discovery, ideation, or execution. But when you own AI features end-to-end, you inherit new responsibilities: modeling risks, defining evaluation strategies for non-deterministic systems, and treating prompts and data pipelines as core product surfaces.

Prompt engineering for a product is fundamentally different from prompting ChatGPT for personal use. In production, I rely on prompt decomposition and orchestration—explicitly breaking a task into steps, assigning each step to the right capability, and enforcing consistent formats. This reduces variance, improves debuggability, and enables targeted evals that catch regressions before customers do.

System design and risk mitigation become front and center. I align early with engineering, legal, security, and support on failure modes, privacy expectations (including Personal information or personally identifiable information (PII)), and rollout plans. We log traces for every critical path, treat prompts as versioned assets, and use observability to connect inputs, intermediate states, and outputs. When something drifts, we need to see it fast, explain it, and fix it.

Evaluating non-deterministic AI features is its own craft. “Thumbs up/thumbs down” isn’t enough. I design layered evals: unit-level checks for correctness and formatting, scenario-level evals for edge cases and risk behaviors, and longitudinal evals to monitor model and data drift over time. Clear acceptance thresholds and shadow deployments help us balance velocity with reliability.

Deciding when AI is the right solution starts with the customer problem, not the model. I ask: Is the task ambiguous enough to benefit from generation? Can we bound the failure modes? Do we have affordable latency and cost envelopes? And what’s the graceful fallback if the model underperforms? If a deterministic algorithm or simple rules solve it better, we choose that—no heroics.

The hidden cost of AI is maintenance. Prompts rot as upstream models change. New data skews behavior. Guardrails that worked yesterday might not hold tomorrow. That’s why ongoing evals, robust logging, and a change-management plan (for prompts, schemas, and policies) are non-negotiable. Treat AI features as living systems, not one-off launches.

If you’re exploring gen ai for product prototyping, start small. Pick a narrow, high-value workflow, instrument everything, and ship with clear success metrics. Use your first release to build your team’s muscles around observability, evals, and cross-functional collaboration. The goal is not a perfect model; it’s a reliable product outcome.

Want to go deeper? Listen to the full conversation here: Spotify | Apple Podcasts. Prefer video? Watch on YouTube: Building AI Products.

What you’ll learn in this episode:

– The difference between an AI-powered product manager and an AI product manager

– Why prompt engineering for a product is different from prompting ChatGPT for personal use

– The role of prompt decomposition and orchestration in building robust AI features

– How to think about system design, risk mitigation, and cross-functional collaboration

– Why observability and logging traces are critical for LLM products

– The challenge of evaluating non-deterministic AI features (and why “thumbs up/thumbs down” isn’t enough)

– How to decide when AI is the right solution for a customer problem

– The hidden cost of ongoing maintenance for AI features

Join the conversation: What practices have helped you ship reliable AI features? Drop your thoughts and questions in the comments—I’d love to learn from your experiences.

Inspired by this post on Product Talk.

October 20, 2025
From Disruption to Breakthrough: How Stack Overflow’s AI Pivot Became a Product Playbook

Generative AI doesn’t knock politely—it kicks the door open and forces product teams to re-think the fundamentals. I’ve lived through my share of market shifts, and the story of Stack Overflow’s AI journey hits every note of what it takes to respond with clarity, speed, and rigor.

When ChatGPT launched, Stack Overflow faced a cataclysmic shift: developer behavior was changing overnight. That single sentence captures the urgency I felt as I studied this case: habits, traffic patterns, and value perceptions transformed almost instantly.

Consider the timing: Ellen Brandenburger stepped into Stack Overflow just two weeks before ChatGPT launched. In her shoes, I would have immediately asked the same questions she did: What new developer workflows are becoming “just now possible”? How quickly can we prototype without compromising quality or trust? And how do we avoid overcorrecting in a moment of uncertainty?

In response, the team created Overflow AI, a concentrated effort to explore “what’s just now possible” for developers. I love this framing—it anchors exploration to near-term feasibility while keeping sight of evolving user needs. It’s the kind of focused discovery effort I encourage when a platform-defining shift hits.

They moved through four disciplined iterations of conversational search, each an experiment with clear hypotheses and guardrails:

V1: a chat UI on top of keyword search

V2: semantic search to handle natural questions

V3: fallback to GPT-4 for gaps in Stack Overflow’s corpus

V4: adding RAG for attribution and transparency

Two principles stood out as non-negotiable: attribution and transparency. For developers, trust depends on knowing where an answer came from, why it’s relevant, and whether it reflects source truth. I’ve found the same in my own teams—without provenance and clarity, even great answers feel shaky.

The team’s evaluation approach was refreshingly pragmatic: simple spreadsheets and subject-matter experts assessing accuracy, relevance, and completeness. In my org, we’ve adopted similar lightweight scorecards before scaling LLM investments; it keeps us honest about quality before we fall in love with a demo.

Here’s the moment that demonstrates real product management leadership: despite the investment, Stack decided to sunset conversational search when it couldn’t meet developer standards. That discipline—choosing not to ship what isn’t good enough—preserves brand trust and creates space for a better bet.

And that better bet was a strategic pivot: the team leaned into data licensing, leveraging its 14M+ Q&A corpus to power LLM training and benchmarks. Instead of treating AI as a threat, they turned their differentiated asset into a durable business line.

They went further, building industry benchmarks with subject-matter experts to prove Stack data improved LLM accuracy and relevance. This is exactly how I think about outcomes vs output: quantify lift against real tasks, validate with domain experts, and package value in a way decision-makers can trust.

Key lessons I’m taking forward:

Take one bite of the apple at a time—prototype, learn, iterate.

Product in the AI era means managing probabilities, not certainties.

For context, Ellen Brandenburger is a product leader and coach; former head of product at Chegg Skills and Stack Overflow’s data licensing team. Her arc through this transformation underscores what matters most right now: tight feedback loops, transparent evaluation, and the courage to pivot from feature bets to business model bets when the evidence demands it.

If you’re leading gen AI initiatives, treat this as a playbook: form a focused “just now possible” team, instrument quality with SMEs early, obsess over attribution and transparency, and be willing to sunset—even after heavy investment—when the work doesn’t clear your user’s bar. Then, zoom out: your unique data and workflows may be the moat. Build for that.

Inspired by this post on Product Talk.

October 20, 2025
Mastering AI Evals: Real-World Discovery Tactics to Ship Quality, Safe, Reliable AI

I’ve been shipping GenAI features long enough to know that clever prompts and orchestration aren’t enough. What actually matters is evidence: Does the system work, for whom, and under what conditions? That’s where rigorous AI evals come in—the backbone of building reliable, safe, and continuously improving AI products.

In a recent conversation focused entirely on evaluation, I dug into what “evals” mean in the AI/ML world, why they’re more than just quality assurance, and how to operationalize them end to end. If you want to explore the discussion, listen on Spotify: https://open.spotify.com/episode/7mSiEGSYNO4sXeGAVTJO4V or Apple Podcasts: https://podcasts.apple.com/kh/podcast/ai-evals-discovery/id1794203808?i=1000727980774. There’s also a video version on YouTube: https://www.youtube.com/watch?v=pfSIQMrWhQE.

Here’s how I frame evals with my teams. First, define the behavior you want to see in terms real users care about. Then codify that intent as tests that run consistently. I distinguish between golden datasets, synthetic data, and real-world traces. Golden datasets capture canonical examples that represent “ground truth.” Synthetic data fills important gaps quickly and safely. Real-world traces keep you honest and reflect evolving usage.

The most durable loop I’ve found is simple: identify error modes, turn them into evals, and automate. This is where error analysis pays off. Some checks should be purely deterministic—code-based checks that evaluate structured outputs, schemas, or policies. Others benefit from LLM-as-judge when human-like judgment matters, as long as you calibrate and continuously verify those judges with spot checks and inter-rater agreement.

Discovery practices should inform every evaluation step. If you’re doing “Story-Based Customer Interviews,” you can derive realistic scenarios, acceptance criteria, and edge cases directly from user narratives. That context sharpens the evals and prevents you from overfitting to toy problems or proxy metrics that don’t reflect user value.

Evals require ongoing care and feeding. Criteria drift is real—what counted as “good” six weeks ago may not satisfy users after you ship a new capability or your audience evolves. I treat the eval suite like living product infrastructure: versioned, reviewed, and owned. When we change prompts, models, or retrieval strategies, the evals run first, then we examine deltas, regressions, and surprises before anything reaches production.

Guardrails and human oversight work hand-in-hand with evals. Guardrails enforce non-negotiables (safety, privacy, compliance), while evals measure progress against nuanced goals (relevance, helpfulness, tone). In high-stakes workflows, I combine pre-deployment evals, runtime guardrails, and spot human review. The goal isn’t to eliminate humans; it’s to focus their attention where judgment and context matter most.

Practically, I start with a minimal eval harness that standardizes inputs and outputs—often in JSON (JavaScript Object Notation)—and writes repeatable tests. I maintain a small golden dataset, add targeted synthetic data for coverage, and stream real-world traces into the suite once we have consent and redaction in place. For subjective criteria (e.g., tone, helpfulness), I layer in LLM-as-judge with calibration. For objective checks (e.g., schema validation, policy compliance), code-based checks are my default.

Tooling evolves quickly, but the principles hold. Whether you’re working with Anthropic or experimenting with V0 or Lovable in your prototyping stack, the eval loop stays the same: define success, test it the same way every time, and close the loop with learning. If you’re a product creator or leading forward deployed engineers, this discipline accelerates gen ai for product prototyping without sacrificing safety or quality.

I also tie evals to outcomes vs output OKRs. Instead of “ship three prompts,” we commit to measurable outcomes like resolution rate, time-to-answer, or a target “helpfulness” score. In customer support ai strategy, we monitor real-world traces, CSAT, and handoff quality to ensure the AI augments agents rather than creating silent failure modes. That’s how evals drive product-market fit lessons instead of just dashboards.

If you want to go deeper, explore these foundational concepts and tools: ML (Machine learning), LLM (Large language model), “AI Evals for Engineers and PMs”: https://maven.com/parlance-labs/evals, “The Product Leadership Wheel – A Framework for Defining and Growing Product Leadership at Scale”: https://www.petra-wille.com/plwheel, “How I Designed & Implemented Evals for Product Talk’s Interview Coach”: https://www.producttalk.org/2025/09/interview-coach-evals/, “Behind the Scenes: Building the Product Talk Interview Coach”: https://www.producttalk.org/2025/08/customer-interview-coach/, V0: https://vercel.com/docs/v0, JSON (JavaScript Object Notation): https://en.wikipedia.org/wiki/JSON, Anthropic: https://www.anthropic.com/, Lovable: https://lovable.dev/, and “Story-Based Customer Interviews”: https://learn.producttalk.org/course/story-based-customer-interviews.

If this resonates, I’ll be sharing weekly lessons learned from building and evaluating AI features in the wild, plus conversations with cross-functional teams about real-world AI development. Have thoughts or a tactic that’s worked for you? Drop a comment and let’s compare notes.

Inspired by this post on Product Talk.

October 20, 2025
Inside Braze’s Blitz to $500M CARR: Bold PM Lessons on Going Global and Outsmarting Rivals

I’ve long believed the best product breakthroughs happen at the intersection of market timing, technical first-principles, and relentless customer discovery. Braze’s trajectory is a compelling proof point. Bill Magnuson is the co-founder and CEO at Braze, along with Kevin Wang, who joined as employee #8 and serves as the CPO. The two MIT graduates have built Braze into a publicly listed customer engagement platform with a $4.4B market cap. In 2023, Braze surpassed $500M in CARR, and serves over 2,200 customers worldwide. Before Braze, Bill spent time at Bridgewater Associates. Kevin’s academic background is in brain & cognitive sciences, and prior to joining Braze he worked at Accenture and Brewgene.
What strikes me most is how early conviction catalyzed execution. The Braze founders’ early insights into the mobile revolution weren’t abstract theses; they translated into concrete product choices that aligned with the emerging realities of push notifications, in-app messaging, and event-driven personalization. That early bet on mobile-first customer engagement created strategic leverage that compounding growth later amplified.
Origin stories matter because they encode the decision-making DNA. How a TechCrunch Hackathon sparked Braze’s creation is a reminder that speed to learning often beats speed to launch. Meeting co-founders at an NYC Hackathon stacked the deck for chemistry and complementary skills — a pattern I’ve seen repeatedly when teams form around real problems and prototype under time pressure.
Finding “terminal value” product market fit is more than PMF — it’s about enduring utility that scales with customer complexity. I appreciated how they framed the search as “fishing in every pond,” testing use cases and segments broadly while retaining a coherent platform strategy. That duality — breadth of exploration with depth of conviction — is precisely how I guide teams through product discovery when the surface area of opportunity is vast.
The early journey from 1,000 beta signups to 2,200+ paying customers underscores a disciplined funnel from interest to value to revenue. Braze’s scrappy scaling and early product development show that sometimes you must resist playbook dogma. Breaking the rules of a lean startup doesn’t mean ignoring hypotheses; it means investing ahead of the curve when platform primitives (data, messaging, orchestration) are the real unlock for long-term differentiation.
Navigating early fundraising challenges often forces sharper articulation of strategy and sequencing. I’ve found that the “why now” and “why this architecture” narratives become decisive — especially when your thesis runs counter to conventional wisdom. In Braze’s case, riding the mobile wave to success was inseparable from building the right infrastructure for real-time engagement and global scale.
Competition is inevitable; how you posture is a choice. Approaching competition strategically like a boxer resonated with me — pick your angles, conserve energy, and control the fight tempo. Translate that into product terms: choose the battles that exploit your architectural strengths, avoid the feature-by-feature brawl, and make category-defining bets where your feedback loops are fastest and most defensible.
Globalization rewards systems thinking. Building a global customer base requires architectural foresight (latency, compliance, localization), go-to-market nuance, and a repeatable model for entering new regions. When scale helps or hurts is an under-discussed reality — some processes must centralize; others need to decentralize to stay close to the customer signal. The never-ending quest for PMF is real; every new segment, channel, and geography is a fresh PMF search with its own “viable path to value.”
If I had to distill the practitioner takeaways, I’d start with this: prioritize platform primitives over shiny features; measure learning velocity, not just shipping velocity; and align resourcing to “terminal value” outcomes, not activity. That’s how you out-execute better-funded rivals and convert timing advantages into durable moats.
Referenced:
Accenture: https://www.accenture.com/
Appboy: https://www.braze.com/resources/articles/appboy-social-network-for-mobile-apps
Bipul Sinha: https://www.linkedin.com/in/bipulsinha/
Braze: https://www.braze.com/
Bridgewater Associates: https://www.bridgewater.com/
Jon Hyman: https://www.linkedin.com/in/jon-hyman/
Mark Ghermezian: https://x.com/markgher
MIT: https://www.mit.edu/
Rubrik: https://www.rubrik.com/
WeWork: https://www.wework.com/

October 20, 2025
How Guideline Rewired 401(k)s: First‑Principles Strategy, Gusto Edge, and Product Wins

“I don’t believe in stealth mode” is a product mantra I’ve long embraced, and it immediately came to mind as I dug into how Guideline modernized 401(k)s for small and medium-sized businesses. In a space dominated by incumbents and legacy processes, transparency and execution in public view can be a superpower. That ethos, paired with disciplined product discovery, comes through clearly in Guideline’s story.
Kevin Busque, the co-founder and CEO of Guideline, saw the problem up close while building Taskrabbit: traditional 401(k) plans suffered from complexity, low participation, and “confusing fee structures.” As a product leader, I’ve watched similar frictions stall adoption in other regulated categories—when fees are opaque and onboarding is arduous, engagement dies before it starts. The insight was simple but profound: remove confusion, automate compliance, and make default participation the norm.
After launching Guideline to address those problems head-on, the company rapidly validated market pull, hitting $120 million in ARR by June 2024. That milestone reflects more than growth—it’s evidence that a first-principles approach to retirement plans can outcompete legacy playbooks. It also highlights the compounding impact of product decisions that prioritize clarity, automation, and aligned incentives.
What impressed me most was the “Do the hard thing first” mindset. In practice, that meant investing early in infrastructure others avoided, like deeply integrated payroll workflows and robust compliance automation, rather than deferring them as future tech debt. It’s the opposite of chasing shiny objects: master the unglamorous backbone and everything else compounds.
On market entry, Guideline focused on nailing product-market fit by aligning with payroll ecosystems where SMBs already live. The Gusto partnership was a pivotal move—“Kevin’s insights from the Gusto integration” underscore how strategic distribution, combined with a clean UX and transparent pricing, became a durable edge. Compared to heavyweights like ADP, Fidelity, Paychex, and Intuit, Guideline reframed the buyer journey around simplicity and trust.
Pricing matters in retirement more than most founders realize. “How Guideline set their fees up” and “Lucky 8: Kevin’s unexpected pricing strategy” show how precise pricing architecture can both demystify costs and drive adoption. Clarity isn’t just a marketing claim—it’s a feature that reduces cognitive load and increases participation rates.
I also appreciated how early traction came from a surprisingly broad customer mix—“The surprising range of Guideline’s early customers” points to a product that generalized well across verticals without losing focus. “Working with Plaid as Guideline’s first customer” exemplifies how partnering with trusted fintech brands accelerates credibility and creates feedback loops that sharpen the product.
Defaults drive outcomes. “Guideline’s auto-enrollment feature” is a great example of using behavioral design to improve financial health at scale. When the right default exists and the friction is removed, participation becomes the baseline, not the exception. It’s a masterclass in aligning product and policy to deliver real retirement outcomes, not just feature checklists.
From a roadmap perspective, I was struck by the discipline in resisting premature expansion—“Will Guideline ever go multi-product?” is a nuanced question for any scaling company. “Kevin’s take on product-market fit” and “Guideline’s compounding advantage” reinforce a principle I live by: compound depth before breadth. Every integration, every compliance workflow, every support touchpoint can either compound or fragment your advantage.
Finally, leadership matters as much as strategy. “The challenges faced by introverted leaders” resonate deeply with how I build teams: create space for deep work, institutionalize written decision-making, and use clear operating principles so the product vision scales beyond the founder. It’s the quiet, consistent habits—not the loud slogans—that hold complex products together.
For product leaders working on regulated, high-stakes categories like retirement plans, healthcare, or financial services, the lessons are clear: conduct rigorous product discovery before you ship, pursue distribution advantages through strategic partnerships, architect pricing as an experience, and let default-driven features (like auto-enrollment) do the heavy lifting. That’s how you rewire entrenched markets—by doing the hard thing first, and doing it in the open.

October 20, 2025
What Makes or Breaks Executive Hires: My Lessons on Fit, Red Flags, and Measuring Success

Executive hiring is one of those rare decisions that can bend a company’s trajectory. In my role leading product management at a high-growth SaaS company, I’ve seen the difference between a leader who compounds value and one who quietly drains momentum. That’s why I was eager to examine what actually makes (or breaks) these bets, and to share a practical lens you can use to improve executive hiring outcomes.
I sat down with Eeke de Milliano for a focused conversation on the realities of executive hiring, leadership transitions, and measuring success. We dig into the “buy or build a leader” decision, how to avoid common red flags, and what it takes to set executives up to thrive in hyper-growth environments.
Eeke de Milliano is the Head of Global Product at Stripe, helping drive innovation and success in the company’s product line. Before this role, she was Head of Product at Retool and co-founded Constellate. Eeke previously spent 6 years as Product Lead at Stripe, working with the company during their hyper-growth era.
In today’s episode, we discuss how to rigorously assess executive hiring fit, including the challenges companies face when hiring new executives and the most common red flags and pitfalls I see teams miss under time pressure. We also explore practical advice for measuring success, especially when outcomes vs output get muddled in the first 90–180 days.
A recurring theme for me is that learning your own strengths is an underrated piece of the process. If you don’t understand the leadership leverage you already have on the team, you’ll over-hire for breadth or under-hire for depth. Great executive hiring clarifies the complementary edge you need—then measures it.
On the buy vs build decision: early signals matter. If you’re “buying” an external leader, pre-align on scope, authority, and what great looks like before day one. If you’re “building” from within, design a clear on-ramp and operating cadence so the leader can scale without drowning. In both cases, my mental model is to instrument leading indicators (team health, decision velocity, stakeholder trust) well before lagging business metrics fully show up.
Two red flags I always watch for: first, leaders who default to playbooks without interrogating context; second, leaders who cannot articulate how they measure success beyond activity and output. In hyper-growth, pattern-matching is useful—but uncalibrated pattern-matching is dangerous.
The human dynamics matter just as much as the strategy. What creates dysfunctional exec relationships is often misaligned interfaces: unclear decision rights, overlapping charters, or incentives that reward local maxima. High-functioning executive teams are like parents—a united front in public, with candid debate in private, anchored to shared principles and measurable outcomes.
Referenced:
ASML: https://www.asml.com/en
Claire Hughes Johnson: https://www.linkedin.com/in/claire-hughes-johnson-7058/
Constellate: https://constellate.team/
John Collison: https://www.linkedin.com/in/johnbcollison/
Mike Maples Jr.: https://www.linkedin.com/in/maples/
Patrick Collison: https://www.linkedin.com/in/patrickcollison/
Retool: https://retool.com/
Stripe: https://stripe.com/
Will Gaybrik: https://www.linkedin.com/in/william-gaybrick-5730347/
Where to find Eeke:
LinkedIn: https://www.linkedin.com/in/eeke-de-milliano-3b05a629/
Timestamps:
(00:00) Should you ‘buy or build’ a leader
(03:45) Why do executive hires fail so often?
(09:35) Why the stakes are so high for leadership hires
(12:26) The hardest document Eeke ever wrote
(14:06) Two red flags in a new hire
(17:27) An example of an outstanding leader
(21:40) What creates dysfunctional exec relationships
(22:38) The three steps towards hiring successful leaders
(30:30) What you should know about outside hires
(33:12) Eeke’s advice for easing leadership transitions
(42:06) How to notice success patterns
(47:21) Why high-functioning executive teams are like parents
(52:02) The most surprising lesson from Eeke’s first stint at Stripe
(55:11) The leadership data Eeke wishes we had

October 20, 2025
Scrappy Outbound to ‘Hyperbolic’ PMF: How a COVID Pivot Fueled Owner’s Explosive Growth

I’m drawn to origin stories that turn constraints into catalysts, and this one is a masterclass. Adam Guild is the co-founder and CEO at Owner, an online food ordering system for independent restaurants. Within a year, Owner went from being about to run out of money to having hundreds of customers. Last year, they raised a $33M Series B. Those numbers only make sense when you see the scrappy tactics and the decisive post-COVID pivot that unlocked genuine product-market fit.

Adam’s entrepreneurial journey began as a teenager when he built a successful Minecraft server, which led him to drop out of high school to become a founder. His passion for helping small businesses was sparked by his mom’s struggles running a dog grooming shop, which led him to launch the early iteration of Owner. As a product leader, I recognize that kind of founder-market fit instantly—the best ideas often surface at the intersection of lived pain and hands-on tinkering.

What struck me most was how working with a small business kickstarted Owner. Rather than “build it and they will come,” Adam embedded with real operators, learned their workflows, and shipped fast iterations that directly moved revenue and saved time. I’ve seen this forward-deployed product approach outpace traditional discovery for SMB tools—when you sit in the kitchen, the point-of-sale line, or the back office, your prioritization gets brutally clear and your product discovery becomes grounded in outcomes, not outputs.

Adam’s unusual outbound strategy was a reminder that early-stage go-to-market is a craft. Cold outreach, hands-on onboarding, and relentlessly personalized pitches carried them through the zero-to-one phase. When your ICP is time-starved and margin-conscious, “unscalable” tactics are often the most scalable path to signal: you earn trust, collect high-fidelity feedback, and create case studies that compound.

Then came the COVID pivot. The pandemic accelerated Owner’s success because it reshaped demand overnight: independent restaurants needed a direct, online ordering system to survive. The teams that won were those that eliminated adoption friction, connected the dots between channel, product, and operations, and carried the emotional weight of their customers’ reality. This is where Owner’s speed and empathy turned into a durable advantage.

The quest to find product-market fit crystallized around clear signals: urgent pull from operators, fast time-to-value, and repeatable outcomes. How Owner’s pivot led to “hyperbolic” product-market fit is the throughline—usage intensity, referrals, and condensed sales cycles all pointed to a solution that was now indispensable. Inside Owner’s explosive growth, I see a tight loop: ship, sit with customers, quantify impact, then scale only what works.

What actually worked to get new customers? Channel–product fit over channel proliferation. High-intent outreach, proof via live results, and visible social proof from recognizable restaurants created momentum. I also appreciated the pragmatic lens on partnerships and content—operators trust peers and practical playbooks more than generic marketing. Mentions ranging from Guisados to P.F. Chang’s highlight how credibility compounds when you deliver consistently.

How Owner secured its crucial first round of funding reinforced a familiar truth: narrative quality rises with clarity of problem, velocity of learning, and evidence of market pull. The constellation of names referenced—Alex Bard, Dean Bloembergen, Jack Altman, Kimbal Musk, Naval Ravikant, Neil Patel, Peter Thiel, Sean Rad—underscores how operational rigor plus a resonant mission attracts heavyweight believers. Communities like Thiel Fellowship and Y Combinator also surfaced as formative ecosystems that sharpen founders and widen networks.

The bet on going multi-product is a pivotal inflection in any SMB platform’s life. Expansion only works when each new capability deepens core value for the same buyer, not when it dilutes focus. The winning pattern: solve one painful job thoroughly, earn the right to add adjacent workflows, and measure expansion by net retention and attach rate—not by a feature checklist. This is where outcomes vs output OKRs prevent drift.

I also took note of the hiring philosophy. The two qualities Adam looks for in new hires map to what I’ve seen drive early-stage slope: people who love the problem and run toward responsibility. Narrow hiring bars, clear scorecards, and hands-on working sessions outperform generic interviews—especially when your customers are small businesses who need speed, reliability, and care.

Sales-led vs. product-led growth is often framed as a binary, but in SMB, the blend matters. Early on, a sales-led motion validates willingness to pay and compresses feedback cycles; as fit tightens, product-led loops amplify reach and reduce CAC. The art is knowing when to transition emphasis, which KPI to optimize at each phase, and how to keep experience quality high as you scale onboarding and support.

For additional context and inspiration, the conversation touched on operators, thinkers, and platforms such as HubSpot, Modern Restaurant Management, and communities like Y Combinator and the Thiel Fellowship, alongside individuals including Alex Bard, Dean Bloembergen, Jack Altman, Kimbal Musk, Naval Ravikant, Neil Patel, Peter Thiel, and Sean Rad. The range of perspectives mirrors the range of skills modern product leaders need to wield—customer empathy, scrappy GTM, and disciplined execution.

My takeaway is simple: scrappiness fuels discovery, clarity fuels scale. Owner’s journey—from near-zero runway to hundreds of customers and a $33M Series B—shows how a decisive, customer-obsessed pivot can transform a fragile idea into an enduring company. If you build for independent restaurants or any SMB segment, the blueprint holds: earn trust through results, tighten feedback loops, and let product-market fit pull you forward.

October 20, 2025
From Bootstrapped to $6B: Inside 1Password’s B2B Pivot, GTM Engine, and CEO Playbook

I’m endlessly fascinated by companies that scale with discipline, humility, and a relentless focus on customer trust. 1Password’s arc checks every box. Used by over 100,000 businesses and millions of individuals worldwide, it’s a rare story of going from a small, family-run operation to a $6B company without losing the plot. As I dug into this journey, I found a masterclass in product management leadership, intentional go-to-market sequencing, and the hard choices required to balance security with usability.

Here’s the quick snapshot that framed my takeaways: Jeff Shiner joined 1Password as CEO in 2012, when the team was just under 20 people. Under Jeff’s leadership, 1Password expanded into B2B, launched a SaaS platform, and scaled from a small family-run operation into a global company. In 2019, Jeff led 1Password through its first-ever funding round – a $200M Series A from Accel – to build out its go-to-market team and accelerate product development. Before joining 1Password, Jeff held senior roles at IBM and led teams through multiple acquisitions and integrations. That resume matters; it shows up in the way the company navigated pivotal transitions without spinning out.

The first lesson that landed for me: bootstrapping isn’t always what it’s cracked up to be. Staying bootstrapped for 15 years created incredible product discipline and customer-centricity, but there was an opportunity cost in go-to-market velocity. The decision to raise a $200M Series A from Accel in 2019 wasn’t about vanity—it was a surgical call to build the commercial muscle and accelerate product development at the moment the category was tipping. I’ve seen similar inflection points in my own work: the right capital, at the right time, can turn a strong product into a dominant platform.

The consumer-to-B2B pivot is the second big lesson. The lightbulb moment was recognizing team adoption patterns and the unmet enterprise needs around provisioning, policy, and audit. That shift required more than features; it demanded a reframe of the product roadmap, with crisp outcomes over output across security, UX, and administration. This is product discovery 101 at scale—listen for systemic patterns in user behavior, then align the organization on the few bets that unlock product-market fit in the new segment.

One of my favorite strategic choices was launching the SaaS platform before billing. It sounds counterintuitive, but it worked because it prioritized trust and usage over immediate monetization. By validating real-world adoption first, the team bought itself the context to introduce pricing and metering that mapped to customer value. When we’ve run similar plays, I’ve found the keys are transparent communication, clean migration paths, and a metrics spine that ties engagement to eventual revenue.

Security is the brand. But “being too secure” can kill usability—and adoption. I appreciated how candidly the team confronted this. Over-indexing on friction (even for good reasons) can block activation, expansion, and the very outcomes security teams care about. The craft is in reducing cognitive load while preserving principled guardrails. It’s a practical reminder that the best security UX eliminates unnecessary choices and defaults to safety without making people feel punished for doing the right thing.

There’s also a leadership chapter I found especially human: becoming CEO without telling anyone. Titles aside, the work was about creating alignment—clarifying purpose, simplifying decision rights, and protecting a culture that had been forged by builders like David Teare, Sara Teare, Roustem Karimov, and Natalia Karimov. In my experience, this is where outcomes vs output OKRs pay off: they force teams to anchor on the customer result, not the feature list, which becomes crucial as the org scales across B2C and B2B motions.

On go-to-market, the sequencing was clean: invest deliberately across sales, marketing, and customer success to support the B2B motion while keeping a strong consumer brand. The thread through all of it was customer-centric focus at scale. One tactic I advocate to preserve that focus is to embed product and engineering tightly with the field—think forward deployed engineers for high-signal accounts—so the roadmap stays tethered to real-world constraints, not just internal narratives.

Competitors matter, but the posture matters more. I liked how Jeff framed it: know the market, including players like LastPass, but don’t let competition dictate the roadmap. Use it as a directional signal, not an existential script. The companies that win don’t chase parity; they compound differentiated value and limit context switching for customers.

Not every bet landed. The first B2B product failed. That failure, and the iteration that followed, is precisely how strong product cultures are built. You tighten the feedback loops, double down on product discovery, and refine the jobs-to-be-done until adoption becomes the leading indicator. What stood out is how those learnings later informed the most pivotal moments in the company’s climb.

If you want to trace the journey end to end, there are a few sections I flagged for a deeper listen and reflection: how Jeff got involved at 0:03, the consumer-to-B2B pivot at 16:13, the first B2B product failure at 30:40, the funding decision after 15 years bootstrapped at 52:45, and a candid look at the most pivotal moments at 1:02:00.

Referenced for context and further exploration: 1Password: https://1password.com, Accel: https://www.accel.com, Arun Mathew: https://www.linkedin.com/in/arun-mathew-b7186412/, David Teare: https://www.linkedin.com/in/daveteare/, Floodgate: https://floodgate.com, LastPass: https://www.lastpass.com, Mike Maples: https://www.linkedin.com/in/maples/, Natalia Karimov: https://1password.com/company/meet-the-team/natalia-karimov, Roustem Karimov: https://www.linkedin.com/in/roustem/?originalSubdomain=ca, Sara Teare: https://1password.com/company/meet-the-team/sara-teare, Shopify: https://www.shopify.com, Tobi Lütke: https://www.linkedin.com/in/tobiaslutke/.

Where to find Jeff: LinkedIn: https://www.linkedin.com/in/jshiner

My biggest takeaway: this is a blueprint for scaling trust. From “too secure” to just secure enough, from consumer to B2B, from bootstrapped to $200M Series A, the throughline is disciplined learning. For product leaders, the invitation is clear—align the roadmap to outcomes, validate value before billing, and build a go-to-market engine that amplifies customer love rather than distracting from it.

October 20, 2025
How a Weekend Hack Hit 7-Figure ARR: My Product Playbook from Reducto’s Rise

I’m often asked how to spot and scale an AI wedge quickly without over-engineering. Recently, I studied how one founder did exactly that—and it’s a masterclass in product-market fit, go-to-market speed, and customer-centric execution. Adit Abraham is the co-founder and CEO of Reducto, which helps leading AI teams extract and structure data from complex documents and spreadsheets in their pipeline. Within 6 months of launching, Reducto went from 0→7 figures in ARR. Reducto has grown to process tens of millions of pages monthly for companies ranging from startups to Fortune 10 enterprises. They just announced a $24M Series A. Before Reducto, Adit was a Product Manager at Google, working on Ads and Search, and conducted machine learning research at MIT’s Media Lab. Here’s what stood out to me as a product leader: the fastest path to traction wasn’t a grand platform vision—it was a weekend project that nailed one painful, universal job to be done: turn messy PDFs and spreadsheets into structured, reliable data that AI teams can trust. Listening to customers revealed an important pivot. Instead of forcing a preconceived product roadmap, the team followed customer signal to PDF processing. The turning point wasn’t a feature bomb—it was clarity: when your users repeatedly drag you toward a narrow, high-pain workflow, follow that pull with urgency. The weekend project that became Reducto’s breakthrough embodied a principle I push with my teams: ship a thin slice that solves one gnarly, repeatable problem end-to-end. It creates credibility, accelerates learning loops, and makes it obvious what to build next. From there, Reducto focused on “transferable features”—capabilities that compound across adjacent use cases (think normalization, validations, lineage, and auditability), so every new customer increases product surface area without bespoke reinvention. Landing a Fortune 10 customer didn’t come from a flashy deck. It came from enterprise-grade reliability, ruthless attention to accuracy, and a willingness to be hands-on. This is where forward-deployed engineering shines: sit with users, work their real documents, and treat integrations, SLAs, and observability as first-class features. In AI document processing, precision and proof beat promises every time. For technical founders, sales can feel unnatural. My guidance mirrors what worked here: reframe sales as active product discovery at the edge of pain. Use the customer’s language, quantify ROI in minutes saved and errors avoided, and reduce the perceived risk with quick pilots, deterministic evaluation, and transparent quality metrics. Caring beats perfect pitches—responsiveness, iteration speed, and real ownership of results build trust faster than theatrics. The strategy behind Reducto’s horizontal expansion was pragmatic: start with a narrow ingestion problem, then generalize through connectors, schemas, and review workflows that serve multiple industries. When a wedge market behaves like infrastructure, platformize the capabilities that every adjacent use case will need. That’s how you broaden TAM without losing product sharpness. I also appreciate the operating cadence: hire slow, go-to-market fast. Keep the bar high on IC excellence while removing friction from the path to revenue. Early-stage advantage comes from fewer handoffs, shorter feedback loops, and tighter alignment between product, engineering, and customer outcomes. On mindset, one line resonated deeply: “You’re going to fail”. The point isn’t pessimism—it’s preparation. Design processes that surface weak signals early, celebrate invalidated hypotheses, and compress the time between insight and iteration. In my experience, the teams that win treat failure as data and speed as a cultural norm. Fundraising-wise, momentum compounds when narrative and metrics rhyme. 0→7 figures in ARR in six months, tens of millions of pages processed monthly, and a clear enterprise motion make a compelling arc for a $24M Series A. The lesson: sequence your proof points—pain, precision, and production scale—so investors can see inevitability rather than potential. If you’re building in document AI or adjacent data ingestion, study the tooling landscape (Anthropic, Scale AI, Stripe, Textract, Y Combinator) not as competitors but as ecosystem rails. Your goal is reliable transformation from unstructured inputs to structured outputs with measurable quality, strong governance, and smooth downstream integration. I’ll leave you with a practical playbook I use with my teams: Listen for intense pull, not polite praise. Pivot when usage—not opinions—clusters around a painful workflow. Ship a narrow, decisive wedge that solves the full job end-to-end. Measure accuracy, speed, and reliability. Invest early in “transferable features” that travel across verticals—validation, audit trails, observability, and schema tooling. Treat sales as discovery. Quantify ROI, shorten time-to-value, and make evaluation deterministic. Scale with forward-deployed engineering until patterns stabilize. Then platformize. Grow revenue faster than headcount. Hire slow, raise the bar, and keep iteration loops tight. If you want to explore more, start with Reducto (https://reducto.ai/) and connect with Adit on LinkedIn (https://www.linkedin.com/in/aditabraham/). Whether you’re chasing your first customer or your first Fortune 10 logo, the blueprint is the same: focus the wedge, prove precision, and move fast where it matters most.

October 20, 2025
From Skeptic to $2B: The Hard-Won Product Playbook Behind Persona’s Platform

I’m drawn to product stories where skepticism sharpens strategy, and this one delivers. The arc from reluctant founder to a $2B valuation is more than a feel-good headline—it’s a masterclass in platform thinking, principled decision-making, and founder-led execution under pressure. As a product leader, I unpack the choices, inflection points, and habits that turned uncertainty into enduring advantage.

Here’s the snapshot: Rick Song is the co-founder and CEO of Persona, the identity verification platform used by some of the world’s largest companies. Before starting Persona, Rick worked on identity fraud and risk products at Square, which laid the groundwork for what would become Persona’s highly technical, horizontal platform. Since founding the company, Rick has scaled Persona into a category-defining leader, recently raising a $200M Series D at a $2B valuation.

What stands out most to me is how Rick’s skepticism shaped Persona’s early strategy. Rather than chasing hype, he pressure-tested assumptions, constrained scope, and let customer reality—not pitch-deck mythology—pull the roadmap forward. That mindset is foundational when you’re building a true platform company: it forces depth over decoration and compels teams to solve hard, horizontal problems that generalize beyond the first few customers.

The journey begins with “Life before Persona” and “The push from Charles,” followed by “Early reluctance and low expectations.” I’ve been in those rooms—where the idea is simultaneously promising and premature. In that moment, measured doubt is a feature, not a bug. It sharpens your discovery, clarifies hypotheses, and aligns the team around learning velocity rather than vanity milestones.

From there, the real work starts: “Winning the first $50 customer” and the discipline of “Invalidating” Persona. I love this framing; it’s the antidote to confirmation bias. Persona “found their edge” by embracing the unglamorous details of identity, investing in reliability, and resisting the urge to overfit to early signals. When you treat each small win as a constrained experiment, you naturally build antifragility into the product.

The hardest product transition came next: “Transitioning from MVP to platform.” That shift requires you to zoom out from features to primitives, from integrations to orchestration, from point solutions to reusable systems. One defining moment—“Turning down a $5K deal on principle”—shows how clear product tenets safeguard long-term leverage. Pair that with the discipline of “Generalizing bespoke solutions,” and you get a durable platform instead of a services treadmill.

As traction compounds, the focus turns to “Finding product-market fit,” “Founder-led sales and consultative approach,” and “Building a culture of reactivity.” Founder-led selling isn’t just about closing deals—it’s deep discovery at the frontier, where your customers’ edge cases become your platform’s next capabilities. That reactivity, when systematized, accelerates learning loops and helps land “the first enterprise customers” without compromising architectural integrity.

Finally, there’s a mindset layer: “Silicon Valley’s obsession with frameworks” can distract from the harder habit of “Developing first principles thinking.” I’m a fan of frameworks when they’re earned, not borrowed. The right balance is to “Stay competitor-informed” while remaining customer-anchored and principle-driven. In practice, that means monitoring the market, but letting your roadmap be pulled by real-world constraints and outcomes.

For those who want to explore the broader ecosystem referenced along the way, here are a few touchpoints that shaped the conversation and context: Accenture: accenture.com, Anthropic: https://www.anthropic.com/, Braze: https://www.braze.com/, Bridgewater Associates: https://www.bridgewater.com/, Charles Yeh: https://www.linkedin.com/in/charlesyeh/, Christie Kim: https://www.linkedin.com/in/christiekimck/, Clay: clay.com, Kareem Amin: https://www.linkedin.com/in/kareemamin/, MIT: mit.edu, Newfront: newfront.com, Palantir: https://www.palantir.com/, Persona: withpersona.com, Rippling: rippling.com, Scale AI: scale.com, Snowflake: https://www.snowflake.com/, Square: squareup.com, Y Combinator: ycombinator.com, Zachary Van Zant: https://www.linkedin.com/in/zacharyv/.

If you want to follow Rick directly, here’s where to find him: LinkedIn: https://www.linkedin.com/in/rick-song-25198b24/.

October 20, 2025
Inside Linear: How Craft, Focus, and Small Teams Build Category-Defining Products

I’ve long believed that craft and focus are the two most reliable levers in product management, and listening to Karri Saarinen articulate how those principles shaped Linear reaffirmed why they still win. Karri is the co-founder and CEO of Linear, the project management tool built for high-performance software teams. Since its founding in 2019, Linear has achieved a valuation of $1.25B as of 10th June 2025 and now counts companies like OpenAI, Ramp and Vercel as customers. Before founding Linear, Karri led design at Airbnb and Coinbase, and previously co-founded Kippt, a bookmarking tool acquired by Coinbase. From the opening moments (1:37), his childhood love for computers and design felt familiar. Many of us who lead product today started by tinkering—not for a resume line, but out of curiosity. That early bias toward making things, paired with taste, often becomes the quiet engine behind strong product discovery and product-market fit lessons. At (6:54), the story of founding Kippt and the lessons from a failed bookmarking startup reminded me how much scar tissue can accelerate good judgment. Failure clarified what really matters: build for a specific user, ship faster than your uncertainty, and let the market teach you. The thread continues at (13:14) with lessons from a serial entrepreneur—how pattern recognition, not stubbornness, is what you carry forward. The segment at (19:32) hits a nerve for anyone scaling: why teams shouldn’t grow too quickly. I’ve seen small, senior teams accomplish in weeks what larger groups struggle to deliver in quarters. Velocity isn’t headcount; it’s clarity, trust, and an obsession with quality. Smaller teams keep craft close to the metal and reduce coordination tax. At (25: 03), Linear’s early beginnings emphasize how tight validation loops shape a product. Early validation strategies used to shape the product weren’t about chasing breadth—they were about earning depth with a narrowly defined customer. It’s the purest form of product discovery, and it sets the foundation for everything that follows. The conversation at (36:55) on the unexpected power of intuition resonated with how I coach teams: treat intuition as a hypothesis generator, then use data to reduce risk. It’s not intuition versus evidence; it’s intuition prioritized, evidence instrumented. That’s also how outcomes vs output OKRs stay honest—by measuring what matters without drowning product sense in dashboards. Linear’s unusual approach to user growth (42:41) rejects growth theater in favor of signal-rich adoption. Rather than boil the ocean with generic funnels, they doubled down on the right users, in the right sequence. That ties directly to (57:30) and the power of extreme focus, and the reminder at (59:18) to Design “something for someone”. It’s a crisp antidote to generic, over-configurable tools that try to be everything to everyone. If you’re curious what shaped Linear’s early product roadmap (47:29), the answer is principled constraint: a maniacal focus on performance, reliability, and a workflow that feels frictionless to high-performance software teams. When the product coheres around a few non-negotiables, teams can ship faster and with higher quality. The tension between flexibility vs. simplicity (1:04:29) shows up in every roadmap debate I’ve ever led. Flexibility sells demos; simplicity earns daily active usage. Picking simplicity early forces better defaults, clearer information architecture, and fewer surface areas where complexity can metastasize. Principled leadership shows up again at (1:17:27): Lead your team with strong principles. The best teams don’t need long memos to decide; they need clear tenets to align. Finally, (1:24:45) surfaces a nuanced distinction: design founders vs. engineering founders. The best outcomes emerge when design taste and engineering rigor compound—not compete—inside a product culture. A few references that stood out and helped frame the context: Airbnb, Coinbase, Y Combinator, Brian Armstrong, Brian Chesky, Jori Lallo, and Tuomas Artman. The through-line is unmistakable: high-taste product builders who pair speed with standards. If you want to jump to specific moments, here are the timecodes I found most actionable: (1:37), (6:54), (13:14), (19:32), (25: 03), (36:55), (42:41), (47:29), (52:02), (57:30), (59:18), (1:04:29), (1:17:27), (1:24:45). Each one reinforces a simple truth: focus compounds. If you’d like to follow Karri’s work directly, find him on LinkedIn: https://www.linkedin.com/in/karrisaarinen/ and Twitter/X: https://x.com/karrisaarinen. My takeaway for product leaders: revisit your scope, trim the excess, and put a small, senior team on the sharpest problem in your roadmap. Start with a principled bet, instrument outcomes, and keep the bar for craft uncomfortably high. That’s how you build products that feel inevitable—because they’re intentionally, relentlessly focused.

October 20, 2025