Month: October 2025

How I Find—and Keep—Product-Market Fit: Lessons on Conviction, Distribution, and Mergers

Product-market fit isn’t a finish line; it’s a dynamic state that needs to be earned repeatedly. In my work leading product strategy, I’ve learned that the most resilient companies combine ruthless intellectual honesty with repeatable discovery habits, movement-first distribution, and a bias toward decisive action when markets shift under their feet.

One case study I return to often: Bob Moore is the co-founder and CEO at Crossbeam, a “LinkedIn for data” platform that helps companies find overlapping opportunities with their partners. Crossbeam has raised US$117M to date and recently acquired Reveal in 2024. Bob previously cofounded RJMetrics (now part of Adobe Commerce Cloud) and Stitch Data (acquired by Talend). He is also the author of Ecosystem-Led Growth. The arc of these companies offers a clear lens into finding founder-market fit, falling in and out of product-market fit, and rebuilding with conviction.

When I evaluate ideas, I start with founder-market fit and falsification. I look for a lived pain, an unusual insight, and unfair access—then I try to disprove my thesis fast. I’ll line up dozens of target users and adjacent stakeholders, pressure-test the problem, and map evidence to The 4 Levels of PMF. The goal isn’t to “win” early interviews; it’s to surface the constraint that will eventually break the model: data availability, switching costs, procurement friction, or a distribution bottleneck.

Market shifts can invalidate a great product overnight. The analytics stack reconfiguration around Amazon Redshift is a perfect reminder that timing, platform shifts, and ecosystem dependencies will bend your trajectory. I actively maintain a “watchlist” of platform moves (cloud data platforms, changes in ad networks, privacy policy shifts, AI infrastructure) and connect them to my product’s core assumptions. If a new platform absorbs the value we created, I’d rather be first to cannibalize our own roadmap than last to react.

On distribution, I engineer sharing, reciprocity, and compounding usage directly into the product. That means designing collaboration surfaces, data assets, or partner workflows that make every new customer a new channel. Crossbeam’s model highlights how overlap mapping and partner ecosystems can turn integration nodes into growth nodes—an ethos that aligns with Ecosystem-Led Growth. Internally, I complement this with proactive outbound motions and the “joint jam” sales tactic: co-creating a live, high-signal artifact with the prospect that proves value with their data, not my slides.

Falling out of PMF is a feature of reality, not a failure of leadership—provided you move with clarity. The RJMetrics journey illustrates how you can find market fit, then lose it as the stack modernizes. My safeguard is a portfolio of leading indicators: retention by job-to-be-done, time-to-first-value, expansion drivers, sales-assist ratio, and the support “tax” on core workflows. When those turn, I default to intellectual honesty: narrow the ICP, rebuild the wedge, or sunset the thing that’s stealing oxygen from the core.

Building with conviction versus consensus is a critical cultural muscle. Consensus can smooth relationships, but it often averages out the insight. I anchor decisions in clear principles, write tight pre-mortems, and set owner-driven DRIs. We invite dissent early (red-team reviews, structured decision docs), then “disagree and commit” with a time-boxed checkpoint tied to specific, falsifiable milestones. This lets us move fast without romanticizing our own ideas.

Creating scalable and durable startups requires architecture, not just ambition. I push for composability across data models, feature flags for safe exploration, and an experimentation fabric that lets us test distribution hypotheses at low cost. We sequence multi-product bets only when we see strong, repeated pull from the market—ideally where network effects are latent. Unlocking network effects in software isn’t magic; it’s the disciplined design of interactions where each participant makes the system more valuable for the next.

Mergers are another lever for durability when executed with rigor. The Crossbeam/Reveal merger is a timely example of using consolidation to reduce fragmentation, standardize workflows, and accelerate network effects. Getting mergers right starts with strategic fit and cultural compatibility, but the real game is integration: aligning product architectures, pricing, packaging, and go-to-market motion within a 100-day plan that customers can feel in the product, not just read in a press release.

If you’re pressure-testing your own path to product-market fit, here’s what I’ve found most reliable: obsess over founder-market fit first, use The 4 Levels of PMF to calibrate evidence, design distribution into the product from day one, watch platform shifts like a hawk, and choose conviction over consensus—with mechanisms that keep you honest. Do that consistently, and you won’t just find PMF—you’ll keep it.

October 20, 2025
Mastering Altitude Shifts: Hard‑Won Product Leadership Lessons from Anneka Gupta’s Journey

I’m endlessly fascinated by leaders who can operate at every altitude—zooming out on strategy one minute and diving into the weeds the next. That’s why Anneka Gupta’s journey resonated so strongly with me, because it crystallizes how multi-disciplinary leadership accelerates product outcomes and go-to-market execution.

Anneka Gupta is the Chief Product Officer at Rubrik, a cloud management and data security company with a US$6B market cap. Before Rubrik, Anneka spent 11 years leading various teams at LiveRamp, including product, go-to-market, and operations.

One proof point that leapt off the page for me: LiveRamp went from $30M to $200M ARR in 3 years. That kind of growth rarely comes from product alone—it’s the compounding effect of crisp customer segmentation, tight GTM alignment, and a culture that prioritizes outcomes over output. In my own teams, anchoring OKRs to business outcomes rather than feature counts has been the most reliable way to unlock this momentum.

What I admire most is Anneka’s jack-of-all-trades career. Rotating through product, operations, and GTM builds a powerful intuition for how systems interact. I’ve seen the same benefit at scale: PMs who have shipped, sold, and supported the product make sharper tradeoffs because they integrate customer value, revenue mechanics, and operational feasibility in real time.

There’s a counterintuitive hiring lesson here too—why specialist hires can backfire. When the product or market is still evolving, over-optimized specialists often struggle without mature processes and stable interfaces. Early on, I bias toward adaptable builders who can define the playbook, not just run it. Specialists shine once the motion is proven and repeatable.

Altitude control matters. Knowing when leaders should get in the weeds is a differentiator. I’ve found three triggers: existential risk (security, reliability, or reputation), pivotal zero-to-one bets, and repeated cross-functional misalignment. Step in, diagnose at the system level, model the behavior you expect, and then step back out quickly so the team retains ownership.

There’s also one area every PM can improve in: customer-facing fluency. I agree with the principle that PMs should undergo the same training as sales reps. Shadow discovery calls, rehearse objection handling, and learn to speak to value drivers by persona. When PMs can authentically sell the problem and the solution narrative, product discovery gets faster and win rates improve.

Crafting products for different personas is another thread I pull on constantly. Buyers care about ROI, risk, and roadmap; users care about speed, clarity, and control. Great product discovery bridges the two by validating problem severity and adoption friction in parallel. That’s how you avoid building “the best product” that still loses because the buying committee can’t align on value.

I’m also struck by how deftly LiveRamp navigated enterprise shifts like transitioning Acxiom’s customers to LiveRamp and the broader dynamics of why Acxiom chose to buy not build. These moves demand rigorous change management—backwards compatibility, data governance guarantees, and clear migration value propositions. When the incentives align for customers and field teams, integrations become accelerants rather than tax.

Rubrik’s approach to building product underscores the same fundamentals: focus on critical customer outcomes, connect roadmap to go-to-market reality, and measure what matters. In practice, that means linking product bets to explicit revenue or retention hypotheses and setting guardrails so teams can run fast without creating long-term complexity debt.

I also appreciate the humility in reflecting on mistakes and the outsized impact of mentors and peers. The best leaders I’ve worked with narrate their decision-making—what they knew, what they assumed, and what they’d do differently—which compounds organizational learning. It’s the difference between isolated wins and a repeatable operating system.

If I distill my own playbook from these themes, it looks like this: hire for adaptability early, specialize later; anchor to outcomes vs output to avoid local maxima; keep PMs close to the sales and support edges of the system; and practice altitude shifting as a daily discipline. The result is a product organization that learns faster than the market changes—arguably the only durable advantage.

October 20, 2025
From Prototype to the Pentagon: My Playbook for Winning DoD Customers and Mission Fit

I’ve spent years building dual-use products and partnering with teams navigating the Department of Defense. In this piece, I share how I move from prototype to program of record by aligning product strategy to real mission outcomes, building trust with end users, and translating commercial product rigor into the national security context. Commercial versus military market strategies require fundamentally different assumptions. In the private sector, we obsess over product-market fit and velocity; in defense, we obsess over “mission solution fit” and survivability in procurement. The buyer is a complex web—operators, program managers, contracting officers, and Program Executive Offices (PEOs)—and each needs a clear value story tied to mission impact, not just features or ARR. When I validate ideas for defense products, I start with deep discovery at the edge: talking with operators, understanding tactics, techniques, and procedures (TTPs), and quantifying what “better” looks like in their environment. The “Mission Model Canvas” helps me capture stakeholders, beneficiaries, and constraints that don’t exist in a typical SaaS motion. “Hacking for Defense” has been invaluable for structuring this discovery and ensuring we test assumptions against mission reality, not just market appetite. A practical guide to military sales and procurement starts by mapping decision pathways. I identify the PEOs, the Program Managers, and the acquisition timelines that govern transition. I treat each step like an enterprise sale with additional layers: requirements, testing, accreditation, and budgeting. I align demonstrations to mission milestones, ensure my roadmap accounts for integration and accreditation lead times, and keep decision-makers looped with concise, evidence-based updates. Rethinking go-to-market strategy for defense means planning for longer cycles and multi-level consensus. Instead of a simple funnel, I build a coalition: an operator champion for pilots, a program sponsor for funding continuity, and a contracting route that matches how the customer buys. The goal is to de-risk adoption across technical, operational, and procurement dimensions in parallel. Building a network in national security is a full-contact sport. I invest time in the field, put forward deployed engineers next to users, and show up at the training ranges and labs where problems are real. Trust accumulates when teams see you adapt quickly, respect constraints, and demonstrate an understanding of mission risk. That trust turns into access and, ultimately, into pull from the organization. The dual-use debate isn’t binary—it’s a portfolio decision. I’ve seen teams succeed by leading with a defense wedge when the problem is uniquely military, and others start commercially to prove traction then tailor for defense. The key is to avoid whiplash: design your architecture and compliance posture so you can serve both without fragmenting your roadmap. Behind the rise of a new generation of “defense founders” is a shift in ambition and capability. Teams are mission-driven, technically sophisticated, and comfortable operating in complex stakeholder environments. They’re building for hard problems and measuring success in operational outcomes, not just revenue milestones. “Mission solution fit” is my north star. I define it as measurable mission improvement with acceptable changes to TTPs, training, and integration. I seek evidence that units can and will use the solution under realistic constraints, that it interoperates with existing systems, and that program leadership can fund it at scale. When those signals align, transition becomes possible. Breaking new ground in military tech often means navigating institutional friction. The “The Frozen Middle” is real—layers that resist change even when leadership and operators are aligned. I plan for this by prototyping where adoption barriers are lowest, securing a senior sponsor, and demonstrating cost, schedule, and performance wins that the middle cannot ignore. The hidden challenges most startups miss tend to be non-technical. Security and accreditation aren’t documentation exercises; they’re product constraints that should shape architecture early. Interoperability isn’t a feature; it’s table stakes. And your ability to explain “why now” in the language of budget cycles can matter as much as a benchmark. Essential resources for any defense founder include “Hacking for Defense,” “The Hacking for Defense Manual,” the directory in “How to find your customer in the Dept of Defense,” the “Mission Model Canvas,” and lessons from “The lean launchpad at Stanford” and “The Secret History of Silicon Valley.” I also draw on the work of Alexander Osterwalder and Eric Ries to bridge discovery, iteration, and disciplined scale. What’s missing from Silicon Valley in this domain is patience paired with rigor. The best teams combine world-class product discovery with respect for acquisition realities. They instrument outcomes in the field, align roadmaps to funding gates, and bring forward deployed engineers to close the gap between prototype and operational capability. From prototype to the Pentagon is a repeatable path when we hold ourselves to mission outcomes, build coalitions across the acquisition chain, and design for constraints from day one. If you’re committed to national security, build with empathy for the operator, clarity for the buyer, and a roadmap that survives contact with procurement.

October 20, 2025
Inside Figma’s Product Playbook: Taste, Simplicity, and Storytelling for Extraordinary PMs

I’ve long believed the best products come from a careful blend of taste, simplicity, and storytelling. Studying how Figma operationalizes these principles has sharpened my own playbook for building, launching, and scaling products. In this piece, I distill the patterns I use and teach: how to approach new products, how to prioritize without losing the plot, and how to use narrative as a force multiplier for teams and customers.

At a high level, here’s the arc I focus on: approaching new products with a strong point of view, shaping product culture that balances craft with outcomes, understanding when to change course, tying business goals to product expansion, going multi‑product deliberately, recognizing the differences between “0 to 1” and “1 to 10” talent, and elevating storytelling from launch polish to a core build-time practice. Along the way, I’ll highlight why taste and simplicity aren’t luxuries—they’re strategy.

When I explore how to build from zero, I start with a crisp customer promise and a single, testable magic moment. The early days demand ruthless focus: one job-to-be-done, one path to value, one reason to share. As teams expand scope, the risk is layering utility without coherence. The countermeasure is systematic simplicity—every addition must make the core value faster, clearer, or more extensible. If it doesn’t, it’s noise.

Product culture is the scaffolding that makes this discipline stick. Speed and operational excellence drive the right kind of urgency; experimentation at scale validates hypotheses without cargo-culting metrics; and rigor in reviews ensures we’re prioritizing outcomes over output. The best cultures pair evidence with taste—data guides, but the bar for quality, narrative, and craft is set by humans with conviction.

Knowing when to change things is both an art and a system. I look for signal in stubborn user friction, plateauing activation, a long tail of workarounds, and moments when a new platform or workflow unlocks 10x value. The framework I use: if a change can simplify the path to the promise, or unlock a whole new class of users without diluting the core, it deserves energy. Change the defaults before changing the philosophy.

Business goals should sharpen, not overshadow, product expansion. Before adding surfaces or SKUs, I insist on clarity around the ICP, the premium moment worthy of pricing, the extensibility story for developers, and the narrative that unifies everything. Multi‑product strategy works best when each product is a chapter in the same book, not a pile of features. That’s why I appreciate how the ecosystem comes together across Figjam: https://www.figma.com/figjam/, Figma: https://www.figma.com/, Figma Dev Mode: https://www.figma.com/dev-mode/, and Figma Slides: https://www.figma.com/slides/—distinct entry points, shared language, and compounding value.

For “0 to 1” product work, I hire for curiosity, taste, and velocity. I want builders who can reduce ambiguity quickly, prototype with whatever tools are at hand, and tell a clear story about why their version of the problem matters. My favorite interview signal is a non-obvious customer insight that changed their roadmap. Entrepreneurial talent shows up in the questions they ask about distribution, pricing, and adoption—not just the feature.

I’m often asked why there aren’t more designer founders. My take: the gap is less about capability and more about exposure to distribution, pricing, and finance. Practical fixes help—give design leaders P&L ownership, put them on customer calls that include procurement, and pair them with GTM partners early. When designers are fluent in business mechanics, their advantage in taste and narrative becomes a superpower.

New product launches work best when the story is built in from day one. I like to “slow-cook” with tight, cross-functional squads, private betas with power users, and an explicit before/after narrative that connects the dots across product, docs, community, and developer ecosystem. As teams scale, I match talent to stage: “0 to 1” thrives in uncertainty; “1 to 10” excels at repeatability, quality, and operational excellence. Both are essential; mixing them at the wrong time creates drag.

Storytelling is not veneer—it’s how we align teams, earn stakeholder trust, and help users see themselves in the product. I anchor roadmaps to a one-sentence promise, show the painful “before,” demonstrate the “after,” and name the magic mechanic that makes it possible. Then I translate that story into prioritization. I stack-rank by value, confidence, and cost, and I’m explicit about what we won’t do. Strategy is as much the boundary as the plan.

If you’re refining your product storytelling, a quick checklist helps: articulate the promise in plain language, show rather than tell with a demo that lands the magic moment in 30 seconds, connect to measurable outcomes, and make the first-run experience feel like the narrative come to life. Don’t bury the lead. If a user can’t explain your product to a teammate after one minute, the story isn’t ready.

The difference between “good” and “extraordinary” product managers is simple to say and hard to do. Good PMs coordinate and ship on time. Extraordinary PMs set a higher bar for taste, simplify relentlessly, and move teams from consensus to conviction. They connect craft to outcomes, use narrative to create momentum, and make decisions that age well because the logic is legible.

Simplicity is a growth strategy. It shortens time-to-value, reduces error surface, and raises retention by making products feel learnable and trustworthy. Tactics I lean on: one hard thing at a time, remove to improve, defaults are design, and compress choices until the right path is the easy path. Simplicity isn’t less—it’s the right less.

Taste, in product and design, is not innate; it’s a practiced sensitivity to what feels inevitable. I cultivate it by collecting exemplars, writing and revisiting product principles, insisting on weekly critiques, and sweating the narrative as much as the pixel. The best teams hold two truths: quality you can feel and outcomes you can measure.

If you want to explore the ecosystem I referenced, here are direct links: Figjam: https://www.figma.com/figjam/, Figma: https://www.figma.com/, Figma Dev Mode: https://www.figma.com/dev-mode/, Figma Slides: https://www.figma.com/slides/.

Whether you’re building your first product or scaling a platform, the throughline remains: lead with taste, ship with simplicity, and align everyone with a story worth rallying around. That combination turns good teams into extraordinary ones—and products into movements.

October 20, 2025
Inside dbt Labs’ $4.2B ascent: category creation, open source, and monetization playbook

As a VP of Product Management, I’m fascinated by the rare mix of strategy, timing, and execution that turns a great idea into a durable category. The arc of dbt Labs is one of those definitive product stories: a cloud-based data management platform that has raised over $400M to date, and was last valued at $4.2B in 2022. What stands out to me first is the scale and velocity. Dbt Labs has grown from just three companies using its free tool in 2016 to an ecosystem of 30,000+ enterprise users. That journey captures the essence of category creation done right: lead with an opinionated product, cultivate a community around clear practices, and sequence monetization only after adoption becomes self-sustaining. When I look at Dbt’s explosive growth, I see a masterclass in product management leadership. The team focused on a precise, under-served problem in modern data workflows and built a tooling philosophy that aligned with how analysts and engineers actually work. That alignment turned a utility into a movement. The strategic pivot from consulting to a software company is a decision I’ve navigated myself, and it’s often misunderstood. Consulting’s hidden scalability and consultancy superpowers aren’t about headcount—they’re about tight customer feedback loops, paid discovery, and rapid learning cycles that directly shape product decisions. In this case, consulting engagements shaped the roadmap and helped validate the eventual product thesis with a clarity that pure software bets rarely achieve. Category creation is rarely a straight line. The team deployed unexpected strategies for building a tech category from scratch—most notably The anti-demo strategy. Rather than an overproduced wow moment, they optimized for real-life proof and repeatable value in the hands of practitioners. That put credibility ahead of theatrics. Community was the flywheel. Community hacking: the Slack group that changed everything wasn’t just a channel—it was a living spec for the product and the practices around it. Pair that with The open source philosophy and you have a compounding effect: trust, transparency, and contribution. When growth went exponential, it was because the community could see, shape, and advocate for the standard. Finding dbt Labs’ first customers mattered less than building a motion they could evangelize. How consulting engagements shaped the roadmap is a reminder that early revenue can be a learning instrument. Done well, it tightens product discovery and derisks foundational bets. Funding is another decision point I pay close attention to. The critical moment: Why and when dbt Labs sought venture funding came only after the system’s constraints were obvious. Fundraising only when “things started to break” signals operational discipline—capital as a force multiplier, not a crutch. On the commercial side, the sequencing was thoughtful. How to drive commercial adoption after open-sourcing is all about value layering: permissions, governance, collaboration, and scale—capabilities that enterprises will happily pay for. That dovetails into Key monetization strategies and the eventual Pivoting from consulting to software—a move that codifies services learnings into scalable product value. There are also powerful founder operating principles here. Becoming an “accidental founder” resonates with many of us who start by solving a concrete problem and wake up running a company. Why “begrudging” CEOs can be successful underscores that obsession with the customer often beats a desire to be a CEO. Advice for finding PMF: “It’s not a playbook” reflects the truth I’ve seen across teams: seek signals, not templates. Lowering your standards is a hack is a counterintuitive push toward shipping, learning, and iterating. Navigating emotional overwhelm and Every CEO needs a coach are signals of mature leadership—build inner capacity as deliberately as product capacity. Two things every founder CEO should do: set the cadence and protect the standards. If you want a quick guide to the narrative arc and key lessons, here’s how I map it to the journey: (00:00) Introduction; (02:56) The critical oversight in data analysis; (05:41) Becoming an “accidental founder”; (07:04) Inside the unique decision to start a consultancy; (08:17) The game-changing principle behind dbt Labs’ rapid growth; (11:20) Finding dbt Labs’ first customers; (15:52) Consulting’s hidden scalability; (17:25) How dbt Labs created a new category; (21:03) The anti-demo strategy; (23:59) Community hacking: the Slack group that changed everything; (26:00) The open source philosophy; (27:39) When growth went exponential; (28:49) How consulting engagements shaped the roadmap; (30:02) Fundraising only when “things started to break”; (32:40) Consultancy superpowers: the hidden advantages; (34:04) Pivoting from consulting to software; (40:00) Key monetization strategies; (48:56) Why “begrudging” CEOs can be successful; (51:02) Advice for finding PMF: “It’s not a playbook”; (51:59) Lowering your standards is a hack; (53:30) Navigating emotional overwhelm; (54:25) Every CEO needs a coach. Referenced: Amazon Redshift: https://aws.amazon.com/redshift/ Bob Moore: https://www.linkedin.com/in/robertjmoore/ Crossbeam: https://www.crossbeam.com/ dbt Labs: https://www.getdbt.com/ Drew Banin: https://www.linkedin.com/in/drewbanin/ Jerry Colonna: https://www.reboot.io/team/jerry-colonna/ RJMetrics: https://en.wikipedia.org/wiki/RJMetrics SeatGeek: https://seatgeek.com/ Steve Ritter: https://www.linkedin.com/in/steve-ritter-69495210/ Squarespace: https://www.squarespace.com/ Where to find Tristan: LinkedIn: https://www.linkedin.com/in/tristanhandy/ Twitter/X: https://x.com/jthandy

October 20, 2025
Inside Clay’s $1.25B Playbook: Unconventional GTM, Pricing Strategy, and Enterprise Wins

Clay’s path to a $1.25B valuation isn’t conventional—and that’s exactly why it’s instructive. Through the lens of product management and go-to-market strategy, I break down how unconventional tactics, rigorous pricing decisions, and a long game on brand combined to create real upmarket momentum. If you lead product, growth, or revenue, there’s a repeatable playbook here for blending product-led growth with enterprise sales without losing speed or signal. Varun Anand is the co-founder and Head of Operations at Clay, a GTM development environment that combines data and AI to help over 5000 companies power everything from CRM enrichment to highly targeted outreach campaigns. Clay recently announced their Series B expansion, raising $40M at a $1.25B valuation. Before Clay, Varun was the Director of Operations at Newfront and the Head of Expansion at Candid. Varun also spent four years working on Hillary Clinton’s presidential campaign. Turning traditional GTM on its head, Clay’s earliest traction didn’t come from glossy campaigns—it came from scrappy sales tactics: “WhatsApp groups, Reddit threads, and reverse demos.” I’ve seen this play repeatedly outperform paid channels early because it compounds social proof in the exact communities where power users congregate. When your ICP hangs out in niche threads, customer acquisition is a function of credibility, not CPM. On pricing, “credit-based pricing” was a pivotal decision. Equally important, the team “rejected the usage-based model.” For PLG plus enterprise, this matters: credits make value legible to buyers, reduce billing anxiety for ops and finance teams, and align with predictable, budgeted workflows. In my experience, credit models also create clearer upgrade paths when your product spans multiple use cases. Clay built a robust self-serve engine and then layered “enterprise customers on top of PLG.” This sequencing avoids the trap of hiring an enterprise team before the product is self-serve-proven. It also creates cleaner handoffs—self-serve for discovery and activation, sales for proof, procurement, and expansion. Content and brand weren’t afterthoughts. Clay made a “big bet on content” and “invested in brand from day-one.” That’s a contrarian move many teams delay, but content accelerates learning loops, reduces sales cycle time, and scales enablement far beyond headcount. In enterprise sales, a trusted brand is an asset class. Winning big accounts required creative proofs of value. “Reverse demos” flipped the script—show the customer’s data, in their workflow, with their outcomes. It’s one of the fastest routes to de-risking adoption and building trust with enterprise buyers. From there, they applied a pragmatic “land and expand model” that aligns with how large organizations actually buy. Clay highlights “3 changes that unlocked Clay’s upmarket motion.” While every company’s inflection points are unique, the meta-lesson is consistent: clarify the ICP, operationalize proof (reverse demos, ROI), and meet enterprise expectations on reliability, governance, and support—without sacrificing the PLG engine. Team construction was equally intentional. Hiring people who are “technical enough” and using a “hands-on interviewing process” raised the talent bar and reduced execution drag. I’ve found this mirrors the strength of forward-deployed mindsets: product, ops, and GTM talent who can prototype, troubleshoot, and translate customer complexity into scalable systems. Finally, Clay’s contrarian take on compensation signals a willingness to design incentives for the business they want to build, not the one the market expects. Compensation philosophies quietly shape culture, velocity, and who opts in. Referenced: Anthropic: https://www.anthropic.com/ Clay: https://www.clay.com/ Clay’s Series B expansion: https://www.clay.com/blog/series-b-expansion Eric Nowoslawski: https://www.linkedin.com/in/outboundphd/ Figma: https://www.figma.com/ Jesse Ouellette: https://www.linkedin.com/in/jesseoue/ Kareem Amin: https://www.linkedin.com/in/kareemamin/ Nick Merrill: https://www.linkedin.com/in/nick-merrill-64562310/ Notion: https://www.notion.com/ Oyster: https://www.oysterhr.com/ Pave: https://www.pave.com/ Rippling: https://www.rippling.com/ Snowflake: https://www.snowflake.com/ Verkada: https://www.verkada.com/ Webflow: https://webflow.com/ Yash Tekriwal: https://www.linkedin.com/in/yashtekriwal/ My takeaway: this is a modern GTM blueprint—prove value in the wild, price for clarity, build self-serve first, then industrialize trust for enterprise. Do that, and you can scale without losing the product signals that got you traction in the first place.

October 20, 2025
Just Now Possible Preview: How Real Teams Ship AI—Workflows, RAG, Agents, Evaluation

I’m excited to share a preview of Just Now Possible, a show where I sit down with the builders who are shipping meaningful AI features in the real world. My goal is simple: pull back the curtain on how AI products actually get made—messy problems, rapid prototyping, and the leadership decisions that move teams from concept to customer value.

Watch the preview on YouTube: https://www.youtube.com/embed/Kb2HbuPbfR8?feature=oembed. Prefer audio? Listen on Spotify: https://open.spotify.com/episode/5xM0pDnqR0JpKmW6aZ0pj6?ref=producttalk.org or Apple Podcasts: https://podcasts.apple.com/us/podcast/podcast-preview/id1838832993?i=1000725807029&ref=producttalk.org. Want a text version? Read the transcript ($): #full-transcript.

How AI products come to life—straight from the builders themselves. In each episode, we dive deep into how teams spotted a customer problem, experimented with AI, prototyped solutions, and shipped real features. We dig into everything from workflows and agents to RAG and evaluation strategies, and explore how their products keep evolving. If you’re building with AI, these are the stories for you.

From my own experience leading product teams, I’ve seen that the real unlocks come from disciplined product discovery, clear outcomes vs output OKRs, and smart use of gen ai for product prototyping. We’ll talk about the tradeoffs between speed and safety, when to bring in forward deployed engineers, and how to validate product-market fit lessons before scaling. Along the way, we’ll unpack practical patterns—like when to use RAG vs fine-tuning, how to evaluate agents in production workflows, and what great product management leadership looks like in AI-first environments.

The first full episode drops on Thursday, September 18th. Don't miss it!

Full transcripts are available to paid subscribers.

Inspired by this post on Product Talk.

October 20, 2025
Building AI Products That Work: My Playbook for LLM Strategy, Evals, and Orchestration

AI features don’t succeed on clever prompts alone—they demand thoughtful product strategy, rigorous evaluation, and tight cross-functional collaboration. As a VP of Product Management and someone deeply immersed in building with Large language model (LLM) technology, I’m constantly refining how we turn generative capabilities into real customer value. This episode of All Things Product zeroes in on that challenge, and it captures many of the principles I rely on when shipping AI to production.

The central question resonates with every product leader I know: How do product teams learn to build AI-powered products “beyond just dabbling with ChatGPT”? I appreciate how the conversation moves past novelty and into the disciplines that make AI reliable, safe, and outcome-oriented.

One metaphor that always lands for me: building AI features is less like writing a single “killer prompt” and more like orchestrating a team of “interns.” You define roles, break down work, set guardrails, and continuously review outputs. That orchestration mindset, coupled with strong observability, evals, and ongoing maintenance practices, is what separates flashy demos from repeatable product value.

Here’s how I frame the work. First, there’s a difference between an AI-powered product manager and an AI product manager. Many of us are becoming AI-powered—using tools to accelerate discovery, ideation, or execution. But when you own AI features end-to-end, you inherit new responsibilities: modeling risks, defining evaluation strategies for non-deterministic systems, and treating prompts and data pipelines as core product surfaces.

Prompt engineering for a product is fundamentally different from prompting ChatGPT for personal use. In production, I rely on prompt decomposition and orchestration—explicitly breaking a task into steps, assigning each step to the right capability, and enforcing consistent formats. This reduces variance, improves debuggability, and enables targeted evals that catch regressions before customers do.

System design and risk mitigation become front and center. I align early with engineering, legal, security, and support on failure modes, privacy expectations (including Personal information or personally identifiable information (PII)), and rollout plans. We log traces for every critical path, treat prompts as versioned assets, and use observability to connect inputs, intermediate states, and outputs. When something drifts, we need to see it fast, explain it, and fix it.

Evaluating non-deterministic AI features is its own craft. “Thumbs up/thumbs down” isn’t enough. I design layered evals: unit-level checks for correctness and formatting, scenario-level evals for edge cases and risk behaviors, and longitudinal evals to monitor model and data drift over time. Clear acceptance thresholds and shadow deployments help us balance velocity with reliability.

Deciding when AI is the right solution starts with the customer problem, not the model. I ask: Is the task ambiguous enough to benefit from generation? Can we bound the failure modes? Do we have affordable latency and cost envelopes? And what’s the graceful fallback if the model underperforms? If a deterministic algorithm or simple rules solve it better, we choose that—no heroics.

The hidden cost of AI is maintenance. Prompts rot as upstream models change. New data skews behavior. Guardrails that worked yesterday might not hold tomorrow. That’s why ongoing evals, robust logging, and a change-management plan (for prompts, schemas, and policies) are non-negotiable. Treat AI features as living systems, not one-off launches.

If you’re exploring gen ai for product prototyping, start small. Pick a narrow, high-value workflow, instrument everything, and ship with clear success metrics. Use your first release to build your team’s muscles around observability, evals, and cross-functional collaboration. The goal is not a perfect model; it’s a reliable product outcome.

Want to go deeper? Listen to the full conversation here: Spotify | Apple Podcasts. Prefer video? Watch on YouTube: Building AI Products.

What you’ll learn in this episode:

– The difference between an AI-powered product manager and an AI product manager

– Why prompt engineering for a product is different from prompting ChatGPT for personal use

– The role of prompt decomposition and orchestration in building robust AI features

– How to think about system design, risk mitigation, and cross-functional collaboration

– Why observability and logging traces are critical for LLM products

– The challenge of evaluating non-deterministic AI features (and why “thumbs up/thumbs down” isn’t enough)

– How to decide when AI is the right solution for a customer problem

– The hidden cost of ongoing maintenance for AI features

Join the conversation: What practices have helped you ship reliable AI features? Drop your thoughts and questions in the comments—I’d love to learn from your experiences.

Inspired by this post on Product Talk.

October 20, 2025
From Disruption to Breakthrough: How Stack Overflow’s AI Pivot Became a Product Playbook

Generative AI doesn’t knock politely—it kicks the door open and forces product teams to re-think the fundamentals. I’ve lived through my share of market shifts, and the story of Stack Overflow’s AI journey hits every note of what it takes to respond with clarity, speed, and rigor.

When ChatGPT launched, Stack Overflow faced a cataclysmic shift: developer behavior was changing overnight. That single sentence captures the urgency I felt as I studied this case: habits, traffic patterns, and value perceptions transformed almost instantly.

Consider the timing: Ellen Brandenburger stepped into Stack Overflow just two weeks before ChatGPT launched. In her shoes, I would have immediately asked the same questions she did: What new developer workflows are becoming “just now possible”? How quickly can we prototype without compromising quality or trust? And how do we avoid overcorrecting in a moment of uncertainty?

In response, the team created Overflow AI, a concentrated effort to explore “what’s just now possible” for developers. I love this framing—it anchors exploration to near-term feasibility while keeping sight of evolving user needs. It’s the kind of focused discovery effort I encourage when a platform-defining shift hits.

They moved through four disciplined iterations of conversational search, each an experiment with clear hypotheses and guardrails:

V1: a chat UI on top of keyword search

V2: semantic search to handle natural questions

V3: fallback to GPT-4 for gaps in Stack Overflow’s corpus

V4: adding RAG for attribution and transparency

Two principles stood out as non-negotiable: attribution and transparency. For developers, trust depends on knowing where an answer came from, why it’s relevant, and whether it reflects source truth. I’ve found the same in my own teams—without provenance and clarity, even great answers feel shaky.

The team’s evaluation approach was refreshingly pragmatic: simple spreadsheets and subject-matter experts assessing accuracy, relevance, and completeness. In my org, we’ve adopted similar lightweight scorecards before scaling LLM investments; it keeps us honest about quality before we fall in love with a demo.

Here’s the moment that demonstrates real product management leadership: despite the investment, Stack decided to sunset conversational search when it couldn’t meet developer standards. That discipline—choosing not to ship what isn’t good enough—preserves brand trust and creates space for a better bet.

And that better bet was a strategic pivot: the team leaned into data licensing, leveraging its 14M+ Q&A corpus to power LLM training and benchmarks. Instead of treating AI as a threat, they turned their differentiated asset into a durable business line.

They went further, building industry benchmarks with subject-matter experts to prove Stack data improved LLM accuracy and relevance. This is exactly how I think about outcomes vs output: quantify lift against real tasks, validate with domain experts, and package value in a way decision-makers can trust.

Key lessons I’m taking forward:

Take one bite of the apple at a time—prototype, learn, iterate.

Product in the AI era means managing probabilities, not certainties.

For context, Ellen Brandenburger is a product leader and coach; former head of product at Chegg Skills and Stack Overflow’s data licensing team. Her arc through this transformation underscores what matters most right now: tight feedback loops, transparent evaluation, and the courage to pivot from feature bets to business model bets when the evidence demands it.

If you’re leading gen AI initiatives, treat this as a playbook: form a focused “just now possible” team, instrument quality with SMEs early, obsess over attribution and transparency, and be willing to sunset—even after heavy investment—when the work doesn’t clear your user’s bar. Then, zoom out: your unique data and workflows may be the moat. Build for that.

Inspired by this post on Product Talk.

October 20, 2025
Mastering AI Evals: Real-World Discovery Tactics to Ship Quality, Safe, Reliable AI

I’ve been shipping GenAI features long enough to know that clever prompts and orchestration aren’t enough. What actually matters is evidence: Does the system work, for whom, and under what conditions? That’s where rigorous AI evals come in—the backbone of building reliable, safe, and continuously improving AI products.

In a recent conversation focused entirely on evaluation, I dug into what “evals” mean in the AI/ML world, why they’re more than just quality assurance, and how to operationalize them end to end. If you want to explore the discussion, listen on Spotify: https://open.spotify.com/episode/7mSiEGSYNO4sXeGAVTJO4V or Apple Podcasts: https://podcasts.apple.com/kh/podcast/ai-evals-discovery/id1794203808?i=1000727980774. There’s also a video version on YouTube: https://www.youtube.com/watch?v=pfSIQMrWhQE.

Here’s how I frame evals with my teams. First, define the behavior you want to see in terms real users care about. Then codify that intent as tests that run consistently. I distinguish between golden datasets, synthetic data, and real-world traces. Golden datasets capture canonical examples that represent “ground truth.” Synthetic data fills important gaps quickly and safely. Real-world traces keep you honest and reflect evolving usage.

The most durable loop I’ve found is simple: identify error modes, turn them into evals, and automate. This is where error analysis pays off. Some checks should be purely deterministic—code-based checks that evaluate structured outputs, schemas, or policies. Others benefit from LLM-as-judge when human-like judgment matters, as long as you calibrate and continuously verify those judges with spot checks and inter-rater agreement.

Discovery practices should inform every evaluation step. If you’re doing “Story-Based Customer Interviews,” you can derive realistic scenarios, acceptance criteria, and edge cases directly from user narratives. That context sharpens the evals and prevents you from overfitting to toy problems or proxy metrics that don’t reflect user value.

Evals require ongoing care and feeding. Criteria drift is real—what counted as “good” six weeks ago may not satisfy users after you ship a new capability or your audience evolves. I treat the eval suite like living product infrastructure: versioned, reviewed, and owned. When we change prompts, models, or retrieval strategies, the evals run first, then we examine deltas, regressions, and surprises before anything reaches production.

Guardrails and human oversight work hand-in-hand with evals. Guardrails enforce non-negotiables (safety, privacy, compliance), while evals measure progress against nuanced goals (relevance, helpfulness, tone). In high-stakes workflows, I combine pre-deployment evals, runtime guardrails, and spot human review. The goal isn’t to eliminate humans; it’s to focus their attention where judgment and context matter most.

Practically, I start with a minimal eval harness that standardizes inputs and outputs—often in JSON (JavaScript Object Notation)—and writes repeatable tests. I maintain a small golden dataset, add targeted synthetic data for coverage, and stream real-world traces into the suite once we have consent and redaction in place. For subjective criteria (e.g., tone, helpfulness), I layer in LLM-as-judge with calibration. For objective checks (e.g., schema validation, policy compliance), code-based checks are my default.

Tooling evolves quickly, but the principles hold. Whether you’re working with Anthropic or experimenting with V0 or Lovable in your prototyping stack, the eval loop stays the same: define success, test it the same way every time, and close the loop with learning. If you’re a product creator or leading forward deployed engineers, this discipline accelerates gen ai for product prototyping without sacrificing safety or quality.

I also tie evals to outcomes vs output OKRs. Instead of “ship three prompts,” we commit to measurable outcomes like resolution rate, time-to-answer, or a target “helpfulness” score. In customer support ai strategy, we monitor real-world traces, CSAT, and handoff quality to ensure the AI augments agents rather than creating silent failure modes. That’s how evals drive product-market fit lessons instead of just dashboards.

If you want to go deeper, explore these foundational concepts and tools: ML (Machine learning), LLM (Large language model), “AI Evals for Engineers and PMs”: https://maven.com/parlance-labs/evals, “The Product Leadership Wheel – A Framework for Defining and Growing Product Leadership at Scale”: https://www.petra-wille.com/plwheel, “How I Designed & Implemented Evals for Product Talk’s Interview Coach”: https://www.producttalk.org/2025/09/interview-coach-evals/, “Behind the Scenes: Building the Product Talk Interview Coach”: https://www.producttalk.org/2025/08/customer-interview-coach/, V0: https://vercel.com/docs/v0, JSON (JavaScript Object Notation): https://en.wikipedia.org/wiki/JSON, Anthropic: https://www.anthropic.com/, Lovable: https://lovable.dev/, and “Story-Based Customer Interviews”: https://learn.producttalk.org/course/story-based-customer-interviews.

If this resonates, I’ll be sharing weekly lessons learned from building and evaluating AI features in the wild, plus conversations with cross-functional teams about real-world AI development. Have thoughts or a tactic that’s worked for you? Drop a comment and let’s compare notes.

Inspired by this post on Product Talk.

October 20, 2025
Inside Braze’s Blitz to $500M CARR: Bold PM Lessons on Going Global and Outsmarting Rivals

I’ve long believed the best product breakthroughs happen at the intersection of market timing, technical first-principles, and relentless customer discovery. Braze’s trajectory is a compelling proof point. Bill Magnuson is the co-founder and CEO at Braze, along with Kevin Wang, who joined as employee #8 and serves as the CPO. The two MIT graduates have built Braze into a publicly listed customer engagement platform with a $4.4B market cap. In 2023, Braze surpassed $500M in CARR, and serves over 2,200 customers worldwide. Before Braze, Bill spent time at Bridgewater Associates. Kevin’s academic background is in brain & cognitive sciences, and prior to joining Braze he worked at Accenture and Brewgene.
What strikes me most is how early conviction catalyzed execution. The Braze founders’ early insights into the mobile revolution weren’t abstract theses; they translated into concrete product choices that aligned with the emerging realities of push notifications, in-app messaging, and event-driven personalization. That early bet on mobile-first customer engagement created strategic leverage that compounding growth later amplified.
Origin stories matter because they encode the decision-making DNA. How a TechCrunch Hackathon sparked Braze’s creation is a reminder that speed to learning often beats speed to launch. Meeting co-founders at an NYC Hackathon stacked the deck for chemistry and complementary skills — a pattern I’ve seen repeatedly when teams form around real problems and prototype under time pressure.
Finding “terminal value” product market fit is more than PMF — it’s about enduring utility that scales with customer complexity. I appreciated how they framed the search as “fishing in every pond,” testing use cases and segments broadly while retaining a coherent platform strategy. That duality — breadth of exploration with depth of conviction — is precisely how I guide teams through product discovery when the surface area of opportunity is vast.
The early journey from 1,000 beta signups to 2,200+ paying customers underscores a disciplined funnel from interest to value to revenue. Braze’s scrappy scaling and early product development show that sometimes you must resist playbook dogma. Breaking the rules of a lean startup doesn’t mean ignoring hypotheses; it means investing ahead of the curve when platform primitives (data, messaging, orchestration) are the real unlock for long-term differentiation.
Navigating early fundraising challenges often forces sharper articulation of strategy and sequencing. I’ve found that the “why now” and “why this architecture” narratives become decisive — especially when your thesis runs counter to conventional wisdom. In Braze’s case, riding the mobile wave to success was inseparable from building the right infrastructure for real-time engagement and global scale.
Competition is inevitable; how you posture is a choice. Approaching competition strategically like a boxer resonated with me — pick your angles, conserve energy, and control the fight tempo. Translate that into product terms: choose the battles that exploit your architectural strengths, avoid the feature-by-feature brawl, and make category-defining bets where your feedback loops are fastest and most defensible.
Globalization rewards systems thinking. Building a global customer base requires architectural foresight (latency, compliance, localization), go-to-market nuance, and a repeatable model for entering new regions. When scale helps or hurts is an under-discussed reality — some processes must centralize; others need to decentralize to stay close to the customer signal. The never-ending quest for PMF is real; every new segment, channel, and geography is a fresh PMF search with its own “viable path to value.”
If I had to distill the practitioner takeaways, I’d start with this: prioritize platform primitives over shiny features; measure learning velocity, not just shipping velocity; and align resourcing to “terminal value” outcomes, not activity. That’s how you out-execute better-funded rivals and convert timing advantages into durable moats.
Referenced:
Accenture: https://www.accenture.com/
Appboy: https://www.braze.com/resources/articles/appboy-social-network-for-mobile-apps
Bipul Sinha: https://www.linkedin.com/in/bipulsinha/
Braze: https://www.braze.com/
Bridgewater Associates: https://www.bridgewater.com/
Jon Hyman: https://www.linkedin.com/in/jon-hyman/
Mark Ghermezian: https://x.com/markgher
MIT: https://www.mit.edu/
Rubrik: https://www.rubrik.com/
WeWork: https://www.wework.com/

October 20, 2025
How Guideline Rewired 401(k)s: First‑Principles Strategy, Gusto Edge, and Product Wins

“I don’t believe in stealth mode” is a product mantra I’ve long embraced, and it immediately came to mind as I dug into how Guideline modernized 401(k)s for small and medium-sized businesses. In a space dominated by incumbents and legacy processes, transparency and execution in public view can be a superpower. That ethos, paired with disciplined product discovery, comes through clearly in Guideline’s story.
Kevin Busque, the co-founder and CEO of Guideline, saw the problem up close while building Taskrabbit: traditional 401(k) plans suffered from complexity, low participation, and “confusing fee structures.” As a product leader, I’ve watched similar frictions stall adoption in other regulated categories—when fees are opaque and onboarding is arduous, engagement dies before it starts. The insight was simple but profound: remove confusion, automate compliance, and make default participation the norm.
After launching Guideline to address those problems head-on, the company rapidly validated market pull, hitting $120 million in ARR by June 2024. That milestone reflects more than growth—it’s evidence that a first-principles approach to retirement plans can outcompete legacy playbooks. It also highlights the compounding impact of product decisions that prioritize clarity, automation, and aligned incentives.
What impressed me most was the “Do the hard thing first” mindset. In practice, that meant investing early in infrastructure others avoided, like deeply integrated payroll workflows and robust compliance automation, rather than deferring them as future tech debt. It’s the opposite of chasing shiny objects: master the unglamorous backbone and everything else compounds.
On market entry, Guideline focused on nailing product-market fit by aligning with payroll ecosystems where SMBs already live. The Gusto partnership was a pivotal move—“Kevin’s insights from the Gusto integration” underscore how strategic distribution, combined with a clean UX and transparent pricing, became a durable edge. Compared to heavyweights like ADP, Fidelity, Paychex, and Intuit, Guideline reframed the buyer journey around simplicity and trust.
Pricing matters in retirement more than most founders realize. “How Guideline set their fees up” and “Lucky 8: Kevin’s unexpected pricing strategy” show how precise pricing architecture can both demystify costs and drive adoption. Clarity isn’t just a marketing claim—it’s a feature that reduces cognitive load and increases participation rates.
I also appreciated how early traction came from a surprisingly broad customer mix—“The surprising range of Guideline’s early customers” points to a product that generalized well across verticals without losing focus. “Working with Plaid as Guideline’s first customer” exemplifies how partnering with trusted fintech brands accelerates credibility and creates feedback loops that sharpen the product.
Defaults drive outcomes. “Guideline’s auto-enrollment feature” is a great example of using behavioral design to improve financial health at scale. When the right default exists and the friction is removed, participation becomes the baseline, not the exception. It’s a masterclass in aligning product and policy to deliver real retirement outcomes, not just feature checklists.
From a roadmap perspective, I was struck by the discipline in resisting premature expansion—“Will Guideline ever go multi-product?” is a nuanced question for any scaling company. “Kevin’s take on product-market fit” and “Guideline’s compounding advantage” reinforce a principle I live by: compound depth before breadth. Every integration, every compliance workflow, every support touchpoint can either compound or fragment your advantage.
Finally, leadership matters as much as strategy. “The challenges faced by introverted leaders” resonate deeply with how I build teams: create space for deep work, institutionalize written decision-making, and use clear operating principles so the product vision scales beyond the founder. It’s the quiet, consistent habits—not the loud slogans—that hold complex products together.
For product leaders working on regulated, high-stakes categories like retirement plans, healthcare, or financial services, the lessons are clear: conduct rigorous product discovery before you ship, pursue distribution advantages through strategic partnerships, architect pricing as an experience, and let default-driven features (like auto-enrollment) do the heavy lifting. That’s how you rewire entrenched markets—by doing the hard thing first, and doing it in the open.

October 20, 2025