Tag: product management leadership

Stop Falling for Hollywood Demos: The Unfiltered Truth of Live AI Voice for Support

I’ve sat through countless AI demos, and I’ve learned there are really two kinds: the “Hollywood demo,” which is polished to perfection, and the “real-world demo,” which shows the product raw—imperfections and all. The former dazzles, but the latter is where you discover what’s actually ready for prime time.

Hollywood demos look great, but sometimes need a closer look to make sure what you see is what you’ll get. When I’m evaluating an AI Agent for customer service, I always look past the polish. I’m assessing how well it will handle real-world scenarios—the messy, complex conversations your team deals with every day. That’s especially true on voice, the toughest channel to get right.

Voice is one of the toughest tests of any AI system. It’s not just “chat with speech.” An AI Agent needs to be able to listen, respond, and adapt in real time. Timing, tone, and turn-taking are all part of the product, they shape the experience as much as accuracy or reasoning.

An edited video might sound seamless, but it can’t show how a system behaves in a real support environment—like when a conversation takes an unexpected turn or when it pauses briefly to reason or retrieve data. Those small moments—latency, clarifications, interruptions—are when you see what the AI Agent is really capable of. A real-world demo lets you see and hear how the system actually behaves under real conditions, not in a controlled environment that’s been smoothed out with editing.

That’s why the live Fin Voice demo at Pioneer stood out. The team called Fin live on stage to show the real thing (with real latency and interruptions) so people could understand the product they’d be deploying to their own customers. As a product leader, I appreciate that level of transparency because it mirrors how customers will experience the system in production.

When Paul Adams, Chief Product Officer, demoed Fin Voice at Pioneer, the goal was to show the product exactly as customers experience it. In 90 seconds, Fin verified his identity, retrieved account data, managed an interruption, offered options, completed the workflow, and sent a follow-up email. That’s the kind of end-to-end outcome I look for—fast verification, accurate retrieval, natural pacing, and a closed loop.

Latency. You could hear brief pauses while Fin fetched subscription details and checked backend systems. That wasn’t lag—it was work happening in real time. In voice AI, thoughtful latency that signals reasoning is far better than synthetic speed that collapses under real load.

Natural conversation flow. Fin detected when Paul finished speaking, handled interruptions gracefully, and replied in short, human-like turns. That turn-taking behavior is essential for trust and comprehension in voice customer support.

Awareness and tone. Subtle changes in pacing when Paul laughed or hesitated showed sensitivity to context. Tone control is not a “nice to have” in voice—it’s a core UX capability.

Unscripted conversation design. No rigid IVR menus or fixed paths. Paul spoke naturally, and Fin adapted to resolve his query. That adaptability is what differentiates a true AI Agent from a glorified decision tree.

Those details are the real test. A voice AI Agent that performs well in a live demo is one that will perform well for you and your customers too.

Voice has been one of the most demanding, and rewarding, areas of development for Fin. Since launch, we’ve been expanding what it can do so support leaders can customize how Fin sounds, behaves, and aligns with their brand.

Voice and tone customization: Choose from multiple natural voices, set greetings, and fine-tune how Fin communicates with customers.

Escalation and conversational guidance: Teach Fin to use your terminology, ask clarifying follow-ups, and escalate when needed.

Deployment controls: Manage rollouts, test safely in internal environments, and fine-tune before going live.

Flexible integrations: Connect to any telephony system via call forwarding, and link Fin Voice to backend systems or APIs to take action.

Multilingual capability: Fin Voice now supports 28 languages natively.

Alongside these features, we’ve made big improvements to Fin’s answer quality—the foundation of a great voice experience. When people call, they’re looking for accurate, immediate answers they can trust.

So we’ve focused on three key areas: low latency, which is down roughly 30–40% since launch; clarification flow, so Fin asks smart follow-up questions to reduce back and forth and improve resolution rates; and voice-specific answer structure, so Fin delivers information in shorter sentences with pacing designed for listening.

Together, these improvements mean customers get the highest-quality answers as quickly as possible, resulting in more resolutions and better experiences.

Running a live demo always carries risk because things can go wrong. But that’s also why it matters—because that’s how customers experience it too. Support leaders stake their reputation on the systems they choose, so the only way to understand what you’re putting in front of your customers is to see it under real conditions.

When you see Fin in a demo, you’re seeing the same system that runs in production. Real-world demos take more effort and don’t always go perfectly, but they show what’s real—and that’s exactly what you need to evaluate before you deploy voice AI at scale.

Inspired by this post on The Intercom Blog.

November 11, 2025
Global Invoicing Nightmares: Hard-Won Product Lessons on EU Tax, Compliance, and Customer Value

I hit play on Global Invoicing – All Things Product Podcast with Teresa Torres & Petra Wille and felt an immediate jolt of recognition. We’ve all launched a feature that looked solid—until a small, overlooked detail broke everything. Their stories about global invoicing and taxes echoed challenges I’ve faced leading product for international customers: if you don’t design for the last mile of compliance, you can accidentally block the very "moment of value creation" your product promises.

Listen to this episode on: Spotify | Apple Podcasts

The conversation starts as a candid rant about EU tax compliance and quickly becomes a precise product management lesson: when we fail to map the entire path to customer value—down to the tiniest regulatory requirement—we can ship something “done” that still doesn’t work in the real world. That gap between intention and outcome is where good product teams live or die.

In my experience, the nightmare of global invoicing for small online businesses is very real. Even big platforms (like Squarespace and Teachable) miss the mark on EU tax compliance, and when they do, customers feel it immediately. It’s the kind of edge case that doesn’t show up in a demo but absolutely shows up in revenue. Or as Teresa put it, “It’s not a little detail when your client won’t pay the invoice.” — Teresa Torres

I appreciated how the episode digs into the difference between passing a regulatory checklist and actually meeting customer needs. Put plainly: the product isn’t “done” when the ticket moves to Done; it’s done when the customer completes the job—receives an acceptable invoice, pays successfully, and can reconcile it without friction. That’s why I lean hard on story mapping for regulatory work; it exposes the invisible steps where value creation can silently fail.

Here’s how the episode resonates with my own playbook: the nightmare of global invoicing for small online businesses is a systems problem; why even big platforms (like Squarespace and Teachable) miss the mark on EU tax compliance is a prioritization and discovery problem; how Petra and Teresa navigated invoicing across borders with Ableify and LearnWorlds highlights pragmatic tool choices and trade-offs; the key difference between meeting regulations and meeting customer needs is an outcomes-over-output mindset; what product teams can learn from regulatory edge cases is how to find the seams where markets, laws, and workflows collide; how missing a single detail can block the "moment of value creation" is a reminder that value is defined by customers; and why story mapping is critical for finding gaps between "we shipped it" and "customers got value" is the method that connects all of the above.

Practically, that means I treat regulatory features like any other high-stakes product surface: do real product discovery with affected users; co-design the happy path and the ugly edge cases; write acceptance criteria that include jurisdictional and document-level specifics (e.g., VAT numbers, invoice formats, timing rules); align with finance and legal early; and instrument the journey from invoice issued to invoice paid so we can see where real customers get stuck. This is outcomes vs output OKRs in action, and it’s one of the fastest ways to earn trust with stakeholders.

Key takeaways worth bookmarking: Customers define value, not your compliance checklist. Regulatory work still requires discovery—you can’t skip understanding user needs. The path to value doesn’t end when your feature works; it ends when your customer succeeds. “Sweating the details” isn’t micromanagement—it’s good product management.

Memorable quotes to bring back to your team: “If you don’t sweat the details, people choose other platforms.” — Petra Wille. “It’s not a little detail when your client won’t pay the invoice.” — Teresa Torres.

Follow Teresa Torres: https://ProductTalk.org | Follow Petra Wille: https://Petra-Wille.com

Mentioned in the episode: Squarespace | Stripe | Product at Heart | Teachable | LearnWorlds | Ablefy | Become a Better Product Leader: A 52-Week Transformation Journey | Product Talk Academy

Have thoughts on this episode? Leave a comment below.

Full transcripts are only available for paid subscribers.

Inspired by this post on Product Talk.

November 11, 2025
How Incident.io’s AI SRE Diagnoses, Hypothesizes, and Fixes Outages in Slack at Record Speed

When your site goes down, every second counts. I’ve lived that reality across multiple product lines, and the difference between a five-minute blip and a two-hour outage is felt by customers, engineers, and the business. That’s why I’ve been closely following how Incident.io has evolved from coordination during chaos to intelligent, proactive response.

Now, they’re building something new: an AI SRE that can actually help diagnose and respond to incidents. As someone who thinks deeply about reliability, velocity, and customer trust, that promise hits the intersection of AI Strategy, product management leadership, and operational excellence.

I recently spent time with Lawrence Jones, Founding Engineer at Incident.io and Ed Dean Product Lead for AI at Incident.io, digging into how their team is teaching AI to think like a site reliability engineer. They shared how they went from simple prototypes that summarized incidents to a multi-agent system that forms hypotheses, tests them, and even drafts fixes—all from within Slack.

Here’s what stood out to me first: AI’s biggest impact comes from compressing time—identifying causes minutes instead of hours. In practice, that means fewer cycles lost to paging the wrong on-call, clearer paths to root cause, and faster recovery—without cutting humans out of the decision loop.

Equally important is deciding where automation belongs. The team’s approach aligns with how I evaluate high-risk workflows: Identify which parts of debugging can safely be automated. Combine retrieval, tagging, and re-ranking to find relevant context fast. Use post-incident “time travel” evals to measure how well their AI performed. Balance human trust and AI confidence inside high-stakes workflows. The human remains accountable; the AI accelerates context, options, and execution.

On the technical side, the retrieval choices were refreshingly pragmatic. Retrieval-augmented reasoning still benefits from simplicity: deterministic tagging and re-ranking often beat complex vector setups. I’ve seen the same in production: start with crisp, deterministic signals, then layer embeddings where they truly add value. This keeps systems debuggable and stable as you scale.

The interface choices matter just as much as the models. “Slack as the interface for human-AI collaboration” puts the agent where incidents already live, reducing friction and increasing adoption. Under the hood, they’ve been pragmatic with “PGVector and Postgres for retrieval experiments”, using “RAG (Retrieval-Augmented Generation)” and “Multi-agent orchestration” to chain context gathering, hypothesis formation, and action proposals. The north star is compelling: “AI as your company’s immune system”.

What impressed me operationally was the rigor around evaluation. Post-incident “time travel” evals let teams score AI accuracy after they know what really happened. That’s the standard we should all adopt: test the agent against reality, not just synthetic prompts, and feed those learnings back into prompts, tools, and guardrails.

Trust is the currency in incidents, so the product surface must reflect uncertainty with care. Building trust in AI isn’t just about precision—it’s about showing reasoning and uncertainty in ways humans understand. In other words, show the chain of thought as a structured artifact (signals considered, hypotheses rejected, evidence gathered), expose confidence bands, and always make it easy for humans to override or guide.

From a workflow standpoint, the investigation loop mirrors seasoned SRE practice: fast scoping, parallel checks and data sources, building hypotheses and refining findings, then proposing remediations paired with the context that justifies them. Human-agent collaboration here is not a handoff—it’s a tight copilot loop where the agent gathers, tests, and drafts, and the human confirms, prioritizes, and executes.

For platform and security leaders, this approach blends speed with safety. Clear permissions, auditable actions, blast-radius constraints, and CI/CD integration keep the AI inside defined guardrails while still delivering material acceleration. The payoff is higher deployment frequency without compromising reliability—because detection, triage, and rollback become faster and more repeatable.

My takeaway as a product leader: this is a blueprint for agentic AI in mission-critical workflows. Start in the tools users live in (Slack), nail retrieval with deterministic foundations, model the expert’s playbook (not just their summaries), and make evaluation a first-class part of the product. Do that well, and the AI goes from assistant to teammate—conservative when it should be, bold when the evidence supports it, and always legible to the humans in the loop.

The momentum around Incident.io’s AI SRE suggests where we’re headed next: deeper integrations, broader coverage across service catalogs, and richer automations that remain transparent and controllable. For teams investing in reliability, this is the moment to operationalize agentic AI—measured, auditable, and designed for trust—so you can move faster when it matters most.

Inspired by this post on Product Talk.

November 6, 2025
Build a Company You’ll Run Forever: Bootstrapping vs VC, PMF, and the Art of ‘Eating Glass’

I’ve spent my career building products and teams that I intend to steward for the long haul, and I’m drawn to founders who treat company-building as a craft you can practice forever. In this analysis, I break down a journey that crystallizes what it takes: going from a teenage wholesale hustle to an API-first healthcare clearinghouse, and in the process, learning why execution isn’t a moat, why venture capital is “going pro,” and how “eating glass” can become a durable advantage.
Here’s the arc that anchored my thinking: a founder who, at 16, turned $2,500 into a wholesale empire; later bootstrapped a wildly profitable auto-parts business; then sold it to tackle “the most complicated problem” he’d ever encountered: business-to-business transaction exchange. He spent years building EDI infrastructure, threw away the entire codebase eight times, and found extraordinary traction in healthcare. The company recently raised a $70M Series B co-led by Stripe and Addition. The throughline is a consistent, high-agency approach to product management and go-to-market strategy, guided by first principles decision making.
The first customer is often the trickiest—not because demand doesn’t exist, but because the product’s value proposition, points of parity, and competitive differentiation are still coalescing. I push teams to do founder-led GTM early, speak in the user’s language, and orchestrate high-signal conversations that expose real switching costs. That’s how we avoid mistaking polite interest for product-market fit.
Bootstrapping forces rigor, but it also means being “constrained by capital.” There’s a ceiling to the speed at which you can iterate, validate, and scale. Venture capital, in the right context, is like “going pro”: you trade a bit of optionality for time, talent density, and a faster feedback loop. I often see confusion between ownership vs. control; structurally, you can design for alignment while still moving with the urgency a competitive market demands.
One theme I return to with my own teams: execution is never actually a moat. Processes can be copied. Culture can be mimicked superficially. What can’t be easily replicated is the willingness to do the unglamorous, compounding work—what the founder here called “eating glass.” It’s the daily discipline of simplifying the system, instrumenting the edge cases, and standing up operational excellence that compounds into true competitive differentiation.
When product-market fit hits in enterprise infrastructure, it can feel like “the snake swallowing a deer.” Capacity, process, and architecture are stretched to their limits all at once. I’ve experienced the same pattern: everything slows down so the organization can re-architect for scale. The trick is to make those constraints visible—measure service levels, queuing, and error budgets like you would in a production system—so you’re not flying blind.
Some of the strongest product-management instincts I’ve seen borrow from discount retail and Toyota. From discount retail, we learn to obsess over unit economics, operational throughput, and ruthless simplification. From the Toyota production system, we adopt Kanban / TPS (Toyota), continuous improvement, and respect for constraints. In software terms, this becomes fast deployment frequency, small batch sizes, and defect prevention at the source—because “All software is a cascade of miracles.”
Scaling decision-making is where most teams stall. I favor clear ownership, lightweight written narratives, and a bias for first principles decision making over committee compromise. That structure lets high-agency individuals move quickly while keeping cross-functional stakeholders aligned on outcomes vs output OKRs. It’s how you build empowered product teams without sacrificing focus.
Hiring is where philosophy becomes practice. I resonate with the onboarding mantra “everything’s your fault now”—not as blame, but as an invitation to own outcomes end to end. I look for high-agency people who demonstrate systems thinking and the capacity to simplify. Manager hiring should lag role clarity; bring in managers when coordination overhead is the limiting factor, not when it merely feels uncomfortable.
Longevity comes from founder-approach fit as much as product-market fit. Build a company you don’t want to leave by aligning operating cadence, decision rights, and cultural norms with how you actually work best. Maintain conviction in unconventional practice when the evidence supports it, while remembering that “Reality has a surprising amount of detail.” The more I zoom in on the real work—interfaces, edge cases, workflows—the more the right design emerges.
In healthcare EDI, that realism matters. HIPAA overview (HHS) sets the compliance baseline. Payer integrations with Aetna, Blue Cross Blue Shield, and Cigna demand reliability and deep domain fidelity. Cloud and back-office ecosystems—from AWS and NetSuite to Slack, Microsoft Teams, Zapier, and Clay—shape the surrounding workflow. Lessons from Amazon, Target, Walmart, and Costco inform operational rigor; supply chain analogies from Ford Motor Company and GM clarify interface contracts. Porter’s five forces helps frame market structure; perspectives from Jeff Bezos and Peter Thiel sharpen strategic posture.
If you’re building for the long run, here’s the blueprint I use with product leaders: validate painfully specific jobs-to-be-done before you scale; prefer founder-led GTM until messaging closes the intent-to-adoption gap; instrument throughput and quality like a production system; invest in people who treat ambiguity as a chance to lead; and don’t confuse speed with hurry. When the “snake swallowing a deer” moment arrives, re-architect deliberately, protect your margins, and let operational excellence carry you from product discovery to durable product-led growth.
References and resources: Aetna: https://www.aetna.com/, Amazon: https://www.amazon.com/, AWS: https://aws.amazon.com/, Blue Cross Blue Shield: https://www.bcbs.com/, Change Healthcare: https://www.changehealthcare.com/, Cigna: https://www.cigna.com/, Clay: https://www.clay.com/, Costco: https://www.costco.com/, Ford Motor Company: https://www.ford.com/, GM: https://www.gm.com/, HIPAA overview (HHS): https://www.hhs.gov/hipaa/index.html, Jeff Bezos: https://x.com/JeffBezos, Kanban / TPS (Toyota): https://global.toyota/en/company/vision-and-philosophy/production-system, Microsoft Teams: https://www.microsoft.com/microsoft-teams, NetSuite: https://www.netsuite.com/, O’Reilly Auto Parts: https://www.oreillyauto.com/, Peter Thiel: https://x.com/peterthiel, Porter’s five forces: https://www.isc.hbs.edu/strategy/pages/the-five-forces.aspx, “Reality has a surprising amount of detail”: https://johnsalvatier.org/blog/2017/reality-has-a-surprising-amount-of-detail, Slack: https://slack.com/, Stedi: https://www.stedi.com/, Summit Racing: https://www.summitracing.com/, Target: https://www.target.com/, Walmart: https://www.walmart.com/, Zapier: https://zapier.com/

November 6, 2025
AI at Home, Impact at Work: Experiments That Supercharged My Product Leadership

I recently tuned into an insightful All Things Product episode featuring Teresa Torres and Petra Wille on how experimenting with AI in everyday life sharpens how we build AI-powered products at work. The core premise resonated deeply with my AI Strategy: low-stakes, personal experiments accelerate confidence, clarify limitations, and build an AI product toolbox we can bring into the office with rigor.

If you want to dive in, you can listen on Spotify or Apple Podcasts. I found the conversation especially relevant for product trios and anyone shaping LLMs for product managers in high-stakes environments.

The idea is simple but powerful: when I prototype with AI at home—where the stakes are low—I learn faster, make safer mistakes, and internalize critical product patterns. Over time, those patterns transfer directly to work: tighter context management, sharper bias awareness, clearer human-in-the-loop guardrails, and a more nuanced view of when to use AI as a thought partner versus when to consider agentic AI.

In my own practice, I’ve mirrored many of the scenarios discussed: using ChatGPT by OpenAI to plan meals, analyze public data sets like school budgets, and even sanity-check real estate evaluations. These seemingly mundane tasks are fertile ground for learning about context window limits, hallucination (artificial intelligence), AI bias, and privacy-by-design trade-offs. Each experiment helps me craft better prompts, structure data for clarity, and decide when a human review step is non-negotiable—core habits for AI risk management.

At work, I treat AI as a thought partner for writing, research synthesis, and contract review. I also explore when and how to responsibly evolve toward agentic AI for repeatable workflows. The distinction matters: a thought partner augments judgment; an agent automates execution. Building the right scaffolding—data governance, auditability, constraints, and escalation paths—ensures we unlock speed without compromising safety.

Three lines from the episode stayed with me: “I’m trying to write things that only I can write — that’s my guiding writing light right now.” — Teresa. “The more we use AI, the more we learn what it’s good at, what it’s not good at, and where context becomes a limitation.” — Teresa. “It’s a safer playground — we can build our toolbox at home before bringing those lessons to work.” — Petra. These are practical north stars for product management leadership in the GenAI era.

For anyone getting started, here’s what worked for me: begin with “low-stakes” personal experiments, write down your prompts and outcomes, and reflect on failure modes. Treat each activity as product discovery: What problem am I solving? What outcome matters? What data and context does the model need? Which decisions must stay human-in-the-loop? This discipline builds an AI product toolbox you can confidently apply to real customer problems.

I also keep a running toolkit of references and tools that inform my practice: Context window as a concept helps me size and sequence information. Visual and video tools like Midjourney and Sora expand how I think about multimodal experiences. I rotate between Claude by Anthropic and ChatGPT by OpenAI depending on task fit, and I’ve used Claude Code when I need structured assistance with code review. For knowledge capture and workflow, Readwise and Ghost help me structure insights and ship content.

If you want more structured learning paths, I found Josh Seiden’s Learn AI With Me, A 30-Day Sprint to be a practical primer, and the broader community conversation at Product at Heart Conference is invaluable. For a deeper grounding in risk, I recommend reviewing topics like Hallucination (artificial intelligence), AI bias, and Agentic AI—and revisiting the complementary episode, Context is King.

I’d love to hear how you’re experimenting: Where have you seen AI meaningfully reduce toil? Where does it still struggle? How are you balancing creativity, data safety, and compliance as you scale? Drop a comment below and let’s compare notes—especially on patterns that help product trios move faster without sacrificing trust.

Bottom line: start small at home, carry lessons into the office, and build with curiosity and intentionality. That’s how we level up our product discovery, sharpen our value proposition, and lead teams confidently through the GenAI transition.

Inspired by this post on Product Talk.

November 4, 2025
From Code to Roadmaps: My Proven Playbook for Engineers Becoming Product Managers

"From code commits to boardrooms. Here are real stories of software engineers who swapped bugs for roadmaps on the road to product manager." I’ve made that leap myself and helped many engineers do the same. In this piece, I share the playbook I use to guide high-potential ICs into impactful product management roles—without losing the engineering rigor that makes them special.

Engineers make exceptional product managers because we’re trained to decompose complex systems, debug ambiguity, and reason from first principles. The transition isn’t about abandoning code; it’s about expanding your scope from implementation details to customer outcomes, market context, and business impact.

The first shift is mental: move from shipping outputs to driving outcomes. Features are a means; value is the end. I anchor this change with outcomes vs output OKRs, ensuring every roadmap item ties to a measurable user or business result rather than a checklist of tickets.

Next, upskill deliberately in three areas: product discovery, product positioning, and stakeholder management. Learn to design unbiased customer interviews, synthesize patterns from qualitative and quantitative signals, and craft crisp value propositions that resonate with real segments. Then practice executive-ready communication—clear decisions, concise narratives, and no jargon crutches.

Here’s the practical, low-risk way to get PM experience without changing your title: form a product trios working group (design, engineering, product) around a real problem. Lead discovery with a weekly cadence, run lightweight experiments, and translate insights into a draft product roadmapping and sprint planning artifact. Ship small, learn fast, and narrate the learning.

Build a simple portfolio that proves product judgment. Include one-page problem briefs, discovery notes, customer quotes, prioritized opportunity trees, and a before/after roadmap snapshot. For each artifact, quantify the impact: activation lift, support ticket reduction, conversion improvement—whatever outcome your work influenced.

If you want to pivot internally, propose a 90-day experiment. Volunteer to own a well-bounded problem, commit to an outcomes dashboard, and set a weekly stakeholder update. Keep a minimal engineering contribution during the trial to de-risk the transition for your team while you demonstrate PM leverage across the squad.

If you’re interviewing externally, prepare two deep case studies: one discovery-led (how you reduced uncertainty) and one delivery-led (how you aligned stakeholders and shipped). Be explicit about trade-offs, risks you retired, metrics you moved, and lessons learned. The best signals of product sense are clarity under constraints and an ability to say “no” for good reasons.

Once you land the role, use a 30-60-90 plan. In the first 30 days, map users, workflows, metrics, and decision rhythms; in 60, run a focused discovery sprint and align on your hypothesis-led roadmap; by 90, deliver a thin slice that proves value and establishes credibility with empowered product teams. Keep your communication tight, your dashboards honest, and your customers close.

Common pitfalls: translating directly from solution space to roadmap without validating problems; equating stakeholder satisfaction with customer value; and mistaking velocity for progress. Avoid them by running small tests early, revisiting segment-specific value propositions, and anchoring trade-offs to product-market fit lessons.

If you’re standing at the edge of this transition, start where you are: choose one user pain, one measurable outcome, and one small bet. Treat it like a product: define success, experiment thoughtfully, and learn in public. The road from engineer to product manager isn’t a title change—it’s a shift in how you create value.

Inspired by this post on Product School.

November 3, 2025
Impact Analysis Mastery: Proven Steps to Predict, Measure, and Maximize Product Outcomes

When I think about the difference between a roadmap that moves the business and one that simply ships output, impact analysis is the habit that changes everything. It gives me and my product trios a disciplined way to forecast value, align stakeholders, and de-risk bets before a single sprint starts. Over the years, I’ve seen great ideas fail not because they were bad, but because we couldn’t articulate, test, and track their true impact. That’s the problem impact analysis solves.

Impact analysis, in practice, is a structured method for predicting how a proposed change will influence user behavior and business outcomes—and then validating those predictions with data. Uncover what impact analysis is, why it matters, and how to do it with proven methods and clear steps for product teams. When done well, it translates strategy into evidence-backed choices that strengthen our value proposition and accelerate product-led growth.

I use impact analysis at three key moments: during product discovery to vet opportunities, in product roadmapping and sprint planning to prioritize, and post-launch to confirm that outcomes beat expectations. It is equally useful for net-new features, UX improvements, pricing changes, and even enablement like in-app guides or product tours.

Step 1: Define the outcome with precision. I anchor every proposal to outcomes vs output OKRs, choose one primary success metric, and record the current baseline. If we plan to experiment, I estimate the minimum detectable effect (MDE) to ensure our A/B testing can actually validate the expected lift. This protects us from investing in ideas that are too small to measure or too broad to manage.

Step 2: Map the causal chain. I translate the idea into a simple impact map: feature change → user behavior (activation, frequency, conversion, retention) → business outcome (revenue, cost, risk, satisfaction). This clarifies what must change in user behavior and why users would care—forcing us to revisit our value proposition if the link feels thin.

Step 3: Size the upside and reach. I estimate who will be exposed (reach), how often (frequency), and the expected behavior change (conversion delta). I complement this with RICE (reach, impact, confidence, effort) or cost of delay to compare options. The goal isn’t perfect math; it’s consistent, transparent assumptions that we can pressure test with data.

Step 4: Evaluate risk, complexity, and dependencies. I assess technical effort, privacy-by-design considerations, data governance needs, and cross-team sequencing. This is where stakeholder management becomes essential—aligning Engineering, Design, GTM, and Security early so we don’t discover hidden blockers mid-sprint.

Step 5: Design the evidence plan. For changes where causality matters, I prefer A/B testing with the right MDE and guardrail metrics. I instrument events and set up dashboards in a unified analytics platform (Amplitude analytics, Pendo, or a homegrown stack) so we can monitor leading indicators quickly. If experiments aren’t feasible, I use sequential rollouts, synthetic controls, or pre-post analyses with clear caveats.

Step 6: Communicate the decision. I share a one-page impact brief that summarizes objectives, hypotheses, metric choices, expected lifts, risks, and the test plan. This reduces debate time, improves stakeholder trust, and enables empowered product teams to move faster with clarity.

Step 7: Ship, monitor, and learn. After launch, I track leading indicators within days and validate lagging outcomes over weeks. I run retention analysis and cohort reviews to confirm that behavior change sticks, and I write a short learning memo—especially when we miss—so future bets get sharper.

On a recent initiative, our team debated whether to build a new onboarding flow or invest in targeted in-app guides. The impact analysis showed the guide approach would reach 3x more users in the next quarter, require half the effort, and be easier to A/B test end-to-end. We shipped the guides, saw a measurable lift in activation, and then recycled those insights to inform the broader onboarding redesign. The analysis didn’t just pick a winner—it created a faster path to compounding outcomes.

Common pitfalls I watch for: chasing vanity metrics, assuming linear impact at scale, ignoring confidence and variance, and skipping instrumentation. Another trap is treating impact analysis as a heavyweight doc—keep it lightweight, comparable across initiatives, and tightly tied to decision-making.

My lightweight template: one sentence on the desired outcome and OKR; a causal chain with the key behavior change; a simple sizing with reach, impact, and confidence; risk and dependency notes; the experimentation plan; and the decision. If we can’t write that in one page, we probably don’t understand the bet well enough to pursue it yet.

The next time you review your roadmap, pick your top three bets and run this playbook. You’ll sharpen your prioritization, increase stakeholder confidence, and give your team a clear line of sight from product discovery to measurable outcomes. That’s how we build momentum, quarter after quarter.

Inspired by this post on Product School.

November 3, 2025
Product Tree 101: The Visual Prioritization Framework I Rely on to Align Teams Fast

When my team is drowning in requests, the Product Tree is the visual tool that brings clarity and momentum. "Learn what a product tree is, how to use the product tree framework, and why it’s a powerful tool for smarter product prioritization." That’s exactly what I aim to share here—how I use it to align stakeholders, sharpen product strategy, and translate ideas into outcomes.

A product tree is a simple yet powerful metaphor for your product. The trunk represents the core value, the roots are the technical foundations and platform capabilities, the branches are product areas or themes, and the leaves are features, experiments, or opportunities. By placing ideas as leaves on the right branches—and making sure roots can actually sustain that growth—we turn a messy backlog into a coherent product roadmap.

Why do product managers swear by it? Because it forces outcomes over outputs, exposes trade-offs visually, and reveals where strategy is thin or overgrown. In one view, you see customer value, technical debt, and strategic focus—crucial for empowered product teams, product discovery, and stakeholder management. It’s also an excellent way to connect outcomes vs output OKRs to tangible delivery paths.

Here’s how I set it up. First, I define the trunk with a crisp product value proposition and the minimum set of experiences that make the product viable. This anchors everything else so we don’t mistake a shiny leaf for the core of the tree.

Next, I map branches to clearly named themes that mirror how customers perceive value—onboarding, activation, collaboration, analytics, or reliability. I keep branches aligned to outcomes to avoid feature-first thinking; this pays dividends during product roadmapping and sprint planning.

Then I add leaves: research insights, customer requests, experiments, and enabling features. I note intent (e.g., drive activation, reduce churn), expected impact, and a rough effort signal. This quickly surfaces which leaves grow the product and which are just twigs.

Finally, I draw roots—the enabling platform work and technical investments that make the branches sustainable. Performance, data governance, privacy-by-design, and scalability belong here. If the roots can’t support the canopy, the tree is at risk, and that becomes a visible, prioritizable problem rather than an invisible liability.

Once the tree is sketched, I facilitate a collaborative session with product trios and cross-functional partners. We prune low-impact leaves, cluster work by outcomes, and explicitly link branches to OKRs. In QBRs vs OKRs reviews, the tree becomes our single source of truth for trade-offs, helping stakeholders see why some requests move up and others wait.

In practice, I use the Product Tree to shape a near-term delivery plan and a longer-horizon narrative. Near term, it informs sprint planning and sequencing by ensuring the right roots land before the heavier branches. Longer term, it clarifies the growth story for product-led growth—what we’ll grow next and why it matters for customers.

A few tips from the trenches: anchor branches to customer outcomes, not internal org charts; spotlight enabling work so platform investments aren’t deprioritized; and revisit the tree after each discovery cycle to keep it fresh. The moment the tree feels lopsided, that’s your signal to rebalance bets or revisit assumptions in product discovery.

If you’re preparing for your next planning cycle, try a 60-minute Product Tree workshop. You’ll come away with a shared mental model, sharper prioritization, and a roadmap that is easy to communicate and defend—because everyone can see the product’s future taking shape right in front of them.

Inspired by this post on Product School.

November 3, 2025
Stop Shipping for the Sake of It: Master Outputs vs. Outcomes to Build Products That Win

Too many teams still celebrate what they ship rather than what they change. I’ve learned—sometimes the hard way—that the most expensive mistake in product management is confusing outputs with outcomes. Understand the key differences between output vs. outcome in product management — and how to keep your team focused on what really drives results.

Here’s how I draw the line: outputs are the features, tickets, and releases we produce; outcomes are the measurable changes in user behavior and business performance we create—activation rates, retention, expansion, and time-to-value. If an initiative doesn’t move a metric that matters, it’s output without impact. That’s how feature factories are born.

The confusion is costly because it distorts incentives. Teams optimize for velocity, story points, or deployment frequency and mistake motion for progress. Engineering excellence and DORA metrics matter, but they’re not substitutes for product outcomes. When OKRs drift into task lists, we ship more and learn less. I’ve seen ambitious roadmaps hit every delivery date and still miss the market because we didn’t change customer behavior.

To break that cycle, I anchor planning and reviews to outcome-based OKRs. A good objective might be: increase new-account user activation from 28% to 45% this quarter. The anti-pattern is: ship onboarding redesign v2. The former sets a clear behavioral target; the latter constrains creativity and locks us into a solution before discovery. This is the practical heart of outcomes vs output OKRs.

From there, I define leading indicators that predict the desired outcome—time-to-first-value, completion of core actions, day-7 retention—and instrument them early. Tools like Amplitude analytics help us see whether an experiment is unlocking behavior change or just producing activity. I also set guardrail metrics (support volume, performance, and NPS) so we don’t “succeed” by creating a new failure mode.

The delivery model matters, too. Empowered product teams—built as product trios of product, design, and engineering—own the problem and the outcome. We invest in product discovery to validate assumptions, size opportunities, and find the minimum viable change that moves the metric. A/B testing with a clear minimum detectable effect (MDE) makes our experiments faster, cheaper, and more conclusive.

Roadmaps then become strategic bets rather than feature lists. Each bet articulates the opportunity, the hypothesized solution, the expected outcome, and the evidence that would change our mind. In sprint planning, we slice increments to learn sooner, not just to deliver sooner. CI/CD accelerates shipping; outcome instrumentation accelerates learning.

Stakeholder conversations shift as well. Instead of debating which features to build, we align on the customer problem, the value proposition, and the measures of success. QBRs showcase what changed—activation, adoption, retention—not just what shipped. This is how we move from feature requests to outcome commitments and sustain product-led growth.

I’ve found that outcomes-first execution energizes teams. Clarity of purpose invites creativity, and the autonomy to experiment fuels ownership. When we celebrate behavior change over backlog burn-down, we stop playing to the roadmap and start playing to win the market.

If your team is stuck in output mode, start small: rewrite one key objective as an outcome, instrument a leading indicator, and run a scoped experiment. When the metric moves, let that win reset the culture. Momentum follows outcomes.

Inspired by this post on Product School.

November 3, 2025
Mastering AI Evals: The Essential Product Manager Skill to Ship Safer, Smarter AI

In every AI-powered product I ship, evaluation is the difference between a compelling demo and a dependable customer experience. AI evaluation isn’t a nice-to-have; it’s a core product management competency that shapes quality, safety, and business outcomes from the first prototype to scale.

When I talk about AI evaluation, I mean a disciplined, repeatable way to measure model behavior across quality, safety, reliability, latency, and cost. Gen AI has changed the cadence of product decisions—models evolve weekly, prompts drift under real-world load, and edge cases multiply. Without rigorous evals, we risk shipping unpredictability.

My goal in this piece is simple: “Dive deep into AI evals, why they matter for PMs today, and how to master them with clear steps, examples, and best practices.” If you’re leading product strategy for LLMs, agentic AI, or applied AI features, this is the playbook I rely on.

Why this matters now: customers don’t judge AI by benchmarks, they judge by trust—did it help me, was it safe, was it fast? Strong AI evals let me set outcomes vs output OKRs, quantify risk, and make transparent trade-offs between accuracy, latency, and cost. They also give engineering and design clear guardrails to move fast without breaking user trust.

Step 1: Define the product problem and success metrics. I start by tying AI metrics to business outcomes—resolution rate, deflection rate, revenue lift, time-to-value—and include model-centric measures like hallucination rate, harmful content rate, latency, and token cost. This keeps experiments anchored to impact, not just model scores.

Step 2: Build a high-signal golden dataset. I curate real, anonymized user prompts from discovery and support channels, then add adversarial and long-tail cases. For generative tasks, I create rubric-based criteria for correctness, helpfulness, tone, and safety. This dataset becomes my regression suite as prompts, RAG pipelines, or models change.

Step 3: Choose the right evaluation methods. I combine deterministic unit tests for rules with LLM-as-judge scoring, pairwise preference tests for prompt variants, human review for critical flows, and red teaming for safety. I also apply privacy-by-design and strong data governance to ensure eval data handling meets compliance and customer expectations.

Step 4: Operationalize with CI/CD. Evals run automatically on every prompt, retrieval, or model update, with pass/fail gates and alerting. I track results in a unified analytics platform so product, engineering, and go-to-market teams see the same truth. If a change regresses key thresholds, we pause rollout or roll back.

Step 5: Optimize the cost–quality–latency triangle. Real products live within constraints. I analyze token budgets, caching strategies, model selection (e.g., small for classification, larger for complex generation), prompt structure, retrieval quality, and function-calling patterns. For agentic AI, I evaluate tool-use correctness and task completion reliability, not just text quality.

Step 6: Close the loop with experimentation. Offline evals get me confidence; online A/B testing validates business impact. I design tests with a clear minimum detectable effect (MDE), guard for novelty bias, and instrument activation, retention, and satisfaction in Amplitude or Pendo. Agent analytics help me pinpoint where users succeed or get stuck.

Step 7: Govern responsibly. I maintain model cards, decision logs, and incident playbooks. For customer-facing assistants, I gate risky actions, log explanations, and add human-in-the-loop escalation. AI risk management isn’t bureaucracy—it’s how we earn trust at scale.

A concrete example: building a customer support assistant. My success metrics include deflection rate, first-contact resolution, median response latency, and safe action rate. The golden dataset blends common queries, billing edge cases, account-specific retrieval checks, and adversarial prompts. Evals measure factuality against a knowledge base, tone alignment with brand guidelines, and safe tool use for CRM integration. Only after passing offline gates do we A/B test deflection and CSAT in production.

Common pitfalls I watch for: overfitting prompts to a tiny test set, relying solely on LLM-as-judge without human calibration, skipping safety tests when latency rises, and treating evaluations as a one-time launch task. The antidote is simple—regularly refresh datasets, diversify eval methods, and wire evals into the same release discipline as any core feature.

The payoff is compounding. With strong AI evals, we ship confidently, reduce incident rates, accelerate iteration, and communicate trade-offs clearly to stakeholders. More importantly, we build products customers trust—because quality isn’t a promise, it’s a practice we can measure every day.

Inspired by this post on Product School.

November 3, 2025
Your Ultimate ProductCon San Francisco 2025 Guide: Best Hotels, Eats & Drinks

Heading to ProductCon San Francisco 2025? I approach conference travel the same way I approach product strategy: optimize for outcomes, reduce friction, and invest in high-signal experiences. Here’s the playbook I use to choose the right hotel, find memorable meals, and make the most of every hour in the city.

For lodging, I prioritize walkability, safety, and quiet rooms so I can focus during sessions and recover at night. If you want to be steps from most venues and meetups, SoMa and the Yerba Buena corridor are ideal. InterContinental San Francisco, W San Francisco, and The Clancy (Autograph Collection) are reliable, business-friendly picks with strong Wi‑Fi and ample lobby space for impromptu one‑on‑ones. If you prefer classic energy and transit access, Union Square hotels like Hotel Nikko and The Westin St. Francis work well. For waterfront views and a calmer vibe, Hyatt Regency Embarcadero puts you by the Ferry Building with easy BART and Muni access.

My booking checklist is simple: reserve early, target a high floor away from elevators, and request early check‑in or late checkout around your session schedule. Loyalty programs often unlock better rates and quiet‑room preferences. If you need heads‑down time between talks, ask about day‑use meeting rooms or find a corner of the lobby with stable bandwidth. I also pack a compact power strip and a long USB‑C cable—two small upgrades that routinely save a day.

Coffee is the fuel of great product conversations. Near SoMa, I rotate between Blue Bottle (Mint Plaza), Sightglass (7th Street), and Philz (Front Street) for pre‑session caffeine and quick stand‑ups. If I’m on the Embarcadero side, the Ferry Building’s roasters are perfect for early starts, and morning lines move faster than you’d expect if you arrive just after opening.

For efficient lunches, I favor fast‑casual spots that can handle volume without sacrificing quality. Mixt, Souvla, Sweetgreen, Super Duper Burgers, and The Grove are dependable within a short walk of most downtown venues. When I need a higher‑signal lunch with a partner or prospect, I book a table slightly off the main corridor to avoid the rush—think Mourad for elevated Moroccan in SoMa or Boulevard along the Embarcadero for a polished, quiet conversation.

Dinner is where the best networking often happens, so I plan for atmosphere, acoustics, and a menu that works for mixed dietary needs. Kokkari Estiatorio (FiDi) excels for executive dinners. Liholiho Yacht Club is a creative, memorable choice for cross‑functional teams. Waterbar or Angler near the waterfront pair great food with views that impress visiting colleagues. For something more casual but still conversation‑friendly, Nopa or Sorella deliver consistently.

When it’s time for drinks, I think in terms of groups and goals. For panoramic views and small group catch‑ups, The View Lounge (Marriott Marquis) is a classic. For wine‑forward conversations with a quiet ambiance, Press Club near Yerba Buena works well. If you’re hosting a more energetic crew, Charmaine’s (SF Proper Hotel), Dirty Habit (Hotel Zelos), or 25 Lusk offer space, good music, and reliable service. For craft cocktails, Pacific Cocktail Haven and ABV are standouts if you don’t mind a short ride.

Transit and timing matter. From SFO or OAK, BART is often the fastest, most predictable route downtown; rideshare is convenient late at night. I walk whenever possible, but I time routes along well‑lit, busier streets and avoid sprinting between neighborhoods tight on time. Microclimates are real—bring layers, comfortable shoes, and a compact umbrella. I schedule 15‑minute buffers around key sessions to handle inevitable friend‑of‑a‑friend introductions.

If you need a professional setting for a quick working session, many hotels will extend lobby seating to guests and their visitors. For dedicated space, day passes at coworking operators like Industrious, CANOPY, or Regus are worth it when you’ve got a client briefing or board prep. For a more casual backdrop, Sightglass and Blue Bottle locations typically have reliable Wi‑Fi and just enough outlets if you arrive off‑peak.

Finally, a word on intent: I set a simple goal for each day—one meaningful connection, one surprising insight, and one concrete action to bring back to my team. ProductCon San Francisco 2025 is a catalyst if you design your experience with the same rigor you apply to your roadmap. If you spot me in a session or at a nearby cafe, say hello—I’m always up for trading notes on product strategy, pricing experiments, and what’s working in the field right now.

Quick note: restaurants and hours can change quickly—make reservations where possible and double‑check opening times the week of the event.

Inspired by this post on Product School.

November 3, 2025
Organizational Development Demystified: The Engine Behind Smarter Teams, Culture, and Growth

When people ask me how product organizations actually scale what works, I point them to a simple truth: organizational development is the operating system that makes strategy executable, teams empowered, and outcomes repeatable.

It turns out that organizational development isn’t just HR lingo. It’s the engine behind smarter teams, better culture, and long-term growth.

In practice, I think of organizational development as the discipline that aligns structure, incentives, rituals, and learning loops so empowered product teams can do their best work. It connects product management leadership with execution through clear decision rights, transparent roadmapping, and ways of working that reduce friction across product, design, and engineering.

On the ground, this looks like moving from activity measures to outcomes vs output OKRs, forming durable product trios to own customer problems end to end, and tightening stakeholder management so priorities don’t whipsaw week to week. It also means investing in onboarding that accelerates time-to-impact, creating feedback rituals that surface risks early, and using retention analysis to make smarter bets about where to double down.

The payoff is tangible: faster decision-making, fewer handoffs, and clearer accountability. Teams ship with confidence, leaders get leading indicators instead of lagging surprises, and employee retention at startups improves because people see how their work connects to a meaningful value proposition and product-led growth.

In my own practice, shifting to outcomes-first planning, establishing product trios, and clarifying interfaces across functions reduced decision latency, improved deployment frequency, and made ownership unmistakable. The organization became more resilient because the culture, processes, and metrics reinforced one another instead of competing for attention.

If you’re starting from scratch, begin by aligning on a small set of outcomes that matter, then redesign ceremonies and artifacts to serve those outcomes. Next, empower teams with clear autonomy and constraints—enough freedom to discover, enough guardrails to focus. Finally, make learning visible: use lightweight postmortems, discovery reviews, and customer signal dashboards so your operating system continuously improves.

Organizational development isn’t a one-time reorg; it’s a habit. When we treat it as a product—iterating on roles, rituals, and metrics just like we iterate on features—performance compounds, culture strengthens, and growth becomes sustainable.

Inspired by this post on Product School.

November 3, 2025