I move fastest in Generative AI when I strip work down to its essential signals. At HighLevel, I rely on a single-page format—”Prototyping Requirements: The One-Pager for AI PMs”—to turn ideas into testable artifacts within hours, not weeks. This approach reinforces AI Strategy, minimizes coordination overhead, and keeps Product Management focused on learning over ceremony.
“Prototyping requirements go rogue: one page, zero bureaucracy, built for AI. Shape concepts fast, prompt tools directly, and get to the truth sooner.”
In practice, my one-pager captures only what’s required to run an immediate experiment: the user problem, the target behavior change, success signals, core constraints, intended AI workflows, and the smallest realistic path to an evaluable demo. I also include example prompts, guardrails, and evaluation criteria so the team can apply prompt engineering and LLMs for product managers without guessing.
This is eval-driven development in action. I document a minimal hypothesis, concrete inputs/outputs, and a quick plan for metrics, including qualitative signals from product discovery and continuous discovery. By prompting tools directly, we expose assumptions early, shorten feedback loops, and build an AI product toolbox that compounds learning sprint after sprint.
I run this with a product trio to ensure we balance feasibility, usability, and value. We align on risks, dependencies, and what “good” looks like, then we integrate the learnings into product roadmapping and sprint planning. The result: fewer meetings, tighter collaboration, and empowered product teams delivering sharper outcomes with less friction.
If you want speed and clarity without sacrificing rigor, adopt the one-pager. It centers the conversation on evidence, accelerates AI workflows from prompt to prototype, and makes it obvious what to try next—and what to stop doing. Most importantly, it keeps the team focused on truth over theater, which is how great AI products actually ship.
I created this practical guide to help product managers cut through the hype and apply AI where it genuinely moves the needle—faster discovery, clearer strategy, sharper execution, and measurable outcomes.
A practical guide to AI tools for product managers: tested picks, what each tool is best for, copy-paste prompts, workflows, and screenshot checklists.
Leading product management at HighLevel, I’ve pressure-tested dozens of gen AI solutions across product discovery, roadmap planning, delivery, and go-to-market. In this guide, I map an AI product toolbox to core PM jobs-to-be-done so you can move from experimentation to repeatable impact with confidence.
Expect clear recommendations on where each tool excels—LLMs for product managers, research synthesis for customer interviews, behavioral analytics for opportunity sizing, and lightweight automation for in-app guides and product tours. I connect these tools to proven practices like continuous discovery, outcomes vs output OKRs, and product roadmapping and sprint planning so you can operationalize AI inside your existing workflows.
I also share the evaluation criteria I use before rollout—AI Strategy alignment, data governance and privacy-by-design, AI risk management, observability, and total cost of ownership. This eval-driven development approach helps teams avoid technology FOMO while creating defensible, trustworthy workflows that scale.
To accelerate adoption, I’ve included copy-paste prompts (including prompt engineering patterns for both chat and voice), retrieval-first pipeline blueprints to ground your models in product docs and decision logs, and conversation design tips for support and success use cases. You’ll see step-by-step AI workflows that tie directly to journey mapping, opportunity solution trees, and Kano Model trade-offs.
Every workflow comes with screenshot checklists you can use for onboarding or stakeholder management, making it easy to align ICs and leaders on the same operating picture. Whether you’re optimizing A/B testing, retention analysis, or QBRs vs OKRs, these checklists turn good intentions into repeatable rituals.
Use this guide as your field companion to ship faster with higher confidence—reducing cycle time, improving signal in discovery, and building momentum for product-led growth. If you’re ready to translate generative AI into reliable PM leverage, start with the workflows, adapt the prompts, and make them your own.
AI headlines are everywhere—and many claim they know exactly what’s coming next. In product management, I’m often asked to make single-point predictions about gen ai and LLMs for product managers. I resist that temptation because confident forecasts are seductive—and usually wrong.
Listening to Teresa Torres and Petra Wille unpack why certainty fails reinforced what I practice with my product trios: scenario planning. Instead of betting on one future, I explore several plausible ones, define the signals that would confirm or disconfirm each, and translate those insights into product strategy and product roadmapping and sprint planning we can adapt as evidence evolves.
Their argument mirrors what I see with customers and stakeholders: people are bad at predicting the future, and overconfidence creates fragility. Early adopters don’t represent everyone, so when we extrapolate from enthusiasts to the mainstream, we waste time and erode trust by building the wrong things.
Here’s how I apply this to avoid technology FOMO and make sharper AI Strategy decisions. I treat every bold claim as one possible future, then ask, “what else could happen?” I push extremes—AI everywhere vs. AI as invisible utility; GUIs vanish vs. GUIs evolve; centralized vs. edge compute—and hunt for the needs that stay true across scenarios. Those invariants anchor empowered product teams to outcomes, not outputs, and they help us stage bets responsibly.
Listen to this episode on: Spotify | Apple Podcasts
My key takeaways: Confident predictions are often wrong. Early adopters don’t represent everyone. Treat predictions as one possible future. Scenario planning > trying to be right. Focus on patterns, not hype.
In short: We’re in a period of change—but no one can predict exactly how it plays out. Strong predictions often ignore uncertainty.
A better approach in practice: Treat every prediction as a scenario. Ask: what else could happen? Use multiple futures to guide decisions.
As you evaluate roadmaps, watch for traps like “My experience = everyone’s future” thinking, over-indexing on early adopters, and ignoring real-world constraints like budgets, compliance, and change management.
Tactically, we run quick scenario exercises, push ideas to extremes to explore implications, and extract the underlying insight (not the exact prediction). This complements continuous discovery and helps us write outcomes vs output OKRs that are resilient to uncertainty.
00:00 – The problem with future predictions
04:00 – Why experts get it wrong
06:00 – Scenario planning explained
12:00 – Early adopters vs. reality
20:00 – AI, GUIs, and extreme takes
27:00 – Using scenarios in product work
34:00 – Final thoughts
Resources & Links:
Follow Teresa Torres: https://ProductTalk.org
Follow Petra Wille: https://Petra-Wille.com
Mentioned in this episode:
Claude Code
What did I miss—or what scenarios are you considering for your team? Leave a comment below and let’s compare notes.
Product teams rarely fail because they don’t ship enough features; they fail because they don’t learn fast enough. That’s the core tension I manage every day: when to build to learn and when to build to earn. Navigating that balance is how we protect focus, accelerate time-to-value, and ultimately deliver durable business impact.
Over the years, I’ve seen at least two major ways to develop product: build to learn and build to earn. The first is discovery-led and evidence-seeking; the second is delivery-led and value-capturing. Both are essential. The real craft is knowing which mode to be in, when to switch, and how to keep stakeholders aligned around outcomes instead of output.
The project model remains the default in many organizations—even in the age of AI—and it’s all about output. Stakeholders or executives assemble a prioritized roadmap of features and projects, and teams ship against it. This can create momentum, but without clear outcome metrics and customer validation, it’s easy to drift into a feature factory that looks productive while missing the mark on user value and business results.
When I build to learn, I emphasize continuous discovery. That means using customer interviews to surface unmet needs, running lightweight prototypes to test desirability and usability, and deploying A/B testing to quantify impact. I map assumptions, risks, and opportunities with an opportunity solution tree, and I timebox experiments so we learn fast and cheap. The standard is evidence, not opinions—especially my own. The goal is simple: reduce uncertainty before we scale.
When I build to earn, the objective shifts to capturing value with confidence. Here I align teams to outcomes vs output OKRs, commit to clear acceptance criteria, and ensure product roadmapping and sprint planning reflect the highest-leverage bets we validated in discovery. Delivery excellence matters: crisp definition, reliable release trains, observability, and a strong feedback loop to confirm we’re moving activation, conversion, or retention in the intended direction.
Deciding when to transition from learning to earning is all about thresholds of evidence. I look for leading indicators that our solution reliably solves the target problem, shows a measurable lift in key behaviors, and can be delivered with acceptable risk. If we can’t articulate the expected outcome and how we’ll measure it, we’re not ready to scale. If we can, we invest, monitor impact, and keep guardrails in place to avoid scope drift.
The operating model that makes this sustainable is simple and disciplined. I rely on empowered product teams organized as product trios (product, design, engineering) to run dual tracks of discovery and delivery. We socialize learning with stakeholders early and often to strengthen trust and stakeholder management. We elevate strategy by linking every roadmap item to a problem statement, a testable hypothesis, and a quantified outcome—no orphan features, no vanity launches.
In the AI era, speed can tempt us back into shipping-by-idea. I use gen AI for product prototyping and insight synthesis, and I lean on LLMs for product managers to accelerate discovery work—without treating AI as a shortcut to validation. Our AI Strategy clarifies where AI augments discovery, where it powers the product, and how we evaluate risk, so we move faster without compromising rigor or ethics.
My rule of thumb: spend just enough time building to learn to achieve conviction, then shift decisively to building to earn—while preserving a small discovery cadence to keep learning alive. This rhythm protects focus, compounds insight, and makes growth more predictable. It’s how we avoid the output trap, deliver meaningful outcomes, and create products that customers love and the business celebrates.
Product roadmaps should not be promises etched in stone; they are portfolios of bets made under uncertainty. When I build a roadmap, I’m not predicting the future—I’m designing a system that helps the team learn faster than the market changes, allocate capital wisely, and create alignment across engineering, design, go-to-market, and leadership.
The best roadmaps I’ve seen and shipped anchor on outcomes rather than features. “Outcomes vs output OKRs” is more than a slogan; it’s how we translate strategy into measurable impact. I start by defining a small set of outcome metrics that matter—such as activation rate, time-to-first-value, or expansion revenue—and attach clear key results and guardrails to each theme. This reframes prioritization from “what can we build?” to “what must change in customer behavior?” and gives empowered product teams real autonomy.
I organize the roadmap into time horizons—Now, Next, Later—with explicit confidence levels. Near-term items have higher confidence and more specificity; mid- and long-term bets are thematic with wider time windows. This approach reduces false precision and builds trust because stakeholders can see both the intent and the uncertainty. When dates matter, I use windows and service level expectations rather than single deadlines, and I pair each initiative with a lightweight risk scoring so we can discuss uncertainty explicitly rather than implicitly.
Continuous discovery keeps the roadmap honest. I partner in tight “product trios” across product, design, and engineering to run rapid customer interviews, opportunity sizing, and assumption tests before we commit significant delivery capacity. The opportunity solution tree is my favorite artifact here; it visualizes the path from outcomes to opportunities to experiments and solutions, making trade-offs and sequencing transparent. By the time something moves into sprint planning, we’ve already reduced key uncertainties and clarified the narrowest viable slice we can ship.
Uncertainty demands options. I plan initiatives as options with stage gates and explicit kill criteria rather than as single monolithic projects. For every significant theme, I outline base, best, and worst-case scenarios with pre-decided triggers for when we escalate, pivot, or stop. This practice prevents sunk-cost fallacy and keeps the team focused on evidence. We treat scope as a knob, not a switch, and we bias toward small, sequential bets that compound learning.
Capacity is strategy. I routinely reserve a discovery buffer—typically 10–20%—and a contingency buffer for integration, security, and performance risks that always show up late. I ruthlessly control work-in-progress to limit thrash and protect the team’s ability to respond when new information arrives. When we must navigate dependencies, I use thin vertical slices and decouple via contracts or feature flags so discovery momentum doesn’t stall while platforms evolve underneath.
Prioritization under uncertainty benefits from explicit models. I combine value, effort, and confidence with risk scoring to surface where the unknowns are hiding. Driver trees help us connect top-level outcomes to leading indicators, so we can place bets where they have the highest causal leverage. I also lean on the Kano Model and qualitative signals to avoid over-investing in performance attributes while neglecting excitement features that unlock differentiation and word-of-mouth.
The most effective stakeholder management is narrative-first. For executives, I present a one-page outcomes roadmap that shows themes, expected shifts in key results, and the learning plan. For teams, I provide a more detailed plan that links discovery insights, assumptions-to-test, and decision points. I make room for a “what we’re not doing” section to reduce noise and prevent shadow backlogs from reappearing in every meeting. Most importantly, I socialize change before it happens, explaining the evidence and the trade-offs so adjustments feel like progress, not whiplash.
Measurement closes the loop. We instrument experiments and releases with leading indicators tied to the driver tree and review them on a predictable cadence. If movement stalls, we diagnose whether we have a targeting problem (wrong audience), a value problem (weak proposition), or a friction problem (broken journey). That discipline lets us iterate with purpose instead of chasing vanity metrics or isolated anecdotes.
Here’s a concrete example of roadmapping through uncertainty. Suppose our Q3 objective is to “Increase user activation” with key results to raise the Week-1 activation rate from 32% to 45% and cut time-to-first-value by 30%. In discovery, customer interviews reveal confusion in the first-run setup and a missing integration that advanced users expect. We map an opportunity solution tree and identify two high-leverage opportunities: simplifying the first 10 minutes and offering a guided setup for the integration. We then shape two minimal bets: an in-app guide to streamline the first three tasks and an integration wizard behind a feature flag. Each bet has an explicit decision rule and a two-sprint runway. We ship the guide first, confirm a statistically significant lift via A/B testing, then expand scope. The integration wizard underperforms initial expectations, so we pause, revisit the assumptions, and re-allocate buffer to the stronger path. The roadmap updates in real time, and everyone understands why.
When uncertainty spikes—new competitor, pricing shock, platform deprecation—I shift the roadmap cadence to rolling-wave planning. We shorten planning horizons, increase the frequency of readouts, and elevate discovery allocations temporarily. We also create thematic “containment zones” where we explore multiple options in parallel with small budgets until one path justifies scale. This allows us to stay responsive without abandoning strategy.
Good governance accelerates, it doesn’t slow. A lightweight product council that reviews outcomes, risks, and cross-functional dependencies prevents surprise escalations and ensures we keep shipping what matters. We avoid death-by-approval by agreeing in advance on decision rights and thresholds—for example, a product trio can pivot a bet within a theme up to a certain budget or timeline impact without additional approval, as long as it improves the outcome likelihood.
If you’re evolving your roadmap practice, start with three moves. First, reframe your plan in outcomes and publish a driver tree that connects those outcomes to the few leading indicators you believe move them. Second, stand up a continuous discovery cadence with a visible opportunity solution tree and an assumptions-to-test backlog. Third, implement time windows and confidence levels for all mid- and long-term items, and pair each major initiative with explicit kill criteria. You’ll feel the difference in a single quarter: clearer trade-offs, faster learning, and more predictable delivery—despite uncertainty.
In the end, a roadmap that thrives in uncertainty is an agreement about how we learn and decide together. It aligns the organization on outcomes, it funds options—not fantasies—and it gives empowered product teams room to maneuver. That’s how top product teams plan for uncertainty and still deliver with confidence.
Are you an AI product manager or want to become one? This guide cuts through the noise and shows where the PM role is really heading with AI.
I’ve spent the last few years scaling AI initiatives across complex SaaS products, and I’ve learned that “AI product manager” isn’t a vanity title—it’s a capability set. The role evolves traditional product management with new responsibilities across data, model behavior, risk, and continuous learning systems. My goal here is to demystify what matters, so you can lead with clarity, build with confidence, and deliver measurable outcomes.
First, let’s separate hype from reality. An effective AI Strategy starts with the customer problem, not the model. I anchor roadmaps around clear use cases, then evaluate whether we need a retrieval-first pipeline, agentic AI, or conventional automation. “Build vs buy” is no longer a procurement question; it’s a lifecycle question about iteration speed, quality control, data governance, and long-term unit economics.
Discovery also looks different. I still run continuous discovery and customer interviews, but I augment them with behavioral analytics and targeted experiments to validate feasibility, risk, and value. I practice privacy-by-design and AI risk management from day one, and I define guardrails for acceptable model behavior alongside success metrics. When high stakes are involved, I document data provenance and align with regulatory compliance standards to protect customers and the business.
Execution shifts from shipping static features to operating learning systems. In product roadmapping and sprint planning, I account for context window management, prompt engineering, and the realities of LLMs for product managers: latency, cost, drift, and failure modes. I use feature flags, A/B testing, and eval-driven development to move from offline model evals to online impact with a minimum detectable effect (MDE) worth the release risk. Observability, anomaly detection, and incident management aren’t optional—they’re how we earn trust.
Collaboration expands beyond engineering and design. I work closely with data science on evaluation frameworks, with solutions engineering to de-risk complex enterprise deployments, and with customer success to close the loop on model performance in the wild. Our outcomes vs output OKRs emphasize activation, time-to-value, and sustained retention over vanity accuracy metrics.
Tooling is now strategic advantage. My AI product toolbox includes prompt libraries with versioning, synthetic data generation where appropriate, and a disciplined approach to model and prompt regression tests. I standardize AI workflows—intake, evaluation, deployment, and monitoring—so teams can ship faster without cutting corners. This is how empowered product teams scale safely.
Career-wise, I look for—and coach—PMs who can frame trade-offs crisply: explain when to fine-tune vs use retrieval, when to embed agents, and when not to use AI at all. Show me driver trees that connect model metrics to business outcomes, a clear risk register, and a plan for continuous discovery. If you can tell a compelling story backed by transparent evaluation and customer value, you’re already ahead.
Here’s the bottom line: the “AI product manager” that matters in 2026 is a product leader who can turn uncertainty into systematized learning. If you focus on real customer problems, rigorous evaluation, responsible design, and iterative delivery, you won’t just carry the title—you’ll create durable competitive differentiation.
I keep a running list of product wisdom that sounds great on a slide but quietly sabotages execution. Recently, I revisited that list after a deep conversation with a seasoned CPO from a leading security and compliance platform and reflected on how these lessons show up in my own operating rhythm. What follows is my practical playbook for scaling product organizations without losing speed, quality, or the soul of the product.
Most big-tech veterans struggle when they leap into startups because the safety net of process disappears. At a startup, the buck truly stops with you—there’s no committee to shield a decision and no process to rescue a weak plan. The mindset shift is simple to say and hard to do: own outcomes end to end, reduce your reliance on institutional scaffolding, and make decisions with incomplete information while keeping standards high.
“Great product leaders stay in the details.” I sample artifacts every week—PRDs, design flows, user research notes, postmortems—and I read customer threads to calibrate my intuition. To maintain shipping velocity as headcount grows, I instrument a few critical indicators (deployment frequency, change failure rate) and favor outcomes over output. Data guides my attention; it never replaces judgment.
As teams scale, I use a blunt rule to keep speed high: small autonomous teams, small batch sizes, short feedback loops. One clear owner, one prioritized backlog, and weekly demos to customers. We ship thin slices, not big bangs. And “Great CPOs should avoid comfort metrics”—the easy dashboards that rise when nothing meaningful is moving. I push for outcome-centric OKRs tied to customer value, not vanity charts.
Rigid hierarchies derail quality decision-making. They slow signal, encourage escalation theater, and suppress the truth from the edges. I shorten paths between PMs, engineers, designers, research, and go-to-market leads, and I strip out stage gates that don’t add learning. Above all, I refuse to “Stop making your team fetch rocks”—randomized executive requests without context. Instead, I frame clear problem statements, explicit constraints, and observable success criteria.
Revenue and product can feel at odds, but they don’t have to be. The key to a quality CPO and CRO relationship is a shared operating model: one customer narrative, a joint pipeline of problems worth solving, and a common scorecard. We meet weekly, review the same signals, and align on sequencing: what we solve now for impact, what we stage for scale, and what we sunset to reduce complexity. When trade-offs get tough, we anchor on customer value and long-term defensibility.
Who ultimately oversees the quality bar? I do—and I do it through clarity, exemplars, and consistent feedback loops, not micromanagement. When I leave feedback, I make it actionable and specific: name the user scenario, note the friction, propose a sharper decision frame, and suggest a smaller, testable slice. I expect narrative memos and crisp acceptance criteria; I offer rapid, detailed responses so momentum never stalls.
Open office hours are my forcing function for transparency and speed. Anyone can bring a thorny escalation, a design in progress, or a customer insight. Pair that with weekly 1:1s—non-negotiable for developing leaders and unblocking work—and the organization learns to surface issues early, make faster decisions, and self-correct without drama.
Here’s a glimpse into my working week: Mondays set priorities and confirm the few decisions that matter; midweek is for deep reviews across roadmap, research, and engineering readiness; Thursdays I’m with customers and partners; Fridays I write and synthesize. I leave space for unscripted time with individual contributors—because ICs are the unsung heroes of a company—and I celebrate excellent craft out loud.
The hardest leadership skill is knowing when to push and when to give space. I push on clarity, sequencing, and quality; I give space on solutions and implementation paths. I reject comfort metrics, reinforce outcomes vs. output, and keep the organization close to customers and details. If you’re stepping from big tech into a startup or scaling your product org through rapid growth, these practices will help you ship faster, decide better, and raise the quality bar without burning out your team.
“Outcomes over outputs” is the right mantra—and one I’ve championed across product teams—but turning it into daily practice is where most teams stumble.
It’s simple in theory: focus on the impact of what we build, not just shipping features. In reality, it’s rarely black and white because most teams are asked to do both—hit outcomes and deliver specific outputs—at the same time.
In a benchmark survey, 20% of product teams claim to be outcome-focused, nearly half describe themselves as working in a mix of outcomes and outputs, and about 30% are still primarily working with outputs. I’ve seen versions of this in my own org: we aspire to outcomes, but our rituals, roadmaps, and reporting still reward shipping.
Here’s how I draw the line clearly, coach my teams to avoid common traps, and negotiate better, more actionable outcomes that unlock genuine product discovery and business results.
Simple definitions we live by
An output is something you build or produce—a feature, a project, an initiative. It’s something your team ships.
An outcome is the impact of that output—a change in customer behavior or a business result.
Josh Seiden puts it well in his book Outcomes Over Output: “An outcome is a change in human behavior that drives business results.”
Shift from shipping to shaping results. This graphic clarifies outputs vs outcomes, revealing that value emerges between deliverables and impact—when features change customer behavior and move business results.
I distinguish business outcomes from product outcomes. Business outcomes are typically financial metrics that measure the health of the business (e.g. increase revenue or reduce costs) while product outcomes measure a customer behavior in the product or a sentiment about the product.
Here’s a simple example I’ve used with platform teams. Many B2B companies support a number of integrations. Integrations are outputs. Having integrations alone doesn’t create value. Customers using and finding value in those integrations—that’s an outcome. If those customers retain their subscriptions longer because of the integrations—that’s also an outcome.
Building something isn’t the same as creating value. That’s the core of this distinction, and it’s what separates empowered product teams from feature factories.
Why this distinction matters for empowered product teams
When we task teams with delivering outputs, they’re done when the software ships. When we task teams with delivering outcomes, they aren’t done until the software ships and has the expected impact.
That small shift changes almost everything about how a team works: what we measure (impact, not just delivery), how we know we’re done (measurable behavior change, not release notes), the autonomy we grant (told what to achieve, not what to build), and the planning artifacts we use (an opportunity solution tree beats a feature roadmap when we’re exploring the best path to an outcome).
When I assign outcomes, I’m giving the team latitude—and responsibility—to figure out the best path to success. That’s what opens the door for real product discovery and continuous discovery habits.
Shift your lens from shipping features to achieving impact. This side-by-side visual explains how outcome-driven teams measure success, grant more autonomy, define 'done' by results, and plan with an opportunity solution tree.
Examples: spotting outputs disguised as outcomes
Clear-cut example: “Our outcome is to deliver an Android app.” An Android app is something we build and ship. It’s clearly an output.
To get to an outcome, I ask, “What’s the value of having an Android app?” or “How will we know the Android app is successful?”
We might answer: “Having an Android app will allow us to engage more users. We’ll know it’s successful when people engage with the app on a regular basis.”
This answer uncovers the hidden outcome: engage more people. Now we can set the right scope: increase the percentage of engaged users across any platform; increase the percentage of engaged mobile users; or increase the percentage of engaged Android users.
Any of these outcomes gives us more room to explore than a fixed output. Maybe we don’t need a native app at all. We could deliver the same engagement through a mobile web experience, notifications, or email. And we’re not done when we ship—we’re done when the right people are actually engaged.
Tricky example 1: measure the value creation moment (hires, not applicants)
Move beyond shipping features to the impact that matters. This visual maps the path from build an Android app to the real goal, increase engaged users, by asking why, defining value, and owning results.
When setting outcomes, it’s tempting to choose the easiest-to-measure metric. But a good outcome measures the customer’s value creation moment.
I worked at a company that helped new college grads find their first job. When I started working there, the primary outcome was “increase job applications.” This technically is an outcome—it measures a specific behavior in the product.
But it doesn’t measure the value creation moment. A job seeker doesn’t get value when they apply for a job. They only get value when they get the job. Similarly, employers don’t get value from any job applicant, they get value when the right job applicant applies.
Many job boards try to measure qualified applicants—instead of counting any applicant, they compare the credentials of the applicant to the job description and only count qualified applicants. This is better. But it still doesn’t measure the value creation moment. Both the job seeker and the employer get value when an open job is successfully filled. The right metric is hires.
Yes, “hires” can be hard to instrument because it happens off-platform and incentives misalign. Measure it anyway, even with proxies. The easy metric isn’t always the right outcome.
Tricky example 2: measure impact, not user-generated output (the course reviews trap)
I worked with a team that helped students choose university courses. They set their outcome as: “Increase the number of course reviews on our platform.”
Confusing activity with impact? This visual breaks down four common outcome traps—measuring at the wrong moment, mistaking outputs, chasing adoption, and relying on sentiment—so teams focus on real value.
Sounds like an outcome, right? It’s a metric. You can measure it. It’s an action users take on the site—writing a review. But it’s actually an output in disguise.
Reviews are valuable when they help a student evaluate a course. They don’t create any value if a student never sees them. More reviews aren’t always better, especially if they’re clustered where nobody looks.
A better outcome is “Increase the number of course views that include reviews.” Now we’re measuring impact on the decision moment, not just the production of content.
If you can hit your metric without helping customers, you’re tracking an output, not an outcome.
Tricky example 3: measure success, not just adoption (the traction metric trap)
“Increase the percentage of users who viewed the performance report.”
This looks like a good outcome. It measures a specific behavior in the product. It’s within the team’s control. But it’s what I call a traction metric—it measures adoption of a single feature, not value to the customer.
Why teams get trapped in shipping features: a vicious trust cycle fuels micromanagement, while performance-linked outcomes push safe targets. Break the loop and refocus on customer outcomes that truly move the needle.
Two problems arise. First, people can view the report and still not find what they need. Second, we might have perfectly happy customers who don’t need the report at all. Driving usage of an unneeded feature wastes time and erodes trust.
Measure the value creation moment, not just feature adoption.
Tricky example 4: pair sentiment with behavior
I define a product outcome as a metric that measures either 1. a specific behavior in the product or 2. a sentiment about the product. But sentiment metrics—like CSAT or NPS—can be tricky on their own.
Sentiment metrics are outcomes, but they aren’t directional. They don’t tell us where to explore or set guardrails for what to avoid. So I pair a behavior with a sentiment, for example: “Increase engagement without negatively impacting satisfaction.” I use sentiment as a counterweight.
Facebook and Instagram illustrate why this matters. Meta is exceptional at driving engagement—but to a fault. Many of us don’t like these addictive products. Pairing engagement with a satisfaction guardrail prevents “engagement at all costs.”
Why getting this right is hard (and how I counter it)
Ready to move from shipping features to creating impact? This visual playbook shares five practical moves—translate metrics, partner with teams, iterate, avoid traps, and dig deeper—to turn outputs into measurable outcomes.
The trust cycle. Managers don’t trust that teams can reach outcomes on their own. So managers micromanage the outputs. Teams, in turn, don’t communicate their progress toward outcomes—they communicate their progress on features. This reinforces the manager’s belief that they need to stay involved in the details. It’s a vicious cycle.
I break it by asking teams to show their work—share assumptions, research, opportunity solution trees, and evidence behind choices—and by giving feedback on the thinking, not just the solutions.
The accountability trap. When performance reviews are tied to hitting outcomes, teams play it safe. They sandbag their targets. They disguise outputs as outcomes to guarantee “success.”
I treat outcomes as learning opportunities first. When we start on a new outcome, I set a learning goal—“learn what moves the needle on this metric”—before a performance goal—“increase X by Y%.” This creates space to explore without fear.
How I get teams started with better outcomes
Translate business outcomes to product outcomes. Business outcomes like revenue, retention, and market share are lagging indicators—by the time you see them, it’s too late to act. Product outcomes measure behavior changes within the product that lead to those business results. They’re leading indicators within the team’s control.
Negotiate outcomes with your team. Outcome-setting should be a two-way conversation. Leadership brings the cross-company context. The team brings customer insight and technical realities. Neither side dictates; we co-own the target and the constraints.
Stop celebrating shipped features and start celebrating change. This visual contrasts a feature factory mindset with a true product team, urging teams to track impact, not output, and define success by outcomes.
Expect to iterate on your metrics. Your first outcome metric probably won’t be right. That’s normal. Sonja at tails.com went through four iterations—from 90-day retention to 30-day to 5-day to behavior-based metrics—before landing on something actionable. Thomas at Bluestone Analytics iterated three or four times before finding the right metric. Iteration is the work.
Watch for common mistakes. Outputs disguised as outcomes. Traction metrics masquerading as product outcomes. Sentiment metrics without direction. Business outcomes assigned directly to product teams without translating to behavior change.
Use the right artifacts. Replace feature roadmaps with an opportunity solution tree to explore multiple paths, test assumptions, and sequence bets explicitly against a clear outcome.
Align OKRs with outcomes. If your company uses OKRs, make sure the “KR”s are true product outcomes (behavior change and value creation), not a list of features to ship.
The bottom line
When we shift from an output-first mindset to an outcome-first mindset, it doesn’t mean that outputs stop mattering. Product teams will always ship features, and the ability to do so quickly and with quality still matters. This shift simply ensures those features achieve the intended impact. We aren’t done when we ship—we’re done when what we shipped has the intended impact.
Measure success by the impact of what you ship and you’ll build a product team that learns, adapts, and creates real value. Measure success by what you ship and you’ll get a feature factory.
Quick self-check: is your “outcome” really an outcome?
Ask yourself: 1) Does it measure a behavior change or a sentiment tied to value creation? 2) Could we hit it without helping customers? 3) Is it adoption of a single feature (a traction metric) or a result that customers and the business care about? 4) Do we have a counter-metric to prevent unintended harm? If you stumble on any of these, refine it before you commit.
Can an AI agent actually run a credible content audit end to end? I put that to the test. In my role leading product at a high-growth SaaS and as a hands-on content strategist, I’m constantly balancing depth with reach. During a recent office-hours discussion, someone asked me to zoom out and explain when to use Claude Code. That prompt inspired me to launch a running series—Conversations with Claude—showing exactly how I apply it to real product management and SEO problems.
I’m a heavy user and share what works for me. I receive no compensation from Anthropic for this series; if that ever changes, I’ll disclose it. With that out of the way, let’s dive into how I had Claude conduct a full content audit—and why the results exceeded my expectations.
For the first installment, I chose a fairly complex use case: a comprehensive content audit of my site. I expected this to be a slog. Instead, it was refreshingly fast and rigorous once I set Claude up with the right scaffolding.
I kicked off with a simple directive: start by asking clarifying questions, proceed step by step, and capture notes in a shared task file. I also provided deep context—specifically, the CDH Book (15 chapters + intro) and my entire blog archive in markdown—so the model could reason with my actual corpus rather than guessing from sparse prompts.
Claude began with smart clarifying questions that framed the analysis well. Scope of keywords: Should it focus strictly on concepts unique to or heavily associated with my work like "opportunity solution tree" and "continuous discovery," or also include broader product management terms such as "product outcomes," "assumption testing," and "customer interviewing"? Keyword geography: Start with US-only or include UK/global? Blog coverage assessment: What counts as "well covered"—dedicated deep dives or credible coverage within broader posts? Output format: Add findings to the task file or create a separate deliverable?
Peek inside a Notion-style page that turns content strategy into action: a content-audit task with due date and tags, plus clear steps for keyword research, blog gap analysis, and SEO improvements.
I replied: 1. both 2. us only is a good place to start 3. evaluate this based on how well we rank for the keyword, if we rank reasonably well, you might suggest content improvements to rank better, if we don't rank at all, then you might suggest a whole new article 4. add to the task file
From there, Claude read the CDH Book, extracted roughly 100 keywords, ran them through Keywords Everywhere in two batches of 50 to capture search volume, and pulled current domain rankings and traffic metrics. Within minutes, I had a high-signal view of what’s working, where we’re invisible, and how to prioritize fixes.
The good news came first: we own our branded terms—#1 for "product discovery," "opportunity solution tree," "continuous discovery," and "product trio." That brand equity is doing real work for us.
The biggest gaps were in broad topics the CDH Book covers but where there’s no targeted content. "Outcomes vs outputs" (1,300/mo) — Arguably THE central thesis of CDH, and no ranking. This is the single biggest gap. "Product roadmap" (4,400/mo) — I have a strong anti-roadmap POV but no content targeting this. "Product strategy" (1,900/mo) — Ch 7 argues strategy = opportunity selection. Strong differentiator, no ranking. "Story mapping" (5,400/mo) — I use story maps uniquely (for surfacing assumptions). Huge volume. "Stakeholder management" (2,900/mo) — Ch 13 is entirely about this. No ranking. "Pre-mortem" (4,400/mo) — I cover this as a product discovery technique. No ranking.
Inside a dark-themed writing workspace, a long-form chapter is open while a tidy folder tree catalogs pages and chapters. The scene invites readers to think like auditors—inventory content, track structure, and surface gaps with AI assistance.
The trojan horse opportunity: High-volume generic terms like story mapping, pre-mortem, and usability testing could bring in readers who don't know about CDH yet. Write about these broadly-searched topics with my specific product-discovery angle.
In just a few minutes, Claude generated an analysis of what keywords we ranked for and at what position, a ranked set of high-, medium-, and lower-volume (but strategic) keywords where we didn’t rank yet had relevant content, concrete net-new topics to close the gaps, and a list of existing articles to update to lift their SERP positions. It worked far better than I expected.
Here’s how I set it up so the model could deliver: I didn’t simply ask Claude.ai to "audit my site" and hope for the best. I supplied rich, relevant context (my book and all blog posts as markdown) so it could anchor on my language, frameworks, and mental models. I paired that with live data via APIs like Keywords Everywhere to ground recommendations in actual search volume and competitive rankings. With the right inputs, Claude Code behaved like a capable research analyst and an SEO strategist—able to reason, prioritize, and suggest high-leverage actions.
Next, I went deeper and used the findings to draft a long-form article that addresses the biggest gap—"Outcomes vs outputs"—and ties it directly to product roadmapping and sprint planning. I wove in continuous discovery practices, opportunity solution tree techniques, and product trios collaboration to make it actionable for empowered product teams. I’ll share the end-to-end workflow—including files, prompts, and the editorial QA checklist—in a follow-up.
If you’re new to Claude Code and want a practical starting point, replicate the setup above: assemble your canonical sources in markdown, define a clear evaluation rubric, and ground keyword research with reliable volume data. If you want my exact task file, clarifying-question template, and step-by-step audit rubric, tell me which content gap you’d prioritize first and why—I’ll tailor the walkthrough to the highest-interest topic.
There’s a moment in every product leader’s career when the bravest decision isn’t to build—it’s to stop. That’s why the “Kill Your Darlings” theme resonated so strongly with me. In this episode of All Things Product, Teresa Torres and Petra Wille dig into the courage and craft it takes to sunset products that look successful on the surface yet quietly block your path to meaningful growth. As someone accountable for portfolio outcomes, I’ve learned that disciplined endings are often the catalyst for exceptional beginnings.
Listen to this episode on: Spotify | Apple Podcasts
The heart of the conversation is that uncomfortable middle ground between obvious failure and runaway success: products that are profitable, loved by customers, but fundamentally flatlining. Teresa shares candid stories from her own business, including a decision to cut 40% of revenue on purpose. I’ve been there—choosing to retire a “working… kind of” product to free up discovery capacity felt risky in the moment, but it created the focus we needed for durable growth.
Here’s the trap: some traction can be more dangerous than no traction at all. Early fans are not the same as durable product–market fit, and “stable but not growing” can lull leaders into maintaining instead of learning. Every hour of design, engineering, and go-to-market attention that props up a flatlining product is an hour not invested in the next breakthrough—an opportunity cost that rarely shows up on a dashboard, yet compounds month after month.
From a portfolio perspective, this is continuous discovery in action. If we want empowered product teams to tackle meaningful outcomes, we have to protect their capacity from zombie work. That means setting clear thresholds for when we double down, shift strategies, or sunset—before attachment and inertia take over. When I’ve institutionalized this discipline, our throughput of high-quality bets increased, and our confidence in what not to do became a strategic advantage.
Organization design can make sunsetting harder than it needs to be. Dedicated, long-lived teams are fantastic for compounding capability, but they also create emotional and structural ties to specific products. Petra’s point lands: leaders need explicit sunsetting conversations and a portfolio decision-making cadence that sits one level above teams. In my org, we treat sunsetting as a strategic reallocation—not a verdict on a team’s talent—so people are celebrated for learning, not punished for outcomes outside their control.
Killing profitable products can be the right strategic move when the growth ceiling is clear and the opportunity cost is high. I’ve chosen to “burn the ships (on purpose)” more than once—retiring add-ons that generated reliable revenue but diluted our value proposition and spread discovery thin. Yes, it stings in the quarter you do it. But it’s astonishing how quickly focus restores momentum when you create intentional space for what’s next.
Practically speaking, I make sunsetting easier and less traumatic by operationalizing it: Regular portfolio reviews focused on outcomes and opportunity cost; a visible “sunsetting” column so everyone sees what’s on the table; the Horizon (H1 / H2 / H3) model to balance core, adjacent, and transformational bets; and making portfolio decisions one level above teams to avoid local optimizations. Add explicit exit criteria and success metrics for endings, the same way we set entry criteria for new bets.
Another theme I appreciated is designing for the right customers. Teresa highlights intentionally limiting access and pricing to work with customers who show agency and commitment. I’ve applied the same principle: when we’re clear about who we serve and who we don’t, our product–market signal sharpens, churn narratives simplify, and roadmaps get crisper. Focus is a growth strategy.
If you’re leading a product portfolio, running discovery, or wrestling with a product that “works… kind of,” this conversation is permission to act. Product–market fit isn’t binary, and mediocre success can be the most dangerous place to stay. Sunsetting is a portfolio decision, not a team failure; teams shouldn’t be punished for reaching the end of a product’s natural lifecycle. If experimentation isn’t in your DNA, killing products will always feel traumatic—so make space for it intentionally, not passively.
Key moments and themes worth bookmarking: 00:00 – Why “kill your darlings” matters; 04:30 – The dangerous middle ground; 09:30 – The opportunity cost of “okay” products; 14:30 – Sunsetting in product organizations; 19:00 – Real examples of killing revenue streams; 28:00 – Designing for the right customers; 33:30 – Burn the ships (on purpose); 38:00 – Making sunsetting easier with Regular portfolio reviews, a visible “sunsetting” column, the Horizon (H1 / H2 / H3) model, and making portfolio decisions one level above teams; 46:00 – Normalizing product lifecycles.
Resources & Links:
Follow Teresa Torres: https://ProductTalk.org
Follow Petra Wille: https://Petra-Wille.com
Mentioned in this episode:
Ways to Work with Petra Wille
Product at Heart
CDH Membership by Teresa Torres
Product Talk by Teresa
Product Talk Academy by Teresa
Enduring Ideas: The three horizons of growth
Join the Conversation:
Have thoughts on this episode? Leave a comment below.
Full Transcript
Full transcripts are only available for paid subscribers.
Most MVPs take too long, cost too much, and still miss the mark. Over the past year, I’ve shifted my team to a prototyping prompts approach that lets us validate problem-solution fit in days, not months. The result is faster learning loops, clearer tradeoffs, and a dramatically higher hit rate on features that actually move the needle.
When I say prototyping prompts, I mean structured, layered instructions that guide gen ai systems to produce the right artifacts at the right fidelity. Instead of jumping straight to code, we generate concise problem briefs, user stories, interaction flows, low-fidelity UI descriptions, and test plans. Each pass is constrained by acceptance criteria and business outcomes, which keeps the work grounded in value rather than output.
Here’s the playbook my product trios use to go from idea to a testable MVP in 48–72 hours. First, we anchor on outcomes vs output OKRs and clarify the customer job-to-be-done using evidence from customer interviews and support data. This is classic continuous discovery, but we compress it by focusing on the single riskiest assumption to de-risk this week.
Second, we build a prompt scaffold. We specify the role, constraints, target users, success metrics, and the exact output format we expect. We also define evaluation upfront, borrowing from eval-driven development. For example, before any generation, we list the acceptance tests that a good solution must pass, including edge cases and compliance considerations. This discipline keeps hallucinations in check and improves repeatability.
Third, we spin up multiple prototypes in parallel. One prompt generates a lean product brief; another outlines user flows; a third proposes UI states and error handling. If we’re exploring voice, we add prompt engineering for voice to script dialogs and repair strategies. For data-heavy features, we call out retrieval-first pipeline patterns so the model references source-of-truth data rather than guessing.
Fourth, we validate with real users using the lightest-weight experiment possible. Fake-door tests, concierge workflows, and guided click-throughs let us measure intent before we invest. Where we can, we run quick A/B testing and size the effort using minimum detectable effect (MDE) so we don’t over- or under-sample. The point isn’t perfection; it’s fast, directional signal to inform the next iteration.
Fifth, we instrument and ship behind feature flags. We track activation, task completion, and time-to-value from day one. On the delivery side, we watch DORA metrics and deployment frequency to ensure we’re learning continuously rather than batching big bets. This bridges discovery and delivery so roadmaps reflect real-world feedback, not assumptions.
One recent example: we needed to evaluate a voice AI agent for appointment scheduling. In 72 hours, prompts produced the problem brief, dialog flows, error recovery strategies, and a sandbox to simulate inbound requests across three user personas. We exposed a thin slice to a pilot cohort, captured call outcomes, and iterated the repair prompts twice before writing any production code. The pilot converted at a higher rate than our control flow and gave us the confidence to invest in full integration.
This approach only works if we treat governance as a first-class concern. We bake in privacy-by-design, clear data governance boundaries, and AI risk management from the start. Prompts include guardrails on personally identifiable information, explicit constraints on data use, and links to approved sources. We also maintain a prompt repository with versioning and automated evaluations so changes are observable and reversible.
Practically, strong prompt scaffolds share three traits. They’re specific about context and constraints, they define success in measurable terms, and they separate concerns by artifact type. I’ll often ask for three variants with different tradeoffs, then run a quick synthesis prompt that highlights points of parity and differentiation. This gives the team structured options rather than a single, brittle path.
If you’re starting from zero, begin with one high-leverage workflow. Write a crisp outcome statement, draft your acceptance tests, and create a prompt that outputs a one-page brief, three user flows, and the top five risks with mitigations. Validate with five users in 48 hours, then decide: double down, pivot, or park. Rinse and repeat, and your product roadmapping and sprint planning will shift from speculation to evidence.
The bottom line is simple. Prototyping prompts won’t replace product judgment, but they will accelerate it. By turning ideas into testable artifacts in hours, you minimize waste, maximize learning, and ship better MVPs—fast.
"What if an AI could spot the moment two product teams start pulling in opposite directions — before it derails a quarter?" That question hooked me, because I’ve lived through the costly fallout of subtle misalignments that only surface at the end of a sprint—or worse, during quarterly business reviews.
I recently dug into an episode of Just Now Possible featuring Matthias and Charlotte Kleverud, co-founders of Momental. Their vision for "GitHub for product management" hits a nerve in the best possible way: find "merge conflicts" in strategy, not code, and do it early enough to save execution time, trust, and outcomes.
Here’s the core: Momental ingests documents, meeting transcripts, and voice recordings across an organization, then uses AI agents to map them into a structured context layer—a set of interconnected trees covering goals, decisions, learnings, and who's doing what. When it finds a conflict—say, one team betting on retention while another is prioritizing conversion—it surfaces the misalignment for humans to resolve, just like a merge conflict in code. That framing is both familiar (for anyone who’s shipped software) and powerful (for anyone who’s scaled product strategy across multiple teams).
Their journey tracks with what many of us have learned the hard way. "Starting in 2022 with DaVinci 002 and learning that the market wasn't ready for AI-assisted product thinking" pushed them toward experiments with agent teams. "The origin story: building a team of AI agents in 2024, only to discover agents hit the same alignment problems as humans" is exactly the kind of meta-lesson I’d expect when you scale autonomy without shared context. The breakthrough was an "OODA-loop-driven document processing agent" that continuously curates a living knowledge graph rather than relying on static prompts or brittle pipelines.
One model that stood out was "The product chain: signals → learnings → decisions → principles, and how AI maps it." That is the backbone of healthy product thinking. When this chain is explicit and inspectable, you can trace why a team chose Path A over Path B—and detect when new signals should invalidate old decisions. I’ve seen this accelerate continuous discovery and improve executive decision hygiene.
I also appreciated the organizational modeling: "Three trees that model an organization: the product tree (OKRs to epics), the wisdom tree (decisions and their reasoning), and the people/time tree." This maps cleanly to how we run quarterly planning at scale—tying outcomes to work, preserving rationale, and grounding ownership and timelines. With that structure, "How conflicts are detected, auto-resolved, or escalated to humans with merge options" becomes a pragmatic workflow, not a theoretical AI demo.
On the technical front, they’re blunt about limits: "Why traditional chunking and RAG breaks down at scale and what Momental does instead." Anyone who’s tried to stitch strategy from ad hoc notes knows that naive retrieval won’t cut it. You need durable context boundaries, rich metadata, and graph-aware reasoning. Which brings me to one of my non-negotiables: "Why metadata—who said it, when, and in what context—is critical to preventing hallucinations." In my world, we treat provenance like test coverage—you can’t ship without it.
Process-wise, the product philosophy resonated: "How a document processing agent uses OODA-loop thinking to extract and connect context across documents" reinforces the need for short feedback cycles, explicit hypotheses, and continuous refactoring of knowledge. Pair that with "The self-improving agent: collecting user feedback weekly and rewriting its own prompts" and you’ve got a blueprint for eval-driven development that keeps the system honest over time.
Their UI choices also mirror a pattern I’ve adopted: "Moving from chat-first to UI-first to proactive agents as an AI product design pattern." Chat can feel magical, but alignment work benefits from concrete artifacts—trees, timelines, driver trees, and opportunity solution trees—so people can reason together. Then, let proactive agents watch for drift and nudge teams before the cost of change spikes.
Two broader themes are worth calling out. First, "Specialized tools win" when the problem is deep, cross-functional context like product strategy. General-purpose chatbots struggle here; domain-specific models with strong information architecture have the edge. Second, product culture matters: "Discovery Versus Vibe Coding" is not just a catchy contrast—it’s a reminder that disciplined discovery beats intuition theater when stakes are high.
As for the roadmap, I’m encouraged by their "Design partner strategy and what's next for Momental's public launch." Early design partners are where you validate signal quality, precision of conflict detection, and the ergonomics of human-in-the-loop resolution. I’m especially curious how this intersects with LLMs for product managers, outcomes vs output OKRs, and product roadmapping and sprint planning in large portfolios.
Finally, a nod to the broader ecosystem. The conversation touched on "Claude Code" and a shift "Beyond documents and vectors" that many of us are living through—toward retrieval-first pipelines that respect context windows, stronger governance, and measurable improvements in decision quality. If you care about AI Strategy for empowered product teams, this is a space to watch—and to pilot.
Bottom line: If you’ve ever wished you could prevent strategy drift before it shows up in your dashboards, this "GitHub for product management" approach is worth your attention. Make the chain of signals, learnings, decisions, and principles explicit. Keep humans in the loop for the hard calls. And let proactive, agentic AI do what it does best: flag misalignments early, so your teams can move fast together.