Author: Shivam Tiwari

  • Mastering 30,000-Foot Vision and Ground-Level Execution: Systems That Decide Without You

    Mastering 30,000-Foot Vision and Ground-Level Execution: Systems That Decide Without You

    Executive function, for me, is the art and discipline of building systems that make high-quality decisions without my constant involvement. The real unlock isn’t personal heroics; it’s institutionalizing judgment. When I do my job well, teams move faster, ambiguity shrinks, and the organization compounds learning even when I’m not in the room.

    Operating simultaneously at 30,000 feet and ground level is the defining muscle of executive leadership. I deliberately switch altitudes. At 30,000 feet, I obsess over strategy, architecture, and resourcing. On the ground, I validate core assumptions with firsthand data, listen for weak signals, and spot process cracks before they widen. Altitude changes are not random; they’re triggered by variance from plan, critical customer moments, or leading indicators that deviate from expected ranges.

    The leap from frontline manager to manager of managers is where many rising leaders stall. As a manager of managers, my primary value shifts from personal execution to system design. I move from answering questions to installing mechanisms that ensure questions get answered well by others. This includes clear decision rights, shared metrics, and repeatable, lightweight rituals that scale across teams.

    What is an executive actually accountable for? Outcomes over output, talent density, and the clarity of the operating system. That means defining strategy, aligning resources, creating a cadence of review that exposes truth, and ensuring incentives reward the behaviors we want. My barometer: if I step away, do priorities hold, do metrics behave as expected, and do tradeoffs land where I would have landed?

    Knowing when to dive deep versus when to step back is a craft. I dive deep when risks are existential, when metrics have no credible owner, or when narrative and numbers diverge. I step back when leaders demonstrate consistent judgment, metrics sit inside control limits, and learnings are documented. The principle I return to again and again: context is everything. Senior leaders operate on context, not control.

    To scale judgment, I teach people how I think. I externalize my mental models: how I construct decision trees, how I stress-test assumptions, and how I weigh time horizons. I rely heavily on driver trees for metrics because they force causal clarity. If we can’t map how a top-line goal decomposes into controllable levers, we’re managing by hope, not design.

    Creating a shared language across the business is a force multiplier. I standardize definitions for our core metrics, codify what “good” looks like, and make it easy to repeat the system. We align around outcomes versus output, and we use cadences like MBRs and QBRs to unify narrative and numbers. Shared language makes decisions legible across functions and reduces rework.

    My COO playbook emphasizes owning the full customer experience end to end. When marketing rolls up under a COO in certain stages, the upside is coherence: one narrative from awareness to activation to expansion, one set of metrics, one growth engine. The point isn’t org charts; it’s removing seams customers can feel.

    Demanding and supportive is not a contradiction. I set ambitious, unambiguous bars and back them with coaching, resourcing, and fast feedback. The combination builds trust: expectations are clear, and help is immediate. I expect leaders to bring problems paired with proposed solutions and to escalate early, not perfectly.

    Inside my executive interview process, I’m assessing altitude agility, operating cadence, and taste in metrics. I use structured interviews and live case workshops to see how candidates frame ambiguous problems, build driver trees, and prioritize tradeoffs. The best prompts are simple and revealing: design the operating system for a 3x scale scenario; diagnose a broken funnel with incomplete data; align two teams with conflicting incentives. The workshop prompts that reveal everything surface thinking speed, humility, and the instinct to make context legible.

    The common thread in failed executive hires is a mismatch between the company’s operating system and the leader’s default mode. Some leaders can’t stop doing the work themselves. Others stay too abstract and never build mechanisms. I look for demonstrated ability to change systems, not just run them—leaders who can both author and evolve the playbook.

    On metrics, I practice the driver tree philosophy. I begin with the North Star, decompose it into controllable levers, instrument each node, and assign single-threaded owners. We design review cadences where deviations trigger targeted diagnostics, not thrash. Each tree has documented assumptions, data sources, and thresholds that prompt action. This is how teams learn to anticipate, not react.

    High-functioning executive teams are visibly collaborative. We clarify decision rights, disagree and commit quickly, and conduct post-decisions to harvest learnings without blame. My favorite litmus test is simple: can 30 people operate as one team when it matters? When we get this right, information flows, execution accelerates, and customers feel consistency.

    One of the most counterintuitive leadership lessons is working yourself out of a job. If the system cannot run without you, you have a key-man risk, not a leadership strength. I aim to build successors, codify judgment, and design mechanisms that make good decisions the default state. That’s how you create durable, compounding advantage.

    And the review feedback you can’t unhear? Mine was brutally honest: my bar was high, but my mechanisms were implicit. Once I wrote them down—how I decide, what I expect, where I dive deep—the organization moved faster, and I actually became less central. If there’s a throughline to extraordinary leadership, it’s this: make your judgment teachable and your systems inevitable.


    Book a consult png image
  • From Idea to Impact: My PM-Friendly Blueprint to Building Your First AI Agent Fast

    From Idea to Impact: My PM-Friendly Blueprint to Building Your First AI Agent Fast

    AI agents are quickly moving from novelty to necessity, and the fastest way to capture value is to approach them like any other high-stakes product initiative. In this guide, I share how I plan, build, and launch production-grade agents with a product mindset—balancing ambition with risk, speed with governance, and innovation with measurable outcomes.

    I start by getting crisp on the outcome. Who is the primary user, what job are they hiring the agent to do, and how will we know it’s working? I translate this into outcomes vs output OKRs, such as resolution rate, time-to-value, cost-to-serve, or qualified pipeline influenced—anchoring the roadmap before a single line of code or prompt is written.

    Next, I map the agent’s scope and boundaries. I write a simple capability canvas: the tasks the agent must perform, the tools it can use, the data it can access, and the constraints it must respect. Most successful builds follow a retrieval-first pipeline: connect trusted knowledge sources, enrich with metadata, and manage a lean context window to keep responses relevant and cost-efficient. From the start, I bake in privacy-by-design, data governance, and AI risk management so compliance isn’t an afterthought.

    Model selection comes after the workflow is clear. I choose an LLM for the job (latency, cost, multilingual needs, and tool-use fidelity) and pair it with the right connectors and actions—think CRM integration, ticketing, search, or internal APIs. For voice experiences, I define a voice AI agent persona, turn-taking rules, and barge-in behavior. This is where agentic AI patterns shine: structured planning, tool invocation, and verification loops create a resilient, goal-directed system.

    Prompt design is product design. I write system prompts that define role, tone, constraints, data sources, and success criteria. I add few-shot examples that mirror my top use cases and edge cases, then apply prompt engineering best practices to control style, limit speculation, and encourage citations. For voice, I include prompt engineering for voice to optimize brevity, warmth, and disfluency handling without sacrificing accuracy.

    Before launch, I build an eval-driven development workflow. I curate golden datasets from real user intents, add adversarial cases, and automate evals for accuracy, safety, grounding, and tool-use success. I set a minimum detectable effect (MDE) so A/B testing can validate improvements with confidence, and I define go/no-go thresholds to prevent regression. This becomes my continuous discovery loop for the agent.

    Instrumentation is non-negotiable. I wire up Agent Analytics to track task success, containment/deflection rate, handoff quality, cost per task, and user satisfaction. I supplement with a unified analytics platform and session replays to observe failure patterns. These signals feed prioritization and help me decide when to expand scope versus harden reliability.

    For delivery, I rely on CI/CD with feature flags to gate risky capabilities, plus canary releases for new tools and prompts. I monitor DORA metrics to maintain deployment frequency without trading off quality. When incidents happen, I treat them like production issues: incident management playbooks, rollbacks, and clear postmortems.

    Trust is earned through safety and transparency. I enforce least-privilege access, structured logging, and red-teaming for jailbreaks, prompt injection, and data exfiltration. Threat detection and response plus clear user disclosures keep the experience responsible and compliant with regulatory requirements.

    GTM is product-led. I use in-app guides, product tours, and onboarding checklists to drive user activation and early wins. I define success moments, turn them into habit loops, and run retention analysis to find where users stall. This tight loop of messaging, measurement, and iteration accelerates product-market fit.

    Common high-ROI use cases I prioritize include customer support ai strategy (automated resolution and augmented agent assist), sales and success workflows (lead qualification, QBR prep), and internal knowledge copilots (policy, process, engineering runbooks). Each starts narrow, ships fast, and scales with proven evidence from analytics and experiments.

    If you’re skimming, here’s the blueprint: clarify outcomes, design AI workflows with a retrieval-first pipeline, select the right LLM and tools, engineer robust prompts, institutionalize evals and A/B testing, instrument Agent Analytics, ship with CI/CD and feature flags, and iterate with discipline. In the walkthrough video above, I go deeper on templates, prompts, and experiments you can use to build your first agent with confidence.


    Inspired by this post on Product School.


    Book a consult png image
  • Becoming AI Native: A Practical Playbook to Transform Strategy, Teams, Data, and Tech

    Becoming AI Native: A Practical Playbook to Transform Strategy, Teams, Data, and Tech

    AI Native is more than a feature set—it’s an operating system for the entire business. In my role leading product, I’ve seen that companies win when they treat AI as a first-class citizen across strategy, architecture, workflows, and go-to-market. In this narrative, I unpack what “AI Native: What It Means and How to Get There” looks like in practice, sharing the frameworks I use to align vision, technology, and teams around measurable customer outcomes.

    When I say AI Native, I mean a company where core value creation, customer experience, and internal operations are powered by AI end-to-end. It’s not just bolting on a chatbot. It’s rethinking product strategy, data foundations, and execution so we can deliver differentiated experiences faster, at lower cost, and with higher reliability. This shift demands clarity on where AI truly creates leverage—and the courage to say no where it doesn’t.

    The starting point is strategy. I ground teams in outcomes vs output OKRs and a crisp value proposition: Which customer jobs-to-be-done benefit most from generative AI? Where can we unlock 10x improvements in speed, accuracy, or personalization? We prioritize a small number of high-signal use cases, size impact, and design Minimum Viable Experiments (MVEs) to de-risk assumptions before scaling. This is where build vs buy decisions matter—use foundation models and platforms for commodity needs, and invest your scarce engineering time where differentiation lives.

    Next comes architecture and data. AI Native products thrive on a retrieval-first pipeline, strong context window management, and model-agnostic abstraction so we can swap providers as needs evolve. I emphasize privacy-by-design, robust data governance, and observability across prompts, embeddings, latency, and cost. These guardrails let us move quickly without compromising trust, especially in regulated or enterprise settings.

    Execution shifts as well. I organize empowered product teams and product trios around the highest-value workflows, not components. Continuous discovery pairs with CI/CD, feature flags, and telemetry so we can test safely in production. Eval-driven development is non-negotiable: we design offline and online evaluations that mirror real user success criteria—accuracy, helpfulness, safety, and business outcomes—then wire those evals into the build pipeline to prevent regressions.

    On the intelligence layer, we increasingly rely on AI workflows and agentic AI to orchestrate multi-step tasks—retrieval, reasoning, tool use, and verification—with human-in-the-loop where appropriate. Clear system prompts, tool definitions, and fallbacks keep behavior predictable. This is where product craft meets prompt engineering and LLMs for product managers: the best teams codify patterns, share prompts in a living library, and standardize on a lightweight AI product toolbox.

    Risk and reliability are part of the product, not an afterthought. I run AI risk management as a continuous program spanning red teaming, content filters, PII handling, audit trails, and incident response. We tie policies to concrete controls and create simple dashboards leaders can trust. The goal is to ship boldly with safety, maintainability, and scale in mind.

    Becoming AI Native also changes how we grow. We lean into product-led growth with clear in-app guides, product tours, and activation paths that teach users where AI shines. CRM integration ensures sales and success teams have context to coach customers. Pricing experiments—often usage- or value-based—align revenue with the impact customers feel, while retention analysis helps us double down on the use cases that drive compounding value.

    To make this real, I use a 90-day plan. Days 0–30: align on strategy, top use cases, and risk posture; stand up data pipelines and a basic retrieval-first stack; define evaluation metrics. Days 31–60: ship MVEs behind feature flags, run head-to-head evals, and instrument observability; start a cross-functional community of practice. Days 61–90: scale the winning use cases, formalize governance, and publish a roadmap tied to outcomes—not just features—with clear SLAs and success metrics.

    The destination is a durable advantage: faster iteration cycles, smarter experiences, and a product strategy that compounds with every interaction. If you’re ready to make the leap, start small, measure obsessively, and build the muscle to ship, learn, and adapt. That’s the heart of becoming AI Native—and it’s well within reach.


    Inspired by this post on Product School.


    Book a consult png image
  • From Coaching to Co‑Pilots: How AI Elevates Product Owners and Feature Teams

    From Coaching to Co‑Pilots: How AI Elevates Product Owners and Feature Teams

    After two decades of coaching product teams, I’m making a deliberate shift in how I guide leaders and practitioners. The destination hasn’t changed—great products, empowered product teams, and durable outcomes—but the route has. AI is now a practical, compounding advantage, and it demands we evolve our product coaching model.

    In my day-to-day as a VP of Product Management at HighLevel, I’ve watched AI move from novelty to necessity. Large language models, agentic AI, and streamlined AI workflows now accelerate how we discover opportunities, test hypotheses, and communicate decisions. This is not about replacing product judgment; it’s about augmenting it with a disciplined AI Strategy.

    For years, I’ve raised the alarm about the gap between execution and strategy among “product owners and feature team product managers.” The intent was never to pile on more process. It was to strengthen product discovery, sharpen product strategy, and clarify outcomes vs output OKRs so that teams ship what matters. AI finally gives us the leverage to make that shift unavoidable—and repeatable.

    Here’s the new coaching stance: treat AI as a co-pilot, not an answer engine. I coach teams to build an AI product toolbox they can trust—prompt engineering patterns, eval-driven development to measure model quality, and a retrieval-first pipeline for institutional knowledge. When combined with continuous discovery, this creates a tight loop between insight, iteration, and impact.

    Practically, this means elevating core rituals. In product trios, we start discovery with AI-assisted opportunity mapping, then pressure-test problem framing with user evidence. We generate multiple solution sketches with LLMs for product managers, annotate assumptions, and use A/B testing with a minimum detectable effect (MDE) to validate the riskiest bets. The result is faster learning without skipping the hard thinking.

    On the governance side, I set clear guardrails: privacy-by-design, data governance, AI risk management, and explicit criteria for acceptable model behavior. We treat prompts and evaluation datasets as versioned assets, and we pair product managers with forward deployed engineers to operationalize insights in production safely.

    Coaching also extends to measurement. We anchor product outcomes in the customer journey and watch leading indicators for activation, adoption, and retention. On the delivery side, we look at deployment frequency and the health of the feedback loop between support signals and roadmap choices—because empowered product teams win when they learn faster than the market shifts.

    The most profound cultural change is mindset. Instead of asking AI for answers, we ask it for alternatives, counterexamples, and structured ways to explain tradeoffs to stakeholders. That makes product positioning clearer, decision narratives stronger, and the path from insight to execution shorter.

    If you’re responsible for developing talent, reframe coaching as enablement plus guardrails. Build the AI muscle into everyday discovery and delivery, not as a side project. When we do this well, we transform good practitioners into strategic operators—people who pair judgment with leverage and consistently ship value.

    The bottom line: AI doesn’t replace the craft; it amplifies it. Our job as leaders is to harness that amplification responsibly and turn it into a durable competitive advantage.


    Inspired by this post on SVPG.


    Book a consult png image
  • How We Built Rock-Solid AI Infrastructure: Lessons From Scaling AI Visibility and Reliability

    How We Built Rock-Solid AI Infrastructure: Lessons From Scaling AI Visibility and Reliability

    Scaling AI Visibility pushed me to rethink what “reliable” really means for AI infrastructure. As my team expanded usage across more datasets, models, and workflows, we uncovered unexpected sources of report failure and built the guardrails, observability, and processes that now anchor our stability strategy.

    In practice, the surprising failure modes were rarely the loud ones. We saw report failure triggered by small schema drift from non-deterministic LLM outputs, silent permission changes in upstream data sources, token-limit truncation that broke downstream parsing, third-party API rate limits that surfaced only under bursty load, and clock skew that confused idempotent writes. Individually these issues looked minor; together they created reliability debt.

    Our first move was deep observability. We instrumented the end-to-end pipeline with structured logs, distributed tracing, and high-signal metrics mapped to SLOs and error budgets. That visibility let us separate symptom from cause, quantify impact by segment, and prioritize fixes that moved business outcomes, not just vanity thresholds. It also gave product managers and SREs a shared, real-time view to make tradeoffs explicit.

    Next, we hardened the runtime with resilience patterns: circuit breakers on flaky dependencies, timeouts tuned to p95 behavior, retries with jittered backoff, idempotent processing for at-least-once delivery, and backpressure-aware queues. We enforced schema contracts at ingestion with JSON validation and added feature flags to decouple deploys from releases, so we could roll forward or back within minutes when signals degraded.

    On the product side, we adopted eval-driven development for model and prompt changes, shifting risky modifications behind canaries and staged rollouts. CI/CD gates required evaluation baselines to hold or improve before promotion. We tracked DORA metrics to keep deployment frequency high without sacrificing change failure rate, and we used P95 latency and budget burn as the forcing functions for prioritization.

    Culture mattered as much as code. We formalized incident management with clear ownership, lightweight runbooks, and blameless reviews that produced crisp, automatable actions. We partnered early with SRE on SLO design, integrated privacy-by-design and PII scanning into the pipeline, and treated AI risk management as an ongoing product constraint rather than a checkbox.

    The net effect: fewer flaky reports, faster recovery when things do break, and far more confidence to ship improvements to AI Visibility at pace. If you’re scaling similar capabilities, start with observability, make resilience patterns non-negotiable, and let SLOs guide your product roadmap. Reliability is not a phase—it’s the product.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • Inside Amplitude’s AI Playbook: Lessons from Leo Jiang on Ask Amplitude, Agents, and Visibility

    Inside Amplitude’s AI Playbook: Lessons from Leo Jiang on Ask Amplitude, Agents, and Visibility

    I continually study how high-velocity teams turn AI ambition into shipped product, and Amplitude’s approach stands out. "Leo Jiang is the Head of Engineering, AI Products at Amplitude, focused on building new AI and marketing products. He has helped build Ask Amplitude, Agents, and AI Visibility." From a product management leadership lens, that portfolio signals a clear AI strategy: enable insight (Ask Amplitude), drive action (Agents), and ensure trust and observability (AI Visibility).

    What I appreciate most is the sequencing: start with user-facing value, build agentic AI capabilities where tasks repeat and outcomes can be evaluated, and layer AI workflows with robust governance. For PMs and LLMs for product managers, the implication is to define success via eval-driven development—quantitative rubrics, offline test sets, and real-time feedback loops—before scaling automation. This also hints at an emerging discipline of Agent Analytics: instrument prompts, tool calls, and outcome quality so we can tune performance like we tune a funnel.

    Ask Amplitude gives a relatable example: natural-language questions lower the activation barrier for product and growth teams inside an Amplitude analytics environment. When agents turn answers into next-best actions, product-led growth becomes measurable—from hypothesis to change to impact—inside a unified decision loop. That tight loop is where product strategy, design, and reliability meet to create compounding value.

    Operationally, I organize a product trio around each capability and pair it with forward deployed engineers to accelerate discovery with customers. I also invest in privacy-by-design and data governance early, ensuring marketing use cases respect compliance while keeping iteration speed high. The goal is a repeatable path from prototype to scale that preserves momentum without compromising safety.

    My takeaway for peers: pick one high-frequency workflow, define clear agent boundaries, ship a narrow slice, and measure relentlessly. Use retrieval-first pipeline patterns for grounding, add human-in-the-loop checkpoints, and close the loop with qualitative insights from in-app guides. When that works, expand capabilities—not just features—and let outcomes vs output OKRs steer prioritization.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • 12 MCP prompts that rally your whole company around product data and drive adoption

    12 MCP prompts that rally your whole company around product data and drive adoption

    I’ve seen first-hand how quickly a company aligns when product data becomes everyone’s common language. To make that happen at scale, I rely on MCP prompts inside Pendo to turn raw behavioral signals into clear, cross-functional actions. When we give people precise questions to ask of the data, engineering, product, marketing, customer success, and sales move in lockstep—and outcomes follow.

    Increase revenue, cut costs, and reduce risk with Pendo’s Software Experience Management platform. Optimize the entire software experience to drive adoption and improve engagement.

    What follows are the 12 MCP prompts I use to help teams across the business make better, faster decisions from product analytics, in-app guides, and customer feedback. They’re battle-tested, easy to adapt to your stack, and intentionally written to drive product-led growth and clearer accountability.

    Prompt 1: Show me the activation funnel by segment (SMB, MM, ENT) for the last 90 days, highlight the biggest drop-off steps, and quantify which change would yield the largest absolute lift in activated users.

    Prompt 2: Rank features by adoption velocity over the past 30 days, identify underutilized high-value features by persona, and recommend the top three in-app guide placements to increase engagement.

    Prompt 3: Plot 30/60/90-day retention curves for new users by plan type and persona, flag statistically significant gaps, and suggest two experiments to improve week-two retention.

    Prompt 4: Cluster qualitative feedback (NPS verbatims, support tickets, and in-app survey responses) by theme and feature, summarize the top friction points in one paragraph per theme, and propose fixes ordered by impact and effort.

    Prompt 5: Analyze common user paths after onboarding, surface where users stall or loop, and recommend targeted product tours or tooltips to reduce time-to-first-value.

    Prompt 6: Evaluate the impact of a specific in-app guide on activation rate using an A/B test, report lift with confidence intervals, and include the minimum detectable effect (MDE) assumptions used in the analysis.

    Prompt 7: Identify accounts at churn risk based on declining feature usage, login frequency, and support sentiment; produce a prioritized list with the top three customer success plays for each account.

    Prompt 8: Generate a weekly list of product-qualified leads (PQLs) based on usage thresholds, map them to opportunities in our CRM, and recommend the best follow-up message for sales based on feature interest.

    Prompt 9: Analyze usage distribution across pricing tiers, highlight features driving upgrades, and suggest one packaging change and one in-app nudge to improve conversion to the next plan.

    Prompt 10: Measure time-to-value by persona for a key action, compare pre/post tutorial launch, and quantify the impact of our in-app guides on reducing time-to-first-value.

    Prompt 11: For our last three releases, summarize adoption, top feedback themes, and any regressions; recommend one quick win and one strategic bet for the next sprint.

    Prompt 12: Produce a weekly executive summary with the top three product insights, the KPIs they influence, and clear owner-action pairs across Product, CS, and Marketing.

    When teams start their day with these MCP prompts, product data stops being a report and becomes a decision engine. That’s how we drive adoption, run better experiments, reduce churn, and keep everyone focused on outcomes instead of opinions. If you adapt even a few of these prompts to your context, you’ll feel the shift—more clarity, tighter cycles, and a company moving as one.


    Inspired by this post on Pendo – Best Practices.


    Book a consult png image
  • Build Your Personal Operating System with Claude Code: A Playbook for Focus, Speed, Clarity

    Build Your Personal Operating System with Claude Code: A Playbook for Focus, Speed, Clarity

    This is the year to build your personal operating system. For me, that line isn’t a slogan; it’s a commitment to eliminate context switching, compress decision cycles, and turn fragmented information into a reliable source of truth. As a product leader, I needed a system that blends judgment, data, and automation—so I built mine around Claude Code.

    When I say “personal operating system,” I mean an integrated set of AI workflows, rituals, and tools that capture knowledge, structure decisions, and automate execution. It’s where product discovery meets delivery: a place to synthesize signals, prioritize with clarity, and move from insight to action without friction. The outcome is fewer ad hoc decisions, more deliberate strategy, and a calmer, more focused day.

    Claude Code sits at the center because it helps me translate intent into working software and repeatable processes. I use it to scaffold small utilities, write adapters for APIs, and evolve prompts into robust patterns. It accelerates everything from research synthesis and PRD drafting to backlog grooming and stakeholder updates—while keeping me in the loop for final judgment.

    Under the hood, I run a retrieval-first pipeline that connects notes, docs, tickets, research transcripts, and roadmaps into a searchable, living memory. With careful context window management, I feed only the most relevant snippets into Claude Code, preserving accuracy and speed. The result: richer answers, fewer hallucinations, and an assistant that “remembers” what matters without drowning in noise.

    My daily loop is simple: capture, synthesize, decide, and act. I capture customer signals and meeting notes into a personal knowledge management vault; synthesize patterns with prompt engineering that emphasizes evidence; decide using outcomes vs output OKRs; and act by generating drafts, creating tasks, and updating artifacts. Claude Code helps me wire this end-to-end, so the system works even on my busiest days.

    If you’re implementing this from scratch, start small. Pick one high-friction workflow—say, product feedback triage—and build a narrow agentic AI flow to classify, summarize, and route items. Use eval-driven development to test prompts against known edge cases. Add guardrails and privacy-by-design practices from day one, then expand to neighboring workflows once the first loop is reliable.

    Governance matters. I treat AI risk management, data governance, and security as first-class citizens: limited data scopes, clear audit trails, human-in-the-loop approvals, and rollback plans. Feature flags control changes; observability tracks drift and quality; and a simple playbook documents how we deploy, monitor, and improve the system.

    Measure what this personal operating system earns you. Track decision latency, cycle time from signal to action, meeting-to-output ratios, and the signal-to-noise ratio of inputs. When the system is working, you’ll feel it: fewer meetings, more momentum, and sharper product strategy supported by trustworthy AI workflows.

    The goal isn’t to automate judgment—it’s to protect it. By letting Claude Code handle the glue work and information wrangling, I preserve energy for high-leverage thinking: positioning, sequencing, and trade-offs. Build your personal operating system now, and make this the year your product practice runs with clarity and composure.


    Inspired by this post on Pendo – Best Practices.


    Book a consult png image
  • Stop Groupthink in Hiring: Proven Product-Led Tactics to Make Faster, Fairer Decisions

    Stop Groupthink in Hiring: Proven Product-Led Tactics to Make Faster, Fairer Decisions

    Is hiring broken—or just badly designed? I’ve been sitting with that question after a recent conversation that crystallized what I see across product organizations: AI-fueled application overload, sprawling interview loops, and fuzzy criteria that invite groupthink at exactly the wrong moments. If you’ve ever watched a promising candidate stall out late in the process, you’re not alone. Listen to this episode on: Spotify | Apple Podcasts.

    Here’s the reality I’m observing in the market: Layoffs and hiring freezes have flooded the funnel, while AI tools make it trivial to submit hundreds of applications. Companies are overwhelmed, so they respond by adding more interviews and more stakeholders, hoping more touchpoints equal better signal. In practice, that complexity often dilutes accountability and increases noise—especially for product management leadership roles where clarity, not consensus theater, determines success.

    I’ve seen too many offers derailed by “one last step.” A candidate clears every structured interview, then a casual lunch or unframed panel suddenly becomes the deciding factor. The team isn’t briefed on what to evaluate, one lukewarm comment lands, and group dynamics cascade into a no-hire. That’s not rigor—it’s randomness masked as prudence.

    Groupthink ≠ good hiring decisions. When everyone has veto power, risk-averse no-decisions become the default. Focus-group-style interviews create bias, not signal, and “culture fit” often becomes a proxy for stereotyping or personal preference. As product leaders, we’d never ship a feature based on vibes; we shouldn’t make high-stakes hiring calls that way either.

    There’s a better way—and it mirrors how we run great product discovery. Define who you’re hiring before writing the job description. Set clear success metrics for the role. Assign each interviewer specific criteria to evaluate. Treat hiring like product discovery: intentional, structured, and evidence-based. In my teams, that looks like tight scorecards, interviewer calibration, and a decision owner who synthesizes evidence—not a popularity contest where the loudest voice wins.

    Chemistry checks still matter, but only when we define what collaboration actually means for the role. Introversion, debate style, or lunch-table small talk are not performance indicators. I look for behaviors we value in empowered product teams—clarity of thinking, healthy dissent, co-creation under constraints—often via a real working session with the future product trio. Diverse teams outperform homogenous ones, even if not everyone “vibes,” so I optimize for complementary strengths over sameness.

    If you’re a candidate, remember: When a process feels broken, it’s often not about you. Ask how you’re being evaluated to gauge process maturity; a thoughtful team will happily walk you through their rubric and what great looks like. For structure and support, I’ve seen “Who: The A Method for Hiring” help leaders clarify requirements; “Never Search Alone” and joining a Job Search Council (JSC) can give you peer accountability and sharper narratives. For current openings, I regularly point PMs to Scott Baldwin’s PM job postings on LinkedIn.

    My challenge to fellow product leaders: Audit your hiring process the way you’d audit your roadmap. Where are decisions getting stuck? Where are you over-indexing on consensus and under-indexing on evidence? Tighten the criteria, streamline stakeholders, and instrument the funnel so you can learn and improve. The payoff is faster, fairer, more confident decisions—and teams that reflect the rigor we expect in product strategy and stakeholder management.

    What’s one change you can make this week—reworking the scorecard, calibrating interviewers, or replacing an unstructured lunch with a real collaboration exercise? Small improvements compound. Let’s build hiring systems that are worthy of the talent we’re trying to attract.


    Inspired by this post on Product Talk.


    Book a consult png image
  • Stop Measuring Output, Start Driving Outcomes: My February CDH Book Club Guide

    Stop Measuring Output, Start Driving Outcomes: My February CDH Book Club Guide

    “Continuous Discovery Habits” turns five this year, and I’m celebrating by reading the book together with you. Each month, I’m releasing an in-depth reading guide designed for empowered product teams and product trios—complete with the chapters we’ll read, a preview of the key concepts, short shareable videos, individual and team discussion prompts, team exercises you can run immediately, and additional reading to go deeper.

    We’ll discuss each month’s reading in the comments, and we’ll gather quarterly for live calls. If you’re joining late, no problem—I’ll be monitoring comments throughout the year. Start with the current month or go back to January (https://www.producttalk.org/lets-read-continuous-discovery-habits-together-january-2026/). Jump in where it serves you best, ask for help, share what’s working, and connect with other readers any time.

    If you want to participate, grab a copy of the book (https://amzn.to/3hGkNYT?ref=producttalk.org)—or dust off your old one—share the “Spread the Love” videos with your colleagues, set aside time to run the team exercises, and register for the community sessions. Let’s do this.

    This Month’s Reading

    Chapters: Chapter 3: Focusing on Outcomes Over Outputs

    Estimated reading time: ~22 minutes

    This chapter zeroes in on the critical difference between business outcomes and product outcomes—and why it matters which one your team is assigned; how to translate lagging business metrics into actionable product outcomes you can actually influence; why setting outcomes should be a two-way negotiation between leaders and product trios; when to start with a learning goal versus a performance goal; and five common anti-patterns that derail outcome-focused teams. Need a copy? Grab the book (https://amzn.to/3hGkNYT?ref=producttalk.org).

    Share the Love with Friends and Colleagues

    We learn best in community. I like to seed conversations across my org with short, high-signal content—especially when I’m shifting a culture from outputs to outcomes and sharpening OKRs. Use these short videos to bring peers into the conversation and invite them to read along:

    “What’s an outcome?” (https://videos.producttalk.org/videos/ea9fdab71d1ee3c263/whats-an-outcome?ref=producttalk.org) — The real value of starting with an outcome. “Business outcomes vs. product outcomes” (https://videos.producttalk.org/videos/069fd5b5101ee2c78f/business-outcomes-vs-product-outcomes?ref=producttalk.org) — Why product teams need product outcomes, not business outcomes. “What’s the difference between OKRs and outcomes?” (https://videos.producttalk.org/videos/069fdab61919e4c38f/whats-the-difference-between-okrs-and-outcomes?ref=producttalk.org) — Any outcome can be represented as an OKR. “Understanding revenue model formulas” (https://videos.producttalk.org/videos/799fd5b5101ee2c4f0/understanding-revenue-model-formulas?ref=producttalk.org) — How to identify the business outcomes your company cares about. “Revisit your outcome every quarter” (https://videos.producttalk.org/videos/449fd5b4111ee0cfcd/revisit-your-outcome-every-quarter?ref=producttalk.org) — Don’t abandon your outcome, but do revisit how you measure it.

    Reflect and Discuss What You Read

    Reflection is the conversion rate optimizer for learning. When we pause to discuss what we’re reading, we retain more and apply it faster—especially in product discovery and product strategy work. This chapter challenges us to update our definition of success: away from features shipped and toward outcomes achieved. This month, I’m examining my own relationship with outcomes—where I’ve been rigorous, where I’ve drifted, and how I can help my teams strengthen day-to-day behaviors.

    Individual Reflection

    If your team isn’t working toward an outcome, look at the features or projects on your roadmap and ask: What impact are they supposed to have? If they succeed, what customer behavior or business result would change? If your team does have an outcome, consider whether it’s a business outcome, a product outcome, or a traction metric—and how that choice shapes your daily decisions and discovery cadence. Finally, think about the last time your team’s outcome changed: Was it a deliberate strategic shift, or did it feel like ping-ponging from one priority to the next?

    Team Discussion

    As a team, classify your current outcome: Is it a business outcome, a product outcome, or a traction metric? If it’s a business outcome, identify the leading customer behaviors that would signal momentum; if it’s a traction metric, broaden it to a product outcome that gives you more room to explore. Then, name which of the five anti-patterns (pursuing too many outcomes, ping-ponging, individual outcomes, outputs as outcomes, or tunnel vision) shows up for you and pick one concrete change. Finally, assess how outcomes are set: Are they handed down, or does your product trio co-create them? What would it take to make this a true two-way negotiation?

    Put It Into Practice

    Understanding the difference between business outcomes and product outcomes is table stakes. Translating one into the other is where product management leadership shows up. These exercises will help you connect company goals to customer behavior, avoid outcomes vs output OKRs traps, and increase your span of control over meaningful change.

    Exercise: Map Your Revenue Model

    Time: 30 minutes. Do this: Solo first, then share with your team. Start with this question: How does your company make money? Write out the formula for your revenue model. For example, a subscription business might be: Revenue = Number of Customers × Average Monthly Spend × Retention. Once you have the formula, identify each variable as a potential business outcome. Then, for each business outcome, brainstorm two to three product outcomes (customer behaviors or sentiments) that might be leading indicators. Which of these product outcomes is your team best positioned to influence?

    Exercise: Audit Your Current Outcome

    Time: 45 minutes. Do this: With your product trio. Take your team’s current outcome and run it through a quick diagnostic: Is it a business outcome, product outcome, or traction metric? If it’s a business outcome, what product outcomes might drive it? If it’s a traction metric, how might you broaden it to a product outcome? Is it a leading indicator or a lagging indicator? Can you measure progress weekly, or do you have to wait months? Is it within your team’s span of control? Based on your answers, draft a revised outcome that offers more actionable feedback while still connecting to business value, and prepare to discuss this with your product leader.

    Go Deeper: Additional Reading

    If you prefer an audio summary of this month’s reading, including the book chapter and the resources below, I’ve included an audio version at the end of this post for paid subscribers.

    Related In-Depth Guide: Shifting from Outputs to Outcomes: Why It Matters and How to Get Started (https://www.producttalk.org/shifting-from-outputs-to-outcomes/).

    Supplementary Reading: Empower Product Teams with Product Outcomes, Not Business Outcomes (https://www.producttalk.org/2020/05/product-outcomes/). Defining Product Outcomes: The 8 Most Common Mistakes You Should Avoid (https://www.producttalk.org/2022/12/defining-product-outcomes/). Understanding How Product Outcomes Connect to Revenue and Costs (https://www.producttalk.org/2023/04/connecting-product-outcomes-to-revenue-and-costs/). Product in Practice: Iterating to an Actionable Outcome at tails.com (https://www.producttalk.org/2020/08/actionable-outcomes/). Product in Practice: Iterating on Outcomes with Limited Data (https://www.producttalk.org/2023/12/iterating-on-outcomes-with-limited-data/). Measurable Outcomes – All Things Product with Teresa Torres and Petra Wille (https://www.producttalk.org/measurable-outcomes-all-things-product-podcast-with-teresa-torres-petra-wille/).

    Other Voices: The Business Equation by Brett Bivens (https://venturedesktop.substack.com/p/the-business-equation?ref=producttalk.org). KPI Trees: How to Bridge the Gap Between Customer Behavior, Product Metrics, and Company Goals by Petra Wille and Shaun Russell (https://www.petra-wille.com/blog/kpi-trees-how-to-bridge-the-gap-between-customer-behavior-product-metrics-and-company-goals?ref=producttalk.org). Persistent Models vs. Point-In-Time Goals by John Cutler (https://cutlefish.substack.com/p/tbm-2553-persistent-models-vs-point?ref=producttalk.org). Is It Time to Ditch the Old SaaS Metrics? by Kyle Poyar (https://openviewpartners.com/blog/saas-metrics-plg/?ref=producttalk.org). How Engagement Metrics Can Be Misleading by Oleg Yakubenkov (https://gopractice.io/blog/how-engagement-metrics-can-be-misleading/?ref=producttalk.org). Subscription Churn Metrics and Benchmarks for Operators by Elena Verna (https://www.elenaverna.com/p/subscription-churn-benchmarks-and?ref=producttalk.org).

    Related Courses: Business Fundamentals: Navigate Your Business Context with Confidence (https://learn.producttalk.org/course/business-fundamentals?utm_source=Product+Talk&utm_medium=cdh-book-club-february-2026).

    Our Live Discussion Schedule

    Our live discussion sessions are for paid subscribers and will not be recorded. Invitations will go out to Supporting Members and CDH Members (http://members.producttalk.org/?ref=producttalk.org) two weeks before each event—reserve time on your calendar now so you can participate fully and bring real examples from your team.

    Wednesday, March 18, 2026: 9am–10am PDT and 4pm–5pm PDT. Tuesday, June 16, 2026: 9am–10am PDT and 4pm–5pm PDT. Thursday, September 17, 2026: 9am–10am PDT and 4pm–5pm PDT. Wednesday, December 16, 2026: 9am–10am PST and 4pm–5pm PST.

    Audio Summary

    Prefer to listen? I’ve included an audio summary—Stop Measuring Code Start Measuring Behavior—at the end of this post so you can review the main ideas on your commute or between meetings.

    I’m excited to dive into outcomes with you this month. As a product leader, I’ve seen teams transform their product discovery, product roadmapping and sprint planning, and OKR quality when they anchor on clear product outcomes tied to business value. Let’s build that muscle together and make this a quarter where we stop measuring output and start driving outcomes.


    Inspired by this post on Product Talk.


    Book a consult png image
  • The AI Deployment Gap Is Widening—Accelerate to Mature ROI and World-Class CX in 2026

    The AI Deployment Gap Is Widening—Accelerate to Mature ROI and World-Class CX in 2026

    I’ve watched AI adoption accelerate dramatically over the last year, and the momentum is undeniable. Teams everywhere are experimenting, piloting, and operationalizing AI—but the ways they’re doing it, and the outcomes they’re seeing, vary widely.

    Our latest research shows that 82% of senior leaders invested in AI for customer service in 2025, and 87% plan to in 2026. That’s the new baseline. The differentiator now is depth—how far AI is embedded into core workflows, accountability, and measurement.

    Infographic comparing AI benefits in customer service: 43% with mature deployment report higher quality and consistent support, versus 24% at initial deployment; survey allowed multiple responses.
    Teams with mature AI are almost twice as likely to achieve higher, more consistent support quality. Our survey shows 43% of advanced adopters citing this benefit compared with 24% of early deployments.

    But while most teams are using AI, our 2026 “Customer Service Transformation Report” shows that this usage is not equal. A gap is opening up between teams that have deployed AI at a surface level and those that have integrated it deeply. I see this firsthand: shallow deployments answer FAQs; deep deployments redesign processes, policies, and teams.

    Infographic comparing customer service improvements after AI: 87% of mature deployments report improved metrics vs 62% of all respondents, shown as pink and gray circles with legend and headline.
    Survey results highlight the AI deployment gap: nearly nine in ten organizations with mature AI see improved customer service metrics (87%), compared with 62% across all respondents, visualized with bold circles.

    For this year’s report, we surveyed over 2,400 global customer service professionals across a range of industries to see how they’re using AI today, where it’s paying off, and what they’re betting on as they plan for 2026. The findings mirror my experience leading AI Strategy and AI workflows at scale.

    Infographic of customer service teams measuring AI ROI by deployment stage: 70% mature, 60% scaling, 43% initial, 35% exploring, shown as donut charts, illustrating the deployment gap.
    As AI programs advance, measurement confidence surges. This chart shows how ROI tracking rises from 35% in exploring to 70% in mature deployments—evidence of a widening execution gap in customer service.

    We found that for many teams, AI is still doing narrow work like answering simple questions or handling small parts of workflows. These teams are seeing benefits, but only a fraction of what’s possible. Meanwhile, a smaller group is pulling away. They’ve put AI at the core of their service operation, integrating it into critical workflows, giving it more responsibility, and continuously improving it over time. That’s the hallmark of mature deployment.

    Side-by-side infographic comparing 2025 vs 2026 customer service priorities. In 2026, improving CX leads at 58%, followed by reducing costs and improving efficiency at 46%, with support quality still a key focus.
    Customer service priorities are shifting fast. By 2026, improving CX tops the list at 58%, cost and efficiency climb, and quality moves to third as teams prepare to scale operations and evolve skills.

    The difference in results and overall support experience – for both teams and customers – is significant. Here’s how I interpret the data and what I recommend to close the gap.

    Ranked customer service survey chart titled 'How are existing support roles changing on your team as a result of AI?' showing 45% updated job descriptions, 40% agent AI training, and other shifts at 27–24%.
    Survey insights from the 2026 customer service transformation report reveal how AI reshapes support roles: 45% of teams updated job descriptions and 40% ramped up AI training, while human agents focus more on complex escalations.

    AI adoption is the norm, depth makes the difference. According to senior leaders, 82% of organizations invested in AI in 2025, with 87% planning to invest in the year ahead. Despite this widespread investment, only 10% of teams report having reached a mature level of deployment, where AI is fully integrated into operations and working at scale. In my playbook, maturity means end-to-end ownership of well-defined workflows, robust guardrails, and clear success criteria.

    Survey chart showing drivers to expand AI beyond support: success with AI in support (57%), unified customer experience (49%), scaling without added headcount (33%), and cross-department demand (31%).
    Early AI wins are fueling expansion beyond support. Survey results show 57% cite proven success, 49% aim for a unified customer experience, 33% need to scale without adding headcount, and 31% see demand from other teams.

    Reaching this level of maturity is where AI’s real value lies. We found that 43% of teams with mature deployment report higher quality and consistency across support – nearly double the rate of those still in the exploration or initial deployment stages. That aligns with what I see when we move from point solutions to platform thinking and agentic AI patterns.

    Neon green hero graphic reading 'The 2026 Customer Service Transformation Report', with subhead 'The AI deployment gap is widening' and a black 'Get the report' button over a bar-chart pattern.
    Leaders are racing ahead with real AI in support. Explore the 2026 Customer Service Transformation Report to see where deployment is stalling, benchmark your team, and get practical steps to scale automation that delights.

    ROI becomes clearer with deeper integration. The economic benefits of AI tend to show up first in speed and throughput, and they show up fast. Across all respondents, 62% say their customer service metrics have improved since implementing AI. Most often, teams report their initial gains in efficiency and scale—faster responses, shorter handling times, and the ability to resolve more conversations with the same team—all driving lower cost per interaction.

    But the deeper teams go with deployment, the more the results start to show in the metrics. We found that among teams that describe their AI deployment as mature, the cohort of respondents reporting improved metrics as a result of AI rises from 62% to 87%. What’s more, teams with more mature deployments are significantly more likely to say they can measure the return on their AI investment. My advice: instrument everything upfront, baseline rigorously, and use eval-driven development to iterate with confidence.

    The bar has moved from ‘does it work?’ to ‘is it actually good?’ More than ever, teams are focused on improving customer experience and satisfaction, with 58% saying it’s the top priority for 2026. That number has more than doubled since last year, when just over a quarter (28%) of respondents cited it as a top priority. As AI assumes repetitive work, your people can shift from reactive triage to proactive journey design. Now is the time to invest in quality frameworks, prompt engineering standards, and LLMs for product managers to close the loop between product, ops, and CX.

    Important support work now extends beyond the inbox. AI is reorganizing core customer service operations as it starts to take on a higher volume of work and more complex tasks. Even at the initial deployment stage, 16% of teams report spending less time handling support volume since implementing AI – and among teams who’ve reached maturity, that figure rises to 28%. I’ve seen new roles emerge—AI operations managers, conversation designers, and model evaluators—alongside upskilling for agents into higher-order troubleshooting and relationship building.

    Support is creating the blueprint for AI deployment across the business. Support was the proving ground for AI, and our research suggests that businesses are now planning to expand its use to other areas based on the results it’s yielded so far. Fifty-two percent of respondents said that their organizations are actively planning to scale AI to departments like customer success, marketing, and sales in 2026. The two most cited driving forces behind this decision are the success support has seen with AI to date and a desire to create a unified customer experience. Treat your support stack as a reusable platform: shared services, governance, and reusable components accelerate adoption in adjacent functions.

    Seize the opportunity to close the gap. Having or not having AI isn’t a question anymore. What you should be asking now is how close you are to mature deployment, where AI is capable of tackling nuanced, high-stakes work. Those who have reached this stage show that going deep is what unlocks real value. That’s the opportunity. Push AI to do more, bring it to more channels, use it to resolve the most complex queries, and close the gap before it becomes too wide to close.

    This might seem daunting. But trying new things always is. What we’re experiencing now is a defining moment for customer service, and the teams that are leaning in are actively building the future. As this report shows, what works in customer service now will become the blueprint for how organizations transform the full customer journey with AI. If you want the benchmarks and the playbook to accelerate from pilots to production-grade outcomes, I recommend reviewing the full “2026 Customer Service Transformation Report.”


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • AI Operating Model Masterclass: How I Scale Teams, Tech, and Governance Without Chaos

    AI Operating Model Masterclass: How I Scale Teams, Tech, and Governance Without Chaos

    When I set out to operationalize AI across a product organization, I focus on one promise: repeatable outcomes without chaos. An effective AI operating model turns experiments into an engine—aligning strategy, teams, technology, and governance so we can ship value safely and at scale.

    At its core, an AI operating model is the connective tissue between vision and delivery. I anchor it on a few pillars: clear AI Strategy, empowered cross-functional teams, a modern AI platform, rigorous AI risk management and data governance, and a cadence of eval-driven development that ties everything back to outcomes.

    Strategy comes first. I translate big ambitions into a portfolio of use cases ranked by customer impact, feasibility, and risk. I use continuous discovery to validate the problem, then frame each bet with outcomes vs output OKRs, a crisp value proposition, and a build vs buy decision. For generative AI, I encourage PMs to treat LLMs for product managers as a craft—rapid prototyping, deliberate prompt engineering, and disciplined evaluation from day one.

    Team design matters as much as models. I organize around product trios—PM, design, and engineering—augmented by data, ML, and a “forward deployed” mindset when the domain is complex. I invest in empowered product teams and communities of practice to spread patterns quickly while avoiding centralized bottlenecks.

    On the platform side, I start retrieval-first pipeline before fancy modeling. A solid foundation—feature stores, vector search, observability, and safe integration points—beats bolt-on hacks. I rely on CI/CD with feature flags, strong deployment frequency, DORA metrics, and SRE-grade reliability to keep the iteration loop tight and safe.

    Governance is non-negotiable. I implement privacy-by-design, clear data governance, audit trails, and policy controls aligned to regulatory compliance. AI risk management includes model red teaming, safety layers, and human-in-the-loop review where needed. The goal is confidence: we know what shipped, why it works, and how it fails.

    Execution rides on eval-driven development. For every AI workflow, I define offline and online test sets, target metrics, and a decision policy before launch. I A/B test with proper minimum detectable effect (MDE), layer canaries for protection, and monitor user experience and outcomes in production. This is how we turn “it seems smarter” into statistically confident improvements.

    Adoption is a product in itself. I build onboarding, in-app guides, and product tours that help users form habits quickly. I monitor activation, time-to-value, and retention analysis while partnering with customer support ai strategy to close the loop between real-world issues and roadmap priorities.

    Culture scales the system. I normalize rapid learning, shared playbooks, and personal knowledge management so insights don’t disappear into meetings or notebooks. I upskill teams on prompt engineering, context window management, and model selection, and I celebrate the humility required to refactor what “worked” yesterday.

    Operating cadence keeps it all coherent. I run an AI portfolio review tied to outcomes vs output OKRs, keep a single source of truth for evaluations, and align go-to-market strategy with release readiness. We review risks alongside results so speed never outruns safety.

    If you’re starting from scratch, I recommend a 30-60-90 approach: baseline your current state, choose two lighthouse use cases, stand up the retrieval-first pipeline and eval harness, define governance and data policies, then ship small, safe increments behind feature flags. Teach the system to learn before you make it run.

    I’ve felt the pain of brilliant prototypes that crumble in production and the thrill of AI features that compound value month after month. The difference is the operating model. Build it with intent, and you’ll scale AI with confidence—teams aligned, tech resilient, and customers seeing real outcomes.


    Inspired by this post on Product School.


    Book a consult png image