Author: Shivam Tiwari

February Fin Breakthroughs: Master complex workflows, natural voice, 2-minute Shopify, smarter ops

Every update we shipped this month removed a specific constraint on what teams can do with Fin. In my world, the demo-to-production gap shows up as complexity, control, and confidence. Can the agent handle the query that actually matters? Will it sound right on a call? Can the team deploy it without filing an engineering ticket? Can managers understand what it’s doing? That’s the bar I hold us to.

This month, we delivered answers to all four. Here’s how.

Procedures and Simulations (0:51). The hardest problem in AI-powered customer service isn’t answering FAQs—it’s executing complex queries with real business logic and real consequences if anything goes wrong. Think billing refunds, multi-step flows, and actions that must be right the first time.

We made it dramatically easier to build and manage Fin for those complex queries—without pulling in an engineer. You can author in natural language, test every step in simulation, and deploy with confidence.

The workflow starts with AI drafting the procedure from your existing source material. You edit in natural language, with structured hooks to pull in live data, apply business logic, and add code for deterministic control where you need it. That’s how you handle multi-step flows with the precision that matters when things go wrong.

Simulations are the test environment. Define a test case, pass in the data Fin would receive in a real conversation, and watch it work through each step. You see what Fin is doing, why, and whether it’s meeting the criteria you set. Full transparency at every point. I’ve run these end-to-end myself, and there’s a particular confidence that comes from watching it work before it goes anywhere near a customer.

A conversational moment from the February Fin Product Updates recap: two teammates trade insights with laptops open, while a bold pull-quote drives home the promise—Fin removes complexity to start selling and supporting in under two minutes.

For a deeper look at Procedures and Simulations, head to fin.ai/procedures.

Fin Voice: three major updates. When something’s off in chat, it can take a few exchanges to notice; on a call, it’s immediate. Pronunciation, noise handling, and tone all matter because they’re the customer’s first impression.

Pronunciation rules (4:18). Fin has high out-of-the-box pronunciation accuracy, but it doesn’t know your brand—your product names, your industry terminology, the way your company uses certain words. Alihan Zinna, Staff ML Scientist, showed this with an IKEA example: without pronunciation rules, Fin mispronounced both “IKEA” and a product name; after adding rules, both were corrected and sounded natural.

New natural voices (5:48). We’ve added 11 new voices tuned to a range of brand tones so you can choose one that sounds like it truly belongs to your company—not a generic AI assistant.

Background noise reduction (6:28). People call from airports, shops, and busy offices. Fin now monitors background noise continuously and increases noise reduction when the environment demands it. No configuration needed. As Alihan put it, “This is one of those things customers really notice when it’s not working. The goal was to make it invisible. That’s what we built.”

Catch up on February’s Fin Product Updates with a walkthrough of the Call Metrics dashboard—saved filters, hold‑time tiles, missed and declined call counts, and a monthly breakdown that helps support teams act faster.

Shopify setup experience (8:21). Fin began as a Service Agent and is quickly becoming a Customer Agent—working across the whole lifecycle to support, sell, and guide, even before a customer has an issue. The revamped Shopify setup is a clear step forward.

Shopify catalogs are complex—thousands of products, variants, and dynamic inventory—and connecting all of that to an agent has historically been painful. We removed the friction.

Setup now takes three steps: first, connect your store. Second, install the Messenger directly in Shopify—no code, just a few clicks. Third, deploy Fin. Total time: under two minutes. We timed it live.

What that unlocks is real. In the demo, a first-time snowboarder asked for recommendations. Fin searched the catalog, reasoned about attributes that matter to a beginner (there’s no “beginner” tag in the catalog), personalized suggestions by height and weight, and added a board to the cart.

Even better, one customer updated their website copy to promote a sale. Fin immediately picked up the new context and began recommending sale items, nudging shoppers to add more to the cart to access a discount—no extra configuration required. It read the situation and acted.

See how the latest Fin update streamlines support scheduling. A product expert walks through Holiday Office Hours, showing how to set default hours, track response metrics, and add closures so teams stay consistent.

Three steps, and you have a real-time shopping assistant that knows your store and sells on your behalf.

Helpdesk improvements (12:31). Fin works with any helpdesk, but many teams consolidate to take advantage of our native Intercom helpdesk integration. We’ve shipped 19 helpdesk improvements in 2026 so far; two from this month stand out.

11 new call metrics. Hold time, outbound dial time, missed and declined calls, call terminating party, and more. These give leaders the visibility to analyze workload distribution and call handling quality in detail.

Holiday office hours. Teams no longer need to manually update office hours for every public holiday. This was the most upvoted request in our community, and we shipped it.

Across the board, we removed the constraints that hold teams back: the complexity ceiling in automation, the quality ceiling in voice, the setup barrier in Shopify, and the operational overhead in the helpdesk.

We closed out the month with a Star Wars–style crawl of 22 additional updates. All features mentioned here are live and available now. Explore more at fin.ai/updates. More to come—see you next month.

Inspired by this post on The Intercom Blog.

March 10, 2026
Kill Your Darlings: Why I Sunset ‘Successful’ Products to Fuel Real Portfolio Growth

There’s a moment in every product leader’s career when the bravest decision isn’t to build—it’s to stop. That’s why the “Kill Your Darlings” theme resonated so strongly with me. In this episode of All Things Product, Teresa Torres and Petra Wille dig into the courage and craft it takes to sunset products that look successful on the surface yet quietly block your path to meaningful growth. As someone accountable for portfolio outcomes, I’ve learned that disciplined endings are often the catalyst for exceptional beginnings.

Listen to this episode on: Spotify | Apple Podcasts

The heart of the conversation is that uncomfortable middle ground between obvious failure and runaway success: products that are profitable, loved by customers, but fundamentally flatlining. Teresa shares candid stories from her own business, including a decision to cut 40% of revenue on purpose. I’ve been there—choosing to retire a “working… kind of” product to free up discovery capacity felt risky in the moment, but it created the focus we needed for durable growth.

Here’s the trap: some traction can be more dangerous than no traction at all. Early fans are not the same as durable product–market fit, and “stable but not growing” can lull leaders into maintaining instead of learning. Every hour of design, engineering, and go-to-market attention that props up a flatlining product is an hour not invested in the next breakthrough—an opportunity cost that rarely shows up on a dashboard, yet compounds month after month.

From a portfolio perspective, this is continuous discovery in action. If we want empowered product teams to tackle meaningful outcomes, we have to protect their capacity from zombie work. That means setting clear thresholds for when we double down, shift strategies, or sunset—before attachment and inertia take over. When I’ve institutionalized this discipline, our throughput of high-quality bets increased, and our confidence in what not to do became a strategic advantage.

Organization design can make sunsetting harder than it needs to be. Dedicated, long-lived teams are fantastic for compounding capability, but they also create emotional and structural ties to specific products. Petra’s point lands: leaders need explicit sunsetting conversations and a portfolio decision-making cadence that sits one level above teams. In my org, we treat sunsetting as a strategic reallocation—not a verdict on a team’s talent—so people are celebrated for learning, not punished for outcomes outside their control.

Killing profitable products can be the right strategic move when the growth ceiling is clear and the opportunity cost is high. I’ve chosen to “burn the ships (on purpose)” more than once—retiring add-ons that generated reliable revenue but diluted our value proposition and spread discovery thin. Yes, it stings in the quarter you do it. But it’s astonishing how quickly focus restores momentum when you create intentional space for what’s next.

Practically speaking, I make sunsetting easier and less traumatic by operationalizing it: Regular portfolio reviews focused on outcomes and opportunity cost; a visible “sunsetting” column so everyone sees what’s on the table; the Horizon (H1 / H2 / H3) model to balance core, adjacent, and transformational bets; and making portfolio decisions one level above teams to avoid local optimizations. Add explicit exit criteria and success metrics for endings, the same way we set entry criteria for new bets.

Another theme I appreciated is designing for the right customers. Teresa highlights intentionally limiting access and pricing to work with customers who show agency and commitment. I’ve applied the same principle: when we’re clear about who we serve and who we don’t, our product–market signal sharpens, churn narratives simplify, and roadmaps get crisper. Focus is a growth strategy.

If you’re leading a product portfolio, running discovery, or wrestling with a product that “works… kind of,” this conversation is permission to act. Product–market fit isn’t binary, and mediocre success can be the most dangerous place to stay. Sunsetting is a portfolio decision, not a team failure; teams shouldn’t be punished for reaching the end of a product’s natural lifecycle. If experimentation isn’t in your DNA, killing products will always feel traumatic—so make space for it intentionally, not passively.

Key moments and themes worth bookmarking: 00:00 – Why “kill your darlings” matters; 04:30 – The dangerous middle ground; 09:30 – The opportunity cost of “okay” products; 14:30 – Sunsetting in product organizations; 19:00 – Real examples of killing revenue streams; 28:00 – Designing for the right customers; 33:30 – Burn the ships (on purpose); 38:00 – Making sunsetting easier with Regular portfolio reviews, a visible “sunsetting” column, the Horizon (H1 / H2 / H3) model, and making portfolio decisions one level above teams; 46:00 – Normalizing product lifecycles.

Resources & Links:

Follow Teresa Torres: https://ProductTalk.org

Follow Petra Wille: https://Petra-Wille.com

Mentioned in this episode:

Ways to Work with Petra Wille

Product at Heart

CDH Membership by Teresa Torres

Product Talk by Teresa

Product Talk Academy by Teresa

Enduring Ideas: The three horizons of growth

Join the Conversation:

Have thoughts on this episode? Leave a comment below.

Full Transcript

Full transcripts are only available for paid subscribers.

Inspired by this post on Product Talk.

March 10, 2026
Ship MVPs in Days, Not Months: My Proven Prompt Prototyping Playbook for Product Teams

Most MVPs take too long, cost too much, and still miss the mark. Over the past year, I’ve shifted my team to a prototyping prompts approach that lets us validate problem-solution fit in days, not months. The result is faster learning loops, clearer tradeoffs, and a dramatically higher hit rate on features that actually move the needle.

When I say prototyping prompts, I mean structured, layered instructions that guide gen ai systems to produce the right artifacts at the right fidelity. Instead of jumping straight to code, we generate concise problem briefs, user stories, interaction flows, low-fidelity UI descriptions, and test plans. Each pass is constrained by acceptance criteria and business outcomes, which keeps the work grounded in value rather than output.

Here’s the playbook my product trios use to go from idea to a testable MVP in 48–72 hours. First, we anchor on outcomes vs output OKRs and clarify the customer job-to-be-done using evidence from customer interviews and support data. This is classic continuous discovery, but we compress it by focusing on the single riskiest assumption to de-risk this week.

Second, we build a prompt scaffold. We specify the role, constraints, target users, success metrics, and the exact output format we expect. We also define evaluation upfront, borrowing from eval-driven development. For example, before any generation, we list the acceptance tests that a good solution must pass, including edge cases and compliance considerations. This discipline keeps hallucinations in check and improves repeatability.

Third, we spin up multiple prototypes in parallel. One prompt generates a lean product brief; another outlines user flows; a third proposes UI states and error handling. If we’re exploring voice, we add prompt engineering for voice to script dialogs and repair strategies. For data-heavy features, we call out retrieval-first pipeline patterns so the model references source-of-truth data rather than guessing.

Fourth, we validate with real users using the lightest-weight experiment possible. Fake-door tests, concierge workflows, and guided click-throughs let us measure intent before we invest. Where we can, we run quick A/B testing and size the effort using minimum detectable effect (MDE) so we don’t over- or under-sample. The point isn’t perfection; it’s fast, directional signal to inform the next iteration.

Fifth, we instrument and ship behind feature flags. We track activation, task completion, and time-to-value from day one. On the delivery side, we watch DORA metrics and deployment frequency to ensure we’re learning continuously rather than batching big bets. This bridges discovery and delivery so roadmaps reflect real-world feedback, not assumptions.

One recent example: we needed to evaluate a voice AI agent for appointment scheduling. In 72 hours, prompts produced the problem brief, dialog flows, error recovery strategies, and a sandbox to simulate inbound requests across three user personas. We exposed a thin slice to a pilot cohort, captured call outcomes, and iterated the repair prompts twice before writing any production code. The pilot converted at a higher rate than our control flow and gave us the confidence to invest in full integration.

This approach only works if we treat governance as a first-class concern. We bake in privacy-by-design, clear data governance boundaries, and AI risk management from the start. Prompts include guardrails on personally identifiable information, explicit constraints on data use, and links to approved sources. We also maintain a prompt repository with versioning and automated evaluations so changes are observable and reversible.

Practically, strong prompt scaffolds share three traits. They’re specific about context and constraints, they define success in measurable terms, and they separate concerns by artifact type. I’ll often ask for three variants with different tradeoffs, then run a quick synthesis prompt that highlights points of parity and differentiation. This gives the team structured options rather than a single, brittle path.

If you’re starting from zero, begin with one high-leverage workflow. Write a crisp outcome statement, draft your acceptance tests, and create a prompt that outputs a one-page brief, three user flows, and the top five risks with mitigations. Validate with five users in 48 hours, then decide: double down, pivot, or park. Rinse and repeat, and your product roadmapping and sprint planning will shift from speculation to evidence.

The bottom line is simple. Prototyping prompts won’t replace product judgment, but they will accelerate it. By turning ideas into testable artifacts in hours, you minimize waste, maximize learning, and ship better MVPs—fast.

Inspired by this post on Product School.

March 9, 2026
Behavioral Analytics That Crush Fraud: Spot Anomalies, Prioritize Risk, Act with Confidence

Fraud teams are drowning in signals—events, alerts, and edge cases that look suspicious but rarely point to what truly matters now. In my role leading product, I focus on turning that noise into clear, ranked actions the team can trust. Behavioral analytics is how we bridge the gap from “something looks off” to “here’s why it matters and what to do next.”

See how behavioral analytics helps fraud management teams surface anomalies, prioritize risk factors, and act faster with greater confidence.

When I build fraud capabilities, I start by defining the outcomes that matter: find anomalies early, prioritize by impact, and respond in minutes—not days. That requires a rigorous approach to data governance, strong observability across the stack, and a mindset tuned to threat detection and response rather than passive reporting.

For me, behavioral analytics means unifying event streams across web, mobile, payments, and support into a single, trustworthy, unified analytics platform. We then apply anomaly detection on top of baselines for user, device, and entity behavior—capturing velocity spikes, geolocation drift, account takeover signals, and unusual journey paths. The win is not more alerts; it’s clearer context per alert.

Prioritization is where the value compounds. I combine deterministic signals (e.g., device fingerprint mismatches, impossible travel, repeated declines) with weighted risk scoring that adapts to emerging patterns. This helps fraud analysts triage by potential loss and customer impact, not just alert volume—so the highest-risk cases land at the top of the queue with the right context attached.

Actionability is the final mile. I map each risk tier to a playbook—step-up authentication, temporary holds, secondary review, or immediate block—so teams can act with confidence. Real-time alerts route to the right channel; feature flags allow fast containment; and AI risk management practices ensure continuous learning while preserving precision and recall. We close the loop by measuring investigation time, false positive rates, and recovery to keep improving.

A few lessons keep paying off: instrument early and consistently; keep your schema stable; document risk definitions; and test changes with A/B testing to quantify impact before scaling. Treat your fraud stack like a mission-critical cybersecurity system with tight SLAs, clear ownership, and auditable decisions—because it is.

If you’re evaluating your next move, start with a narrow but high-ROI use case (account takeover or payment fraud), stand up clear dashboards for analysts, and iterate on the risk scoring model weekly. With disciplined data practices and aligned playbooks, behavioral analytics turns scattered signals into decisive, defensible action.

Inspired by this post on Amplitude – Perspectives.

March 5, 2026
Turn Support Wins into a Company-Wide AI Blueprint for Consistent, End-to-End CX

Building a great end-to-end customer experience with AI means going beyond support, and I’ve seen firsthand how transformative that shift can be when we treat every interaction as part of one cohesive journey.

Every customer touchpoint, from the first sales conversation through to post-sales support and success, is an opportunity to get it right. Other teams are now turning to AI to transform how they show up for customers, and support, which led the way, has already written the blueprint. In my role, I focus on making that blueprint actionable across the entire lifecycle.

In The 2026 Customer Service Transformation Report, it’s clear most businesses are thinking about what’s next, with more than half planning to scale AI to other departments. Interestingly, they often cite their early success with AI in support as motivation for the move. This makes support teams uniquely positioned to help lead the transition, a strategic role unimaginable just two years ago.

In this piece, I share how teams are introducing AI to other parts of the business, how to think about this expansion effort, and the new opportunities it creates for support leaders who want to drive a unified customer experience.

Support was the first proving ground for AI, and our research suggests that businesses are now planning to expand its use to other areas based on the results it’s yielded so far. Fifty-two percent of respondents said that their organizations are actively planning to scale AI to other departments in 2026.

What will this look like? Leading companies are already finding out.

Wins in support are setting the pace for company-wide AI. Survey results rank the drivers: proven success in support (57%), the push for a unified customer experience (49%), scaling other functions without more headcount (33%), and cross-department demand (31%).

My favorite example is WHOOP, the fitness wearables company. They offer a premium product which makes their sales conversations more consultative than transactional. Customers want to know “Which membership is right for me?” or “How often do I need to charge my WHOOP?” According to Emily Shirley, Business Manager for Growth Product at WHOOP, if someone chatted with the inside sales team, they were twice as likely to convert, but they didn’t have enough reps to respond to incoming queries fast enough. Customers could wait more than 10 hours for a reply.

With a big product launch on the line and an anticipated spike in prospective customer conversations, their three-person team needed help. So they deployed Fin to the "Join" page, the final step before purchase.

With Fin resolving 84% of inbound questions, the sales team was able to focus on high-value leads. Together, they drove a 130% increase in attributable sales. The team is now exploring ways to expand Fin beyond FAQs, focusing on personalised conversation flows, multi-product recommendations, and richer data capture. As Emily says: “There are so many parts of the buyer journey where this applies. We’ve only scratched the surface.”

It’s clear there’s a desire to push AI to other parts of the customer lifecycle, but there is a risk hidden in this expansion. If sales, customer success, and other departments all launch their own Agent, each operating in isolation, you can end up fragmenting the very thing our research says teams want to create. The second-most cited reason for pushing AI beyond support: desire for a unified customer experience.

Without shared context, each handoff becomes a source of friction where customers could receive inconsistent answers or be asked to repeat information. I’ve watched even well-intentioned AI rollouts struggle here—great local wins, but an overall journey that feels disjointed.

A translucent UI visual maps a support-led AI blueprint that scales across the business—from SDR and sales to custom assistants—anchored by layers for goals, memory and user context, business knowledge, and interoperability.

The opportunity (and the challenge) is to keep the customer at the center. Instead of department-specific Agents that operate independently, we must strive for cohesion. That means shared memory, consistent governance, and connected AI workflows that respect the customer’s history and intent across channels.

This is the future I’m building toward: solutions like Fin becoming a “Customer Agent,” capable of handling the entire customer experience. This will mean Fin can function in many roles, supported by a memory that grows with the customer over time and deep knowledge of the business, creating a seamless experience for every interaction. In practice, that’s agentic AI designed to collaborate across teams, systems, and journeys—without losing context.

Pushing AI into new parts of the business requires someone to own the process. And for many organizations, that’s the support team. Nearly a third of respondents (32%) confirmed their customer service teams are leading their business' AI transformation strategy.

This presents a real opportunity for support teams to shape the future of customer experience. Instead of each function reinventing the wheel, support can act as a center of excellence, defining shared standards, guardrails, and operating practices that drive performance.

“You already manage the most complex, high-volume customer interactions; you have rich data on customer needs and behavior; and you know how Agents perform in the real world. Those insights will be invaluable as AI scales across your business.”

Leaders are racing ahead with real AI in support. Explore the 2026 Customer Service Transformation Report to see where deployment is stalling, benchmark your team, and get practical steps to scale automation that delights.

In my organization, when we extended AI from support into sales, we deliberately brought our conversation design expertise, Agent Analytics, and governance models along with it. One team owns the orchestration, memory strategy, and CRM integration so a customer can start with a sales question and end up with a support one—without ever feeling a seam. That continuity is where journey mapping meets product strategy and turns into measurable outcomes.

As Agents like Fin expand their capabilities and move into new areas, I expect many customer service leaders will see their roles expand to include AI implementation across the customer journey. It’s a natural progression for product management leadership in support: owning the experience, the data, and the operating model.

Achieving perfect customer experience is AI’s biggest promise. But in order to get there, teams need to be smart about the solutions they deploy. A unified Customer Agent capable of handling the entire journey end-to-end will have a significant advantage, delivering consistent, context-aware experiences across every interaction.

The Customer Agent future is being built right now, and it’s starting with the team pioneering AI transformation from the very beginning: support. For leaders in these organizations, this is a rare opportunity to shape how customer relationships will be built and maintained in the AI era.

If you’d like to dig deeper into the data and benchmarks guiding these decisions, download The 2026 Customer Service Transformation Report.

Inspired by this post on The Intercom Blog.

March 5, 2026
Prevent Strategy Drift: AI that flags ‘merge conflicts’ in product plans before a quarter derails

"What if an AI could spot the moment two product teams start pulling in opposite directions — before it derails a quarter?" That question hooked me, because I’ve lived through the costly fallout of subtle misalignments that only surface at the end of a sprint—or worse, during quarterly business reviews.

I recently dug into an episode of Just Now Possible featuring Matthias and Charlotte Kleverud, co-founders of Momental. Their vision for "GitHub for product management" hits a nerve in the best possible way: find "merge conflicts" in strategy, not code, and do it early enough to save execution time, trust, and outcomes.

Here’s the core: Momental ingests documents, meeting transcripts, and voice recordings across an organization, then uses AI agents to map them into a structured context layer—a set of interconnected trees covering goals, decisions, learnings, and who's doing what. When it finds a conflict—say, one team betting on retention while another is prioritizing conversion—it surfaces the misalignment for humans to resolve, just like a merge conflict in code. That framing is both familiar (for anyone who’s shipped software) and powerful (for anyone who’s scaled product strategy across multiple teams).

Their journey tracks with what many of us have learned the hard way. "Starting in 2022 with DaVinci 002 and learning that the market wasn't ready for AI-assisted product thinking" pushed them toward experiments with agent teams. "The origin story: building a team of AI agents in 2024, only to discover agents hit the same alignment problems as humans" is exactly the kind of meta-lesson I’d expect when you scale autonomy without shared context. The breakthrough was an "OODA-loop-driven document processing agent" that continuously curates a living knowledge graph rather than relying on static prompts or brittle pipelines.

One model that stood out was "The product chain: signals → learnings → decisions → principles, and how AI maps it." That is the backbone of healthy product thinking. When this chain is explicit and inspectable, you can trace why a team chose Path A over Path B—and detect when new signals should invalidate old decisions. I’ve seen this accelerate continuous discovery and improve executive decision hygiene.

I also appreciated the organizational modeling: "Three trees that model an organization: the product tree (OKRs to epics), the wisdom tree (decisions and their reasoning), and the people/time tree." This maps cleanly to how we run quarterly planning at scale—tying outcomes to work, preserving rationale, and grounding ownership and timelines. With that structure, "How conflicts are detected, auto-resolved, or escalated to humans with merge options" becomes a pragmatic workflow, not a theoretical AI demo.

On the technical front, they’re blunt about limits: "Why traditional chunking and RAG breaks down at scale and what Momental does instead." Anyone who’s tried to stitch strategy from ad hoc notes knows that naive retrieval won’t cut it. You need durable context boundaries, rich metadata, and graph-aware reasoning. Which brings me to one of my non-negotiables: "Why metadata—who said it, when, and in what context—is critical to preventing hallucinations." In my world, we treat provenance like test coverage—you can’t ship without it.

Process-wise, the product philosophy resonated: "How a document processing agent uses OODA-loop thinking to extract and connect context across documents" reinforces the need for short feedback cycles, explicit hypotheses, and continuous refactoring of knowledge. Pair that with "The self-improving agent: collecting user feedback weekly and rewriting its own prompts" and you’ve got a blueprint for eval-driven development that keeps the system honest over time.

Their UI choices also mirror a pattern I’ve adopted: "Moving from chat-first to UI-first to proactive agents as an AI product design pattern." Chat can feel magical, but alignment work benefits from concrete artifacts—trees, timelines, driver trees, and opportunity solution trees—so people can reason together. Then, let proactive agents watch for drift and nudge teams before the cost of change spikes.

Two broader themes are worth calling out. First, "Specialized tools win" when the problem is deep, cross-functional context like product strategy. General-purpose chatbots struggle here; domain-specific models with strong information architecture have the edge. Second, product culture matters: "Discovery Versus Vibe Coding" is not just a catchy contrast—it’s a reminder that disciplined discovery beats intuition theater when stakes are high.

As for the roadmap, I’m encouraged by their "Design partner strategy and what's next for Momental's public launch." Early design partners are where you validate signal quality, precision of conflict detection, and the ergonomics of human-in-the-loop resolution. I’m especially curious how this intersects with LLMs for product managers, outcomes vs output OKRs, and product roadmapping and sprint planning in large portfolios.

Finally, a nod to the broader ecosystem. The conversation touched on "Claude Code" and a shift "Beyond documents and vectors" that many of us are living through—toward retrieval-first pipelines that respect context windows, stronger governance, and measurable improvements in decision quality. If you care about AI Strategy for empowered product teams, this is a space to watch—and to pilot.

Bottom line: If you’ve ever wished you could prevent strategy drift before it shows up in your dashboards, this "GitHub for product management" approach is worth your attention. Make the chain of signals, learnings, decisions, and principles explicit. Keep humans in the loop for the hard calls. And let proactive, agentic AI do what it does best: flag misalignments early, so your teams can move fast together.

Inspired by this post on Product Talk.

March 5, 2026
Real-Time Answers in Slack and Teams: How Amplitude’s Global Agent Elevates Product Decisions

I’ve been looking for a pragmatic way to put product analytics where my teams already work—inside Slack and Microsoft Teams. The moment insights are one message away, cycle time shrinks, debates get crisper, and experiments move faster. That’s why I’m bringing Amplitude Global Agent into our daily decision flow to deliver instant, source-backed answers with visual clarity and actionable next steps.

Connect Amplitude Global Agent to Slack or Microsoft Teams to answer questions with source-backed analytics, charts, and recommended actions like A/B tests.

What excites me most is the shift from dashboards to dialogue. Instead of digging through reports, I can ask a focused question in Slack—“How did activation change week-over-week for our self-serve cohort?”—and get a chart in-channel, complete with recommendations that point me toward the next best move. This is Agent Analytics done right: faster insight loops, reduced context switching, and more confidence in the decisions we make every day.

From a product management perspective, this integration strengthens continuous discovery and aligns product trios around the same truth. Engineers, designers, and PMs see the same chart, discuss trade-offs in the same thread, and can agree on an action—often an A/B test—within minutes. It’s a lightweight but powerful way to support product-led growth and keep our roadmap tied to measurable outcomes.

In practice, the questions I ask the most look like this: “Which onboarding step causes the biggest drop-off this month?”, “Which channels drive the highest L28 activation rate?”, and “Where did retention improve after our pricing change?” In each case, the Agent returns charts we can share instantly with stakeholders, plus recommended actions like A/B test ideas to validate hypotheses quickly. The result is a reliable rhythm: ask, see, align, act.

Governance matters just as much as speed. We’re configuring strict permissions, role-based access, and purposeful channel placement so analytics land where they should—no broader, no narrower. We’re also leaning into clear query prompts and naming conventions for events and properties to help the Agent retrieve precisely what’s needed, every time. The aim is a high-signal, low-noise system that maintains trust while accelerating decisions.

To embed this into our operating cadence, I plug the Agent into three moments: daily standups (to scan activation, conversion, and incidents), weekly product reviews (to align on experiment status and next bets), and executive QBR prep (to pull clean, shareable charts fast). Because the insights arrive in Slack or Microsoft Teams, our conversations stay focused and traceable, and decisions get documented in the same place they were discussed.

We’ll measure impact with simple, telltale indicators: fewer ad-hoc analytics requests, faster time from question to decision, increased A/B test velocity, and clearer links between recommended actions and outcome metrics like activation and retention. My bar is straightforward—if this Agent can help one team make a better decision per day, it will more than pay for itself across the org.

If you’re considering a similar move, start small: connect one high-signal channel, curate a handful of common queries, and coach your team on good prompts. Within a week, you’ll feel the difference. When analytics become conversational, momentum follows—and your product strategy benefits from sharper, faster, and more transparent decision-making.

Inspired by this post on Amplitude – Best Practices.

March 4, 2026
Stop Selling Your Roadmap: Win Stakeholder Trust by Showing Your Work, Not Conclusions

I’m seeing the same pattern in product orgs everywhere—inside HighLevel and across my network: everyone is racing to add AI to the roadmap, and every stakeholder has a strong opinion about what to build next. Delivery has never been faster, which makes it dangerously easy to confuse speed with progress.

When we chase features without grounding in continuous discovery, we drift back into a feature factory. We ship more, but we ship the wrong things faster. The antidote is simple and hard at the same time: recommit to product discovery, validate with assumption testing, and let the evidence steer our AI Strategy—not the hype.

Of course, that only works if we can bring our stakeholders along. In the AI moment, it’s deceptively easy to get to a slick prototype and painfully hard to harden it for production. Early demos make almost any idea look promising. That’s precisely why stakeholder management must evolve from pitching solutions to showing our work.

In practice, stakeholder management is about alignment with the people who influence our product decisions—executives, sales, marketing, customer success, engineering leadership, and sometimes legal or finance. Some have veto power; others have input. Knowing who can block versus who can shape is crucial for where we spend our time. Even in empowered product trios, the best discovery can derail if we reveal only conclusions at the end.

I’ve tried every mapping framework—power-interest grids, RACI matrices—and they help. But the real challenge isn’t identifying stakeholders. It’s figuring out how to bring them along so that our product roadmapping and sprint planning decisions stick.

Identify who shapes your product decisions. This visual groups stakeholders into three tiers—those with veto power, key influencers, and audiences to inform—so teams can align, communicate, and reduce delivery risk.

Here’s the most common trap I see (and have fallen into): focusing stakeholder reviews on the roadmap, release plan, or prioritized backlog. That invites an opinion battle. And stakeholders have their own conclusions—usually shaped by the last customer call, a board meeting, or a market headline.

This is how the HiPPO dynamic gets created. HiPPO stands for the “Highest Paid Person’s Opinion,” and the saying goes, “The HiPPO always wins.” When we present conclusions without the journey, we set ourselves up to lose. In the gen ai rush, the chorus of “everyone is doing AI” makes that opinion even harder to counter.

So I don’t try to win opinion battles. I bring new information—fresh customer interviews, clear opportunity mapping, and results from assumption tests. The gap between what the market hypes and what customers actually need is often enormous. Our edge is evidence.

The strategy that consistently works for me is simple: show your work. If you’re practicing continuous discovery, your opportunity solution tree isn’t just a thinking tool—it’s your strongest stakeholder management asset. It helps you build confidence in your decisions, and it can help your stakeholders build the same confidence.

Avoid the stakeholder trap of selling conclusions. This visual shows how anchoring on solutions invites HiPPO battles—and how to shift the conversation by sharing discovery evidence, insights, and data.

Step 1 — Start with the outcome. I open every conversation by restating the shared goal and asking whether anything has changed. Anchoring on outcomes vs output OKRs reframes hot-button solution debates (like “we need an AI feature”) back to what will move the needle on the outcome we agreed to pursue.

Step 2 — Share the opportunity space. I show how we mapped customer needs, pain points, and desires. Then I ask, “What did we miss?” Stakeholders often surface opportunities we haven’t seen yet—signals from the field, market shifts, or partner feedback. I capture their input and commit to validating it in upcoming customer interviews.

Step 3 — Walk through prioritization. Using the tree’s structure, I explain why we prioritized one branch over another. Then I ask where they might have chosen differently. This turns debate into collaboration and lets me leverage their expertise without ceding the discovery framework.

Step 4 — Go deep on the target opportunity. Before we talk solutions, I make the customer’s problem vivid and real. Interview snapshots help stakeholders empathize and see what matters most. Once the opportunity is crisp, solution discussions become dramatically more objective.

Show your work, not just your conclusions. This infographic guides product teams through seven steps to build stakeholder confidence—align on outcomes, map opportunities, prioritize, test assumptions, and repeat.

Step 5 — Share solutions and invite theirs. I present our solution set and explicitly ask for additional ideas. If their suggestions diversify our set, we include them. Solution ideas are cheap; the opportunity is what matters. This is where product trios can benefit from leadership’s pattern recognition and industry context.

Step 6 — Share your assumption tests and results. I walk through our story maps, high-risk assumptions, and what we’ve learned so far. I invite stakeholders to add assumptions—this is where their knowledge shines. If we have data, we share it; if we’re pre-data, we share the plan to get it and ask for feedback.

Step 7 — Repeat. I don’t batch this into a big reveal. I keep a steady cadence and tailor depth to each audience: weekly for my manager, monthly highlights for marketing, and concise updates for executives. Continuous discovery pairs with continuous stakeholder management.

Showing your work doesn’t mean drowning people in detail. It means tailoring the signal to the audience. My rule of thumb is outcome, opportunity, solution, evidence—walk the lines of the tree at the right altitude for each stakeholder.

Show your work the right way for each stakeholder. Use a smart filter to turn discovery noise into clear signals—weekly journeys for your manager, focused monthly highlights for marketing, and a 30-second CEO pitch.

In a 30-second update with a CEO, it might sound like this:

“Our goal is to reduce time-to-first-value for new users. We’ve been interviewing customers and learned that onboarding is where most people get stuck—specifically, they don’t know which features to try first. We explored a few approaches and tested them. The most promising one is a guided setup flow that adapts based on the user’s role. In early tests, new users completed onboarding 40% faster.”

That pattern works across channels—Slack updates, monthly reviews, or quarterly planning. The format flexes, the structure doesn’t: outcome, opportunity, solution, evidence.

As you adopt this approach, watch for four anti-patterns that quietly erode trust.

Avoid the traps that erode stakeholder trust. This infographic guides product teams to show their work, welcome ideas, provide frequent updates, and prioritize results over ideology to build alignment and credibility.

Anti-pattern 1 — Telling instead of showing. The curse of knowledge makes our conclusions feel obvious to us and opaque to others. The fix: slow down, start at the top of the tree, walk the decisions, and let stakeholders reach the conclusion with you.

Anti-pattern 2 — Shooting down stakeholder ideas. As you build a library of validated assumptions, it’s easy to spot flaws in a suggestion and say “no” too quickly. Instead, place their idea within your discovery framework. If it maps to a different opportunity, say, “That idea has promise—we’ll consider it when we address that opportunity.” If it rests on risky assumptions, story map the idea together, list the assumptions, and share what you’ve already learned. People accept the evidence they help generate.

Anti-pattern 3 — Saving everything for a big reveal. Infrequent, comprehensive updates invite opinion battles because stakeholders have formed their own conclusions in the dark. Short, frequent updates build alignment as the work unfolds.

Anti-pattern 4 — Fighting the ideological war. Sometimes a more senior stakeholder will overrule you. Don’t turn it into a debate about how product decisions “should” be made. Focus on the decision at hand, do the best work within constraints, and let results—not ideology—prove the value of discovery over time.

Shift from selling to showing. This co-creation guide invites stakeholders into discovery, taps their expertise, and turns relationships from obstacles into partnerships for smarter product decisions.

Here’s the mindset shift that changes everything: stakeholder management is a co-creation opportunity. When we show our work with artifacts like an opportunity solution tree, experience maps, and interview snapshots, we’re not just communicating—we’re inviting collaboration. We’re leveraging stakeholders’ expertise, context, and connections to make better product decisions.

When stakeholders have walked the path with us, they don’t need to be sold on the destination. They become allies. Engagement stops being a status ritual and starts being real partnership—the kind that moves outcomes and builds durable trust.

Try this in your next review: don’t start with your roadmap. Start at the top of the tree. Reaffirm the outcome. Share the opportunity space. Explain your prioritization. Show what you’re learning. Invite contribution. You might be surprised how quickly alignment—and confidence—follow when you stop selling conclusions and start showing your work.

Inspired by this post on Product Talk.

March 4, 2026
From Chaos to Clarity: My Proven Playbook to Scale an Analytics Taxonomy That Sticks

I’ve stepped into too many product reviews where teams argued over numbers that should have been obvious. Three names for the same “signup” event, properties scattered across tools, and no shared definitions—classic analytics chaos. As VP of Product Management at HighLevel, I’ve learned that scaling an analytics taxonomy isn’t just a data exercise; it’s a leadership mandate that unlocks decision velocity, alignment, and confident product bets.

Learn best practices our professional services team has compiled in helping customers move from scattered events to a scalable, user-friendly data structure.

Why does this matter so much? A robust taxonomy powers a unified analytics platform across Amplitude analytics, Pendo, and our CRM stack, reduces rework, and strengthens data governance. When events are clear and consistent, product-led growth accelerates: onboarding becomes measurable, activation is trackable, and retention analysis turns into a weekly ritual rather than a quarterly scramble.

I always start with outcomes, not events. We define a North Star metric and use driver trees to map how user behaviors ladder up to that outcome. Then we ground the plan in journey mapping: what signals mark activation, aha moments, and long-term engagement? This ensures our taxonomy mirrors real user intent, not just engineering convenience.

Next comes naming conventions and structure. We standardize on a readable, durable pattern (for example, actor_action_object), apply consistent property naming, and document required vs. optional properties. We version events deliberately, so we can evolve without breaking dashboards. Most importantly, we align events to product strategy—tracking less, but better.

Governance makes it scale. We establish a clear DRI for the tracking plan, a lightweight review process for changes, and a schema registry that serves as the single source of truth. Privacy-by-design is non-negotiable: we treat sensitive fields deliberately and audit access. Observability closes the loop—schema validations and alerts catch drift before it confuses teams.

Tooling and process turn good intentions into muscle memory. We keep the tracking plan “as code” in a repository, run CI/CD checks to validate events, and use feature flags to roll out new instrumentation safely. Pendo helps us annotate in-app experiences, while Amplitude provides the exploratory lens for cohorts, funnels, and retention. Together, these systems reduce guesswork and speed up discovery.

Migrations are where many teams stall, so I de-risk them with a clear, time-boxed plan. We audit the current event surface, map scattered events to the new taxonomy, and deprecate duplicates with guardrails. We communicate changes broadly, provide easy-to-scan documentation, and pair enablement sessions with hands-on examples from live dashboards. The goal is confidence, not just compliance.

We measure success like a product. Are we answering critical questions faster? Are duplicate events trending down? Are activation and retention questions easy to answer in under five minutes? When the taxonomy is working, stakeholders stop asking, “Do we trust this?” and start asking, “What should we build next?”

One of the most rewarding shifts I’ve seen: product trios moving from ad-hoc analyses to repeatable, weekly rituals. With crisp definitions, onboarding flows become testable, PLG motions are predictable, and leadership reviews focus on outcomes, not definitions. That’s the moment analytics transforms from a cost center into a growth engine.

If you’re staring at a wall of scattered events, start small: clarify outcomes, align your journey map, set conventions, and ship a minimum viable taxonomy to one critical flow. Iterate quickly. The compounding payoff—clarity, speed, and trust—will be obvious to every team you partner with.

When we do this well, analytics becomes a strategic asset. Our teams spend less time reconciling numbers and more time building what matters. That’s the real meaning of moving from chaos to clarity.

Inspired by this post on Amplitude – Best Practices.

March 3, 2026
Lost in the Woods: 5 Survival Patterns Every Product Leader Must Master Now

Ever feel like your product team is “lost in the woods”? I’ve certainly been there—when strategy gets fuzzy, outcomes drift, or constraints aren’t clear. What helped me reframe the chaos was borrowing “lost person” patterns from search-and-rescue and mapping them to product strategy, product discovery, and team behaviors. The result is a practical playbook for product management leadership that keeps empowered product teams moving toward outcomes—not just outputs.

Listen to this episode on: Spotify | Apple Podcasts

Here are the five patterns I see most often—and how I turn each one into forward motion: settle in place (freeze), chase shortcuts, follow the first visible path, use your own navigation (intuition/taste), and retrace your steps. Each of these has a smart, minimal move that helps teams reorient fast without abandoning continuous discovery or product strategy discipline.

Settle in place (freeze). Sometimes the smartest move is to stop. When my team lacks context or authority, I pause delivery work and escalate instead of improvising fixes. This prevents thrash, protects focus, and creates the air cover we need to realign outcomes vs output OKRs.

Chase shortcuts. Shortcuts can be brilliant—or overconfident. I’ve learned to pressure-test whether the “road” is where we think it is before we commit. That means lightweight experiments, clear exit criteria, and the humility to pivot. Think about big bets like Spotify podcasts: compelling vision, but you still have to validate assumptions step by step.

Follow the first visible path. The obvious option isn’t always the best one. My job as a product leader is to make multiple paths visible before we choose. I lean on opportunity solution trees and KPI trees (or driver trees) to surface alternatives, align stakeholders, and keep empowered product teams focused on customer impact and product-market fit—not just the loudest idea.

Use your own navigation (intuition/taste). Judgment matters, especially for product trios making fast calls—but it’s not a replacement for evidence. When my “compass” conflicts with what we observe, I anchor back to customer interviews, rapid tests, and discovery loops. Intuition should guide where we look, while data validates how we proceed.

Retrace your steps. When we’re drifting, I go back to what used to work: principles, quality practices, and discovery habits as feedback loops. Returning to fundamentals—clear problem statements, crisp value propositions, and disciplined outcomes—rebuilds momentum fast.

Team prompt to try: If your team is “lost” right now, which pattern are you defaulting to—and what’s the smallest move you can make this week to get oriented (escalate, test a shortcut, map options, validate intuition with evidence, or retrace to a principle)? I use this question in weekly reviews to keep us grounded in continuous discovery and product strategy.

Resources & Links:

Follow Teresa Torres: https://ProductTalk.org

Follow Petra Wille: https://Petra-Wille.com

Mentioned in the episode:

Lost Person Behavior: A Search and Rescue Guide on Where to Look – for Land, Air and Water

Robert J. Koester

Examples referenced: Xerox, Nokia, Kodak, Volkswagen emissions scandal, Spotify podcasts, large-org tooling contexts like Oracle and SAP

Opportunity Solution Trees: Visualize Your Discovery to Stay Aligned and Drive Outcomes

KPI Trees: How to Bridge the Gap Between Customer Behavior, Product Metrics, and Company Goals

Let's Read Continuous Discovery Habits Together (January 2026) for Continuous Discovery Habits (and the idea of habits as feedback loops)

Shifting from Outputs to Outcomes: Why It Matters and How to Get Started

I’d love to hear how your team navigates these patterns. Which small move will you try this week? Leave a comment below and let’s compare notes on product discovery, stakeholder management, and product roadmapping that actually drives outcomes.

Inspired by this post on Product Talk.

March 3, 2026
March CDH Book Club: Master Experience Mapping to Align Teams and Accelerate Discovery

I’m thrilled to invite you to our March session of the CDH Book Club. Continuous Discovery Habits turns five this year. And to celebrate we are reading the book together. I’ve seen firsthand—leading product trios and empowered product teams—that sharpening our discovery habits is the fastest way to better outcomes vs output OKRs, tighter team alignment, and more confident product strategy.

Each month, I am releasing an in-depth reading guide that includes:

The chapters we will be reading

A preview of the most important concepts we'll be learning about

Short videos you can share with friends and colleagues to help spread the ideas

Individual and team discussion questions to help you absorb and engage with the reading

Team exercises to help you put the ideas into practice

Additional reading to help you go deeper on the core ideas

We’ll be discussing each month’s reading in the comment section and we’ll gather quarterly to discuss on a live call. I’ll be there to trade notes, compare experience maps, and share what’s working across product discovery practices.

Joining late? No problem. I monitor the comments on each reading guide throughout the year. Start with the current month or go back to January—whatever works for you. You can ask for help, share what’s working, and connect with other readers at any point.

If you want to participate, grab a copy of the book (or dig up your old copy), share the "Spread the Love" videos, reserve some time to do the team exercises, and register for the community sessions. Let’s do this!

This Month’s Reading

Chapters:

Chapter 4: Visualizing What You Already Know

Estimated reading time: ~14 minutes

This chapter will introduce you to:

Why starting individually—rather than as a group—is the fastest path to unlocking your team’s collective intelligence

How drawing (even badly) forces you to get specific in ways that words never will

The strategic choice of setting your experience map’s scope—too narrow and you miss opportunities, too broad and you lose focus

How diverse perspectives become your team’s secret weapon when you know how to synthesize them

Why your first experience map isn’t truth—it’s a hypothesis you’ll test and evolve with every customer conversation

Need a copy? Grab the book.

Share the Love with Friends and Colleagues

We learn best in community. Use the following short videos to share the key concepts from this chapter with friends and colleagues. Invite them to participate in the book club with you. In my teams, these quick hits help us align faster before we co-create an experience map or opportunity solution tree.

Visualize your thinking – To bring others along

Unlock team alignment – With visualizations

Reflect & Discuss What You Read

When we reflect and discuss what we read, we absorb more of the material. It helps us put what we learn into practice. Don’t skip this step. In my own practice, the real unlock came when I treated mapping as a living artifact that shapes customer interviews, not a one-off deliverable.

Most of us believe we work collaboratively, but we’ve never truly experienced what it means to build shared understanding from diverse perspectives. This chapter challenges you to get uncomfortable—to draw when you’d rather talk, to work alone before working together, and to see your maps as living documents rather than one-time deliverables.

Individual Reflection

Think about the last time your team tried to align on what you know about your customers. Did everyone start by creating their own perspective first, or did you jump straight into a group discussion? What happened as a result?

When was the last time you drew something at work? What stops you from using drawing as a thinking tool—is it discomfort with your drawing skills, lack of time, or something else?

Look at your current work. If you were to create an experience map right now, what scope would you choose? How does your desired outcome help you determine what to include and what to leave out?

Team Discussion

As a trio, each person should identify one unique perspective they bring to your team’s understanding of your customer. How might these different viewpoints create blind spots if you only relied on one person’s view?

When your team disagrees about what customers need or want, how do you typically resolve it? Do you debate until someone wins, defer to the most senior person, or test your different hypotheses?

Does your team have a current experience map? If so, when was the last time you updated it based on what you’re learning from customers? If not, what’s preventing you from creating one?

Put It Into Practice

Understanding why experience maps matter is different from actually creating one that drives your discovery work. These exercises will help you practice the discipline of starting individually, synthesizing diverse perspectives, and using your map to guide customer conversations. My suggestion: timebox, embrace imperfect drawings, and let the artifact lead your next interview script.

Exercise: Create Your Individual Experience Maps

Time: 20 minutes individually, 45–60 minutes with your team

Do this: Individually first, then share with your trio

Start by agreeing on the scope of your experience map based on your current outcome. Each member of your trio should then independently create their own experience map using pen and paper (or your favorite digital drawing tool).

Focus on drawing the customer’s experience, not your product’s features. Where do they get stuck? What goes wrong? How do they work around problems? Don’t worry about drawing well—boxes, arrows, and stick figures are perfectly fine.

Once everyone has created their individual maps, schedule time to share them with each other. As you explore each person’s perspective, ask questions to understand their thinking. Pay particular attention to the differences between maps—this is where the richest insights emerge.

Exercise: Co-Create Your Shared Experience Map

Time: 30 minutes with your team

Do this: With your product trio

Bring your individual experience maps together and work to synthesize them into a single shared map. Start by identifying all the unique nodes (distinct moments, actions, or events) across all three maps. Arrange them in a comprehensive flow.

Collapse similar nodes, but be careful not to overgeneralize. Add links to show relationships and flow between nodes—including loops, error cases, and abandonment points. Finally, add context about what customers are thinking, feeling, and doing at each step.

As you work, avoid getting bogged down in endless debate. If you disagree about details, draw out the difference rather than debating it. This often reveals you already agree or helps you pinpoint exactly where your understanding differs.

Remember: This map is your current hypothesis about your customer’s experience. Use it to guide your upcoming customer interviews and plan to evolve it based on what you learn.

Go Deeper: Additional Reading

If you prefer an audio summary of this month’s reading, including the book chapters and the following resources, I’ve included an audio version for paid subscribers at the bottom of this post.

Supplementary Reading

Why Drawing Maps Sharpens Your Thinking

Core Concept: Collaborative Decision-Making in a Product Trio

Other Voices

To Draw or Not to Draw: Is Traditional Sketching Still Relevant in the Digital Design Era? by Julia Ku

Journey-Mapping Approaches: 2 Critical Decisions to Make Before You Begin by Kate Kaplan

The Visual Language of Comic Books Can Improve Brain Health by Mary Widdicks

Mapping Your User’s Day with the User Clock Sketch by Ben Crothers

Our Live Discussion Schedule

Our live discussion sessions are for paid subscribers. Sessions are not recorded. Invitations will go out to Supporting Members and CDH Members two weeks before the scheduled event. But reserve the time on your calendar now.

Wednesday, March 18, 2026: 9am–10am PDT and 4pm–5pm PDT

Tuesday, June 16, 2026: 9am–10am PDT and 4pm–5pm PDT

Thursday, September 17, 2026: 9am–10am PDT and 4pm–5pm PDT

Wednesday, December 16, 2026: 9am–10am PST and 4pm–5pm PST

Audio Summary

This summary was produced by NotebookLM. The sources supplied were the book chapters as well as all of the additional reading.

Listen here: March — Draw the User Clock to Build Empathy (audio)

This article is part of the CDH Book Club celebrating the five-year anniversary of Continuous Discovery Habits. See all book club posts.

Inspired by this post on Product Talk.

March 2, 2026
Battle-Tested AI Agent Orchestration Patterns for Reliable, Observable, Product-Ready Systems

Shipping agentic AI into production is exhilarating—until a flaky output torpedoes trust. Over the past year, I’ve led teams at HighLevel to operationalize agents across customer-facing and internal workflows, and I’ve learned that reliability isn’t an afterthought; it’s an architecture. In this piece, I share the AI Agent Orchestration Patterns for Reliable Products that consistently deliver dependable outcomes at scale.

When we talk about orchestration, we’re talking about more than a single prompt. The shift is from monolithic calls to coordinated “agentic AI” where routers, planners, and specialists collaborate through structured “AI workflows.” In practice, I rely on a few canonical patterns: a planner–executor loop for multi-step tasks, a router–specialist setup for skill selection, and a “retrieval-first pipeline” that grounds generation with authoritative context before a single token is produced.

Reliability-by-design starts with typed inputs/outputs and strict validation. I standardize on JSON schemas, enforce tool/function signatures, and implement idempotency keys so retries don’t wreak havoc on downstream systems. Timeouts, circuit breakers, and backpressure protect the platform under load, while rate limiting and dead-letter queues keep failure modes contained. Most importantly, we engineer graceful degradation: agents “abstain” when uncertain, fall back to deterministic paths, and escalate to humans instead of guessing.

Safety is a first-class concern, not a bolt-on. Our “AI risk management” pipeline includes PII redaction, allow/deny lists for tools and data, and the principle of least privilege for every connector (yes, even the ChatGPT connector). We codify policy-as-code for repeatability and require human-in-the-loop approvals for sensitive or irreversible actions. In my experience, clear red lines and reversible defaults prevent the vast majority of regrettable outcomes.

Without strong “observability,” you’re flying blind. I instrument agents with an “Agent Analytics” layer that captures traces, spans, tool invocations, and token usage across the entire chain. The essential metrics are outcome quality (task success rate), latency (p50/p95), tool failure rates, cost per task, and user-level satisfaction signals. Cross-agent lineage allows us to pinpoint where a plan went awry and which tool or prompt introduced drift—vital for rapid remediation.

Quality improves fastest when it is measured relentlessly. I practice “eval-driven development” with golden datasets, rubric-based scoring, and risk-weighted sampling of edge cases. LLM-as-judge can help, but we always calibrate against human ratings and monitor agreement. In production, I blend online metrics with controlled “A/B testing” and plan experiments to hit a realistic minimum detectable effect (MDE). The result is a virtuous loop where prompt tweaks, tool changes, and retrieval adjustments are verified before wide rollout.

Agents need the same rigor we expect from any modern system. I gate releases through “CI/CD” with linting for prompts, schema checks for tools, and simulation runs for critical paths. “Feature flags” enable shadow and canary deployments so we can throttle exposure by segment or workflow. I also track reliability with “DORA metrics” and “deployment frequency,” and I partner closely with “SRE” for on-call coverage, runbooks, and incident postmortems tailored to agent failure modes.

Context is a resource to allocate, not a bottomless pit. Thoughtful “context window management” means curating retrieval, summarizing long-running threads, setting memory time-to-live, and constraining what the agent can see at any given step. I bias hard toward retrieval over recall, keep chunks small and semantically precise, and validate that the “retrieval-first pipeline” truly returns the right evidence—not just the nearest match.

In day-to-day product work, I lean on a compact playbook: a router that selects the best specialist; a planner that decomposes tasks and allocates tools; a deterministic guard that verifies preconditions; an execution loop with explicit budgets; and a fallback policy that prefers abstaining over hallucinating. Together, these patterns create an agent that behaves like a dependable teammate rather than a creative wildcard.

No architecture thrives without the right rituals. Product trios keep discovery continuous, while clear outcomes (not output) align teams on value instead of vanity. We map risks early, maintain a public quality dashboard, and rehearse failure recoveries so incidents never become improvisations. The cultural signal is simple: we celebrate root-cause clarity and safe iteration over heroics.

If you’re just starting, implement three patterns first: retrieval before generation, abstain-and-escalate for low confidence, and canary releases under feature flags. Instrument everything from day one, run a weekly eval review, and expand scope only when the data says you’re ready. With these habits, your agents will earn user trust—and keep it.

Inspired by this post on Product School.

March 2, 2026