Churn isn’t just a retention problem—it’s a product, go-to-market, and strategy signal that shows up everywhere in the customer journey. Over the past few years, I’ve evaluated and implemented churn prediction tools across high-growth SaaS environments, and the difference between reactive firefighting and proactive, data-driven retention is night and day.
Compare the top 8 churn prediction tools for SaaS teams. Features, use cases, and how each stacks up, so you can act before customers quietly leave.
When I assess churn prediction tools for product-led growth, I start with a simple question: will this help my team see risk early enough—and clearly enough—to intervene with precision? The best platforms combine behavioral analytics, retention analysis, and anomaly detection to surface leading indicators before Net Recurring Revenue (NRR) takes a hit.
First, signal coverage matters. Strong churn models draw from product usage events, CRM integration, support tickets, billing health, and even session replay to capture real-world behavior. I look for native connectors to systems like Intercom, Pendo, and Amplitude analytics, plus flexible ingestion for custom events. Without comprehensive signals, even the smartest models will miss critical moments such as stalled onboarding, shrinking active seats, or feature disengagement.
Second, I require transparent risk scoring and clear drivers. Black-box scores erode trust with Customer Success and Product teams; explainability builds alignment. Tools that expose driver trees, cohort-based retention analysis, and segment lift help me translate insights into prioritized experiments. When possible, I tie predicted churn segments to A/B testing with a thoughtful minimum detectable effect (MDE) so we can quantify impact quickly and avoid overfitting to noise.
Third, actionability is non-negotiable. Predictions must trigger targeted AI workflows, in-app guides, and product tours—not just dashboards. My ideal setup routes high-risk cohorts to tailored journeys (e.g., an onboarding rescue path) while notifying the right owner in CRM and Customer Success. Playbooks should be easy to operationalize, measurable, and reversible if the signals change.
Fourth, I evaluate platform scalability, data governance, and privacy-by-design. Enterprise readiness means clear role-based access, auditability, robust SLAs, and an architecture that can evolve into a unified analytics platform as the product and data footprint grows. I also weigh total cost of ownership, implementation time, and maintenance burden against expected gains in NRR and expansion.
In my experience, the winning tools are the ones that make it simple to connect predictions to outcomes: reduce onboarding drop-off, increase user activation, prevent seat contraction, and accelerate expansion. They align Product, Customer Success, and Growth around shared metrics, shorten time-to-value, and make proactive retention part of the operating rhythm—not a last-ditch effort at renewal.
In this 2026 comparison, I’ll outline how each tool handles data breadth, model quality, explainability, and workflow automation. I’ll also share implementation checklists and decision criteria so you can choose the right fit for your stage, stack, and motion—whether you’re primarily product-led growth, sales-led, or hybrid.
If you’ve ever felt like customers “quietly leave” despite solid top-of-funnel metrics, this guide will help you turn churn signals into concrete actions—and convert at-risk accounts into durable advocates.
Our outcome-based pricing model hinges on one principle: you pay when Fin delivers value.
As Fin takes on new roles, that principle doesn’t change, but the definition of value does.
Fin for Sales qualifies leads, engages prospects, and routes high-intent buyers to your sales team. The value it creates isn’t a resolved query, but a pipeline of qualified opportunities. So we price accordingly: $10 per qualified lead. And you, the customer, define what “qualified” means, not Fin.
This is the first outcome-based pricing model for an AI Agent for sales. Here’s why I believe it’s the right approach and how I’ve seen it change the way teams think about SaaS pricing and ROI.
Over the years, I’ve learned that the fastest way to earn trust with sales and finance leaders is to align pricing with outcomes they actually report on. The core finding from our research was unambiguous: zero buyers preferred paying for activity. They wanted to pay for results.
That insight shaped how we priced Fin for its service role, $0.99 per resolution, where a resolution means the customer’s issue is fully solved without human intervention. More recently, we evolved that model to outcomes, reflecting the broader ways Fin delivers value across complex workflows. We believe pricing should be aligned with value delivery, and the vendor should carry risk when the product doesn’t perform. In sales, the best unit of value is pipeline.
Most sales teams today are overwhelmed by leads. Early in my career, I watched reps spend hours chasing form fills that looked promising but went nowhere. That experience cemented a lesson I still use: volume is vanity; qualification is sanity.
Ensuring the right opportunities promptly reach your sales team is what makes a difference. When a prospect visits your site, engages with Fin, answers qualifying questions, and is directed to a sales rep, Fin is identifying whether the opportunity is worth your team’s time and delivering value.
Charging per conversation would penalize businesses for every curious visitor who asks a question but isn’t a buyer. And charging per token, well, that’s always been a model that protects the vendor, not the customer.
We needed a metric that captures the actual value Fin creates in a sales context: qualified leads.
The purest version of outcome-based pricing for Fin’s sales role would be a percentage of closed revenue. Fin qualifies the lead, a rep closes the deal, and we take a cut. On paper, it looks elegant; in practice, I found it breaks down for two reasons that matter to operators.
First, attribution. Between the moment Fin qualifies a lead and the moment a deal closes, dozens of things can impact the final result. The quality of human-led demos can differ, products can have outages, prospects’ budgets can get cut. Tying Fin’s price to the final outcome holds it accountable for variables entirely outside its control.
Second, measurement. To track closed revenue, we’d need deep integration into every customer’s CRM, tracking each opportunity from qualification through to close. That’s a significant implementation burden that slows time to value, which is the opposite of what we want.
So we asked: what’s the most honest proxy for the value Fin delivers, where Fin is clearly the one creating it?
A qualified lead is that proxy. It represents the moment Fin has done its job. It has engaged the prospect, gathered the relevant information, evaluated them against your criteria, and determined they’re qualified. Everything up to that point is Fin’s work. Everything after it is the rep’s. At $10 per qualified lead, the pricing reflects this boundary.
There are two key components to how this pricing model works.
First, the customer defines success. With Fin’s sales role, the customer sets their own qualification criteria based on their business context. A company with high average contract values might set a lower bar because they can’t afford to miss anyone. A company where rep time is scarce and deal sizes are smaller might set a much higher bar, filtering aggressively to only surface the most promising prospects. The criteria flex to match the business.
Second, the economics are different by design. As a Customer Agent, Fin can switch between roles like sales and service. So if you’ve deployed Fin for Sales, it can still handle support queries like prospects asking a product question. Those queries are charged at $1 per resolution, consistent with our service pricing. Disqualifications, where Fin determines a prospect doesn’t meet the criteria, are also $1. The $10 price point for qualified leads reflects the higher value of pipeline creation compared to issue resolution.
The ROI speaks for itself. Early customers are reporting significant returns using Fin for Sales. One shared a perspective that mirrors what I hear in executive QBRs:
“I would say it’s at least 10 times the value. You’re now giving the business exactly what it needs as opposed to just activity. We say this expression in sales leadership all the time – ‘I don’t pay my sales team for activity. I pay them for results.’ I want my AI engine to be the same way.”
When you compare the cost of a qualified lead from Fin against the fully loaded cost of an SDR—salary, benefits, tooling, ramp time—the economics are compelling. For many businesses, particularly those that never had SDRs in the first place, Fin for Sales isn’t just replacing headcount, but creating an entirely new capability that wasn’t economically viable before.
This pricing model came from extensive customer research—qualitative interviews and quantitative studies—exploring how buyers want to pay for AI in a sales context. We tested multiple concepts: per-conversation, per-token, per-seat, revenue share, and per-qualified-lead. The research consistently pointed to outcome-aligned pricing as the preferred model, with the qualified lead emerging as the metric that best balances value alignment, measurability, and practical implementation.
Outcome-based pricing is still rare in AI, but we think that will change. For Sales Agents, we’re the first to do it. Transparency is part of the model. If you understand why we price the way we do, you can evaluate whether it works for your business.
In my work with product, operations, and support leaders, I’m often asked to help make sense of Agent Analytics—what to track, how to attribute outcomes, and where to invest. After reviewing countless dashboards and running experiments across human agents and AI agents, I’ve learned that some of the most common measurement beliefs are precisely the ones that lead teams astray.
What comes up in conversation with leaders about Agent Analytics, and why not everything is what it seems.
Below, I unpack four pervasive myths I encounter and share the data-centered practices I use to replace them. My goal is simple: help you upgrade the way you measure performance so you can improve customer outcomes, accelerate learning, and scale impact with confidence.
Myth 1: “Lower average handle time (AHT) means higher performance.” AHT is useful but incomplete. When teams optimize solely for speed, they often push complexity into repeat contacts, reopens, or escalations. In the data, that shows up as a weak or negative relationship between lower AHT and durable outcomes like first contact resolution (FCR), customer effort, or revenue per conversation.
Reality and what I measure instead: I right-size speed by pairing AHT with intent-level resolution and recontact rate. For simple intents (password reset, billing address update), shorter is usually better. For complex intents (tiered troubleshooting, multi-step verification), “right-speeding” wins—slightly longer interactions that prevent rework. Practically, that means segmenting by intent complexity using behavioral analytics, tracking weighted “intent resolution rate,” and monitoring repeat-contact windows (24–168 hours) to catch downstream pain.
Myth 2: “AI agent containment tells the whole story.” A high containment rate can mask failure modes such as unresolved intent, silent abandonment, or low-quality handoffs that frustrate customers and spike human workload later.
Reality and what I measure instead: I break containment into three parts for voice and chat flows: (1) intent resolution without escalation, (2) graceful handoff quality when escalation is necessary, and (3) post-handoff efficiency and satisfaction. For voice AI agent experiences, I also track escalation clarity (did the transcript summarize history and intent?), time-to-human, and customer satisfaction on the combined interaction. This provides a fuller view of customer support ai strategy effectiveness and avoids over-crediting automation for partial wins.
Myth 3: “Quality is subjective, so it can’t be measured at scale.” Teams often default to sporadic QA because they assume it can’t be standardized across channels or agent types. The result is noisy feedback loops and stalled coaching.
Reality and what I measure instead: Quality becomes measurable when it’s grounded in observable behaviors linked to outcomes. I use a rubric anchored in behavioral analytics (e.g., verified customer need, correct resolution path, policy compliance, empathy markers) and validate it via correlation with FCR, recontact, and retention analysis. To scale, I combine calibrated human reviews with AI-assisted scoring, check inter-rater reliability weekly, and use driver trees to connect quality levers to business results. This creates a consistent, coachable signal for both human agents and AI flows.
Myth 4: “If the dashboard is green after launch, we’ve won.” Early wins can reflect novelty effects, cherry-picked routing, or short-term incentives that don’t persist. Declaring victory too soon locks in fragile gains and hides regressions across cohorts.
Reality and what I measure instead: I treat go-live as the start of learning. I use A/B testing with a clear minimum detectable effect (MDE), stagger ramps, and hold out stable control cohorts for at least one full demand cycle. I track outcomes vs output OKRs—focusing on intent resolution, customer effort, and revenue/customer health over vanity metrics. I also monitor seasonality and channel mix shifts inside a unified analytics platform to ensure improvements generalize beyond the first week.
How I operationalize this day to day: (1) define intents and complexity upfront, (2) unify journey data across channels, (3) instrument resolution and recontact rigorously, (4) apply driver trees to isolate what actually moves outcomes, and (5) iterate via disciplined experiments rather than sweeping changes. This approach aligns product and operations, speeds up coaching, and ensures AI investments compound rather than decay.
If you’re rethinking your Agent Analytics stack, start by replacing each myth with a sharper metric: pair AHT with intent-level resolution, pair containment with handoff quality and satisfaction, pair QA with outcome-linked rubrics, and pair green dashboards with robust experiments. The payoff is a measurement system that earns trust, guides better decisions, and consistently improves customer and business results.
Today, I’m thrilled to share Fin’s next leap as a Customer Agent: ecommerce. When we launched Fin for Sales, Fin expanded further across the customer journey — and now we’re bringing that same intelligence to product discovery, checkout conversion, and post‑purchase support for Shopify merchants.
Fin for Ecommerce is a new role purpose-built for Shopify merchants that combines shopping assistance and ecommerce support. Fin is already the best Agent for customer service, resolving over a million queries a week for 8,000+ businesses. Now, it also guides shoppers to the right product, addresses concerns in the moment, and converts browsing into buying — all in one fluid experience.
Here’s what’s new and why it matters for conversion rate, average order value (AOV), and lifetime value:
A leading mattress retailer shares how Fin for Ecommerce acts like an expert associate—asking about sleep style and firmness, then recommending the best-fit product to boost confidence and drive conversions.
Fin helps shoppers find the right product. It asks thoughtful questions, narrows options across large catalogs, and compares products based on what the shopper actually needs — like a great in‑store assistant, at scale.
Fin helps increase order value. It recommends relevant add‑ons and higher‑value alternatives based on conversation context, keeps carts effortless to update, and guides shoppers smoothly into checkout when they’re ready.
See Fin for Ecommerce in action: a Product Discovery card curates three high-performance ski jackets with images, names, and prices, revealing how the customer agent guides shoppers and accelerates confident purchases.
Fin handles support without losing the sale. Returns, refunds, and order changes happen in the same conversation; once resolved, Fin brings shoppers right back to browsing so momentum isn’t lost.
Fin is integrated with Shopify. Connect your store and Fin syncs your catalog, order data, and APIs in minutes — no manual training or complex setup.
A customer spotlight from Ninja Transfers shows Fin for Ecommerce boosting sales: 10% of support chats convert, with order values 20% above average—proof that an AI customer agent can drive revenue while improving service.
In a great retail store, an attentive associate changes everything: they ask what you’re looking for, understand your preferences, answer the questions that matter, and walk you to checkout — and when you return, they remember you. That level of proactive, human‑quality assistance has never truly made it online.
Most ecommerce still looks like it did a decade ago: filters, FAQs, and self‑serve flows that assume the customer already knows what they want. Ecommerce offers scale and 24/7 convenience, but it’s passive — it can’t understand a shopper’s intent and actively guide them to a product that fits.
Fin for Ecommerce acts like a customer agent—checking shipping status, surfacing in‑stock color variants, and updating the order in the same thread—turning a jacket mix‑up into a quick, seamless experience.
Fin for Ecommerce changes that by bringing high‑quality shopping assistance to Shopify stores.
"Fin doesn't just recommend products — it asks the right questions about sleep position and firmness preference, understands what the customer actually needs, and guides them to the right decision. It sells the way we sell." Anthony Navarro, Market Sales Manager at Avocado
An Avocado Green Mattress customer experience leader shares how Fin for Ecommerce unifies support and sales—answering policies, selling products, and explaining the mattress break-in period—so shoppers get instant, agent-level help.
Here’s how it works in practice. When a shopper says "I need a gift for my partner" or asks "what running shoes work for trail and road?," Fin doesn’t dump them on a search results page — it starts a conversation. It asks about preferences, incorporates live browsing context, surfaces the most relevant options, and compares them based on what the shopper cares about.
This is powered by Fin Apex 1.0, the best-performing model for customer service, combined with a retrieval engine purpose-built for ecommerce. It handles vague, exploratory shopping questions and large product catalogs, helping shoppers find the right fit, faster.
Seamlessly connect Fin to your Shopify store. With one click, sync your product catalog, pull live inventory, and import store policies so your customer agent can answer questions and resolve orders faster.
In practical terms, this is agentic AI meeting ecommerce: Fin plans, retrieves, and reasons through complex product questions and next best actions to move the shopper forward confidently.
Based on the conversation, Fin recommends complementary or higher-value options, keeps carts easy-to-update, and guides shoppers into checkout when they’re ready.
Customer testimonial from Groupsumi spotlights Fin for Ecommerce: rapid, high-quality support with minimal setup, powered by Shopify as the single source of truth, helping teams cut complexity and focus on growth.
"Fin for Ecommerce is already driving meaningful revenue, with 10% of conversations converting to orders averaging 20% above our store AOV." Matt Satell, Director of Ecommerce, Ninja Transfers
Fin for Ecommerce is built on the same AI platform that powers Fin for Service. Fin understands whether a conversation requires shopping assistance, support, or both, and moves between them seamlessly without the customer noticing.
Meet Fin for Ecommerce, your always‑on customer agent. This bold hero invites you to add Fin to your store so shoppers get instant answers, higher confidence at checkout, and fewer support tickets.
This means the same Agent that helps shoppers buy also handles the hard and complex post‑purchase work including refunds, exchanges, order changes, tracking, and shipping questions. It can make changes in real time, within the same conversation, using the same context and data.
"The handoff between support and sales is so smooth I can't tell the difference without checking the filters. Fin talks policy, sells products, and references our mattress break-in period all in one conversation. It handles both the way our best agents would — but without the customer waiting to be passed between people." Kurt Dwiggins, Customer Experience Manager at Avocado
Fin for Ecommerce is purpose-built for Shopify merchants. Connect your Shopify store and Fin establishes a live connection to your entire catalog – products, variants, content, and order data – ensuring every response reflects your latest inventory and shoppers only see what’s actually available.
You can add the Messenger to your store and set Fin live in minutes without any manual training or technical expertise. When connected to Shopify’s API, Fin can handle even your most complex customer requests like tracking orders, processing returns, and updating subscriptions via Procedures. Fin automatically drafts Procedures for common ecommerce support queries based on your Shopify account and customized to your company policies.
You review, adjust, and publish, allowing Fin to start handling real queries in minutes.
"What surprised us most about Fin for Ecommerce is how quickly it delivers high-quality support with minimal, non-technical setup. Using Shopify as the single source of truth reduces operational complexity and allows us to focus on core business execution." Arnau Jiménez, Chief Technology Officer, GroupSumi
Fin is now a Customer Agent, with multiple roles that work seamlessly across the customer lifecycle. When a single Agent can guide a shopper from "I need a gift for my partner" to checkout, and handle a return weeks later without losing context, that’s a fundamentally better customer experience. It’s one Agent that deeply understands your products and your customers, and supports them throughout their entire journey with your business.
Leading ecommerce brands, including Avocado, WHOOP, Shutterstock, Flaviar, Carvana, Nuuly, MPB, Pure Electric, and Goodbuy Gear, already trust Fin to create standout experiences for their shoppers. I’m excited to continue expanding Fin’s roles as a Customer Agent and share more soon.
Ready to see it in action? Visit fin.ai/ecommerce and add Fin to your Shopify store today.
Our retention curve had flattened even as activation ticked up, and that disconnect told me we were missing a leading indicator buried in our AI agent telemetry. I set out to connect our AI evals directly to product retention, not as an academic exercise, but as the basis for focused roadmap bets and stronger product-led growth.
"Learn how we used Agent Analytics to discover an eval signal that predicts 3X higher user retention."
Connecting AI evals to retention analysis is deceptively hard. Evals often live in ad-hoc notebooks while behavioral analytics and cohort retention live elsewhere. IDs drift. Signals are noisy. Teams gravitate to fast output over outcome clarity. I leaned into eval-driven development to close that gap and make our AI workflows accountable to business results.
We began with crisp hypotheses: for example, that higher semantic accuracy and lower escalation rates would correlate with repeat usage. We enumerated a concise eval taxonomy—accuracy, containment, safety, latency, and UX friction—and used Agent Analytics to compute per-user and per-tenant features on a daily cadence. That gave us a reliable, unified analytics platform for AI-specific signal generation.
Next, we joined those features to our product telemetry in Amplitude analytics using clean user and account identifiers. With that foundation, we created weekly and monthly cohorts, ran retention analysis, and used driver trees alongside simple logistic models to control for plan type, segment, region, and acquisition channel. The goal wasn’t perfection—it was directional clarity strong enough to inform product strategy.
One eval metric separated itself from the pack. When users hit a specific threshold early in their journey, the model predicted 3X higher user retention compared to peers who didn’t. I still remember overlaying that signal on our cohort chart—the lift was impossible to unsee, and it immediately reframed our activation and onboarding priorities.
From there, we operationalized. We built in-app guides that nudged new users toward the eval threshold, added a health score to customer success workflows, and put feature flags on model changes until they improved the eval. We validated the effect size with A/B testing and set up anomaly detection to catch regressions before they touched real users.
If you want a repeatable playbook: define your north-star retention window, shortlist 3–5 eval candidates tied to real user value, ensure rock-solid identifiers across systems, compute daily features in Agent Analytics, model uplift against retention cohorts in Amplitude analytics, then translate the winning signal into onboarding nudges, product tours, and success playbooks. Track second-order outcomes too—support tickets, NPS, and Net Recurring Revenue (NRR)—so you don’t optimize a proxy at the expense of experience.
I also learned what to avoid. Watch for sample-size traps and label leakage, and remember that segment mix can masquerade as model improvement. Use minimum detectable effect (MDE) calculations to size experiments, add risk scoring to gate launches, and keep a tight feedback loop between product, data science, and customer success.
The payoff is far more than a tidy dashboard. By grounding our AI strategy in behavioral analytics and measurable retention lift, we turned an abstract eval into a concrete growth lever—and gave our product teams the confidence to move faster with clarity.
Inspired by this post on Amplitude – Perspectives.
AI agents are getting remarkably good at scaffolding features and writing tests, yet when production issues surface, accountability still lands on me and my team. The last mile of quality—reproducing the issue, isolating the root cause, and validating a durable fix—remains a human responsibility, even in an era of agentic AI. That’s why I’ve built a repeatable debugging approach that blends behavioral analytics with agent-assisted coding to close the loop quickly and safely.
Investigate bugs directly in Claude or Cursor with Amplitude MCP. Learn two Session Replay workflows to debug faster.
The goal is simple: transform messy, anecdotal bug reports into actionable, prioritized work that my developers can resolve confidently. By pairing Session Replay with Amplitude analytics, I can quantify impact, capture precise reproduction steps, and feed rich context into Claude or Cursor. The result is a faster path from signal to solution—and fewer back-and-forth cycles with engineering, support, and product.
Here’s how I use Session Replay to tighten the feedback loop. First, I lean on behavioral analytics to detect anomalies and segment affected users, so I know whether we’re facing an edge case or a widespread degradation. Then I use the replay to see exactly what the customer experienced: the path they took, the UI state, the environment details that matter (device, browser, version), and the precise moment things went sideways. This contextual backbone lets me enter Claude or Cursor with high-signal inputs, rather than guesswork.
Workflow 1: From customer session to reproducible issue. I start with the offending Session Replay and capture the exact steps to reproduce, including state transitions and timestamps for any console errors or API failures. In Claude or Cursor, I provide those steps, reference the replay link, and ask the model to propose a minimal failing test and a hypothesis for root cause. With Amplitude MCP as the connective tissue, I can keep the model anchored to the relevant events and user path while it generates patches or targeted instrumentation. I validate the hypothesis locally, run the failing test, and then move the fix through CI/CD with feature flags so we can verify in production without overexposing risk.
Workflow 2: From code symptoms back to customer evidence. Sometimes I begin in the IDE or agent environment with a flaky test, a suspicious diff, or a performance regression. In that case, I ask Claude or Cursor to outline likely failure modes and the critical code paths. Then I pivot to Session Replay for corroboration: do real users hit these paths, under what conditions, and how often? Using Amplitude MCP to anchor the agent in actual user journeys helps separate theoretical fixes from changes that will meaningfully improve outcomes. I confirm with replays after the patch lands, monitor Web Vitals and related behavioral metrics, and only then ramp the flag.
Two practices make these workflows consistently effective. First, I frame prompts to keep the model tightly scoped: reproduction steps, expected vs. actual behavior, impacted segments, and any known constraints (e.g., rate limits, third-party dependencies). Second, I treat the agent as a proactive pair-programmer: it drafts hypotheses, tests, and diffs, while I provide ground truth from Session Replay and analytics. That division of labor keeps the LLM productive without letting it drift from the evidence.
Operationally, I also align this approach with our incident management and observability standards. For high-severity issues, SREs and product managers share the same replay artifacts, event timelines, and roll-forward criteria. We document root causes and guardrails as docs-as-code, then socialize them via developer evangelism so similar classes of bugs get caught earlier. Over time, this tightens our DORA metrics—particularly lead time for changes and deployment frequency—without compromising stability.
Privacy-by-design is non-negotiable. We ensure Session Replay redacts sensitive fields, enforces least-privilege access, and complies with our data governance policies. When I involve an agent, I include only the minimum data necessary to reach a fix and prefer structured artifacts (event IDs, stack traces, and test cases) over raw PII. These safeguards let us move quickly without trading away trust.
The takeaway is pragmatic: agents can accelerate creation, but accountability for quality still rests with us. By grounding Claude or Cursor in real user behavior via Amplitude MCP and Session Replay, I get faster reproduction, more accurate fixes, and cleaner rollouts. The combination turns “mysterious customer bug” into “verified hypothesis and passing test” in a fraction of the time—and that’s how we ship responsibly at speed.
Inspired by this post on Amplitude – Best Practices.
I’m energized by the momentum I’m seeing at the intersection of behavioral analytics and AI workflows. "Chanaka is an AI Engineer at Amplitude, where he’s building the MCP server that brings Amplitude’s behavioral context directly into your AI tools." That single sentence captures a strategic inflection point for product organizations: AI that finally understands user behavior at the moment of decision.
Why does this matter? When behavioral analytics flow natively into our AI tools, we move from generic assistants to product-savvy copilots. Instead of prompting blind, I can ground my questions in Amplitude analytics—segment performance, cohort trends, and event funnels—so AI answers reflect real customer journeys, not hypotheticals. The result is sharper prioritization, faster discovery, and tighter feedback loops that directly support product-led growth.
From a technical standpoint, an MCP server becomes a clean, secure interface for LLMs to access behavioral analytics as-needed. That enables a retrieval-first pipeline that reduces hallucinations, improves context window management, and elevates prompt engineering quality. It also unlocks agentic AI patterns—where the assistant autonomously requests the right behavioral context to diagnose activation drops, spot anomalies, or recommend experiments. In short, it’s a unified analytics platform meeting LLMs for product managers where we actually work.
In day-to-day product management, this translates into practical wins. I can ask, “Which onboarding step is blocking user activation for the SMB segment?” and get an answer grounded in behavioral analytics with relevant visualizations or funnels. I can explore retention analysis by cohort without switching tools, then iterate on hypotheses and next-best actions inside the same AI-driven workflow. These tighter loops materially improve decision quality and team velocity.
There are governance considerations, of course. I advocate clear data access policies, strong privacy-by-design controls, and well-defined scopes for what the MCP server can retrieve. Start with high-value, low-risk datasets, pilot with a focused team, and instrument eval-driven development to measure accuracy, latency, and business impact. When done right, the AI Strategy becomes an execution engine—not just a slide.
My playbook: begin with one or two high-impact questions (e.g., activation blockers or churn drivers), wire them into the MCP-powered AI workflow, and quantify time-to-insight and decision quality improvements. As wins accumulate, expand to roadmap shaping, opportunity sizing, and experiment generation. The promise here is compelling—AI that doesn’t just talk about the product, but truly understands how customers use it, and helps us build the right things faster.
Inspired by this post on Amplitude – Best Practices.
I just finished listening to "Taste – All Things Product Podcast with Teresa Torres & Petra Wille," and as a product leader shipping AI-powered capabilities at HighLevel, Inc., I wanted to pressure-test the sudden obsession with "taste."
If you're curious, you can listen to this episode on Spotify or Apple Podcasts.
The core question landed perfectly for our moment: Is "taste" the must-have skill of the AI era — or just the latest tech buzzword in a world where AI is eating through design, delivery, and discovery?
Teresa pushes back hard, highlighting how slippery the term can be. "It's just this month's flavor of founder mode." She points out that "taste" is rarely defined, can't be easily taught, and too often becomes shorthand for "my preference trumps yours." Just as importantly, "It's not about your taste. It's about your customer's taste."
Petra adds needed nuance from years in the craft: pattern-recognition is real, and some people do develop sharper product sense over time. As she put it, "I am a strong believer that you develop product sense and taste over time. It's never finished."
Both threads lead back to familiar roots in product: product sense, founder mode, and the enduring myth of the lone visionary. They even grapple with the big question on everyone’s mind—Will AI Eat Taste Too?—and where that leaves product teams navigating GenAI, LLMs for product managers, and evolving product strategy.
Here’s my take. "Taste" can be useful as a personal north star, but it is not a decision system. In my teams, we bias toward evidence: continuous discovery, customer interviews, discovery synthesis with opportunity solution trees, and tight collaboration in product trios. Opinion can start the conversation, but evidence should end it.
Practically, that means investing in the skills that compound: Discovery skills — understanding customers, matching solutions to real needs. Human-to-human interaction skills. Learning to collaborate with AI effectively. Critical thinking and judgment grounded in evidence.
On AI collaboration specifically, we treat GenAI as a force multiplier, not a decider. We prototype with AI to explore breadth, then narrow with qualitative and quantitative signals, ablation-style experiments, and clear success criteria. The bar I hold myself to is simple: taste without evidence is just opinion.
Three lines I underlined from the conversation:
"It's just this month's flavor of founder mode." — Teresa Torres
"It's not about your taste. It's about your customer's taste." — Teresa Torres
"I am a strong believer that you develop product sense and taste over time. It's never finished." — Petra Wille
If you want to go deeper, these references are helpful for sharpening judgment without falling into the "great man" theory trap.
Follow Teresa Torres: https://ProductTalk.org
Follow Petra Wille: https://Petra-Wille.com
Founder mode
Marty Cagan: Founder-Style Leadership
Vercel/v0 CEO Guillermo Rauch on building taste: from Lenny Rachitsky’s Linkedin post
Continuous discovery (Read Teresa’s Everyone Can Do Continuous Discovery—Even You! Here’s How
The "great man" theory
Steve Jobs and the myth of the lone product visionary
Have thoughts on this episode? Leave a comment below and share how your team balances product sense with evidence in the age of AI.
I’m excited to share that we’ve brought Amplitude Plug and Play to the Claude and Cursor marketplaces—a lightweight way to infuse your everyday prompts with serious product analytics context and speed.
"Learn more about our new AI plugin, the easiest way to turn your favorite AI client into an analytics expert with a single-install."
For years, I’ve watched teams lose momentum hopping between dashboards, docs, and spreadsheets just to answer simple questions like “What changed in activation last week?” or “Which cohort is driving retention?” With Amplitude analytics and behavioral analytics at the core, Amplitude Plug and Play collapses that friction by bringing the answers to where you already think and build—inside Claude and Cursor.
In practice, this means I can ask natural-language questions such as “Show me the funnel from signup to activation by region,” “Compare retention week over week for new users from our latest release,” or “Summarize our last A/B testing results on onboarding” and get structured, context-aware responses. The goal is to keep me in flow while still honoring the rigor of a unified analytics platform.
What I love most is how this elevates both discovery and delivery. Product managers can accelerate continuous discovery by querying cohorts, drivers, and anomalies mid-conversation. Engineers working in Cursor or with Claude Code can validate event definitions, sanity-check metrics, and spot regressions without leaving their IDE. The result is tighter feedback loops and better decision quality.
Just as importantly, the experience is designed for clarity and consistency. When I ask about activation, I expect the same canonical definition every time. When I explore a retention analysis, I want clear assumptions and transparent logic. By anchoring responses to well-defined metrics and event taxonomies, the plugin helps reinforce good data governance while keeping the interaction fast and conversational.
Getting started takes only a few minutes. Open the Claude or Cursor marketplace, search for Amplitude Plug and Play, complete the single-install flow, and connect to your Amplitude analytics workspace. From there, start prompting as you normally would—only now your AI client can reason with product context.
This launch is part of how I see gen ai reshaping AI workflows for product teams: less context switching, more signal per prompt, and a shared, accessible understanding of what’s really moving the business. If you’re ready to turn your AI assistant into a trusted partner for product insight, Amplitude Plug and Play is a powerful next step.
Inspired by this post on Amplitude – Best Practices.
I’ve been closely tracking how agentic AI reshapes frontline operations, and few case studies are as instructive as AITropos. Their north star is deceptively simple: take a food order over WhatsApp — correctly, every time, fast enough that customers can’t tell it’s not a person. That’s the challenge Santi Marchiori and Juan Haedo embraced, and it’s a masterclass in product strategy, conversation design, and systems engineering.
What they’ve built is an AI order-taking agent that handles the full flow — menu recommendations, modifiers, delivery zones, payment links, and status updates — entirely inside WhatsApp. Choosing the customer’s preferred channel wasn’t just a UX decision; it set the bar for speed, reliability, and trust. In hospitality, seconds matter. Latency becomes brand.
Their path to this solution reflects disciplined continuous discovery. They spent two years exploring hundreds of startup ideas before finding the niche of AI-powered order taking in hospitality, then iterated through three product forms — hardware for waiters, a waiter app, and finally a customer-facing WhatsApp agent — before landing on the right form factor. In my experience, this is what real product-market fit lessons look like: follow the problem, not the artifact.
Under the hood, the hardest problem is translating "non-deterministic human conversation" into structured "POS-compatible order data." To hit real-time response speed requirements, they chose a "tools-based architecture" over "MCP" or pipelines. That decision minimizes orchestration overhead and keeps the agent focused on the shortest path from intent to action — a pragmatic approach I recommend when SLAs are tight and context changes fast.
They also engineered for throughput and precision. A parallelized pipeline searches for multiple products simultaneously and pre-fetches product context before the agent even calls a tool. Complementing that, smaller, fast sub-agents assemble an "immediate system prompt" that injects relevant data into each turn without extra tool calls. Think of it as a retrieval-first pipeline designed to slash latency while preserving accuracy — a pattern every team building AI workflows should study.
Focus is evident in their KPIs. They identified order item identification accuracy as their single most important KPI. Picking one metric that truly governs customer trust is a hallmark of strong product management; it clarifies trade-offs in model selection, prompt engineering, and fallback behavior.
Quality assurance is equally rigorous. Before going live in any new venue, they test with thousands of agent-simulated customer conversations overnight. This approach de-risks deployment, surfaces edge cases early, and provides the data backbone for Agent Analytics and iteration. It’s a practical blueprint for teams operationalizing LLMs for product managers who need both scale and safety.
Operationally, the payoff shows up in onboarding. They reduced new customer onboarding from three months to a few weeks — and continue to shrink it as they build domain templates. Standardizing schemas, prompts, and flows for repeatable segments is exactly how you turn bespoke wins into a scalable go-to-market engine.
Stepping back, a few lessons stand out for product leaders building agentic AI in high-velocity environments: meet customers where they already are (WhatsApp), pick an architecture that serves your latency constraints (tools over complex workflows), pre-inject context to reduce tool calls, simulate at scale before launch, and anchor teams around one trust-defining KPI. Do these consistently, and you transform AI from a novelty into an always-on employee your customers actually prefer to use.
I used to treat the roadmap like a sacred artifact. Over time, I learned the uncomfortable truth: the best product leaders stop obsessing over the roadmap and start obsessing over ambition. My number one job isn’t shipping features—it’s raising the bar for what the team believes is possible and carving out the time to think deeply. When I spend half my time thinking (not doing), the business moves faster, customers feel the lift, and outcomes finally outpace output.
The impact of a great product leader starts with context-setting. Under a founder, the role often skews toward influence without deference—pressure-testing ideas, bringing data and customer insight, and helping translate founder vision into a portfolio and product strategy. Under a hired CEO, it’s about aligning capital allocation, setting clear investment theses, and ensuring product roadmapping and sprint planning connect directly to financial and go-to-market realities.
Ambition beats activity. I push teams beyond “what we can fit this quarter” and anchor on value creation: how does this create net-new customer advantage? We measure with outcomes vs output OKRs, tie initiatives to activation, retention, and Net Recurring Revenue (NRR), and celebrate learning velocity as much as shipping velocity. When the narrative moves from features to outcomes, customers notice—and so does the business.
I’m demanding without breeding fear. The trick is a high bar plus psychological safety: crisp quality standards, blameless postmortems, and an expectation of intellectual honesty. I separate people from problems, model curiosity over certainty, and use stakeholder management to align early, not late. The result is a culture where empowered product teams volunteer for the hard problems because the path to excellence is transparent.
Most “politics” is an incentives problem. When functions optimize for different scorecards, status games fill the vacuum. I fix this with a shared driver tree, clarified decision rights, and compensation aligned to company-wide outcomes. Once incentives match the strategy, alignment stops being a meeting and starts being momentum.
I use a three-bucket framework to delegate decisions. Bucket 1: I decide (irreversible, cross-company implications, or existential risk). Bucket 2: Team decides; I’m consulted (reversible or scoped risk with clear guardrails). Bucket 3: Team decides; I’m informed (local optimization and execution details). This creates speed without surrendering strategic coherence, and it’s a practical approach to building empowered product teams.
I’m militant with my calendar to protect thinking time. I block two to three mornings per week for deep work, partner with executive assistants to defend those blocks, and aggressively prune low-ROI rituals. “Thinking time” isn’t a luxury; it’s where product strategy is forged, complex bets are sequenced, and product-market signals get synthesized. I also fly at a low altitude—joining customer calls, reviewing designs and PRDs weekly—so judgment stays grounded without micromanaging.
The AI era demands more risk in our roadmaps. I place a few venture-like bets, timebox them, and instrument eval-driven development so we can kill or scale quickly. The concept of an app is changing—from static screens to adaptive workflows, assistants, and agentic AI. This shifts product roadmapping and sprint planning toward capabilities, data leverage, and safety systems (privacy-by-design, data governance, and AI risk management) rather than a linear feature list.
Innovation teams need shelter from the core. I separate their KPIs from immediate monetization, create technical sandboxes with clear guardrails, and run a parallel discovery track. Forward deployed engineers sit with customers; continuous discovery ensures we converge on problems worth solving; and when something works, we integrate it into the core without smothering it with legacy processes.
I use a barbell planning horizon: 12 weeks of executional clarity and 12–24 months of strategic theses. Anything beyond that is scenario planning, not a promise. We revisit the theses quarterly, tie them to product strategy and go-to-market strategy, and ensure each increment is measurable. This balances focus with optionality.
Excellence in 2026 looks different. It requires fluency in AI Strategy, strong data governance, and the ability to move from feature leadership to system leadership. Product leaders must be bilingual—equally comfortable discussing LLMs and retrieval-first pipelines as they are speaking in NRR, gross margin, and payback periods. The job is to translate technology shifts into durable customer advantage.
Being a great C-suite partner means acting enterprise-first. I co-own capital allocation with finance, sequence hiring with people and engineering, and encode our strategy into operating cadence. I treat sales-led growth and product-led growth as complementary systems, not rival religions, and I bring clarity to trade-offs with driver trees and scenario plans.
Chase impact, not titles. The fastest growth I’ve seen comes from optimizing for scope, learning rate, and mentors—not for role labels. If you want comp and career to compound, maximize the value you create: fix activation, improve retention, unlock expansion, or reduce cost-to-serve. Titles follow impact, not the other way around.
Four bottlenecks stall careers repeatedly. First, a scope ceiling—holding too much IC work and not scaling through delegation. Second, stakeholder friction—underinvesting in alignment and communication. Third, weak people leadership—not hiring, coaching, and performance-managing decisively. Fourth, fuzzy strategy—if your strategy can’t be drawn as a driver tree, your teams can’t execute it. Remove these bottlenecks and your trajectory changes fast.
In the end, the roadmap is an instrument, not the strategy. Raise the team’s ambition, align incentives, protect deep work, and take smarter AI-informed risks. Do that consistently and the roadmap stops being a crutch—it becomes a flywheel.
In my role leading product strategy at HighLevel, I’ve learned that AI search is one of the most overlooked growth levers in a modern product stack. When we treat every query as a moment to understand intent, reduce friction, and guide users to value, AI search stops being a utility and starts becoming a compounding engine for product-led growth.
"Turn AI search into a growth channel with AI visibility, sentiment analysis, revenue impact, and content recommendations in one place."
That single line has become a practical blueprint for how I operationalize AI Strategy: make what users ask visible, interpret how they feel, quantify what converts, and continually recommend better content. AI visibility tells me which intents we serve well (and where we fail). Sentiment analysis connects experience to emotion. Revenue impact closes the loop with attribution. Content recommendations ensure we don’t just diagnose gaps—we close them.
Under the hood, I anchor this on a retrieval-first pipeline that marries behavioral analytics with a unified analytics platform. This lets me trace the path from query to outcome: how users phrase needs, which results earn clicks, where drop-offs happen, and which experiences correlate with activation, retention, and expansion. With that signal, I can prioritize high-leverage content updates, tune relevance, and decide when agentic AI should step in with guided workflows rather than static results.
Measurement has to be rigorous. I rely on eval-driven development to benchmark intent coverage and answer quality, then confirm impact with A/B testing designed around a clear minimum detectable effect. We test ranking tweaks, prompt variants for LLMs for product managers, and new answer types (short snippets vs. deep dives) to isolate what actually moves activation and Net Recurring Revenue. If it doesn’t change behavior or dollars, it’s noise.
The operating model matters as much as the model weights. Cross-functional product trios pair continuous discovery and journey mapping with a lightweight content audit cadence. The CRO role partners with data science to align search KPIs to revenue goals, and solutions engineering ensures CRM integration and downstream systems reflect what users discover. This keeps the system honest: every improvement is traceable from insight to impact.
Finally, governance and scale are non‑negotiable. Privacy-by-design, clear data governance, and observability protect trust while feature flags and CI/CD let us iterate safely. When the fundamentals are strong, we can confidently expand into richer experiences—like proactive recommendations, in-app guides, and voice AI agent handoffs—without sacrificing reliability or compliance.
If your AI search still feels like a black box, it’s time to turn it into a transparent, revenue-linked growth channel. Make the work visible, measure what matters, and let sentiment and behavior guide the roadmap. The payoff is real: better answers, faster activation, and a content system that learns—and sells—every day.
Inspired by this post on Amplitude – Best Practices.