Tag: Agent Analytics

  • Inside 27,000 AI Sessions: What Real Users Taught Me About Designing High-Trust Agents

    Inside 27,000 AI Sessions: What Real Users Taught Me About Designing High-Trust Agents

    Over the past quarter, I’ve been obsessed with a simple question: how do real people actually prompt AI agents when the stakes are high and the clock is ticking? We analyzed 27K sessions with Amplitude's Global Agent using our Agent Analytics tool. Here's what we found out about how real users are prompting our agent. That single line belies months of careful instrumenting, qualitative review, and product debates—and it forever changed how I design agent experiences.

    The clearest pattern I saw: users don’t craft “perfect” prompts—they co-create with the agent. Most sessions began with a broad intent, then tightened through rapid, iterative turns. The winning structure emerged as context, command, and constraints. When our agent acknowledged context first, clarified the command, and reflected constraints back, users responded with noticeably more confidence. It reinforced what great prompt engineering already teaches, but grounded in lived behavior across thousands of journeys.

    Trust was the next breakthrough. People wanted transparency on capabilities, a concise first answer, and an easy path to deeper detail and sources. They frequently asked the agent to show its work, summarize trade-offs, or restate assumptions in plain language. Instrumenting observability into the agent’s reasoning artifacts—without overwhelming the user—proved foundational for building credibility session by session.

    On task complexity, users fared best when the agent orchestrated a few small, verifiable steps rather than one heroic leap. Retrieval-first pipeline patterns consistently reduced confusion and rework, especially when paired with strong context window management. The more the agent proactively chunked the problem, validated intermediate outputs, and offered next-best actions, the smoother the journey—and the more reusable the prompts became.

    UX nudges mattered as much as model quality. Inline examples (“Try this”), one-click refinements (“Shorter,” “Add a table,” “Cite sources”), and lightweight guardrails kept momentum high without boxing users in. When the agent made uncertainty explicit and offered safe fallbacks, abandonment dropped and users explored more ambitiously. The experience felt less like “querying a model” and more like collaborating with a capable teammate.

    From a product management lens, these insights shape how I prioritize agentic AI. I’m doubling down on: scaffolded prompts that lead with context and constraints; transparent citations and assumptions; multi-step plans that the user can edit; and evaluation loops that A/B test prompt templates, tool strategies, and response formats. I’m also investing in analytics that connect session patterns to activation, speed-to-value, and retention so we can run eval-driven development, not opinion-driven debates.

    If you’re building agents into a core product workflow, start by designing for iterative co-creation, not one-shot brilliance. Offer progressive disclosure, keep the first answer tight, and make verification effortless. Shape the model with retrieval-first strategies, manage your context window like a scarce resource, and treat observability as a feature, not a debug tool. Most of all, let real usage guide your roadmap—these 27K sessions reminded me that the best agent UX is learned alongside our users, not imagined in isolation.


    Inspired by this post on Amplitude – Perspectives.


    Book a consult png image
  • How We Taught Agentic AI to Speak Product Analytics—and Unlocked Actionable Insights

    How We Taught Agentic AI to Speak Product Analytics—and Unlocked Actionable Insights

    I set out to solve a deceptively simple problem: help our teams ask product questions in plain English and get trustworthy, analysis-grade answers—fast. That required more than a powerful model; it demanded agents that genuinely understand the language of product analytics, from behavioral analytics nuances to the messy reality of event taxonomies, funnels, and cohorts. In this post, I share how we engineered agentic AI that speaks our domain fluently and turns questions into decisions.

    The core challenge wasn’t data volume or dashboard sprawl; it was semantics. Different teams said “activation,” “onboarding,” or “first value” and meant overlapping but distinct things. Our PMs, analysts, and engineers navigated a maze of synonyms across Amplitude analytics, Pendo, and our unified analytics platform. Generic LLMs stumbled on these nuances, so we built a shared ontology—driver trees anchored to a clear North Star—with canonical definitions for activation, retention, and conversion, plus consistent event naming and cohort logic.

    We started with a rigorous metric catalog: every KPI linked to its drivers, exact formulas, cohorts, and time windows; every event mapped to a product taxonomy; every dashboard and SQL snippet versioned with ownership and lineage. That catalog became the ground truth for agents. We embedded data governance and privacy-by-design from the start—permissioning for fields and queries, PII redaction, and scoped access that reflected how product teams actually work.

    Next, we built a retrieval-first pipeline to ground the agents in our corpus before generation. We indexed metric definitions, dashboards, experiment readouts, runbooks, and high-signal Slack threads so the agent could cite relevant artifacts, not just predict plausible text. With careful context window management and prompt engineering, the agent retrieves definitions and prior analyses, then plans multi-step actions: run a query, compare cohorts, check “minimum detectable effect (MDE)” for an A/B test, and summarize findings with references.

    Architecturally, we treated this as “Agent Analytics”: an orchestrator that selects tools based on intent—querying Amplitude analytics or Pendo for behavioral paths and funnels, hitting our warehouse for cohort tables, or pulling experiment metadata and anomaly detection alerts. Tool use is permission-aware, auditable, and designed to fail safe. The agent’s outputs include citations back to the exact definitions, dashboards, and SQL used, so reviewers can validate and iterate.

    Quality came from eval-driven development, not intuition. We built a gold set of representative product questions (activation inflections, retention analysis by segment, funnel drop-offs after feature launches) and scored the agent on faithfulness to definitions, numerical accuracy, latency, and actionability. We incorporated regression checks to catch drifts after schema changes, and we tuned prompts to reduce overconfident answers and push for clarifying questions when context was missing.

    Safety and reliability were non-negotiable. We layered AI risk management with role-based access, guardrails that block destructive queries, and risk scoring for unfamiliar joins or sudden spikes in metric deltas. The agent logs every step—what it retrieved, which tools it called, and why—so analysts can replay and refine the chain of thought with transparent provenance.

    The payoff: product teams now self-serve nuanced questions in minutes instead of days, and our analysts spend more time on discovery than report wrangling. Retention analysis improved as the agent standardized cohort logic; conversion investigations accelerated thanks to consistent funnel definitions; and cross-functional decisions aligned around the same driver trees and shared language. Most importantly, the agent turned ambiguous asks into structured analyses that stand up to scrutiny.

    For fellow product leaders, my lesson is simple: start with semantics, not models. A crisp ontology, disciplined taxonomy, and clear ownership will outperform a flashy stack riddled with ambiguity. Avoid technology FOMO; favor retrieval-first grounding, small sharp tools, and continuous discovery with your product trios. When your organization speaks a common analytics language, agents can finally think with you, not just for you.

    Next, we’re extending the agent’s planning skills to recommend experiment designs, estimate power and “minimum detectable effect (MDE),” and propose driver-tree-informed bet sizing. We’re also tightening feedback loops so every accepted answer, edit, or override strengthens the retrieval corpus and evaluations. The vision: a calm, reliable layer that makes rigorous product analytics feel conversational—and helps teams move from questions to confident action.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • How We Taught Agentic AI to Speak Product Analytics—and Unlocked Actionable Insights

    How We Taught Agentic AI to Speak Product Analytics—and Unlocked Actionable Insights

    I set out to solve a deceptively simple problem: help our teams ask product questions in plain English and get trustworthy, analysis-grade answers—fast. That required more than a powerful model; it demanded agents that genuinely understand the language of product analytics, from behavioral analytics nuances to the messy reality of event taxonomies, funnels, and cohorts. In this post, I share how we engineered agentic AI that speaks our domain fluently and turns questions into decisions.

    The core challenge wasn’t data volume or dashboard sprawl; it was semantics. Different teams said “activation,” “onboarding,” or “first value” and meant overlapping but distinct things. Our PMs, analysts, and engineers navigated a maze of synonyms across Amplitude analytics, Pendo, and our unified analytics platform. Generic LLMs stumbled on these nuances, so we built a shared ontology—driver trees anchored to a clear North Star—with canonical definitions for activation, retention, and conversion, plus consistent event naming and cohort logic.

    We started with a rigorous metric catalog: every KPI linked to its drivers, exact formulas, cohorts, and time windows; every event mapped to a product taxonomy; every dashboard and SQL snippet versioned with ownership and lineage. That catalog became the ground truth for agents. We embedded data governance and privacy-by-design from the start—permissioning for fields and queries, PII redaction, and scoped access that reflected how product teams actually work.

    Next, we built a retrieval-first pipeline to ground the agents in our corpus before generation. We indexed metric definitions, dashboards, experiment readouts, runbooks, and high-signal Slack threads so the agent could cite relevant artifacts, not just predict plausible text. With careful context window management and prompt engineering, the agent retrieves definitions and prior analyses, then plans multi-step actions: run a query, compare cohorts, check “minimum detectable effect (MDE)” for an A/B test, and summarize findings with references.

    Architecturally, we treated this as “Agent Analytics”: an orchestrator that selects tools based on intent—querying Amplitude analytics or Pendo for behavioral paths and funnels, hitting our warehouse for cohort tables, or pulling experiment metadata and anomaly detection alerts. Tool use is permission-aware, auditable, and designed to fail safe. The agent’s outputs include citations back to the exact definitions, dashboards, and SQL used, so reviewers can validate and iterate.

    Quality came from eval-driven development, not intuition. We built a gold set of representative product questions (activation inflections, retention analysis by segment, funnel drop-offs after feature launches) and scored the agent on faithfulness to definitions, numerical accuracy, latency, and actionability. We incorporated regression checks to catch drifts after schema changes, and we tuned prompts to reduce overconfident answers and push for clarifying questions when context was missing.

    Safety and reliability were non-negotiable. We layered AI risk management with role-based access, guardrails that block destructive queries, and risk scoring for unfamiliar joins or sudden spikes in metric deltas. The agent logs every step—what it retrieved, which tools it called, and why—so analysts can replay and refine the chain of thought with transparent provenance.

    The payoff: product teams now self-serve nuanced questions in minutes instead of days, and our analysts spend more time on discovery than report wrangling. Retention analysis improved as the agent standardized cohort logic; conversion investigations accelerated thanks to consistent funnel definitions; and cross-functional decisions aligned around the same driver trees and shared language. Most importantly, the agent turned ambiguous asks into structured analyses that stand up to scrutiny.

    For fellow product leaders, my lesson is simple: start with semantics, not models. A crisp ontology, disciplined taxonomy, and clear ownership will outperform a flashy stack riddled with ambiguity. Avoid technology FOMO; favor retrieval-first grounding, small sharp tools, and continuous discovery with your product trios. When your organization speaks a common analytics language, agents can finally think with you, not just for you.

    Next, we’re extending the agent’s planning skills to recommend experiment designs, estimate power and “minimum detectable effect (MDE),” and propose driver-tree-informed bet sizing. We’re also tightening feedback loops so every accepted answer, edit, or override strengthens the retrieval corpus and evaluations. The vision: a calm, reliable layer that makes rigorous product analytics feel conversational—and helps teams move from questions to confident action.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • Stop Drowning in Tasks: How AI Marketing Agents Restore Focus and Maximize Impact

    Stop Drowning in Tasks: How AI Marketing Agents Restore Focus and Maximize Impact

    Every week I meet marketers who are working harder than ever—more campaigns, more content, more dashboards—yet seeing less movement on metrics that matter. The surge of AI tooling has amplified activity, not necessarily impact. That’s the focus problem: we confuse motion with momentum, and our backlogs look great while our outcomes stall.

    Learn how AI agents for marketing can help you prioritize impact so you can do important work, instead of just more work.

    In my role leading product and growth teams, I’ve learned that AI only compounds value when it is pointed squarely at outcomes. If we don’t define what “good” looks like, agentic AI will simply scale busywork. The antidote is a disciplined operating model that connects strategy to execution and instruments agents with clear success criteria.

    First, anchor your program with outcomes vs output OKRs. Choose one or two measurable business outcomes—such as qualified pipeline, conversion rate, or activation—and make everything else subordinate. This provides the compass agents need to make effective trade-offs when speed and volume tempt you to do “one more thing.”

    Second, map a driver tree from the target outcome down to the controllable levers: audience segments, offers, channels, messaging, and experience friction. This traceability shows where agents can move the needle fastest—whether that’s accelerating research, sharpening positioning, or eliminating handoffs that slow experimentation.

    Third, design a small, agentic AI workforce aligned to those levers. For example: a Research Agent that synthesizes market insights and past performance; a Copy Agent that generates on-brief, on-brand variants; a Distribution Agent that adapts content to each channel and schedules posts; and an Analytics Agent that runs A/B tests, summarizes results, and flags anomalies. Keep human oversight where judgment matters most—strategy, brand voice, and high-stakes decisions.

    Fourth, instrument rigor from day one with Agent Analytics and eval-driven development. Define offline evals for brand consistency, factuality, safety, and response time; pair them with online experiments that quantify lift on your target outcomes. Set a minimum detectable effect (MDE) so you stop shipping changes that cannot plausibly move the metric.

    Fifth, operationalize your AI workflows. Standardize prompts, inputs, and handoffs; templatize briefs and acceptance criteria; and keep a change log so improvements compound rather than reset. Use short, frequent feedback loops to prune low-impact work and double down on what demonstrably advances your objectives.

    I’ve seen teams reclaim focus and momentum when they treat agents as teammates, not toys. The magic isn’t in producing more assets—it’s in consistently choosing the next best action in service of a clear outcome. When you combine outcome clarity, a driver tree, targeted agents, and tight evals, AI becomes a force multiplier for marketing impact.

    If you’re feeling overwhelmed by AI’s possibilities, start small: commit to one outcome, one driver you believe is material, and one agent designed for that job. Prove lift, codify the workflow, then scale. Velocity is only valuable when it’s pointed in the right direction.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • Unlocking AI’s Black Box: How Monitors and Scorecards Elevate CX with Confidence

    Unlocking AI’s Black Box: How Monitors and Scorecards Elevate CX with Confidence

    I followed the energy at Fin Labs Paris and immediately zeroed in on the announcement of Monitors. In my view, it’s the missing piece that turns Fin’s powerful automation into an observable, trustworthy system—sitting alongside Insights and Recommendations to form a complete observability suite that gives teams confidence in what Fin is doing.

    With Monitors, you define what conversations get reviewed, both Fin and human, and set evaluation criteria using Custom Scorecards. That level of control ensures you’re measuring the metrics that matter most to your business and holding support quality to your bar, not a generic one.

    Used in concert with Insights and Recommendations, you can finally see what’s happening across your support operation, evaluate every conversation against your standards, and take targeted action to continuously move toward perfect customer experiences.

    As Agents become more powerful, transparency and control become critical. I’ve seen this shift firsthand: AI is advancing fast, and the stakes are no longer theoretical—Agents are resolving real customer issues with real consequences at scale.

    Diagram of the AI model lifecycle loop with four stages—Train, Test, Deploy, Analyze—with Analyze highlighted in orange to show monitoring that closes the feedback loop and opens the AI black box.
    Visualizing the AI development flywheel—Train, Test, Deploy, Analyze—this graphic spotlights Analyze in orange to introduce Monitors, turning opaque model behavior into measurable signals and continuous customer service insights.

    Fin has almost 8,000 customers, averages a 67% resolution rate, and resolves close to 2 million customer queries every single week, including highly complex queries in regulated industries.

    At that scale, observability isn’t a nice-to-have; it’s a necessity. Traditional CSAT and small QA samples weren’t built for Agent-led operations—they miss edge cases, don’t scale, and can’t explain drift. The result is a black box. What teams need most right now is confidence, built on data you can trust and act on.

    At Intercom, this is called the Fin Flywheel: Train, Test, Deploy, Analyze.

    Intercom Monitors dashboard with review queues and analytics cards, plus an Edit monitor panel configuring a 'Vulnerable customers' rule set with sample testing and continuous monitoring for Fin conversations.
    See inside Intercom's Monitors: a streamlined dashboard with pass‑rate charts and review queues, alongside a panel to define a 'Vulnerable customers' monitor, test it on sample chats, and run continuous checks.

    Analyze is the step where you find out what’s actually happening and it’s where improvement begins.

    In my experience, achieving confidence in an AI support operation requires three things: (1) a complete understanding of what Fin, your human team, and your customers are talking about; (2) a way to monitor and score conversations based on the criteria that matter most to your business; and (3) AI-powered recommendations that make it easy to act on what you find. Intercom launched Insights and Recommendations to address the first and third. Now, Monitors completes the system for full observability and opens the black box.

    Monitors: know whether every conversation met your standards. Customer sentiment is important, but it’s different from determining whether a conversation was handled correctly. With Monitors, you can do both—and do it at scale.

    Quote graphic for Announcing Monitors: Opening the AI black box, featuring a testimonial on tracking AI quality continuously vs. spot checks, attributed to Ineke Oates, Head of Support at Agorapulse.
    Customer support leaders praise Monitors for turning AI performance from a black box into measurable signals. This quote from Ineke Oates of Agorapulse highlights the shift from manual spot checks to continuous quality tracking.

    Monitors is a new QA capability that delivers a structured, repeatable way to define which conversations get reviewed and evaluate them against quality criteria you set. It replaces ad-hoc sampling and spreadsheet-driven QA with a system that scales as your volume grows.

    Two components work together: Monitors define what gets reviewed and Custom Scorecards define how each conversation is evaluated. That pairing brings the rigor of Agent Analytics and the discipline of eval-driven development to everyday CX operations.

    Random sampling has always been a blunt tool. When AI is handling thousands of conversations a week, a small, arbitrary slice won’t reliably capture your highest-risk edge cases, your most complex escalations, or where quality is starting to drift. I’ve felt that pain in operations reviews—too many unknowns, not enough signal.

    Product screenshot of a Monitors dashboard with review queues and bar-chart analytics, plus a New scorecard panel to assess human teammates or an AI agent using configurable criteria and pass rates.
    Open the AI black box with Monitors: track conversations, triage unreviewed items, and build transparent scorecards with criteria like accuracy, process adherence, and efficiency to lift customer support quality.

    With Monitors, you select and evaluate conversations with intent. You can target specific signals of risk or failure, like “the customer showed signs of financial vulnerability” or “Fin looped around with the same answer without resolving the issue.” Or you can create consistent, repeatable samples to benchmark quality over time. Use the existing library of filters (customer data, channel, Fin-specific metrics) or describe nuanced scenarios in natural language. Most teams will do both: hone in on the conversations that matter most and maintain a steady, structured QA sample each week.

    "When I saw Monitors, my first reaction was — this is exactly what we need. The ability to track quality continuously, instead of relying on spot checks, is a big shift for us." Ineke Oates, Head of Support, Agorapulse

    Custom Scorecards make your standards explicit and enforceable. One-size-fits-all rubrics never reflect your brand voice, industry constraints, or customer expectations. With Custom Scorecards, you define what “good” looks like for your business and turn that into a measurable, comparable quality score for every conversation.

    Minimalist testimonial graphic on an off‑white background quoting a customer about Monitors enabling QA where conversations happen, running across Fin and human support in one place; attributed to a Culture Amp leader.
    A customer testimonial underscores the promise of Monitors: bring quality assurance into the flow of work, unifying AI assistant Fin and human agents in a single place for faster, clearer customer support.

    You define the criteria that matters, how each should be measured, and how important each one is. Some criteria can be scored automatically by AI, others reviewed by a human, or both — all within the same scorecard. This means you’re not choosing between scale and judgment; you get both in one system.

    Each conversation is then evaluated against these criteria, and the system calculates an overall quality score based on your configuration. You can weigh what matters most, or mark certain criteria as critical, so a single failure can fail the entire evaluation when needed.

    The result is a single, consistent quality score that reflects your standards—not a generic metric, and not a collection of disconnected checks. That’s what makes quality measurable over time and comparable across AI and human support.

    Dashboard screenshot of Monitors review queues showing users, monitor types, colored review scores, reviewers, review status, notes, and follow-up actions with AI auto-review labels.
    Monitors helps open the AI black box by turning model outputs into trackable reviews. This clean queue groups customers, monitor types, scores, and actions—with AI auto-review—so teams improve quality faster.

    There’s an important distinction here: CX Score tells you how customers felt about a conversation. Custom Scorecards tell you whether it met your standards. You need both.

    "We looked at dedicated QA tools, but what's compelling about Monitors is that it lives where our conversations already happen. We don't need another system — we can run QA across Fin and our human team in one place." Jared Ellis, Senior Director, Global Product Support, Culture Amp

    When a conversation meets your criteria for review, Monitors routes it into a Review Queue. Each conversation is assigned to the right reviewer with its scorecard attached and status tracked end to end: Not reviewed, Reviewed, Needs a fix, Fix complete. Reviewers work directly in Intercom, capture what went wrong, and propose concrete fixes—like updating documentation or refining a workflow—so quality loops end in action, not just scores.

    Fin quality dashboard showing AI support monitor metrics and a line chart of criteria trends over time; cards list 75.2% average review score, 92.8% reviews passed, 856 reviews, and 62 failed, with date and filter controls.
    Monitors turn AI performance from opaque to measurable. The Fin quality view summarizes review score, pass rate, and review counts while a time‑series chart tracks escalation ease, clarification, and efficiency—delivering fast, actionable CX insights.

    Reporting turns QA into a continuous signal rather than a one-off audit. You can track review scores over time across Monitors and Scorecards, and compare them directly to CX Score, resolution rate, and other performance metrics. Patterns that were previously invisible become clear: a topic consistently underperforming, a quality dip correlated with a recent knowledge base change, or a team whose scores are improving week over week. This is observability applied to CX—evidence you can act on.

    Monitors for Fin conversations is live today, and the roadmap goes further. Human agent QA will bring the same structured evaluation to your human team’s conversations, creating one consistent quality system across your entire support operation.

    Real-time alerts will notify you the moment a conversation crosses a threshold you’ve defined—before the issue reaches more customers and risks compounding negative sentiment.

    Promotional banner reading "Get started with the #1 Agent today" over a dark, aurora-like gradient background, featuring a white button labeled "Start a free trial"; marketing graphic for an AI support agent.
    Kick off your journey with the #1 Agent—an AI partner designed to turn resolutions into real outcomes. Tap “Start a free trial” to explore faster, smarter customer service and see how Fin delivers value from day one.

    Knowledge base evaluation will connect AI scoring directly to your content so conversations are assessed against your latest policies and documentation, catching inaccurate or outdated responses and providing clear rationale linked to the relevant source.

    Creating perfect customer experience with AI requires transparency. You need to understand how the system is performing if you want to maintain and improve quality over time. With Insights, Monitors, and Recommendations, this is now possible—a complete analysis suite that lets you see what’s happening across every conversation, ensure it meets your standards, and pinpoint improvement opportunities when they matter most.

    I’ve long advocated for a retrieval-first, eval-driven approach to AI Strategy because it makes risk visible and manageable. Monitors operationalizes that philosophy for CX leaders: you get continuous signal, shared definitions of quality, and a direct path from flags to fixes. If you’re scaling AI support, this is how you replace uncertainty with control—and turn the black box into a competitive advantage.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • Stop Flying Blind with AI Agents: Put Users at the Center with Pendo Agent Analytics

    I’ve watched too many AI agent deployments celebrate velocity while overlooking the one thing that determines long-term success: whether real users are actually getting value. Dashboards tend to spotlight model upgrades, prompt tweaks, and launch counts, yet they rarely quantify task completion, trust, or time-to-value. That blind spot isn’t technical—it’s human.

    Enterprises are spending 93% of their AI budget building agents and almost none know if those agents are actually working for users. Pendo Agent Analytics closes the gap.

    In my product reviews, I look for evidence that agentic AI is improving outcomes across the customer journey, not just the demo path. Without behavioral analytics and observability, teams optimize for throughput instead of resolution, for novelty instead of reliability. This is where eval-driven development, A/B testing, and rigorous cohort analysis become non-negotiable: they translate agent performance into user impact we can measure and improve.

    Here’s the pattern that works for me: define user-centric success metrics first, then let the AI follow. I prioritize signals like successful task completion, low-friction activation, reduced escalations, and sentiment lift—tied directly to product-led growth indicators such as retention and expansion. When these metrics move in the right direction, I know the agent is creating compounding value, not just answering faster.

    Practically, I operationalize this with an analytics spine that captures end-to-end agent interactions: intents, prompts, responses, clarifying turns, handoffs, and final outcomes. I segment by persona, journey stage, and account tier to uncover where agents delight and where they degrade trust. With this foundation, I can run controlled experiments, spot anomalies early, and connect improvements in agent behavior to improvements in business performance.

    Pendo Agent Analytics closes the loop by making these user outcomes visible and actionable. Instead of guessing whether an agent helped or hindered, I can analyze where users stall, which prompts or skills drive completion, and how interventions like in-app guides or product tours change behavior. That visibility lets me tune models and experiences in days, not quarters—and gives stakeholders confidence that our AI investments are paying off for customers.

    If you’re scaling agents today, start small but instrument deeply: map top user intents, define offline and online evals, A/B test prompts and policies, monitor regressions, and tie every improvement to activation, adoption, and retention. The result is a durable feedback loop that keeps agents aligned with user value as your surface area grows.

    AI agents are not a destination—they’re a capability. When we anchor that capability to clear user outcomes and measure it with the right analytics, we stop flying blind and start compounding advantage. That’s how we turn promising demos into dependable products.


    Inspired by this post on Pendo – Best Practices.


    Book a consult png image
  • Kaizen for the AI Era: Tiny Daily Wins That Build Smarter, Scalable Customer Support

    Kaizen for the AI Era: Tiny Daily Wins That Build Smarter, Scalable Customer Support

    Every day, I challenge my teams to make one small, meaningful improvement—something so lightweight it’s impossible to ignore and easy to repeat. That tiny daily motion compounds, and over time it reshapes customer experience, operational quality, and team culture.

    That’s the essence of Kaizen, the Japanese philosophy of continuous improvement. Developed in post-war Japan and popularized by companies like Toyota, Kaizen proves that small, steady changes lead to significant long-term results. In product management and customer support, this approach transforms big ambitions into daily behaviors that actually stick.

    Crucially, Kaizen isn’t passive or unstructured. It thrives on three principles I reinforce across my org. First, small changes reduce resistance—when you lower the activation energy, teams move faster. Second, improvement is continuous, not occasional; instead of waiting for quarterly reviews or major releases, you ask: “What can we improve right now?” Third, everyone participates—the people closest to the work are best positioned to improve it. That’s how momentum spreads.

    In practice, the cycle is simple: identify a small problem, test the change, measure the result, refine, and repeat. The point isn’t radical transformation in a single swing; it’s steady progress guided by data and observation—a rhythm that aligns beautifully with eval-driven development and continuous discovery.

    At Intercom, we apply this same philosophy to how we manage our Agent Fin through a process we call the “Fin Flywheel”. Here’s how this works.

    Train: Teach Fin how to handle and resolve the most complex customer queries.

    Test: Run fully simulated customer conversations from start to finish to see exactly how Fin will behave before going live.

    Deploy: Launch Fin across all channels so customers get consistent support wherever they reach out.

    Analyze: Use AI-powered insights to review and improve Fin’s performance so it can deliver better customer experiences.

    This isn’t a one-time setup; it’s a continuous loop where every interaction feeds ongoing improvement. Rather than deploying AI and assuming it will perform as expected, improvement is built into the system itself. The more Fin is used, the better it gets. That’s the hallmark of agentic AI done right—tight feedback loops, purposeful conversation design, and clear Agent Analytics that illuminate what to tune next.

    But continuous improvement doesn’t stop with AI. Within our Human Support operations, I emphasize the same mindset that drives great LLMs for product managers: you instrument the experience, learn from real usage, and close gaps fast. We operate with a simple mindset: the first time that you solve a customer issue should be the last time it happens.

    When a conversation reaches a human, we pause to diagnose and prevent recurrence. Why did this reach me? Why couldn’t Fin resolve it? How can we prevent this from happening again? Those questions anchor a culture of root-cause thinking and accelerate product-led growth by removing friction at the source.

    To make this effortless, we’ve built a lightweight, AI-powered way to log suggestions in the moment—no long explanations or heavy admin required. Ideas are reviewed quickly and implemented by subject matter experts or by the team themselves. This keeps the flywheel spinning: insights flow in, fixes go out, and measurable outcomes improve.

    The result is a frontline that evolves from reactive problem-solvers into a proactive improvement engine. The people closest to customers spot friction, suggest fixes, and see their insights shaped into meaningful change. It’s continuous discovery embedded in everyday work, not a side project.

    Kaizen demonstrates that lasting progress doesn’t come from occasional transformation; it comes from intentional, everyday refinement. The “Fin Flywheel” applies that philosophy to AI. Our Human Support continuous improvement process applies it to human insights. Together, they create a shared system where both people and AI learn continuously from customer interactions.

    When improvement is built into the mechanics of how you work, it stops being a one-off project and becomes an ingrained capability. Over time, those small daily improvements don’t just add up—they compound into a sustainable, data-driven advantage that elevates customer experience and differentiates your customer support ai strategy.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • We Built Agent Analytics After Observability Broke—Why Your AI Team Needs It Now

    We Built Agent Analytics After Observability Broke—Why Your AI Team Needs It Now

    I remember the exact moment our product crossed the threshold from scripted automation to truly agentic AI. The excitement was real—so was the pit in my stomach when our dashboards went dark. Our trusted analytics and observability stack, which had served us flawlessly for traditional software, suddenly couldn’t explain what the agent was doing, why it made certain choices, or how to reproduce outcomes across runs.

    "The moment our product became a AI agent, our entire observability stack became irrelevant—not something you want as an analytics company. Here's what we did."

    Why does this happen? Agentic AI doesn’t behave like conventional apps. Instead of deterministic flows and neatly tagged events, we face non-deterministic trajectories, tool-use chains, evolving prompts, context window dynamics, and policy guardrails that influence outcomes in real time. Clicks and pageviews give way to tokens, tool calls, and conversation turns. Without purpose-built observability, you can’t do credible product discovery, measure behavioral analytics, or run eval-driven development with confidence.

    That’s why we built Agent Analytics. We needed a unified lens to trace every step of an AI workflow—from user intent to model prompts, function calls, retrievals, tool outputs, and final responses—while capturing latency, cost, guardrail hits, fallbacks, and outcome tags. We instrumented runs end-to-end, added experiment support for prompt engineering and policy variants, and wired in evaluations so we could turn subjective quality into objective signals the team could act on.

    The impact on product management was immediate. We shortened iteration cycles by making failure states obvious and reproducible, turned ambiguous feedback into structured data, and gave engineers and designers a shared source of truth for conversation design and AI workflows. With visibility into containment, escalation, autonomy ratio, and step-level success, we could ship confidently, rollback safely, and align roadmap bets to measurable outcomes—not anecdotes.

    Building this capability demanded more than logging. We invested in data governance and privacy-by-design to mask sensitive content while preserving semantic context, and we separated human-identifiable data from model telemetry. We treated prompts and policies like code—versioned, diffable, and safely rolled out behind feature flags and CI/CD—so we could experiment without risking regressions in production.

    What should every team measure? Start with outcome quality (task success, resolution, containment), reliability (tool success rate, guardrail triggers, fallbacks), performance (time-to-first-token, total latency, step-level latency), and efficiency (tokens and cost per successful task). Add groundedness checks for retrieval steps, regression evals for core journeys, and post-release anomaly detection to catch drift before users do. These metrics become your operating system for agent performance and your compass for product strategy.

    If you’re building or scaling AI agents, you need Agent Analytics before you hit your first incident. It’s the difference between guessing and knowing—between reactive firefighting and proactive iteration. With the right observability, your team can move faster, manage risk intelligently, and translate agent behavior into business outcomes that compound over time.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • Agentic Architecture Demystified: How Modern AI Systems Plan, Learn, and Execute at Scale

    Agentic Architecture Demystified: How Modern AI Systems Plan, Learn, and Execute at Scale

    In my role leading product teams at HighLevel, I’m often asked to explain what’s really happening behind the scenes of today’s AI products. The short answer is that modern systems are built on "Agentic Architecture: How Modern AI Systems Actually Work"—not just a single model, but a coordinated loop of planning, tool use, memory, and evaluation. Once you see that pattern, the design decisions snap into focus and the roadmap becomes far easier to prioritize.

    At its core, agentic AI treats the model as a reasoning engine embedded within an AI workflow. The agent interprets intent, plans steps, calls the right tools and APIs, grounds itself in trusted data, and then evaluates outcomes before deciding to continue or stop. This loop creates reliability, reduces hallucinations, and enables the system to operate in real-world, multi-step scenarios.

    Here’s the practical lifecycle I rely on. A user provides intent (a goal or request). We run a retrieval-first pipeline to ground the model in accurate, current data. Prompt engineering structures the task and primes the agent with constraints and success criteria while managing context window management. The agent generates a plan, executes steps by calling tools or services, evaluates intermediate results, reflects or revises as needed, and only then returns a final answer with clear citations or evidence.

    For more complex work, I orchestrate multiple specialized agents—commonly a planner, a solver, and a critic—coordinated by a lightweight controller. This multi-agent pattern reduces single-agent blind spots, encourages self-checking, and mirrors how empowered product teams collaborate. Whether it’s conversation design for support flows or a voice AI agent driving hands-free tasks, orchestration is the difference between a clever demo and a dependable product.

    Memory is the second pillar. Short-term working context sits in the prompt, while long-term memory lives in vector stores or databases to track past interactions, preferences, and outcomes. Retrieval augments the model with the right facts at the right time, and tight context window management ensures the agent stays focused on signal, not noise. The result is faster responses, lower costs, and far better accuracy.

    Reliability is earned through eval-driven development and robust AI risk management. I define offline and online evaluations, guardrails, and human-in-the-loop checkpoints before scaling traffic. These evaluations become living, automated tests that protect against regressions as prompts, models, and tools evolve. The payoff is real: fewer escalations, higher trust, and measurable improvements to quality over time.

    From a product strategy perspective, I resist over-engineering. Start with a simple retrieval-first pipeline and a single agent; prove value; then layer in multi-agent orchestration only where it moves key metrics. Instrument everything—latency, cost, grounding coverage, and outcome quality—and build Agent Analytics dashboards so teams can diagnose issues and iterate with confidence.

    If you’re looking for a practical playbook, here’s mine: clarify the user intent and success criteria; design the tools the agent can call; ground with authoritative data; write prompts that constrain scope and define termination conditions; add reflection and automated evaluations; and ship behind feature flags for safe, staged rollout. Each step compounds reliability without killing velocity.

    The diagram and the video above bring these patterns to life. If you watch closely, you’ll see the same loop—plan, retrieve, act, evaluate—show up in every effective implementation, regardless of domain. That repetition isn’t accidental; it’s the backbone of agentic architecture and a blueprint you can adapt to your own stack.

    Ultimately, what matters is outcomes. When we build around agentic AI, we create systems that are explainable to stakeholders, maintainable by engineers, and genuinely helpful to customers. That’s how we move past hype to durable impact—shipping AI products that plan, learn, and execute at scale.


    Inspired by this post on Product School.


    Book a consult png image
  • How We Automated 81% of Customer Support with AI—While Uplifting CX, Speed, and ROI

    How We Automated 81% of Customer Support with AI—While Uplifting CX, Speed, and ROI

    Leading the Support function for a company that builds a leading Agent and AI-forward customer service platform has been, for me, unique, exciting, and yes—daunting. It’s where product ambition meets operational reality, and where every decision I make is immediately tested by customers who expect excellence.

    It’s unique because we use the same technology as our customers. We live in the product every day, which puts us in a privileged position to be the voice of the customer across the organization. That tight feedback loop has shaped how I prioritize, what I build next, and how I measure success.

    It’s exciting because we get to try all of the new features and capabilities of Fin and the Intercom helpdesk. With a relentless focus on AI innovation, I’ve had access to remarkable tools that help us deliver an incredible customer experience—and I’ve seen firsthand how the right workflows and guardrails turn those tools into outcomes.

    And it’s daunting because expectations for our own Customer Support (CS) team are sky high. If we can’t deliver incredible support using our own technology, we undermine its value proposition. That imperative has kept me honest, focused, and fast.

    In our new research, “The 2026 Customer Service Transformation Report,” we’ve been sharing how forward-looking teams use AI to transform their support models. If you’d like to get straight to the report, download it here.

    When Intercom changed its focus in late 2022 to prioritize the customer service use case, we undertook a critical review of the support experience we were delivering and committed to driving meaningful change under an AI-first framework. That was a turning point: I aligned product strategy and operations around a single north star—automate with quality, and elevate humans to higher-value work.

    Three years on, Fin now resolves over 81% of all our customer support volume, delivering immediate and high-quality resolutions. We have absorbed a 300%+ increase in customer demand since 2022 without proportional headcount growth. Without Fin, we would have needed at least 100 additional CS team members to meet that demand and our improved service levels – a net saving to Intercom of between $7.5M–$9M annually.

    Throughout this work, we drew on research from the 2026 Customer Service Transformation Report and applied the lessons directly to our own org design, knowledge management, and AI workflows. What follows is our story of transformation and how we achieved a mature deployment of Fin.

    The problems we set out to solve

    Back in 2022, our challenges looked familiar to any modern support organization, and I knew we needed a step-change—not incremental tweaks.

    We faced increased support demand from new and existing customers: Intercom was launching major features and changes at speed, driving up overall customer conversation volume and requiring additional headcount for the CS team. I could see we were scaling people faster than processes—unsustainable without automation.

    Our support policy (as defined by our service level objectives) was not based on a high bar: In most cases, we were only committed to “business hours” coverage for the majority of our customers, impacting first response times. Even with SLOs that were not considered best in class, we were struggling to meet our commitments. I wanted 24/7 coverage and faster first responses without sacrificing quality.

    We wanted to do more: As we pivoted our strategy, we wanted to open new routes to our support team, such as providing support to website visitors with technical questions and to trial customers. That meant meeting customers earlier in their journey with accurate, on-brand responses—at scale.

    What we did

    We made a very conscious decision to become our own best reference customer. As Intercom embraced the opportunity that generative AI presented to transform customer service, we intentionally moved to an AI-first strategy for our Customer Support team. I set a simple operating principle: ship value quickly, measure relentlessly, and let evidence guide the next bet.

    We started with the highest-volume, informational queries and saw our resolution rates climb quickly. With that foundation in place, we pushed Fin further, training it on deeper documentation and internal procedures, and eventually giving it the ability to take actions on behalf of customers. As Fin took on more complex work, our results started to compound—and trust in the system grew across the organization.

    Early adoption and building trust. When “AI Assist” features came to the Intercom Inbox, the CS team got early exposure to AI and were empowered to provide feedback directly to our product teams. This built awareness and trust across the team about what we were trying to achieve with AI, and helped shape the product roadmap. We were also the first beta customer for Fin, rolling it out to a subset of customers to watch sentiment and outcomes closely. With no adverse reaction and an initial resolution rate of over 25%, we deployed Fin to most customer segments within weeks. I’ll never forget the first week we put Fin in front of real customers—the silence of issues that never reached humans was the loudest signal of success.

    Knowledge management as a product. We recognized quickly that time spent tuning our help center and knowledge assets for Fin would pay dividends. We transitioned our Help Center Manager into a “Knowledge Manager,” with a dedicated remit to optimize content for Fin. We embedded knowledge creation into our “New Product Introduction” (NPI) process, targeting that Fin would resolve at least 50% of customer issues at every new product and feature launch. Over time, we added new sources, including “Developer Documents,” enabling Fin to handle increasingly complex issues. We built a culture of continuous improvement—allocating “out of the inbox” time so every teammate could close content gaps and raise the bar.

    Conversation design end-to-end. To ensure a consistent, high-quality customer experience, we created a new “Conversation Designer” role that owns the journey across automation and human handoffs. Using Intercom’s Workflows, we introduced “skills-based routing” so that when a customer asks for a human, the conversation reaches someone with the right expertise quickly. This is now handled by Fin directly using a feature called “Attributes.” The result: a seamless, on-brand experience regardless of channel or escalation path.

    Neon green hero graphic reading 'The 2026 Customer Service Transformation Report', with subhead 'The AI deployment gap is widening' and a black 'Get the report' button over a bar-chart pattern.
    Leaders are racing ahead with real AI in support. Explore the 2026 Customer Service Transformation Report to see where deployment is stalling, benchmark your team, and get practical steps to scale automation that delights.

    Organization changes that unlocked leverage. As we scaled Fin, we stood up a dedicated AI Support team under a senior CS leader to continuously optimize automation and define our AI adoption strategy across the journey. We restructured human roles into “Technical Support Specialist” and “Technical Support Engineer” to better align with the complexity of incoming work. We also expanded Support Operations to focus on optimization—using AI to uplevel Enablement, Workforce Management, QA, Process Management, and Data Insights. Just as important, we reset expectations about the balance between time spent supporting customers directly versus improving AI. That mindset shift created compounding returns.

    Pushing Fin further with new capabilities. As capabilities matured, we were early adopters and saw measurable wins:

    Fin Guidance: Multiple Guidance rules provide additional controls and a more personalized, targeted experience for customers.

    Fin Tasks and Procedures: Enables Fin to carry out activities such as updating customers on incident status and deep troubleshooting for technical issues.

    Insights: AI-driven dashboards provide deep insight into Fin’s performance and surface recommendations for further optimization. Insights also provides a Customer Experience (CX) Score for every customer interaction, enabling more targeted improvement efforts and opening up new ways to close the loop with customers who have had a poor experience.

    What we achieved

    What started as a focused effort to improve our customer support experience became the strongest proof point for what’s possible when you fully embrace AI. Fin now resolves over 81% of all our customer support volume and has allowed us to absorb a 300%+ increase in demand without proportional headcount growth. Over 90% of our customers now benefit from improved first response performance, 24/7 coverage, and outbound phone support.

    What the numbers don’t fully capture is the shift in how our team operates. With volume absorbed by Fin, our CS teammates now deliver consultative support—guiding next best actions, deepening product adoption, and contributing directly to retention and expansion. Customers that receive these engagements adopt Fin at a much deeper level and achieve greater support success. What was once a reactive, volume-driven team is now a function that generates significant revenue.

    What’s next

    Customer expectations are always rising, so we’re building on our progress by embracing the Fin Flywheel—an actionable framework for ongoing improvement and optimization. This keeps us honest about the discipline required to sustain AI performance at scale.

    Train: Teach Fin to resolve even the most complex queries with Procedures, knowledge, and policies.

    Test: Run fully simulated customer conversations from start to finish to see exactly how Fin will behave before going live.

    Deploy: Set Fin live across every channel – voice, email, chat, and social – for consistent support wherever customers reach out.

    Analyze: Use AI-powered Insights to analyze and improve Fin’s performance and deliver better customer experiences.

    We are also investing in our support teammates so they can adjust to the new world of AI—taking on more complex work and being valued for the subject matter expertise, consultative engagement, and empathy they bring to the role. That human layer is where differentiation shines.

    We will continue to develop and share best practices for deploying an Agent, based on our own experience with Fin and the lessons learned from our most forward-looking customers. These are captured and continually evolving in The Agent Blueprint.

    Transformation takes commitment

    The most successful teams aren’t bolting AI onto old processes; they’re rebuilding support around it—investing in knowledge and people alongside technology, and treating AI as a continuous discipline rather than a one-time deployment. That’s the real change required. For support teams willing to make it, there’s a rare opportunity to redefine what customer service can deliver—higher CSAT, faster resolution, and durable ROI.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • February Fin Breakthroughs: Master complex workflows, natural voice, 2-minute Shopify, smarter ops

    February Fin Breakthroughs: Master complex workflows, natural voice, 2-minute Shopify, smarter ops

    Every update we shipped this month removed a specific constraint on what teams can do with Fin. In my world, the demo-to-production gap shows up as complexity, control, and confidence. Can the agent handle the query that actually matters? Will it sound right on a call? Can the team deploy it without filing an engineering ticket? Can managers understand what it’s doing? That’s the bar I hold us to.

    This month, we delivered answers to all four. Here’s how.

    Procedures and Simulations (0:51). The hardest problem in AI-powered customer service isn’t answering FAQs—it’s executing complex queries with real business logic and real consequences if anything goes wrong. Think billing refunds, multi-step flows, and actions that must be right the first time.

    We made it dramatically easier to build and manage Fin for those complex queries—without pulling in an engineer. You can author in natural language, test every step in simulation, and deploy with confidence.

    The workflow starts with AI drafting the procedure from your existing source material. You edit in natural language, with structured hooks to pull in live data, apply business logic, and add code for deterministic control where you need it. That’s how you handle multi-step flows with the precision that matters when things go wrong.

    Simulations are the test environment. Define a test case, pass in the data Fin would receive in a real conversation, and watch it work through each step. You see what Fin is doing, why, and whether it’s meeting the criteria you set. Full transparency at every point. I’ve run these end-to-end myself, and there’s a particular confidence that comes from watching it work before it goes anywhere near a customer.

    Two colleagues in a studio sit at a wooden table with laptops during a Fin Product Updates discussion; an overlaid quote highlights selling and supporting customers in under two minutes.
    A conversational moment from the February Fin Product Updates recap: two teammates trade insights with laptops open, while a bold pull-quote drives home the promise—Fin removes complexity to start selling and supporting in under two minutes.

    For a deeper look at Procedures and Simulations, head to fin.ai/procedures.

    Fin Voice: three major updates. When something’s off in chat, it can take a few exchanges to notice; on a call, it’s immediate. Pronunciation, noise handling, and tone all matter because they’re the customer’s first impression.

    Pronunciation rules (4:18). Fin has high out-of-the-box pronunciation accuracy, but it doesn’t know your brand—your product names, your industry terminology, the way your company uses certain words. Alihan Zinna, Staff ML Scientist, showed this with an IKEA example: without pronunciation rules, Fin mispronounced both “IKEA” and a product name; after adding rules, both were corrected and sounded natural.

    New natural voices (5:48). We’ve added 11 new voices tuned to a range of brand tones so you can choose one that sounds like it truly belongs to your company—not a generic AI assistant.

    Background noise reduction (6:28). People call from airports, shops, and busy offices. Fin now monitors background noise continuously and increases noise reduction when the environment demands it. No configuration needed. As Alihan put it, “This is one of those things customers really notice when it’s not working. The goal was to make it invisible. That’s what we built.”

    Video still of a presenter beside a laptop and the Fin Call Metrics dashboard, showing tiles for hold times, missed and declined call counts, outbound dialing time, and a monthly stacked bar chart.
    Catch up on February’s Fin Product Updates with a walkthrough of the Call Metrics dashboard—saved filters, hold‑time tiles, missed and declined call counts, and a monthly breakdown that helps support teams act faster.

    Shopify setup experience (8:21). Fin began as a Service Agent and is quickly becoming a Customer Agent—working across the whole lifecycle to support, sell, and guide, even before a customer has an issue. The revamped Shopify setup is a clear step forward.

    Shopify catalogs are complex—thousands of products, variants, and dynamic inventory—and connecting all of that to an agent has historically been painful. We removed the friction.

    Setup now takes three steps: first, connect your store. Second, install the Messenger directly in Shopify—no code, just a few clicks. Third, deploy Fin. Total time: under two minutes. We timed it live.

    What that unlocks is real. In the demo, a first-time snowboarder asked for recommendations. Fin searched the catalog, reasoned about attributes that matter to a beginner (there’s no “beginner” tag in the catalog), personalized suggestions by height and weight, and added a board to the cart.

    Even better, one customer updated their website copy to promote a sale. Fin immediately picked up the new context and began recommending sale items, nudging shoppers to add more to the cart to access a discount—no extra configuration required. It read the situation and acted.

    Presenter explains Fin's Holiday Office Hours feature beside a laptop, with a UI screen showing office hours, reply times, and holiday closures settings for customer support teams.
    See how the latest Fin update streamlines support scheduling. A product expert walks through Holiday Office Hours, showing how to set default hours, track response metrics, and add closures so teams stay consistent.

    Three steps, and you have a real-time shopping assistant that knows your store and sells on your behalf.

    Helpdesk improvements (12:31). Fin works with any helpdesk, but many teams consolidate to take advantage of our native Intercom helpdesk integration. We’ve shipped 19 helpdesk improvements in 2026 so far; two from this month stand out.

    11 new call metrics. Hold time, outbound dial time, missed and declined calls, call terminating party, and more. These give leaders the visibility to analyze workload distribution and call handling quality in detail.

    Holiday office hours. Teams no longer need to manually update office hours for every public holiday. This was the most upvoted request in our community, and we shipped it.

    Across the board, we removed the constraints that hold teams back: the complexity ceiling in automation, the quality ceiling in voice, the setup barrier in Shopify, and the operational overhead in the helpdesk.

    We closed out the month with a Star Wars–style crawl of 22 additional updates. All features mentioned here are live and available now. Explore more at fin.ai/updates. More to come—see you next month.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • Turn Support Wins into a Company-Wide AI Blueprint for Consistent, End-to-End CX

    Turn Support Wins into a Company-Wide AI Blueprint for Consistent, End-to-End CX

    Building a great end-to-end customer experience with AI means going beyond support, and I’ve seen firsthand how transformative that shift can be when we treat every interaction as part of one cohesive journey.

    Every customer touchpoint, from the first sales conversation through to post-sales support and success, is an opportunity to get it right. Other teams are now turning to AI to transform how they show up for customers, and support, which led the way, has already written the blueprint. In my role, I focus on making that blueprint actionable across the entire lifecycle.

    In The 2026 Customer Service Transformation Report, it’s clear most businesses are thinking about what’s next, with more than half planning to scale AI to other departments. Interestingly, they often cite their early success with AI in support as motivation for the move. This makes support teams uniquely positioned to help lead the transition, a strategic role unimaginable just two years ago.

    In this piece, I share how teams are introducing AI to other parts of the business, how to think about this expansion effort, and the new opportunities it creates for support leaders who want to drive a unified customer experience.

    Support was the first proving ground for AI, and our research suggests that businesses are now planning to expand its use to other areas based on the results it’s yielded so far. Fifty-two percent of respondents said that their organizations are actively planning to scale AI to other departments in 2026.

    What will this look like? Leading companies are already finding out.

    Survey chart showing why organizations expand AI beyond support: success with AI in support 57%, unified customer experience 49%, scaling other functions without added headcount 33%, and cross-department requests 31%.
    Wins in support are setting the pace for company-wide AI. Survey results rank the drivers: proven success in support (57%), the push for a unified customer experience (49%), scaling other functions without more headcount (33%), and cross-department demand (31%).

    My favorite example is WHOOP, the fitness wearables company. They offer a premium product which makes their sales conversations more consultative than transactional. Customers want to know “Which membership is right for me?” or “How often do I need to charge my WHOOP?” According to Emily Shirley, Business Manager for Growth Product at WHOOP, if someone chatted with the inside sales team, they were twice as likely to convert, but they didn’t have enough reps to respond to incoming queries fast enough. Customers could wait more than 10 hours for a reply.

    With a big product launch on the line and an anticipated spike in prospective customer conversations, their three-person team needed help. So they deployed Fin to the "Join" page, the final step before purchase.

    With Fin resolving 84% of inbound questions, the sales team was able to focus on high-value leads. Together, they drove a 130% increase in attributable sales. The team is now exploring ways to expand Fin beyond FAQs, focusing on personalised conversation flows, multi-product recommendations, and richer data capture. As Emily says: “There are so many parts of the buyer journey where this applies. We’ve only scratched the surface.”

    It’s clear there’s a desire to push AI to other parts of the customer lifecycle, but there is a risk hidden in this expansion. If sales, customer success, and other departments all launch their own Agent, each operating in isolation, you can end up fragmenting the very thing our research says teams want to create. The second-most cited reason for pushing AI beyond support: desire for a unified customer experience.

    Without shared context, each handoff becomes a source of friction where customers could receive inconsistent answers or be asked to repeat information. I’ve watched even well-intentioned AI rollouts struggle here—great local wins, but an overall journey that feels disjointed.

    Diagram of an AI support blueprint showing roles (SDR, CSM, Sales, Shopping Assistant, Support Rep, Custom) stacked above layers for Goals, Memory & User Context, Business Knowledge, and Interoperability.
    A translucent UI visual maps a support-led AI blueprint that scales across the business—from SDR and sales to custom assistants—anchored by layers for goals, memory and user context, business knowledge, and interoperability.

    The opportunity (and the challenge) is to keep the customer at the center. Instead of department-specific Agents that operate independently, we must strive for cohesion. That means shared memory, consistent governance, and connected AI workflows that respect the customer’s history and intent across channels.

    This is the future I’m building toward: solutions like Fin becoming a “Customer Agent,” capable of handling the entire customer experience. This will mean Fin can function in many roles, supported by a memory that grows with the customer over time and deep knowledge of the business, creating a seamless experience for every interaction. In practice, that’s agentic AI designed to collaborate across teams, systems, and journeys—without losing context.

    Pushing AI into new parts of the business requires someone to own the process. And for many organizations, that’s the support team. Nearly a third of respondents (32%) confirmed their customer service teams are leading their business' AI transformation strategy.

    This presents a real opportunity for support teams to shape the future of customer experience. Instead of each function reinventing the wheel, support can act as a center of excellence, defining shared standards, guardrails, and operating practices that drive performance.

    “You already manage the most complex, high-volume customer interactions; you have rich data on customer needs and behavior; and you know how Agents perform in the real world. Those insights will be invaluable as AI scales across your business.”

    Neon green hero graphic reading 'The 2026 Customer Service Transformation Report', with subhead 'The AI deployment gap is widening' and a black 'Get the report' button over a bar-chart pattern.
    Leaders are racing ahead with real AI in support. Explore the 2026 Customer Service Transformation Report to see where deployment is stalling, benchmark your team, and get practical steps to scale automation that delights.

    In my organization, when we extended AI from support into sales, we deliberately brought our conversation design expertise, Agent Analytics, and governance models along with it. One team owns the orchestration, memory strategy, and CRM integration so a customer can start with a sales question and end up with a support one—without ever feeling a seam. That continuity is where journey mapping meets product strategy and turns into measurable outcomes.

    As Agents like Fin expand their capabilities and move into new areas, I expect many customer service leaders will see their roles expand to include AI implementation across the customer journey. It’s a natural progression for product management leadership in support: owning the experience, the data, and the operating model.

    Achieving perfect customer experience is AI’s biggest promise. But in order to get there, teams need to be smart about the solutions they deploy. A unified Customer Agent capable of handling the entire journey end-to-end will have a significant advantage, delivering consistent, context-aware experiences across every interaction.

    The Customer Agent future is being built right now, and it’s starting with the team pioneering AI transformation from the very beginning: support. For leaders in these organizations, this is a rare opportunity to shape how customer relationships will be built and maintained in the AI era.

    If you’d like to dig deeper into the data and benchmarks guiding these decisions, download The 2026 Customer Service Transformation Report.


    Inspired by this post on The Intercom Blog.


    Book a consult png image