Tag: Agent Analytics

  • From Idea to Impact: My PM-Friendly Blueprint to Building Your First AI Agent Fast

    From Idea to Impact: My PM-Friendly Blueprint to Building Your First AI Agent Fast

    AI agents are quickly moving from novelty to necessity, and the fastest way to capture value is to approach them like any other high-stakes product initiative. In this guide, I share how I plan, build, and launch production-grade agents with a product mindset—balancing ambition with risk, speed with governance, and innovation with measurable outcomes.

    I start by getting crisp on the outcome. Who is the primary user, what job are they hiring the agent to do, and how will we know it’s working? I translate this into outcomes vs output OKRs, such as resolution rate, time-to-value, cost-to-serve, or qualified pipeline influenced—anchoring the roadmap before a single line of code or prompt is written.

    Next, I map the agent’s scope and boundaries. I write a simple capability canvas: the tasks the agent must perform, the tools it can use, the data it can access, and the constraints it must respect. Most successful builds follow a retrieval-first pipeline: connect trusted knowledge sources, enrich with metadata, and manage a lean context window to keep responses relevant and cost-efficient. From the start, I bake in privacy-by-design, data governance, and AI risk management so compliance isn’t an afterthought.

    Model selection comes after the workflow is clear. I choose an LLM for the job (latency, cost, multilingual needs, and tool-use fidelity) and pair it with the right connectors and actions—think CRM integration, ticketing, search, or internal APIs. For voice experiences, I define a voice AI agent persona, turn-taking rules, and barge-in behavior. This is where agentic AI patterns shine: structured planning, tool invocation, and verification loops create a resilient, goal-directed system.

    Prompt design is product design. I write system prompts that define role, tone, constraints, data sources, and success criteria. I add few-shot examples that mirror my top use cases and edge cases, then apply prompt engineering best practices to control style, limit speculation, and encourage citations. For voice, I include prompt engineering for voice to optimize brevity, warmth, and disfluency handling without sacrificing accuracy.

    Before launch, I build an eval-driven development workflow. I curate golden datasets from real user intents, add adversarial cases, and automate evals for accuracy, safety, grounding, and tool-use success. I set a minimum detectable effect (MDE) so A/B testing can validate improvements with confidence, and I define go/no-go thresholds to prevent regression. This becomes my continuous discovery loop for the agent.

    Instrumentation is non-negotiable. I wire up Agent Analytics to track task success, containment/deflection rate, handoff quality, cost per task, and user satisfaction. I supplement with a unified analytics platform and session replays to observe failure patterns. These signals feed prioritization and help me decide when to expand scope versus harden reliability.

    For delivery, I rely on CI/CD with feature flags to gate risky capabilities, plus canary releases for new tools and prompts. I monitor DORA metrics to maintain deployment frequency without trading off quality. When incidents happen, I treat them like production issues: incident management playbooks, rollbacks, and clear postmortems.

    Trust is earned through safety and transparency. I enforce least-privilege access, structured logging, and red-teaming for jailbreaks, prompt injection, and data exfiltration. Threat detection and response plus clear user disclosures keep the experience responsible and compliant with regulatory requirements.

    GTM is product-led. I use in-app guides, product tours, and onboarding checklists to drive user activation and early wins. I define success moments, turn them into habit loops, and run retention analysis to find where users stall. This tight loop of messaging, measurement, and iteration accelerates product-market fit.

    Common high-ROI use cases I prioritize include customer support ai strategy (automated resolution and augmented agent assist), sales and success workflows (lead qualification, QBR prep), and internal knowledge copilots (policy, process, engineering runbooks). Each starts narrow, ships fast, and scales with proven evidence from analytics and experiments.

    If you’re skimming, here’s the blueprint: clarify outcomes, design AI workflows with a retrieval-first pipeline, select the right LLM and tools, engineer robust prompts, institutionalize evals and A/B testing, instrument Agent Analytics, ship with CI/CD and feature flags, and iterate with discipline. In the walkthrough video above, I go deeper on templates, prompts, and experiments you can use to build your first agent with confidence.


    Inspired by this post on Product School.


    Book a consult png image
  • Inside Amplitude’s AI Playbook: Lessons from Leo Jiang on Ask Amplitude, Agents, and Visibility

    Inside Amplitude’s AI Playbook: Lessons from Leo Jiang on Ask Amplitude, Agents, and Visibility

    I continually study how high-velocity teams turn AI ambition into shipped product, and Amplitude’s approach stands out. "Leo Jiang is the Head of Engineering, AI Products at Amplitude, focused on building new AI and marketing products. He has helped build Ask Amplitude, Agents, and AI Visibility." From a product management leadership lens, that portfolio signals a clear AI strategy: enable insight (Ask Amplitude), drive action (Agents), and ensure trust and observability (AI Visibility).

    What I appreciate most is the sequencing: start with user-facing value, build agentic AI capabilities where tasks repeat and outcomes can be evaluated, and layer AI workflows with robust governance. For PMs and LLMs for product managers, the implication is to define success via eval-driven development—quantitative rubrics, offline test sets, and real-time feedback loops—before scaling automation. This also hints at an emerging discipline of Agent Analytics: instrument prompts, tool calls, and outcome quality so we can tune performance like we tune a funnel.

    Ask Amplitude gives a relatable example: natural-language questions lower the activation barrier for product and growth teams inside an Amplitude analytics environment. When agents turn answers into next-best actions, product-led growth becomes measurable—from hypothesis to change to impact—inside a unified decision loop. That tight loop is where product strategy, design, and reliability meet to create compounding value.

    Operationally, I organize a product trio around each capability and pair it with forward deployed engineers to accelerate discovery with customers. I also invest in privacy-by-design and data governance early, ensuring marketing use cases respect compliance while keeping iteration speed high. The goal is a repeatable path from prototype to scale that preserves momentum without compromising safety.

    My takeaway for peers: pick one high-frequency workflow, define clear agent boundaries, ship a narrow slice, and measure relentlessly. Use retrieval-first pipeline patterns for grounding, add human-in-the-loop checkpoints, and close the loop with qualitative insights from in-app guides. When that works, expand capabilities—not just features—and let outcomes vs output OKRs steer prioritization.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • Stop Drowning in Dashboards: Real-Time Digital Analytics for Finserv Contact Centers

    Stop Drowning in Dashboards: Real-Time Digital Analytics for Finserv Contact Centers

    I’ve sat in enough finserv contact center reviews to know the pattern: wall-to-wall dashboards, weekly exports, and colorful charts that still leave teams asking, “So what should we do next?” The truth is, more dashboards rarely create better decisions. What we need is digital analytics that translates signals into action—fast, precise, and privacy-safe.

    When I say digital analytics, I mean a unified analytics platform that captures real-time behavioral data across voice, chat, IVR, email, and in-app journeys, then operationalizes it for agents, supervisors, and automated workflows. See how real-time behavioral analytics helps finserv contact centers lower costs, improve resolution speed, and deliver better member experiences.

    Dashboards tend to be lagging, siloed, and optimized for reporting, not resolving. They spotlight vanity metrics, bury journey-level friction, and rarely surface the “next best action” that actually moves a member request toward resolution. By the time a trend shows up in a weekly readout, the expensive part—handle time, repeat contacts, churn risk—has already accumulated.

    Real-time digital analytics flips that script. Instead of passively describing performance, it continuously detects intent, risk, and friction as interactions unfold—then powers targeted responses. For example, it can route high-risk transactions to specialized agents, prompt dynamic guidance during an escalated call, or trigger a proactive message that deflects a repeat contact. In practice, that means fewer transfers, faster resolution speed, and measurable reductions in operating costs.

    For finserv specifically, the payoff is immediate. Agent Analytics surfaces coaching opportunities (e.g., where scripts stall or compliance steps get missed). Retention analysis identifies members at churn risk after a negative experience. Journey analytics exposes where authentication fails or balance inquiries overwhelm queues, so you can intelligently deflect to self-service. And when a potential fraud signal appears mid-session, real-time insights can prioritize routing and alerting without sacrificing compliance.

    Implementation should be iterative and outcomes-driven. Start by instrumenting the top five journeys that drive the most cost or dissatisfaction (lost card, fraud dispute, loan status, password reset, payment issue). Tie each to clear outcomes vs output OKRs—think first-contact resolution, repeat-contact reduction, containment rate, and average time-to-resolution—so every analytic signal earns its keep. Then activate insights inside the workflow: agent assist prompts, smart routing, and targeted follow-ups that close the loop.

    Governance matters just as much as speed. In a regulated environment, privacy-by-design and data governance are non-negotiable. Build data access controls, audit trails, and consent management into your operating model from day one. Align analytics with regulatory compliance requirements to ensure that what you measure and automate is defensible, explainable, and safe for members and the business.

    To accelerate learning, pair digital analytics with controlled experiments. Use A/B testing on IVR flows, authentication steps, and post-call follow-ups to quantify what truly reduces transfers and repeat contacts. Define a minimum detectable effect (MDE) upfront so tests are fast and conclusive. Run continuous discovery with cross-functional product trios (operations, data, compliance) to turn insights into shippable improvements every sprint.

    On the stack side, focus on connecting systems you already trust. CRM integration ensures that context follows the member, while tools like Amplitude analytics, Pendo, or Intercom can instrument key digital touchpoints. Whether you choose build vs buy, the principle is the same: consolidate signals into a unified analytics platform, then push decisions and guidance back into the tools agents and members already use.

    The cultural shift is from reporting to decisioning. Instead of celebrating more charts, celebrate faster resolutions and fewer escalations. Replace static executive reports with alerting and action playbooks. Make it trivial for supervisors to see what changed, why it mattered, and which play to run next. That’s how you convert data into durable operating advantage.

    The mandate is clear: stop drowning in dashboards. Move to digital analytics that captures behavior in real time, respects compliance, and powers operational decisions where they matter most—in the member journey. When you do, cost curves flatten, resolution speed climbs, and member trust compounds.


    Inspired by this post on Amplitude – Perspectives.


    Book a consult png image
  • From PDFs to Proposals: How Tendos AI’s Agent Swarm Automates Construction Quotes Fast

    From PDFs to Proposals: How Tendos AI’s Agent Swarm Automates Construction Quotes Fast

    Anyone who has lived inside construction tendering knows the grind. "When a construction company receives a bid request, someone has to open that email, parse the attached PDF (sometimes 1,800 pages describing an entire building), figure out which products are relevant, look up pricing, and draft a quote—all before the deadline. It's tedious, error-prone, and surprisingly manual." That painful reality is exactly why this conversation about Tendos AI caught my attention—and why it matters for product leaders building agentic AI in complex, document-heavy workflows.

    I listened as Daniel Kappler and Matthias Hilscher from Tendos AI walked through how they’re automating the tendering workflow for manufacturers in the construction industry. What began as a narrow prototype—matching radiator requests to product catalogs—has matured into a full agentic system that does the heavy lifting from email categorization to offer generation. The end result: a scalable AI workflow that tackles messy inputs, orchestrates specialized agents, and produces quotes that are ready for human review—or even straight-through processing.

    What impressed me most was the rigor. They validated the opportunity with a design partner, spent a week on-site observing real workflows, and then engineered a multi-agent architecture where specialized agents collaborate, including a "review agent" that checks work before anything reaches a human. They evaluate each agent independently (not just the whole chain), built custom observability when off-the-shelf tooling fell short, and use human-in-the-loop feedback to push toward a self-learning system.

    From a product management perspective, this is agentic AI done right. It blends continuous discovery with eval-driven development, thoughtful UX decisions, and pragmatic guardrails. Evaluating agents individually makes debugging tractable and change detection transparent; a dedicated "review agent" mirrors code review to reduce error propagation; and custom tracing plus Agent Analytics provide the observability needed to operate AI workflows reliably at scale.

    My key takeaway: "Start narrow to prove value: Tendos AI began with just radiators for one design partner before expanding to all building products"—a classic wedge strategy that accelerates learning while building credibility.

    Another takeaway I’ll adopt in future roadmaps: "Own the interface: building a web application (vs. integrating into legacy systems) gave them control over UX and the ability to iterate toward full automation." Controlling the surface area let them move faster than a purely backend integration ever could.

    On measurement and reliability, I loved this: "Evaluate each agent, not just the chain: per-agent evals make debugging tractable and show exactly where performance changed." That’s true eval-driven development—aligning metrics to decision points rather than only outcomes.

    Quality gates matter in automation, and they nailed it: "Use review agents: a separate agent that checks work (like code review) catches errors before they reach humans." It’s a simple pattern with outsized ROI.

    Finally, the product-market signal is unmistakable: "Let customers pull you: customers asked Tendos to replace their CPQ software—strong signals of product-market fit." When buyers invite you to displace existing systems, you’re past validation and into expansion.

    If you’re exploring agentic AI for enterprise workflows, the themes here are gold: the tendering chain in construction is ripe for automation; domain expertise accelerates opportunity discovery; robust entity extraction across PDFs ranging from 1 to 1,800+ pages is non-negotiable; planning patterns for creating and updating task plans matter; agents must reason about product fit against customer requirements; custom tracing and observability unlock debugging for complex agent chains; and human feedback loops pave the path to self-learning systems.

    Guests: Daniel Kappler — CPO (Product & Design), Tendos AI; Matthias Hilscher — CTO (Engineering), Tendos AI.

    Want to dive deeper? Listen to this episode on: Spotify | Apple Podcasts.

    Explore the team and product: Tendos AI.

    For builders of agentic AI, here’s my playbook distilled from this story: start narrow to earn trust and accuracy; own the interface to speed iteration; use per-agent evaluations to localize issues; add a "review agent" as a quality gate; invest early in tracing, observability, and Agent Analytics; keep humans in the loop until your metrics justify autonomy; and let strong pull signals guide your roadmap. That’s how you turn complex emails and massive PDFs into precise, production-grade quotes—consistently.


    Inspired by this post on Product Talk.


    Book a consult png image
  • Inside the AI Customer Service Shift: What 166 Leaders Told Me About Teams, Roles, and ROI

    Inside the AI Customer Service Shift: What 166 Leaders Told Me About Teams, Roles, and ROI

    I wanted to cut through the hype and see what’s actually changing inside customer service teams as AI agents like Fin move from pilots to production. So I analyzed 166 interviews with support leaders, managers, and frontline specialists to understand how roles, workflows, and team structures evolve once AI becomes part of everyday work.

    The anecdotes were already loud: AI tools are transforming customer support. But the scale, shape, and consistency of that transformation? Less clear. I went to the source—the practitioners living it—to quantify what’s real and what’s next for customer support AI strategy.

    Here’s what I gleaned from the data.

    TL;DR — What’s changing

    AI is reorganizing core CS operations: Nearly every team (≈95%) reported meaningful workflow changes. Triage, routing, translation, and categorization are increasingly automated. Hybrid human+AI systems are taking their place.

    Frontline work is changing to AI oversight: Humans now QA, monitor, and test AI outputs. When it comes to handling queries, they step in for nuance, rather than repetition.

    Structural change is widespread but uneven across companies: 83% reported new responsibilities or roles. Some built AI pods, while others retained traditional setups.

    Tier 1 headcount demand is falling: 28% saw hiring freezes, slowdowns, or natural attrition at Tier 1 level as AI Agents manage more requests and improve operational efficiency.

    Skill gaps are widening inside teams: Data literacy, QA, and cross-functional communication are all rising in value. For many companies, long-term role strategy is lagging behind.

    Research methodology

    The goal of this research is to understand how many customer service teams have changed their roles, responsibilities and ways of working due to adopting AI agents, as well as understanding how these changes manifest within their organizations.

    For this study, the data chosen consists of interviews conducted by the research team, either with Intercom customers or prospects. This data was chosen because the focus of the interviews revolved around the individual experience of the participant, which gives a higher chance of information related to role changes to be present.

    The data was collected using Snowflake by pulling all interviews stored in gong conducted by a member of the research team from 01-01-2025 to 14-10-2025.

    After the data was pulled, a python script was used to clean the conversation corpus for each conversation retrieved. Common English stopwords (e.g. “and”, “very”, “with”, etc.) were removed, as well as all the text associated with a speaker in the conversation that was not the interview participant(s). This was done to reduce the computational power required for the conversation coding, avoid API timeouts and reduce costs.

    After the corpus was cleaned, the OpenAI API was employed, alongside a prompt, to code each conversation using closed codes defined in a closed codebook.

    The codes used were:

    No role change mentioned: No explicit changes to roles, teams, or reporting lines are attributed to AI/Fin.

    Role responsibilities changed due to AI/Fin: Duties/ownership moved between humans and AI/Fin, or scope of a role changed because AI/Fin handles tasks.

    Team structure/reporting changed due to AI/Fin: Org/team boundaries, team charters, or reporting lines changed due to adopting AI/Fin.

    Headcount/hiring impacted due to AI/Fin: Hiring plans, headcount, staffing coverage, or shifts/rotations changed due to AI/Fin.

    Workflow/process changed due to AI/Fin: Steps, triage/escalations, routing, or playbooks changed because AI/Fin alters the process.

    Other organizational changes due to AI/Fin: Other changes inside the organization due to AI/Fin that don’t involve a change in responsibilities, team structure/reporting lines, headcount or workflow/processes changes.

    Data analysis

    166 conversations were retrieved. More than 90% of all conversations report some sort of change either in their role, team, or processes due to implementing Fin, or a similar AI product, with only 13 participants reporting no changes.

    Across these conversations, each one could have multiple types of change associated with it (M = 2.35, Med = 2, Min = 1, Max = 4, N = 166).

    More specifically, after implementing Fin or a similar AI product:

    94.58% participants reported having their processes and workflows disrupted

    82.53% participants reported seeing their role and responsibilities change

    27.71% participants reported changes in company headcount or hiring

    6.02% participants reported their team structure or reporting lines changing as a result

    Additionally, 16.27% participants reported a change for a different reason from the ones highlighted above (“Other organizational changes due to AI/Fin”).

    Sample representativeness

    The sample is representative with a confidence level of 90% and a margin of error of ±6.4% (accounting for an overall unknown population size). The individual confidence intervals for each type of change are as follows.

    Workflow/process changed due to AI/Fin: 157 (94.6%), 90% CI: 91.7% – 97.5%

    Role responsibilities changed due to AI/Fin: 137 (82.5%), 90% CI: 77.7% – 87.4%

    Headcount/hiring impacted due to AI/Fin: 46 (27.7%), 90% CI: 22.0% – 33.4%

    Other organizational changes due to AI/Fin: 27 (16.3%), 90% CI: 11.6% – 21.0%

    No role change mentioned: 13 (7.8%), 90% CI: 4.4% – 11.3%

    Team structure/reporting changed due to AI/Fin: 10 (6.0%), 90% CI: 3.0% – 9.1%

    Thematic analysis

    1) Automation and AI integration replacing manual steps (94.58%). I see AI workflows embedding into every stage of support. Manual triage, routing, translations, and repetitive responses shift to Fin or similar systems, while agents focus on human-in-the-loop oversight.

    Agents’ day-to-day work now revolves around monitoring or fine-tuning AI outputs, not replying to the same questions. In many teams, conversations enter Fin first; humans only step in when nuance or exception handling is required. Testing, QA, and rollout practices have matured too—teams track Fin’s accuracy and iterate intentionally.

    2) Humans shift to oversight, AI handles execution (82.53%). The role resets are unmistakable. Support agents and managers move from high-volume execution to optimization, configuration, and measurement. New roles emerge—AI specialists, automation managers, Fin owners—while responsibilities migrate toward strategic analysis and quality assurance.

    Duties are redistributed: Fin takes on refunds, triage, simple messaging, even parts of the sales process. I’ve watched some careers pivot toward product/ops or AI systems strategy as managers coordinate testing and monitor adoption metrics.

    3) Reductions or slower growth due to efficiency gains (27.71%). Efficiency is real. Many teams reduce Tier 1 headcount needs or slow hiring because AI absorbs simpler requests. Others reallocate people to complex work or AI management. A few still expand—adding automation engineers, implementation specialists, or technical AI leads—but not at past growth rates.

    The upshot: organizations handle more volume while stabilizing or reducing staffing, especially at the frontline tier.

    4) New AI teams, flatter orgs, fewer escalation layers (6.02%). I’m seeing organizational design catch up to the tech. Some companies form dedicated LLM or automation teams. Others flatten hierarchies, design around workflow complexity instead of region, or merge roles. Dedicated escalation layers shrink as Fin routes or resolves more autonomously.

    Team design is getting more modular and data-driven, with clearer ownership for configuration, governance, and Agent Analytics.

    5) Broader digital transformation and operational modernization (16.27%). Beyond support, companies are modernizing their operating model: automation-first, digital self-service, better data foundations, and new vendor ecosystems. Collaboration patterns between data, ops, CX, and product/engineering are tightening, with a culture of experimentation and continuous improvement taking hold.

    How have customer service roles and responsibilities changed due to Fin/AI agent implementation?

    Implementing Fin or a similar AI agent profoundly changes how an organization operates, with around 95% of participants reporting some level of change in their processes after implementation. These systems have significantly reshaped the workflows that customer service teams are used to. Tasks once performed manually, such as ticket triage, routing, repetitive responses, and translations are now handled by AI agents.

    “This marks a clear transformation in how customer service agents work: moving away from directly resolving customer queries to focusing on more analytical and procedural work”

    As a result, customer service agents’ responsibilities have shifted from performing manual tasks to monitoring and fine-tuning the AI agent whenever its output is inaccurate or incomplete. This marks a clear transformation in how customer service agents work: moving away from directly resolving customer queries to focusing on more analytical and procedural work, such as testing, QA, and performance analysis of AI outputs.

    Human agents who still handle conversations tend to do so either because the AI agent cannot yet respond adequately, or because of an organizational choice to retain human involvement for sensitive or high-value interactions. Nevertheless, the need for such roles is diminishing. Around 28% of participants reported a reduction in Tier 1 staff or a hiring slowdown or a full hiring freeze, as AI agents increasingly manage simple requests and organizational attention shifts towards improving automation efficiency.

    “In some cases, this has led to the creation of specialized AI teams, reorganizations around workflow complexity, or the merging and redefinition of existing roles”

    However, this transformation is not uniform across companies. While some roles have disappeared (particularly escalation layers), others have emerged. Many organizations are reallocating existing staff to AI management or hiring new technical profiles such as automation engineers, implementation specialists, and AI leads. In some cases, this has led to the creation of specialized AI teams, reorganizations around workflow complexity, or the merging and redefinition of existing roles.

    Around 83% of participants reported changes to their roles or responsibilities following the introduction of Fin or similar AI agents. Specifically, customer service agents who no longer handle basic queries now focus on managing AI performance, reviewing Fin tasks and improving automation outputs. Managers oversee AI evaluation and implementation, coordinate testing, and monitor AI metrics such as resolution and involvement rates. In some organizations, new dedicated roles have emerged—AI specialists, automation managers, or Fin owners—reflecting a strategic shift toward automation-first, digital self-service models.

    These structural shifts are also cultural. I’m seeing teams embrace experimentation, versioning, and eval-driven development while deepening collaboration with data, operations, and product/engineering. The move from outcomes vs output OKRs is palpable: leaders are measuring containment, deflection, CSAT, and time-to-resolution with new rigor.

    Overall, a widespread transformation is underway. Roles are broadening, responsibilities are diversifying, and cross-functional collaboration is becoming the norm. Given the pace of gen ai improvement and the rise of agentic AI patterns, I expect these shifts to intensify.

    This evolution raises two important questions

    Firstly, do customer service agents possess the skills required to succeed in these new roles? While they are experts in customer interaction and company policy, their work now demands new competencies in data analysis (e.g. reporting AI agent performance and how it changes over time), quality assurance/debugging (e.g. Fin output testing and versioning), and cross-functional communication (e.g. if help from another team is required, drafting a business case to justify the resources required could be needed).

    Secondly, what long-term strategies are companies adopting to support these evolving roles? Some are reorganizing entirely around automation, while others retain traditional structures. For those undergoing transformation, it remains unclear whether these changes are part of a deliberate strategic plan aimed at achieving specific performance outcomes, or the result of experimentation without defined goals.

    Ultimately, Fin’s success— and of AI in customer service more broadly— depends not only on the technology itself but on the people and strategies that shape its use. In my experience, the winners invest early in data literacy, robust QA, clear ownership, and governance; they align product, ops, and CX around a shared AI roadmap; and they measure what matters with disciplined Agent Analytics. That’s how you turn AI workflows into durable customer and business outcomes.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • 4 Costly Misconceptions About Building AI Agents—and How I Turn Them Into Wins

    4 Costly Misconceptions About Building AI Agents—and How I Turn Them Into Wins

    I’ve lost count of how many times I’ve been asked for a “quick AI agent” that can autonomously fix customer problems, write code, or run sales ops. The promise is intoxicating—and I get why. But in practice, sustainable impact comes from disciplined product thinking, not wishful automation. Drawing on my experience leading product for complex, agentic AI initiatives, I want to debunk four misconceptions I see repeatedly and share what actually works.

    Misconception 1: AI agents are plug-and-play. The reality is that effective agentic AI behaves more like a new product line than a feature toggle. It needs clear job stories, domain grounding, tool access, and guardrails. I start by narrowing scope to one painful job to be done, then design AI workflows that reflect real constraints (SLAs, compliance, edge cases). From day one, I instrument with Agent Analytics and set up eval-driven development so we can see failure modes early and iterate with intent.

    What consistently moves the needle is treating the agent like a teammate you onboard: define responsibilities, provide the right tools, and measure outcomes. I pair scripted validations with live evals, track containment rates and handoff quality, and balance precision/recall depending on the risk profile. This is slow to fast, not fast to broken.

    Misconception 2: Bigger models make better agents. In my experience, architecture outperforms horsepower. A retrieval-first pipeline, tight context window management, and practical prompt engineering often beat an oversized model that hallucinates. Tool use matters more than model size: give the agent reliable APIs, clear schemas, and deterministic fallbacks. For LLMs for product managers, the play is to right-size the foundation model and invest in data quality, prompts, and evaluators that reflect your true acceptance criteria.

    When I see erratic behavior, I don’t immediately swap models; I improve retrieval, prune irrelevant context, and clarify the agent’s planning loop. Most performance gains come from better state management and grounding rather than a pricier token budget.

    Misconception 3: Agents replace teams. High-performing organizations design human-in-the-loop systems. I implement human review on high-risk actions, explicit escalation paths, and simple override mechanisms. That’s not just safety theater—it’s good product design. AI risk management and data governance are part of the product backlog, not an afterthought. In customer support ai strategy, for example, the agent drafts, a specialist approves, and the system learns from deltas to tighten future responses.

    The social system matters as much as the technical one: clear role boundaries, audit trails, and feedback loops turn the agent into a force multiplier. Teams gain leverage without surrendering accountability.

    Misconception 4: Shipping the agent equals success. Adoption is earned, not announced. I treat agent launches like any product-led growth motion: define activation events, remove friction with in-app guides and product tours, and A/B test prompts, tool choices, and UI affordances. We track time-to-value, task completion rate, and user trust signals (edits, undo patterns, and escalation requests). When we get those leading indicators right, retention follows.

    Increase revenue, cut costs, and reduce risk with Pendo’s Software Experience Management platform. Optimize the entire software experience to drive adoption and improve engagement.

    My playbook is simple and repeatable: frame the problem narrowly, ground the agent with the right tools and data, measure with eval-driven development and Agent Analytics, then grow adoption with a disciplined go-to-market inside the product. The agents that win don’t feel like magic—they feel dependable. That’s what customers trust, and that’s what scales.


    Inspired by this post on Pendo – Best Practices.


    Book a consult png image
  • Agent Analytics That Matter: How Pendo Drives Adoption, Cuts Costs, and Reduces Risk

    Agent Analytics That Matter: How Pendo Drives Adoption, Cuts Costs, and Reduces Risk

    Every quarter, I revisit the same three questions: Are we accelerating adoption, lowering cost-to-serve, and managing risk without slowing the roadmap? Tools that help me answer all three with clarity earn a place in my stack. That’s why the concept behind Pendo’s Agent Analytics resonates so strongly—it gives product leaders a way to see, in one view, how users engage with AI-powered assistants, in-app guides, and core workflows, and how those behaviors translate into product-led growth.

    Increase revenue, cut costs, and reduce risk with Pendo’s Software Experience Management platform. Optimize the entire software experience to drive adoption and improve engagement.

    In practice, Agent Analytics functions as a unified analytics platform for the modern product team. I can observe how users interact with agents and nudges inside the product, connect those interactions to user activation and retention analysis, and prioritize improvements that deliver measurable outcomes. The result is fewer blind spots across the journey and a tighter feedback loop between discovery and delivery.

    The real value shows up when I pair analytics with targeted interventions. For example, I’ll instrument critical paths, baseline activation, then use in-app guides to remove friction at the exact moment users need help. I incorporate A/B testing and continuous discovery to validate which prompts, pathways, or workflows actually move the needle. With a clean view of adoption, engagement, and time-to-value, my team can double down on what works and retire what doesn’t—faster.

    Risk reduction is equally important. With clear behavioral signals, I can spot confusing prompts, unhelpful agent responses, or unexpected drop-offs before they scale into churn or support volume. That visibility informs our product strategy, aligns stakeholders on trade-offs, and keeps our governance tight without stifling innovation—especially critical as AI Strategy becomes part of everyday product decisions.

    If you’re weighing whether Agent Analytics deserves a place in your toolkit, consider this: better instrumentation yields better decisions. When you unify guide interactions, agent engagement, and core product usage, you can attribute uplift more precisely, forecast impact with greater confidence, and operationalize product-led growth. That’s how we increase adoption, cut unnecessary cost, and de-risk the roadmap—while building experiences customers actually love.


    Inspired by this post on Pendo – Perspectives.


    Book a consult png image
  • Unlock Peak Support Performance with Pendo Agent Analytics to Drive Adoption and ROI

    Unlock Peak Support Performance with Pendo Agent Analytics to Drive Adoption and ROI

    When agent performance improves, everything else follows: faster resolutions, happier customers, and stronger product adoption. In my role leading product management at HighLevel, I use Pendo Agent Analytics to build a shared, measurable view of how our support motions shape the entire software experience and influence product-led growth.

    Increase revenue, cut costs, and reduce risk with Pendo’s Software Experience Management platform. Optimize the entire software experience to drive adoption and improve engagement.

    In practice, I connect Agent Analytics with our product strategy by pairing product signals (user activation, onboarding progress, feature usage depth) with operational signals (first-response time, time-to-resolution, and deflection rates). This lets me see how in-app guides, product tours, and contextual tooltips impact outcomes across segments without guesswork.

    To separate signal from noise, my team runs small, controlled experiments and targeted A/B tests. For example, we’ll instrument a guide for a complex workflow, then compare cohorts on activation, retention, and support ticket volume. If engagement improves and cost-to-serve drops, we standardize the pattern and scale it.

    The real advantage is alignment. By treating analytics as a unified analytics platform that integrates agent activity with product insights, we tie day-to-day support work to our value proposition and roadmap. That transparency sharpens prioritization, accelerates adoption, and creates a clear line of sight from agent coaching to measurable business impact.

    For teams getting started, baseline your agent performance metrics, map the key friction points in your user journey, and instrument those moments with precise, helpful in-app guides and product tours. Review outcomes weekly, double down on what reduces effort and drives engagement, and keep refining the loop until adoption and satisfaction compound.


    Inspired by this post on Pendo – Best Practices.


    Book a consult png image
  • Safeguard Customer Data with Pendo Agent Analytics: Drive Adoption, Cut Costs, Reduce Risk

    Safeguard Customer Data with Pendo Agent Analytics: Drive Adoption, Cut Costs, Reduce Risk

    Protecting customer data is non‑negotiable—and it must coexist with our need for precise product insights. In my role, I frame every analytics initiative, Pendo Agent Analytics included, around measurable outcomes and rigorous governance so we can accelerate growth without compromising trust.

    Increase revenue, cut costs, and reduce risk with Pendo’s Software Experience Management platform. Optimize the entire software experience to drive adoption and improve engagement.

    To make that promise real, I anchor implementation in privacy-by-design. Practically, that means data minimization, purpose limitation, role-based access control, auditable workflows, and clear retention policies. These are the same standards I expect from any unified analytics platform and the operating guardrails my team applies in partnership with security and legal.

    On the product side, I focus Agent Analytics on the behaviors that move the needle: adoption, feature engagement, user activation, and time-to-value. Paired with in-app guides, product tours, and thoughtful tooltip design, insights become timely interventions that drive product-led growth—while staying within our data governance boundaries.

    Reducing organizational risk demands discipline. I pair analytics rollout with a documented data map, DPIAs where appropriate, vendor risk assessments, and clear incident management protocols. We align with regulatory compliance requirements and integrate with cybersecurity practices for continuous monitoring and threat detection and response.

    I track success through business and trust metrics: higher adoption, stronger retention analysis, fewer support tickets, and cost savings from deprecating low-value features—alongside clean audits and consistent adherence to governance standards. The outcome is a tighter feedback loop, smarter roadmap decisions, and sustained customer confidence.

    If you’re evaluating Agent Analytics, start with a controls checklist, define the minimum viable telemetry for your KPIs, validate consent flows, and pilot with a narrow audience before you scale. This approach balances velocity with vigilance, ensuring we harness analytics for impact without sacrificing privacy or compliance.


    Inspired by this post on Pendo – Perspectives.


    Book a consult png image
  • Implementing Agentforce the Right Way: A Practical Playbook with Pendo and Salesforce

    Implementing Agentforce the Right Way: A Practical Playbook with Pendo and Salesforce

    I think about Agentforce implementation the same way I think about any high-stakes product launch: start with outcomes, instrument relentlessly, and iterate in tight loops. When agentic AI touches core workflows in Salesforce, the winners are the teams that combine rigorous product strategy with thoughtful CRM integration and product-led growth tactics.

    Learn the ways in which Pendo helps companies design and iterate on their agentic strategy for Salesforce.

    My working playbook begins with clarity. Before a single agent is deployed, I align with stakeholders on the highest-value “jobs” inside Salesforce—reducing case handle time in Service Cloud, accelerating lead qualification in Sales Cloud, or improving data hygiene for revenue operations. That alignment shapes our agentic AI approach and prevents us from shipping clever agents that don’t move the metric that matters.

    From there, I treat telemetry as a first-class requirement. I instrument the end-to-end journey with Pendo so we can observe when an agent triggers, when it falls back, when it hands off to a human, and how those moments affect conversion, CSAT, and cycle time. I refer to this observability layer as Agent Analytics, and it’s the backbone of evidence-based iteration.

    Guidance is equally critical. I use Pendo’s in-app guides to onboard admins and frontline users directly inside Salesforce, deliver contextual tooltips that explain what the agent will do next, and collect feedback within the flow of work. That combination shortens time-to-value and builds trust, which is essential for customer support ai strategy and change management.

    Iteration is where the compounding returns show up. I run A/B testing on prompts, decision policies, and handoff rules; evaluate performance on real user cohorts; and promote what works. This is classic product-led growth applied to agentic AI—ship small, measure precisely, and scale winners. Prompt engineering is not a one-time task; it’s a continuous discovery loop.

    I also weave in governance from day one. Privacy-by-design, data governance, and AI risk management aren’t add-ons—they are design constraints that shape what the agent is allowed to see and do. The guardrails live alongside the experience: clear disclosures, reversible actions, and easy ways for users to override or escalate.

    Finally, I operationalize the learning loop. Weekly reviews with a product trio (PM, design, engineering) examine Pendo dashboards, qualitative feedback, and Salesforce outcomes. If an agent is underperforming, we adjust prompts, refine retrieval, or simplify the decision tree. If it’s exceeding targets, we expand the use case and systematize the pattern.

    When teams ask me for the “right way” to implement Agentforce, my answer is simple: treat your agent like a product. Measure with Pendo, guide inside Salesforce, and iterate until the business outcome moves. That’s how we turn promising agents into durable advantages.


    Inspired by this post on Pendo – Perspectives.


    Book a consult png image
  • Beyond the Support Iceberg: Gradient Labs’ Multi‑Agent Breakthrough That Actually Gets Work Done

    Beyond the Support Iceberg: Gradient Labs’ Multi‑Agent Breakthrough That Actually Gets Work Done

    When a customer reports a stolen credit card, the frontline play seems straightforward—freeze it. But that’s just the visible tip of a much larger customer support iceberg. Underneath sits the real work: dispute filings, fraud investigations, merchant communications, proactive outreach, and follow-ups that unfold over days across multiple systems. Most AI support tools only touch the surface; they don’t coordinate or close the loop. That gap is exactly where my product instincts kick in—and why this story matters.

    I recently listened to a conversation with Jack Taylor (Product Engineer) and Ibrahim Faruqi (AI Engineer) from Gradient Labs, an AI-native startup building agents that automate the full scope of customer support in fintech. Their approach resonated with the challenges I see every day in customer support automation: fragmented workflows, regulatory complexity, and the need for human-in-the-loop moments. Gradient Labs has architected a platform with three coordinating agents—"inbound, back office, and outbound"—all built on a shared foundation of "natural language procedures, modular skills, and configurable guardrails."

    What impressed me most was how they "Let non-technical subject matter experts define agent behavior through natural language procedures—no coding required." That’s a powerful way to remove engineering bottlenecks, accelerate iteration, and keep the domain experts—those closest to fraud, disputes, and compliance—directly in control. In my experience, this design choice alone can compress lead times from weeks to hours and aligns perfectly with continuous discovery and eval-driven development.

    At the heart of their platform is orchestration. They "Architected a state machine orchestrator that manages turns, triggers, and skill selection across long-running conversations." That "turn" architecture is built for the messy reality of async, multi-day support. They treat "Skills as modular agent capabilities—and how they're scoped deterministically per turn," ensuring the system stays predictable and auditable. They also confront a nuanced challenge most teams dodge: "Defining "done" for outbound agents when the customer isn't the one ending the conversation." That’s where deterministic criteria, timers, and clearly scoped outcomes matter as much as the model beneath.

    Compliance is not an afterthought—it’s baked into the core. Gradient Labs "Built guardrails as binary classifiers with eval pipelines, tuning for high recall on critical regulatory checks." In regulated domains, optimizing for recall on high-stakes checks is the right call; you can tolerate a few extra reviews, but you can’t miss a potential fraud signal. More broadly, they frame "Guardrails as classification problems: balancing recall and precision for regulatory compliance." That mindset is exactly how I like to merge AI risk management with product velocity.

    Crucially, they avoid the trap of fully autonomous optimism. "Ask a Human: a tool call that brings humans into the loop for approvals or missing APIs" gives the system a safety valve for novel or high-risk cases. I also appreciated the explicit "Ask A Human Tool" pattern, which cleanly integrates approvals, policy exceptions, or data gaps without derailing the workflow.

    Quality doesn’t happen by accident. They "Designed an auto-eval system that samples conversations for human review to catch edge cases and build labeled datasets" and built "Auto-eval pipelines that flag conversations for manual review and feed labeled datasets." That closed-loop evaluation flow is the backbone of sustainable performance in agentic AI. Combine this with targeted instrumentation—think CSAT, first contact resolution, deflection rate, time to resolution, and escalation rate—and you get a real Agent Analytics discipline, not just logs and dashboards.

    The "iceberg" metaphor is more than a catchy visual. It’s a blueprint for scoping multi-agent platforms that work across the entire customer journey. With "inbound, back office, and outbound" agents coordinating on complex tasks like fraud disputes, the system can transition cleanly from intake to investigation to resolution—without dropping context or asking customers to repeat themselves. This is what genuine customer support automation looks like when it’s grounded in real operations.

    Under the hood, the team leans into robust design choices that matter at scale: the "Complexities of Natural Language Input" are managed with explicit state and skill scoping, "Deterministic Skill Execution" reduces flakiness, and "Customer-Specific Guardrails" ensure compliance remains aligned to each client’s policies. Add their focus on "APIs and Customer Tools Integration" and the result is a platform that can actually take action—not just answer questions.

    If you’re building in this space, here’s how I’d apply these lessons. Start by mapping the iceberg: enumerate back-office steps, approvals, and SLAs that follow the initial customer touchpoint. Capture those steps as "natural language procedures" owned by SMEs. Implement a "state machine orchestrator" to manage "turns, triggers, and skill selection" across multi-day workflows. Treat "guardrails as classification problems" and tune for high recall on high-stakes checks. Introduce "Ask a Human" early to handle missing APIs or policy exceptions. Finally, operationalize learning with "auto-eval pipelines" and tight, eval-driven development loops. That’s how multi-agent platforms deliver measurable outcomes in fintech support.

    If you want to hear the full conversation, you can listen on Spotify or Apple Podcasts. You’ll also hear a nod to the "Incident.io episode – Referenced in the conversation," and a thoughtful take on the "Future of Multi-Agent Systems."

    In short: this is a shift from simple Q&A bots to agents that can coordinate, comply, and complete. It’s the kind of multi-agent platform work that moves the needle for customer support in fintech—and a compelling template for any product leader scaling agentic AI and AI workflows beyond the tip of the iceberg.


    Inspired by this post on Product Talk.


    Book a consult png image
  • 2026 Support Capacity Playbook: Bold AI Automation, Smarter Staffing, Zero‑Surprise SLAs

    Capacity planning has always been a high-stakes exercise in customer service, and when you miss, the signal shows up fast in backlogs and SLAs. I’ve lived that pressure across multiple cycles, and 2026 will reward teams that plan differently. AI fundamentally changes capacity planning because it changes the work. It resolves the bulk of your volume, speeds up execution, and elevates the complexity and value of what humans handle. The consequence is simple: planning models must evolve. This is the final installment in my 2026 customer service planning series, and I’m focusing on the tension every leader feels right now—be ambitious about automation, but avoid the trap of understaffing if your assumptions don’t hold. My goal is to share how AI changes the logic of capacity planning, what I’ve learned implementing these practices with my team and with customers, and the common traps to avoid. Traditional planning rests on relatively stable assumptions: volume grows predictably, work types stay consistent, handle times don’t swing dramatically, and productivity improves slowly with better tools and training. In an AI-first model, none of that is guaranteed, and the fundamentals flip. The mix of work changes as AI absorbs a growing share of simpler conversations, leaving humans with deeper, more time-consuming issues that demand human-to-human connection. Demand can actually increase when you remove friction, so AI can both resolve more and attract more volume. Human time splits differently as teammates solve customer problems and also review AI behavior, give feedback, improve content, and support system-level work. Performance becomes dynamic, not fixed—automation rate isn’t a one-time number; it can rise with care and fall with neglect. If you plan for 2026 using a pre-AI model—assuming similar productivity, similar work mix, and a linear relationship between volume and headcount—you will underestimate what it now takes to run a high-performing support organization. There are many metrics you can track, but the one to put at the center is automation rate (AI Agent involvement rate × AI Agent resolution rate). This single construct tells me what share of total volume AI actually resolves, how much work remains for humans, how much additional demand humans can absorb, and how ambitious I can be with headcount. Early in the journey, I prioritize raising involvement—getting the AI involved in more conversations. Once involvement is high, I shift to resolution on the hardest remaining work, where each additional 1% of automation can represent several people’s worth of capacity. In my 2026 plans, automation rate sits alongside projected inbound volume, average “output” per person for the more complex work that remains, and occupancy—how much time is allocated to customer-facing interactions versus operational and strategic work. Together, those inputs give a realistic picture of how many people you need and where they should spend their time. First, plan boldly on automation, but match it with investment. I do not cap automation assumptions at 40–50% “because AI is new.” Many teams are already modeling 60%, 70%, even 80%+ for 2026—when they invest in AI ownership and content. The investment is non-negotiable: named ownership for AI performance (AI ops, knowledge management, conversation design), clear automation targets by work type (e.g., informational vs. personalized vs. actions vs. deep troubleshooting), realistic expectations for what’s easy to automate and what’s not, and a concrete plan to raise automation over time in monthly or quarterly steps rather than a single jump. To decide where to invest first, I dig into the data. I start with the biggest volume drivers, separate content-led issues from those dependent on data or complex procedures, assume higher resolution potential for content-led topics once the knowledge base is in shape, and set more modest initial resolution expectations for system-dependent flows. Then I stair-step improvements as the systems, data contracts, and workflows mature. In short, bold automation goals only work when paired with the team structure, content, and systems required to reach them—and the discipline to iterate. Second, expect human “output” per person to go down. That’s a mindset shift. Historically, we assumed individual productivity would stay flat or tick up as tools improved. In an AI-first model, humans handle fewer conversations but more complex, cross-functional issues—and create more value despite lower case counts. I model a lower “cases closed per person” than prior-year baselines, explicitly assume the remaining work is more complex and time-consuming, and redefine productivity to include system-level work like AI Agent improvements, content updates, and policy or workflow change management. I also report “capacity created” from automation alongside human outputs, so leadership sees the full picture. Third, rethink occupancy: more time off the queues, on higher-value work. Traditional occupancy splits time between inbox and training, meetings, and breaks. Now there’s an expanding “out-of-inbox” portfolio that directly affects AI performance and overall capacity: reviewing AI-handled conversations, improving AI Agent triaging and handovers, contributing to content and procedures, feeding insights to product and engineering, and supporting system changes that reduce future volume. I set lower inbox occupancy targets than before and make the rationale explicit. People aren’t working less—they’re working differently. In planning, I assume more time spent on improvement and system work, make it visible (for example, X% in inbox and Y% on AI and system improvement), and treat this as critical, not a “nice to have.” If you don’t proactively allocate it, it won’t happen—and your automation and performance targets will suffer. Fourth, work with the finance team early, and treat your plan as a set of assumptions. Capacity planning with AI is a set of bets across automation rate, human output, demand growth, occupancy, and where surplus capacity (if any) goes. I bring finance in early, show that the plan is dynamic and directly tied to AI performance, and label every lever as an assumption with ranges. I commit to a quarterly review cadence with finance to compare assumptions versus reality and adjust headcount, targets, and investment as needed. The risks are real: if automation grows slower than expected and you stop backfilling too early, you’ll be understaffed for months. Hiring and onboarding take time, so course-correcting late creates strain. If you do produce surplus capacity, have a clear strategy to reallocate those teammates to higher-value work—improving systems, feeding insights back to product, supporting new channels, and driving proactive CX—rather than defaulting to reductions. I also set explicit guardrails—if automation rate misses by five points for two consecutive months, we pause planned reductions and revisit hiring gates. If it over-performs, we shift people into backlog eradication, content upgrades, or proactive outreach, so we bank compounding value. To set your team up for success in 2026, anchor your plan on automation rate, be honest that humans will handle fewer but harder conversations, and protect time for system improvements. Partner early and often with finance, avoid shrinking too fast, and design a plan for surplus capacity so you’re never caught flat-footed. If AI is going to handle the majority of your customer conversations, your plan has to be designed to help it do that well and to keep your team set up for meaningful, sustainable work. A 2026 plan built on adaptable assumptions—not fixed predictions—will hold up as your work, your systems, and your customers’ expectations continue to change. If you’d like future editions like this, subscribe and stay close—I’ll keep sharing what’s working, what isn’t, and how to tune your customer support AI strategy in real time.

    Inspired by this post on The Intercom Blog.


    Book a consult png image