Category: AI Strategy

  • The AI Deployment Gap Is Widening—Accelerate to Mature ROI and World-Class CX in 2026

    The AI Deployment Gap Is Widening—Accelerate to Mature ROI and World-Class CX in 2026

    I’ve watched AI adoption accelerate dramatically over the last year, and the momentum is undeniable. Teams everywhere are experimenting, piloting, and operationalizing AI—but the ways they’re doing it, and the outcomes they’re seeing, vary widely.

    Our latest research shows that 82% of senior leaders invested in AI for customer service in 2025, and 87% plan to in 2026. That’s the new baseline. The differentiator now is depth—how far AI is embedded into core workflows, accountability, and measurement.

    Infographic comparing AI benefits in customer service: 43% with mature deployment report higher quality and consistent support, versus 24% at initial deployment; survey allowed multiple responses.
    Teams with mature AI are almost twice as likely to achieve higher, more consistent support quality. Our survey shows 43% of advanced adopters citing this benefit compared with 24% of early deployments.

    But while most teams are using AI, our 2026 “Customer Service Transformation Report” shows that this usage is not equal. A gap is opening up between teams that have deployed AI at a surface level and those that have integrated it deeply. I see this firsthand: shallow deployments answer FAQs; deep deployments redesign processes, policies, and teams.

    Infographic comparing customer service improvements after AI: 87% of mature deployments report improved metrics vs 62% of all respondents, shown as pink and gray circles with legend and headline.
    Survey results highlight the AI deployment gap: nearly nine in ten organizations with mature AI see improved customer service metrics (87%), compared with 62% across all respondents, visualized with bold circles.

    For this year’s report, we surveyed over 2,400 global customer service professionals across a range of industries to see how they’re using AI today, where it’s paying off, and what they’re betting on as they plan for 2026. The findings mirror my experience leading AI Strategy and AI workflows at scale.

    Infographic of customer service teams measuring AI ROI by deployment stage: 70% mature, 60% scaling, 43% initial, 35% exploring, shown as donut charts, illustrating the deployment gap.
    As AI programs advance, measurement confidence surges. This chart shows how ROI tracking rises from 35% in exploring to 70% in mature deployments—evidence of a widening execution gap in customer service.

    We found that for many teams, AI is still doing narrow work like answering simple questions or handling small parts of workflows. These teams are seeing benefits, but only a fraction of what’s possible. Meanwhile, a smaller group is pulling away. They’ve put AI at the core of their service operation, integrating it into critical workflows, giving it more responsibility, and continuously improving it over time. That’s the hallmark of mature deployment.

    Side-by-side infographic comparing 2025 vs 2026 customer service priorities. In 2026, improving CX leads at 58%, followed by reducing costs and improving efficiency at 46%, with support quality still a key focus.
    Customer service priorities are shifting fast. By 2026, improving CX tops the list at 58%, cost and efficiency climb, and quality moves to third as teams prepare to scale operations and evolve skills.

    The difference in results and overall support experience – for both teams and customers – is significant. Here’s how I interpret the data and what I recommend to close the gap.

    Ranked customer service survey chart titled 'How are existing support roles changing on your team as a result of AI?' showing 45% updated job descriptions, 40% agent AI training, and other shifts at 27–24%.
    Survey insights from the 2026 customer service transformation report reveal how AI reshapes support roles: 45% of teams updated job descriptions and 40% ramped up AI training, while human agents focus more on complex escalations.

    AI adoption is the norm, depth makes the difference. According to senior leaders, 82% of organizations invested in AI in 2025, with 87% planning to invest in the year ahead. Despite this widespread investment, only 10% of teams report having reached a mature level of deployment, where AI is fully integrated into operations and working at scale. In my playbook, maturity means end-to-end ownership of well-defined workflows, robust guardrails, and clear success criteria.

    Survey chart showing drivers to expand AI beyond support: success with AI in support (57%), unified customer experience (49%), scaling without added headcount (33%), and cross-department demand (31%).
    Early AI wins are fueling expansion beyond support. Survey results show 57% cite proven success, 49% aim for a unified customer experience, 33% need to scale without adding headcount, and 31% see demand from other teams.

    Reaching this level of maturity is where AI’s real value lies. We found that 43% of teams with mature deployment report higher quality and consistency across support – nearly double the rate of those still in the exploration or initial deployment stages. That aligns with what I see when we move from point solutions to platform thinking and agentic AI patterns.

    Neon green hero graphic reading 'The 2026 Customer Service Transformation Report', with subhead 'The AI deployment gap is widening' and a black 'Get the report' button over a bar-chart pattern.
    Leaders are racing ahead with real AI in support. Explore the 2026 Customer Service Transformation Report to see where deployment is stalling, benchmark your team, and get practical steps to scale automation that delights.

    ROI becomes clearer with deeper integration. The economic benefits of AI tend to show up first in speed and throughput, and they show up fast. Across all respondents, 62% say their customer service metrics have improved since implementing AI. Most often, teams report their initial gains in efficiency and scale—faster responses, shorter handling times, and the ability to resolve more conversations with the same team—all driving lower cost per interaction.

    But the deeper teams go with deployment, the more the results start to show in the metrics. We found that among teams that describe their AI deployment as mature, the cohort of respondents reporting improved metrics as a result of AI rises from 62% to 87%. What’s more, teams with more mature deployments are significantly more likely to say they can measure the return on their AI investment. My advice: instrument everything upfront, baseline rigorously, and use eval-driven development to iterate with confidence.

    The bar has moved from ‘does it work?’ to ‘is it actually good?’ More than ever, teams are focused on improving customer experience and satisfaction, with 58% saying it’s the top priority for 2026. That number has more than doubled since last year, when just over a quarter (28%) of respondents cited it as a top priority. As AI assumes repetitive work, your people can shift from reactive triage to proactive journey design. Now is the time to invest in quality frameworks, prompt engineering standards, and LLMs for product managers to close the loop between product, ops, and CX.

    Important support work now extends beyond the inbox. AI is reorganizing core customer service operations as it starts to take on a higher volume of work and more complex tasks. Even at the initial deployment stage, 16% of teams report spending less time handling support volume since implementing AI – and among teams who’ve reached maturity, that figure rises to 28%. I’ve seen new roles emerge—AI operations managers, conversation designers, and model evaluators—alongside upskilling for agents into higher-order troubleshooting and relationship building.

    Support is creating the blueprint for AI deployment across the business. Support was the proving ground for AI, and our research suggests that businesses are now planning to expand its use to other areas based on the results it’s yielded so far. Fifty-two percent of respondents said that their organizations are actively planning to scale AI to departments like customer success, marketing, and sales in 2026. The two most cited driving forces behind this decision are the success support has seen with AI to date and a desire to create a unified customer experience. Treat your support stack as a reusable platform: shared services, governance, and reusable components accelerate adoption in adjacent functions.

    Seize the opportunity to close the gap. Having or not having AI isn’t a question anymore. What you should be asking now is how close you are to mature deployment, where AI is capable of tackling nuanced, high-stakes work. Those who have reached this stage show that going deep is what unlocks real value. That’s the opportunity. Push AI to do more, bring it to more channels, use it to resolve the most complex queries, and close the gap before it becomes too wide to close.

    This might seem daunting. But trying new things always is. What we’re experiencing now is a defining moment for customer service, and the teams that are leaning in are actively building the future. As this report shows, what works in customer service now will become the blueprint for how organizations transform the full customer journey with AI. If you want the benchmarks and the playbook to accelerate from pilots to production-grade outcomes, I recommend reviewing the full “2026 Customer Service Transformation Report.”


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • AI Operating Model Masterclass: How I Scale Teams, Tech, and Governance Without Chaos

    AI Operating Model Masterclass: How I Scale Teams, Tech, and Governance Without Chaos

    When I set out to operationalize AI across a product organization, I focus on one promise: repeatable outcomes without chaos. An effective AI operating model turns experiments into an engine—aligning strategy, teams, technology, and governance so we can ship value safely and at scale.

    At its core, an AI operating model is the connective tissue between vision and delivery. I anchor it on a few pillars: clear AI Strategy, empowered cross-functional teams, a modern AI platform, rigorous AI risk management and data governance, and a cadence of eval-driven development that ties everything back to outcomes.

    Strategy comes first. I translate big ambitions into a portfolio of use cases ranked by customer impact, feasibility, and risk. I use continuous discovery to validate the problem, then frame each bet with outcomes vs output OKRs, a crisp value proposition, and a build vs buy decision. For generative AI, I encourage PMs to treat LLMs for product managers as a craft—rapid prototyping, deliberate prompt engineering, and disciplined evaluation from day one.

    Team design matters as much as models. I organize around product trios—PM, design, and engineering—augmented by data, ML, and a “forward deployed” mindset when the domain is complex. I invest in empowered product teams and communities of practice to spread patterns quickly while avoiding centralized bottlenecks.

    On the platform side, I start retrieval-first pipeline before fancy modeling. A solid foundation—feature stores, vector search, observability, and safe integration points—beats bolt-on hacks. I rely on CI/CD with feature flags, strong deployment frequency, DORA metrics, and SRE-grade reliability to keep the iteration loop tight and safe.

    Governance is non-negotiable. I implement privacy-by-design, clear data governance, audit trails, and policy controls aligned to regulatory compliance. AI risk management includes model red teaming, safety layers, and human-in-the-loop review where needed. The goal is confidence: we know what shipped, why it works, and how it fails.

    Execution rides on eval-driven development. For every AI workflow, I define offline and online test sets, target metrics, and a decision policy before launch. I A/B test with proper minimum detectable effect (MDE), layer canaries for protection, and monitor user experience and outcomes in production. This is how we turn “it seems smarter” into statistically confident improvements.

    Adoption is a product in itself. I build onboarding, in-app guides, and product tours that help users form habits quickly. I monitor activation, time-to-value, and retention analysis while partnering with customer support ai strategy to close the loop between real-world issues and roadmap priorities.

    Culture scales the system. I normalize rapid learning, shared playbooks, and personal knowledge management so insights don’t disappear into meetings or notebooks. I upskill teams on prompt engineering, context window management, and model selection, and I celebrate the humility required to refactor what “worked” yesterday.

    Operating cadence keeps it all coherent. I run an AI portfolio review tied to outcomes vs output OKRs, keep a single source of truth for evaluations, and align go-to-market strategy with release readiness. We review risks alongside results so speed never outruns safety.

    If you’re starting from scratch, I recommend a 30-60-90 approach: baseline your current state, choose two lighthouse use cases, stand up the retrieval-first pipeline and eval harness, define governance and data policies, then ship small, safe increments behind feature flags. Teach the system to learn before you make it run.

    I’ve felt the pain of brilliant prototypes that crumble in production and the thrill of AI features that compound value month after month. The difference is the operating model. Build it with intent, and you’ll scale AI with confidence—teams aligned, tech resilient, and customers seeing real outcomes.


    Inspired by this post on Product School.


    Book a consult png image
  • Stop Losing Customers: Predict Churn with Digital Analytics and Act Before It’s Too Late

    Stop Losing Customers: Predict Churn with Digital Analytics and Act Before It’s Too Late

    I stopped treating churn as a postmortem and started treating it as a forecasting problem. When we instrument our product, connect the dots across journeys, and embed those signals into our daily operations, churn becomes predictable—and preventable. This shift has been one of the most impactful product strategy moves my teams have made for product-led growth and retention analysis.

    "Discover why and how CS teams can use digital analytics to take a proactive, predictive approach to churn, stopping it before it happens." That is exactly the mindset I bring to customer success and product collaboration: anticipate risk, intervene with precision, and demonstrate measurable impact.

    The practical work starts with leading indicators. I look at user activation milestones, time-to-first-value, feature adoption depth, frequency and recency of key events, account-level coverage (are multiple users active or just one champion?), usage volatility, and friction signals like repeated errors or stalled onboarding. These behavioral inputs are stronger predictors of churn than survey sentiment alone.

    From there, I create a churn risk score. Early on, a transparent rules-based model is usually enough to separate healthy from at-risk accounts. Over time, we can layer in supervised learning if the data supports it. I rely on Amplitude analytics, Pendo, or a unified analytics platform to tag events, build cohorts, and compute risk in near real time. This is where we consistently see the patterns that matter—especially around user activation and sustained adoption.

    Signals without action won’t save a customer, so I connect the model to our systems of engagement. Through CRM integration, at-risk accounts trigger clear playbooks for CSMs and lifecycle marketers. Inside the product, in-app guides address gaps exactly where they occur—guiding users to the next best action, unblocking onboarding, or showcasing the value hidden behind underused features.

    Because not every nudge works for every segment, we treat intervention design as a product problem and run A/B testing on copy, timing, channel, and offer. We test whether a contextual tooltip outperforms an email sequence, whether a short product tour beats a knowledge base link, and which incentives accelerate onboarding without cannibalizing expansion.

    Operationally, this is a team sport. Product, CS, and marketing meet in product trios to review risk cohorts, prioritize root-cause fixes, and tune playbooks. We run a weekly risk review to turn insights into decisions, and we use monthly business reviews to connect leading indicators to lagging outcomes like retention, expansion, and NRR.

    Measurement is non-negotiable. We pair retention analysis with qualitative feedback to understand whether our interventions truly change behavior. The goal is to close the loop: when a risk cluster improves, we codify the playbook; when a tactic underperforms, we learn, adjust, and try again. Over time, the organization builds a muscle for proactive, data-informed customer health management.

    If you’re getting started, begin by instrumenting events tied to value moments, define a simple health score, and stand up a basic alerting workflow. Pilot one or two interventions, measure lift, and iterate. Within a single quarter, you’ll have enough signal to prioritize product improvements and scale the practices that reliably reduce risk.

    Churn rarely surprises teams that listen to their data and respond in real time. With disciplined analytics, thoughtful in-product guidance, and tight alignment across CS and product, we can move from reacting to predicting—and keep more customers succeeding with far less effort.


    Inspired by this post on Amplitude – Perspectives.


    Book a consult png image
  • Build vs. Buy in an AI-First World: My Framework to De-Risk Decisions and Own Your Data

    Build vs. Buy in an AI-First World: My Framework to De-Risk Decisions and Own Your Data

    Build vs. buy is a decision that never truly goes away, and with AI reshaping the economics of software, I’m revisiting this question more frequently—and with more nuance—than ever. The temptation to “just build it” is real when prototypes are cheaper, shipping feels faster, and small tools can rival big platforms. But the real decision has never been about code; it’s about value, data, and long-term responsibility.

    Across product orgs at every stage, I see the same pattern: AI makes building feel easier—but it doesn’t eliminate the tradeoffs. The hard part is separating what differentiates your product from what simply supports it. That’s why I start by asking whether the capability is truly core to my value stream, and then I force myself to reason about ownership and maintenance, not just velocity.

    My rule of thumb remains simple: If something isn’t core to your value stream, don’t build it. And even when it is core, vendors may still be better positioned—especially for payments, invoicing, and infrastructure. Those domains carry deep operational complexity, continuous compliance, and reliability requirements that are easy to underestimate and painful to own.

    Here’s how this plays out for me. I would never build my own blogging platform. I moved from WordPress to Ghost, because publishing isn’t where I differentiate, and the long tail of upgrades, security, and performance is a drag on focus. The platform does the job, my audience gets a better experience, and my team avoids owning commodity maintenance work.

    On the other hand, I did build my own task management system—despite the abundance of excellent tools like Trello, Evernote, and OmniFocus. For me, tasks, notes, and workflows are deeply personal and idiosyncratic. I wanted my system to reflect how I think, plan, and communicate, with tight integration to my daily product rituals. In this case, the underlying data became the real product—and owning and controlling that data changed the equation.

    That’s the heart of the decision: When the underlying data becomes the real product, ownership matters. Task management, notes, and workflows evolve into a personalized operating system. The moment your data model represents your unique value—and your future differentiation—build vs. buy is no longer a tooling choice; it’s a strategy choice.

    AI is pushing this even further. Cheaper prototyping and “vibe coding” lower the cost of building. Tools like Claude Code and platforms from OpenAI make it viable to ship smaller, targeted tools that would have been uneconomical a few years ago. That expands the frontier of what teams can build without committing to a monolithic platform—and it puts pressure on vendors to improve data portability.

    Which brings me to vendor lock-in. Exports aren’t always enough. When I evaluate CRMs or course platforms, I look for more than CSV dumps. I want robust, well-documented APIs, webhook coverage, import/export parity, schema transparency, and a clear migration path. I’ve seen teams drown in brittle integrations with Salesforce or HubSpot, struggle to unwind course data from Teachable, or get stuck in signature workflows around DocuSign without a clean escape hatch. Portability is table stakes now.

    I treat build vs. buy as a discovery problem. Options are assumptions to test. On the build side, I run feasibility spikes: proof-of-concept integrations, latency checks, cost-to-serve models, and a sober read on maintenance. On the buy side, I trial vendors, not their marketing. I replicate a real workflow, test the edges, validate data portability, and simulate failure modes like vendor downtime or schema changes.

    A word of caution on complexity: “we can build anything” is not the same as “we should build this.” Long-lived products accumulate hidden complexity over time—security, privacy, performance, observability, SRE runbooks, QA automation, documentation, and compliance. Be honest about engineering capabilities and maintenance costs, especially when uptime and regulatory exposure are in play.

    My practical checklist looks like this: Is this core to our differentiation? Do we need to own the data model? How strong is data portability (APIs, webhooks, mapping, re-import)? What’s the true total cost of ownership over three years (people, ops, security, compliance)? Are there regulatory or reliability constraints better handled by a vendor? What’s the opportunity cost of not building something more strategic? And if we buy, what’s our exit plan?

    Ultimately, build vs. buy isn’t just about speed or cost—it’s about core value, data ownership, and long-term responsibility. AI lowers the barrier to building, but it doesn’t erase complexity. Treat build vs. buy decisions like any other discovery effort: test assumptions, prototype, and validate before committing. Ask not just can we build it, but should we own it?

    If you’re wrestling with vendor lock-in, fielding pressure to “just build it,” or rethinking your stack in an AI-first world, this lens will help you ask better questions before you commit. And if you’re exploring targeted builds alongside platforms like Stripe, Dropbox, Obsidian, or Ghost, I’d love to hear what’s working for you and where portability remains a hurdle.


    Inspired by this post on Product Talk.


    Book a consult png image
  • The Customer Feedback Playbook: AI-Powered Tactics I Use to Make Better Product Decisions

    The Customer Feedback Playbook: AI-Powered Tactics I Use to Make Better Product Decisions

    Customer feedback is the most reliable compass I have for product strategy and execution. Over the years leading product at HighLevel, I’ve built and refined a system that turns raw signals from users into clear, prioritized decisions our teams can confidently ship.

    A practical guide to collecting and using product feedback in product management (from AI tools to early-stage tactics) for better product decisions.

    My playbook starts with continuous discovery. I keep a steady flow of insights from sales calls, customer support threads, community forums, and in-product behavior so I can triangulate patterns rather than chase loud anecdotes. This mix of quantitative and qualitative data helps me separate urgent noise from strategically meaningful trends.

    On the quantitative side, I rely on product analytics to ground the conversation. Amplitude analytics gives me activation, retention cohorts, and feature engagement, while controlled experiments and A/B testing validate whether an idea actually moves a target metric. Tying these signals to specific customer segments helps me see where product-led growth is working—and where it’s stalling.

    For qualitative insight, I combine in-app guides and lightweight surveys (via tools like Pendo) with structured interviews and support escalations (often surfaced through platforms like Intercom). I map problems using the Kano Model to understand which requests are basic expectations, which are performance drivers, and which are potential delights. This keeps our roadmap focused on outcomes, not just outputs.

    AI now accelerates the synthesis step. With LLMs for product managers in my AI product toolbox, I summarize interview transcripts, cluster themes across thousands of notes, and quantify sentiment without losing nuance. I still review raw artifacts to avoid hallucinations and preserve context, but AI reduces the time from signal to insight dramatically—freeing me to spend more energy on judgment and storytelling.

    In early-stage contexts, I bias toward speed and proximity to users. I schedule founder- or PM-led discovery calls weekly, instrument product tours early, and launch scrappy in-product prompts to validate demand before over-investing. When data is sparse, I focus on high-signal channels (power users, churned customers with qualified use cases) and document crisp problem statements that connect directly to activation, retention analysis, and revenue outcomes.

    Prioritization ties everything together. I translate insights into hypotheses aligned to outcomes vs output OKRs, then pressure-test them with feasibility and strategic fit. We run small, measurable experiments, track deltas in activation and retention, and adjust the product roadmapping and sprint planning cadence based on what the data and customers teach us.

    This approach builds trust with stakeholders and creates empowered product teams. By grounding decisions in a transparent trail of feedback, analytics, and experiments, we reduce thrash, move faster, and—most importantly—ship product moments that customers value.

    If you’re refining your own feedback engine, start by instrumenting the basics, set a weekly discovery rhythm, and let AI handle the heavy lifting on aggregation and synthesis. The compounding effect is real: better insights lead to better bets, which lead to better outcomes for your users and your business.


    Inspired by this post on Product School.


    Book a consult png image
  • Stop Drowning in Dashboards: Real-Time Digital Analytics for Finserv Contact Centers

    Stop Drowning in Dashboards: Real-Time Digital Analytics for Finserv Contact Centers

    I’ve sat in enough finserv contact center reviews to know the pattern: wall-to-wall dashboards, weekly exports, and colorful charts that still leave teams asking, “So what should we do next?” The truth is, more dashboards rarely create better decisions. What we need is digital analytics that translates signals into action—fast, precise, and privacy-safe.

    When I say digital analytics, I mean a unified analytics platform that captures real-time behavioral data across voice, chat, IVR, email, and in-app journeys, then operationalizes it for agents, supervisors, and automated workflows. See how real-time behavioral analytics helps finserv contact centers lower costs, improve resolution speed, and deliver better member experiences.

    Dashboards tend to be lagging, siloed, and optimized for reporting, not resolving. They spotlight vanity metrics, bury journey-level friction, and rarely surface the “next best action” that actually moves a member request toward resolution. By the time a trend shows up in a weekly readout, the expensive part—handle time, repeat contacts, churn risk—has already accumulated.

    Real-time digital analytics flips that script. Instead of passively describing performance, it continuously detects intent, risk, and friction as interactions unfold—then powers targeted responses. For example, it can route high-risk transactions to specialized agents, prompt dynamic guidance during an escalated call, or trigger a proactive message that deflects a repeat contact. In practice, that means fewer transfers, faster resolution speed, and measurable reductions in operating costs.

    For finserv specifically, the payoff is immediate. Agent Analytics surfaces coaching opportunities (e.g., where scripts stall or compliance steps get missed). Retention analysis identifies members at churn risk after a negative experience. Journey analytics exposes where authentication fails or balance inquiries overwhelm queues, so you can intelligently deflect to self-service. And when a potential fraud signal appears mid-session, real-time insights can prioritize routing and alerting without sacrificing compliance.

    Implementation should be iterative and outcomes-driven. Start by instrumenting the top five journeys that drive the most cost or dissatisfaction (lost card, fraud dispute, loan status, password reset, payment issue). Tie each to clear outcomes vs output OKRs—think first-contact resolution, repeat-contact reduction, containment rate, and average time-to-resolution—so every analytic signal earns its keep. Then activate insights inside the workflow: agent assist prompts, smart routing, and targeted follow-ups that close the loop.

    Governance matters just as much as speed. In a regulated environment, privacy-by-design and data governance are non-negotiable. Build data access controls, audit trails, and consent management into your operating model from day one. Align analytics with regulatory compliance requirements to ensure that what you measure and automate is defensible, explainable, and safe for members and the business.

    To accelerate learning, pair digital analytics with controlled experiments. Use A/B testing on IVR flows, authentication steps, and post-call follow-ups to quantify what truly reduces transfers and repeat contacts. Define a minimum detectable effect (MDE) upfront so tests are fast and conclusive. Run continuous discovery with cross-functional product trios (operations, data, compliance) to turn insights into shippable improvements every sprint.

    On the stack side, focus on connecting systems you already trust. CRM integration ensures that context follows the member, while tools like Amplitude analytics, Pendo, or Intercom can instrument key digital touchpoints. Whether you choose build vs buy, the principle is the same: consolidate signals into a unified analytics platform, then push decisions and guidance back into the tools agents and members already use.

    The cultural shift is from reporting to decisioning. Instead of celebrating more charts, celebrate faster resolutions and fewer escalations. Replace static executive reports with alerting and action playbooks. Make it trivial for supervisors to see what changed, why it mattered, and which play to run next. That’s how you convert data into durable operating advantage.

    The mandate is clear: stop drowning in dashboards. Move to digital analytics that captures behavior in real time, respects compliance, and powers operational decisions where they matter most—in the member journey. When you do, cost curves flatten, resolution speed climbs, and member trust compounds.


    Inspired by this post on Amplitude – Perspectives.


    Book a consult png image
  • Building Physician‑Grade AI When Trust Is Everything: Inside Healio’s Proven Playbook

    Building Physician‑Grade AI When Trust Is Everything: Inside Healio’s Proven Playbook

    Trust is the currency of any high-stakes AI product, and nowhere is that more true than in healthcare. I recently dug into how Healio built an AI assistant for physicians—an audience that can’t afford to be wrong—and it’s a masterclass in balancing accuracy, transparency, and speed without compromising credibility.

    Healio, a 125-year-old medical publishing company, set out to create Healio AI to help clinicians prepare for patient care. From the outset, their guiding principle was simple: physicians won’t trust you until you prove it. That lens shaped every decision—from discovery and prototyping to architecture, evaluation, and ongoing validation.

    Discovery started with a survey of 300 healthcare professionals to understand real-world needs at the point of care. The headline insight: physicians primarily want AI for preparation, not bedside use. Even more surprising, the top ask wasn’t purely diagnostic support; it was help with patient communication and empathy—translating complex information into clear, accessible conversation.

    Momentum mattered. After beginning with Figma mockups to validate workflows, the team built a working prototype in a single weekend using Cursor. That velocity wasn’t about cutting corners; it was about proving value quickly, reducing ambiguity, and iterating with concrete feedback from physicians.

    Under the hood, the system employs RAG and hybrid search—combining lexical search, vector search, and semantic search across multiple trusted sources like PubMed. As any PM who has integrated biomedical literature knows, "just use PubMed" isn’t simple—there are five different ways to access the same data, each with trade-offs. The team made pragmatic choices to balance freshness, coverage, latency, and cost while preserving trust in source quality.

    Designing for trust extended all the way to the citation UX. The team leaned into citations that physicians actually trust: subscripts, hover states, and progressive disclosure. This gave clinicians verifiable threads back to source material without overwhelming the core interaction, aligning with how experts want to audit evidence under time pressure.

    Evaluation wasn’t left to chance. They stood up eight LLM judges for evals: safety, medical accuracy, faithfulness, relevancy, completeness, reasoning, clarity, and overall quality. Just as importantly, they treated those signals as directional, not definitive. In a high-stakes domain, physician feedback trumps LLM-as-judge feedback—so they complemented automated evals with direct reviews from practicing clinicians to calibrate quality and reduce hallucinations.

    On the safety front, the team implemented HIPAA compliance and input guardrails for masking personal health information. That choice reflects strong data governance and privacy-by-design thinking: protect PHI by default, constrain prompts to safe boundaries, and make compliance a first-class citizen in the product architecture.

    They also addressed monetization without compromising experience. Serving contextual ads while the LLM processes queries is a practical approach that preserves physician workflow efficiency and creates a clear, non-intrusive revenue model.

    Critically, the work didn’t stop at launch. The Healio Innovation Partners provide ongoing discovery and validation, ensuring the system evolves with physician needs and the medical evidence base. This is the operating cadence you want for any AI product that sits at the intersection of safety, accuracy, and fast-changing knowledge.

    My takeaways for building AI in high-stakes domains: prioritize retrieval-first pipelines over model cleverness; couple RAG with hybrid search across vetted sources; design citations that earn trust at a glance; use eval-driven development, but let domain-expert feedback be the ultimate judge; and embed regulatory compliance into your product strategy from day one. If trust is your North Star, this is a playbook worth emulating.


    Inspired by this post on Product Talk.


    Book a consult png image
  • AI-Powered Growth Loops: Transform Your PLG Product into a Self-Optimizing Engine

    AI-Powered Growth Loops: Transform Your PLG Product into a Self-Optimizing Engine

    Across my teams and portfolio, I’m watching AI fundamentally reshape product-led growth—from static funnels and one-off playbooks to adaptive, compounding growth loops that learn in real time. The shift isn’t just technological; it’s an operating model change that rewards continuous discovery, rigorous instrumentation, and outcome-driven product strategy.

    "Learn how AI is transforming PLG with a new generation of growth loops that can turn your product into a self-optimizing platform." That line captures what I’ve been building toward: systems that sense user intent, decide the next best action, act contextually, and learn to improve the loop with every interaction.

    Here’s the core pattern I rely on. First, sense: unify product analytics and behavioral signals (think Amplitude analytics, Pendo events, Intercom conversations) into a single, queryable, privacy-safe layer. Second, decide: apply AI Strategy—LLMs for product managers, rules, and retrieval—to segment users by intent and probability of success. Third, act: deliver in-app guides, product tours, tooltips, or personalized nudges that accelerate user activation and time-to-value. Finally, learn: run A/B testing with a clear minimum detectable effect (MDE), then feed outcomes back into the model for continuous optimization.

    Activation is where the gains start compounding. With gen ai, I can auto-generate tailored onboarding checklists, dynamic walkthroughs, and contextual help that adapts to the user’s role, data maturity, and current friction points. We’ve moved from generic product tours to precision guidance that updates based on real-time behavior—often lifting first-week activation and shortening time-to-first-value without adding support load.

    Experimentation is the governor that keeps speed and quality in balance. I instrument every growth loop end to end and pair eval-driven development with A/B testing to confirm incremental impact. Amplitude analytics gives me cohort views and path analysis; Pendo or Intercom can deliver in-app variants; a unified analytics platform closes the loop on retention analysis so I’m not optimizing for click-through at the expense of long-term value.

    Retention and expansion are where AI shines as a compounding engine. Retrieval-first pipeline patterns allow instant, contextual support that deflects tickets and boosts perceived product competence. Agentic AI can orchestrate next-best actions—prompting power users toward advanced features, surfacing value moments, or timing expansion prompts when success signals appear. The result is a virtuous cycle: better guidance drives deeper adoption, which improves model accuracy, which unlocks more relevant guidance.

    None of this works without guardrails. I bake in AI risk management from the start: strict data governance, privacy-by-design, human-in-the-loop review for high-impact actions, transparent user consent, and continuous drift monitoring. The goal is reliable automation that users trust—augmented by clear fail-safes when confidence drops.

    Operationally, I anchor the work in empowered product teams and product trios, focus on outcomes vs output OKRs, and practice continuous discovery to validate problems and solutions before scaling. The baseline metrics I watch: activation rate, time-to-value, week-four retention, PQL/PQA conversion, expansion revenue, and support deflection—each tied to a specific growth loop hypothesis.

    If you’re starting fresh, begin with the highest-leverage loop: user activation. Instrument your onboarding journey, define the critical path to value, ship two to three personalized interventions, and measure impact with a precommitted MDE. Scale what wins, drop what doesn’t, and iterate weekly. Once activation is compounding, extend the same approach to adoption depth, collaboration features, and expansion triggers.

    In practical terms, AI-powered PLG is less about flashy features and more about disciplined feedback loops. Build the sensing fabric, keep the decision layer auditable, ship small actions quickly, and treat learning as the product. Do that, and your product doesn’t just grow—it becomes a self-optimizing platform.


    Inspired by this post on Product School.


    Book a consult png image
  • Inside Product at Heart 2026: Bold Single-Track Vision, AI Everywhere, Deeper Connections

    Inside Product at Heart 2026: Bold Single-Track Vision, AI Everywhere, Deeper Connections

    I just tuned into the latest conversation on the upcoming Product at Heart 2026, and it hit on the exact challenges product leaders are navigating right now: curating meaningful content in a world where AI moves faster than our agendas, designing formats that create real connection, and ensuring every minute earns its place. Listening to Petra Wille and Teresa Torres map out the speaker lineup, workshops, and structural shifts, I found myself nodding along—this is the kind of thoughtful curation we need if we want product teams and product leaders to walk away with practical value, not just inspiration.

    Listen to this episode on: Spotify | Apple Podcasts

    What stood out immediately is the bold move to a single-track conference for 2026. In an era of gen ai hype and endless breakouts, this choice signals clear intent: tighter curation, a shared experience, and less FOMO. The team isn’t carving out a separate AI track—and I love that decision. Their stance is simple and sensible: No AI track—AI will show up everywhere, but not as a siloed topic. The team sees it as part of the everyday toolkit. That mirrors how high-performing, empowered product teams actually work today—AI Strategy and AI workflows are part of the operating system, not a side show.

    The keynote lineup is already compelling. Christian Idiodi (SVPG) brings storytelling that turns product principles into habits you can actually use on Monday. Elaine Kasket, cyber-psychologist, exploring digital afterlife and AI replicas, will push us to think more deeply about the human side of our systems. And Teresa Torres will be sharing what she’s learning about AI—exactly the kind of continuous discovery mindset we need as we integrate LLMs into product discovery and delivery.

    I’m also thrilled to see roundtables become what they’re calling an “alternative track.” That’s a smart way to deepen learning without fragmenting attention. The best conference ROI I’ve had often comes from targeted small-group conversations—where product trios compare approaches, swap metrics frameworks, or challenge each other’s product strategy assumptions. It’s a design choice that rewards curiosity and builds communities of practice.

    We also get a behind-the-scenes look at Teresa’s Maker Studio workshop, where participants will build personal AI workflows. That’s exactly the hands-on, practitioner-first approach teams need right now—less demo theater, more systems that stick. If your roadmap includes integrating LLMs into continuous discovery or augmenting your team’s decision velocity, this kind of guided practice is gold.

    The broader workshop slate looks deep and balanced. Expect returning favorites and practical frameworks: Rich Mironov on the realities of product leadership in complex orgs; Büşra’s metrics workshop translating outcomes into action; and an overview of additional workshops from Rich Mironov, Büşra Coşkuner, Marcus Castenfors, and Özlem Yüce. From success metrics to toolkits for product managers, the content spans IC to product management leadership—ideal if you’re stepping into new roles or scaling empowered product teams.

    One of the most exciting evolutions is the Product Leadership Event, now a 1.5-day retreat. The format blends talk sessions, mini-workshops, dinners, and small-group excursions (boat rides, improv, etc.), giving leaders time and space to exchange playbooks, stress-test decisions, and build real relationships. It’s capped at 60 attendees (all in product leadership roles) to keep it intimate and useful. As someone who believes in outcomes vs output OKRs and first principles decision making, I appreciate how this structure encourages depth over breadth—and real accountability among peers.

    Here are the core takeaways I’m carrying into my own planning: single-track means tighter curation, so every talk has to earn its place. Roundtables are growing into an “alternative track,” offering more ways to engage beyond stage talks. Workshops go deep and meet you where you are—IC, manager, or executive. And the leadership retreat expands to maximize learning from peers, not just from the stage. If you care about product discovery, product strategy, and conference networking that leads to actual business impact, this program looks thoughtfully engineered.

    If you’re planning your 2026 calendar—or just curious how conferences evolve alongside the craft—this is a thoughtful walkthrough of what to expect. Come say hi to Teresa and Petra—on stage, at a roundtable, or somewhere in the hallway conversations that make these events memorable.

    For more context and resources mentioned, explore: Product at Heart, Arne Kittler, Mind the Product, Christian Idiodi of Silicon Valley Product Group, Elaine Kasket, House of Beautiful Business, The 7 Habits of Highly Effective People by Stephen Covey, Rich Mironov, Marty Cagan, Claude Code, Codex by OpenAI, Marcus Castenfors, Büşra Coşkuner and her Success Metrics: A Playbook for Product Managers, Özlem Yüce’s Essential Toolkit for Product Managers, Petra’s Product Leadership Wheel (PLwheel), and Netlight.

    Follow Teresa Torres: https://ProductTalk.org

    Follow Petra Wille: https://Petra-Wille.com

    Full transcripts are only available for paid subscribers.


    Inspired by this post on Product Talk.


    Book a consult png image
  • How I Harness AI to Supercharge Product Discovery for Faster Research, Prototyping, and Validation

    How I Harness AI to Supercharge Product Discovery for Faster Research, Prototyping, and Validation

    I’ve led product teams through countless discovery cycles, and nothing has accelerated our learning loops like AI. By weaving AI into our continuous discovery practice at HighLevel, I cut time-to-insight, reduce risk earlier, and keep our product strategy relentlessly focused on customer outcomes.

    AI streamlines product discovery by accelerating research, prototyping, and validation, enabling teams to make faster, smarter, and user-driven decisions.

    In the research phase, I use gen ai and LLMs for product managers to synthesize interviews, cluster themes, and surface unmet needs in minutes instead of days. Pairing those qualitative insights with behavioral signals in Amplitude analytics helps me spot high-intent cohorts and friction points at scale, so our problem framing is both human-centered and data-backed.

    From there, I translate insights into crisp hypotheses and prioritize with the Kano Model and outcomes vs output OKRs. To keep experiments honest, I define a minimum detectable effect (MDE) up front and design A/B testing plans that reflect realistic traffic and seasonality, ensuring our decisions are statistically grounded rather than anecdotal.

    Prototyping is where gen ai for product prototyping really shines. I spin up multiple UX flows, UI copy variants, and edge-case scenarios using prompt engineering, then iterate with rapid feedback from product trios. When needed, I mock in-app guides and product tours to validate onboarding concepts before we commit to code, preserving velocity without sacrificing quality.

    For validation, I lean on a mix of lightweight experiments—fake-door tests, concierge pilots, and targeted A/B testing—augmented by in-product surveys via Pendo or Intercom. For AI-powered features, I apply eval-driven development to measure relevance, latency, and safety, so we can ship responsibly while maintaining the pace of learning.

    This approach only works when the team is structured to move fast. Empowered product teams and product trios own discovery end-to-end, with clear guardrails around data governance, privacy-by-design, and AI risk management. That alignment lets us shift from opinions to evidence, and from output to outcomes, without friction.

    If you’re getting started, pick one discovery loop to transform: automate research synthesis, prototype two to three variants with AI, and validate with a tightly scoped experiment. Instrument your analytics, track time-to-insight and time-to-prototype, and iterate your product roadmapping and sprint planning with what you learn. The payoff is immediate: faster cycles, stronger conviction, and a more user-driven path to product-led growth.


    Inspired by this post on Product School.


    Book a consult png image
  • From PDFs to Proposals: How Tendos AI’s Agent Swarm Automates Construction Quotes Fast

    From PDFs to Proposals: How Tendos AI’s Agent Swarm Automates Construction Quotes Fast

    Anyone who has lived inside construction tendering knows the grind. "When a construction company receives a bid request, someone has to open that email, parse the attached PDF (sometimes 1,800 pages describing an entire building), figure out which products are relevant, look up pricing, and draft a quote—all before the deadline. It's tedious, error-prone, and surprisingly manual." That painful reality is exactly why this conversation about Tendos AI caught my attention—and why it matters for product leaders building agentic AI in complex, document-heavy workflows.

    I listened as Daniel Kappler and Matthias Hilscher from Tendos AI walked through how they’re automating the tendering workflow for manufacturers in the construction industry. What began as a narrow prototype—matching radiator requests to product catalogs—has matured into a full agentic system that does the heavy lifting from email categorization to offer generation. The end result: a scalable AI workflow that tackles messy inputs, orchestrates specialized agents, and produces quotes that are ready for human review—or even straight-through processing.

    What impressed me most was the rigor. They validated the opportunity with a design partner, spent a week on-site observing real workflows, and then engineered a multi-agent architecture where specialized agents collaborate, including a "review agent" that checks work before anything reaches a human. They evaluate each agent independently (not just the whole chain), built custom observability when off-the-shelf tooling fell short, and use human-in-the-loop feedback to push toward a self-learning system.

    From a product management perspective, this is agentic AI done right. It blends continuous discovery with eval-driven development, thoughtful UX decisions, and pragmatic guardrails. Evaluating agents individually makes debugging tractable and change detection transparent; a dedicated "review agent" mirrors code review to reduce error propagation; and custom tracing plus Agent Analytics provide the observability needed to operate AI workflows reliably at scale.

    My key takeaway: "Start narrow to prove value: Tendos AI began with just radiators for one design partner before expanding to all building products"—a classic wedge strategy that accelerates learning while building credibility.

    Another takeaway I’ll adopt in future roadmaps: "Own the interface: building a web application (vs. integrating into legacy systems) gave them control over UX and the ability to iterate toward full automation." Controlling the surface area let them move faster than a purely backend integration ever could.

    On measurement and reliability, I loved this: "Evaluate each agent, not just the chain: per-agent evals make debugging tractable and show exactly where performance changed." That’s true eval-driven development—aligning metrics to decision points rather than only outcomes.

    Quality gates matter in automation, and they nailed it: "Use review agents: a separate agent that checks work (like code review) catches errors before they reach humans." It’s a simple pattern with outsized ROI.

    Finally, the product-market signal is unmistakable: "Let customers pull you: customers asked Tendos to replace their CPQ software—strong signals of product-market fit." When buyers invite you to displace existing systems, you’re past validation and into expansion.

    If you’re exploring agentic AI for enterprise workflows, the themes here are gold: the tendering chain in construction is ripe for automation; domain expertise accelerates opportunity discovery; robust entity extraction across PDFs ranging from 1 to 1,800+ pages is non-negotiable; planning patterns for creating and updating task plans matter; agents must reason about product fit against customer requirements; custom tracing and observability unlock debugging for complex agent chains; and human feedback loops pave the path to self-learning systems.

    Guests: Daniel Kappler — CPO (Product & Design), Tendos AI; Matthias Hilscher — CTO (Engineering), Tendos AI.

    Want to dive deeper? Listen to this episode on: Spotify | Apple Podcasts.

    Explore the team and product: Tendos AI.

    For builders of agentic AI, here’s my playbook distilled from this story: start narrow to earn trust and accuracy; own the interface to speed iteration; use per-agent evaluations to localize issues; add a "review agent" as a quality gate; invest early in tracing, observability, and Agent Analytics; keep humans in the loop until your metrics justify autonomy; and let strong pull signals guide your roadmap. That’s how you turn complex emails and massive PDFs into precise, production-grade quotes—consistently.


    Inspired by this post on Product Talk.


    Book a consult png image
  • AI Ethics That Win Trust: The Product Manager’s Playbook for Safe, Scalable Innovation

    AI Ethics That Win Trust: The Product Manager’s Playbook for Safe, Scalable Innovation

    I’ve learned that the fastest way to lose customers with AI is to ship something powerful but unpredictable. The fastest way to earn their loyalty is to ship something powerful and trustworthy. That’s the job.

    AI ethics in product management isn’t about theory anymore. It’s the line between trusted products and unpredictable ones. Here’s what PMs need to know.

    When I frame AI ethics for my team, I translate principles into practices that protect customers and accelerate velocity. We bake trust into product strategy, delivery, and operations—so ethics is not a separate checklist, but a core capability that compounds over time.

    First, I anchor the roadmap on explicit outcomes and guardrails. We set success metrics alongside ethical constraints, tying them to outcomes vs output OKRs, so teams know not only what to achieve but what to avoid. If a feature can’t meet our trust thresholds, it doesn’t ship—no matter how impressive the demo.

    Data is where trust starts. We enforce data governance from day one: clear data lineage, collection minimization, role-based access, and privacy-by-design defaults. We document lawful bases for processing, consent flows, and retention policies, then automate checks so they run with every change—not just at launch.

    On the model side, we use eval-driven development to turn subjective “looks good” into measurable quality. We design evaluations for safety, bias, robustness, and performance; we red-team prompts; and we test failure modes in realistic conditions. For LLMs, we lean on a retrieval-first pipeline to ground responses in authoritative data, and we apply context window management and prompt engineering patterns to reduce hallucinations.

    In the product experience, we make ethical choices visible. That means clear disclosures when AI is in the loop, user controls to review and correct outputs, and transparent UX writing that avoids overclaiming. In-app guides and thoughtful tooltip design help users understand capabilities and limits without friction.

    Shipping safely requires operational discipline. We build kill switches, human-in-the-loop overrides for high-risk actions, and incident playbooks that pair incident management with threat detection and response. SRE partnerships ensure observability covers both model behavior and customer impact, with rollback paths ready when drift or regressions appear.

    Governance is a team sport. I maintain an AI risk register, review it with security, legal, and product trios, and brief leadership on residual risks and mitigations. Regulatory compliance isn’t a final hurdle; it’s a design input that shapes technical choices long before code reaches production.

    Build vs buy decisions carry ethical implications too. Vendor due diligence covers model provenance, data handling, eval results, and incident history—not just feature checklists. Contracts codify SLAs, audit rights, and deletion commitments so our obligations to customers flow down the stack.

    Finally, we earn trust in public. We publish model facts, change logs, and limitations in a customer-facing trust center, and we invite feedback loops that turn real-world usage into better safeguards. Stakeholder management matters here: being candid about trade-offs often increases confidence more than chasing perfection.

    This is how I keep teams fast without being reckless: ethics as a product capability, not a poster. Build with intention, measure what matters, and make it easy for customers to understand, control, and benefit from your AI. That’s how we ship innovation that stays trusted—at scale.


    Inspired by this post on Product School.


    Book a consult png image