Tag: AI workflows

  • Becoming AI Native: A Practical Playbook to Transform Strategy, Teams, Data, and Tech

    Becoming AI Native: A Practical Playbook to Transform Strategy, Teams, Data, and Tech

    AI Native is more than a feature set—it’s an operating system for the entire business. In my role leading product, I’ve seen that companies win when they treat AI as a first-class citizen across strategy, architecture, workflows, and go-to-market. In this narrative, I unpack what “AI Native: What It Means and How to Get There” looks like in practice, sharing the frameworks I use to align vision, technology, and teams around measurable customer outcomes.

    When I say AI Native, I mean a company where core value creation, customer experience, and internal operations are powered by AI end-to-end. It’s not just bolting on a chatbot. It’s rethinking product strategy, data foundations, and execution so we can deliver differentiated experiences faster, at lower cost, and with higher reliability. This shift demands clarity on where AI truly creates leverage—and the courage to say no where it doesn’t.

    The starting point is strategy. I ground teams in outcomes vs output OKRs and a crisp value proposition: Which customer jobs-to-be-done benefit most from generative AI? Where can we unlock 10x improvements in speed, accuracy, or personalization? We prioritize a small number of high-signal use cases, size impact, and design Minimum Viable Experiments (MVEs) to de-risk assumptions before scaling. This is where build vs buy decisions matter—use foundation models and platforms for commodity needs, and invest your scarce engineering time where differentiation lives.

    Next comes architecture and data. AI Native products thrive on a retrieval-first pipeline, strong context window management, and model-agnostic abstraction so we can swap providers as needs evolve. I emphasize privacy-by-design, robust data governance, and observability across prompts, embeddings, latency, and cost. These guardrails let us move quickly without compromising trust, especially in regulated or enterprise settings.

    Execution shifts as well. I organize empowered product teams and product trios around the highest-value workflows, not components. Continuous discovery pairs with CI/CD, feature flags, and telemetry so we can test safely in production. Eval-driven development is non-negotiable: we design offline and online evaluations that mirror real user success criteria—accuracy, helpfulness, safety, and business outcomes—then wire those evals into the build pipeline to prevent regressions.

    On the intelligence layer, we increasingly rely on AI workflows and agentic AI to orchestrate multi-step tasks—retrieval, reasoning, tool use, and verification—with human-in-the-loop where appropriate. Clear system prompts, tool definitions, and fallbacks keep behavior predictable. This is where product craft meets prompt engineering and LLMs for product managers: the best teams codify patterns, share prompts in a living library, and standardize on a lightweight AI product toolbox.

    Risk and reliability are part of the product, not an afterthought. I run AI risk management as a continuous program spanning red teaming, content filters, PII handling, audit trails, and incident response. We tie policies to concrete controls and create simple dashboards leaders can trust. The goal is to ship boldly with safety, maintainability, and scale in mind.

    Becoming AI Native also changes how we grow. We lean into product-led growth with clear in-app guides, product tours, and activation paths that teach users where AI shines. CRM integration ensures sales and success teams have context to coach customers. Pricing experiments—often usage- or value-based—align revenue with the impact customers feel, while retention analysis helps us double down on the use cases that drive compounding value.

    To make this real, I use a 90-day plan. Days 0–30: align on strategy, top use cases, and risk posture; stand up data pipelines and a basic retrieval-first stack; define evaluation metrics. Days 31–60: ship MVEs behind feature flags, run head-to-head evals, and instrument observability; start a cross-functional community of practice. Days 61–90: scale the winning use cases, formalize governance, and publish a roadmap tied to outcomes—not just features—with clear SLAs and success metrics.

    The destination is a durable advantage: faster iteration cycles, smarter experiences, and a product strategy that compounds with every interaction. If you’re ready to make the leap, start small, measure obsessively, and build the muscle to ship, learn, and adapt. That’s the heart of becoming AI Native—and it’s well within reach.


    Inspired by this post on Product School.


    Book a consult png image
  • Inside Amplitude’s AI Playbook: Lessons from Leo Jiang on Ask Amplitude, Agents, and Visibility

    Inside Amplitude’s AI Playbook: Lessons from Leo Jiang on Ask Amplitude, Agents, and Visibility

    I continually study how high-velocity teams turn AI ambition into shipped product, and Amplitude’s approach stands out. "Leo Jiang is the Head of Engineering, AI Products at Amplitude, focused on building new AI and marketing products. He has helped build Ask Amplitude, Agents, and AI Visibility." From a product management leadership lens, that portfolio signals a clear AI strategy: enable insight (Ask Amplitude), drive action (Agents), and ensure trust and observability (AI Visibility).

    What I appreciate most is the sequencing: start with user-facing value, build agentic AI capabilities where tasks repeat and outcomes can be evaluated, and layer AI workflows with robust governance. For PMs and LLMs for product managers, the implication is to define success via eval-driven development—quantitative rubrics, offline test sets, and real-time feedback loops—before scaling automation. This also hints at an emerging discipline of Agent Analytics: instrument prompts, tool calls, and outcome quality so we can tune performance like we tune a funnel.

    Ask Amplitude gives a relatable example: natural-language questions lower the activation barrier for product and growth teams inside an Amplitude analytics environment. When agents turn answers into next-best actions, product-led growth becomes measurable—from hypothesis to change to impact—inside a unified decision loop. That tight loop is where product strategy, design, and reliability meet to create compounding value.

    Operationally, I organize a product trio around each capability and pair it with forward deployed engineers to accelerate discovery with customers. I also invest in privacy-by-design and data governance early, ensuring marketing use cases respect compliance while keeping iteration speed high. The goal is a repeatable path from prototype to scale that preserves momentum without compromising safety.

    My takeaway for peers: pick one high-frequency workflow, define clear agent boundaries, ship a narrow slice, and measure relentlessly. Use retrieval-first pipeline patterns for grounding, add human-in-the-loop checkpoints, and close the loop with qualitative insights from in-app guides. When that works, expand capabilities—not just features—and let outcomes vs output OKRs steer prioritization.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • Build Your Personal Operating System with Claude Code: A Playbook for Focus, Speed, Clarity

    Build Your Personal Operating System with Claude Code: A Playbook for Focus, Speed, Clarity

    This is the year to build your personal operating system. For me, that line isn’t a slogan; it’s a commitment to eliminate context switching, compress decision cycles, and turn fragmented information into a reliable source of truth. As a product leader, I needed a system that blends judgment, data, and automation—so I built mine around Claude Code.

    When I say “personal operating system,” I mean an integrated set of AI workflows, rituals, and tools that capture knowledge, structure decisions, and automate execution. It’s where product discovery meets delivery: a place to synthesize signals, prioritize with clarity, and move from insight to action without friction. The outcome is fewer ad hoc decisions, more deliberate strategy, and a calmer, more focused day.

    Claude Code sits at the center because it helps me translate intent into working software and repeatable processes. I use it to scaffold small utilities, write adapters for APIs, and evolve prompts into robust patterns. It accelerates everything from research synthesis and PRD drafting to backlog grooming and stakeholder updates—while keeping me in the loop for final judgment.

    Under the hood, I run a retrieval-first pipeline that connects notes, docs, tickets, research transcripts, and roadmaps into a searchable, living memory. With careful context window management, I feed only the most relevant snippets into Claude Code, preserving accuracy and speed. The result: richer answers, fewer hallucinations, and an assistant that “remembers” what matters without drowning in noise.

    My daily loop is simple: capture, synthesize, decide, and act. I capture customer signals and meeting notes into a personal knowledge management vault; synthesize patterns with prompt engineering that emphasizes evidence; decide using outcomes vs output OKRs; and act by generating drafts, creating tasks, and updating artifacts. Claude Code helps me wire this end-to-end, so the system works even on my busiest days.

    If you’re implementing this from scratch, start small. Pick one high-friction workflow—say, product feedback triage—and build a narrow agentic AI flow to classify, summarize, and route items. Use eval-driven development to test prompts against known edge cases. Add guardrails and privacy-by-design practices from day one, then expand to neighboring workflows once the first loop is reliable.

    Governance matters. I treat AI risk management, data governance, and security as first-class citizens: limited data scopes, clear audit trails, human-in-the-loop approvals, and rollback plans. Feature flags control changes; observability tracks drift and quality; and a simple playbook documents how we deploy, monitor, and improve the system.

    Measure what this personal operating system earns you. Track decision latency, cycle time from signal to action, meeting-to-output ratios, and the signal-to-noise ratio of inputs. When the system is working, you’ll feel it: fewer meetings, more momentum, and sharper product strategy supported by trustworthy AI workflows.

    The goal isn’t to automate judgment—it’s to protect it. By letting Claude Code handle the glue work and information wrangling, I preserve energy for high-leverage thinking: positioning, sequencing, and trade-offs. Build your personal operating system now, and make this the year your product practice runs with clarity and composure.


    Inspired by this post on Pendo – Best Practices.


    Book a consult png image
  • The AI Deployment Gap Is Widening—Accelerate to Mature ROI and World-Class CX in 2026

    The AI Deployment Gap Is Widening—Accelerate to Mature ROI and World-Class CX in 2026

    I’ve watched AI adoption accelerate dramatically over the last year, and the momentum is undeniable. Teams everywhere are experimenting, piloting, and operationalizing AI—but the ways they’re doing it, and the outcomes they’re seeing, vary widely.

    Our latest research shows that 82% of senior leaders invested in AI for customer service in 2025, and 87% plan to in 2026. That’s the new baseline. The differentiator now is depth—how far AI is embedded into core workflows, accountability, and measurement.

    Infographic comparing AI benefits in customer service: 43% with mature deployment report higher quality and consistent support, versus 24% at initial deployment; survey allowed multiple responses.
    Teams with mature AI are almost twice as likely to achieve higher, more consistent support quality. Our survey shows 43% of advanced adopters citing this benefit compared with 24% of early deployments.

    But while most teams are using AI, our 2026 “Customer Service Transformation Report” shows that this usage is not equal. A gap is opening up between teams that have deployed AI at a surface level and those that have integrated it deeply. I see this firsthand: shallow deployments answer FAQs; deep deployments redesign processes, policies, and teams.

    Infographic comparing customer service improvements after AI: 87% of mature deployments report improved metrics vs 62% of all respondents, shown as pink and gray circles with legend and headline.
    Survey results highlight the AI deployment gap: nearly nine in ten organizations with mature AI see improved customer service metrics (87%), compared with 62% across all respondents, visualized with bold circles.

    For this year’s report, we surveyed over 2,400 global customer service professionals across a range of industries to see how they’re using AI today, where it’s paying off, and what they’re betting on as they plan for 2026. The findings mirror my experience leading AI Strategy and AI workflows at scale.

    Infographic of customer service teams measuring AI ROI by deployment stage: 70% mature, 60% scaling, 43% initial, 35% exploring, shown as donut charts, illustrating the deployment gap.
    As AI programs advance, measurement confidence surges. This chart shows how ROI tracking rises from 35% in exploring to 70% in mature deployments—evidence of a widening execution gap in customer service.

    We found that for many teams, AI is still doing narrow work like answering simple questions or handling small parts of workflows. These teams are seeing benefits, but only a fraction of what’s possible. Meanwhile, a smaller group is pulling away. They’ve put AI at the core of their service operation, integrating it into critical workflows, giving it more responsibility, and continuously improving it over time. That’s the hallmark of mature deployment.

    Side-by-side infographic comparing 2025 vs 2026 customer service priorities. In 2026, improving CX leads at 58%, followed by reducing costs and improving efficiency at 46%, with support quality still a key focus.
    Customer service priorities are shifting fast. By 2026, improving CX tops the list at 58%, cost and efficiency climb, and quality moves to third as teams prepare to scale operations and evolve skills.

    The difference in results and overall support experience – for both teams and customers – is significant. Here’s how I interpret the data and what I recommend to close the gap.

    Ranked customer service survey chart titled 'How are existing support roles changing on your team as a result of AI?' showing 45% updated job descriptions, 40% agent AI training, and other shifts at 27–24%.
    Survey insights from the 2026 customer service transformation report reveal how AI reshapes support roles: 45% of teams updated job descriptions and 40% ramped up AI training, while human agents focus more on complex escalations.

    AI adoption is the norm, depth makes the difference. According to senior leaders, 82% of organizations invested in AI in 2025, with 87% planning to invest in the year ahead. Despite this widespread investment, only 10% of teams report having reached a mature level of deployment, where AI is fully integrated into operations and working at scale. In my playbook, maturity means end-to-end ownership of well-defined workflows, robust guardrails, and clear success criteria.

    Survey chart showing drivers to expand AI beyond support: success with AI in support (57%), unified customer experience (49%), scaling without added headcount (33%), and cross-department demand (31%).
    Early AI wins are fueling expansion beyond support. Survey results show 57% cite proven success, 49% aim for a unified customer experience, 33% need to scale without adding headcount, and 31% see demand from other teams.

    Reaching this level of maturity is where AI’s real value lies. We found that 43% of teams with mature deployment report higher quality and consistency across support – nearly double the rate of those still in the exploration or initial deployment stages. That aligns with what I see when we move from point solutions to platform thinking and agentic AI patterns.

    Neon green hero graphic reading 'The 2026 Customer Service Transformation Report', with subhead 'The AI deployment gap is widening' and a black 'Get the report' button over a bar-chart pattern.
    Leaders are racing ahead with real AI in support. Explore the 2026 Customer Service Transformation Report to see where deployment is stalling, benchmark your team, and get practical steps to scale automation that delights.

    ROI becomes clearer with deeper integration. The economic benefits of AI tend to show up first in speed and throughput, and they show up fast. Across all respondents, 62% say their customer service metrics have improved since implementing AI. Most often, teams report their initial gains in efficiency and scale—faster responses, shorter handling times, and the ability to resolve more conversations with the same team—all driving lower cost per interaction.

    But the deeper teams go with deployment, the more the results start to show in the metrics. We found that among teams that describe their AI deployment as mature, the cohort of respondents reporting improved metrics as a result of AI rises from 62% to 87%. What’s more, teams with more mature deployments are significantly more likely to say they can measure the return on their AI investment. My advice: instrument everything upfront, baseline rigorously, and use eval-driven development to iterate with confidence.

    The bar has moved from ‘does it work?’ to ‘is it actually good?’ More than ever, teams are focused on improving customer experience and satisfaction, with 58% saying it’s the top priority for 2026. That number has more than doubled since last year, when just over a quarter (28%) of respondents cited it as a top priority. As AI assumes repetitive work, your people can shift from reactive triage to proactive journey design. Now is the time to invest in quality frameworks, prompt engineering standards, and LLMs for product managers to close the loop between product, ops, and CX.

    Important support work now extends beyond the inbox. AI is reorganizing core customer service operations as it starts to take on a higher volume of work and more complex tasks. Even at the initial deployment stage, 16% of teams report spending less time handling support volume since implementing AI – and among teams who’ve reached maturity, that figure rises to 28%. I’ve seen new roles emerge—AI operations managers, conversation designers, and model evaluators—alongside upskilling for agents into higher-order troubleshooting and relationship building.

    Support is creating the blueprint for AI deployment across the business. Support was the proving ground for AI, and our research suggests that businesses are now planning to expand its use to other areas based on the results it’s yielded so far. Fifty-two percent of respondents said that their organizations are actively planning to scale AI to departments like customer success, marketing, and sales in 2026. The two most cited driving forces behind this decision are the success support has seen with AI to date and a desire to create a unified customer experience. Treat your support stack as a reusable platform: shared services, governance, and reusable components accelerate adoption in adjacent functions.

    Seize the opportunity to close the gap. Having or not having AI isn’t a question anymore. What you should be asking now is how close you are to mature deployment, where AI is capable of tackling nuanced, high-stakes work. Those who have reached this stage show that going deep is what unlocks real value. That’s the opportunity. Push AI to do more, bring it to more channels, use it to resolve the most complex queries, and close the gap before it becomes too wide to close.

    This might seem daunting. But trying new things always is. What we’re experiencing now is a defining moment for customer service, and the teams that are leaning in are actively building the future. As this report shows, what works in customer service now will become the blueprint for how organizations transform the full customer journey with AI. If you want the benchmarks and the playbook to accelerate from pilots to production-grade outcomes, I recommend reviewing the full “2026 Customer Service Transformation Report.”


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • AI Operating Model Masterclass: How I Scale Teams, Tech, and Governance Without Chaos

    AI Operating Model Masterclass: How I Scale Teams, Tech, and Governance Without Chaos

    When I set out to operationalize AI across a product organization, I focus on one promise: repeatable outcomes without chaos. An effective AI operating model turns experiments into an engine—aligning strategy, teams, technology, and governance so we can ship value safely and at scale.

    At its core, an AI operating model is the connective tissue between vision and delivery. I anchor it on a few pillars: clear AI Strategy, empowered cross-functional teams, a modern AI platform, rigorous AI risk management and data governance, and a cadence of eval-driven development that ties everything back to outcomes.

    Strategy comes first. I translate big ambitions into a portfolio of use cases ranked by customer impact, feasibility, and risk. I use continuous discovery to validate the problem, then frame each bet with outcomes vs output OKRs, a crisp value proposition, and a build vs buy decision. For generative AI, I encourage PMs to treat LLMs for product managers as a craft—rapid prototyping, deliberate prompt engineering, and disciplined evaluation from day one.

    Team design matters as much as models. I organize around product trios—PM, design, and engineering—augmented by data, ML, and a “forward deployed” mindset when the domain is complex. I invest in empowered product teams and communities of practice to spread patterns quickly while avoiding centralized bottlenecks.

    On the platform side, I start retrieval-first pipeline before fancy modeling. A solid foundation—feature stores, vector search, observability, and safe integration points—beats bolt-on hacks. I rely on CI/CD with feature flags, strong deployment frequency, DORA metrics, and SRE-grade reliability to keep the iteration loop tight and safe.

    Governance is non-negotiable. I implement privacy-by-design, clear data governance, audit trails, and policy controls aligned to regulatory compliance. AI risk management includes model red teaming, safety layers, and human-in-the-loop review where needed. The goal is confidence: we know what shipped, why it works, and how it fails.

    Execution rides on eval-driven development. For every AI workflow, I define offline and online test sets, target metrics, and a decision policy before launch. I A/B test with proper minimum detectable effect (MDE), layer canaries for protection, and monitor user experience and outcomes in production. This is how we turn “it seems smarter” into statistically confident improvements.

    Adoption is a product in itself. I build onboarding, in-app guides, and product tours that help users form habits quickly. I monitor activation, time-to-value, and retention analysis while partnering with customer support ai strategy to close the loop between real-world issues and roadmap priorities.

    Culture scales the system. I normalize rapid learning, shared playbooks, and personal knowledge management so insights don’t disappear into meetings or notebooks. I upskill teams on prompt engineering, context window management, and model selection, and I celebrate the humility required to refactor what “worked” yesterday.

    Operating cadence keeps it all coherent. I run an AI portfolio review tied to outcomes vs output OKRs, keep a single source of truth for evaluations, and align go-to-market strategy with release readiness. We review risks alongside results so speed never outruns safety.

    If you’re starting from scratch, I recommend a 30-60-90 approach: baseline your current state, choose two lighthouse use cases, stand up the retrieval-first pipeline and eval harness, define governance and data policies, then ship small, safe increments behind feature flags. Teach the system to learn before you make it run.

    I’ve felt the pain of brilliant prototypes that crumble in production and the thrill of AI features that compound value month after month. The difference is the operating model. Build it with intent, and you’ll scale AI with confidence—teams aligned, tech resilient, and customers seeing real outcomes.


    Inspired by this post on Product School.


    Book a consult png image
  • Inside Product at Heart 2026: Bold Single-Track Vision, AI Everywhere, Deeper Connections

    Inside Product at Heart 2026: Bold Single-Track Vision, AI Everywhere, Deeper Connections

    I just tuned into the latest conversation on the upcoming Product at Heart 2026, and it hit on the exact challenges product leaders are navigating right now: curating meaningful content in a world where AI moves faster than our agendas, designing formats that create real connection, and ensuring every minute earns its place. Listening to Petra Wille and Teresa Torres map out the speaker lineup, workshops, and structural shifts, I found myself nodding along—this is the kind of thoughtful curation we need if we want product teams and product leaders to walk away with practical value, not just inspiration.

    Listen to this episode on: Spotify | Apple Podcasts

    What stood out immediately is the bold move to a single-track conference for 2026. In an era of gen ai hype and endless breakouts, this choice signals clear intent: tighter curation, a shared experience, and less FOMO. The team isn’t carving out a separate AI track—and I love that decision. Their stance is simple and sensible: No AI track—AI will show up everywhere, but not as a siloed topic. The team sees it as part of the everyday toolkit. That mirrors how high-performing, empowered product teams actually work today—AI Strategy and AI workflows are part of the operating system, not a side show.

    The keynote lineup is already compelling. Christian Idiodi (SVPG) brings storytelling that turns product principles into habits you can actually use on Monday. Elaine Kasket, cyber-psychologist, exploring digital afterlife and AI replicas, will push us to think more deeply about the human side of our systems. And Teresa Torres will be sharing what she’s learning about AI—exactly the kind of continuous discovery mindset we need as we integrate LLMs into product discovery and delivery.

    I’m also thrilled to see roundtables become what they’re calling an “alternative track.” That’s a smart way to deepen learning without fragmenting attention. The best conference ROI I’ve had often comes from targeted small-group conversations—where product trios compare approaches, swap metrics frameworks, or challenge each other’s product strategy assumptions. It’s a design choice that rewards curiosity and builds communities of practice.

    We also get a behind-the-scenes look at Teresa’s Maker Studio workshop, where participants will build personal AI workflows. That’s exactly the hands-on, practitioner-first approach teams need right now—less demo theater, more systems that stick. If your roadmap includes integrating LLMs into continuous discovery or augmenting your team’s decision velocity, this kind of guided practice is gold.

    The broader workshop slate looks deep and balanced. Expect returning favorites and practical frameworks: Rich Mironov on the realities of product leadership in complex orgs; Büşra’s metrics workshop translating outcomes into action; and an overview of additional workshops from Rich Mironov, Büşra Coşkuner, Marcus Castenfors, and Özlem Yüce. From success metrics to toolkits for product managers, the content spans IC to product management leadership—ideal if you’re stepping into new roles or scaling empowered product teams.

    One of the most exciting evolutions is the Product Leadership Event, now a 1.5-day retreat. The format blends talk sessions, mini-workshops, dinners, and small-group excursions (boat rides, improv, etc.), giving leaders time and space to exchange playbooks, stress-test decisions, and build real relationships. It’s capped at 60 attendees (all in product leadership roles) to keep it intimate and useful. As someone who believes in outcomes vs output OKRs and first principles decision making, I appreciate how this structure encourages depth over breadth—and real accountability among peers.

    Here are the core takeaways I’m carrying into my own planning: single-track means tighter curation, so every talk has to earn its place. Roundtables are growing into an “alternative track,” offering more ways to engage beyond stage talks. Workshops go deep and meet you where you are—IC, manager, or executive. And the leadership retreat expands to maximize learning from peers, not just from the stage. If you care about product discovery, product strategy, and conference networking that leads to actual business impact, this program looks thoughtfully engineered.

    If you’re planning your 2026 calendar—or just curious how conferences evolve alongside the craft—this is a thoughtful walkthrough of what to expect. Come say hi to Teresa and Petra—on stage, at a roundtable, or somewhere in the hallway conversations that make these events memorable.

    For more context and resources mentioned, explore: Product at Heart, Arne Kittler, Mind the Product, Christian Idiodi of Silicon Valley Product Group, Elaine Kasket, House of Beautiful Business, The 7 Habits of Highly Effective People by Stephen Covey, Rich Mironov, Marty Cagan, Claude Code, Codex by OpenAI, Marcus Castenfors, Büşra Coşkuner and her Success Metrics: A Playbook for Product Managers, Özlem Yüce’s Essential Toolkit for Product Managers, Petra’s Product Leadership Wheel (PLwheel), and Netlight.

    Follow Teresa Torres: https://ProductTalk.org

    Follow Petra Wille: https://Petra-Wille.com

    Full transcripts are only available for paid subscribers.


    Inspired by this post on Product Talk.


    Book a consult png image
  • From PDFs to Proposals: How Tendos AI’s Agent Swarm Automates Construction Quotes Fast

    From PDFs to Proposals: How Tendos AI’s Agent Swarm Automates Construction Quotes Fast

    Anyone who has lived inside construction tendering knows the grind. "When a construction company receives a bid request, someone has to open that email, parse the attached PDF (sometimes 1,800 pages describing an entire building), figure out which products are relevant, look up pricing, and draft a quote—all before the deadline. It's tedious, error-prone, and surprisingly manual." That painful reality is exactly why this conversation about Tendos AI caught my attention—and why it matters for product leaders building agentic AI in complex, document-heavy workflows.

    I listened as Daniel Kappler and Matthias Hilscher from Tendos AI walked through how they’re automating the tendering workflow for manufacturers in the construction industry. What began as a narrow prototype—matching radiator requests to product catalogs—has matured into a full agentic system that does the heavy lifting from email categorization to offer generation. The end result: a scalable AI workflow that tackles messy inputs, orchestrates specialized agents, and produces quotes that are ready for human review—or even straight-through processing.

    What impressed me most was the rigor. They validated the opportunity with a design partner, spent a week on-site observing real workflows, and then engineered a multi-agent architecture where specialized agents collaborate, including a "review agent" that checks work before anything reaches a human. They evaluate each agent independently (not just the whole chain), built custom observability when off-the-shelf tooling fell short, and use human-in-the-loop feedback to push toward a self-learning system.

    From a product management perspective, this is agentic AI done right. It blends continuous discovery with eval-driven development, thoughtful UX decisions, and pragmatic guardrails. Evaluating agents individually makes debugging tractable and change detection transparent; a dedicated "review agent" mirrors code review to reduce error propagation; and custom tracing plus Agent Analytics provide the observability needed to operate AI workflows reliably at scale.

    My key takeaway: "Start narrow to prove value: Tendos AI began with just radiators for one design partner before expanding to all building products"—a classic wedge strategy that accelerates learning while building credibility.

    Another takeaway I’ll adopt in future roadmaps: "Own the interface: building a web application (vs. integrating into legacy systems) gave them control over UX and the ability to iterate toward full automation." Controlling the surface area let them move faster than a purely backend integration ever could.

    On measurement and reliability, I loved this: "Evaluate each agent, not just the chain: per-agent evals make debugging tractable and show exactly where performance changed." That’s true eval-driven development—aligning metrics to decision points rather than only outcomes.

    Quality gates matter in automation, and they nailed it: "Use review agents: a separate agent that checks work (like code review) catches errors before they reach humans." It’s a simple pattern with outsized ROI.

    Finally, the product-market signal is unmistakable: "Let customers pull you: customers asked Tendos to replace their CPQ software—strong signals of product-market fit." When buyers invite you to displace existing systems, you’re past validation and into expansion.

    If you’re exploring agentic AI for enterprise workflows, the themes here are gold: the tendering chain in construction is ripe for automation; domain expertise accelerates opportunity discovery; robust entity extraction across PDFs ranging from 1 to 1,800+ pages is non-negotiable; planning patterns for creating and updating task plans matter; agents must reason about product fit against customer requirements; custom tracing and observability unlock debugging for complex agent chains; and human feedback loops pave the path to self-learning systems.

    Guests: Daniel Kappler — CPO (Product & Design), Tendos AI; Matthias Hilscher — CTO (Engineering), Tendos AI.

    Want to dive deeper? Listen to this episode on: Spotify | Apple Podcasts.

    Explore the team and product: Tendos AI.

    For builders of agentic AI, here’s my playbook distilled from this story: start narrow to earn trust and accuracy; own the interface to speed iteration; use per-agent evaluations to localize issues; add a "review agent" as a quality gate; invest early in tracing, observability, and Agent Analytics; keep humans in the loop until your metrics justify autonomy; and let strong pull signals guide your roadmap. That’s how you turn complex emails and massive PDFs into precise, production-grade quotes—consistently.


    Inspired by this post on Product Talk.


    Book a consult png image
  • AI Ethics That Win Trust: The Product Manager’s Playbook for Safe, Scalable Innovation

    AI Ethics That Win Trust: The Product Manager’s Playbook for Safe, Scalable Innovation

    I’ve learned that the fastest way to lose customers with AI is to ship something powerful but unpredictable. The fastest way to earn their loyalty is to ship something powerful and trustworthy. That’s the job.

    AI ethics in product management isn’t about theory anymore. It’s the line between trusted products and unpredictable ones. Here’s what PMs need to know.

    When I frame AI ethics for my team, I translate principles into practices that protect customers and accelerate velocity. We bake trust into product strategy, delivery, and operations—so ethics is not a separate checklist, but a core capability that compounds over time.

    First, I anchor the roadmap on explicit outcomes and guardrails. We set success metrics alongside ethical constraints, tying them to outcomes vs output OKRs, so teams know not only what to achieve but what to avoid. If a feature can’t meet our trust thresholds, it doesn’t ship—no matter how impressive the demo.

    Data is where trust starts. We enforce data governance from day one: clear data lineage, collection minimization, role-based access, and privacy-by-design defaults. We document lawful bases for processing, consent flows, and retention policies, then automate checks so they run with every change—not just at launch.

    On the model side, we use eval-driven development to turn subjective “looks good” into measurable quality. We design evaluations for safety, bias, robustness, and performance; we red-team prompts; and we test failure modes in realistic conditions. For LLMs, we lean on a retrieval-first pipeline to ground responses in authoritative data, and we apply context window management and prompt engineering patterns to reduce hallucinations.

    In the product experience, we make ethical choices visible. That means clear disclosures when AI is in the loop, user controls to review and correct outputs, and transparent UX writing that avoids overclaiming. In-app guides and thoughtful tooltip design help users understand capabilities and limits without friction.

    Shipping safely requires operational discipline. We build kill switches, human-in-the-loop overrides for high-risk actions, and incident playbooks that pair incident management with threat detection and response. SRE partnerships ensure observability covers both model behavior and customer impact, with rollback paths ready when drift or regressions appear.

    Governance is a team sport. I maintain an AI risk register, review it with security, legal, and product trios, and brief leadership on residual risks and mitigations. Regulatory compliance isn’t a final hurdle; it’s a design input that shapes technical choices long before code reaches production.

    Build vs buy decisions carry ethical implications too. Vendor due diligence covers model provenance, data handling, eval results, and incident history—not just feature checklists. Contracts codify SLAs, audit rights, and deletion commitments so our obligations to customers flow down the stack.

    Finally, we earn trust in public. We publish model facts, change logs, and limitations in a customer-facing trust center, and we invite feedback loops that turn real-world usage into better safeguards. Stakeholder management matters here: being candid about trade-offs often increases confidence more than chasing perfection.

    This is how I keep teams fast without being reckless: ethics as a product capability, not a poster. Build with intention, measure what matters, and make it easy for customers to understand, control, and benefit from your AI. That’s how we ship innovation that stays trusted—at scale.


    Inspired by this post on Product School.


    Book a consult png image
  • New Year, New Product Habits: AI Workflows, Coaching Culture, and Community in 2026

    New Year, New Product Habits: AI Workflows, Coaching Culture, and Community in 2026

    Happy New Year! I’m kicking off 2026 with a behind-the-scenes look at what’s changing in my product practice, the experiments I’m running with my teams at HighLevel, and the trends I’m most energized by—especially around continuous discovery, AI workflows, and building stronger coaching cultures.

    If you want to listen to the conversation that sparked many of these reflections, you can find it here: Spotify | Apple Podcasts.

    Why Teresa sunset the live deep-dive cohorts—and how on-demand and the new Discovery Habits Toolbox better support real behavior change. This pivot resonated with my own experience: some skills, especially discovery habits, only stick when they’re reinforced in the flow of real product work, not just in a time-boxed cohort. In my org, we’re leaning into on-demand learning paired with manager coaching to drive durable behavior change.

    What leaders actually need to coach interviewing, assumption testing, and core discovery habits inside their orgs. I’ve found that empowered product teams thrive when leaders have lightweight coaching tools, practical prompts, and clear expectations for product trios. This is less about one-off training and more about building communities of practice where deliberate practice and feedback loops become routine.

    Why training is shifting toward ongoing, leader-supported learning (and how AI will accelerate the shift). AI Strategy isn’t just about tools—it’s about learning systems. For LLMs for product managers to create leverage, we need eval-driven development, privacy-by-design, and clear guardrails. I’m building AI workflows that enable managers to review interviews, spot anti-patterns, and nudge teams toward better decisions—without replacing critical thinking.

    Teresa’s move into paid subscriptions and why AI content doesn’t fit the classic “design once, run for years” course model. I see the same reality in my content roadmap: the half-life of AI guidance is short. That pushes us toward subscription models, tighter feedback loops, and a more adaptive go-to-market strategy for education products.

    A sneak peek into the AI tools Teresa is building for discovery work—from interview coaching to near-ready interview snapshot generation. I’m particularly excited by tooling that scaffolds better interviews, sharpens assumption testing, and speeds up synthesis without skipping the human judgment step. These capabilities map directly to where I want my teams investing time: spending less energy on admin and more on learning from customers.

    Petra’s plans for the year: community building with Product at Heart, a new product leadership email course, her Product Leadership Wheel, and workshops launching in Cairo. As someone who believes in conferences as high-quality “energy wells,” I’m inspired by how these programs create momentum for leaders who are upgrading their coaching muscles.

    The role of conferences and retreats in staying grounded, inspired, and connected. I treat these gatherings as strategic resets—spaces to test ideas, confront blind spots, and deepen my network for future collaboration. The best outcomes often come from serendipitous hallway conversations and hands-on sessions where you can pressure test frameworks with peers.

    How Teresa is staying on top of academic research (and why “synthetic users” aren’t ready for prime time). I agree: while synthetic data can be useful for scaffolding, it’s not a substitute for direct customer contact. Combine academic rigor with real-world interviewing and strong data governance—especially when operating under General Data Protection Regulation (GDPR).

    The shared challenge of evaluating vendors and conference speakers making questionable AI claims. My heuristic: ask for clear problem statements, reproducible evaluations, grounded benchmarks, and a path to safe deployment. If a pitch can’t show measurable uplift or ignores compliance, it’s not ready for empowered product teams.

    Key takeaways I’m carrying into 2026: delivery models matter; leaders need coaching tools, not just training; AI is reshaping how we teach and learn; experimentation is the theme of 2026; and community still energizes. That’s the blueprint I’m using to strengthen continuous discovery, refine our AI workflows, and sustain high standards in product management leadership.

    What about you? How are you integrating AI workflows into your discovery practice, and what coaching tools are helping your managers reinforce the right habits? Share your approach—I’d love to learn what’s working in your context.

    Resources & Links:

    Follow Teresa Torres: https://ProductTalk.org

    Follow Petra Wille: https://Petra-Wille.com

    Teresa’s website: Product Talk

    General Data Protection Regulation (GDPR)

    Product Talk Academy

    Deliberate Practice – ATP episode where Teresa talked about the ending live cohorts for Deep Dive classes

    Teresa’s Discovery Habits Toolbox program

    Petra’s A 52-Week Transformation Journey

    Teresa’s Product Talk subscriptions (AI workflows + discovery content)

    Claude Code

    The Interview Coach by Teresa

    Product at Heart Conference (Hamburg)

    Petra’s Coaching Packages

    Petra’s Ways We Can Work Together

    Petra’s Product Leadership Wheel (PLwheel)

    Petra’s Product Manager (PMwheel)

    Prdkt+ MENA Product Summit 2026

    World Beautiful Business Forum by House of Beautiful Business

    Melissa Suzuno

    Vistaly (Teresa’s integration partner for some upcoming AI tools)

    Teresa’s Just Now Possible podcast


    Inspired by this post on Product Talk.


    Book a consult png image
  • The Modern Playbook for AI Agents: Build One‑Person Departments and Scale with Amplitude

    The Modern Playbook for AI Agents: Build One‑Person Departments and Scale with Amplitude

    I’ve spent the last few years turning AI from an intriguing demo into an operational advantage, and the clearest wins come when we treat agents as productized workflows—not toys. In practice, that means aligning agentic AI to a sharp product strategy, instrumenting everything, and scaling what works across the organization.

    Learn how companies like Replit are consolidating workflows, creating one-person departments, and building systems for scale with Amplitude

    When I talk about agentic AI, I’m focused on outcomes: fewer handoffs, faster cycle times, and measurable uplift in activation, retention, and NPS. The most successful rollouts start with a specific job-to-be-done, translate it into clear AI workflows, and then iterate with a tight feedback loop between data, design, and engineering.

    My implementation playbook is simple and disciplined. First, choose a high-friction workflow and define success upfront. Second, make the build vs buy call on the foundation model, orchestration layer, and connectors. Third, establish AI risk management and safeguards early—before scale amplifies errors. Finally, run small, eval-driven releases and promote what performs.

    Instrumentation is where the leverage compounds. With Amplitude analytics as a unified analytics platform, I design purposeful events (agent intent, tool calls, resolution state, human handoff), map funnels from user input to agent outcome, and cohort users by context to pinpoint lift. This gives me an honest read on where agents help, where they hinder, and what to tune next.

    The “one-person departments” concept isn’t about doing more with less at all costs; it’s about assembling a tight loop of product management leadership, data, and automation so one operator can own a business outcome end-to-end. An agent handles the repeatable work, while the human focuses on judgment, edge cases, and continuous improvement that compounds.

    As we scale, I look for platform scalability patterns: shared tools and policies, reusable prompt libraries, standardized evaluation suites, and consistent governance. That structure keeps agent performance predictable while preserving speed, and it aligns beautifully with product-led growth when agents are embedded directly in the product experience.

    If you’re starting now, begin with a single, valuable workflow. Instrument it thoroughly with Amplitude analytics, make decisions from the data you see—not the demos you remember—and expand only after you’ve proven uplift. Iteration beats ambition here: agentic AI rewards teams who measure relentlessly and scale only what truly works.


    Inspired by this post on Amplitude – Perspectives.


    Book a consult png image
  • Build vs Buy in 2026: How I Make Confident, AI-Savvy Software Decisions That Scale

    Build vs Buy in 2026: How I Make Confident, AI-Savvy Software Decisions That Scale

    Every planning cycle, I’m asked the same high-stakes question: should we build or buy? In 2026, with generative AI reshaping the software landscape and budgets under scrutiny, the classic calculus needs an upgrade. The right call can accelerate time to value, protect precious engineering capacity, and sharpen competitive differentiation—while the wrong one can quietly inflate total cost of ownership for years.

    “Navigate the build vs buy software dilemma, learn how AI is changing the game, and what you should leverage (and when).” That’s been my north star for product strategy this year, and it’s how I guide teams when the pressure is on.

    My first principle is simple: build where we differentiate, buy where we need parity. If the capability is central to our value proposition or our defensibility, I’m inclined to build—often with a phased approach that de-risks scope. If it’s a non-differentiating layer (think billing, analytics plumbing, basic CRM integration), I’ll buy to accelerate, then revisit once scale and specialization justify a deeper internal investment.

    AI changes the equation on both sides. On the “buy” side, modern platforms now ship agentic AI, fine-tuning options, and robust APIs that let us compose advanced capabilities fast. On the “build” side, AI workflows and toolchains (from code copilots to eval-driven development) compress cycle time, making bespoke solutions more attainable. The trade-off has shifted from pure functionality to questions of AI risk management, model governance, data privacy, and the portability of prompts, embeddings, and training data.

    I evaluate decisions across two economic horizons: time to value versus total cost of ownership. Buying often wins the first round—faster deployment, proven reliability, and lower initial lift. But TCO can creep: integration work, per-seat or consumption SaaS pricing, training, vendor-driven roadmap gaps, and the “shadow ops” of maintaining connectors in our CI/CD. Building flips that profile: slower early velocity, higher upfront complexity, but potentially lower long-run costs and tighter fit with our platform scalability goals.

    Operational risk matters just as much as features. I look at incident management posture, SRE maturity, SLAs, and DORA metrics to gauge resilience. If a vendor can’t meet our uptime and recovery expectations—or if their roadmap pace mismatches our deployment frequency—we’re effectively renting risk we can’t control. Conversely, if our team can’t realistically support the operational burden, buying is the safer choice.

    Security, regulatory compliance, and data governance are non-negotiables. I assess privacy-by-design, data residency, audit logs, role-based access, SOC2/ISO coverage, and threat detection and response. For AI-heavy systems, I add model lineage, red-teaming practices, PII handling, and retention policies. If we can’t verifiably meet our obligations in a build scenario within the launch window, we buy and require clear data exit and portability clauses.

    To keep decisions objective, I use a lightweight scorecard across five dimensions: differentiation, urgency/time to value, regulatory/security risk, integration complexity, and AI leverage/portability. We weight criteria with product trios (PM, design, engineering), run discovery spikes, and validate assumptions with stakeholder management up front. A disciplined scorecard curbs recency bias and helps us communicate trade-offs to leadership.

    In practice, I favor staged commitments. When uncertainty is high, we buy to learn—ship value quickly, instrument usage, and collect evidence. If adoption proves sticky and integration pain remains moderate, we double down with deeper vendor integration. If we uncover unique needs or cost inflection points, we pivot to a build plan that reuses learnings, data models, and UX patterns from the bought solution to reduce risk.

    AI-specific choices deserve their own pass. For example, if we need retrieval-augmented generation, I’ll often buy for the orchestration and observability layer while building our domain-specific retrieval-first pipeline and prompt engineering guardrails. That split gives us speed plus control: we retain our IP and data gravity while tapping best-in-class tooling that evolves with the ecosystem.

    Vendor strategy matters as much as technology. I negotiate clear data export, transparent API quotas, sandbox environments for continuous discovery, and price protections for growth. I pressure-test roadmaps, ask for integration references, and align on outcome-based milestones rather than feature checklists. Strong partners welcome this rigor; weak ones stall—another useful signal.

    On the build side, I right-size ambition. We target minimum lovable scope, isolate risk in early sprints, and leverage open source where it’s mature and secure. We design for modularity so we can swap components without rewriting the world, and we budget time for in-app guides and product tours to smooth adoption, because user activation is the real finish line.

    Here’s the playbook I return to: buy to validate and compress time to value; build to differentiate and reduce long-run TCO; continuously re-evaluate as the AI toolchain and our scale evolve. With a transparent scorecard, a bias for learning, and a clear view of risk, the build vs buy decision becomes less of a leap of faith and more of a repeatable product management capability.

    2026 will reward teams that move fast without mortgaging the future. Make the call deliberately, instrument the outcomes, and stay humble—because the best strategy is the one you can adapt as new evidence arrives.


    Inspired by this post on Product School.


    Book a consult png image
  • Inside the AI Customer Service Shift: What 166 Leaders Told Me About Teams, Roles, and ROI

    Inside the AI Customer Service Shift: What 166 Leaders Told Me About Teams, Roles, and ROI

    I wanted to cut through the hype and see what’s actually changing inside customer service teams as AI agents like Fin move from pilots to production. So I analyzed 166 interviews with support leaders, managers, and frontline specialists to understand how roles, workflows, and team structures evolve once AI becomes part of everyday work.

    The anecdotes were already loud: AI tools are transforming customer support. But the scale, shape, and consistency of that transformation? Less clear. I went to the source—the practitioners living it—to quantify what’s real and what’s next for customer support AI strategy.

    Here’s what I gleaned from the data.

    TL;DR — What’s changing

    AI is reorganizing core CS operations: Nearly every team (≈95%) reported meaningful workflow changes. Triage, routing, translation, and categorization are increasingly automated. Hybrid human+AI systems are taking their place.

    Frontline work is changing to AI oversight: Humans now QA, monitor, and test AI outputs. When it comes to handling queries, they step in for nuance, rather than repetition.

    Structural change is widespread but uneven across companies: 83% reported new responsibilities or roles. Some built AI pods, while others retained traditional setups.

    Tier 1 headcount demand is falling: 28% saw hiring freezes, slowdowns, or natural attrition at Tier 1 level as AI Agents manage more requests and improve operational efficiency.

    Skill gaps are widening inside teams: Data literacy, QA, and cross-functional communication are all rising in value. For many companies, long-term role strategy is lagging behind.

    Research methodology

    The goal of this research is to understand how many customer service teams have changed their roles, responsibilities and ways of working due to adopting AI agents, as well as understanding how these changes manifest within their organizations.

    For this study, the data chosen consists of interviews conducted by the research team, either with Intercom customers or prospects. This data was chosen because the focus of the interviews revolved around the individual experience of the participant, which gives a higher chance of information related to role changes to be present.

    The data was collected using Snowflake by pulling all interviews stored in gong conducted by a member of the research team from 01-01-2025 to 14-10-2025.

    After the data was pulled, a python script was used to clean the conversation corpus for each conversation retrieved. Common English stopwords (e.g. “and”, “very”, “with”, etc.) were removed, as well as all the text associated with a speaker in the conversation that was not the interview participant(s). This was done to reduce the computational power required for the conversation coding, avoid API timeouts and reduce costs.

    After the corpus was cleaned, the OpenAI API was employed, alongside a prompt, to code each conversation using closed codes defined in a closed codebook.

    The codes used were:

    No role change mentioned: No explicit changes to roles, teams, or reporting lines are attributed to AI/Fin.

    Role responsibilities changed due to AI/Fin: Duties/ownership moved between humans and AI/Fin, or scope of a role changed because AI/Fin handles tasks.

    Team structure/reporting changed due to AI/Fin: Org/team boundaries, team charters, or reporting lines changed due to adopting AI/Fin.

    Headcount/hiring impacted due to AI/Fin: Hiring plans, headcount, staffing coverage, or shifts/rotations changed due to AI/Fin.

    Workflow/process changed due to AI/Fin: Steps, triage/escalations, routing, or playbooks changed because AI/Fin alters the process.

    Other organizational changes due to AI/Fin: Other changes inside the organization due to AI/Fin that don’t involve a change in responsibilities, team structure/reporting lines, headcount or workflow/processes changes.

    Data analysis

    166 conversations were retrieved. More than 90% of all conversations report some sort of change either in their role, team, or processes due to implementing Fin, or a similar AI product, with only 13 participants reporting no changes.

    Across these conversations, each one could have multiple types of change associated with it (M = 2.35, Med = 2, Min = 1, Max = 4, N = 166).

    More specifically, after implementing Fin or a similar AI product:

    94.58% participants reported having their processes and workflows disrupted

    82.53% participants reported seeing their role and responsibilities change

    27.71% participants reported changes in company headcount or hiring

    6.02% participants reported their team structure or reporting lines changing as a result

    Additionally, 16.27% participants reported a change for a different reason from the ones highlighted above (“Other organizational changes due to AI/Fin”).

    Sample representativeness

    The sample is representative with a confidence level of 90% and a margin of error of ±6.4% (accounting for an overall unknown population size). The individual confidence intervals for each type of change are as follows.

    Workflow/process changed due to AI/Fin: 157 (94.6%), 90% CI: 91.7% – 97.5%

    Role responsibilities changed due to AI/Fin: 137 (82.5%), 90% CI: 77.7% – 87.4%

    Headcount/hiring impacted due to AI/Fin: 46 (27.7%), 90% CI: 22.0% – 33.4%

    Other organizational changes due to AI/Fin: 27 (16.3%), 90% CI: 11.6% – 21.0%

    No role change mentioned: 13 (7.8%), 90% CI: 4.4% – 11.3%

    Team structure/reporting changed due to AI/Fin: 10 (6.0%), 90% CI: 3.0% – 9.1%

    Thematic analysis

    1) Automation and AI integration replacing manual steps (94.58%). I see AI workflows embedding into every stage of support. Manual triage, routing, translations, and repetitive responses shift to Fin or similar systems, while agents focus on human-in-the-loop oversight.

    Agents’ day-to-day work now revolves around monitoring or fine-tuning AI outputs, not replying to the same questions. In many teams, conversations enter Fin first; humans only step in when nuance or exception handling is required. Testing, QA, and rollout practices have matured too—teams track Fin’s accuracy and iterate intentionally.

    2) Humans shift to oversight, AI handles execution (82.53%). The role resets are unmistakable. Support agents and managers move from high-volume execution to optimization, configuration, and measurement. New roles emerge—AI specialists, automation managers, Fin owners—while responsibilities migrate toward strategic analysis and quality assurance.

    Duties are redistributed: Fin takes on refunds, triage, simple messaging, even parts of the sales process. I’ve watched some careers pivot toward product/ops or AI systems strategy as managers coordinate testing and monitor adoption metrics.

    3) Reductions or slower growth due to efficiency gains (27.71%). Efficiency is real. Many teams reduce Tier 1 headcount needs or slow hiring because AI absorbs simpler requests. Others reallocate people to complex work or AI management. A few still expand—adding automation engineers, implementation specialists, or technical AI leads—but not at past growth rates.

    The upshot: organizations handle more volume while stabilizing or reducing staffing, especially at the frontline tier.

    4) New AI teams, flatter orgs, fewer escalation layers (6.02%). I’m seeing organizational design catch up to the tech. Some companies form dedicated LLM or automation teams. Others flatten hierarchies, design around workflow complexity instead of region, or merge roles. Dedicated escalation layers shrink as Fin routes or resolves more autonomously.

    Team design is getting more modular and data-driven, with clearer ownership for configuration, governance, and Agent Analytics.

    5) Broader digital transformation and operational modernization (16.27%). Beyond support, companies are modernizing their operating model: automation-first, digital self-service, better data foundations, and new vendor ecosystems. Collaboration patterns between data, ops, CX, and product/engineering are tightening, with a culture of experimentation and continuous improvement taking hold.

    How have customer service roles and responsibilities changed due to Fin/AI agent implementation?

    Implementing Fin or a similar AI agent profoundly changes how an organization operates, with around 95% of participants reporting some level of change in their processes after implementation. These systems have significantly reshaped the workflows that customer service teams are used to. Tasks once performed manually, such as ticket triage, routing, repetitive responses, and translations are now handled by AI agents.

    “This marks a clear transformation in how customer service agents work: moving away from directly resolving customer queries to focusing on more analytical and procedural work”

    As a result, customer service agents’ responsibilities have shifted from performing manual tasks to monitoring and fine-tuning the AI agent whenever its output is inaccurate or incomplete. This marks a clear transformation in how customer service agents work: moving away from directly resolving customer queries to focusing on more analytical and procedural work, such as testing, QA, and performance analysis of AI outputs.

    Human agents who still handle conversations tend to do so either because the AI agent cannot yet respond adequately, or because of an organizational choice to retain human involvement for sensitive or high-value interactions. Nevertheless, the need for such roles is diminishing. Around 28% of participants reported a reduction in Tier 1 staff or a hiring slowdown or a full hiring freeze, as AI agents increasingly manage simple requests and organizational attention shifts towards improving automation efficiency.

    “In some cases, this has led to the creation of specialized AI teams, reorganizations around workflow complexity, or the merging and redefinition of existing roles”

    However, this transformation is not uniform across companies. While some roles have disappeared (particularly escalation layers), others have emerged. Many organizations are reallocating existing staff to AI management or hiring new technical profiles such as automation engineers, implementation specialists, and AI leads. In some cases, this has led to the creation of specialized AI teams, reorganizations around workflow complexity, or the merging and redefinition of existing roles.

    Around 83% of participants reported changes to their roles or responsibilities following the introduction of Fin or similar AI agents. Specifically, customer service agents who no longer handle basic queries now focus on managing AI performance, reviewing Fin tasks and improving automation outputs. Managers oversee AI evaluation and implementation, coordinate testing, and monitor AI metrics such as resolution and involvement rates. In some organizations, new dedicated roles have emerged—AI specialists, automation managers, or Fin owners—reflecting a strategic shift toward automation-first, digital self-service models.

    These structural shifts are also cultural. I’m seeing teams embrace experimentation, versioning, and eval-driven development while deepening collaboration with data, operations, and product/engineering. The move from outcomes vs output OKRs is palpable: leaders are measuring containment, deflection, CSAT, and time-to-resolution with new rigor.

    Overall, a widespread transformation is underway. Roles are broadening, responsibilities are diversifying, and cross-functional collaboration is becoming the norm. Given the pace of gen ai improvement and the rise of agentic AI patterns, I expect these shifts to intensify.

    This evolution raises two important questions

    Firstly, do customer service agents possess the skills required to succeed in these new roles? While they are experts in customer interaction and company policy, their work now demands new competencies in data analysis (e.g. reporting AI agent performance and how it changes over time), quality assurance/debugging (e.g. Fin output testing and versioning), and cross-functional communication (e.g. if help from another team is required, drafting a business case to justify the resources required could be needed).

    Secondly, what long-term strategies are companies adopting to support these evolving roles? Some are reorganizing entirely around automation, while others retain traditional structures. For those undergoing transformation, it remains unclear whether these changes are part of a deliberate strategic plan aimed at achieving specific performance outcomes, or the result of experimentation without defined goals.

    Ultimately, Fin’s success— and of AI in customer service more broadly— depends not only on the technology itself but on the people and strategies that shape its use. In my experience, the winners invest early in data literacy, robust QA, clear ownership, and governance; they align product, ops, and CX around a shared AI roadmap; and they measure what matters with disciplined Agent Analytics. That’s how you turn AI workflows into durable customer and business outcomes.


    Inspired by this post on The Intercom Blog.


    Book a consult png image