I build products on the belief that trust is earned in every design decision and every deployment. Trust has always been a first principle at Intercom, from our early investments in security and privacy to the globally recognized certifications that shape our approach today.
As AI becomes more deeply embedded in customer-facing work, it’s essential that businesses can rely on systems that are safe, reliable, and governed to the highest standards. That’s why we’re proud to share that Intercom is now AIUC-1 certified, becoming one of the first companies to meet the world’s first standard designed specifically for AI Agents. For leaders navigating AI Strategy and AI risk management, this is more than a badge—it’s a measurable leap forward in governance and operational rigor.
AIUC-1 is the first certification tailored to the unique risks and challenges of AI Agents. It complements broader AI governance frameworks like ISO 42001 by focusing on enterprise-specific concerns like security, customer safety, system reliability, data and privacy, society, and accountability. In practice, this alignment helps us translate policy into deployable safeguards across cybersecurity, data governance, and regulatory compliance.
To achieve certification, organizations undergo independent third-party audits and quarterly adversarial testing across more than a thousand enterprise risk scenarios. This continuous technical evaluation ensures that AI systems remain robust against fast-evolving threats and that safeguards keep pace with rapid progress in the field. As a product leader, I welcome this level of scrutiny—it’s how we operationalize threat detection and response and make agentic AI dependable at scale.
AIUC-1 itself evolves every quarter, incorporating new research, threat patterns, and global best practices. The standard is shaped by the AIUC-1 Consortium, launched in November with more than 50 founding members who collectively handle tens of trillions of dollars in payments and serve over a billion people daily. Intercom is proud not only to be certified, but to be recognized as a founding technical contributor helping shape the development of the standard. That continuous, community-driven iteration mirrors how we build—measure, learn, and harden—so our customers benefit from real-world, enterprise-ready AI.
Intercom has decades of combined experience in security, compliance, and trust, and we’ve consistently demonstrated that robust governance and fast innovation can coexist. Achieving AIUC-1 certification reinforces that the same rigor we apply across our platform also extends to Fin, our AI Agent. I’ve seen first-hand how risk and procurement teams evaluate generative AI: they expect clarity, evidence, and controls. This certification delivers independent proof that our approach meets those expectations.
For our customers, this certification provides independent validation that Intercom’s AI systems are safe, resilient, and enterprise-ready. It confirms that our AI is tested regularly, built with strong safeguards, and aligned with the expectations of modern security and risk teams. It also signals our continued leadership in shaping responsible AI practices globally, ensuring our customers benefit from standards built for real-world use. In short, you can move faster with confidence—without compromising on governance.
Intercom has always approached trust as an ongoing commitment. AIUC-1 strengthens the foundation we’ve built across other frameworks and certifications, including SOC 2, ISO 27001, ISO 27701, ISO 27018, HIPAA, HDS, and ISO 42001. Together, these certifications create a comprehensive control fabric across privacy, security, and reliability—critical pillars for any enterprise deploying gen AI into production workflows.
As AI technology accelerates, we will continue to evolve our safeguards, deepen our governance practices, and contribute to the standards that shape responsible AI. Our promise is simple: to build AI that is not only powerful and efficient, but safe, transparent, and deserving of the trust our customers place in us. That’s how we turn innovation into durable value.
You can learn more about our certifications and access our security and compliance documentation through the Intercom Trust Center.
Get started with Fin and see how an AIUC-1 certified, enterprise-ready AI Agent can elevate your customer experience with confidence.
When I assess whether an AI product is ready for prime time, I start with trust—not model accuracy. Accuracy is table stakes; trust is what earns adoption, drives retention, and unlocks durable product-led growth.
Evaluation metrics in AI products go beyond accuracy. Learn how product teams use trust-driven metrics to build reliable, growth-driving AI systems.
In practice, I organize trust-driven metrics into four layers: model quality and safety, user and business outcomes, operational reliability and cost, and governance and compliance. This layered approach keeps product trios aligned on what matters now, what must be gated in CI/CD, and what signals we’ll use to prove progress against outcomes vs output OKRs.
On model quality and safety, I care about precision, recall, F1, calibration, and abstention behavior, but also the hard-to-fake signals: hallucination rate, grounding and faithfulness, citation coverage, toxicity, bias, and fairness. For generative systems, I instrument refusal correctness (declining unsafe requests) and evidence adequacy (did the answer rely on retrieved, trustworthy sources).
User and business outcomes must be explicit. I track adoption, activation, task success rate, time to first value, win rate uplift in assisted workflows, CSAT and NPS deltas, and retention analysis by cohort exposed to AI features. For customer support scenarios, deflection rate, average handle time change, and first-contact resolution are core; for sales or ops copilots, I monitor cycle-time reduction and error-rate reduction in critical tasks.
Experimentation is non-negotiable. I design A/B testing with a clear minimum detectable effect (MDE), pre-registered guardrails for safety and quality, and sequential tests that stop early if harm outpaces benefit. Online metrics are always paired with offline evals so we can iterate quickly without exposing users to regressions.
Operationally, trust shows up as speed, stability, and cost predictability. I track latency end-to-end, time to first token, throughput, rate of 5xx and timeouts, cost per request, and caching effectiveness. We also trend safety incidents per 10,000 interactions and mean time to mitigation to keep reliability visible alongside performance.
Governance and compliance are part of the product, not an afterthought. Data governance and privacy-by-design metrics include PII exposure rate, data lineage coverage, access-control correctness, audit pass rate against internal policies, and model and prompt change traceability. This is the backbone of our AI risk management posture and accelerates regulatory compliance reviews instead of slowing them down.
The delivery engine for all of this is eval-driven development. We maintain golden datasets and scenario-based test suites that mirror real user intents, gate releases in CI/CD with minimum thresholds, and run canary rollouts to validate offline–online alignment. Every model or prompt update gets a comparable scorecard so product, engineering, and design can trade off quality, speed, and cost with shared facts.
For LLM-heavy features, retrieval-first pipeline metrics are mandatory. I monitor retrieval hit rate, recall at K, mean reciprocal rank, context contamination, and citation correctness. With large prompts, context window management matters: we track context utilization, truncation rate, and the contribution of each context block to final answers to avoid silently losing critical evidence.
Finally, trust must be legible. I package these metrics into an executive scorecard that maps to business outcomes, risk appetite, and OKRs, with clear thresholds for ship, improve, or roll back. When teams can articulate trade-offs—say, a 20% latency reduction at a small cost increase, or a lower hallucination rate at the expense of higher abstention—they build credibility with stakeholders and confidence with customers.
Trust is not a single number; it’s a system of evidence. By instrumenting these layers and operationalizing AI Strategy with rigorous, transparent metrics, we can ship faster, reduce surprises, and earn the right to scale AI features across the product portfolio.
Most mornings start the same way for me: coffee in hand, I sit down, open Claude Code, and type /today. In a few seconds, Claude pulls fresh tasks from my Trello board, compiles a clean today.md with what matters most, and assembles a research digest of the latest academic work across my focus areas.
Scanning that today.md has become my daily ritual. My workload typically spans writing, coding, and administration. I now make a habit of asking Claude, "What's on my to-do list that you can help with?" That simple question keeps me honest about where AI can accelerate my day.
I’m experimenting with a workflow where Claude enriches every task based on what it can take on or accelerate. It’s still early, so we iterate together for a few minutes each morning to tighten the loop and improve the prompts and outputs.
Next up is my research digest. I skim, download the PDFs that look promising, and move on. Tomorrow, Claude will deliver detailed summaries of every paper I saved—so I stay current without burning hours on search and sorting.
For the first few hours, I protect deep work. Today, that means writing this article. My to-do list and draft live side-by-side in Obsidian, so I click directly from the task into the outline, pick up my running conversation with Claude, and get right back into flow. I pair-write: we outline, I draft, and then I ask, "I wrote the intro. What do you think?"
A terminal-based AI helper suggests concrete ways to lighten your workload—draft a blog, plan 2026, launch a course, migrate files, craft a survey, and digest research—so you can pick the next task fast.
Claude gives pointed feedback—what’s working, what needs tightening—and we iterate. This is genuinely how I work now. I pair with Claude on almost everything I do. It didn’t happen overnight; over the past five months, I’ve built a personal AI-enhanced operating system that has fundamentally improved how I operate: more output, faster cycles, and frankly, more joy in the work.
Because it’s made such a difference, I’m sharing the playbook. If you’re new to Claude Code or want to get more from it, start here:
Claude Code: What It Is, How It's Different, and Why Non-Technical People Should Use It
Stop Repeating Yourself: Give Claude Code a Memory
How to Use Claude Code Safely: A Non-Technical Guide to Managing Risk
In recent office hours, one question came up again and again: Where do I start—what should I automate and what should I have AI augment? Today, I’ll walk through how I decide, share my own workflows, and show how I prioritize what to build next. Next week, we’ll get into how to design and build personal workflows.
This series was inspired by my personal usage of Claude Code. I have not received any compensation from Anthropic for writing this series. And you can trust that if that ever changes, I will disclose it. This is not only required by the FTC here in the US, but I strongly believe it is the right thing to do. You can count on me to do so.
Understanding what AI workflows can do for you
Peek inside a dark-themed writing workspace where a markdown editor displays an article on choosing tasks to automate with AI. The sidebar organizes notes, while the draft outlines pulling Trello tasks, making today.md, and using Claude.
I started with ChatGPT in the browser not long after it launched and quickly began asking, “Can ChatGPT help with this?” As my use cases grew (and my patience for copy-paste vanished), I moved to Claude Code. The philosophy never changed: continuously push the envelope of what LLMs can do today while managing risk.
My default stance is to attempt everything with AI, then decide what becomes a reusable workflow versus a one-off assist. A workflow, to me, is a sequence of steps where some are automated by AI, others are AI-augmented, and some still require me.
Across my setup, clear patterns emerged. I use AI to: (1) do more of what I’m already good at, (2) eliminate friction in frequent tasks, and (3) remove what drains me. The goal is simple: multiply impact without sacrificing quality.
Take writing. I now average about 35,000 words per month—up from roughly 8,000. I’m writing more often and in more depth. I draw more from academic research and include more stories—both my own and those from others. Claude gives me detailed feedback on everything I write, which helps me maintain momentum. It’s remarkable how often a simple nudge—“Ready to write the next section?”—keeps me in the zone. I also spend more time with Claude on structure before drafting, so I discard far less.
Go behind the scenes of creating an AI automation guide: a split-screen workspace pairs the article draft with detailed reviewer notes, revealing a practical, iterative process of outlining, fact-checking, and refining before publication.
Podcast production is another domain where AI shines. I produce two weekly shows: I love connecting with Petra Wille on All Things Product, and talking with product teams building AI-powered products on Just Now Possible. I use Descript to edit, and I rely on Claude Code shortcuts (slash commands) to draft episode titles, descriptions, show notes, chapters, and social posts. I still own the editorial bar—no “AI slop”—but I let AI handle the heavy lifting so I can focus on shaping the final story.
Then there are tasks I fully automate. I love reading across creativity, collaboration, AI efficacy, and more. I do not love searching for relevant papers. So I don’t. Every morning, my automated research workflow finds the newest, most relevant articles and populates my digest. All I do is review.
Choosing your first AI workflows
Classic delegation advice still applies: build awareness of where your time goes; identify what you can delegate; invest your time in the work you’re uniquely equipped to do. That’s a great start for AI workflow strategy, but don’t ignore what you love doing and want to do more of. Augmentation often generates the highest returns—AI helps me go deeper, faster, without diluting my craft.
Peek inside an AI-powered curation flow: a markdown workspace compiles a 'Filtered Research Digest' with criteria, paper counts, and summaries, demonstrating how automation turns raw literature into actionable insights.
To uncover opportunities, I simply ask, over and over: Can AI help with this? As you go about your work today, keep asking yourself: How can AI help with this?
Evaluating if a task is a good candidate for an AI workflow
Through trial and error, I now run new tasks through a quick filter:
• Is this a one-time task or do I do it often?
A clean, workshop-style slide asks the pivotal question: "How can AI help with this?" Use it to spark automation ideas, map steps, and decide where generative AI can accelerate research, drafting, analysis, and repetitive work.
• Do I enjoy doing this task or would I give it to someone else if I could?
• How complex is the task?
• Can I articulate how I would do the task step-by-step?
• Does completing the task require my human judgment?
• Can I define what "done successfully" looks like?
• How much risk is there if the task is not done well?
This checklist takes minutes and pays off quickly. The answers tell me whether to automate, augment, or keep a task human-only for now—and they guide how much process and guardrailing to build around each workflow.
From here, I’ll walk through how to answer these questions in practice, how the answers map to different levels of automation or augmentation, and how I prioritize which workflows to invest in. I’ll also share 41 of my own AI workflows (noting which are automated versus augmented) plus 9 discovery-related workflows currently in development so you can steal shamelessly and ship your first one today.
The rest of this article requires a paid subscription. This publication is reader-supported. If you’ve benefited from my writing, please subscribe today.
Every successful AI initiative I’ve led or advised has shared the same foundation: we treat data as a product. Models will improve, infrastructure will evolve, and use cases will expand—but only high-quality, well-governed, and well-structured data compounds value over time.
“Companies that prioritize data quality, governance, and structure will accelerate their AI initiatives the fastest.” That line has become a non-negotiable principle in my playbook because it consistently separates prototypes that stall from platforms that scale.
When I say data quality, I mean trustworthy signals: clear definitions, deduplication, lineage, and timely freshness. Governance adds accountability and safety: ownership, access controls, auditability, and privacy-by-design aligned with regulatory compliance. Structure makes it all usable: consistent schemas, event taxonomies, and feature stores that let product teams ship faster without reinventing pipelines.
In practice, this looks like aligning an AI Strategy with a unified analytics platform so every team works from the same truth. It means instrumenting feedback loops, labeling outcomes, and building a retrieval-first pipeline that brings the right context to LLMs at the right time. It also means thoughtful context window management so models remain grounded, relevant, and cost-efficient.
I’ve seen the difference firsthand. Early gen ai prototypes built on messy, conflicting data looked promising in demos but failed in the wild—hallucinations spiked, confidence scores dipped, and user trust eroded. Once we tightened governance, standardized schemas, and implemented human-in-the-loop evaluation, accuracy climbed, risk dropped, and feature velocity increased without sacrificing safety.
For product managers, the mandate is clear: treat data work as core product work. Define quality SLAs, make data contracts explicit, and give empowered product teams the tools to observe, debug, and improve signals continuously. Pair AI risk management with measurable product outcomes, and you’ll turn experimentation into a durable advantage.
The payoff is more than model performance; it’s organizational clarity and speed. With the right data foundation, LLMs for product managers become easier to deploy, customer experiences feel coherent, and roadmaps shift from firefighting to compounding wins. Invest in data quality, governance, and structure now, and your AI initiatives won’t just move faster—they’ll sustain momentum.
Inspired by this post on Amplitude – Best Practices.
Every time I ship a new generative AI capability with my product teams, I’m reminded that governance isn’t a compliance afterthought—it’s a strategic advantage. In today’s landscape, the way we govern data determines how quickly we can innovate, how confidently we can scale, and how credibly we can talk about risk with customers, regulators, and our own board.
New AI pressures are redefining what good governance takes. Learn how to build better frameworks, move fast with confidence, and keep your data from being a black box.
My north star for AI Strategy is simple: align business outcomes with responsible practices that are auditable, repeatable, and fast. Practically, that means codifying AI risk management, privacy-by-design, and regulatory compliance into the product lifecycle—requirements, design, build, deploy, and operate. When those guardrails live inside our workflows (not just in policy docs), we accelerate delivery without increasing exposure.
Visibility breaks the “black box.” I start by establishing a unified analytics platform and a living data catalog with lineage, classification, and stewardship. When we pair that with a retrieval-first pipeline for LLMs, we can trace exactly which sources informed a response, who had access, and whether consent and retention rules were honored. Provenance, RBAC/ABAC, encryption, and deterministic masking stop sensitive data from leaking into training sets while keeping our teams productive.
Speed with safety comes from engineering the right controls into CI/CD. Before any AI feature hits production, we run automated checks for PII exposure, policy violations, adversarial prompts, and data drift; then we add human-in-the-loop review where stakes are high. Continuous monitoring, audit logs, and playbooks for incident management and threat detection and response turn governance into an everyday habit rather than a once-a-quarter ritual.
In the first 30 days, I inventory systems, map data flows, and assign clear ownership. We define data quality SLAs, document lawful bases for processing, and publish a concise policy that product managers and engineers can actually use. This anchors stakeholder management and sets expectations for trade-offs.
By day 60, we implement fine-grained access controls, consent-aware tracking, and consistent metadata standards across sources. We wire dashboards for high-signal metrics—access attempts, data minimization, model input/output risk flags—so leaders can see governance health at a glance and course-correct quickly.
By day 90, we close the loop with outcomes vs output OKRs, tying governance to business impact: faster cycle times, fewer incidents, and higher customer trust. Training for LLMs for product managers and communities of practice ensure empowered product teams can make judgment calls confidently, not wait for gatekeepers.
If you’ve felt the friction between innovation and oversight, you’re not alone. The good news is that the right framework lets us do both: move fast with confidence, demonstrate responsible AI, and earn the trust that compounds into product-led growth. That’s the real promise of modern data governance—and it’s how we make sure our AI is powerful, reliable, and never a black box.
Inspired by this post on Amplitude – Best Practices.
AI doesn’t fail because the model is bad, it fails because ownership is missing.
When someone truly owns your AI, everything changes. Resolution and automation rates climb, the system self-improves, and the customer experience transforms in ways a dashboard alone will never show you.
This is part three of our five-part series on customer service planning for 2026. We’ll be sharing all five editions on our blog and on LinkedIn.
If you’d rather have them emailed to you directly as they’re published, drop your details here.
Last week, we introduced the four roles that make AI actually work in a support organization. These roles are already showing up inside the teams who are scaling AI the fastest, and this week, we get closer to the ground.
Here’s what these roles look like in practice — what they do, how they work, and why your AI performance will inevitably drift without them.
AI operations lead — owns AI performance, every day. I think of this person as the air-traffic controller for our AI Agent. I treat the AI as a living system that needs ongoing supervision, evaluation, and tuning. This role is accountable for what leaders care about most: quality, reliability, and continuous improvement.
The AI ops lead sees the whole picture: conversation quality, missing knowledge, flawed assumptions, unexpected failures, new opportunities for automation, and the subtle signals that the system is beginning to drift. In practice, that vigilance is the difference between steady gains and slow decline.
Day-to-day, here’s what I expect from this role.
1. Reviews AI conversations and surfaces performance patterns. The AI ops lead monitors the AI Agent’s behavior — the tone shift after a product launch, a sudden dip in resolution for a specific intent, or conversation clusters revealing new customer behavior. They scan for anomalies, trends, and early warnings, with an emphasis on what’s happening right now, not last week. Without this intentional ownership, I’ve watched a 2% dip turn into a 10% drop in days.
2. Prioritizes fixes and improvements. Once patterns emerge, they triage fixes like a product team handles bugs. Missing or incorrect content? They route it to the knowledge manager. Behavioral issues? They adjust guidance and guardrails. Action or system issues? They partner with the automation specialist. This connective tissue turns individual fixes into compounding improvements.
3. Defines and maintains AI guardrails. Leaders everywhere worry about AI doing things it shouldn’t. This role answers that fear by establishing clarification logic, escalation rules, “never answer” policies, and safety boundaries. The goal is predictable behavior that protects customer trust — an essential pillar of any AI Strategy and AI risk management practice.
4. Aligns reporting with leadership. The AI ops lead reports on resolution rate, CX Score, CSAT, automation coverage, and hours saved — making the economic impact visible. That visibility is a foundational step in any credible customer support ai strategy.
Why this role exists now. AI systems are dynamic and require constant tuning. A small dip in quality quickly becomes an operational issue, and no existing role naturally owns that. When someone does, teams feel the benefit almost immediately.
Knowledge manager — builds and maintains the structured knowledge AI depends on. I hear the same thing from leaders again and again: AI is only as good as the content you give it. This role is rapidly evolving from classic knowledge management into knowledge strategy — part content designer, part systems thinker, part information architect. Their job is to build the knowledge scaffolding that lets AI answer accurately, consistently, and safely.
Here’s how the knowledge manager creates leverage.
1. Writes, maintains, and improves support knowledge — continuously. After every product change, they update articles, remove duplication, resolve contradictions, and pay down “knowledge debt” that quietly erodes accuracy. The upkeep is shaped by AI performance; when patterns expose gaps, they fix the source.
2. Structures knowledge for AI, not for browsing. Traditional help centers are for humans skimming pages. AI needs clean intent signals, crisp formatting, and clearly structured language. The knowledge manager designs that structure as intentionally as the content itself.
3. Works hand-in-hand with AI ops. Many performance issues stem from missing or unclear knowledge. When the AI ops lead surfaces recurring misunderstandings or low-resolution categories, the knowledge manager resolves the root cause at the source.
4. Ensures accuracy and compliance at scale. As AI handles more sensitive situations, the knowledge manager safeguards correctness, currency, and compliance — critical for data governance and regulatory alignment.
5. Develops a cross-functional knowledge strategy. The role creates a canonical, cross-functional source of truth that product, engineering, product marketing, go-to-market, and support (AI and human) can all rely on.
Why this role exists now. This is one of the highest-leverage positions in an AI-first support org. Teams like Rocket Money and Anthropic are hiring knowledge managers because AI accuracy depends on the quality of knowledge feeding it. Without this role, resolution rate caps out early and never climbs.
Conversation designer — designs how the AI speaks, clarifies, and interacts. AI isn’t just a tool customers use; it’s a representative they interact with. Tone, clarity, pacing, and conversational structure matter, especially in voice. Every word affects perceived expertise, trustworthiness, and brand. The conversation designer ensures the AI feels human-friendly without pretending to be human — the sweet spot that builds trust without misleading customers.
In my experience, staffing conversation design early accelerates results. It changes not only how we tune AI, but how we understand the end-to-end customer experience.
Here’s what great conversation design looks like.
1. Shapes the AI’s tone, voice, and communication style. This role refines phrasing, tunes politeness, adjusts how confusion is handled, and shapes micro-interactions that determine whether customers feel cared for or dismissed. On voice channels, natural cadence is make-or-break.
2. Designs flows for high-value conversations. They design how the AI clarifies intent, branches, communicates uncertainty, verifies details, escalates, hands off, and returns to the main thread without feeling mechanical — treating customer experience as a product with language as the interface.
3. Translates procedures and complex workflows into natural language and logic. As AI runs structured procedures and actions, this role becomes a conversational system architect, translating SOPs into conditional logic with exceptions and fallbacks. For example, in Intercom, our conversation designer uses Simulations to run simulated conversations to see where the AI Agent gets confused, over-confident, or awkward, and refine flows until the interaction feels effortless end-to-end.
4. Ensures transitions to humans feel smooth and respectful. Handoffs should provide clear context to the human agent and maintain continuity so customers never feel dropped.
Why this role exists now. As AI becomes the primary interface, conversation design directly influences trust, brand perception, and operational outcomes. It’s a core competency for any Generative AI and LLMs for product managers program.
Support automation specialist — builds the backend actions that allow AI to do real work. If the conversation designer shapes expression, this role shapes capability. They transform AI from an answering machine into an outcome engine by bridging AI and the systems it must safely and deterministically act on.
Support teams increasingly expect AI to do what a human would do: refund a charge, adjust a subscription, verify an identity, update an account setting, or pull relevant data. That expectation creates a new technical role at the edge of support, ops, and engineering.
What I rely on this specialist to deliver.
1. Creates and maintains backend workflows the AI executes. This includes building and maintaining: Fin Tasks. Fin Procedures with embedded steps. Action flows that call internal and external APIs. Automations that span billing systems, user identity layers, CRM objects, subscription entitlements, refund tools, and more. They ensure the AI can act compliantly and predictably — the playbooks that turn intent into action.
2. Owns the integrations required for advanced automation. Many problems require data elsewhere — billing platforms, internal databases, systems of record. The specialist ensures the AI can retrieve, validate, and use that information safely, often partnering closely on CRM integration and internal services.
3. Partners closely with product and engineering. Some workflows require new endpoints, permission layers, safety gates, or deterministic fallbacks. This role drives those changes across the stack.
4. Ensures reliability and safety at every step. Guardrails, validation logic, exception handling, safe execution paths — all are essential. They confirm that the AI has access to the correct data, the action matches policy, edge cases are accounted for, risky flows have deterministic constraints, and every action is auditable and reversible.
Why this role exists now. Customers don’t want answers, they want outcomes. AI can now deliver those outcomes, but only with the right backend scaffolding. This role modernizes operational architecture and unlocks end-to-end automation.
How these roles work together — the new operating loop. These roles aren’t silos; they’re interdependent parts of one system. The AI ops lead identifies patterns and performance gaps. The knowledge manager resolves inaccuracies or missing content. The conversation designer improves clarity, tone, and flow. The automation specialist expands the system’s ability to take action. Each improvement compounds the next, moving you from early automation to transformational resolution rates through continuous refinement.
This loop is what separates teams that plateau early from teams that scale AI into a reliable, high-performing system — the essence of a durable AI Strategy.
How to get started (even if you can’t hire all four roles today). Most teams phase into this model: assign partial ownership, formalize responsibilities, then specialize as AI volume grows. Here’s the progression I recommend.
Phase 1: Assign ownership. Give each role’s core responsibilities to someone who can devote five to 10 hours weekly. Early on, support ops, enablement, senior ICs, and technically inclined teammates can anchor the work.
Phase 2: Formalize the responsibilities. As AI resolves more queries, optimization becomes core operational work. Formalizing ownership prevents performance drift and knowledge debt.
Phase 3: Specialize and hire. Once AI handles 50–70% of incoming volume, these responsibilities become full-time roles. Investing in specialization becomes essential infrastructure for the next scale stage.
The bottom line. AI changes the shape of your support team. These four roles — AI operations lead, knowledge manager, conversation designer, and support automation specialist — form the backbone of the AI-first support organization. They bring order to a constantly changing environment and enable AI to deliver the outcomes leaders and customers expect heading into 2026.
Next week, we’ll continue the 2026 planning series with a deep dive into org design models for AI-first support teams — how to structure people, workflows, and accountability in a world where AI resolves most conversations before a human ever sees them.
To follow along with the series and have each new edition emailed to you directly, drop your details here.
By 2026, the AI Product Owner will be the keystone role that turns AI strategy into measurable business outcomes. In my teams, this seat bridges market insight, model capability, data governance, and shipping velocity—so product decisions are not just clever, but compliant, reliable, and fast.
I often describe the remit simply: "Here is your clear guide to the AI product owner role (skills, responsibilities, how it differs from PM) and ways AI tools supercharge delivery." In practice, the AI Product Owner translates business goals into model-backed experiences, aligns cross-functional execution, and ensures the product’s AI behavior remains safe, lawful, and on-brand under real-world constraints.
How does this differ from a traditional PM? While Product Management sets portfolio strategy, positioning, and market narratives, the AI Product Owner owns the AI experience end-to-end—data readiness, evaluation harnesses, safety guardrails, and the iterative model improvements that drive outcomes vs output OKRs. I anchor the role inside empowered product teams and product trios (PM/Design/ML Eng) to keep discovery continuous and delivery disciplined.
On responsibilities, I expect four pillars. First, discovery: continuous discovery with customers and internal experts to uncover use cases where generative AI or LLMs beat the status quo. Second, experience: define the right interaction patterns for AI UX, including retrieval-first pipeline choices, context window management, and feedback loops for human-in-the-loop correction. Third, governance: privacy-by-design, AI risk management, data governance, and regulatory compliance baked into the roadmap. Fourth, delivery: CI/CD for models and prompts, observable evaluation with A/B testing and minimum detectable effect (MDE), and SRE-grade incident management when AI behavior drifts.
Skills-wise, I look for product sense plus technical fluency. That includes LLMs for product managers (prompting, grounding, RAG), analytics mastery (Amplitude analytics, retention analysis, activation metrics), and comfort with DORA metrics and deployment frequency to keep iteration high but safe. Strong stakeholder management and clear writing are non-negotiable—AI capabilities evolve fast, and leaders must see risk, cost, and ROI with no ambiguity.
AI tools truly supercharge delivery when they eliminate bottlenecks. My practical stack: an AI product toolbox with Claude Code and a ChatGPT connector for rapid prototyping; CustomGPT workflows for support triage and internal knowledge; Pendo product tours and in-app guides to validate behavior changes; Intercom for customer support ai strategy; and tight CRM integration via HubSpot to measure revenue impact. The outcome is faster idea-to-learning cycles, sharper telemetry, and far cleaner handoffs.
For roadmapping, I prioritize thin slices that prove value early—shipping narrowly scoped assistants or copilots, then expanding with product roadmapping and sprint planning that ties capability unlocks to outcomes. A unified analytics platform helps compare human-only baselines to augmented workflows, while agentic AI patterns automate routine steps under strict guardrails.
Risk is a product surface, not a side task. I require explicit policy gates (PII handling, red-teaming, bias audits), clear escalation paths, and incident playbooks. When we treat policy and reliability as features, customers reward us with deeper adoption and higher trust.
If you’re pursuing the AI Product Owner path, build a portfolio around shipped learnings: the experiment you killed with data, the safety constraint you designed, the postmortem you led, and the business metric you moved. That story—evidence of disciplined discovery, responsible delivery, and real-world results—is exactly what teams (and boards) want to see in 2026.
When AI Agents resolve the majority of customer conversations, the shape of your support team has to change. I’ve experienced this shift firsthand: the moment AI begins to carry the volume, your people must pivot from answering individual questions to engineering the system that consistently delivers quality outcomes.
The old tiered model built around queue management, handoffs, and volume-based productivity no longer fits. AI now handles the bulk of customer interactions, and that changes the role of your human team entirely. Responsibilities evolve, and success is measured differently. It goes beyond just adding automation to existing ways of working. You’re building an operating model that’s entirely new.
Most teams don’t hire a dedicated AI function from day one. They start by distributing a few critical responsibilities across existing team members, and formalize those responsibilities as AI becomes central to how support works. That’s exactly how I recommend getting momentum without over-hiring too early: prove value fast, name clear owners, and then scale.
Once you have executive support and a clear strategy in place, these are the four foundational roles we believe are key to getting AI off the ground in a meaningful way:
Skillset/background: Often promoted from support ops. Deep understanding of workflows, systems, and tooling. Strong analytical and cross-functional coordination skills.
Why you need this: Without clear ownership, performance drifts. This role ensures the AI Agent constantly improves.
AI isn’t replacing support—it’s opening doors. This visual highlights how GenAI is spawning roles in customer success, from digital support engineers to automation success teams, and unlocking clearer, upward career paths.
In my teams, this role becomes the heartbeat of AI performance—instrumenting quality feedback loops, triaging failure modes, and aligning fixes across product, data, and support ops.
2. Knowledge manager
Responsibilities: Owns macros, snippets, and help content. Maintains structured, accurate inputs the AI Agent depends on.
Skillset/background: Often promoted from support ops. Deep understanding of workflows, systems, and tooling. Strong analytical and cross-functional coordination skills.
Why you need this: Without clear ownership, performance drifts. This role ensures the AI Agent constantly improves.
Every generative AI system is only as good as its knowledge. I’ve learned the hard way that inconsistent or stale content erodes trust—both for customers and internal stakeholders. A rigorous knowledge manager prevents that.
3. Conversation designer
Build a winning AI support team with four core roles: an ops lead to drive quality, a knowledge manager to keep content accurate, a conversation designer for tone and flow, and an automation specialist to power customer actions.
Responsibilities: Designs how the AI Agent communicates by focusing on tone of voice, structure, handoff logic, and interaction flow. Tunes how responses feel.
Skillset/background: Background in content design, UX writing, or support enablement. Deep grasp of policy, CX standards, and conversational nuance.
Why you need this: This role ensures the AI Agent speaks like your brand – clearly, helpfully, and in line with customer expectations.
This is your brand’s voice in motion. A strong conversation designer sets the guardrails that keep interactions on-brand, compliant, and empathetic while still efficient.
4. Support automation specialist
Responsibilities: Builds workflows and backend actions the AI Agent can execute.
Skillset/background: Background in support engineering, systems, or tooling. Works closely with product and engineering teams.
AI in customer service thrives with player‑coaches—hands‑on leaders who build, mentor, and iterate with the team. This quote-driven graphic signals a move away from heavy management toward agile, coaching‑first support operations.
Why you need this: Enables the AI Agent to take action – not just respond. This role translates customer intents into business systems.
In practice, this role unlocks the jump from “answering” to “resolving.” They wire up secure actions, map intents to outcomes, and partner with engineering to keep latency low and reliability high.
Introducing new AI-first roles doesn’t mean your existing functions disappear. But they do need to evolve. For AI to scale effectively, every function in your support organization must shift its focus from managing queue-level activity to improving the system’s performance:
Enablement trains human agents to work with the AI Agent: managing handoffs, tuning responses, and understanding how to give feedback that improves the system.
QA evolves from reviewing conversations to reviewing the quality of the customer experience and behavior of the AI Agent: where the AI succeeds, where it falls short, and how the system as a whole performs.
Workforce management plans capacity based on automation coverage, not just inbound volume.
You’ll also need a new kind of leadership to make this model work. The traditional support leader doesn’t map cleanly to an AI-first organization. You need a new layer: leaders who are part strategist, part operator. They roll up their sleeves to analyze the AI Agent’s performance, refine content, and debug handoffs, but they also coach the team through a new way of working.
Customer service is reorganized for the AI era: a VP of Support leads human support, ops and optimization, and a new AI support function—adding conversation design, knowledge management, and systems analysis alongside agents, insights, and WFM.
This is the “player-coach model” – leaders who actively shape both the system and the people within it.
These leaders see the AI Agent as a teammate to manage, not just a tool to monitor. They can’t be purely people leaders or purely systems thinkers. They need to be both, and they’re emerging as a critical hire in support right now.
Some teams are restructuring their organizations around the AI Agent as a core product, not just a support tool. Here are some real-world examples:
At Dotdigital, a dedicated “Fin Ops” specialist role was created to refine content and improve AI performance.
At Clay, a dedicated GTM engineer role has been established as part of the ops team with a focus on making support more efficient at scale using Fin. Additionally, a support engineering function has been embedded directly in the CX organization to help reduce volume by fixing bugs and building internal tools.
Lightspeed created a dedicated Digital Engagement team to manage Fin’s optimization, and formalized a triangular model that brings together technical teams, frontline experts, and content specialists.
In my experience, the most resilient org designs align around three pillars: Human Support, AI Support, and Support Operations and Optimization. Each pillar carries distinct ownership yet shares accountability for AI performance. That structure keeps the team focused on outcomes over output and makes continuous improvement everyone’s job.
AI shouldn’t replace your agents—it should elevate them. This Rocket Money quote highlights a modern support model where automation handles the busywork and people concentrate on high‑value, human moments.
Once AI Agents handle most conversations, your team’s work moves from “answering questions” to “designing and improving the system that answers questions.” They become the force that steers quality, rather than the one that carries the volume.
This is why new roles are important. It’s not because they’re trendy, but because the performance of your support organization now depends on the performance of AI, and no AI Agent succeeds without clear ownership of content, behavior, workflows, and improvement cycles.
That’s the pattern we’ve seen from working with so many teams:
They name owners early.
They distribute responsibilities before they formalize them.
They anchor teams around AI outcomes, not ticket outcomes.
And they hire leaders who can manage both the system and the people.
If you take one thing away from this week’s article, let it be this: if AI is going to handle the majority of your customer conversations, your team needs to be designed to help it do that well.
Your roles, responsibilities, and leadership approach are now part of the architecture of AI performance.
Next week, we’ll go deeper into how these roles actually operate day-to-day – the workflows, responsibilities, rhythms, and collaboration patterns that make an AI-first support organization run.
Every week, I’m in conversations with product leaders, engineers, and security teams who are trying to ship AI features faster without compromising trust. The tension is real: stakeholders want velocity, customers want transparency, and regulators want accountability. That’s exactly where modern data governance earns its keep.
New AI pressures are redefining what good governance takes. Learn how to build better frameworks, move fast with confidence, and keep your data from being a black box.
In my role leading product management, I’ve learned that robust data governance isn’t a compliance checkbox—it’s a strategic capability. When we treat governance as a product, we architect for clarity, safety, and speed. That means aligning AI Strategy with day-to-day delivery so teams know what they can ship, when, and why.
Here’s the practical blueprint I rely on. First, establish ownership and a shared language. Create a living data catalog, lineage maps, and clear data classifications so teams know which assets are sensitive, regulated, or eligible for training LLMs. Second, harden privacy-by-design and least-privilege access. Bake PII detection, secrets management, and role-based policies directly into your workflows. Third, bring quality and observability to the forefront: instrument data contracts, monitor drift, and track model performance across environments. Finally, implement model governance end to end—dataset cards, model cards, bias testing, human-in-the-loop review, and a repeatable evaluation harness.
To move fast with confidence, make governance invisible and automated. Treat policies as code in CI/CD, gate deployments with pre-merge checks, and fail builds that violate data contracts. Log prompts and outputs responsibly, route unsafe patterns to red-teaming, and use a retrieval-first pipeline to anchor models on verified sources rather than fragile context stuffing. This is how we scale AI product development while keeping audit trails complete and costs in check.
Avoiding the black-box problem starts with transparency. Document assumptions, training data sources, and known limitations—then expose explanations where it matters in the product experience. Pair this with a unified analytics platform to tie telemetry, feature flags, and user feedback to model changes. When something goes sideways, your observability, incident management playbooks, and threat detection and response processes should make root-cause analysis fast and defensible.
If you’re building your program from scratch, use a 30-60-90 approach. In the first 30 days, inventory systems, classify data, and map high-risk use cases. By day 60, formalize RACI for governance, deploy access controls, and set up your evaluation pipeline with golden datasets and measurable acceptance thresholds. By day 90, operationalize incident response, conduct tabletop exercises, and wire governance outcomes into OKRs—think time-to-approval for high-risk changes, reduction in production incidents, and model evaluation pass rates.
This playbook pays off in board conversations and with customers. You can articulate your AI risk management posture, show measurable progress on regulatory compliance, and demonstrate how governance accelerates—not hinders—delivery. Most importantly, your teams gain the confidence to experiment, knowing there’s a safety net that protects users, the brand, and the business.
If your organization is wrestling with how to balance innovation and control, start small, codify what works, and scale with intent. With the right foundations in data governance, AI becomes an engine for durable advantage—not a source of sleepless nights.
Inspired by this post on Amplitude – Perspectives.
What if your morning started with a helpful check-in from a voice AI that actually improves your sleep—using the same core principles that typically cost thousands of dollars and come with year-and-a-half waitlists? That idea energizes me as a product leader, because it blends clinical-grade outcomes with consumer-grade accessibility. Recently, I dug into how the team at Rest built an AI sleep coach inspired by Cognitive Behavioral Therapy for Insomnia (CBTI), and why their method offers a repeatable blueprint for complex, personal AI products.
The origin story is a classic product discovery moment. Rest’s team noticed that a meaningful slice of users in their podcast app were using audio to fall asleep. Although it represented only about 10% of users, that group showed a high willingness to pay. That signal pushed them to explore a dedicated sleep solution, moving from a general audio app to a targeted sleep experience—and eventually toward an AI-powered coach as LLMs matured.
Through jobs-to-be-done research, they identified a clear, underserved segment: “DIY sleep hackers.” These are motivated users who want agency, structure, and results without navigating clinical systems. Choosing CBTI (a clinically proven approach with 80% efficacy) gave the product a strong evidence-based foundation while remaining accessible as a wellness tool. It’s the kind of strategic choice I look for: credible, measurable, and aligned with user motivation.
The product evolution moved in smart, incremental steps. Rest started with a basic text chatbot before graduating to a voice-first experience—using Vapi for voice and OpenAI for reasoning. Voice changed the relationship dynamic: it increased intimacy, lowered friction for daily check-ins, and made behavioral coaching feel human without pretending to be. The team built a memory system that tracks context (like traveling or having a dog) with time-based relevance, which keeps conversations fresh, respectful, and genuinely personalized.
Daily engagement is driven by dynamic agendas that adapt based on sleep data, the user’s stage in the program, and their recent compliance. I love this mechanic: it operationalizes behavior change by sequencing the right intervention at the right time. In parallel, they developed text via OpenAI Assistants while building voice with Vapi, which let them ship value while learning in two modes. They also moved from massive system prompts to RAG for general sleep knowledge, keeping personal user context in the prompt—reducing brittleness while improving scalability.
Because sleep sits close to healthcare, the team drew a firm line between wellness and medical positioning. They implemented clear guardrails: no diagnosis, no medication advice, and strong boundaries on scope. Weekly error analyses with domain experts (sleep therapists) tightened quality and tone, and they adopted LLM-powered evals to enforce safety boundaries. For observability and evaluations, they leveraged Langfuse, and they experimented with Hamming for voice testing to refine the experience end-to-end.
Under the hood, this is a great example of “one bite of the apple at a time” product building in AI. Start with a simple interface, anchor on an evidence-based method, layer personalization with memory, formalize program structure with dynamic agendas, and shift to RAG when general knowledge outgrows prompt engineering. As a product leader, I see strong echoes of agentic patterns here—goal-oriented orchestration, stateful memory, and adaptive planning—shipped in pragmatic increments rather than as a monolithic platform rewrite.
A few takeaways I’m applying with my teams: First, segment deeply and pick a high-intent niche (those “DIY sleep hackers” were the right beachhead). Second, let modality fit the job—voice is not a gimmick when it boosts compliance and empathy. Third, design safety and scope from day one if you’re anywhere near health. Finally, invest early in evals and observability so you can improve with confidence, not hope.
If you want to explore the full conversation and product decisions, you can listen here: Spotify | Apple Podcasts.
Resources & Links:
Rest – AI sleep coach app
Vapi – Voice agent platform Rest uses
Langfuse – Observability and evals platform
Hamming – Voice testing platform
AI Evals Maven Course by Hamel Husain and Shreya Shankar
Bottom line: Rest demonstrates how to take a clinically grounded method like CBTI, translate it into a daily voice-first experience, and ship it with rigor. If you’re building in AI, this is a model worth studying—practical, safe, and deeply user-centered.
Every breakthrough we ship in AI reinforces a simple truth I live by: "Companies that prioritize data quality, governance, and structure will accelerate their AI initiatives the fastest." That statement captures the difference between flashy demos and durable, scalable products. In my experience, the strongest AI Strategy starts with the discipline to treat data as a product, not an afterthought.
When teams rush to production with generative AI or LLMs, the first issues rarely come from the model itself—they come from the data. Poor lineage leads to hallucinations, inconsistent schemas inflate costs, and weak access controls erode trust. For LLMs for product managers, this is the gap between a compelling prototype and a reliable system customers depend on every day.
Let me clarify what I mean by data quality, governance, and structure. Quality is completeness, accuracy, freshness, and consistency across sources. Governance is policy, ownership, and accountability—privacy-by-design, regulatory compliance, and AI risk management built in from day one. Structure is the architecture: clear data contracts, standardized schemas, metadata and lineage, and role-based access that keeps sensitive signals protected while enabling speed.
Here’s the product playbook I use to operationalize this. First, map critical sources and define data contracts at the edges so producers and consumers can move independently. Second, standardize schemas and entity resolution to eliminate ambiguous joins. Third, enforce privacy-by-design with policy-as-code and automated redaction. Fourth, converge analytics into a unified analytics platform so definitions, freshness, and observability are shared. Fifth, instrument end-to-end lineage and quality SLAs with alerting. Finally, close the loop with human feedback and labeling to continuously improve model performance.
For generative AI workloads, a retrieval-first pipeline is essential. Unify trusted sources (product analytics, CRM, support, docs), embed and index them with guardrails, and focus on context window management to keep prompts lean, relevant, and cost-effective. This approach improves response quality, reduces token spend, and makes updates near-real-time—without retraining the base model every week.
Measure what matters. Tie model outcomes to product metrics through rigorous A/B testing, and size experiments with minimum detectable effect (MDE) so you can ship confidently. Use product analytics to verify that better data actually improves activation, retention, and support deflection. When teams can trace an AI improvement back to a specific data-quality fix, they invest in governance with conviction.
Culture closes the gap. Empowered product teams and product trios (PM, design, engineering) make crisper decisions when data stewards are embedded and accountable. Clear ownership, shared definitions, and transparent dashboards reduce friction with security and compliance while speeding up delivery. This is how product management leadership sustains velocity without trading away trust.
The bottom line: if we want faster, safer, and more scalable AI, we start with the data. Build strong foundations, treat governance as enablement, and structure every step so improvements compound. With that in place, Generative AI stops being a science experiment and becomes a durable competitive advantage.
Inspired by this post on Amplitude – Perspectives.
Will AI replace software engineers or reshape their roles? Explore risks, opportunities, and alternative career paths in tech.
I’m often asked whether AI will make software engineers obsolete. My short answer: AI is already automating tasks, not eliminating the role. The engineers who learn to orchestrate models, systems, and stakeholders will create more value—not less. The real shift is from keystrokes to judgment, from writing code to designing socio-technical systems that deliver outcomes.
Today’s gen ai assistants—think Claude Code and ChatGPT connector—excel at unit test scaffolding, boilerplate generation, refactoring, docstrings, and code search. When integrated into CI/CD, they can open draft pull requests, annotate diffs, and propose fixes. This lifts developer productivity and frees time for higher-leverage work: problem framing, architecture decisions, and customer discovery.
What changes in the role? We spend more cycles on product discovery, privacy-by-design, and AI Strategy, and fewer on repetitive implementation. We design agentic AI workflows that combine retrieval, tools, and guardrails; we evaluate trade-offs that blend performance, cost, and safety; and we partner with empowered product teams to ship the smallest valuable slice, learn, and iterate.
Measure what matters. If AI is working, DORA metrics should improve: higher deployment frequency, shorter lead time for changes, stable change failure rate, and faster MTTR. Pair that with outcomes vs output OKRs to avoid gaming the system—shaving seconds off a build is meaningless if it doesn’t move activation, retention, or revenue. A unified analytics platform can help connect engineering signals to business impact.
Risk is real—and manageable. AI risk management and data governance are now core competencies, not afterthoughts. Protect IP with robust access controls, context window management, and red-teaming. In production, instrument threat detection and response to catch prompt injection, data leakage, and model drift. Treat this like any other reliability discipline alongside SRE.
If parts of coding get automated, where can great engineers thrive? Several high-impact paths are emerging: platform engineering for LLMs (tooling, evals, observability), SRE for AI-infused systems, developer evangelism and education, product management for AI-native experiences, security engineering focused on model and data threats, and forward deployed engineers who pair with customers to solve messy, real-world problems.
How to upskill fast: build an AI product toolbox and ship small. Prototype gen ai features end-to-end—retrieval, function calling, human-in-the-loop QA—and connect them to your CRM integration or support stack. Use A/B testing with a clear minimum detectable effect (MDE) to validate impact. Leverage CustomGPT workflows for internal enablement and in-app guides or product tours to onboard users safely.
Here’s a pragmatic 90-day plan. Week 0–2: audit your top 10 engineering tasks by time spent; identify 3 that are ripe for AI augmentation. Week 3–6: pilot inside CI/CD with explicit guardrails; track DORA metrics and developer sentiment. Week 7–10: productionize the wins; document runbooks; add incident management paths. Week 11–12: share learnings with product trios, refine your value proposition, and set next-quarter OKRs.
AI won’t replace software engineers; engineers who master AI will outpace those who don’t. If we embrace the shift—toward systems thinking, responsible governance, and customer outcomes—we’ll build better products faster and open new, rewarding career paths. The opportunity is here and compounding.