Tag: LLMs for product managers

Own Your AI: 4 Essential Roles to Supercharge Support and Prevent Performance Drift by 2026

AI doesn’t fail because the model is bad, it fails because ownership is missing.

When someone truly owns your AI, everything changes. Resolution and automation rates climb, the system self-improves, and the customer experience transforms in ways a dashboard alone will never show you.

This is part three of our five-part series on customer service planning for 2026. We’ll be sharing all five editions on our blog and on LinkedIn.

If you’d rather have them emailed to you directly as they’re published, drop your details here.

Last week, we introduced the four roles that make AI actually work in a support organization. These roles are already showing up inside the teams who are scaling AI the fastest, and this week, we get closer to the ground.

Here’s what these roles look like in practice — what they do, how they work, and why your AI performance will inevitably drift without them.

AI operations lead — owns AI performance, every day. I think of this person as the air-traffic controller for our AI Agent. I treat the AI as a living system that needs ongoing supervision, evaluation, and tuning. This role is accountable for what leaders care about most: quality, reliability, and continuous improvement.

The AI ops lead sees the whole picture: conversation quality, missing knowledge, flawed assumptions, unexpected failures, new opportunities for automation, and the subtle signals that the system is beginning to drift. In practice, that vigilance is the difference between steady gains and slow decline.

Day-to-day, here’s what I expect from this role.

1. Reviews AI conversations and surfaces performance patterns. The AI ops lead monitors the AI Agent’s behavior — the tone shift after a product launch, a sudden dip in resolution for a specific intent, or conversation clusters revealing new customer behavior. They scan for anomalies, trends, and early warnings, with an emphasis on what’s happening right now, not last week. Without this intentional ownership, I’ve watched a 2% dip turn into a 10% drop in days.

2. Prioritizes fixes and improvements. Once patterns emerge, they triage fixes like a product team handles bugs. Missing or incorrect content? They route it to the knowledge manager. Behavioral issues? They adjust guidance and guardrails. Action or system issues? They partner with the automation specialist. This connective tissue turns individual fixes into compounding improvements.

3. Defines and maintains AI guardrails. Leaders everywhere worry about AI doing things it shouldn’t. This role answers that fear by establishing clarification logic, escalation rules, “never answer” policies, and safety boundaries. The goal is predictable behavior that protects customer trust — an essential pillar of any AI Strategy and AI risk management practice.

4. Aligns reporting with leadership. The AI ops lead reports on resolution rate, CX Score, CSAT, automation coverage, and hours saved — making the economic impact visible. That visibility is a foundational step in any credible customer support ai strategy.

Why this role exists now. AI systems are dynamic and require constant tuning. A small dip in quality quickly becomes an operational issue, and no existing role naturally owns that. When someone does, teams feel the benefit almost immediately.

Knowledge manager — builds and maintains the structured knowledge AI depends on. I hear the same thing from leaders again and again: AI is only as good as the content you give it. This role is rapidly evolving from classic knowledge management into knowledge strategy — part content designer, part systems thinker, part information architect. Their job is to build the knowledge scaffolding that lets AI answer accurately, consistently, and safely.

Here’s how the knowledge manager creates leverage.

1. Writes, maintains, and improves support knowledge — continuously. After every product change, they update articles, remove duplication, resolve contradictions, and pay down “knowledge debt” that quietly erodes accuracy. The upkeep is shaped by AI performance; when patterns expose gaps, they fix the source.

2. Structures knowledge for AI, not for browsing. Traditional help centers are for humans skimming pages. AI needs clean intent signals, crisp formatting, and clearly structured language. The knowledge manager designs that structure as intentionally as the content itself.

3. Works hand-in-hand with AI ops. Many performance issues stem from missing or unclear knowledge. When the AI ops lead surfaces recurring misunderstandings or low-resolution categories, the knowledge manager resolves the root cause at the source.

4. Ensures accuracy and compliance at scale. As AI handles more sensitive situations, the knowledge manager safeguards correctness, currency, and compliance — critical for data governance and regulatory alignment.

5. Develops a cross-functional knowledge strategy. The role creates a canonical, cross-functional source of truth that product, engineering, product marketing, go-to-market, and support (AI and human) can all rely on.

Why this role exists now. This is one of the highest-leverage positions in an AI-first support org. Teams like Rocket Money and Anthropic are hiring knowledge managers because AI accuracy depends on the quality of knowledge feeding it. Without this role, resolution rate caps out early and never climbs.

Conversation designer — designs how the AI speaks, clarifies, and interacts. AI isn’t just a tool customers use; it’s a representative they interact with. Tone, clarity, pacing, and conversational structure matter, especially in voice. Every word affects perceived expertise, trustworthiness, and brand. The conversation designer ensures the AI feels human-friendly without pretending to be human — the sweet spot that builds trust without misleading customers.

In my experience, staffing conversation design early accelerates results. It changes not only how we tune AI, but how we understand the end-to-end customer experience.

Here’s what great conversation design looks like.

1. Shapes the AI’s tone, voice, and communication style. This role refines phrasing, tunes politeness, adjusts how confusion is handled, and shapes micro-interactions that determine whether customers feel cared for or dismissed. On voice channels, natural cadence is make-or-break.

2. Designs flows for high-value conversations. They design how the AI clarifies intent, branches, communicates uncertainty, verifies details, escalates, hands off, and returns to the main thread without feeling mechanical — treating customer experience as a product with language as the interface.

3. Translates procedures and complex workflows into natural language and logic. As AI runs structured procedures and actions, this role becomes a conversational system architect, translating SOPs into conditional logic with exceptions and fallbacks. For example, in Intercom, our conversation designer uses Simulations to run simulated conversations to see where the AI Agent gets confused, over-confident, or awkward, and refine flows until the interaction feels effortless end-to-end.

4. Ensures transitions to humans feel smooth and respectful. Handoffs should provide clear context to the human agent and maintain continuity so customers never feel dropped.

Why this role exists now. As AI becomes the primary interface, conversation design directly influences trust, brand perception, and operational outcomes. It’s a core competency for any Generative AI and LLMs for product managers program.

Support automation specialist — builds the backend actions that allow AI to do real work. If the conversation designer shapes expression, this role shapes capability. They transform AI from an answering machine into an outcome engine by bridging AI and the systems it must safely and deterministically act on.

Support teams increasingly expect AI to do what a human would do: refund a charge, adjust a subscription, verify an identity, update an account setting, or pull relevant data. That expectation creates a new technical role at the edge of support, ops, and engineering.

What I rely on this specialist to deliver.

1. Creates and maintains backend workflows the AI executes. This includes building and maintaining: Fin Tasks. Fin Procedures with embedded steps. Action flows that call internal and external APIs. Automations that span billing systems, user identity layers, CRM objects, subscription entitlements, refund tools, and more. They ensure the AI can act compliantly and predictably — the playbooks that turn intent into action.

2. Owns the integrations required for advanced automation. Many problems require data elsewhere — billing platforms, internal databases, systems of record. The specialist ensures the AI can retrieve, validate, and use that information safely, often partnering closely on CRM integration and internal services.

3. Partners closely with product and engineering. Some workflows require new endpoints, permission layers, safety gates, or deterministic fallbacks. This role drives those changes across the stack.

4. Ensures reliability and safety at every step. Guardrails, validation logic, exception handling, safe execution paths — all are essential. They confirm that the AI has access to the correct data, the action matches policy, edge cases are accounted for, risky flows have deterministic constraints, and every action is auditable and reversible.

Why this role exists now. Customers don’t want answers, they want outcomes. AI can now deliver those outcomes, but only with the right backend scaffolding. This role modernizes operational architecture and unlocks end-to-end automation.

How these roles work together — the new operating loop. These roles aren’t silos; they’re interdependent parts of one system. The AI ops lead identifies patterns and performance gaps. The knowledge manager resolves inaccuracies or missing content. The conversation designer improves clarity, tone, and flow. The automation specialist expands the system’s ability to take action. Each improvement compounds the next, moving you from early automation to transformational resolution rates through continuous refinement.

This loop is what separates teams that plateau early from teams that scale AI into a reliable, high-performing system — the essence of a durable AI Strategy.

How to get started (even if you can’t hire all four roles today). Most teams phase into this model: assign partial ownership, formalize responsibilities, then specialize as AI volume grows. Here’s the progression I recommend.

Phase 1: Assign ownership. Give each role’s core responsibilities to someone who can devote five to 10 hours weekly. Early on, support ops, enablement, senior ICs, and technically inclined teammates can anchor the work.

Phase 2: Formalize the responsibilities. As AI resolves more queries, optimization becomes core operational work. Formalizing ownership prevents performance drift and knowledge debt.

Phase 3: Specialize and hire. Once AI handles 50–70% of incoming volume, these responsibilities become full-time roles. Investing in specialization becomes essential infrastructure for the next scale stage.

The bottom line. AI changes the shape of your support team. These four roles — AI operations lead, knowledge manager, conversation designer, and support automation specialist — form the backbone of the AI-first support organization. They bring order to a constantly changing environment and enable AI to deliver the outcomes leaders and customers expect heading into 2026.

Next week, we’ll continue the 2026 planning series with a deep dive into org design models for AI-first support teams — how to structure people, workflows, and accountability in a world where AI resolves most conversations before a human ever sees them.

To follow along with the series and have each new edition emailed to you directly, drop your details here.

Inspired by this post on The Intercom Blog.

December 2, 2025
AI vs. Human Judgment in Customer Interviews: The Hard‑Won Lessons That Changed My Mind

I recently revisited a topic I once pushed back on: using AI to analyze (and maybe even synthesize) customer interviews. After six months of real-world experiments and countless conversations with seasoned product leaders, I’ve evolved my perspective. There is meaningful value here—but only when we’re clear about where AI helps and where it quietly erodes the hard-won customer understanding that powers great product decisions.

If you want to experience the conversation that sparked this reflection, you can listen to the episode on Spotify or Apple Podcast, and watch the discussion here: YouTube. It’s a candid, practical exploration of AI’s role in continuous discovery, and it mirrors what I’m seeing on the ground with product trios and empowered product teams.

Here’s the crux: AI raises the floor for beginners but accelerates experts even more. That matches my experience—early-career PMs get structure, momentum, and a confidence boost, while experienced interviewers can move faster without sacrificing nuance. But there’s a catch. If your interviewing skills aren’t solid yet, AI can create a veneer of insight that masks shallow understanding. In other words, it can help you go wrong more efficiently.

The conversation makes an important distinction between analysis and synthesis. Analysis is about extracting signals from the interview. Synthesis is about building meaning—connecting patterns, weighing contradictions, and deciding what to do next. AI can speed up the former with summaries and highlights. The latter—true synthesis—still demands expert judgment, context, and empathy.

One line from the episode stuck with me: your unpolished interview skills matter more than any shiny new AI workflow. I’ve felt that firsthand. When interview quality is uneven, dropping transcripts into an LLM won’t save you. You still need to synthesize every interview individually so the signals remain traceable and credible. That discipline keeps teams aligned, prevents overfitting to noise, and builds the organizational memory that fuels better bets.

We also explored the operational reality most teams face: interviews pile up. Backlogs grow. Leaders want speed. This is where “expert + AI” shines. With the right prompts, templates, and context, tools like ChatGPT and Claude can help transform raw transcripts into structured artifacts you can trust—provided a strong interviewer sets the frame and makes the calls. That balance preserves both velocity and quality.

What changed my mind most was the evidence from experiments—running sets of interviews through different LLMs and comparing outcomes. The patterns were consistent: beginner + AI is usually better than nothing, but the real performance gains come from expert + AI. When experts guide the process, AI becomes an accelerant rather than a crutch.

A favorite story in the episode takes a detour into building a gaming PC—an unexpected but perfect metaphor for AI’s limits. You can get great step-by-step guidance from a model, but when context shifts or edge cases appear, expertise is what keeps you from making expensive mistakes. Customer interviews are like that. Empathy comes from human interaction; AI can’t replace the experience of talking directly to your customers.

My practical guidance for teams integrating AI into continuous discovery: start with interviewing fundamentals, separate analysis from synthesis, and standardize how you capture single-interview learnings. If you need a tight template for this, refer to “The Interview Snapshot: How to Synthesize and Share What You Learned from a Single Customer Interview.” Use AI for summaries, clustering, and draft artifacts—but have an expert finalize the narratives, evaluate trade-offs, and document assumptions.

If you’re scaling this across an organization, invest in training first, then in workflows. Build a lightweight operating system for discovery: consistent interview guides, “story-based” techniques, and a shared library of prompts. Consider resources like “The Interview Coach,” as well as practical write-ups such as “Customer Interview Analysis: Where AI Helps and Hurts.” These help teams avoid common pitfalls and make better use of AI in high-judgment moments.

My bottom line: AI isn’t magic. It can help, but only if your interviews are strong and you provide the right context. Customer understanding is a competitive moat; outsourcing it entirely will cost you in the long run. Use AI to accelerate—not replace—the human judgment that makes product discovery work.

Resources and links worth exploring: ChatGPT, Claude, The Interview Snapshot: How to Synthesize and Share What You Learned from a Single Customer Interview, The Interview Coach, and Customer Interview Analysis: Where AI Helps and Hurts.

I’d love to hear how your team is using AI in discovery. What’s working, what’s risky, and where do you draw the line between automation and judgment? Share your experiences in the comments—our community learns faster when we compare notes.

Inspired by this post on Product Talk.

December 2, 2025
Unlock AI Product Roadmaps: Essential Tools Every PM Needs to Prioritize and Ship Faster

In my role leading product teams, the AI product roadmap isn’t just a plan—it’s the operating system for how we discover value, prioritize with rigor, and ship with confidence. The pace has changed, the stakes are higher, and the best product managers are now orchestrating AI capabilities, data, and customer insight in near-real time.

Master the evolving art of the AI product roadmap. Prioritize smarter, turn data into direction and insight into action, only much faster.

When I say “AI product roadmap,” I’m talking about a living system that blends strategy, discovery, and delivery. It’s less about dates and more about outcomes, risk reduction, and sequencing learning. In practice, that means combining AI Strategy with product roadmapping and sprint planning, then validating each bet with real customer signals.

For prioritization, I anchor on outcomes vs output OKRs and connect them to measurable signals across the funnel. Continuous discovery keeps insights flowing, while a unified approach to analytics and retention analysis tells me where the lift is. This lets me rank initiatives not just by impact and effort, but by how quickly we can learn, iterate, and compound value.

On discovery, product trios are non-negotiable. We prototype early with gen ai and LLMs for product managers to accelerate concept validation and reduce ambiguity. When customers can co-create through in-app guides or lightweight product tours, we turn vague needs into crisp problem statements and testable hypotheses far faster.

On delivery, I pair tight feedback loops with experimentation. A deliberate cadence of A/B testing and strong instrumentation ensures we’re learning every sprint, not just launching. The goal is to de-risk decisions quickly, keep momentum high, and translate signals into roadmap movement without thrash.

Under the hood, the AI stack matters. I rely on a retrieval-first pipeline to ground models in trusted data, and I’m intentional about privacy-by-design and data governance from day one. As agentic AI patterns emerge, I put evaluation workflows in place so we can ship confidently—and safely—without slowing down innovation.

Finally, alignment is the multiplier. Clear narrative roadmaps tied to customer outcomes help stakeholders see trade-offs, while crisp interfaces with go-to-market and CRM integration close the loop from roadmap to revenue. When everyone can trace a line from AI strategy to shipped value, prioritization becomes easier and trust grows.

If you’re feeling the acceleration, you’re not alone. With the right AI product toolbox—rooted in discovery, grounded in data, and delivered through tight feedback loops—you can move faster, learn smarter, and build products your customers can’t live without.

Inspired by this post on Product School.

December 1, 2025
AI Product Owner in 2026: The High-Impact Role Every Team Needs to Win With AI

By 2026, the AI Product Owner will be the keystone role that turns AI strategy into measurable business outcomes. In my teams, this seat bridges market insight, model capability, data governance, and shipping velocity—so product decisions are not just clever, but compliant, reliable, and fast.

I often describe the remit simply: "Here is your clear guide to the AI product owner role (skills, responsibilities, how it differs from PM) and ways AI tools supercharge delivery." In practice, the AI Product Owner translates business goals into model-backed experiences, aligns cross-functional execution, and ensures the product’s AI behavior remains safe, lawful, and on-brand under real-world constraints.

How does this differ from a traditional PM? While Product Management sets portfolio strategy, positioning, and market narratives, the AI Product Owner owns the AI experience end-to-end—data readiness, evaluation harnesses, safety guardrails, and the iterative model improvements that drive outcomes vs output OKRs. I anchor the role inside empowered product teams and product trios (PM/Design/ML Eng) to keep discovery continuous and delivery disciplined.

On responsibilities, I expect four pillars. First, discovery: continuous discovery with customers and internal experts to uncover use cases where generative AI or LLMs beat the status quo. Second, experience: define the right interaction patterns for AI UX, including retrieval-first pipeline choices, context window management, and feedback loops for human-in-the-loop correction. Third, governance: privacy-by-design, AI risk management, data governance, and regulatory compliance baked into the roadmap. Fourth, delivery: CI/CD for models and prompts, observable evaluation with A/B testing and minimum detectable effect (MDE), and SRE-grade incident management when AI behavior drifts.

Skills-wise, I look for product sense plus technical fluency. That includes LLMs for product managers (prompting, grounding, RAG), analytics mastery (Amplitude analytics, retention analysis, activation metrics), and comfort with DORA metrics and deployment frequency to keep iteration high but safe. Strong stakeholder management and clear writing are non-negotiable—AI capabilities evolve fast, and leaders must see risk, cost, and ROI with no ambiguity.

AI tools truly supercharge delivery when they eliminate bottlenecks. My practical stack: an AI product toolbox with Claude Code and a ChatGPT connector for rapid prototyping; CustomGPT workflows for support triage and internal knowledge; Pendo product tours and in-app guides to validate behavior changes; Intercom for customer support ai strategy; and tight CRM integration via HubSpot to measure revenue impact. The outcome is faster idea-to-learning cycles, sharper telemetry, and far cleaner handoffs.

For roadmapping, I prioritize thin slices that prove value early—shipping narrowly scoped assistants or copilots, then expanding with product roadmapping and sprint planning that ties capability unlocks to outcomes. A unified analytics platform helps compare human-only baselines to augmented workflows, while agentic AI patterns automate routine steps under strict guardrails.

Risk is a product surface, not a side task. I require explicit policy gates (PII handling, red-teaming, bias audits), clear escalation paths, and incident playbooks. When we treat policy and reliability as features, customers reward us with deeper adoption and higher trust.

If you’re pursuing the AI Product Owner path, build a portfolio around shipped learnings: the experiment you killed with data, the safety constraint you designed, the postmortem you led, and the business metric you moved. That story—evidence of disciplined discovery, responsible delivery, and real-world results—is exactly what teams (and boards) want to see in 2026.

Inspired by this post on Product School.

November 26, 2025
25 High-Impact Career Paths for Software Engineers Beyond Coding: My Real-World Playbook

I’ve spent years helping talented engineers explore what’s next when pure coding no longer feels like the only—or best—path. From hiring across cross-functional teams to mentoring career pivots, I’ve seen firsthand how engineering strengths translate into high-leverage roles that shape product, strategy, and growth.

Software engineers have alternative career options leveraging their skills in roles like product manager, data scientist, business analyst, and 22 more.

When an engineer moves into product management, they’re not starting from scratch—they’re redirecting problem-solving, systems thinking, and customer empathy toward outcomes. In practice, that means mastering product discovery, strengthening stakeholder management, and getting fluent in product roadmapping and sprint planning, so decisions are guided by impact rather than “outputs vs outcomes” confusion. I’ve watched this transition unlock empowered product teams and clearer prioritization across complex backlogs.

Data-oriented paths are equally compelling. If you enjoy experimentation and evidence-based decisions, roles in analytics or data science reward rigor. Think A/B testing, identifying the minimum detectable effect (MDE), and using tools like Amplitude analytics to translate behavioral signals into product bets. Pair that with retention analysis and you’ll become indispensable to growth conversations.

Business-facing roles such as business analyst or product marketing manager are ideal if you’re energized by customer problems and market narratives. Your engineering fluency sharpens value propositions, product positioning, and go-to-market strategy in a way that resonates with both buyers and builders. In my teams, the best bridges between product and revenue often came from former engineers who could articulate trade-offs with clarity.

If operational excellence is your edge, consider SRE, DevOps, or cybersecurity. The same instincts that push you toward clean CI/CD pipelines and resilient architectures translate well into incident management, threat detection and response, and privacy-by-design practices. These roles reward systems thinking and the ability to balance reliability with delivery speed.

For engineers who love community and storytelling, developer evangelism is a natural fit. You’ll translate complex concepts into actionable guidance, from in-app guides and product tours to UX writing and documentation. The best evangelists I’ve worked with turn feedback loops into product insight, strengthening activation and product-led growth without heavy sales pressure.

Customer-facing technical roles—solutions engineer, forward deployed engineer, or technical consultant—let you stay close to the product while solving real-world problems. You’ll drive onboarding quality, user activation, and adoption while surfacing insights that influence roadmaps. Done well, this work tightens the loop between customer outcomes and product decisions.

AI-centered roles are expanding rapidly. If you’re curious about AI Strategy, retrieval-first pipelines, or the practical use of LLMs for product managers, you can bring an engineer’s discernment to a noisy space. The most valuable contributors here pair pragmatic architecture choices with clear risk management and measurable business value, not hype.

Leadership tracks remain a strong option too. The IC to manager transition isn’t about title; it’s about raising the ceiling for others. You’ll coach empowered product teams, shape organizational development, and align initiatives to defensible metrics—think DORA metrics for flow, leading indicators for value, and OKRs that measure outcomes over output.

If you’re exploring a pivot, start small and intentional. Run “career A/B tests” by taking on cross-functional projects, shadowing adjacent roles, or shipping a lightweight portfolio that demonstrates the new muscle. Join a ProductCon session, practice conference networking, and refine a narrative that links your engineering foundation to the outcomes your target role owns.

Finally, map your personal unfair advantages—domain knowledge, systems thinking, customer empathy, or operational rigor—to the roles that value them most. With focus, you can reposition your engineering experience into a differentiated story that accelerates your next chapter. The breadth of options is real, and with a deliberate plan, you’ll turn curiosity into conviction—and conviction into impact.

Inspired by this post on Product School.

November 24, 2025
Mastering Data Governance in the AI Era: Move Fast, Reduce Risk, and Unlock Trusted Insights

Every week, I’m in conversations with product leaders, engineers, and security teams who are trying to ship AI features faster without compromising trust. The tension is real: stakeholders want velocity, customers want transparency, and regulators want accountability. That’s exactly where modern data governance earns its keep.

New AI pressures are redefining what good governance takes. Learn how to build better frameworks, move fast with confidence, and keep your data from being a black box.

In my role leading product management, I’ve learned that robust data governance isn’t a compliance checkbox—it’s a strategic capability. When we treat governance as a product, we architect for clarity, safety, and speed. That means aligning AI Strategy with day-to-day delivery so teams know what they can ship, when, and why.

Here’s the practical blueprint I rely on. First, establish ownership and a shared language. Create a living data catalog, lineage maps, and clear data classifications so teams know which assets are sensitive, regulated, or eligible for training LLMs. Second, harden privacy-by-design and least-privilege access. Bake PII detection, secrets management, and role-based policies directly into your workflows. Third, bring quality and observability to the forefront: instrument data contracts, monitor drift, and track model performance across environments. Finally, implement model governance end to end—dataset cards, model cards, bias testing, human-in-the-loop review, and a repeatable evaluation harness.

To move fast with confidence, make governance invisible and automated. Treat policies as code in CI/CD, gate deployments with pre-merge checks, and fail builds that violate data contracts. Log prompts and outputs responsibly, route unsafe patterns to red-teaming, and use a retrieval-first pipeline to anchor models on verified sources rather than fragile context stuffing. This is how we scale AI product development while keeping audit trails complete and costs in check.

Avoiding the black-box problem starts with transparency. Document assumptions, training data sources, and known limitations—then expose explanations where it matters in the product experience. Pair this with a unified analytics platform to tie telemetry, feature flags, and user feedback to model changes. When something goes sideways, your observability, incident management playbooks, and threat detection and response processes should make root-cause analysis fast and defensible.

If you’re building your program from scratch, use a 30-60-90 approach. In the first 30 days, inventory systems, classify data, and map high-risk use cases. By day 60, formalize RACI for governance, deploy access controls, and set up your evaluation pipeline with golden datasets and measurable acceptance thresholds. By day 90, operationalize incident response, conduct tabletop exercises, and wire governance outcomes into OKRs—think time-to-approval for high-risk changes, reduction in production incidents, and model evaluation pass rates.

This playbook pays off in board conversations and with customers. You can articulate your AI risk management posture, show measurable progress on regulatory compliance, and demonstrate how governance accelerates—not hinders—delivery. Most importantly, your teams gain the confidence to experiment, knowing there’s a safety net that protects users, the brand, and the business.

If your organization is wrestling with how to balance innovation and control, start small, codify what works, and scale with intent. With the right foundations in data governance, AI becomes an engine for durable advantage—not a source of sleepless nights.

Inspired by this post on Amplitude – Perspectives.

November 21, 2025
How I Use ChatGPT to Supercharge Product Management: Workflows, Prompts, and PM Playbooks

I treat ChatGPT as a force multiplier across the entire product lifecycle—from discovery and strategy to delivery and growth. Unlock workflows, prompts, and real PM tips showing how ChatGPT quietly reshapes product management behind the scenes.

My goal is pragmatic: turn generative AI into repeatable, measurable leverage for product discovery, product roadmapping and sprint planning, stakeholder management, and product-led growth without sacrificing quality, privacy-by-design, or judgment. This is how I apply LLMs for product managers in a way that strengthens customer empathy and speeds up decision cycles.

In discovery, I use ChatGPT to synthesize interviews, categorize sentiment, and surface emergent themes faster than a manual pass. I’ll feed it anonymized notes and ask for Jobs-to-be-Done statements, contradictory signals to validate, and the top three risks to our hypotheses. When the corpus gets large, I pair it with a retrieval-first pipeline and apply context window management so outputs stay grounded in real customer data.

On strategy and positioning, I draft and refine a crisp value proposition, clarify points of parity, and identify competitive differentiation. I ask ChatGPT to convert inputs into outcomes vs output OKRs, pressure-test assumptions, and produce a one-page narrative that even non-technical stakeholders can engage with. The result is faster alignment and fewer meetings to get to the same level of clarity.

For planning and delivery, I use ChatGPT to accelerate PRD outlines, user stories, and acceptance criteria, while explicitly requesting edge cases, failure states, and non-functional requirements. I’ll have it map risks to mitigations and suggest simple instrumentation aligned to DORA metrics and incident management readiness—useful when we’re iterating within a CI/CD cadence.

In experimentation, ChatGPT helps me frame strong A/B testing plans, calculate a minimum detectable effect (MDE), and sanity-check sample sizes. I also use it to translate metrics into plain language updates for the team, connect learnings to the next experiment, and propose follow-up analyses for retention analysis or activation bottlenecks.

For growth and onboarding, I prompt ChatGPT to generate hypotheses for user activation, in-app guides, and tooltip design that match personas and JTBDs. It drafts variations I can quickly test through Pendo or similar tools, supports product-led growth motions, and helps craft contextual copy that aligns with our value proposition without adding cognitive load.

Stakeholder communications get sharper and faster. I’ll ask for concise executive summaries, a version tailored for engineering leaders, and another for customer-facing teams. It’s especially effective for QBRs vs OKRs updates, where I need crisp narratives tied to outcomes, plus a plain-English articulation of risks and trade-offs for empowered product teams.

The guardrails matter. I set clear AI risk management boundaries, prevent any sensitive data from entering prompts, and align usage with data governance and regulatory compliance requirements. I also version and review prompts just like product artifacts, so the best ones evolve into a durable AI product toolbox the whole team can use.

If you’re getting started, pick one high-friction workflow—say, interview synthesis or PRD drafting—and timebox a week to build a repeatable prompt set and review rubric. Measure cycle-time savings and quality deltas, then expand to a second workflow. Within a month, you’ll have a lightweight operating model for AI Strategy that compounds across your roadmap.

Inspired by this post on Product School.

November 20, 2025
How We Built an AI Sleep Coach: CBTI, Voice AI, and a Product Playbook for Better Rest

What if your morning started with a helpful check-in from a voice AI that actually improves your sleep—using the same core principles that typically cost thousands of dollars and come with year-and-a-half waitlists? That idea energizes me as a product leader, because it blends clinical-grade outcomes with consumer-grade accessibility. Recently, I dug into how the team at Rest built an AI sleep coach inspired by Cognitive Behavioral Therapy for Insomnia (CBTI), and why their method offers a repeatable blueprint for complex, personal AI products.

The origin story is a classic product discovery moment. Rest’s team noticed that a meaningful slice of users in their podcast app were using audio to fall asleep. Although it represented only about 10% of users, that group showed a high willingness to pay. That signal pushed them to explore a dedicated sleep solution, moving from a general audio app to a targeted sleep experience—and eventually toward an AI-powered coach as LLMs matured.

Through jobs-to-be-done research, they identified a clear, underserved segment: “DIY sleep hackers.” These are motivated users who want agency, structure, and results without navigating clinical systems. Choosing CBTI (a clinically proven approach with 80% efficacy) gave the product a strong evidence-based foundation while remaining accessible as a wellness tool. It’s the kind of strategic choice I look for: credible, measurable, and aligned with user motivation.

The product evolution moved in smart, incremental steps. Rest started with a basic text chatbot before graduating to a voice-first experience—using Vapi for voice and OpenAI for reasoning. Voice changed the relationship dynamic: it increased intimacy, lowered friction for daily check-ins, and made behavioral coaching feel human without pretending to be. The team built a memory system that tracks context (like traveling or having a dog) with time-based relevance, which keeps conversations fresh, respectful, and genuinely personalized.

Daily engagement is driven by dynamic agendas that adapt based on sleep data, the user’s stage in the program, and their recent compliance. I love this mechanic: it operationalizes behavior change by sequencing the right intervention at the right time. In parallel, they developed text via OpenAI Assistants while building voice with Vapi, which let them ship value while learning in two modes. They also moved from massive system prompts to RAG for general sleep knowledge, keeping personal user context in the prompt—reducing brittleness while improving scalability.

Because sleep sits close to healthcare, the team drew a firm line between wellness and medical positioning. They implemented clear guardrails: no diagnosis, no medication advice, and strong boundaries on scope. Weekly error analyses with domain experts (sleep therapists) tightened quality and tone, and they adopted LLM-powered evals to enforce safety boundaries. For observability and evaluations, they leveraged Langfuse, and they experimented with Hamming for voice testing to refine the experience end-to-end.

Under the hood, this is a great example of “one bite of the apple at a time” product building in AI. Start with a simple interface, anchor on an evidence-based method, layer personalization with memory, formalize program structure with dynamic agendas, and shift to RAG when general knowledge outgrows prompt engineering. As a product leader, I see strong echoes of agentic patterns here—goal-oriented orchestration, stateful memory, and adaptive planning—shipped in pragmatic increments rather than as a monolithic platform rewrite.

A few takeaways I’m applying with my teams: First, segment deeply and pick a high-intent niche (those “DIY sleep hackers” were the right beachhead). Second, let modality fit the job—voice is not a gimmick when it boosts compliance and empathy. Third, design safety and scope from day one if you’re anywhere near health. Finally, invest early in evals and observability so you can improve with confidence, not hope.

If you want to explore the full conversation and product decisions, you can listen here: Spotify | Apple Podcasts.

Resources & Links:

Rest – AI sleep coach app

Vapi – Voice agent platform Rest uses

Langfuse – Observability and evals platform

Hamming – Voice testing platform

AI Evals Maven Course by Hamel Husain and Shreya Shankar

Bottom line: Rest demonstrates how to take a clinically grounded method like CBTI, translate it into a daily voice-first experience, and ship it with rigor. If you’re building in AI, this is a model worth studying—practical, safe, and deeply user-centered.

Inspired by this post on Product Talk.

November 20, 2025
High-Quality Data, High-Velocity AI: My Product Playbook for Governance, Trust, and Scale

Every breakthrough we ship in AI reinforces a simple truth I live by: "Companies that prioritize data quality, governance, and structure will accelerate their AI initiatives the fastest." That statement captures the difference between flashy demos and durable, scalable products. In my experience, the strongest AI Strategy starts with the discipline to treat data as a product, not an afterthought.

When teams rush to production with generative AI or LLMs, the first issues rarely come from the model itself—they come from the data. Poor lineage leads to hallucinations, inconsistent schemas inflate costs, and weak access controls erode trust. For LLMs for product managers, this is the gap between a compelling prototype and a reliable system customers depend on every day.

Let me clarify what I mean by data quality, governance, and structure. Quality is completeness, accuracy, freshness, and consistency across sources. Governance is policy, ownership, and accountability—privacy-by-design, regulatory compliance, and AI risk management built in from day one. Structure is the architecture: clear data contracts, standardized schemas, metadata and lineage, and role-based access that keeps sensitive signals protected while enabling speed.

Here’s the product playbook I use to operationalize this. First, map critical sources and define data contracts at the edges so producers and consumers can move independently. Second, standardize schemas and entity resolution to eliminate ambiguous joins. Third, enforce privacy-by-design with policy-as-code and automated redaction. Fourth, converge analytics into a unified analytics platform so definitions, freshness, and observability are shared. Fifth, instrument end-to-end lineage and quality SLAs with alerting. Finally, close the loop with human feedback and labeling to continuously improve model performance.

For generative AI workloads, a retrieval-first pipeline is essential. Unify trusted sources (product analytics, CRM, support, docs), embed and index them with guardrails, and focus on context window management to keep prompts lean, relevant, and cost-effective. This approach improves response quality, reduces token spend, and makes updates near-real-time—without retraining the base model every week.

Measure what matters. Tie model outcomes to product metrics through rigorous A/B testing, and size experiments with minimum detectable effect (MDE) so you can ship confidently. Use product analytics to verify that better data actually improves activation, retention, and support deflection. When teams can trace an AI improvement back to a specific data-quality fix, they invest in governance with conviction.

Culture closes the gap. Empowered product teams and product trios (PM, design, engineering) make crisper decisions when data stewards are embedded and accountable. Clear ownership, shared definitions, and transparent dashboards reduce friction with security and compliance while speeding up delivery. This is how product management leadership sustains velocity without trading away trust.

The bottom line: if we want faster, safer, and more scalable AI, we start with the data. Build strong foundations, treat governance as enablement, and structure every step so improvements compound. With that in place, Generative AI stops being a science experiment and becomes a durable competitive advantage.

Inspired by this post on Amplitude – Perspectives.

November 19, 2025

Brand Visibility in AI Answer Engines: A Product Playbook

If your CEO asks why an AI answer names a competitor but leaves out your brand, the tempting response is to publish more pages or look for a ChatGPT optimization trick. That treats the symptom. The real question is whether the answer engine can confidently connect your brand to the user’s decision, verify the connection, and explain it accurately.

Treat AI visibility as a product system. You can improve its inputs, test its outputs, and assign owners to its failure modes. You cannot guarantee a mention, but you can increase the probability of an accurate inclusion by building a clear public identity, credible evidence, reliable retrieval, and useful actions.

Define the decision you want to be present for

Brand visibility is too vague to manage. Visibility for what? A category definition, a shortlist, an integration question, a troubleshooting task, and a product comparison are different jobs. Each requires different evidence.

Start with an intent map. Use the customer journey, support conversations, sales objections, onboarding friction, and product analytics to identify the decisions that matter. Then connect each decision to the artifact an answer engine would need.

User job	Typical question	Artifact to publish	Desired answer behavior
Understand the category	What problem does this category solve?	Category explainer and glossary	Recognize the brand’s category and relevant use cases
Evaluate options	Which product fits this workflow or constraint?	Use-case page, comparison, and evidence	Include the brand when it genuinely fits and state the tradeoffs
Get started	How do I reach the first useful outcome?	Quick-start documentation	Return accurate prerequisites and steps
Integrate	Does this product connect to another system?	Integration page and API documentation	Describe compatibility, setup, and limitations correctly
Resolve a problem	Why is this workflow failing?	Troubleshooting documentation	Retrieve a grounded diagnosis and resolution path
Check current status	Is this feature available, and what changed?	Changelog and release notes	Use current product facts instead of stale descriptions

For each row, define when your brand is actually eligible. A weak objective says, ‘The brand should appear.’ A useful objective says, ‘The brand is relevant when the user needs this capability, works under these constraints, and can verify these claims.’

That distinction protects the program from vanity metrics. Your product should not appear in every answer. It should appear in the answers where it can help, in the correct category, with an honest account of its strengths and limits. My rule is simple: a mention that misclassifies the product is a failure, even if the brand name is present.

Prioritize prompt families using product judgment. Start where a better answer could affect a meaningful buying, activation, integration, or support decision. Within that set, look for the largest evidence gap: an important question for which your current public material is missing, contradictory, gated, or stale. That gives you a defensible backlog rather than an open-ended demand for more content.

Build a canonical brand record before producing more content

An answer engine has a harder job when your homepage describes one category, your documentation uses another product name, a partner directory lists an old capability, and a comparison page makes a broader claim than the evidence supports. Publishing another page adds volume without resolving the identity problem.

Create an internal brand fact record that becomes the contract for every public property. It should contain:

The official organization, product, and feature names, including approved abbreviations.
The primary category and a plain-language description of what the product does.
The users, jobs, and constraints for which the product is relevant.
The capabilities and integrations that can be stated publicly.
The limitations or eligibility conditions that materially change a recommendation.
The evidence behind important claims, such as documentation, case studies, API references, or release notes.
An owner and review trigger for every fact that can change.

Use this record to audit the homepage, product pages, documentation, API references, GitHub repositories, partner listings, review profiles, and conference descriptions. Do not force identical prose everywhere. Do keep the underlying identity, category, capability, and product status consistent.

Your site architecture should make that identity easy to follow. Connect category explainers to use-case pages, use-case pages to product documentation, documentation to integrations and troubleshooting, and changing capabilities to release notes. The links should reflect a real path from understanding to evaluation to action.

Then inspect the technical path an unauthenticated visitor can use. The essentials are concrete:

Put foundational product facts in semantic HTML rather than only inside images, videos, or interfaces that require a login.
Keep robots.txt and XML sitemaps friendly to public product and documentation pages.
Use canonical tags to concentrate signals when similar pages exist.
Apply schema.org types such as Organization, Product, HowTo, and FAQPage only where the visible content supports them.
Use descriptive headings and rich alt text so page meaning is not dependent on presentation.
Keep public pages fast enough to retrieve reliably.
Leave foundational documentation open when there is no business, privacy, or security reason to gate it.

Do not loosen access controls in the name of visibility. Public product facts, help content, and approved evidence belong in the retrievable footprint. Customer data, internal plans, private support records, and administrative documentation do not. The right fix for a gated public fact is a safe public page, not broader access to a private system.

Write pages that answer prompts without requiring guesswork

Traditional marketing pages often ask the visitor to infer the product’s category, audience, and value from slogans. An answer engine needs explicit relationships. It should be able to identify what the product is, who it is for, what task it performs, what conditions apply, and where the supporting evidence lives.

Use a predictable page contract

Write as if you are teaching a capable assistant that lacks your internal context. A useful page contract contains:

A short opening that directly answers the page’s primary question.
A clear definition of the product, feature, workflow, or integration.
Prerequisites and eligibility conditions before the instructions begin.
Steps or decision criteria in the order the user needs them.
Limitations, tradeoffs, and unsupported cases near the claim they qualify.
Links to evidence and deeper documentation.
A visible path to the next task, such as setup, troubleshooting, or an API operation.

Define acronyms where they first appear. Use descriptive headings rather than clever labels. Add concise question-and-answer sections when they match real prompts. Repeat canonical facts consistently, but do not bury the useful answer under repeated positioning language.

Match the artifact to the intent

A single generic landing page cannot cover the full journey. Build the artifact that makes the intended answer defensible:

Category explainers should define the problem, the common workflow, the relevant buyer, and the boundaries of the category.
Use-case pages should connect a specific user job to product capabilities and show the conditions under which the fit holds.
Comparison pages should state points of parity, meaningful differences, user fit, limitations, and migration considerations without turning every dimension into a victory claim.
Quick starts should identify prerequisites, the setup sequence, the first observable success, and common failure paths.
Integration pages should state supported objects or workflows, authentication requirements, data direction, limitations, and links to the relevant API or setup instructions.
Troubleshooting pages should connect symptoms to likely causes, corrective steps, and a way to verify that the fix worked.
Release notes and changelogs should make changing availability, behavior, and terminology explicit.

Comparison content deserves particular care because it directly affects product positioning. Do not hide obvious points of parity or invent distinctions that a buyer cannot verify. Explain where the alternatives differ, who benefits from each difference, and when the distinction should change the decision. Honest limits make the rest of the page more credible.

Maintain a claim ledger behind these pages. Record the exact claim, its evidence, the public locations where it appears, its owner, and the event that should trigger review. A product rename, integration change, policy update, or feature release should update the ledger and the affected pages together. This is how content operations become part of product operations.

Layer authority, live retrieval, and useful actions

AI visibility can happen at different layers. Treating them as one channel makes diagnosis difficult:

Public-footprint visibility comes from a clear, consistent body of information that helps an engine recognize the brand and its category.
Retrieval visibility happens when the engine or an attached workflow fetches current material during the conversation.
Action visibility happens when a connector or tool lets the user complete a task through the assistant.

The public footprint needs distribution as well as first-party content. Keep product facts consistent across documentation, API references, GitHub repositories, partner directories, reputable media, conference material, and legitimate third-party reviews. Pursue inclusion in structured knowledge bases such as Wikidata only when the brand meets the relevant eligibility requirements.

Do not manufacture authority through fabricated claims, fake reviews, or spammy link schemes. Those tactics create contradictions and reputational risk. The durable strategy is to be verifiably useful on the surfaces where practitioners already look for answers.

Live retrieval becomes important when an answer depends on current documentation, account context, or a changing product state. A retrieval-first pipeline should fetch the relevant material before the response is generated. Its quality depends on more than adding documents to an index.

Chunk documentation around a coherent task or concept rather than breaking related instructions apart.
Carry the heading and parent context with each chunk so a retrieved paragraph retains its meaning.
Add metadata for product, feature, version or status, intent, update state, and access permissions.
Prefer canonical documentation when duplicate explanations compete.
Return citations or document identifiers that allow the answer to be checked.
Test retrieval against the same prompt families used for visibility measurement.

A ChatGPT connector or CustomGPT workflow adds the action layer. Publish a high-quality OpenAPI specification, keep each action narrowly scoped, and describe its inputs, permissions, output, and failure conditions clearly. The assistant should be able to choose the correct operation without guessing between overlapping tools.

Privacy-by-design belongs in the architecture, not in a warning added after launch. Enforce the user’s permissions before retrieval, preserve tenant boundaries, minimize the data passed into the model context, and keep secrets out of indexed content. If an action changes data or creates an external consequence, use clear confirmation and guardrails appropriate to that action.

A connector does not replace the public footprint. It improves accuracy and task completion for users who can access it. Public explanations still establish category relevance, authority, and discoverability before the user invokes a tool.

Measure visibility as a product system, not a screenshot

A favorable answer copied into a presentation is not a measurement system. Answer behavior can vary with wording, context, model configuration, accessible material, and tool availability. Build a stable panel of priority prompts and track its outputs over time.

Each prompt in the panel should have an intent identifier, target user, task, wording, expected eligibility condition, claims that must be correct, and an artifact owner. Include natural variants across category discovery, evaluation, setup, integration, and troubleshooting. Preserve the panel long enough to compare changes instead of rewriting it after every result.

Score more than whether the name appeared:

Eligible mention rate: how often the brand appears when the predefined fit conditions are present.
Grounded citation rate: how often the answer points to appropriate first-party or credible third-party evidence.
Factual accuracy: whether the answer passes a predefined set of product facts.
Positioning accuracy: whether the brand is placed in the right category, use case, and competitive context.
Freshness: whether changing capabilities and product status match the canonical record.
Retrieval success: whether the workflow returns the document needed for the task.
Action completion: whether an enabled connector completes the intended task under the correct permissions.

Share of voice can help, but only within eligible prompts. A rising mention rate paired with falling accuracy is not progress. Nor is a citation useful when it points to an outdated page.

Use the failure pattern to choose the next intervention:

If the brand is absent across an entire intent family, inspect coverage, category clarity, and external authority.
If it appears under the wrong category, reconcile names and definitions across the canonical record and public properties.
If it appears without evidence, strengthen the relevant artifact and its links to documentation or proof.
If the facts are stale, repair canonical pages, release notes, metadata, and duplicate content.
If retrieval returns the wrong page, adjust chunking, metadata, canonical preference, and evaluation queries.
If the answer is correct but the action fails, inspect the OpenAPI description, authentication, permissions, inputs, and error handling.

Test changes with the same discipline used for a product experiment. State the hypothesis before shipping. Freeze the evaluation rubric. Capture a baseline, compare the candidate under the same conditions, and use repeated samples rather than interpreting one convenient response. Use an A/B design only where exposure can be isolated; otherwise label the result as a before-and-after observation and avoid claiming causality.

Set the minimum detectable effect before reviewing the outcome. In this context, it is the smallest improvement large enough to justify a decision. That prevents a tiny movement in a noisy prompt panel from becoming a success story merely because the team wants the release to work.

Assign ownership by failure class. Product marketing can own canonical positioning, documentation can own instructional accuracy, the web team can own crawlability and structured markup, engineering can own retrieval and connectors, and product or analytics can own the evaluation panel. A shared dashboard is useful only when each red metric has a named route to action.

Key takeaways

Optimize for eligibility in a real user decision, not for raw brand-name frequency.
Establish one canonical brand fact record before adding more public content.
Publish answer-shaped artifacts for category, comparison, setup, integration, troubleshooting, and product-change intents.
Combine a trustworthy public footprint with live retrieval and carefully scoped actions.
Measure mentions, citations, accuracy, freshness, retrieval, and task completion separately.
Tie every content or technical change to a hypothesis, a stable prompt panel, and a minimum detectable effect.

Start with the prompt family closest to a real buying, activation, integration, or support decision. Capture the baseline answer, identify the smallest missing or unreliable artifact, fix it, and rerun the same evaluation. Expand to adjacent intents only after the first one produces consistently accurate, well-grounded answers.

The goal is not to make an assistant say your name. It is to make your brand a defensible inclusion for the right question, supported by current evidence and a working next step.

References

Shivam.Consulting Blog – Crack the AI Answer Engine: How I Boost Brand Visibility in ChatGPT – Proven, Ethical Playbook

November 17, 2025

How I Use ChatGPT to Supercharge PM: Smart Workflows, Killer Prompts, and Real-World Wins

Every week, I lean on ChatGPT to cut through noise, reduce rework, and move faster with more confidence. It’s not a silver bullet, but it has become an unfair advantage in my day-to-day leadership of product strategy, discovery, and delivery. Unlock workflows, prompts, and real PM tips showing how ChatGPT quietly reshapes product management behind the scenes.

Here’s my stance: ChatGPT doesn’t replace product judgment. It amplifies it. Used well, it accelerates product discovery, clarifies roadmaps, sharpens positioning, and strengthens stakeholder management. Used poorly, it creates noise and risk. What follows are the specific workflows and prompts that reliably save me hours while protecting quality and trust.

Discovery and research are where I see the biggest upside. I use ChatGPT to draft interview guides, transform raw notes into theme clusters, and generate “Jobs to Be Done” problem statements—then I validate them with customers. I anonymize inputs to protect privacy and follow privacy-by-design and data governance commitments; AI risk management matters more than ever when we’re handling real user data.

When I move from insight to definition, ChatGPT helps me spin up crisp PRDs and user stories. I provide context about our users, constraints, and success metrics and ask for structured outputs: goals, non-goals, acceptance criteria, and risks. This keeps our product trios aligned and focused on outcomes vs output OKRs, not just shipping features.

For competitive analysis and positioning, I feed in public information and ask for points of parity, points of differentiation, and potential messaging angles. I treat the output as a starting point for my value proposition and battlecards—not the final word. It’s a fast way to surface hypotheses and pressure-test our product-led growth narrative.

Roadmapping and sprint planning also benefit. I use ChatGPT to map dependencies, draft milestone narratives, and transform epics into well-formed backlogs. When we align quarterly plans, I ask for risk scenarios and contingency options so we can make trade-offs explicit before we commit.

On analytics and experiments, ChatGPT is my drafting partner. It helps me define A/B testing plans, clarify the minimum detectable effect (MDE), and outline instrumentation requirements. I still verify numbers in our analytics stack, but the scaffolding is done in minutes, not hours—freeing me to focus on retention analysis and activation levers.

Stakeholder communication is where the time savings compound. I use ChatGPT to produce executive summaries, QBRs vs OKRs comparisons, and board-ready narratives that highlight outcomes, risks, and next steps. It’s a powerful way to stay crisp and consistent across leadership updates without losing the nuance that matters.

Prompt patterns make or break results. I keep four rules: set the role, provide rich context, define constraints, and specify the output format. For example: “You are a senior PM advisor. Context: [user, market, problem]. Constraints: [privacy, timeline, budget]. Output: PRD with goals, acceptance criteria, and risks.” With larger inputs, I use context window management by chunking content and asking for summaries before synthesis.

For internal knowledge, I lean on a retrieval-first pipeline. Instead of pasting long docs, I reference curated, approved sources so answers track to current reality. CustomGPT workflows and a simple ChatGPT connector help with governance: they increase speed while reducing the chance of hallucinations and stale information.

Guardrails are non-negotiable. We never paste sensitive data into prompts; we redact PII, spot-check against source-of-truth systems, and red-team important outputs. AI risk management isn’t just a checkbox—it’s how we maintain trust while scaling productivity with gen ai.

Finally, enablement turns personal productivity into team capability. I run short playbooks for empowered product teams: discovery synthesis, PRD drafting, roadmap storytelling, and stakeholder-ready updates. The result is higher-quality thinking, faster cycles, and fewer meetings to align on the essentials.

ChatGPT for product managers isn’t hype; it’s a practical edge when you apply discipline. Start with one workflow that drains your time, add a prompt template, and measure the outcome. In a week, you’ll have proof. In a quarter, you’ll have a new operating system for how your team learns, decides, and ships.

Inspired by this post on Product School.

November 17, 2025
Taming 1,000+ Vendor Emails: How Xelix’s AI Helpdesk Delivers Fast, Confident Answers

Chaos in vendor communications is a problem I see across finance operations: sprawling accounts payable inboxes, slow response times, and missed context. That’s why this build caught my attention—not just because it’s GenAI, but because it’s a disciplined product strategy that converts email overload into measurable outcomes.

Accounts payable inboxes can see 1,000+ vendor emails a day. Xelix’s new Helpdesk turns that chaos into structured tickets, enriched with ERP data, and pre-drafted replies—complete with confidence scores.

I dug into the end-to-end approach with the team—Claire Smid — AI Engineer, Xelix; Emilija Gransaull — Back-End Tech Lead, Xelix; Talal A. — Product Manager, Xelix—focusing on how they scoped the problem, iterated fast, and de-risked AI in production.

Their product thesis is refreshingly pragmatic. They prototyped with “daily slices” (Carpaccio-style) and built a retrieval-first pipeline that matches vendors, links invoices, and drafts accurate responses—before a human ever clicks “send.” That framing matters: enrichment and matching take center stage, with the model amplifying precision instead of improvising.

We unpacked the tricky bits that make or break an AI helpdesk at scale: vendor identity matching, Outlook threading, UX pivots from “inbox clone” to ticket-first views, and the metrics that prove real impact (handling time, stickiness, auto-closed spam). The pipeline architecture and email processing choices were grounded in operational realities, not just AI aspirations.

Several takeaways are worth pinning to any AI product roadmap. “Start narrow to win: pick high-volume, high-cost requests (invoice status & reminders).” “Enrichment > magic: accurate replies come from great retrieval/matching, not just a bigger LLM.” “Design for adoption: familiar inbox view helps onboarding, but a ticket-first UI unlocks AI features.” These are the kinds of decisions that drive adoption, trust, and ROI.

Data enrichment challenges dominated early learning curves: stitching ERP context into tickets, handling vendor identification at scale, managing email thread continuity, and calibrating response generation for accuracy. On the generation side, the team emphasized precision over verbosity—clean responses that reflect system-of-record truth—then instrumented the experience to “Evaluate System Performance” with production-grade telemetry.

Trust was treated as a product feature. “Measure outcomes, not vibes: track ‘messages sent from Helpdesk’, % auto-resolved.” And critically, “Confidence builds trust: show match quality and response confidence so humans know when to edit.” By surfacing match quality and confidence scores, they shortened coaching loops and made human-in-the-loop supervision feel natural, not burdensome.

What’s next is equally compelling: “targeted generation, multiple specialized responders, and more agentic routing.” That direction aligns with agentic AI patterns I recommend for operations-heavy workflows—route first, retrieve deeply, then generate with intent. It’s a scalable path from assistive AI to autonomous resolution while maintaining governance and auditability.

If you want a quick map of the journey, the conversation flowed from 0:00 Meet the Team: Claire, Emilija, and Talal, 00:36 Introduction to Xelix and Its Products, 01:08 Understanding Accounts Payable Teams, 01:37 Help Desk Product Overview, 03:11 Challenges Faced by Accounts Payable Teams, 04:03 AI Integration in Help Desk, 05:47 Automating Reconciliation Requests, 07:45 Development Methodology: Carpaccio, 09:11 Prototyping and Beta Testing, 12:00 Manual Tagging and Data Collection, 16:39 Focusing on High-Impact Use Cases, 18:55 User Experience and Interface Design, 24:56 Pipeline Architecture and Email Processing, 28:21 Data Enrichment Challenges, 29:04 Handling Vendor Identification, 33:33 Email Thread Management, 36:15 Generating Accurate Responses, 40:48 Evaluating System Performance, 49:20 Future Developments and Goals.

My takeaway for product leaders: when the domain is high-volume and rules-heavy (like AP), retrieval-first beats model-first. Start with the narrowest, costliest intents; prove lift with “messages sent from Helpdesk” and “% auto-resolved”; then graduate UX from familiar to AI-native (ticket-first) once trust is earned. That’s how you turn vendor chaos into answers—reliably, scalably, and fast.

Inspired by this post on Product Talk.

November 13, 2025