Tag: LLMs for product managers

How I Use ChatGPT to Supercharge Product Management: Workflows, Prompts, and PM Playbooks

I treat ChatGPT as a force multiplier across the entire product lifecycle—from discovery and strategy to delivery and growth. Unlock workflows, prompts, and real PM tips showing how ChatGPT quietly reshapes product management behind the scenes.

My goal is pragmatic: turn generative AI into repeatable, measurable leverage for product discovery, product roadmapping and sprint planning, stakeholder management, and product-led growth without sacrificing quality, privacy-by-design, or judgment. This is how I apply LLMs for product managers in a way that strengthens customer empathy and speeds up decision cycles.

In discovery, I use ChatGPT to synthesize interviews, categorize sentiment, and surface emergent themes faster than a manual pass. I’ll feed it anonymized notes and ask for Jobs-to-be-Done statements, contradictory signals to validate, and the top three risks to our hypotheses. When the corpus gets large, I pair it with a retrieval-first pipeline and apply context window management so outputs stay grounded in real customer data.

On strategy and positioning, I draft and refine a crisp value proposition, clarify points of parity, and identify competitive differentiation. I ask ChatGPT to convert inputs into outcomes vs output OKRs, pressure-test assumptions, and produce a one-page narrative that even non-technical stakeholders can engage with. The result is faster alignment and fewer meetings to get to the same level of clarity.

For planning and delivery, I use ChatGPT to accelerate PRD outlines, user stories, and acceptance criteria, while explicitly requesting edge cases, failure states, and non-functional requirements. I’ll have it map risks to mitigations and suggest simple instrumentation aligned to DORA metrics and incident management readiness—useful when we’re iterating within a CI/CD cadence.

In experimentation, ChatGPT helps me frame strong A/B testing plans, calculate a minimum detectable effect (MDE), and sanity-check sample sizes. I also use it to translate metrics into plain language updates for the team, connect learnings to the next experiment, and propose follow-up analyses for retention analysis or activation bottlenecks.

For growth and onboarding, I prompt ChatGPT to generate hypotheses for user activation, in-app guides, and tooltip design that match personas and JTBDs. It drafts variations I can quickly test through Pendo or similar tools, supports product-led growth motions, and helps craft contextual copy that aligns with our value proposition without adding cognitive load.

Stakeholder communications get sharper and faster. I’ll ask for concise executive summaries, a version tailored for engineering leaders, and another for customer-facing teams. It’s especially effective for QBRs vs OKRs updates, where I need crisp narratives tied to outcomes, plus a plain-English articulation of risks and trade-offs for empowered product teams.

The guardrails matter. I set clear AI risk management boundaries, prevent any sensitive data from entering prompts, and align usage with data governance and regulatory compliance requirements. I also version and review prompts just like product artifacts, so the best ones evolve into a durable AI product toolbox the whole team can use.

If you’re getting started, pick one high-friction workflow—say, interview synthesis or PRD drafting—and timebox a week to build a repeatable prompt set and review rubric. Measure cycle-time savings and quality deltas, then expand to a second workflow. Within a month, you’ll have a lightweight operating model for AI Strategy that compounds across your roadmap.

Inspired by this post on Product School.

November 20, 2025
How We Built an AI Sleep Coach: CBTI, Voice AI, and a Product Playbook for Better Rest

What if your morning started with a helpful check-in from a voice AI that actually improves your sleep—using the same core principles that typically cost thousands of dollars and come with year-and-a-half waitlists? That idea energizes me as a product leader, because it blends clinical-grade outcomes with consumer-grade accessibility. Recently, I dug into how the team at Rest built an AI sleep coach inspired by Cognitive Behavioral Therapy for Insomnia (CBTI), and why their method offers a repeatable blueprint for complex, personal AI products.

The origin story is a classic product discovery moment. Rest’s team noticed that a meaningful slice of users in their podcast app were using audio to fall asleep. Although it represented only about 10% of users, that group showed a high willingness to pay. That signal pushed them to explore a dedicated sleep solution, moving from a general audio app to a targeted sleep experience—and eventually toward an AI-powered coach as LLMs matured.

Through jobs-to-be-done research, they identified a clear, underserved segment: “DIY sleep hackers.” These are motivated users who want agency, structure, and results without navigating clinical systems. Choosing CBTI (a clinically proven approach with 80% efficacy) gave the product a strong evidence-based foundation while remaining accessible as a wellness tool. It’s the kind of strategic choice I look for: credible, measurable, and aligned with user motivation.

The product evolution moved in smart, incremental steps. Rest started with a basic text chatbot before graduating to a voice-first experience—using Vapi for voice and OpenAI for reasoning. Voice changed the relationship dynamic: it increased intimacy, lowered friction for daily check-ins, and made behavioral coaching feel human without pretending to be. The team built a memory system that tracks context (like traveling or having a dog) with time-based relevance, which keeps conversations fresh, respectful, and genuinely personalized.

Daily engagement is driven by dynamic agendas that adapt based on sleep data, the user’s stage in the program, and their recent compliance. I love this mechanic: it operationalizes behavior change by sequencing the right intervention at the right time. In parallel, they developed text via OpenAI Assistants while building voice with Vapi, which let them ship value while learning in two modes. They also moved from massive system prompts to RAG for general sleep knowledge, keeping personal user context in the prompt—reducing brittleness while improving scalability.

Because sleep sits close to healthcare, the team drew a firm line between wellness and medical positioning. They implemented clear guardrails: no diagnosis, no medication advice, and strong boundaries on scope. Weekly error analyses with domain experts (sleep therapists) tightened quality and tone, and they adopted LLM-powered evals to enforce safety boundaries. For observability and evaluations, they leveraged Langfuse, and they experimented with Hamming for voice testing to refine the experience end-to-end.

Under the hood, this is a great example of “one bite of the apple at a time” product building in AI. Start with a simple interface, anchor on an evidence-based method, layer personalization with memory, formalize program structure with dynamic agendas, and shift to RAG when general knowledge outgrows prompt engineering. As a product leader, I see strong echoes of agentic patterns here—goal-oriented orchestration, stateful memory, and adaptive planning—shipped in pragmatic increments rather than as a monolithic platform rewrite.

A few takeaways I’m applying with my teams: First, segment deeply and pick a high-intent niche (those “DIY sleep hackers” were the right beachhead). Second, let modality fit the job—voice is not a gimmick when it boosts compliance and empathy. Third, design safety and scope from day one if you’re anywhere near health. Finally, invest early in evals and observability so you can improve with confidence, not hope.

If you want to explore the full conversation and product decisions, you can listen here: Spotify | Apple Podcasts.

Resources & Links:

Rest – AI sleep coach app

Vapi – Voice agent platform Rest uses

Langfuse – Observability and evals platform

Hamming – Voice testing platform

AI Evals Maven Course by Hamel Husain and Shreya Shankar

Bottom line: Rest demonstrates how to take a clinically grounded method like CBTI, translate it into a daily voice-first experience, and ship it with rigor. If you’re building in AI, this is a model worth studying—practical, safe, and deeply user-centered.

Inspired by this post on Product Talk.

November 20, 2025
High-Quality Data, High-Velocity AI: My Product Playbook for Governance, Trust, and Scale

Every breakthrough we ship in AI reinforces a simple truth I live by: "Companies that prioritize data quality, governance, and structure will accelerate their AI initiatives the fastest." That statement captures the difference between flashy demos and durable, scalable products. In my experience, the strongest AI Strategy starts with the discipline to treat data as a product, not an afterthought.

When teams rush to production with generative AI or LLMs, the first issues rarely come from the model itself—they come from the data. Poor lineage leads to hallucinations, inconsistent schemas inflate costs, and weak access controls erode trust. For LLMs for product managers, this is the gap between a compelling prototype and a reliable system customers depend on every day.

Let me clarify what I mean by data quality, governance, and structure. Quality is completeness, accuracy, freshness, and consistency across sources. Governance is policy, ownership, and accountability—privacy-by-design, regulatory compliance, and AI risk management built in from day one. Structure is the architecture: clear data contracts, standardized schemas, metadata and lineage, and role-based access that keeps sensitive signals protected while enabling speed.

Here’s the product playbook I use to operationalize this. First, map critical sources and define data contracts at the edges so producers and consumers can move independently. Second, standardize schemas and entity resolution to eliminate ambiguous joins. Third, enforce privacy-by-design with policy-as-code and automated redaction. Fourth, converge analytics into a unified analytics platform so definitions, freshness, and observability are shared. Fifth, instrument end-to-end lineage and quality SLAs with alerting. Finally, close the loop with human feedback and labeling to continuously improve model performance.

For generative AI workloads, a retrieval-first pipeline is essential. Unify trusted sources (product analytics, CRM, support, docs), embed and index them with guardrails, and focus on context window management to keep prompts lean, relevant, and cost-effective. This approach improves response quality, reduces token spend, and makes updates near-real-time—without retraining the base model every week.

Measure what matters. Tie model outcomes to product metrics through rigorous A/B testing, and size experiments with minimum detectable effect (MDE) so you can ship confidently. Use product analytics to verify that better data actually improves activation, retention, and support deflection. When teams can trace an AI improvement back to a specific data-quality fix, they invest in governance with conviction.

Culture closes the gap. Empowered product teams and product trios (PM, design, engineering) make crisper decisions when data stewards are embedded and accountable. Clear ownership, shared definitions, and transparent dashboards reduce friction with security and compliance while speeding up delivery. This is how product management leadership sustains velocity without trading away trust.

The bottom line: if we want faster, safer, and more scalable AI, we start with the data. Build strong foundations, treat governance as enablement, and structure every step so improvements compound. With that in place, Generative AI stops being a science experiment and becomes a durable competitive advantage.

Inspired by this post on Amplitude – Perspectives.

November 19, 2025
Intercom is now a Shopify Plus Technology Partner: AI-powered support to scale ecommerce

I’m thrilled to share that Intercom is now a certified Shopify Plus Partner on the Technology Track. As someone who obsesses over product quality, speed, and measurable outcomes, this milestone reflects the rigorous standards we hold ourselves to and the trust Shopify Plus merchants can place in our solution.

The Shopify Partner Program Technology Track supports the largest Shopify merchants by helping them find the apps and solutions they need to build and scale their business. The program is available specifically for Shopify Partners who provide a level of product quality, service, performance, privacy, and support that meets the advanced requirements of Shopify Plus merchants.

As a Technology Partner, Shopify has recognized Intercom as a provider trusted to help high-growth ecommerce brands scale.

“The Shopify Partner Program Technology Track is designed to meet the advanced requirements of the world’s fastest growing brands. We’re happy to welcome Intercom to the program, bringing their insight and experience in Customer Support to the Plus merchant community.”

— Jeff Kennedy, Head of Product Partnerships, Shopify

For Shopify Plus merchants, this certification means that our integration is vetted and optimized, and that our roadmap aligns with Shopify’s priorities. In practice, that translates into faster resolutions, less context switching, and more personalized conversations—without compromising privacy or performance.

Over the past year, we’ve launched a series of enhancements to our Shopify integration to give merchants more control and speed in support, including:

Data Connector templates so our AI Agent Fin can fully resolve requests from customers who want to get information about their Shopify order.

Multi-store support for merchants to manage conversations from multiple storefronts in one inbox.

Inbox order actions for merchants to take actions like editing shipping addresses, cancelling and refunding whole orders, deduplicating or creating duplicate orders based on existing ones, all without leaving the conversation.

EU workspace support to ensure merchants stay aligned with EU data residency requirements.

Launch your AI customer service faster—this hero graphic invites users to try the #1 AI agent with a bold headline and clear CTA, emphasizing practical, real‑world demos over polished Hollywood sizzle.

Updated data mapping and custom fields to keep Shopify order data and customer profiles fully in sync.

These updates make it faster and easier for merchants to resolve queries, personalize conversations, and drive loyalty, all from one platform. I’ve seen these capabilities reduce average handle time and minimize escalations—especially for complex order changes and post-purchase workflows.

We’re already seeing how our Shopify integration is helping merchants scale their support and deliver better customer experiences: teams are deflecting routine inquiries with AI while empowering agents to focus on high-value, relationship-building conversations.

Our team is continuing to invest in Shopify-specific capabilities. Here’s what we’re working on:

Expanded Fin Tasks for complex order actions with new pre-built workflows.

Enabling Model Context Protocol (MCP) support.

Smarter product search powered by Shopify data.

These additions will help merchants resolve faster, personalize at scale, and stay ahead of rising customer expectations – particularly as we approach peak season. We’ll continue to ship in tight feedback loops with Plus merchants to ensure each improvement moves the needle.

If you’re a Shopify Plus merchant, learn more about how we can help you scale your support with Fin, the best performing AI Agent for ecommerce. Ready to move fast? Get started with Fin now.

Inspired by this post on The Intercom Blog.

November 18, 2025
Inside PendomoniumX London: AI Transformation, Real-World Wins, and Product Innovation

Walking into PendomoniumX London, I could feel the AI revolution hitting its stride. The conversations were sharper, the demos more grounded, and the outcomes more measurable—a clear signal that AI Strategy is moving from slideware to shipped value in modern product management. PendomoniumX’s sixth stop brought 350+ software leaders together for a day of AI transformation, real-world stories, and product innovation. What stood out to me was the shift from hype to execution. Teams compared playbooks for gen ai and Generative AI, shared lessons from LLMs for product managers, and showed how they’re threading AI into product discovery, product roadmapping and sprint planning, and go-to-market motions. The focus was pragmatic: drive adoption, accelerate time-to-value, and make better decisions with cleaner signals. On the product-led growth front, I saw compelling examples of using Pendo’s in-app guides and product tours to increase user activation and reduce friction in key onboarding moments. When AI-enhanced experiences are paired with clear guidance and behavioral analytics, customers don’t just try features—they build habits. What I appreciated most were the leadership narratives: empowered product teams aligning around outcomes, candid retros on where AI prototypes missed the mark, and crisp frameworks for prioritizing the highest-leverage bets. The conference networking felt purposeful, with operators trading hard-won insights on experimentation velocity, data governance, and building trust into AI-infused experiences. My takeaway: AI is no longer a side project—it’s a core capability in product management. If we anchor our AI Strategy in clear customer problems, instrument for learning, and iterate with discipline, we can consistently turn innovation into impact. And with the right mix of PLG mechanics, in-app education, and thoughtful design, those gains compound across the product lifecycle.

Inspired by this post on Pendo – Perspectives.

November 17, 2025
Crack the AI Answer Engine: How I Boost Brand Visibility in ChatGPT — Proven, Ethical Playbook

I hear the same question in nearly every executive review and go-to-market strategy session: how do we get our brand to show up more often inside ChatGPT? As a product leader, I treat this as an AI Strategy problem, not a mystery. The path forward looks a lot like modern SEO, adapted to how large language models (LLMs) discover, trust, and summarize information across the web and via tools.

Understand how ChatGPT works and how to make your brand appear more often. Like SEO, but for AI chats.

First, let me set expectations. We can’t force mentions, but we can systematically raise the probability that an LLM chooses our content as a trusted source. My playbook centers on three levers: strengthen your public footprint (so you’re easy to learn from), amplify trustworthy signals (so you’re chosen), and enable high-fidelity retrieval and actions (so you’re accurate and current when the model reaches out).

Public footprint: I build topical authority around the entity that is our brand. That means canonical naming, clean information architecture, and interlinked explainers, how-tos, and case studies that answer real tasks. I use schema.org (Organization, Product, HowTo, FAQPage) to make our pages machine-readable, and I back claims with credible citations. Think of this as “entity-first content design” for gen ai and LLMs for product managers.

Content design for LLMs: I write like I’m teaching a capable assistant. I define acronyms in-line, structure pages with crisp headings, include concise summaries up top, and add Q&A sections that mirror natural prompts. I avoid heavy gating on foundational docs so models can ingest the essentials. I also optimize for context window management by keeping key facts succinct and repeated consistently across properties.

Authority and distribution: Models overweight high-credibility surfaces. I prioritize documentation, API references, GitHub repos, conference talks, reputable media, and third‑party reviews. Where appropriate, I pursue eligibility for knowledge bases (e.g., Wikidata) and ensure consistent facts across partner sites and directories. This isn’t about gaming; it’s about being verifiably useful wherever professionals already look.

Technical hygiene: I keep robots.txt and sitemaps friendly to docs, ensure semantic HTML, fast performance, and rich alt text, and use canonical tags to concentrate signals. Changelogs, release notes, and comparison pages help LLMs answer "what’s new" and "versus" questions with precision—core to product positioning and product-led growth.

Tools and connectors: Visibility isn’t only pre-training; it’s also in-session. I invest in a reliable ChatGPT connector and CustomGPT workflows so assistants can call our APIs via well-scoped actions. I publish a high-quality OpenAPI spec, implement a retrieval-first pipeline over our docs, and tune chunking and metadata so answers stay grounded. Good context window management, privacy-by-design, and clear guardrails are non-negotiable.

Intent coverage: I map the customer journey and write to the prompts users actually type: definitions, quick starts, integrations, troubleshooting, and “compare vs” pages with transparent points of parity. This doubles as strong customer support ai strategy while reinforcing our go-to-market strategy.

Measurement: I maintain a prompt panel representing priority intents and track our share of voice in model outputs over time. When we ship content improvements, I use disciplined A/B testing where possible and set a minimum detectable effect to avoid overfitting to anecdotal wins. I pair qualitative spot checks with analytics to see which pages, entities, and citations correlate with improved inclusion.

Governance and ethics: I avoid manipulative tactics, fabricated claims, or spammy link schemes. Sustainable AI visibility comes from trustworthy content, clear provenance, and user value. Treat LLMs like discerning editors: they reward clarity, credibility, and consistency.

The bottom line: you can’t control when an assistant mentions your brand, but you can earn it. Build an authoritative, structured footprint; show up on credible surfaces; enable high-quality retrieval and actions; and measure rigorously. Done well, AI visibility compounds—just like great SEO—only faster, and with outsized leverage for teams who execute with focus and integrity.

Inspired by this post on Amplitude – Perspectives.

November 17, 2025
How I Use ChatGPT to Supercharge PM: Smart Workflows, Killer Prompts, and Real-World Wins

Every week, I lean on ChatGPT to cut through noise, reduce rework, and move faster with more confidence. It’s not a silver bullet, but it has become an unfair advantage in my day-to-day leadership of product strategy, discovery, and delivery. Unlock workflows, prompts, and real PM tips showing how ChatGPT quietly reshapes product management behind the scenes.

Here’s my stance: ChatGPT doesn’t replace product judgment. It amplifies it. Used well, it accelerates product discovery, clarifies roadmaps, sharpens positioning, and strengthens stakeholder management. Used poorly, it creates noise and risk. What follows are the specific workflows and prompts that reliably save me hours while protecting quality and trust.

Discovery and research are where I see the biggest upside. I use ChatGPT to draft interview guides, transform raw notes into theme clusters, and generate “Jobs to Be Done” problem statements—then I validate them with customers. I anonymize inputs to protect privacy and follow privacy-by-design and data governance commitments; AI risk management matters more than ever when we’re handling real user data.

When I move from insight to definition, ChatGPT helps me spin up crisp PRDs and user stories. I provide context about our users, constraints, and success metrics and ask for structured outputs: goals, non-goals, acceptance criteria, and risks. This keeps our product trios aligned and focused on outcomes vs output OKRs, not just shipping features.

For competitive analysis and positioning, I feed in public information and ask for points of parity, points of differentiation, and potential messaging angles. I treat the output as a starting point for my value proposition and battlecards—not the final word. It’s a fast way to surface hypotheses and pressure-test our product-led growth narrative.

Roadmapping and sprint planning also benefit. I use ChatGPT to map dependencies, draft milestone narratives, and transform epics into well-formed backlogs. When we align quarterly plans, I ask for risk scenarios and contingency options so we can make trade-offs explicit before we commit.

On analytics and experiments, ChatGPT is my drafting partner. It helps me define A/B testing plans, clarify the minimum detectable effect (MDE), and outline instrumentation requirements. I still verify numbers in our analytics stack, but the scaffolding is done in minutes, not hours—freeing me to focus on retention analysis and activation levers.

Stakeholder communication is where the time savings compound. I use ChatGPT to produce executive summaries, QBRs vs OKRs comparisons, and board-ready narratives that highlight outcomes, risks, and next steps. It’s a powerful way to stay crisp and consistent across leadership updates without losing the nuance that matters.

Prompt patterns make or break results. I keep four rules: set the role, provide rich context, define constraints, and specify the output format. For example: “You are a senior PM advisor. Context: [user, market, problem]. Constraints: [privacy, timeline, budget]. Output: PRD with goals, acceptance criteria, and risks.” With larger inputs, I use context window management by chunking content and asking for summaries before synthesis.

For internal knowledge, I lean on a retrieval-first pipeline. Instead of pasting long docs, I reference curated, approved sources so answers track to current reality. CustomGPT workflows and a simple ChatGPT connector help with governance: they increase speed while reducing the chance of hallucinations and stale information.

Guardrails are non-negotiable. We never paste sensitive data into prompts; we redact PII, spot-check against source-of-truth systems, and red-team important outputs. AI risk management isn’t just a checkbox—it’s how we maintain trust while scaling productivity with gen ai.

Finally, enablement turns personal productivity into team capability. I run short playbooks for empowered product teams: discovery synthesis, PRD drafting, roadmap storytelling, and stakeholder-ready updates. The result is higher-quality thinking, faster cycles, and fewer meetings to align on the essentials.

ChatGPT for product managers isn’t hype; it’s a practical edge when you apply discipline. Start with one workflow that drains your time, add a prompt template, and measure the outcome. In a week, you’ll have proof. In a quarter, you’ll have a new operating system for how your team learns, decides, and ships.

Inspired by this post on Product School.

November 17, 2025
Taming 1,000+ Vendor Emails: How Xelix’s AI Helpdesk Delivers Fast, Confident Answers

Chaos in vendor communications is a problem I see across finance operations: sprawling accounts payable inboxes, slow response times, and missed context. That’s why this build caught my attention—not just because it’s GenAI, but because it’s a disciplined product strategy that converts email overload into measurable outcomes.

Accounts payable inboxes can see 1,000+ vendor emails a day. Xelix’s new Helpdesk turns that chaos into structured tickets, enriched with ERP data, and pre-drafted replies—complete with confidence scores.

I dug into the end-to-end approach with the team—Claire Smid — AI Engineer, Xelix; Emilija Gransaull — Back-End Tech Lead, Xelix; Talal A. — Product Manager, Xelix—focusing on how they scoped the problem, iterated fast, and de-risked AI in production.

Their product thesis is refreshingly pragmatic. They prototyped with “daily slices” (Carpaccio-style) and built a retrieval-first pipeline that matches vendors, links invoices, and drafts accurate responses—before a human ever clicks “send.” That framing matters: enrichment and matching take center stage, with the model amplifying precision instead of improvising.

We unpacked the tricky bits that make or break an AI helpdesk at scale: vendor identity matching, Outlook threading, UX pivots from “inbox clone” to ticket-first views, and the metrics that prove real impact (handling time, stickiness, auto-closed spam). The pipeline architecture and email processing choices were grounded in operational realities, not just AI aspirations.

Several takeaways are worth pinning to any AI product roadmap. “Start narrow to win: pick high-volume, high-cost requests (invoice status & reminders).” “Enrichment > magic: accurate replies come from great retrieval/matching, not just a bigger LLM.” “Design for adoption: familiar inbox view helps onboarding, but a ticket-first UI unlocks AI features.” These are the kinds of decisions that drive adoption, trust, and ROI.

Data enrichment challenges dominated early learning curves: stitching ERP context into tickets, handling vendor identification at scale, managing email thread continuity, and calibrating response generation for accuracy. On the generation side, the team emphasized precision over verbosity—clean responses that reflect system-of-record truth—then instrumented the experience to “Evaluate System Performance” with production-grade telemetry.

Trust was treated as a product feature. “Measure outcomes, not vibes: track ‘messages sent from Helpdesk’, % auto-resolved.” And critically, “Confidence builds trust: show match quality and response confidence so humans know when to edit.” By surfacing match quality and confidence scores, they shortened coaching loops and made human-in-the-loop supervision feel natural, not burdensome.

What’s next is equally compelling: “targeted generation, multiple specialized responders, and more agentic routing.” That direction aligns with agentic AI patterns I recommend for operations-heavy workflows—route first, retrieve deeply, then generate with intent. It’s a scalable path from assistive AI to autonomous resolution while maintaining governance and auditability.

If you want a quick map of the journey, the conversation flowed from 0:00 Meet the Team: Claire, Emilija, and Talal, 00:36 Introduction to Xelix and Its Products, 01:08 Understanding Accounts Payable Teams, 01:37 Help Desk Product Overview, 03:11 Challenges Faced by Accounts Payable Teams, 04:03 AI Integration in Help Desk, 05:47 Automating Reconciliation Requests, 07:45 Development Methodology: Carpaccio, 09:11 Prototyping and Beta Testing, 12:00 Manual Tagging and Data Collection, 16:39 Focusing on High-Impact Use Cases, 18:55 User Experience and Interface Design, 24:56 Pipeline Architecture and Email Processing, 28:21 Data Enrichment Challenges, 29:04 Handling Vendor Identification, 33:33 Email Thread Management, 36:15 Generating Accurate Responses, 40:48 Evaluating System Performance, 49:20 Future Developments and Goals.

My takeaway for product leaders: when the domain is high-volume and rules-heavy (like AP), retrieval-first beats model-first. Start with the narrowest, costliest intents; prove lift with “messages sent from Helpdesk” and “% auto-resolved”; then graduate UX from familiar to AI-native (ticket-first) once trust is earned. That’s how you turn vendor chaos into answers—reliably, scalably, and fast.

Inspired by this post on Product Talk.

November 13, 2025
AI Won’t Replace Engineers—Engineers Using AI Will: A Practical Playbook for Your Next Move

Will AI replace software engineers or reshape their roles? Explore risks, opportunities, and alternative career paths in tech.

I’m often asked whether AI will make software engineers obsolete. My short answer: AI is already automating tasks, not eliminating the role. The engineers who learn to orchestrate models, systems, and stakeholders will create more value—not less. The real shift is from keystrokes to judgment, from writing code to designing socio-technical systems that deliver outcomes.

Today’s gen ai assistants—think Claude Code and ChatGPT connector—excel at unit test scaffolding, boilerplate generation, refactoring, docstrings, and code search. When integrated into CI/CD, they can open draft pull requests, annotate diffs, and propose fixes. This lifts developer productivity and frees time for higher-leverage work: problem framing, architecture decisions, and customer discovery.

What changes in the role? We spend more cycles on product discovery, privacy-by-design, and AI Strategy, and fewer on repetitive implementation. We design agentic AI workflows that combine retrieval, tools, and guardrails; we evaluate trade-offs that blend performance, cost, and safety; and we partner with empowered product teams to ship the smallest valuable slice, learn, and iterate.

Measure what matters. If AI is working, DORA metrics should improve: higher deployment frequency, shorter lead time for changes, stable change failure rate, and faster MTTR. Pair that with outcomes vs output OKRs to avoid gaming the system—shaving seconds off a build is meaningless if it doesn’t move activation, retention, or revenue. A unified analytics platform can help connect engineering signals to business impact.

Risk is real—and manageable. AI risk management and data governance are now core competencies, not afterthoughts. Protect IP with robust access controls, context window management, and red-teaming. In production, instrument threat detection and response to catch prompt injection, data leakage, and model drift. Treat this like any other reliability discipline alongside SRE.

If parts of coding get automated, where can great engineers thrive? Several high-impact paths are emerging: platform engineering for LLMs (tooling, evals, observability), SRE for AI-infused systems, developer evangelism and education, product management for AI-native experiences, security engineering focused on model and data threats, and forward deployed engineers who pair with customers to solve messy, real-world problems.

How to upskill fast: build an AI product toolbox and ship small. Prototype gen ai features end-to-end—retrieval, function calling, human-in-the-loop QA—and connect them to your CRM integration or support stack. Use A/B testing with a clear minimum detectable effect (MDE) to validate impact. Leverage CustomGPT workflows for internal enablement and in-app guides or product tours to onboard users safely.

Here’s a pragmatic 90-day plan. Week 0–2: audit your top 10 engineering tasks by time spent; identify 3 that are ripe for AI augmentation. Week 3–6: pilot inside CI/CD with explicit guardrails; track DORA metrics and developer sentiment. Week 7–10: productionize the wins; document runbooks; add incident management paths. Week 11–12: share learnings with product trios, refine your value proposition, and set next-quarter OKRs.

AI won’t replace software engineers; engineers who master AI will outpace those who don’t. If we embrace the shift—toward systems thinking, responsible governance, and customer outcomes—we’ll build better products faster and open new, rewarding career paths. The opportunity is here and compounding.

Inspired by this post on Product School.

November 12, 2025
5 Costly UX Research Pitfalls I See Often—and How AI + Qual Insights Prevent Them

In product reviews and roadmap debates at HighLevel, I come back to a simple truth: great products start with great user research—but even seasoned teams fall into the same traps. After leading product discovery across empowered product teams and product trios, I’ve learned that a few avoidable mistakes consistently derail speed, quality, and outcomes.

Learn how to avoid the top five UX research pitfalls. Discover how AI and qualitative insights can help teams uncover the why behind user behavior.

The “why” behind user behavior is where durable growth lives. When we pair qualitative insights with analytics and a clear AI Strategy, we don’t just validate a solution—we de-risk the roadmap, improve user activation, and increase retention. Here are the five pitfalls I watch for and how I coach teams to avoid them.

Pitfall 1: Treating opinions as insights. Early in my career, I mistook strong stakeholder opinions for customer truth. Now I insist on a clear research question, a decision we will make with the evidence, and a hypothesis we’re trying to falsify. A/B testing is great for measuring impact when you’ve defined minimum detectable effect (MDE), but discovery research demands explicit learning goals and unbiased inputs.

How to avoid it: Write the decision statement first (“We will proceed with X if we learn Y”), then design the research. Keep a visible decision log so insights connect directly to product roadmapping and sprint planning, not to the loudest opinion in the room.

Pitfall 2: Leading questions and flawed methods. I still see interview guides that telegraph the desired answer. This corrupts the signal. Instead, I push teams to pilot guides with a product trio, remove solution language, and focus on behaviors. We complement interviews with in-app guides, targeted surveys, and session reviews using tools like Pendo and Intercom to capture moments of friction in-context.

How to avoid it: Ask neutral, behavior-first questions (“Tell me about the last time you…”) and validate with artifacts (screenshots, workflows). Pilot every guide with a colleague, then refine for clarity and neutrality.

Pitfall 3: Over-indexing on quantitative data and ignoring the why. Amplitude analytics and retention analysis tell me what happened; they rarely tell me why it happened. When teams chase dashboards without pairing them with qualitative interviews, we optimize for surface-level metrics and miss underlying jobs, anxieties, and unmet needs.

How to avoid it: Pair funnels and cohorts with a short round of qualitative interviews. Use Generative AI to summarize transcripts, cluster themes, and highlight contradictions, then validate themes against Amplitude analytics and CRM integration data. The synthesis is where insight emerges.

Pitfall 4: Recruiting bias—talking only to superfans or the most vocal detractors. If we only hear from power users, we build for edge cases; if we only hear complaints, we over-index on blockers. The result is a lopsided roadmap that misses mainstream value.

How to avoid it: Recruit across segments—new users, churned users, evaluators who never converted, and adjacent personas. Balance the sample and document who you didn’t talk to. For sensitive segments, lean on privacy-by-design practices and data governance so participants feel safe sharing candid feedback.

Pitfall 5: Weak synthesis and no path to action. Research often ends with a beautiful report that gathers dust. Insights must translate into choices: what we will do, what we will not do, and what we must learn next. Without this, research slows delivery without improving outcomes.

How to avoid it: Convert findings into atomic insights with evidence, confidence, and impact. Tie each insight to outcomes vs output OKRs, then schedule a decision review with the product trio. If you can’t articulate the decision, you haven’t finished the research.

How I use AI without losing the plot: I rely on LLMs for product managers to speed the busywork, not to replace judgment. Gen AI helps me transcribe, tag, and cluster themes; extract Jobs to Be Done; detect hesitation and sentiment; and draft UX writing variants for follow-up surveys. With a ChatGPT connector or similar tools, I can map qualitative themes to Amplitude analytics events and Pendo paths, revealing the narrative behind the numbers.

Guardrails matter: I apply AI risk management and privacy-by-design principles—no sensitive data in prompts, clear consent, and human-in-the-loop validation. AI is a force multiplier when the prompts are grounded in a solid research plan and the outputs feed a real decision.

A quick checklist I share with teams: define the decision and hypothesis; recruit a balanced sample; use neutral, behavior-first questions; triangulate quant with qual; synthesize into atomic insights; and link every insight to a concrete action or OKR. Do this, and you compress time-to-learning without sacrificing rigor.

When we respect the craft of research and thoughtfully apply AI, we consistently uncover the why behind user behavior—and build products that users adopt, love, and keep. That’s the fastest path to product-led growth and durable differentiation.

Inspired by this post on Amplitude – Perspectives.

November 11, 2025
Stop Falling for Hollywood Demos: The Unfiltered Truth of Live AI Voice for Support

I’ve sat through countless AI demos, and I’ve learned there are really two kinds: the “Hollywood demo,” which is polished to perfection, and the “real-world demo,” which shows the product raw—imperfections and all. The former dazzles, but the latter is where you discover what’s actually ready for prime time.

Hollywood demos look great, but sometimes need a closer look to make sure what you see is what you’ll get. When I’m evaluating an AI Agent for customer service, I always look past the polish. I’m assessing how well it will handle real-world scenarios—the messy, complex conversations your team deals with every day. That’s especially true on voice, the toughest channel to get right.

Voice is one of the toughest tests of any AI system. It’s not just “chat with speech.” An AI Agent needs to be able to listen, respond, and adapt in real time. Timing, tone, and turn-taking are all part of the product, they shape the experience as much as accuracy or reasoning.

An edited video might sound seamless, but it can’t show how a system behaves in a real support environment—like when a conversation takes an unexpected turn or when it pauses briefly to reason or retrieve data. Those small moments—latency, clarifications, interruptions—are when you see what the AI Agent is really capable of. A real-world demo lets you see and hear how the system actually behaves under real conditions, not in a controlled environment that’s been smoothed out with editing.

That’s why the live Fin Voice demo at Pioneer stood out. The team called Fin live on stage to show the real thing (with real latency and interruptions) so people could understand the product they’d be deploying to their own customers. As a product leader, I appreciate that level of transparency because it mirrors how customers will experience the system in production.

When Paul Adams, Chief Product Officer, demoed Fin Voice at Pioneer, the goal was to show the product exactly as customers experience it. In 90 seconds, Fin verified his identity, retrieved account data, managed an interruption, offered options, completed the workflow, and sent a follow-up email. That’s the kind of end-to-end outcome I look for—fast verification, accurate retrieval, natural pacing, and a closed loop.

Latency. You could hear brief pauses while Fin fetched subscription details and checked backend systems. That wasn’t lag—it was work happening in real time. In voice AI, thoughtful latency that signals reasoning is far better than synthetic speed that collapses under real load.

Natural conversation flow. Fin detected when Paul finished speaking, handled interruptions gracefully, and replied in short, human-like turns. That turn-taking behavior is essential for trust and comprehension in voice customer support.

Awareness and tone. Subtle changes in pacing when Paul laughed or hesitated showed sensitivity to context. Tone control is not a “nice to have” in voice—it’s a core UX capability.

Unscripted conversation design. No rigid IVR menus or fixed paths. Paul spoke naturally, and Fin adapted to resolve his query. That adaptability is what differentiates a true AI Agent from a glorified decision tree.

Those details are the real test. A voice AI Agent that performs well in a live demo is one that will perform well for you and your customers too.

Voice has been one of the most demanding, and rewarding, areas of development for Fin. Since launch, we’ve been expanding what it can do so support leaders can customize how Fin sounds, behaves, and aligns with their brand.

Voice and tone customization: Choose from multiple natural voices, set greetings, and fine-tune how Fin communicates with customers.

Escalation and conversational guidance: Teach Fin to use your terminology, ask clarifying follow-ups, and escalate when needed.

Deployment controls: Manage rollouts, test safely in internal environments, and fine-tune before going live.

Flexible integrations: Connect to any telephony system via call forwarding, and link Fin Voice to backend systems or APIs to take action.

Multilingual capability: Fin Voice now supports 28 languages natively.

Alongside these features, we’ve made big improvements to Fin’s answer quality—the foundation of a great voice experience. When people call, they’re looking for accurate, immediate answers they can trust.

So we’ve focused on three key areas: low latency, which is down roughly 30–40% since launch; clarification flow, so Fin asks smart follow-up questions to reduce back and forth and improve resolution rates; and voice-specific answer structure, so Fin delivers information in shorter sentences with pacing designed for listening.

Together, these improvements mean customers get the highest-quality answers as quickly as possible, resulting in more resolutions and better experiences.

Running a live demo always carries risk because things can go wrong. But that’s also why it matters—because that’s how customers experience it too. Support leaders stake their reputation on the systems they choose, so the only way to understand what you’re putting in front of your customers is to see it under real conditions.

When you see Fin in a demo, you’re seeing the same system that runs in production. Real-world demos take more effort and don’t always go perfectly, but they show what’s real—and that’s exactly what you need to evaluate before you deploy voice AI at scale.

Inspired by this post on The Intercom Blog.

November 11, 2025
From Sketch to Clickable Demo: My AI Prototyping Playbook to Build Apps in Hours

I’ve spent much of my career compressing the distance between a napkin sketch and something real customers can touch. At HighLevel, my product teams use generative AI to validate ideas faster, reduce risk earlier, and win stakeholder trust with evidence instead of slides. The goal isn’t to be flashy—it’s to be precise, testable, and repeatable.

Today, you can build it before you pitch it. AI prototyping can turn ideas into clickable demos in hours. Here are some tools to try and steps to follow.

I start every AI prototyping sprint by sharpening the problem statement and the outcome we care about. That means being explicit about the target user, jobs-to-be-done, and the riskiest assumptions. I define a minimum detectable effect (MDE) and tie it to outcomes vs output OKRs so everyone aligns on what “good” looks like before we touch a tool.

From there, I move from sketch to interface. I capture a rough flow (whiteboard, tablet, or even paper) and generate UI variations with my AI product toolbox—tools that translate structure into components and screens. I’ll iterate on information hierarchy and copy until the narrative supports the core job, borrowing techniques from UX writing. For product managers leaning into LLMs for product managers, this phase is about speed to feedback, not perfection.

Next, I wire data and logic. I connect a lightweight backend or spreadsheet, stitch in a CRM integration if needed, and add LLM calls through a ChatGPT connector or Claude Code. If the concept benefits from multi-step autonomy, I introduce agentic AI to orchestrate tasks across APIs. CustomGPT workflows help me encapsulate business rules so the demo behaves consistently in user paths we care about.

Governance is not optional at this stage. I apply privacy-by-design defaults, document data governance decisions, and run a quick AI risk management pass: input validation, prompt safety, rate limits, and fallback responses. This keeps the prototype credible and prevents false positives from polluting stakeholder perception.

With a click-through in hand, I instrument the experience so learning compounds. I drop in Amplitude analytics to track activation, task completion, and drop-off, and set up simple A/B testing when there’s a meaningful design or copy choice. This makes the prototype a learning vehicle, not just a demo.

Then I get it in front of users—fast. Five targeted conversations will beat fifty internal opinions. I run structured product discovery interviews, observe time-to-value, and capture objections. This is where empowered product teams shine: we make changes in real time, re-run the flow, and document what moves the needle for product-led growth.

When speed matters, I use a four-hour cadence: Hour 1 for problem framing and MDE; Hour 2 for sketch-to-UI generation; Hour 3 for data wiring and AI logic; Hour 4 for instrumentation and user walkthroughs. By the end, we have a clickable demo, preliminary analytics, and a clear decision on whether to advance, pivot, or park.

Finally, I translate insights into a concise artifact: the hypothesis we tested, the signal we observed, the trade-offs we made, and the next sprint plan for product roadmapping and sprint planning. The point is not to be right on the first try; it’s to learn precisely, cheaply, and quickly enough to invest with conviction.

If you adopt this approach, you’ll find that stakeholder management becomes easier, team energy rises, and your roadmap earns credibility. Build it before you pitch it, and let real interactions—not wishful thinking—do the heavy lifting.

Inspired by this post on Product School.

November 10, 2025