Category: AI Strategy

  • From Concierge to AI Marketing Engine: Inside Mowie’s Document Hierarchy Playbook

    From Concierge to AI Marketing Engine: Inside Mowie’s Document Hierarchy Playbook

    I’m constantly asked by SMB owners: What if your small business could have a full marketing team—automated content calendars, customer segmentation, and channel-specific posts—without the headcount? That question is no longer hypothetical; it’s precisely the promise behind Mowie, and the way they got there is a masterclass in practical AI product development.

    I recently listened to Chris O'Connor (CEO) and Jessica Valenzuela (Co-Founder) of Mowie, an AI marketing platform built for small and medium-sized businesses in restaurants, retail, and e-commerce. Their story starts with a concierge marketing service—doing the work by hand for overwhelmed owners—and evolves into a fully automated AI product.

    They walk through their "document hierarchy" approach: how Mowie crawls the web to build a "dossier" about each business, infers customer segments and marketing pillars, and generates quarterly content calendars with channel-specific posts. As a product leader, this is the kind of retrieval-first pipeline that consistently outperforms naive prompt chaining because it builds durable context before generation.

    They also unpack the technical challenges of structuring unstructured data and the evolution from rigid schemas to loosely structured markdown. In my experience with LLMs for product managers, markdown becomes a flexible intermediate representation that’s easy to diff, trace, and feed back into models without brittle parsing.

    Equally important, they use customer feedback—from calendar approvals to regeneration requests—as their primary evaluation signal. That’s eval-driven development in practice: close the loop with lightweight evals that reflect genuine user intent, not proxy metrics.

    The planning model is elegant: the three mini-calendars—public events, business-specific events, and recommended campaigns—roll up into a coherent plan that eliminates the blank-page problem and enables steady, predictable execution.

    Crucially, they’re building traceability so customers can see which context documents influenced their content. This kind of transparency increases trust, accelerates edits, and supports governance in regulated categories where auditability matters.

    Onboarding and data collection stay pragmatic: let the system crawl first, ask humans only for deltas, and progressively profile over time. It’s a pattern I advocate in continuous discovery and AI workflows—keep humans in the loop without overwhelming them, and make the right action the easy action.

    Early on, they used Simon Sinek's Golden Circle framework to validate demand and sharpen messaging. Framing the "why" before the "what" helps teams maintain a crisp value proposition and tighten their go-to-market strategy.

    Performance measurement goes beyond vanity metrics by connecting marketing performance back to point-of-sale data for attribution. The ability to tie campaigns to revenue events is the bridge from clever content to accountable outcomes.

    What’s next is equally compelling: deeper attribution, omnichannel expansion, and digital out-of-home displays. For SMBs, that points to a unified analytics platform spanning email, social, and in-store touchpoints—exactly where modern marketing is headed.

    My takeaways for builders: invest in a retrieval-first pipeline with a resilient document hierarchy; prefer loosely structured markdown over rigid JSON when dealing with messy inputs; design human-in-the-loop controls that double as evals; and always connect activity to business outcomes. That’s how you turn an idea into a repeatable system that scales.

    If you want to explore further, start here: Mowie AI — AI marketing platform for SMBs. For early validation and storytelling, revisit Simon Sinek's Golden Circle.


    Inspired by this post on Product Talk.


    Book a consult png image
  • Automated Insights for Product Teams: Uncover Causal ‘Aha’ Moments in Minutes, Not Weeks

    Automated Insights for Product Teams: Uncover Causal ‘Aha’ Moments in Minutes, Not Weeks

    I’ve spent countless cycles guiding teams through the maze of dashboards, SQL pulls, and ad‑hoc analyses—only to watch truly meaningful patterns emerge far too late. Automated insights are the next frontier in product analytics: a shift from manual exploration to AI that proactively surfaces what matters most. When we let the system do the heavy lifting, we accelerate discovery, reduce bias, and give product trios the clarity to act.

    Finding causal connections in product data involves exhaustive searches and tests. We trained our AI to find “aha” moments in minutes instead of weeks.

    Here’s what that means in practice for product management: the platform continuously scans events, cohorts, and segments; prioritizes signals linked to activation, conversion, and retention; and highlights likely causes behind meaningful movements in your core KPIs. Instead of sifting through endless funnels and cohorts, I get ranked hypotheses I can validate with targeted A/B testing and minimum detectable effect (MDE) guardrails.

    This approach turns analytics into action. Automated insights reduce time-to-learning, tighten our discovery loops, and make continuous discovery tangible—especially when we’re aligning roadmaps, designing experiments, and refining onboarding. Whether you’re using tools like Amplitude analytics or instrumenting a unified analytics platform, the value is the same: faster, clearer paths to customer impact.

    I’ve seen teams unlock retention analysis breakthroughs by spotting counterintuitive patterns—like a specific feature combination or an overlooked step in onboarding—well before they would have surfaced through manual analysis. With AI workflows scanning the noise and elevating the signal, we can focus on decisions: ship or iterate, scale or sunset, double down or pivot. That’s empowered product teams in action.

    If you’re building for product-led growth, this is the leverage you’ve been waiting for. Automated insights transform how we prioritize, test, and communicate strategy—bringing us from gut feel and lagging indicators to explainable, causal narratives we can stand behind. The outcome is simple: more confident bets, less waste, and a faster path to durable product-market fit.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • Unlock Real-Time Product Insights: Amplitude + OpenAI MCP in ChatGPT, Without BI Bottlenecks

    Unlock Real-Time Product Insights: Amplitude + OpenAI MCP in ChatGPT, Without BI Bottlenecks

    I’ve been working to remove the friction between product questions and product answers. The most impactful step so far: connecting Amplitude analytics directly into ChatGPT via OpenAI’s MCP. This turns everyday conversations into decision-grade insights—no dashboards to hunt, no SQL to write, and no analytics queue to wait on.

    Connect Amplitude data directly to the tools your team uses every day. OpenAI’s MCP connector eliminates traditional barriers to product data.

    In practice, this means I can ask ChatGPT natural-language questions like, “Where are users dropping in our activation funnel this week?” or “Which cohorts are driving retention lift post-onboarding?” and get grounded answers from Amplitude—fast. It’s a step-change for product-led growth because the insights live where we already think and plan.

    Here’s how I apply it day to day: I’ll prompt ChatGPT to compare week-over-week activation for new SMB signups across regions, diagnose drop-offs by step, and summarize A/B testing outcomes with guardrails like minimum detectable effect considerations. When we’re shaping strategy, I’ll pull a retention analysis and cohort breakdown to inform bet sizing and roadmap tradeoffs—all without pulling the team into a BI bottleneck.

    Governance remains non-negotiable. I scope the MCP tools to a least-privilege data slice, apply privacy-by-design rules to exclude PII, and log every query for auditability. Clear data governance and AI risk management policies ensure we maintain trust while accelerating discovery. Tight context window management keeps prompts focused and reduces noise.

    Operationally, the setup is straightforward: define the MCP tool spec for Amplitude, map canonical events and metrics (activation, retention, conversion, and product-qualified lead stages), and test with a retrieval-first pipeline so responses reliably cite the right source of truth. We standardize metric definitions across product, growth, and customer success to avoid semantic drift.

    The impact on empowered product teams is immediate. Continuous discovery becomes a daily habit rather than a quarterly ritual; questions move from “I’ll get back to you” to “Let’s check right now.” For product managers working with LLMs, this is the connective tissue that makes ChatGPT a true ChatGPT connector for analytics—an on-demand, unified analytics platform that supports faster iteration and sharper decision-making.

    If you’ve been waiting to make analytics truly ambient, this is the moment. Start small with a single funnel or cohort, validate governance, and expand to your core lifecycle metrics. The payoff is a shared understanding of what’s working, what’s not, and where to focus next—delivered in the flow of work.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • Build Powerful AI Writing Workflows with Claude Code: A No‑Code, Step‑by‑Step Playbook

    Build Powerful AI Writing Workflows with Claude Code: A No‑Code, Step‑by‑Step Playbook

    My writing process used to be messy. Even in my role leading product strategy, I’d start strong and then stall because I hadn’t clarified what I truly wanted to say.

    I’d begin with a brain dump—everything swirling in my head. I’d try to shape it into an outline, lose patience, and just start writing. A few paragraphs later, I’d realize I didn’t know where I was going, stop, and return to the outline. It was a tortured loop between writing and structuring.

    Now I do it differently. When I get stuck, I don’t start writing. I ask Claude for help.

    Claude reviews my outline and helps me fill in gaps. It often suggests things that I don’t like. This is good. It helps me figure out the core of what I want to say. Instead of writing my way to what I think, I discuss my way to what I think.

    Claude isn’t just a sounding board. I also use it to help me brainstorm headlines, explore outline alternatives, critique each section as I write, conduct supporting research, act as my thesaurus and dictionary, make SEO recommendations, and so much more. As a result, I am writing way more.

    I didn’t design this workflow in one sitting. I built it iteratively, the same way I build products: by asking, "How can Claude help with this?" and evolving from there.

    If you haven’t been following along, I’m deep in a series about Claude Code and how it helps me work better. Here’s what we’ve covered so far: Claude Code: What It Is, How It’s Different, and Why Non-Technical People Should Use It, Stop Repeating Yourself: Give Claude Code a Memory, How to Use Claude Code Safely: A Non-Technical Guide to Managing Risk, and How to Choose Which Tasks to Automate with AI (+50 Real Examples).

    This week, I’m diving into how to design personal AI workflows. I’ll use my writing workflow to illustrate each step, and I encourage you to follow along with your own process so you end with something tangible.

    macOS dark-mode editor screenshot where Claude outlines an article on building AI workflows, showing a section breakdown, three paywall placement options, trade-offs, and a guidance prompt.
    Claude breaks down an AI workflow article and suggests three paywall points, weighing trade-offs to guide conversion strategy. A clear, structured example of planning content and automation steps with Claude Code.

    Designing AI workflows looks a lot like designing product solutions. I lean on "discovery" habits—clarifying outcomes, mapping the journey, and testing assumptions—to make the work both reliable and repeatable.

    This series is inspired by my personal usage of Claude Code. I have not received any compensation from Anthropic for writing this series. And you can trust that if that ever changes, I will disclose it. This is not only required by the FTC here in the US, but I strongly believe it is the right thing to do. You can count on me to do so.

    First, I map out what I do to complete the task. Once you’ve identified the AI workflow you want to create, start by mapping exactly what you do when you do it yourself. If this feels hard, do the task a few more times and jot down each step as you go.

    Here’s what I do when I write a blog post: I choose a topic; I write down everything I can think of related to that topic; I structure it into an outline; I do some research to fill in gaps; I write each section; I edit each section; I think about SEO tactics; I brainstorm headlines; I decide what images to add; and I send it to my editor.

    If this looks a lot like story mapping, that’s because it is. Instead of mapping what a customer has to do to get value from a solution, I’m mapping what I do to complete a task. The benefit is the same: I can see what must happen and ask, "Where can AI help?"

    From here, I focus on four moves: choose one step to automate or augment with AI; decide on the right automation (or augmentation) strategy—code vs. LLMs; prototype the first workflow with detailed instructions; and test and iterate until it meets my bar for quality and speed.

    My goal is to give you enough guidance that you can follow along and end with a draft of your first AI workflow. If you apply continuous discovery to your own process, you’ll not only accelerate output—you’ll improve the clarity and quality of your thinking along the way.


    Inspired by this post on Product Talk.


    Book a consult png image
  • 6 AI Strategies to Accelerate Business Growth: Unlock Revenue, Cut Costs, Scale Faster

    6 AI Strategies to Accelerate Business Growth: Unlock Revenue, Cut Costs, Scale Faster

    I’ve spent the last few years weaving AI into core product workflows, and the pattern is clear: when we pair disciplined product thinking with pragmatic AI Strategy, growth compounds. The question I hear most isn’t if AI can help, but where to begin and how to de-risk the journey while moving fast.

    AI for business growth starts with one of these six strategies. See how companies use AI to unlock revenue, cut costs, and scale smarter and faster.

    1) Revenue acceleration with unified customer intelligence. I start by connecting behavioral analytics and CRM integration to a unified analytics platform, then layer a retrieval-first pipeline so large language models can surface high-intent accounts, churn signals, and next-best actions. With Amplitude analytics and A/B testing, we validate AI-driven playbooks for upsell, cross-sell, and win-back—turning insights into measurable lift rather than novelty.

    2) Cost reduction through targeted automation. Not all automation yields the same outcome. I look for repetitive, high-volume processes where quality is easy to verify—customer support ai strategy with AI-assisted deflection, accounts payable automation, and security workflows like threat detection and response. Combining agentic AI with clear guardrails reduces handle time, frees teams for higher-value work, and keeps error rates within acceptable thresholds.

    3) Faster time-to-market via eval-driven development. Speed without signal is noise. I lean on eval-driven development to instrument models, measure drift, and tighten CI/CD loops. We track DORA metrics like deployment frequency while using gen ai for product prototyping to compress discovery and delivery. Frameworks and tools such as Claude Code help engineers iterate safely behind feature flags so we can ship learning, not just code.

    4) Personalization that drives activation and retention. Growth sticks when onboarding is contextual. I use in-app guides, product tours, and thoughtful tooltip design powered by LLMs for product managers to tailor the first-run experience. With retention analysis and outcomes vs output OKRs, we align personalization with the moments that matter—activation, habit formation, and expansion.

    5) Trust-by-design to scale responsibly. AI risk management, privacy-by-design, and data governance are not afterthoughts; they are growth enablers. By defining policy, red-teaming prompts, and practicing context window management, we reduce rework, limit incident management, and maintain compliance across markets. Clear review gates make it easier to say yes to more AI use cases without compromising customer trust.

    6) Voice and agent experiences that feel like product, not add-ons. When prompt engineering for voice and voice AI agent patterns are integrated into the core journey—guided onboarding, smart handoffs, proactive notifications—engagement rises. Agent Analytics turns conversations into product signals we can act on in roadmapping and sprint planning, closing the loop between user intent and product improvement.

    My playbook for getting started is simple: pick one revenue and one efficiency use case, define success upfront, and ship a narrowly scoped MVP with robust analytics. Use continuous discovery with product trios to refine prompts, data sources, and experience design. Then scale what works, retire what doesn’t, and let evidence—not hype—set the roadmap.

    If you’re evaluating where to apply gen ai next, these six lanes offer fast paths to impact without sacrificing governance or customer trust. The companies I’ve seen win treat AI as a capability within the product, not a separate project—and they measure it with the same rigor they use for any critical feature.


    Inspired by this post on Product School.


    Book a consult png image
  • Make Every Answer the Last: Building a Self-Improving AI Support Engine for 2026

    Make Every Answer the Last: Building a Self-Improving AI Support Engine for 2026

    Once I’ve defined the right roles on my team, the next move is to design an operating model that makes progress a habit. My goal is simple: every interaction should strengthen the system so the AI Agent keeps improving over time.

    I anchor the team on a mantra that has never failed me: “The first time you answer a question should be the last.” That single statement reframes support as a compounding system rather than a one-off activity.

    The ambition is to ensure every resolution makes the next one faster and more accurate, so fewer issues repeat, quality compounds, and support scales naturally. That doesn’t happen by accident—it requires intentional design.

    In practice, this comes down to four essentials: clear ownership of performance, guardrails that make iteration fast and safe, feedback loops that turn learning into routine upgrades, and a culture that celebrates the work of improvement—not just the outcomes. Here’s how I put that into play.

    First, I start with clear ownership. Ambiguity is one of the most common reasons AI performance plateaus. When no one truly owns how the AI Agent performs, feedback gets lost, issues linger, and improvements stall.

    On high-performing teams, I assign a single owner—often an AI ops lead—responsible for making the AI Agent better. They review resolution trends to spot underperformance, make targeted updates to content, configuration, and behavior, coordinate with product and engineering on systemic blockers, and set improvement priorities, targets, and timelines. The title matters less than the mandate; what matters is clear authority to drive change across teams.

    Real-world example: At Dotdigital, AI performance plateaued after a strong start—resolving around 2,800 conversations per month for three consecutive months. To drive resolution rates up, the team created a dedicated support operations specialist role, filled by an experienced agent with deep product knowledge. This person will focus on refining snippets, improving content, and enhancing the AI’s resolution capabilities.

    Second, I make iteration fast and safe. As the AI Agent takes on more volume and complexity, change can start to feel risky—so teams hesitate, and performance stalls. Lightweight governance fixes that by making the path from insight to action predictable.

    I keep the rules simple and explicit: which changes need review (and which don’t), who the decision-makers are, how we test updates before they go live, where feedback flows so it’s seen and acted on, and when progress gets reviewed on a steady cadence. Governance isn’t bureaucracy—it’s what keeps improvement routine and safe.

    Real-world example: Anthropic ran a focused “Fin hackathon” sprint to improve their AI Agent’s resolution rate. The team audited unresolved queries, identified underperforming topics, and created or updated content to close gaps. They converted frequently used macros into AI-usable snippets, monitored Fin’s performance during live support, and continuously refined content based on real interactions. This structured approach enabled rapid improvement while maintaining quality standards.

    Third, I build a system that learns by default. AI performance isn’t static, but many organizations treat it like a one-time implementation. The most successful teams operationalize learning: they analyze where the AI Agent struggles and feed those insights directly into structured improvements.

    The signals are straightforward: review common handoffs to humans, track unresolved queries by topic or intent, measure resolution rate trends over time, and use those inputs to prioritize fixes and content upgrades. Whether you follow a formal loop like the Fin Flywheel framework or something lighter, the goal is the same—make improvement inevitable.

    Fourth, I treat content as competitive infrastructure. Your AI Agent is only as good as what it knows. As George Dilthey, Head of Support at Clay, put it: “That’s when we realized: AI doesn’t just come up with information out of nowhere, you have to feed it. We were spending all our time evaluating tools when we should’ve been focused on content.”

    I operationalize knowledge like infrastructure: every topic has a clear owner, content is structured, versioned, and ingestion-ready, new products ship with source-of-truth content by default, and changes ship on a schedule—not when someone finds time. This is the backbone that differentiates teams who scale confidently from those who stall out.

    In my organization, we’ve evolved our New Product Introduction (NPI) process by aligning early with R&D on a single, canonical source of truth that becomes the foundation for all downstream content—including what the AI Agent uses to resolve queries. By embedding content creation into launch readiness, not as an afterthought, we’ve consistently hit 50%+ resolution rates on new features from day one.

    Finally, I make belief visible. Even the best system will stagnate if people stop believing in it. Belief can fade quietly unless you reinforce it on purpose. I keep it strong by sharing specific wins regularly, highlighting improvements with metrics, and recognizing the people behind the gains—then giving them space to lead. This isn’t just about morale; it keeps everyone aligned on the bigger play.

    When you put it all together—clear ownership, safe iteration, a learning system by default, and content as infrastructure—AI performance compounds. As the AI Agent gets better, the entire support model becomes faster, more reliable, and truly scalable. That’s the foundation of a modern, AI-first support organization.

    Next, I’ll take this a level deeper and share how capacity planning changes when AI handles the majority of inbound volume and your team shifts into higher-value roles. If scaling with confidence is the goal, this is where the operating model pays off.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • Spain’s Tough New Customer Service Law: What It Signals—and How AI Keeps You Compliant, Fast, and Human

    Spain’s Tough New Customer Service Law: What It Signals—and How AI Keeps You Compliant, Fast, and Human

    Support teams in Spain just got the clearest signal yet that the old way of doing things won’t cut it anymore. As I look at the details, I see more than a regulatory hurdle—I see a blueprint for the modernization many of us have been pushing toward for years.

    The signal arrives in the form of one of the most ambitious customer service regulations in Europe—a law designed to strengthen consumer protections and set clear expectations for fair, transparent, and personalized customer service. Among its measures: new protections against spam calls, stronger transparency requirements, safeguards around personalized interactions, and measurable standards for speed, accessibility, and complaint handling within customer support.

    It’s a significant shift, especially for large enterprises and essential-service providers. While the initial reaction might be anxiety about audits and penalties, the larger opportunity is hard to ignore: this law compels us to build modern, resilient support operations that scale, perform, and earn trust.

    Spain is often an early mover in consumer-protection regulation, and this shift could signal what future standards across the EU might look like. For EMEA leaders, this is a moment to reevaluate operating models, invest in automation thoughtfully, and ensure customer experience improvements directly support regulatory compliance.

    Below, I break down what the law requires, what it means in practice, and how AI Agents like Fin can help teams meet regulatory expectations while delivering faster, more personal support at scale.

    The law applies in full to providers of regulated services, including water, energy, passenger transport, postal services, pay-audiovisual media, and electronic communications, and also to any company (or group) that meets certain size and turnover thresholds, even if their core business falls outside those sectors.

    Large companies (those with more than 250 employees and over €50 million in turnover) also hold additional obligations, particularly around multilingual support in Spain’s co-official language regions.

    While the law is still moving through its final approval stages, the direction is clear: a broad set of obligations will apply to reinforce consumer rights, ensuring they can: Reach support quickly. Speak to a human when needed. Get clear information during outages or service disruptions. Have complaints handled promptly and on time.

    1. 95% of support calls must be answered within three minutes

    This raises the bar significantly for responsiveness, especially during spikes, outages, billing cycles, or seasonal surges. Most support systems are not built for this level of agility. In my experience, you can’t hire your way to this metric sustainably—you have to design for it.

    2. Customers must be able to speak to a human on request

    Automation is allowed, but it cannot be the only option. At any point during a call, a customer must be able to transfer to a human if they ask for one. Companies cannot trap customers in automated loops. The practical implication: every workflow needs a reliable, audited escape hatch to a person.

    3. Support lines must be free of charge

    Premium-rate numbers are prohibited. Customer service cannot generate revenue for the business, nor may it be used to upsell products. This cleanly separates service from sales and reduces consumer friction.

    4. Essential services must offer 24/7 support for continuity issues

    Electricity, water, gas, telecoms, and transport providers must always be reachable at all hours when customers need to report service interruptions. That means coverage, triage, and routing must be always-on.

    5. Complaints must be resolved within 15 days – or within five days for undue charges

    This halves the previous general complaint window of 30 days and adds a much faster path for billing-error complaints. Companies must maintain records, assign tracking numbers, and ensure timely follow-up. Your case management discipline will make or break this requirement.

    6. No spam calls or unwanted commercial pressure

    Companies must identify business calls with a designated prefix, and customer -service calls with a different one. Telecom operators will be required to block calls that do not use these codes. Additionally, contracts obtained via unsolicited calls will be legally null and void, protecting consumers from being pressured into commitments they never intended to make.

    7. Companies must maintain a unified complaint-tracking system

    All complaints, claims, and incidents must be recorded in a centralized system to ensure traceability. If your data is fragmented across tools, this is a call to centralize and standardize intake.

    8. Companies must pass annual external audits

    These audits assess whether customer service processes are meeting the required standards. In practice, that means consistent processes, measurable outcomes, and reliable evidence.

    9. Better linguistic and accessibility rights

    Large companies operating in regions with co-official languages must be able to provide support in those languages. They must also ensure their customer service is accessible for vulnerable consumers, such as those with disabilities or older adults. Multilingual and accessible by design is the new default.

    10. Fairer contract renewals

    Companies must provide customers with 15 days’ notice prior to automatic renewal of online subscriptions and make cancellation simple. This is both a compliance and customer trust win.

    Most support systems weren’t built for this level of speed or operational rigor. But the steps required to comply are the same ones that make service better for customers—and better for the teams delivering it. That’s why I view AI as an essential capability, not a bolt-on.

    With the regulatory expectations clear, the question becomes: what does a modern, compliant support operation look like? For me, it blends human empathy with intelligent automation, proving auditability without sacrificing experience.

    This is where AI plays a meaningful role. Not as a replacement for humans, but as a reliable front line that can handle a wide range of queries, including the most complex ones that require real depth, while keeping queues under control.

    Adopting an AI Agent like Fin helps teams build a support model that meets regulatory expectations and improves customer experience across all your channels. Here’s how.

    Many organizations will struggle to meet the three-minute standard during normal times, let alone during spikes or busy seasons, without unsustainably scaling their teams. Fin can help by reducing the number of calls that reach your phone lines and Fin Voice will ensure the ones that do are handled quickly.

    Reducing avoidable call volume before it reaches the queue

    Many of the queries teams receive are predictable: outage updates, billing questions, account changes, and other repeatable issues. Fin can resolve these instantly across several channels, including live chat, SMS, email, and WhatsApp, using the content and processes your team already maintains. I’ve seen this alone cut peak-time pressure dramatically.

    Answering the phone immediately

    For customers who do call, Fin Voice can pick up straight away. It provides natural, conversational responses based on your existing knowledge and helps your team stay responsive during busy periods.

    Making it easy to reach a human easier during spikes

    When queues build up, Fin can capture the reason for the call, gather details, and prioritize the most urgent issues. If you offer callback options, Fin can help schedule them quickly so customers avoid long wait times, which is key for staying compliant during peak periods.

    The law requires customers to reach a real person whenever they request one. Fin supports this by keeping the path to a human clear and dependable: every interaction includes an option to speak to a person, and that option is accessible until the issue is resolved; when chosen, Fin hands over full context so human teams don’t start from scratch; if you show team availability or wait times, Fin can surface that information for customers; escalations can be prioritized to ensure faster pickup; alerts can notify on-call staff when urgent issues arise. On the phone, Fin Voice follows the same principle. Callers can request a transfer at any moment, and Fin routes the call to the right team with context intact.

    Essential-service providers must be reachable at any hour when customers need to report service interruptions. Fin can help you meet this requirement without building a full overnight staffing model.

    Always-on answers and triage

    Fin provides first-line support at any hour of the day or night. Fin Voice brings this capability to the phone, giving callers immediate help even when your human team is offline. Fin can also direct customers to the latest updates you’ve published, such as outage information or status pages.

    Routing urgent issues to the right people

    When an issue requires human judgment, Fin gathers the necessary details and routes it to the appropriate on-call team using your existing after-hours processes. Teams can set up notifications so urgent issues are seen quickly.

    Proactively surface what matters most

    With AI Insights, Fin can also monitor for emerging patterns in customer conversations through Trending Topics. This means that if there’s a sudden spike in reports about a specific outage or a recurring question about a new process, Fin can flag these trends in real time. Your team is alerted to what’s top-of-mind for customers, so you can prioritize updates, publish targeted FAQs, or escalate critical issues, ensuring your support stays relevant and responsive, even overnight.

    Complaints and outages often create the biggest spikes in volume, and the new law increases pressure to respond quickly, keep customers informed, and maintain complete records. This is exactly where structured AI intake adds value.

    A more structured complaint intake

    Fin can recognize when a customer is lodging a complaint, gather required information, and initiate a record in your existing system with a clear ID assigned from the outset.

    Clear ownership and deadline alignment

    Your team can then use your case-management tools to apply the 15-day resolution timeline (or five says for undue charges). Fin’s structured intake helps ensure that ownership and next steps are visible, rather than buried in unstructured notes.

    Faster, more consistent outage communications

    During service interruptions, Fin can share the latest published information, provide estimated fix times when available, and direct customers to live updates. On the phone, Fin Voice can triage incident-related calls quickly so callers aren’t waiting for a human agent just to receive basic information.

    While multilingual support is only mandatory for large companies operating in co-official language regions, it remains essential for meeting consumer expectations. Fin helps by supporting multilingual, natural language interactions across voice and other channels; operating within channels that support accessibility features, like channels compatible with screen readers or commonly used messaging apps; and offering “request a call” paths and collecting the necessary information up front so teams can follow up quickly for customers who prefer phone support.

    The law prohibits customer service interactions from generating additional revenue or being used to offer new products. With Guidance, you can set Fin up to stay firmly within these boundaries by shaping how it responds, which topics it should avoid, and what it should prioritize when a customer is seeking help or lodging a complaint.

    The law raises expectations around documentation and audit readiness. Fin helps by making customer interactions more structured and consistent: when a conversation involves a complaint, Fin can ensure the required information is captured and a clear ID assigned; that ID can follow the interaction so it remains easy to trace; consistent intake gives you better visibility into key metrics regulators care about, like response times, time to first human contact, escalation volume, and whether complaints are resolved within required timelines; transcripts, summaries, and metadata can be retained until cases are resolved, supporting audit requirements; many organizations maintain internal compliance playbooks outlining processes and owners. Fin’s structured intake helps keep these practices reliable; leverage Insights to identify trending topics, optimize processes and measure service quality.

    Spain’s new customer service law raises the bar on speed, access, and accountability. It’s natural to worry about how your team will cope, especially if your support operation has grown organically across tools and regions. I’ve seen how quickly burnout and chaos can set in when expectations rise faster than capacity.

    The reality is that meeting these expectations through people alone would put unsustainable pressure on already stretched support teams. The risk of burnout and operational chaos is real, which is why an AI Agent like Fin can bring welcome relief.

    By handling everything from high-volume, repetitive questions to many of the deeper, more involved issues customers raise, Fin keeps queues manageable and prevents the strain from falling entirely on your human team, helping everyone stay above water as expectations rise.

    For companies operating across the EU, adapting early to Spain’s stricter expectations can build resilience for whatever comes next—whether that ends up being driven by regulation or customer demand. Now is the time to align compliance, AI strategy, and customer experience into a single, measurable operating model.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • AIUC-1 Certified: How Intercom Raises the Bar for Trustworthy, Enterprise-Ready AI Agents

    AIUC-1 Certified: How Intercom Raises the Bar for Trustworthy, Enterprise-Ready AI Agents

    I build products on the belief that trust is earned in every design decision and every deployment. Trust has always been a first principle at Intercom, from our early investments in security and privacy to the globally recognized certifications that shape our approach today.

    As AI becomes more deeply embedded in customer-facing work, it’s essential that businesses can rely on systems that are safe, reliable, and governed to the highest standards. That’s why we’re proud to share that Intercom is now AIUC-1 certified, becoming one of the first companies to meet the world’s first standard designed specifically for AI Agents. For leaders navigating AI Strategy and AI risk management, this is more than a badge—it’s a measurable leap forward in governance and operational rigor.

    AIUC-1 is the first certification tailored to the unique risks and challenges of AI Agents. It complements broader AI governance frameworks like ISO 42001 by focusing on enterprise-specific concerns like security, customer safety, system reliability, data and privacy, society, and accountability. In practice, this alignment helps us translate policy into deployable safeguards across cybersecurity, data governance, and regulatory compliance.

    To achieve certification, organizations undergo independent third-party audits and quarterly adversarial testing across more than a thousand enterprise risk scenarios. This continuous technical evaluation ensures that AI systems remain robust against fast-evolving threats and that safeguards keep pace with rapid progress in the field. As a product leader, I welcome this level of scrutiny—it’s how we operationalize threat detection and response and make agentic AI dependable at scale.

    AIUC-1 itself evolves every quarter, incorporating new research, threat patterns, and global best practices. The standard is shaped by the AIUC-1 Consortium, launched in November with more than 50 founding members who collectively handle tens of trillions of dollars in payments and serve over a billion people daily. Intercom is proud not only to be certified, but to be recognized as a founding technical contributor helping shape the development of the standard. That continuous, community-driven iteration mirrors how we build—measure, learn, and harden—so our customers benefit from real-world, enterprise-ready AI.

    Intercom has decades of combined experience in security, compliance, and trust, and we’ve consistently demonstrated that robust governance and fast innovation can coexist. Achieving AIUC-1 certification reinforces that the same rigor we apply across our platform also extends to Fin, our AI Agent. I’ve seen first-hand how risk and procurement teams evaluate generative AI: they expect clarity, evidence, and controls. This certification delivers independent proof that our approach meets those expectations.

    For our customers, this certification provides independent validation that Intercom’s AI systems are safe, resilient, and enterprise-ready. It confirms that our AI is tested regularly, built with strong safeguards, and aligned with the expectations of modern security and risk teams. It also signals our continued leadership in shaping responsible AI practices globally, ensuring our customers benefit from standards built for real-world use. In short, you can move faster with confidence—without compromising on governance.

    Intercom has always approached trust as an ongoing commitment. AIUC-1 strengthens the foundation we’ve built across other frameworks and certifications, including SOC 2, ISO 27001, ISO 27701, ISO 27018, HIPAA, HDS, and ISO 42001. Together, these certifications create a comprehensive control fabric across privacy, security, and reliability—critical pillars for any enterprise deploying gen AI into production workflows.

    As AI technology accelerates, we will continue to evolve our safeguards, deepen our governance practices, and contribute to the standards that shape responsible AI. Our promise is simple: to build AI that is not only powerful and efficient, but safe, transparent, and deserving of the trust our customers place in us. That’s how we turn innovation into durable value.

    You can learn more about our certifications and access our security and compliance documentation through the Intercom Trust Center.

    Get started with Fin and see how an AIUC-1 certified, enterprise-ready AI Agent can elevate your customer experience with confidence.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • Beyond Accuracy: The Trust-First Evaluation Metrics I Use to Scale High-Impact AI Products

    Beyond Accuracy: The Trust-First Evaluation Metrics I Use to Scale High-Impact AI Products

    When I assess whether an AI product is ready for prime time, I start with trust—not model accuracy. Accuracy is table stakes; trust is what earns adoption, drives retention, and unlocks durable product-led growth.

    Evaluation metrics in AI products go beyond accuracy. Learn how product teams use trust-driven metrics to build reliable, growth-driving AI systems.

    In practice, I organize trust-driven metrics into four layers: model quality and safety, user and business outcomes, operational reliability and cost, and governance and compliance. This layered approach keeps product trios aligned on what matters now, what must be gated in CI/CD, and what signals we’ll use to prove progress against outcomes vs output OKRs.

    On model quality and safety, I care about precision, recall, F1, calibration, and abstention behavior, but also the hard-to-fake signals: hallucination rate, grounding and faithfulness, citation coverage, toxicity, bias, and fairness. For generative systems, I instrument refusal correctness (declining unsafe requests) and evidence adequacy (did the answer rely on retrieved, trustworthy sources).

    User and business outcomes must be explicit. I track adoption, activation, task success rate, time to first value, win rate uplift in assisted workflows, CSAT and NPS deltas, and retention analysis by cohort exposed to AI features. For customer support scenarios, deflection rate, average handle time change, and first-contact resolution are core; for sales or ops copilots, I monitor cycle-time reduction and error-rate reduction in critical tasks.

    Experimentation is non-negotiable. I design A/B testing with a clear minimum detectable effect (MDE), pre-registered guardrails for safety and quality, and sequential tests that stop early if harm outpaces benefit. Online metrics are always paired with offline evals so we can iterate quickly without exposing users to regressions.

    Operationally, trust shows up as speed, stability, and cost predictability. I track latency end-to-end, time to first token, throughput, rate of 5xx and timeouts, cost per request, and caching effectiveness. We also trend safety incidents per 10,000 interactions and mean time to mitigation to keep reliability visible alongside performance.

    Governance and compliance are part of the product, not an afterthought. Data governance and privacy-by-design metrics include PII exposure rate, data lineage coverage, access-control correctness, audit pass rate against internal policies, and model and prompt change traceability. This is the backbone of our AI risk management posture and accelerates regulatory compliance reviews instead of slowing them down.

    The delivery engine for all of this is eval-driven development. We maintain golden datasets and scenario-based test suites that mirror real user intents, gate releases in CI/CD with minimum thresholds, and run canary rollouts to validate offline–online alignment. Every model or prompt update gets a comparable scorecard so product, engineering, and design can trade off quality, speed, and cost with shared facts.

    For LLM-heavy features, retrieval-first pipeline metrics are mandatory. I monitor retrieval hit rate, recall at K, mean reciprocal rank, context contamination, and citation correctness. With large prompts, context window management matters: we track context utilization, truncation rate, and the contribution of each context block to final answers to avoid silently losing critical evidence.

    Finally, trust must be legible. I package these metrics into an executive scorecard that maps to business outcomes, risk appetite, and OKRs, with clear thresholds for ship, improve, or roll back. When teams can articulate trade-offs—say, a 20% latency reduction at a small cost increase, or a lower hallucination rate at the expense of higher abstention—they build credibility with stakeholders and confidence with customers.

    Trust is not a single number; it’s a system of evidence. By instrumenting these layers and operationalizing AI Strategy with rigorous, transparent metrics, we can ship faster, reduce surprises, and earn the right to scale AI features across the product portfolio.


    Inspired by this post on Product School.


    Book a consult png image
  • Vibe Check Part 3: 5 Costly Vibe Marketing Mistakes—and How I Use AI to Avoid Them

    Vibe Check Part 3: 5 Costly Vibe Marketing Mistakes—and How I Use AI to Avoid Them

    Vibe marketing can electrify a brand, but it can also derail a strategy if it outruns the fundamentals. I have seen campaigns with breathtaking creative fall flat because the message had no anchor in product truth, no measurable goals, and no operational guardrails. In this installment, I share the patterns I watch for, the diagnostics I run, and the AI tools I use to keep the vibe aligned with outcomes.

    Learn how to avoid the five most common mistakes in vibe marketing to have more success with AI marketing tools.

    At its best, vibe marketing translates product positioning and value proposition into an emotional signal customers immediately recognize. At its worst, it becomes mood without meaning. The difference is disciplined product management: clear go-to-market strategy, outcomes vs output OKRs, rigorous A/B testing, and a feedback loop that connects creative choices to customer behavior.

    Mistake 1: Mistaking mood for strategy. Early drafts often lean on catchy lines or trending aesthetics that don’t map to customer jobs-to-be-done or competitive differentiation. When I feel that drift, I force the team to articulate the core product promise, restate the positioning, and tie each headline to a measurable outcome. If a message cannot be traced to a specific hypothesis, audience, and metric, we rewrite it before it ships.

    Mistake 2: Chasing trends instead of customer truth. Vibes built on whatever is viral this week rarely compounding learnings. I push for continuous discovery with interviews, in-product surveys, and sentiment analysis, then let gen ai generate multiple narrative variants grounded in actual quotes and objections. We evaluate with A/B testing and an explicit minimum detectable effect so we don’t declare victory on noise. That keeps our experimentation eval-driven, not anecdote-driven.

    Mistake 3: Measuring vanity, not meaning. Reach and likes can be directional, but I optimize for activation, time-to-value, retention analysis, and conversion lift across the funnel. I instrument journeys in a unified analytics platform with Amplitude analytics and CRM integration so we can connect vibe exposure to outcomes. If the creative lifts click-through but hurts downstream activation, it’s not working—no matter how cool it looks.

    Mistake 4: One vibe for every segment and channel. Audiences experience value differently, so the same creative rarely works in ads, landing pages, and in-app guides. I use LLMs for product managers and CustomGPT workflows to adapt the message by segment and stage, then validate with product tours, in-app prompts, and targeted lifecycle emails. The goal is coherence, not uniformity: a consistent story tuned to the context where decisions happen.

    Mistake 5: Unbounded AI experimentation. Without AI risk management and data governance, teams can unintentionally ship off-brand or non-compliant copy. I set privacy-by-design standards, define approval thresholds, and establish context window management so models stay on-brief and on-policy. We log generations, review outputs against brand guidelines, and use retrieval to ground messaging in approved claims.

    My practical playbook is simple: define the hypothesis tied to positioning, generate creative options with gen ai, pre-qualify with qualitative feedback, run A/B tests with clear success criteria, and iterate only on variants that move a business metric. Product trios align weekly on learnings so marketing signals and product-led growth motions reinforce each other. When the vibe matches the value and the data, momentum compounds.

    Vibe marketing is not the opposite of rigor; it is rigor expressed emotionally. With the right AI strategy, measurement discipline, and governance, the creative spark becomes a durable advantage—and your brand earns the right to keep the spotlight.


    Inspired by this post on Amplitude – Perspectives.


    Book a consult png image
  • From No-Code Hack to 10,000 Weekly Calls: Inside Perk’s Voice AI That Actually Works

    From No-Code Hack to 10,000 Weekly Calls: Inside Perk’s Voice AI That Actually Works

    I love real-world AI that ships, scales, and actually solves painful customer problems. This story checks every box. As a product leader who has brought agentic AI to production environments, I was captivated by how a small, focused team at Perk took a no-code voice AI prototype and turned it into a system that reliably makes 10,000+ calls per week to prevent failed hotel payments.

    What happens when you combine a real customer problem, a no-code prototype, and a team willing to listen to every single call?

    Steven Payne (Product Manager), Gabriel Stock (Senior Engineering Manager), and Philipe Steiff (Senior Software Engineer) from Perk share how they built a voice AI agent that calls hotels to verify virtual credit card payments, preventing travelers from arriving to find their rooms unpaid. This is a textbook example of linking operational pain to a high-leverage AI solution.

    What started as a hackathon experiment in Make.com became a production system handling over 10,000 calls per week across multiple languages. Along the way, the team learned hard lessons about prompt engineering for voice (numbers, pronunciation, and a very "Karen-like" first version), how to break a single monolithic prompt into structured conversation stages, and why listening to actual calls beats any amount of theorizing.

    From a product management perspective, this approach aligns perfectly with eval-driven development and continuous discovery. Structure the problem, instrument aggressively, ship safely, then listen—deeply—to real interactions. In my own teams, I’ve seen that nothing accelerates iteration on agentic AI like closing the loop between qualitative call reviews and quantitative evals.

    They built a working prototype without writing a single line of backend code.

    They structured the call into discrete stages (IVR, booking confirmation, payment) to improve reliability.

    They created two eval systems: one for call success classification, another for conversational behavior.

    They scaled from five calls a day to tens of thousands per week while maintaining quality.

    This is a detailed look at building AI for real-time human interaction—where the stakes are high and the feedback is immediate.

    Guests: Steven Payne, Product Manager, Perk; Gabriel Stock, Senior Engineering Manager, Perk; Philipe Steiff, Senior Software Engineer, Perk.

    What stood out to me was how Perk's team identified an AI use case by connecting prior experimentation with a real operational problem. Why they chose Make.com for prototyping—and shipped to production without touching backend code—underscores how far no-code can take you when paired with crisp problem framing. The evolution from a single prompt to structured conversation stages (IVR handling, booking confirmation, payment request) is exactly how you harden agent behavior for production.

    Breaking up the agent's task dramatically improved reliability. They also built two eval systems: classification for success rates and LLM-as-judge for conversational behavior. Even with automation, the team still listens to calls manually—a practice I strongly endorse for uncovering edge cases, trust issues, and UX nuances that dashboards can’t show.

    The challenge of prompt engineering for voice—numbers, booking references, and text-to-speech markup—was non-trivial. Expanding to German revealed that prompts in native language improve results. And, as often happens with operations-heavy rollouts, this project uncovered other operational problems they didn't know existed—valuable signal for the roadmap.

    Resources & Links: Perk. Make.com — No-code automation platform used for the prototype. Twilio — Voice/telephony provider. Eleven Labs — Text-to-speech provider (used in early experiments).

    Chapters: 00:00 Introduction to the Team; 01:54 Understanding PERK's Mission; 02:59 Challenges in Travel Booking; 07:27 AI Solutions for Customer Care; 09:52 Prototyping with AI and Voice; 17:00 Implementing AI in Production; 25:51 Learning Through Trial and Error; 26:40 Prompting Challenges and Solutions; 27:58 Iterating on Prompts and Evaluations; 30:08 Scaling and Production Challenges; 32:43 Advanced Evaluation Techniques; 35:32 Real-World Applications and Success; 49:07 Future Directions and Expansion; 53:53 Conclusion and Team Reflections.

    My product takeaways: Start with clear operational pain and measurable outcomes (e.g., payment verification). Use no-code to validate quickly, then progressively harden. Treat voice AI like any production system: break it into deterministic stages, add guardrails, and measure both outcome and behavior. Pair automated evals with hands-on reviews. And when going multilingual, write prompts in the native language—your accuracy will thank you.

    If you’re exploring agentic AI for operations, this is the blueprint: tight scoping, Make.com for speed, Twilio for reliability, structured prompts for control, and an eval-driven loop to scale quality with confidence.


    Inspired by this post on Product Talk.


    Book a consult png image
  • Crack the AI Search Code: How Startups Win Recommendations in ChatGPT and Perplexity

    Crack the AI Search Code: How Startups Win Recommendations in ChatGPT and Perplexity

    AI search is reshaping how customers discover emerging products, and I’ve seen firsthand how this shift rewards startups that speak clearly to both humans and machines. Learn how LLMs like ChatGPT and Perplexity decide which startups to recommend and what signals help a brand get discovered in AI search.

    In practice, AI search behaves less like a list of blue links and more like a synthesis engine. These models look for credible, consensus-backed, well-structured sources they can cite with confidence. That means your brand’s discoverability hinges on technical clarity (schema, structure, speed), topical authority (depth, citations, expert bylines), and evidence of real-world adoption (reviews, case studies, third-party validation).

    I start by mapping buyer intent across the entire journey—category exploration, problem framing, solution fit, integration needs, ROI, and competitive comparisons. Then I design a page system that answers each intent with precision: clear “About” and “Use Cases” pages, integration-specific pages, objective "X vs Y" comparisons, transparent pricing, and a living FAQ that mirrors the exact questions users ask in conversational queries.

    Structure matters. I add JSON-LD schema for Organization, Product, FAQPage, HowTo, and Article where appropriate; keep canonical URLs consistent; and ensure titles, meta descriptions, and Open Graph data reinforce the same story. Clean sitemaps, a sensible robots.txt, and fast, mobile-first performance reduce friction for crawlers and increase the odds that LLMs extract accurate snippets.

    Authority is earned off-site as much as on-site. I prioritize third-party signals—G2/Capterra reviews, analyst mentions, reputable press, open-source repos with README clarity, academic or industry citations, and credible partner integrations. LLMs heavily weight these external proofs when recommending solutions, especially for B2B and regulated categories.

    On your site, demonstrate expertise. I include expert bylines with real credentials, cite primary sources, showcase customer outcomes with verifiable metrics, and make methodologies transparent. Shallow, keyword-stuffed posts don’t help; comprehensive, up-to-date explainers with references do.

    Make your content retrieval-friendly. LLMs favor text they can segment, anchor, and quote. I structure pages with descriptive headings, short paragraphs, and linkable anchors; offer HTML-first documentation (not just PDFs); and provide copyable code or configuration steps when relevant. This also sets you up for a retrieval-first pipeline in your own product experiences.

    From a product and platform angle, I expose trustworthy documentation and a clear trust center—security, compliance, data governance, and privacy-by-design content. When a user asks an LLM whether they can safely deploy your solution, these pages often get pulled into the answer.

    Evaluation closes the loop. I run an eval-driven development process for content: a stable prompt set that mirrors real queries, regular tests in both Perplexity and ChatGPT, and analytics to track referrals from AI-driven sources. I iterate headlines, schema, and on-page structure, then tie changes back to engagement and pipeline using A/B testing where it’s appropriate.

    Don’t neglect comparison and alternatives pages. Fair, well-cited pages that address trade-offs and points of parity build trust—and they give LLMs succinct, quotable language for recommendation contexts. Clarity beats hype every time.

    Finally, keep your corpus fresh. I schedule quarterly content reviews, retire outdated claims, and highlight release notes and integration updates. Freshness signals help models favor your content when they resolve time-sensitive queries.

    If you treat AI search as a product surface—one that rewards precision, provenance, and performance—you’ll dramatically increase your odds of being recommended where it matters. That’s how I operationalize AI discovery for startups: intent mapping, structured content, external authority, a retrieval-friendly corpus, and a rigorous eval loop.


    Inspired by this post on Amplitude – Perspectives.


    Book a consult png image