Category: Generative AI

  • Escape the “It’s Just an LLM” Trap: Inside Operator, a Reliable, Actionable AI Agent

    Escape the “It’s Just an LLM” Trap: Inside Operator, a Reliable, Actionable AI Agent

    We just launched Operator, an Agent for your customer operations that helps you understand, manage, and improve your entire customer experience. I’ve spent years shipping AI-driven products at production scale, and this one reflects the lessons I’ve learned the hard way about what it really takes to go from a flashy demo to a dependable system your team trusts.

    To give you a clear view of just how powerful this Agent is, I want to share the technical infrastructure and engineering choices that make Operator work reliably at production scale across thousands of customer workspaces. My goal is to demystify the gap between a well-prompted LLM and a true, production-grade Agent—so you can make an informed build vs. buy decision.

    If you’re a technical leader evaluating whether to build something like this yourself, or trying to understand the difference between a well-prompted LLM and a production Agent system, this is for you.

    Escaping the “it’s just an LLM” trap

    Most engineering teams in this space start the same way: a prototype. You take a foundation model, give it API access to your support data, add a system prompt with some domain context, and you’ve got something that queries your database, summarizes tickets, and generates reports that look right. It demos convincingly—and I’ve been there, impressed in the moment, only to watch it buckle under real-world complexity.

    The problem with that prototype is that it obscures the scope of what’s actually required. It demonstrates the 10% of the system that’s straightforward to build, and it’s easy to assume the rest is just as straightforward. It isn’t. The gap between a working demo and a production system your team depends on daily is where most of the engineering investment lives. That’s precisely the gap we focused on closing.

    With Operator, we’ve invested deeply in every layer: tooling, reasoning, how the Agent takes action, and the infrastructure that makes it reliable at scale. Here’s a closer look at the architecture and why it matters for agentic AI, platform scalability, and observability.

    The tooling layer

    The first thing we had to confront was that the obvious approach (giving a model access to your APIs and letting it figure things out) doesn’t hold up in production. The model makes reasonable decisions for simple queries, but operating across thousands of customer workspaces with different configurations, data models, and usage patterns, a “figure it out” approach isn’t nearly precise enough.

    What you need is purpose-built tooling: tools that encode decisions about what data to fetch, how to structure it, what context to include, and what to leave out. Operator has over 50 of these tools and 10 skills.

    A tool is a single action that Operator takes (search content, run a query, look up a conversation). A skill chains multiple tools together to complete a whole job, like debugging a conversation end-to-end, rolling out a content update across an entire help center, and identifying the next automation opportunity. This is where AI workflows move from abstract prompts to dependable, repeatable outcomes.

    The difference between using thin wrappers around API endpoints and purpose-built tooling shows up in something as seemingly simple as a performance question. When you ask “how did Fin perform last week?”, a naive implementation runs a query and hands back a table. Operator runs a reporting tool that determines which metrics are relevant for your specific workspace, which are meaningful for your particular question, and what the numbers actually mean in context, giving you a much richer answer that you can do something tangible with.

    Developing that behavior took months of engineering. Not because any individual piece is conceptually hard, but because getting it right across the full range of customer workspaces, configurations, and edge cases is an iterative process. You build it, you test it against real conversations, you find the cases where it breaks, you fix those, and you repeat. There’s no shortcut—and in practice, this is where most DIY efforts stall.

    The intelligence layer

    The tooling layer solves what to do, but beneath it is a harder problem: understanding what’s worth doing, and why. This is the layer that makes Operator understand your business rather than just query it. Three components go into it, and in my experience they’re non-negotiable for a reliable Agent.

    1. Semantic search

    Unlike solutions that rely on keyword matching, Operator uses a system that understands what content is about, not just what words it contains. When it searches your help center, it’s using the same semantic search engine we’ve spent years optimizing for Fin itself. This is a retrieval system that’s been tuned against millions of real support conversations, with precision and recall characteristics we’ve measured and improved continuously. This retrieval-first pipeline is the backbone of grounding and dramatically reduces hallucinations.

    2. Attribute awareness

    Operator has access to your data and knows what is meaningful for different questions. It knows which metrics are actually in use in your workspace, which custom attributes carry signals, and which fields are populated versus effectively empty. We’ve built specific skills that give Operator this meta-knowledge, so when it’s investigating a performance question, it’s looking at the right things, not hallucinating insights from sparse data.

    3. Intelligent reasoning

    A well-built Agent can answer your question and anticipate what you should ask next. If you ask Operator about escalations spiking, it doesn’t just say, “escalations increased 23% week-over-week.” It’ll continue on to tell you why this happened by examining the escalated conversations and identifying that a disproportionate number involved a specific product area, before moving on to check whether the relevant help content is up to date, and, if it isn’t, proposing an update. That chain of reasoning isn’t prompt engineering. It’s encoded in the skills we’ve built, refined against the patterns we see across our entire customer base.

    The action layer

    This is where the engineering complexity increases by an order of magnitude because instead of just analyzing problems and recommending solutions, Operator takes action to solve them itself. It can update Guidance rules, draft and publish help articles, create Procedures, configure data connectors, and modify your Fin configuration. Moving from read-only insights to write-capable actions is a fundamentally different class of product and infrastructure problem—one that demands rigorous SRE practices and rock-solid safeguards.

    Every one of these actions has to be safe, reversible, and auditable. An analytics tool that occasionally returns a wrong number is frustrating. but an Agent that occasionally applies a wrong configuration change to a live support system is a different category of problem. To prevent this, we built a robust proposal system, whereby every change Operator suggests is presented as a reviewable diff. You see exactly what will change before anything is applied, with the option to accept, reject, or refine. Nothing goes live without your explicit approval.

    What else sets Operator apart

    A UI that’s both conversational and graphical, not one or the other. Operator blends conversational interaction with purpose-built graphical components. Proposal diffs show exactly what will change in an article. Inline charts visualize performance trends. Dashboards render directly inside the conversation thread. In practice, that means a knowledge manager reviews a structured diff—not a wall of LLM-generated text—and a team lead asking about weekly performance gets an accurate chart with context, not a paragraph approximating data.

    Building this hybrid experience is extremely difficult outside of a native platform integration. In a chat interface or CLI, you’re limited to text output; in a standalone dashboard, you lose conversational context. Operator does both in the same thread, so every interaction is detailed and context-rich—and importantly, actionable in the flow of work.

    It lives where your team already works. Operator is built into the same platform your team uses every day. It’s not a separate tool with a separate login, nor is it a Slack bot your engineer set up that only three people know about. It operates exactly where you are, alongside the conversations, help center articles, workflows, and data you’re working with. That tight integration closes the gap between finding a problem and fixing it: spot an outdated article while reviewing a Fin conversation, and Operator can surface the fix in the same session. Notice an escalation spike in the morning, and you can ask Operator to investigate without switching tools, waiting for a data pull, or filing a ticket.

    The compounding advantage

    Every customer using Operator teaches us something. We see which debugging approaches work across different types of support operations, learn which content structures perform better, and identify automation strategies that consistently land. Those patterns get encoded back into Operator’s skills and tools. When we discover that a particular sequence of investigation steps reliably identifies the root cause of a spike in escalations, we build that into Operator’s diagnostic skill. When we find that a specific way of structuring help articles leads to higher Fin resolution rates, we encode that into the content creation skill. Our engineering team is continuously shipping improvements based on what we observe across the entire customer base.

    A custom-built solution gives you exactly what you built, meaning it doesn’t get smarter unless you invest engineering resources into making it smarter. And that usually means taking time and talent away from your core product. I’ve watched teams underestimate the ongoing cost of eval-driven development, model upgrades, and API churn—costs that only grow as your footprint expands.

    We’re not locking the door

    Some teams want to build their own Agents. Some of our most technical customers do this. But when you do, you’re working with raw APIs and building your own tooling on top of them. When you use Operator, you’re working with a system that already knows what questions to ask, understands your data, and encodes the best practices we’ve learned from thousands of support teams. We recently launched the Fin CLI, which means you can use third-party agents like Claude Code or Cursor to interact with your Fin data and configuration. That door is open. What I hope this post has clarified is everything that goes into the build of Operator: Over 50 tools and 10 skills, purpose-built for support operations. Years of investment in semantic search. Deep integration with every layer of Fin’s stack. The proposal system. The intelligence layer. The reliability infrastructure.

    If you’d still like to move ahead with building a custom solution, here’s an honest assessment. You can build a useful read-only tool in weeks. It’ll query your data, summarize tickets, and generate reports, but turning it into a production system will take quarters. Reliability, security, edge case handling, multi-tenant data isolation, and graceful degradation are all important architectural decisions that you’ll need to get right from the start. The action layer is also where you might risk stalling out. Going from “here’s what’s wrong” to safely making changes in a production system is a fundamentally different engineering problem than analysis. Most DIY projects never get there. Finally, you’ll be maintaining it forever. Every model upgrade, API change, and new capability in your support platform means updating your custom tooling. We have a team dedicated to this. You’ll need one too.

    The economics still favor buying when a vendor has invested more in the problem than you can justify internally. What I hope this post adds is a clearer picture of what that investment actually looks like from an engineering perspective—and why it compounds into a durable advantage for your support organization.

    The investment is ongoing. The problems we’re solving at the infrastructure level today are harder than the ones we solved a year ago, and that trajectory isn’t slowing down. If you’re ready to see the difference a production-grade Agent can make, explore Operator.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • How AI-Designed Enzymes and Agentic AI Could Finally Make Plastic Truly Recyclable

    How AI-Designed Enzymes and Agentic AI Could Finally Make Plastic Truly Recyclable

    Only 10% of the plastic we manufacture gets recycled. For a century, we have relied on mechanical and chemical methods that were never designed to close the loop. As a product leader, I look for step-change technologies that break through entrenched ceilings, and biology—specifically engineered enzymes—has emerged as that missing piece.

    Recently, I dug into the work of Rhea's Factory and spoke with their founders, Arzu Sandıkçı (co-founder and CEO) and Mert Topcu (co-founder). Arzu brings deep expertise in molecular biology and enzyme engineering. Mert brings 20 years in tech, including a decade at Google as a product manager. Their combined perspective—domain science plus product rigor—shows up in every design choice.

    Rhea's Factory has built an AI platform that uses protein language models, multi-step agentic pipelines, and proprietary wet lab data to design novel enzymes that deconstruct plastic polymers into their original monomers—selectively, at low temperatures, and at industrial scale. That stack matters: it layers foundation models with domain-specific constraints and real-world data to systematically explore, evaluate, and scale candidates.

    Here’s the crux: traditional recycling mostly just chops polymer chains into shorter fragments. Enzymatic recycling, by contrast, breaks plastic all the way back to its original monomers. Think of a necklace and pearls analogy—mechanical methods snip the chain; enzymes cleanly return the pearls. The result is true circularity: you can remake high-quality plastic without downcycling.

    Selectivity is the superpower. Enzymes can target specific plastic types even in mixed waste streams, operating at low temperatures in a controlled, low energy reactor process. That combination of precision and energy efficiency is why this approach can be both greener and economically competitive.

    The field accelerated after the discovery of a plastic-eating bacteria in Japan, which opened the door to enzymatic recycling. Advances in protein structure prediction—“AlphaFold” and the Nobel Prize in Chemistry—transformed what’s possible in enzyme engineering, and created space for AI-native design loops to flourish.

    On the AI side, the team evolved from a human-orchestrated pipeline to an agentic AI scientist. Problem statements serve as inputs, multi-step protein generation builds on foundation models, and guardrails at each pipeline step keep the AI pointed in the right direction without limiting exploration. It’s a textbook example of agentic AI applied to a highly constrained, safety-critical domain.

    Crucially, wet lab feedback closes the loop. Why wet lab data—even just hundreds of proprietary data points—can be enough to train a powerful domain-specific prediction model is a reminder that quality and relevance can trump sheer volume when you’re operating in a narrow, high-signal domain. The team measures success in the lab first, then scales what works.

    I appreciated their take on exploration: there are moments when Mert sometimes wants the model to hallucinate. Running high temperature settings helps explore the full enzyme design space, and the guardrails ensure those forays remain productive rather than random. In other words, controlled creativity beats blind search.

    The business constraint is unambiguous: enzymatic recycling must compete economically with cheap, oil-based plastic production. That framing forces disciplined choices around energy use, throughput, and yield—factors that directly determine unit economics and the path to industrial reality and cost parity.

    What’s next is equally compelling: a process agent to optimize end-to-end system performance, a 5,000-ton demo plant in California to validate scale, and enzymes for new plastic types. I’m especially intrigued by enzyme blends for mixed plastics and the practical insight into why clamshells aren’t recyclable—precisely the messy corner cases that decide whether circularity works outside the lab.

    From a product management lens, several patterns stand out: define clear problem statements as inputs to the agentic orchestration; use eval-driven development to enforce stage-by-stage quality; build a proprietary data moat with wet lab results; and tie milestones to industrial metrics (conversion, selectivity, energy per ton) rather than vanity outputs. This is AI Strategy in action—aligning model capability, data leverage, and operational design to deliver outcomes, not just demos.

    Most of all, the ambition to explore an enzyme design space that “makes everything nature has ever evolved look like a tiny dot” captures the promise of this approach. Pairing agentic AI with rigorous lab validation doesn’t just make plastic circularity plausible—it makes it programmable.


    Inspired by this post on Product Talk.


    Book a consult png image
  • No More Accidental Agents: How We Engineered Global Agent’s Helpful, Curious Personality

    No More Accidental Agents: How We Engineered Global Agent’s Helpful, Curious Personality

    Most teams ship AI agent personalities by accident—emergent quirks, brittle prompts, and uneven behavior. We refused to let that happen. From day one, we treated personality as a first-class product surface, one that should be designed, instrumented, and iterated with the same rigor as any core capability.

    Learn how we designed Global Agent’s personality and fine-tuned its inquisitiveness and helpfulness using Agent Analytics.

    In my role leading product at HighLevel, Inc., I framed our approach around agentic AI and conversation design: personality is not “flavor text”; it is the control system for how an agent interprets context, asks questions, and decides when to act. Our product strategy prioritized clarity, empathy, and consistency—so the agent would be curious enough to resolve ambiguity without becoming interrogatory, and helpful enough to move work forward without overstepping.

    We made that intent measurable. Using behavioral analytics, we defined operational signals such as clarification-question rate, resolution-path efficiency, and escalation quality. We combined eval-driven development with targeted A/B testing to compare prompt patterns and tool strategies, ensuring each change had a clear hypothesis and measurable outcome.

    To calibrate inquisitiveness, we mapped decision points where the agent should ask follow-ups versus proceed autonomously. Prompt engineering codified those thresholds, while a retrieval-first pipeline reduced unnecessary questions by improving context completeness up front. When the agent did ask, we constrained tone and cadence to keep queries concise, respectful, and progress-oriented.

    To enhance helpfulness, we prioritized precise action-taking and unambiguous guidance. Context window management preserved relevant facts without diluting intent, and guardrails aligned with AI risk management principles ensured the agent stayed within policy, privacy, and compliance boundaries. The result was an assistant that resolved more tasks end-to-end, with fewer stalls and clearer handoffs when human help was warranted.

    Agent Analytics became our nervous system. We instrumented every dialog turn to attribute outcomes to design choices, then used driver trees to connect micro-behaviors to macro results like time-to-resolution and customer satisfaction. This closed-loop view let us ship confidently, knowing which levers improved helpfulness, which sharpened curiosity, and which merely added noise.

    Process mattered as much as tooling. Product trios ran continuous discovery with customers to surface edge cases—ambiguous intents, multi-intent turns, and sensitive scenarios—while our engineering partners operationalized experiments with clean rollback paths. We favored small, testable changes over sweeping rewrites, building momentum and trust with each iteration.

    The payoff is a personality that feels consistent across use cases: curious when clarity is missing, decisive when action is obvious, and transparent when limits are reached. Users experience fewer dead ends, faster resolutions, and a brand voice that shows up the same way every time—because it was defined, measured, and improved on purpose.

    If you’re building agentic AI, don’t leave personality to chance. Treat it like a product: set clear outcomes, instrument deeply with Agent Analytics, and iterate with eval-driven development and A/B testing. That’s how curiosity becomes a feature, helpfulness becomes a habit, and your agent becomes reliably, intentionally excellent.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • The Ultimate Knowledge Management Playbook to Supercharge Your AI Sales Agent

    The Ultimate Knowledge Management Playbook to Supercharge Your AI Sales Agent

    Revenue leaders are starting to use AI to generate better leads, capture peak buyer intent, and scale their pipeline without a linear increase in headcount. I see it every day in my own teams: when we get the foundations right, AI doesn’t just answer questions—it accelerates qualification and turns curiosity into pipeline.

    Done well, an AI-first inbound sales experience engages buyers 24/7 in any language, qualifies leads intelligently, and routes high-intent prospects to the right conversion path. But behind that experience, there’s an unsung hero: knowledge management. I’ve learned the hard way that even the smartest Agent underperforms if it’s not fed the right information.

    A Sales Agent is only as good as what you give it to work with. If you’re using an Agent, like Fin, to run inbound sales motions end to end, it needs an extensive pool of knowledge to draw from. You need to feed it accurate answers on pricing, features, and plan fit, and clear rules for how to qualify and route each prospect. Without it, your Agent can’t do its job, and your sales team is back to answering the same questions manually and triaging leads that could have been handled automatically.

    In this guide, I walk through everything you need to know about building and maintaining the knowledge base that powers your Sales Agent—what to include, how to launch, what to measure, and how to iterate so results compound over time.

    What is knowledge management and why is it so important?

    Definition: Knowledge management is the process of creating, organizing, sharing, and maintaining knowledge in your business.

    Black-and-white testimonial graphic for Fin with a close-up portrait on the left and a large quote on the right highlighting how knowledge management boosts sales funnels, conversion, pipeline, and revenue.
    Knowledge is your sales agent's edge. This Fin testimonial shows how organizing and optimizing content removes friction in the funnel, lifting conversion and unlocking millions in pipeline and revenue for growing teams.

    Your public website and product pages are classic examples, but those are just the tip of the knowledge management iceberg. In an inbound sales motion, knowledge management involves a range of activities such as creating resources (FAQs, pricing overviews, competitive battlecards, case studies, internal sales materials), identifying gaps in documentation and qualification criteria, implementing systems that make information easy to access and use, and developing processes to keep everything current. In my experience, these elements are what allow an Agent to move from merely answering questions to recommending the right plan and explaining why it fits.

    Why knowledge management matters even more in the age of AI

    Your knowledge base is no longer just static collateral for buyers to read. It powers your Sales Agent and entire inbound motion. It’s the key to accurately answering complex prospect queries, guiding product discovery, qualifying intent in real time, and accelerating the path to pipeline. Two realities shape my approach:

    1) Your Agent is only as strong as what you “feed” it. Your Agent is only as good as the knowledge and content that it has access to. A lack of information, poorly structured sales materials, or out-of-date pricing documentation all prevent it from providing clear and correct answers to your buyers, leading to poor buying experiences that degrade trust and cost you deals. No large language model (LLM) knows your business like you do. It doesn’t understand your prospects’ specific needs, pain points, pricing tiers, or use cases. That knowledge is unique to you and your organization, which means you need to map it all out and explicitly feed it to your Agent. You need to feed it facts about your product, and also give it the context behind those facts so it can guide buyers to the right solution rather than just answering their questions.

    2) Every investment of knowledge has compounding results. Making the switch to AI isn’t just adopting a new tool. It means adapting to a new ecosystem. Think of it as a flywheel. Every piece of knowledge you add makes your Agent more effective. It generates better conversations and data, which tells you what to add or refine next. The more you invest in it, the faster it compounds.

    Monochrome quote graphic for Fin featuring a grayscale headshot on the left and a large quote on the right about avoiding duplicate content for sales, highlighting efficient knowledge management.
    Smart sales teams don’t copy what already works for service—they connect to it. This Fin quote card reminds readers to reuse trusted knowledge, cut duplication, and keep content manageable for faster, more accurate selling.

    “You have to think about AI like a new sales rep. On day one, it needs coaching, guidance, and feedback. But over time, as you refine the inputs and learn from real conversations, it becomes more autonomous and the level of coaching required decreases significantly.” Pascaline Albin, Director of Sales Development at Fin

    Every upfront investment you make in your sales knowledge has long-term, revenue-generating impact. Whether you hire someone to do this work full time or give your sales reps time away from the inbox each week, the ROI speaks for itself. I’ve routinely seen small content improvements unlock big conversion gains.

    Think of it this way: say it takes 30 minutes to document a new competitive battlecard or update pricing information. That 30-minute investment results in hours saved for your sales team, highly engaged buyers who get instant answers, and actionable data to optimize your inbound motion.

    Calculate: Average time to compose a response × frequency of question = time saved for your team. More importantly, that’s time your SDRs and AEs can reinvest in multi-threading into accounts, running complex evaluations, and closing high-value deals that actually move pipeline.

    Calculate: Number of prospects who ask this query × average time to respond = total time saved for buyers.

    Black-and-white headshot of a smiling professional beside a bold quote about Fin's AI Customer Agent and testing Fin for Sales to ensure complete knowledge, perfect customer experience, and faster revenue.
    Give your sales agents the knowledge they need from Day 0. A friendly portrait sits next to a bold statement on using Fin's AI Customer Agent to optimize content, guide reps, and turn buyer intent into pipeline and revenue.

    “For sales funnels, identifying knowledge gaps or friction can result in a huge improvement in conversion. When you optimize Fin with the right content, the incremental improvements have a big impact on our bottom line and can lead to millions of dollars in pipeline and revenue. That's why knowledge management is an integral part of our training and optimization process.” Tommy Dunton, Senior Manager of Sales Development at Fin

    The best way to start generating that data is simply to start. The sooner you begin, the sooner you can capture insights about what your buyers want and need from your inbound sales experience. I prioritize quick deployment, fast feedback loops, and continuous iteration.

    What to include in your knowledge base

    Wrangling and prioritizing all of your internal and external sales documentation can feel daunting, but with the right technology, it doesn’t have to. The ideal platform provides data-driven insights to show what buyers actually ask and a centralized place to create, manage, and optimize your knowledge content. For example, with Fin for Sales, you get access to a leads report that gives you insight into disengaged prospects. Intercom’s Knowledge Hub enables you to create a single source of truth for your public-facing collateral and internal sales materials. Using Content Targeting, you can segment this information so your Sales Agent only uses the exact content you want.

    1) Pricing and product FAQs. What it is: answers to the most common discovery questions buyers have, from pricing and plan differences to implementation, integration, and security or trust topics. How to source: analyze your sales inbox and early discovery calls. Where to use: public website, Sales Agent, and proactive outbound messages.

    Illustration of a sales agent using an AI-powered knowledge management dashboard on a laptop, with chat bubbles, documents, and analytics icons for faster answers and improved customer messaging.
    Give every seller instant, trusted answers with an AI-powered knowledge base that unifies docs, FAQs, and playbooks into a single source of truth—accelerating ramp, boosting call confidence, and improving every customer conversation.

    2) Competitor comparisons and battlecards. What it is: guidance for handling competitor mentions, addressing friction, and highlighting unique value propositions. How to source: talk to top-performing AEs or your product marketing team. Where to use: internal snippets for your Sales Agent and internal sales materials.

    3) Case studies and social proof. What it is: proof points that help buyers build business cases and gain confidence, speeding deal cycles. How to source: collaborate with customer success and marketing on ROI stories. Where to use: Sales Agent, website, and sales collateral.

    4) Specific use cases and buyer personas. What it is: targeted content for cohorts with similar pain points and jobs-to-be-done (e.g., engineering teams, startups). How to source: combine product marketing’s value propositions with real discovery conversations. Document the exact probing questions your best SDRs and AEs use so your Agent can uncover context in real time. Where to use: website and Sales Agent to enable contextual solution matching.

    Content formats and sources

    When sourcing knowledge, cast a wide net. You likely have more relevant content than you realize, and almost any information is useful once framed correctly. With Fin, you can use public articles (product FAQs, pricing overviews, feature benefits), internal articles (internal sales materials, internal FAQs), snippets (short-form text like promotions or battlecards), website pages (synced from your marketing site), and PDFs (whitepapers, technical specs, detailed sales materials).

    Sales Performance dashboard with KPIs—Conversation Volume 214, Contact Capture Rate 18.9%, Completion Rate 20.6%—and a Sankey-style funnel from Chat and Email to outcomes like Sales Qualified and Pro Plan.
    Turn conversations into revenue with a clear Sales Performance view. Track rising KPIs and follow leads from Chat and Email through Qualified, Disqualified, and Recovered to outcomes such as Sales Qualified, Pro Plan, or Free Plan.

    Create a knowledge management process that fuels your Agent: 5 steps

    Step 1: Audit what you have. Start by reviewing your current materials to prevent your Agent from learning outdated information and to identify gaps. If you’re already using a Customer Agent, much of that content can pull double duty for sales—no need to start from scratch. Make your existing content available for your Sales Agent and build sales-specific content on top, like pricing comparisons, competitive battlecards, customer case studies, and qualification criteria that wouldn’t apply to service conversations. If you’re starting fresh, audit pricing, product FAQs, feature details, competitor comparisons, case studies, and buyer use cases.

    Put yourself in your buyer’s shoes. Walk through the same steps your prospects take, including their first interaction with your Sales Agent. Before going live, test it yourself. If you’re using Fin, you can do this using the built-in Preview panel to validate answers, routing, and missing topics or objections. Confirm that your Agent asks the right probing questions about goals, fit, and urgency before making a routing decision.

    “We're moving incredibly fast at Fin with our Customer Agent, which means optimising our content, guidance and experience with Fin is a constant focus. Before we launch new products, we're testing Fin for Sales to ensure it's got all of the knowledge it needs to make sure the customer experience is perfect and we can convert that intent into pipeline and revenue from Day 0 of that launch.” Tommy Dunton, Senior Manager of Sales Development at Fin

    Seek input from across your GTM organization. Don’t rely solely on sales. Involve marketing, growth, revenue ops, and sales ops to align content with campaigns and routing logic, and to integrate with systems like your CRM. Your SDRs and AEs bring real-world objections, use cases, and competitor insights that win deals—and those should feed directly into your Agent’s knowledge base. Judging fit is as much art as science, and your best SDRs can teach the Agent to interpret subtle signals.

    Black-and-white headshot beside a bold quote about Fin AI for sales agents, stressing ongoing training and high‑quality knowledge bases to lift performance; clean, minimalist layout.
    Scalable selling starts with better knowledge. This graphic pairs a monochrome portrait with a bold Fin quote showing how training agents and curating a strong knowledge base compound AI performance over time.

    Step 2: Plan and prioritize. Decide where to start by focusing on questions your team still answers manually that, if documented, would help your Agent capture more qualified intent. Identify the content your reps share most (demos, explainers, case studies) and ensure the Agent can access it. Look at leads reporting to find early-stage questions, stuck points, and high-volume disengaged outcomes, then strengthen objection-handling content. Prioritize based on pipeline value—build competitive battlecards and enterprise-tier documentation before free-plan details. Use reporting to find funnel drop-offs and content that hasn’t been updated recently—refresh pricing immediately if it has changed.

    Allocate time and resources. Treat your Sales Agent like a core GTM channel, not a side project. Assemble a cross-functional project team with clear roles. The Agent owner translates sales strategy into prompts, routing logic, integrations, and rollout. The optimization owner reviews performance data, identifies drop-offs, and drives changes to content or Agent behavior. Early alignment ensures your Agent operates as a professional extension of your sales team.

    Step 3: Go live and learn. Deploy broadly across your marketing site and pricing pages to accelerate learning. Within weeks, you’ll see where the Agent guides discovery and qualifies buyers versus where it stalls. Investigate drop-offs—often these point to missing answers or weak probing questions. If your Agent and knowledge base live in the same platform, you’ll get full visibility into your qualification funnel and content performance across touchpoints.

    Track metrics to measure success. Monitor completion rate (conversations reaching a clear routing decision), pipeline created (opportunities generated through Agent-handled conversations), meetings booked (qualified prospects routed to a call), and customer satisfaction (quality of the experience). These metrics show what content is working and where to improve.

    Step 4: Iterate and improve. Expect gaps early on. That’s good—it surfaces what buyers need to convert. When the Agent gives a poor response, the root cause is usually missing, outdated, or shallow content. Close the gaps, then monitor your metrics and conversation reviews to keep compounding improvements.

    Black-and-white headshot on the left, with a large Fin-branded quote on the right stating that content powers a Sales Agent's discovery responses and keeps them current on the latest offerings.
    Your Sales Agent runs on great content. This Fin-themed graphic pairs a professional headshot with a bold statement highlighting how strong knowledge enables discovery answers and timely updates across the GTM motion.

    Build ongoing maintenance into your workflow. Knowledge management is continuous. As your product, personas, and goals evolve, so must your content. Define owners, review cadences, and working time to refresh and create content—don’t wait for launch week chaos. Encourage a “knowledge management” mindset by logging content requests from SDRs and AEs when they hear new objections or discover probing questions that uncover true pain points.

    “Training Agents to get better over time is fundamental to using AI. Fin learns from our website and help center, so the quality of those resources directly impacts its performance. The more we’ve invested in our knowledge base, the more success we’ve seen with Fin and those gains continue to compound.” Beth-Ann Sher, Senior AI Knowledge Manager at Fin

    Step 5: Build knowledge management into future launch plans. Make Agent-ready sales content part of every product or pricing launch checklist. Partner with engineering, product marketing, and revenue operations to update catalogs and your Agent’s knowledge base on day zero. Then review early discovery conversations to add resources, address new objections, and fine-tune contextual solution matching.

    “Content should no longer be an afterthought. It is one of your strongest GTM levers because your Sales Agent relies on it to handle discovery questions and stay up to date on your latest offerings.” Beth-Ann Sher, Senior AI Knowledge Manager at Fin

    Best practices for Agent-friendly knowledge management

    Fin quote graphic with a grayscale portrait next to text about unifying conversation data, lead reporting, and agent configuration to improve sales qualification, content insights, and the buyer experience.
    A pull-quote from Fin explains why one platform matters in sales: centralize conversation data, lead reporting, and agent configuration to spot funnel drop-offs, learn which content works, and elevate the buying journey.

    Use the terms your buyers use. Language varies by industry, persona, and role. Analyze discovery calls and on-site searches to capture how buyers actually speak and train your Agent accordingly. Test internally across SDRs, revenue ops, and marketing to reveal variations and content gaps.

    Simplify language and remove ambiguity. Machine-friendly language is buyer-friendly. Avoid jargon, spell out acronyms, and clearly explain key product terms so value propositions land.

    Keep the experience consistent and on-brand. Ensure product terminology, feature names, and pricing tiers are consistent everywhere. Proof for tone, spelling, grammar, and use standardized templates to build trust.

    Add context to your answers. If your internal FAQ is full of “yes/no” answers, expand on the why. Restate the question, provide business context, and equip the Agent with follow-ups that keep the conversation alive and uncover goals and constraints.

    Add text to images and videos. Show and tell—always include clear explanatory text so your Agent and all users, including those with accessibility needs, can benefit.

    Minimalist hero graphic with the headline 'Add Fin to your sales team today,' a glossy 3D blue spiral at center, and a black 'Start free trial' button, promoting Fin for Sales as an AI customer agent.
    Introduce Fin for Sales to your team with this clean hero banner: bold headline, signature blue spiral, and a clear 'Start free trial' call to action—inviting readers to explore an AI customer agent built for revenue.

    Create a scannable structure. Use clear headers and lists in your source content so both Agents and humans can navigate quickly. Avoid dynamic elements that hide crucial details.

    Collect bite-size information in FAQ articles. Package tactical intel—seasonal promotions, short battlecards, edge cases—into concise snippets so your Agent can retrieve and deliver them instantly.

    A connected Agent turns every conversation into insight. When a Sales Agent is connected to your CRM and enrichment tools, every interaction, qualification signal, and piece of sales content flows into a connected system. “A single platform matters in sales. When your conversation data, lead reporting, and Agent configuration all live in one place, you get much better visibility into your qualification funnel. You can see where buyers are dropping off, what content is working, and can improve the buying experience.” Fred Walton, Senior AI Conversation Designer at Fin

    Every conversation makes your knowledge base sharper, showing you what’s resonating, what’s missing, and where to invest next. That’s the retrieval-first pipeline mindset I push with my teams.

    Make knowledge management a core sales function

    Behind every high-performing Sales Agent is a comprehensive, machine-friendly knowledge management process. Without it, even the most capable Agent will struggle to deliver the pipeline gains AI can deliver. This isn’t a one-time project; it’s a continuous investment. The teams treating knowledge management as a core sales function are building systems that improve with every conversation, turning inbound demand into a compounding growth engine.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • I Pointed a “Ralph Wiggum” AI Loop at My Product for a Week—The Data That Stopped Chaos

    I Pointed a “Ralph Wiggum” AI Loop at My Product for a Week—The Data That Stopped Chaos

    I spent a week pointing a "Ralph Wiggum loop" at my product to see how far an agentic AI could take pragmatic, everyday improvements without human micromanagement. It was equal parts exhilarating and nerve-wracking. The short version: the loop moved fast and broke assumptions, but Amplitude analytics kept it from going off the rails—and turned chaos into controlled acceleration.

    By "Ralph Wiggum loop," I mean a deliberately naive, endlessly curious cycle: try something small, ship it behind a flag, watch the data, then try again. It is the product equivalent of a fearless intern who experiments constantly. That energy is invaluable for discovery, but it absolutely demands strong guardrails and a clear definition of success.

    Before I started, I framed the outcomes I cared about: user activation within the first session, reduction in time-to-value, and early retention indicators. I set baselines and a minimum detectable effect (MDE) for A/B testing so the loop could distinguish noise from signal. I also documented a driver tree of behaviors we wanted to influence and ensured every event was cleanly instrumented in Amplitude analytics to support reliable behavioral analytics.

    The guardrails mattered most. I put every change behind feature flags with instant rollback. I defined "off the rails" conditions upfront, including regression thresholds for activation and retention analysis, and enabled anomaly detection to surface unexpected spikes or drops. Session replay was ready to diagnose confusion fast, and I kept a daily evaluation cadence so the loop never ran unattended for long.

    Day by day, the loop proposed micro-experiments: onboarding copy variants, tooltip timing, in-app guide sequencing, and subtle changes to progressive disclosure. Each iteration shipped behind a flag to a small cohort. I watched leading indicators in real time, then zoomed out to cohort views to guard against short-term gains that might erode longer-term value. When something looked promising, we expanded exposure methodically; when something looked risky, we paused immediately.

    We had a pivotal moment where the loop suggested a bolder call-to-action that spiked activation. On the surface, it looked like a win. Amplitude cohorts told a fuller story: downstream engagement softened, and anomaly detection flagged a pattern that hinted at premature conversion rather than genuine intent. A quick rollback through feature flags saved the week—and reminded me why eval-driven development should be the default for agentic AI workflows.

    The most surprising part was how quickly the loop unlocked small compounding gains once the measurement scaffolding was in place. With a unified analytics platform and crisp guardrails, the system became a safe sandbox where the AI could explore aggressively while we stayed anchored to outcomes. The combination of behavioral analytics, A/B testing discipline, and daily human review turned raw speed into durable learning.

    My takeaways are direct. Agentic AI can accelerate discovery, but only if you define stop conditions and wire strict feedback loops into your stack. Measurement is product strategy here—without it, you get noisy activity instead of progress. Invest in instrumentation first, treat feature flags as non-negotiable, and let anomaly detection and session replay be your early warning system. Most of all, tie every experiment to activation, engagement, or retention, not vanity metrics.

    If you’re considering your own week with a "Ralph Wiggum loop," start painfully small, constrain the blast radius, and insist on decision-quality data. Do that, and you’ll turn a chaotic agent into a compounding engine for product discovery—one that moves fast, learns faster, and stays on track.


    Inspired by this post on Amplitude – Perspectives.


    Book a consult png image
  • From Vision to Execution: Building Agentic, Data‑Driven Products with Real‑World Rigor

    From Vision to Execution: Building Agentic, Data‑Driven Products with Real‑World Rigor

    When I consider where product development is headed, one statement captures the mandate perfectly: "Eric Carlson is a Principal AI Engineer helping to shape and build Amplitude's next generation vision of of agentic and data driven product development." That vision resonates deeply with how I lead teams—anchoring strategy in behavioral analytics while enabling agentic AI to act on insights with speed, safety, and measurable impact.

    Translating that vision into execution starts with clarity of outcomes. I frame driver trees that connect customer value to leading indicators—activation, engagement depth, and retention—then instrument product telemetry with Amplitude analytics and behavioral analytics to surface the moments that matter. From there, we operationalize learning with A/B testing and feature flags, ensuring each hypothesis gets a fair, observable run and that we can safely ramp what works.

    Agentic AI changes the operating model. Instead of static dashboards, we design autonomous workflows that observe signals, reason over context, and take action—grounded in a retrieval-first pipeline and governed by eval-driven development. For product managers, this demands fluency with LLMs for product managers and practical prompt engineering, plus rigorous AI Strategy around data governance, privacy-by-design, and risk scoring so agents remain trustworthy under real-world conditions.

    Cross-functional cadence is everything. I partner closely with Principal AI Engineers and product trios to blend continuous discovery with execution: rapid user interviews to reveal intent, opportunity solution trees to prioritize, and outcomes vs output OKRs to align incentives. The result is a system where insights are unified, decisions are explainable, and agents improve through tight feedback loops across analytics, experimentation, and production telemetry.

    If you’re building toward an agentic, data-driven future, invest in a unified analytics platform, shorten the path from signal to action, and measure learning velocity as carefully as feature delivery. With the right foundations, agentic AI becomes more than a feature—it becomes a force multiplier for product strategy, customer value, and sustainable growth.


    Inspired by this post on Amplitude – Perspectives.


    Book a consult png image
  • From Prototype to Production: How I Built Reliable AI-Generated Opportunity Solution Trees

    From Prototype to Production: How I Built Reliable AI-Generated Opportunity Solution Trees

    I just wrapped an all-out engineering sprint. That still sounds odd coming from me, because while I’ve written code on and off for years, I don’t self-identify as an engineer. I’m a product manager who used to be a designer. It’s been a long time since I wrote code for a living.

    But AI has expanded what’s just now possible—for our products, and for us. It’s pushed me to do more than I imagined. In that spirit, I want to share a recent engineering story. It includes technical details, and a year ago I couldn’t have done any of it. I learned it with the help of AI, and my aim is to show what’s now within reach.

    I’ve been building two services with a partner at Vistaly: AI-generated interview snapshots and AI-generated opportunity solution trees. We put out a call for alpha partners, received over 100 applicants, and selected eight design partners to start.

    Opportunity Solution Tree diagram with a blue Desired Outcome branching to green Opportunity nodes, yellow Solution nodes, and orange Assumption Tests for product discovery and AI workflows.
    A clear, color‑coded map from desired outcome to opportunities, solutions, and assumption tests—showing how to structure discovery work and prompt AI to generate, compare, and validate product ideas.

    Each team uploaded three customer interviews. I identified the key moments and opportunities and then generated an opportunity solution tree from those snapshots. I provide the AI services; Vistaly is building the UI and workflows around them.

    Early feedback was strong. Teams immediately asked to upload more interviews—exactly the kind of demand signal you hope to see—so we got to work making that possible.

    Dark interface screenshot of an opportunity solution tree with colored cards and dotted connectors, showing merged, moved, and evidence-added Opportunity notes about onboarding, support, and bot readiness.
    Go behind the scenes as AI turns raw feedback into a clear Opportunity Solution Tree. Linked cards reveal user needs—onboarding, support offload, and bot-readiness signals—so product teams can spot priorities and next steps at a glance.

    Updating an opportunity solution tree with new interview content is far harder than generating a new tree from scratch. I initially underestimated the complexity. Our goal wasn’t to produce a tree and declare it truth. We wanted teams to engage, correct, and collaborate with the AI—scaffolding cross-interview synthesis instead of doing it for them.

    To support that, we needed a way to communicate precisely how a tree would change after new interviews were added. We took inspiration from git diff and set out to build the equivalent for opportunity solution trees—step-by-step change sets that explain each proposed modification.

    Diagram of an opportunity solution tree with an Outcome node pointing to Opportunity A and Opportunity B; B branches to child opportunities and shows source evidence, labeled “Updates Can't Result in Data Loss.”
    A clear visual of AI‑generated opportunity solution trees: outcomes feed opportunities that branch into sub‑opportunities, while evidence is preserved. The structure ensures updates stay traceable and never cause data loss.

    That decision was right, but the lift was larger than I expected. It wasn’t enough to generate an updated tree; I also had to provide a clear, ordered walkthrough of what changed and why.

    I often see the same pattern with AI: it’s easy to get to an impressive prototype, but much harder to reach a production-grade product. That was exactly my experience here. My service actually comprised two sub-services: generating a new tree from scratch and updating an existing tree with new interviews. The first worked well in alpha; the second had to be built before anyone could add a fourth interview.

    Opportunity Solution Tree diagram: teal Outcome links to Opportunities A and B; Opportunities C and D branch under B; right panel lists the change set steps for adding nodes.
    Explore how an outcome expands into an Opportunity Solution Tree: Opportunities A and B stem from the goal, with C and D nested under B, while a concise change set tracks every node added along the way.

    On the surface, these services look similar. In reality, updates must preserve existing structure unless new evidence requires a change. You have to account for compound operations—merges, splits, deletes—while guaranteeing no data loss. Every node has source opportunities (supporting evidence from interviews) and children (tree sub-opportunities), and neither can be dropped.

    In classic AI fashion, I got a reasonable version working in a few days and shipped it to our design partners. One team quickly hit our beta limits and asked to convert to a paid subscription so they could keep going. They showed a willingness to pay, converted, and started uploading aggressively.

    Diagram of an Opportunity Solution Tree showing how parent 'Opportunity A' with children x, y, z is split into 'Opportunity A' and 'Opportunity B' to reassign evidence and connections.
    Watch an Opportunity Solution Tree evolve: the original parent A with x, y, z branches is split into A and B, shifting evidence while preserving links—mirroring how AI refines scope and structure in discovery.

    At the 14th, 15th, and 16th uploads, the cracks appeared. We saw odd behavior in some trees. The Vistaly team noticed that the change sets—the step-by-step instructions emitted by my service—didn’t always reconstruct the final tree my service also emitted. We needed those steps to match exactly, so teams could review and accept, modify, or reject each change with confidence.

    They flagged the issue the day I was flying to New Orleans for Jazz Fest. In hindsight, I’m glad I didn’t grasp the scope of what awaited me. I had roughly 80% of the work still to do to make tree updates rock solid. At least I got to enjoy the music first.

    Flowchart merging two opportunity solution trees: Opportunity B with children y and z, and Opportunity C with t, u, v, consolidated into one tree led by Opportunity C connected to five child opportunity nodes.
    From fragments to focus: this diagram shows how Opportunities B and C are merged into a single Opportunity Solution Tree, removing duplicates and unifying context so AI can rank and explore five related opportunities with clarity.

    Back home, I started diagnosing. My service was a pipeline: several LLM-driven steps followed by deterministic code to compare trees and produce change sets. As I dug in, I realized that approach was flawed. Tree diffs, unlike linear document diffs, are ambiguous.

    In a document, if I add a sentence, the diff shows an addition. If I delete a paragraph and rewrite it, the diff shows a removal and an addition. Simple. But trees are different. Suppose I split opportunity A into A and B, and later merge B with C. The split can disappear from the final diff.

    Diagram of an opportunity solution tree labeled 'Input Tree' showing an Outcome node branching to Opportunity A and C, each with child nodes x-z and t-v, with arrows indicating hierarchy.
    Peek inside our process: a simple opportunity solution tree maps an outcome to prioritized opportunities A and C with downstream options x-z and t-v. A clear snapshot of how AI organizes product discovery.

    When the model splits an opportunity, it must distribute A’s source opportunities and children between A and B. For instance, if A has source opportunities 1, 2, 3 and children x, y, z, after the split A might keep 1, 2, and x, while B takes 3, y, and z.

    Now suppose the model merges B into C. If C originally had source opportunities 4 and 5 and children t, u, v, then after the merge C now has source opportunities 3, 4, 5 and children t, u, v, y, z. When you compare the original and final trees, it looks like A somehow donated some evidence and children directly to C. The split and merge that explain why are invisible to a naive diff.

    Opportunity Solution Tree diagram titled Output Tree: a blue Outcome node branches to green Opportunity A and Opportunity C, which expand to nodes x-v with arrows; Product Talk badge.
    See how an AI-generated Opportunity Solution Tree unfolds: one Outcome flows to Opportunities A and C, then into options x–v. Clean colors and arrows reveal the hierarchy from goal to opportunities at a glance.

    That was the core insight: we didn’t just need to show what changed—we needed to show why it changed. I had to reconstruct each move step-by-step. That meant getting the model to show its work, which opened a new can of worms.

    I refactored my prompts so the model produced both the final output and the exact change set it used to get there. The action language was explicit: add, delete, reframe, merge, split, and so on. Crucially, I asked the model to describe its moves in user-meaningful terms—“split A into A and B, then merge B into C”—not as opaque reassignments of sources and children.

    Diagram of an AI-generated Opportunity Solution Tree: blue Outcome node with children Opportunity A and Opportunity B; B branches to Opportunity C and D. A right-hand list shows the change set for each step.
    Watch an opportunity solution tree take shape: start with the outcome, add opportunities A and B, then extend B to C and D. The paired change set makes every edit transparent—ideal for AI-assisted product discovery.

    For each LLM step, the model now emitted its recommendation and the corresponding change set. This helped, but it wasn’t perfect. After extensive testing and error analysis, two classes of errors emerged: (1) the model attempted an invalid move, and (2) the change set didn’t actually generate the recommendation.

    Category 1 felt like designing a game while the model played it creatively. For example, what happens when the model tries to merge a parent with a child? If opportunity A has children B, C, and D and the model merges A with B, the merge is directional. If the instruction is “keep A, delete B,” that works—the parent absorbs the child. But if the instruction is “keep B, delete A,” then C and D become orphans. These puzzles were solvable and even fun.

    Diagram of Opportunity Solution Tree merge rules: merging node B into parent A is allowed, while merging A into B is not because it would orphan opportunities B, C, and D.
    Visual explainer from Product Talk on AI-generated Opportunity Solution Trees. It contrasts an allowed merge (B into A) with a not-allowed merge (A into B) that leaves child opportunities orphaned, guiding safe hierarchy edits.

    Category 2 was harder. Despite prompt iterations, I could only push the discrepancy rate down to about 1 in 40 instances. With 10–20 LLM calls per run, that meant roughly half of all runs still failed. Not acceptable for production. I hit a wall. A paying customer was waiting, and more design partners were queued up.

    Next, I tried to correct the model’s mistakes with deterministic code. I had promised that my change sets would generate the output tree, so I wrote verifiers: detect conflicts (e.g., delete a node, then try to use it later), guard against data loss, prevent orphaned nodes, and more. Detection was straightforward; correction was not. Fixing issues required guessing the model’s intent. If the sequence said “delete A, then merge A with B,” should I remove A entirely or salvage A’s sources and children by merging into B? There were dozens of such cases with no unambiguous answer.

    Workflow diagram titled 'My Simple Repair Loop' showing an iterative validation cycle: Generate the Change Set → Run the validation tool → Check Result, with branches to retry on failure or exit on pass.
    A step-by-step loop shows how changes are validated: generate a change set, run a validation tool, review the result, then repeat on failure and exit on pass—mirroring iterative work behind AI-built Opportunity Solution Trees.

    After 11 straight days of deep work—including weekends—I was exhausted. I dislike hustle culture; this isn’t how I design my life. But I was stuck, and then I had an insight.

    On a walk with my husband (also an engineer), I realized I could have the LLM repair its own mistakes. My data contract with Vistaly requires that the change set must generate the output tree. I had already built robust validation code. I knew exactly when a change set failed—and why. No amount of prompt tuning alone was fixing it. So I turned the validator into a tool for the model and created a simple agentic loop.

    The loop works like this: the model proposes a change set, calls the validation tool, and gets back a pass/fail plus specific feedback. If it fails, the model uses those instructions to repair the change set and calls the tool again. Iterate until success or a max number of turns.

    I prototyped in Node.js with a single model call, a verifier pass, and a repair attempt. At first, the loop didn’t converge—it just accumulated compute. I experimented with how to communicate errors, how much context to include, and how to sequence feedback. Eventually, it clicked: the model began fixing its own mistakes and typically returned a valid change set in one or two repairs. It was, in practice, eval-driven development applied to LLM outputs.

    I had already built an agent loop utility for another AI workflow, so I productionized quickly: model call, optional tool invocation, tool result returned to the model, repeat until the validator signals success or the loop times out. I integrated the new loop into the pipeline and shipped the revamped service to Vistaly on Monday at noon. They’re integrating now, and it will be in the hands of our design partners shortly. I’m relieved—and ready for a day off.

    Reflecting on the last two weeks, a few things stand out. First, I shed limiting beliefs about being an engineer. To make this reliable, I had to solve legitimately hard problems, and that feels good.

    Second, this was genuinely fun. Designing the action set and watching the model push those boundaries was like working through elegant puzzles. Models are incredibly creative, and harnessing that creativity with the right constraints is deeply satisfying.

    Third, I learned when I can and can’t trust Claude to write code for me. Since Opus 4.6 came out, I gave Claude a much longer leash. After the past two weeks, Claude is back on a short leash. I found a lot of gaps in my implementation in areas where I simply trusted that Claude got it right, when in fact it didn’t. If you don’t have the right infrastructure—planning, testing, code review—this can be disastrous. I’ll be investing more here and sharing what I learn.

    Finally, if this work had been spread over two months, it would have been thoroughly enjoyable. I’m discovering how much I like being an AI engineer. It feels like a new chapter where I can combine opportunity solution trees with modern AI engineering—and deliver real value to product teams doing continuous discovery.

    I’m excited to share more of what we’re building with Vistaly and to onboard more design partners soon. If you’re interested, get on the waiting list. And if you’ve been hesitant to stretch beyond your current skill set, I hope this story nudges you to take the first small step toward what’s just now possible.


    Inspired by this post on Product Talk.


    Book a consult png image
  • Unlock High-Leverage PM Work: 5 Claude Cowork Playbooks to Turbocharge Your Strategy

    Unlock High-Leverage PM Work: 5 Claude Cowork Playbooks to Turbocharge Your Strategy

    In my role leading product teams, I’m relentless about freeing time for high-leverage work—clarifying strategy, sharpening positioning, and unblocking execution. Claude Cowork has become a reliable AI partner in that mission, helping me automate repeatable tasks while preserving judgment for the decisions that matter most.

    Get 5 playbooks to automate common product management tasks with Claude Cowork and free yourself for higher-leverage PM work.

    When I say “playbooks,” I mean structured, repeatable workflows that turn messy inputs into crisp outputs—without sacrificing rigor. With agentic AI, LLMs for product managers, and thoughtful prompt engineering, these playbooks plug directly into my product roadmapping and sprint planning process, accelerating discovery, analysis, and stakeholder alignment.

    Playbook 1: Continuous discovery synthesis. I route raw customer interviews, support threads, and behavioral analytics into Claude Cowork to cluster themes, extract Jobs-to-Be-Done, and propose opportunity areas. It drafts an initial opportunity solution tree with clear problem statements, target outcomes, and candidate solutions, which I then refine with the team. This shortens the loop between customer interviews and actionable insights while preserving the nuance that continuous discovery requires.

    Playbook 2: Strategy-to-roadmap alignment. Starting from our product strategy and target outcomes, I ask Claude Cowork to translate goals into a prioritized roadmap, calling out outcomes vs output OKRs and showing driver trees that connect initiatives to measurable impact. It flags dependencies and suggests stakeholder management touchpoints, making the narrative behind prioritization transparent and easier to socialize across product trios and leadership.

    Playbook 3: Experiment design and A/B testing. To move from ideas to evidence, I have Claude Cowork generate testable hypotheses, success metrics, and guardrails for A/B testing. It produces experiment briefs, checks statistical assumptions like minimum detectable effect (MDE), and suggests instrumentation plans for tools such as Amplitude analytics. I use these drafts to speed up reviews without compromising on methodological rigor.

    Playbook 4: Launch communications and in-product guidance. After we ship, I leverage Claude Cowork to assemble UX writing, release notes, and in-app guides tailored to user segments. It proposes short product tours, contextual tooltips, and support macros that keep messaging consistent across Pendo or Intercom while reinforcing our value proposition. The result is faster, more cohesive go-to-market execution with fewer round-trips.

    Playbook 5: AI risk, governance, and quality checks. Before anything goes live, I use Claude Cowork to run structured reviews for data governance, privacy-by-design, and AI risk management. It helps draft acceptance criteria, red-team prompts for edge cases, and an eval-driven development checklist so the team can track model behavior and mitigate regressions over time. These safeguards maintain trust as we scale AI workflows across the product surface.

    To make these playbooks sing, I seed Claude Cowork with a retrieval-first pipeline of canonical docs—vision, strategy, OKRs, analytics dashboards, and definition-of-done checklists—plus prompt templates tuned for our voice and review standards. Tight context window management, explicit role instructions, and lightweight evaluations keep outputs accurate, auditable, and on-brand.

    The impact has been compounding: faster discovery-to-decision cycles, clearer roadmaps tied to outcomes, stronger experiments, and launch content that lands. Most importantly, the team spends more time on creative problem solving and stakeholder partnership, not manual synthesis or formatting. If you’re ready to reclaim your calendar and elevate your product strategy, start with these five Claude Cowork playbooks and iterate from there.


    Inspired by this post on Amplitude – Perspectives.


    Book a consult png image
  • Intercom Rebrands to Fin: Why Shedding Brand Baggage Powers the Next AI Era

    Intercom Rebrands to Fin: Why Shedding Brand Baggage Powers the Next AI Era

    Sometimes a corporate rename lands with such obvious inevitability—and such lateness—that it feels like a quiet confession. As a product leader, I’ve wrestled with that timing question: move early and risk confusion, or wait and risk stagnation. In this case, the industry finally received the clarity it has been circling for years.

    The announcement was clear: “we’re changing the name of our company to Fin.” Crucially, the name Intercom will continue as the customer service software platform that many of the best brands rely on as their primary help desk. The team also “just launched a complete rebuild, Intercom 2,” and is doubling down investment in that product. In other words, the company brand now matches its leading customer agent platform—Fin—while Intercom remains the flagship product line.

    From a product strategy and brand architecture perspective, this move aligns the corporate identity with the growth engine. I’ve seen too many winners of a prior era cling to yesterday’s positioning while markets shift under their feet. The phrase that keeps echoing in my mind—because it’s true in practice—is that “the only path to success in the future is through destroying your past.” Culture, pricing models, product lineup, investment priorities—those can evolve. But until the company name evolves, the market’s mental model often does not.

    It’s telling that three years ago, when the team effectively created the service agent category, they led with Fin and kept Intercom in the background. That wasn’t indecision—it was smart category design. Humans don’t frequently remap old concepts; we add new ones. We don’t wake up reinterpreting what a chair is, but we do invest energy to understand a new kind of drone or an intelligent software agent. New categories deserve new names, or they’ll be dragged back into old expectations.

    This is where product positioning meets competitive differentiation. Newcomers without legacy baggage enjoy a clean slate; they never have to convince the market they’ve changed because they never had an old position to defend. Even with provably superior technology, an incumbent can find itself explaining rather than advancing. I’ve led naming and repositioning work where the hardest task wasn’t shipping new capabilities—it was unseating the entrenched narrative in customers’ heads.

    So, “baggage be gone.” Fin is clearly positioned as the future of the customer agent category and is poised to become the largest part of the business. Intercom, as a product brand, very much lives on—and with “Intercom 2” now in the world, the product roadmap and investment thesis are unambiguous. The core takeaway for product management leadership: align corporate naming with your category-creating bet, then let go. That’s how you turn momentum into market leadership.

    For leaders working through similar decisions, here’s the lesson I’m taking to my own teams: rebrands aren’t about logos, they’re about narrative clarity and execution velocity. When the corporate name and the breakout product share the same story, go-to-market motions get sharper, customer understanding improves, and AI strategy integrates more naturally into customer support workflows. Naming follows strategy—not the other way around.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • Beyond the Product Builder Hype: How AI, org design, and joy shape PM success

    Beyond the Product Builder Hype: How AI, org design, and joy shape PM success

    I recently spent time with the debate behind the "product builder" trend—asking whether it’s the future of product management or just another wave of tech FOMO. The conversation featuring Teresa Torres and Petra Wille is a useful prompt, but what matters most is how we translate these ideas into healthy product practices inside our own organizations.

    Here’s my take: the product builder movement is neither a mandate nor a fad—it’s a tool. The right question isn’t "should product managers code?" but whether leaning into building advances outcomes for our customers and our teams. In practice, that means letting interest and skill—not pressure—set the pace.

    Petra captured it perfectly: "Just because I can do it — is it something I enjoy doing? And do I have enough experience to really get into the flow?" Those two tests—joy and depth—are underrated filters. I’ve seen PMs light up when prototyping or vibe coding a thin slice, and I’ve also seen well-meaning dabbling create hidden complexity that slows everyone down later.

    Org design determines whether this works. It’s not about the tools—it’s about clarity of roles, healthy interfaces between product, design, and engineering, and explicit guardrails for where experiments stop and production begins. AI has raised the stakes: "AI can make unskilled work look polished. That’s a feature and a bug — executives see the shine, engineers inherit the mess." If you’ve ever watched a glossy demo turn into weeks of refactors, you know exactly what this looks like.

    To avoid that trap, I deliberately separate the three layers where AI is changing product work: personal productivity, team process, and product strategy. Treating these as different stacks keeps expectations clean: a prompt that accelerates personal workflows isn’t the same as an AI-enhanced process that reshapes delivery, and neither automatically produces durable product advantage. Don’t conflate them.

    Discovery remains stubbornly human. "Why discovery still requires talking to your customers (sorry)" is more than a friendly nudge. AI can broaden our search space and sharpen analysis, but it doesn’t replace qualitative conversations or the judgment that comes from pattern recognition across real customer contexts. Continuous discovery and disciplined customer interviews are still the most reliable compasses we have.

    Where does "vibe coding" fit? It’s great for roughing out concepts, de-risking slices, and communicating intent when words or static mocks won’t cut it. Tools like Claude Code make this faster than ever, and familiar stacks like Ruby on Rails lower the bar for spinning up functional prototypes. But remember the design system trap: AI can make bad decisions look good on the surface. If you don’t control for architecture, accessibility, data contracts, and handoff quality, your team pays the integration tax later.

    In well-set-up orgs, the output-oriented muscle memory gets rewired. When AI frees up time, strong teams reinvest it into better problem framing, sharper opportunity solution trees, and tighter product strategy—rather than simply chasing more output. That’s a leadership challenge, not a tooling problem, and it shows up quickly in how teams make trade-offs.

    Here’s how I operationalize this with empowered product teams: we articulate clear boundaries for prototypes versus shippable code, define decision rights for when PMs or designers "build," and align on review gates that protect quality without stifling speed. We also make the three AI layers explicit in roadmapping and retros, so improvements to personal workflows don’t get mistaken for strategic advantage.

    My distilled guidance echoes the episode’s throughline. The product builder trend isn’t a mandate — it’s a tool. Let enjoyment and skill guide who on your team leans into it. Organizational readiness determines whether AI empowers your team or creates chaos. Don’t conflate personal efficiency, process change, and product impact—they require different responses. Discovery fundamentals haven’t changed; AI helps you go deeper, not skip the work. And the real takeaway on product builders: not everyone has to build, but everyone can if they want to.

    If you want to hear the full discussion that sparked these reflections, listen on Spotify or Apple Podcasts. Then tell me: where will you apply builder energy in your team—and where will you deliberately say no?

    Resources & Links: Follow Teresa Torres: https://ProductTalk.org. Follow Petra Wille: https://Petra-Wille.com. Mentioned in this episode: Claude Code, Vibe coding, Ruby on Rails.

    One more quote I loved because it centers autonomy and craft: "It’s a tool in our toolbox. We can decide who on our team has fun with it, wants to do it, wants to contribute." That’s the mindset that sustains both momentum and morale.


    Inspired by this post on Product Talk.


    Book a consult png image
  • My Playbook for a Smarter Feature Launch Slack Channel with Agents, Feature Flags, and Readouts

    My Playbook for a Smarter Feature Launch Slack Channel with Agents, Feature Flags, and Readouts

    Feature launches move fast, and the Slack channel is our command center. Recently, I leveled it up with agentic AI so every data question, feature flag decision, and post-launch readout lives in one trusted place—faster, clearer, and with less operational drag on the team.

    Learn how to set up your launch Slack channel so agents handle your data questions, feature flags, and post-launch readouts in one place.

    Here’s the strategy I use. I treat the launch Slack channel like a real-time control room: agentic AI handles the repetitive asks, experts handle the judgment calls, and stakeholders stay aligned through crisp, automated summaries. The result is tighter stakeholder management, quicker go/no-go calls, and fewer meetings—without sacrificing data quality or governance.

    First, I set clear channel rituals. I name the space #launch-[feature], declare scope and SLAs, and pin the success metrics, dashboards, and rollout plan. Product, engineering, data, support, and GTM all join. I keep threads focused: one for metrics, one for incidents, one for enablement, one for feedback. This small bit of structure makes agent responses and human follow-ups easy to find.

    Next, I add a data questions agent. The agent connects to approved sources and answers the most common queries—activation by cohort, conversion by segment, latency by region—directly in-thread with citations and timestamps. When the question requires nuance, the agent routes to an owner and posts a handoff note, preserving context. This keeps our AI workflows safe and reliable while giving the team quick visibility.

    Then I wire in a feature flags agent. It exposes read-only status by environment, shows rollout percentages, and links to change history. When a toggle is requested, the agent enforces approvals and logs who asked, who approved, and why. We can pause, ramp, or roll back in seconds—with auditability intact. Feature flags become an operational muscle instead of a bottleneck.

    Finally, I schedule post-launch readouts. The readout agent publishes T+1 hour, T+24 hours, and T+7 days summaries: adoption, performance, anomalies, and key learnings. It highlights A/B testing results, flags outliers, and threads follow-up actions to owners. The team gets a single source of truth for post-launch readouts without scrambling across tools.

    Governance matters. I apply role-based access, protect PII, and make the agent cite sources so we can trust what we see. I use Agent Analytics to monitor response accuracy, deflection, and time-to-answer, then refine prompts and permissions. This is practical AI risk management: clear boundaries, human-in-the-loop for consequential decisions, and transparent logs.

    The impact has been real: faster decisions during go-to-market, fewer pings to data and engineering, and higher confidence in our product management rituals. Centralizing “questions, flags, and readouts” in Slack doesn’t replace expertise—it frees it to focus on the hard problems.

    If you’re rolling this out, start small: define the channel, pin your metrics, launch the data agent with a handful of approved queries, add the feature flags agent with strict approvals, and automate a simple daily readout. Iterate weekly. Within one or two launches, you’ll feel the compounding benefits.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • Outcome-Based Pricing That Delivers: Pay $10 Only for Qualified Leads with Fin for Sales

    Outcome-Based Pricing That Delivers: Pay $10 Only for Qualified Leads with Fin for Sales

    Our outcome-based pricing model hinges on one principle: you pay when Fin delivers value.

    As Fin takes on new roles, that principle doesn’t change, but the definition of value does.

    Fin for Sales qualifies leads, engages prospects, and routes high-intent buyers to your sales team. The value it creates isn’t a resolved query, but a pipeline of qualified opportunities. So we price accordingly: $10 per qualified lead. And you, the customer, define what “qualified” means, not Fin.

    This is the first outcome-based pricing model for an AI Agent for sales. Here’s why I believe it’s the right approach and how I’ve seen it change the way teams think about SaaS pricing and ROI.

    Over the years, I’ve learned that the fastest way to earn trust with sales and finance leaders is to align pricing with outcomes they actually report on. The core finding from our research was unambiguous: zero buyers preferred paying for activity. They wanted to pay for results.

    That insight shaped how we priced Fin for its service role, $0.99 per resolution, where a resolution means the customer’s issue is fully solved without human intervention. More recently, we evolved that model to outcomes, reflecting the broader ways Fin delivers value across complex workflows. We believe pricing should be aligned with value delivery, and the vendor should carry risk when the product doesn’t perform. In sales, the best unit of value is pipeline.

    Most sales teams today are overwhelmed by leads. Early in my career, I watched reps spend hours chasing form fills that looked promising but went nowhere. That experience cemented a lesson I still use: volume is vanity; qualification is sanity.

    Ensuring the right opportunities promptly reach your sales team is what makes a difference. When a prospect visits your site, engages with Fin, answers qualifying questions, and is directed to a sales rep, Fin is identifying whether the opportunity is worth your team’s time and delivering value.

    Charging per conversation would penalize businesses for every curious visitor who asks a question but isn’t a buyer. And charging per token, well, that’s always been a model that protects the vendor, not the customer.

    We needed a metric that captures the actual value Fin creates in a sales context: qualified leads.

    The purest version of outcome-based pricing for Fin’s sales role would be a percentage of closed revenue. Fin qualifies the lead, a rep closes the deal, and we take a cut. On paper, it looks elegant; in practice, I found it breaks down for two reasons that matter to operators.

    First, attribution. Between the moment Fin qualifies a lead and the moment a deal closes, dozens of things can impact the final result. The quality of human-led demos can differ, products can have outages, prospects’ budgets can get cut. Tying Fin’s price to the final outcome holds it accountable for variables entirely outside its control.

    Second, measurement. To track closed revenue, we’d need deep integration into every customer’s CRM, tracking each opportunity from qualification through to close. That’s a significant implementation burden that slows time to value, which is the opposite of what we want.

    So we asked: what’s the most honest proxy for the value Fin delivers, where Fin is clearly the one creating it?

    A qualified lead is that proxy. It represents the moment Fin has done its job. It has engaged the prospect, gathered the relevant information, evaluated them against your criteria, and determined they’re qualified. Everything up to that point is Fin’s work. Everything after it is the rep’s. At $10 per qualified lead, the pricing reflects this boundary.

    There are two key components to how this pricing model works.

    First, the customer defines success. With Fin’s sales role, the customer sets their own qualification criteria based on their business context. A company with high average contract values might set a lower bar because they can’t afford to miss anyone. A company where rep time is scarce and deal sizes are smaller might set a much higher bar, filtering aggressively to only surface the most promising prospects. The criteria flex to match the business.

    Second, the economics are different by design. As a Customer Agent, Fin can switch between roles like sales and service. So if you’ve deployed Fin for Sales, it can still handle support queries like prospects asking a product question. Those queries are charged at $1 per resolution, consistent with our service pricing. Disqualifications, where Fin determines a prospect doesn’t meet the criteria, are also $1. The $10 price point for qualified leads reflects the higher value of pipeline creation compared to issue resolution.

    The ROI speaks for itself. Early customers are reporting significant returns using Fin for Sales. One shared a perspective that mirrors what I hear in executive QBRs:

    “I would say it’s at least 10 times the value. You’re now giving the business exactly what it needs as opposed to just activity. We say this expression in sales leadership all the time – ‘I don’t pay my sales team for activity. I pay them for results.’ I want my AI engine to be the same way.”

    When you compare the cost of a qualified lead from Fin against the fully loaded cost of an SDR—salary, benefits, tooling, ramp time—the economics are compelling. For many businesses, particularly those that never had SDRs in the first place, Fin for Sales isn’t just replacing headcount, but creating an entirely new capability that wasn’t economically viable before.

    This pricing model came from extensive customer research—qualitative interviews and quantitative studies—exploring how buyers want to pay for AI in a sales context. We tested multiple concepts: per-conversation, per-token, per-seat, revenue share, and per-qualified-lead. The research consistently pointed to outcome-aligned pricing as the preferred model, with the qualified lead emerging as the metric that best balances value alignment, measurability, and practical implementation.

    Outcome-based pricing is still rare in AI, but we think that will change. For Sales Agents, we’re the first to do it. Transparency is part of the model. If you understand why we price the way we do, you can evaluate whether it works for your business.


    Inspired by this post on The Intercom Blog.


    Book a consult png image