Tag: AI workflows

  • Unleashing Inbound Sales with AI: My Playbook for Launching and Scaling Sales Agents Fast

    Unleashing Inbound Sales with AI: My Playbook for Launching and Scaling Sales Agents Fast

    Inbound leads shouldn’t wait for a rep’s calendar. When we first launched The Service Agent Blueprint, support leaders finally had a clear AI path. Go-to-market and revenue teams are now facing similar uncertainty, so I’m introducing The Sales Agent Blueprint—a practical map for launching and scaling AI for sales with confidence.

    For most sales teams, inbound motions require a lot of manual work. I’ve watched leads pile up in queues, waiting for availability rather than being prioritized by buyer intent. That delay costs meetings, pipeline, and momentum—and it’s exactly where a modern AI Strategy can transform your go-to-market strategy.

    Agents can run sales conversations end to end – engaging buyers, qualifying leads, and routing high-intent opportunities to the right team to move prospective buyers forward quickly. Humans will still be involved, but will move their focus to the consultative conversations and higher-value work they did not have time to focus on before. In practice, this shift enables cleaner AI workflows, better conversation design, and a healthier balance between sales-led growth and product-led growth.

    The questions many go-to-market and revenue leaders are facing now are where do you start? What should success look like? How do you actually test and deploy these solutions? These are the right questions—and the ones I hear most often when teams weigh build vs buy decisions, evaluation frameworks, and CRM integration nuances.

    The Sales Agent Blueprint answers those questions. It’s designed to be a strategic guide for sales, revenue, and AI transformation leaders who want to deploy AI for inbound sales fast, prove value, and build momentum. If you’re aiming for eval-driven development, this will help you define success up front and operationalize it.

    What’s inside is simple by design yet deep enough to take you from zero to value. The Sales Agent Blueprint is structured around two tracks that reflect how high-performing teams adopt agentic AI: first, launch for quick wins; next, scale for durable growth.

    Minimal blue banner for Introducing the Sales Agent Blueprint with a bold 'Scale it' headline, abstract halftone device graphic, subtle crop marks, and a 'Coming Soon' badge in the upper-right corner.
    Coming soon: Sales Agent Blueprint. A sleek, blueprint-inspired teaser with the call to 'Scale it' signals tools, playbooks, and workflows to grow revenue, streamline operations, and scale teams with confidence.

    Today, I’m releasing the first part of the Blueprint: “Launch it.” It’s a practical guide for getting your Agent live and seeing real results. You’ll learn how to deploy a Sales Agent that runs inbound sales conversations end to end, engaging buyers, qualifying leads, and routing high-intent opportunities to the right outcome in real time—without disrupting your current CRM integration or pipeline processes.

    By the end of the “Launch it” track, you’ll be ready to execute with clarity. Here’s how I frame the essential steps, based on what consistently works in the field.

    Understand what a Sales Agent is: Discover why they’re different from chatbots and how they work. Build a business case: Prove the basic economics of AI, decide whether to buy or build, and get the buy-in and budget you need to move forward.

    Evaluate an Agent: Learn how to define success, choose the right evaluation criteria, and run a focused, high-impact assessment with our five-step framework.

    Deploy with confidence: Build a deployment plan that gets your Agent live quickly to engage buyers at peak intent. Learn what to expect at each stage.

    Vector-style 'Blueprint' title on a light grid with Bézier points, plus a royal-blue panel reading '1 Launch it' next to a satellite icon; footer shows FIN.AI/BLUEPRINT/SALES promoting the Sales Agent Blueprint.
    Introducing the Sales Agent Blueprint. This crisp, grid-based graphic spotlights step 1—Launch it—signaling day-one activation for an AI sales agent. Explore the framework and get started at fin.ai/blueprint/sales.

    Continuously improve performance: After launch, your Agent becomes a system to manage. We’ll show you how to implement a repeatable process to train, test, deploy, and optimize.

    The second track, “Scale it” (coming soon), focuses on the organizational and systems design work that unlocks compounding gains. Launching AI is only the beginning. To unlock its full potential, you need to rewire your inbound sales motion—redesigning the buyer journey, building AI-first systems and ownership models, and rethinking how pipeline is generated and scaled. This is where governance, measurement, and team roles evolve to support sustainable growth.

    I’ll be building this Blueprint in public as I navigate the same challenges—sharing what works, what to avoid, and how to accelerate time-to-value without sacrificing quality or trust. If you’re ready to turn intent into revenue with agentic AI, this is your head start.

    The Sales Agent Blueprint is live now. Explore the full guide at fin.ai/blueprint/sales and start your “Launch it” sprint today.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • My Essential AI Toolbox for Product Managers: Tested Picks, Prompts, Workflows + Checklists

    My Essential AI Toolbox for Product Managers: Tested Picks, Prompts, Workflows + Checklists

    I created this practical guide to help product managers cut through the hype and apply AI where it genuinely moves the needle—faster discovery, clearer strategy, sharper execution, and measurable outcomes.

    A practical guide to AI tools for product managers: tested picks, what each tool is best for, copy-paste prompts, workflows, and screenshot checklists.

    Leading product management at HighLevel, I’ve pressure-tested dozens of gen AI solutions across product discovery, roadmap planning, delivery, and go-to-market. In this guide, I map an AI product toolbox to core PM jobs-to-be-done so you can move from experimentation to repeatable impact with confidence.

    Expect clear recommendations on where each tool excels—LLMs for product managers, research synthesis for customer interviews, behavioral analytics for opportunity sizing, and lightweight automation for in-app guides and product tours. I connect these tools to proven practices like continuous discovery, outcomes vs output OKRs, and product roadmapping and sprint planning so you can operationalize AI inside your existing workflows.

    I also share the evaluation criteria I use before rollout—AI Strategy alignment, data governance and privacy-by-design, AI risk management, observability, and total cost of ownership. This eval-driven development approach helps teams avoid technology FOMO while creating defensible, trustworthy workflows that scale.

    To accelerate adoption, I’ve included copy-paste prompts (including prompt engineering patterns for both chat and voice), retrieval-first pipeline blueprints to ground your models in product docs and decision logs, and conversation design tips for support and success use cases. You’ll see step-by-step AI workflows that tie directly to journey mapping, opportunity solution trees, and Kano Model trade-offs.

    Every workflow comes with screenshot checklists you can use for onboarding or stakeholder management, making it easy to align ICs and leaders on the same operating picture. Whether you’re optimizing A/B testing, retention analysis, or QBRs vs OKRs, these checklists turn good intentions into repeatable rituals.

    Use this guide as your field companion to ship faster with higher confidence—reducing cycle time, improving signal in discovery, and building momentum for product-led growth. If you’re ready to translate generative AI into reliable PM leverage, start with the workflows, adapt the prompts, and make them your own.


    Inspired by this post on Product School.


    Book a consult png image
  • Fin for Sales: Instantly Engage, Qualify, and Close High‑Intent Leads with an AI Customer Agent

    Fin for Sales: Instantly Engage, Qualify, and Close High‑Intent Leads with an AI Customer Agent

    Today, I’m spotlighting Fin for Sales, a new role for Fin Customer Agent that runs your inbound sales motion end-to-end. From my vantage point leading product management and collaborating closely with revenue teams, this is a meaningful evolution in how we capture, qualify, and convert high-intent demand with precision and speed.

    The promise here is simple and powerful: a single Customer Agent with shared context, memory, and business goals that supports the entire journey from first touch to close. Fin for Sales brings Fin to the start of the customer journey so it can engage prospects, guide them through your funnel, and ensure the best opportunities reach your sales team without delay.

    At a high level, here’s what stands out to me in practice. Fin engages every prospect instantly at the moment intent is highest. It runs discovery like your best rep with clear pricing guidance, product education, and objection handling. It qualifies and routes in real time using your playbook and syncs full context to your CRM. And it closes deals while you sleep by booking meetings, starting trials, and steering buyers to the right next step—boosting MQLs, pipeline, and early close/win rates.

    Fin engages every prospect instantly. It starts the right conversation when interest peaks, re-engages before prospects go cold, and works on every channel, in every language, 24/7. In my experience, that immediacy is the difference between a lead that converts and a lead that disappears.

    Screenshot of a Fin for Sales chat widget on a dark abstract background, where an AI assistant compares Free vs Pro CRM plans, recommends Pro for reporting needs, and offers to book a sales call.
    Introducing Fin for Sales, a conversational assistant that qualifies prospects in real time. The chat compares Free vs Pro, spotlights reporting and Salesforce integrations, and invites users to book a call.

    Fin runs discovery like your best rep. It explains pricing, guides product discovery, handles objections, and personalizes each interaction based on who the prospect is and what they care about. This is where thoughtful conversation design and consistent playbook execution really compound.

    Fin qualifies and routes in real time. Using your playbook, it collects and enriches data about your prospects, sends qualified leads to your sales team or down self-serve paths, while syncing full context to your CRM. Your team never works the wrong lead. That’s operational rigor revenue leaders crave.

    Fin closes deals while you sleep. It can book meetings, start trials, and guide buyers to the right next step. Early customers are already seeing impressive results, increasing MQLs, growing pipeline and seeing close/win rates of nearly 50% in the first month. That’s the kind of lift that reshapes go-to-market strategy and forecasting confidence.

    Graphic showing Fin for Sales connecting a prospect insights panel to Salesforce. A dark UI card lists contact details and signals like purchase intent, opportunity, and timeline over blue shapes.
    Fin for Sales links customer agent insights with Salesforce, turning live conversations into rich profiles and lead scores. View key details, intent and opportunity signals, and guided next steps like booking a meeting.

    Why this matters: most online sales experiences still rely on forms, queues, and follow-ups—exactly when prospects want clarity and momentum. Hiring enough reps to cover every time zone, channel, and hour is unrealistic, and even the best teams burn cycles on leads that were never going to convert. I’ve watched high-intent demand slip through the cracks simply because the response wasn’t fast, consistent, or contextual enough.

    Revenue leaders need a system that meets every inbound interaction immediately, without sacrificing quality, and routes only the right opportunities to sales. Incremental automation doesn’t fix the core issue; an agentic approach does. Fin for Sales closes that gap by pairing instant engagement with disciplined qualification and crisp handoffs.

    How it works in the moment: when a prospect is actively exploring your site, any delay—a form, a queue, a “we’ll get back to you”—erodes intent. Fin engages in real time through the Spotlight Messenger, a new interface built specifically for sales conversations. It can proactively start a conversation based on context like the page someone is on or how they’re browsing, and it offers smart suggestions to kick-start engagement.

    Chat widget for Fin for Sales displaying an in-chat calendar and time-slot picker for March 2026, with Friday, March 9 highlighted and a Confirm booking button on a blue gradient background.
    Fin for Sales schedules meetings directly in chat. A sleek widget shows a March 2026 calendar with selectable time slots and a clear Confirm booking CTA, streamlining lead capture and speeding up sales follow-ups.

    Prospects who might have waited—or never reached out—now get answers immediately. Fin also works across channels including messenger and email, so buyers can engage however they prefer. Whether someone is browsing your pricing page at 2am or comparing features during a lunch break, Fin responds instantly and relevantly so no lead is left behind.

    To move prospects toward a decision, Fin guides personalized discovery conversations that clarify needs and accelerate choices. Four pillars make this consistent and trustworthy. Playbook: you brief Fin in natural language on desired outcomes and scenarios; it follows your rules, handles objections with approved guidance, and stays on track. Knowledge: it draws from your product knowledge base to answer pricing, features, and plan fit, and can reuse what you’ve already trained for customer service—no duplicate setup. Enrichment: once Fin learns a user’s email or name, it enriches that data with outside sources to improve qualification, personalization, and routing. Memory: if Fin recognizes a returning visitor, it remembers context so the buyer never starts over.

    As conversations progress, Fin surfaces the opportunities most likely to close. It qualifies like your best SDR—asking about use case, budget, fit, and timing—and applies your existing playbook to identify the strongest opportunities. Details captured in conversation, plus enrichment, produce a complete picture that’s structured and synced into your CRM for immediate sales action. And when a lead isn’t a fit, Fin gracefully disqualifies or redirects to self-serve resources, ensuring your pipeline stays focused.

    Minimalist hero graphic with the headline 'Add Fin to your sales team today,' a glossy 3D blue spiral at center, and a black 'Start free trial' button, promoting Fin for Sales as an AI customer agent.
    Introduce Fin for Sales to your team with this clean hero banner: bold headline, signature blue spiral, and a clear 'Start free trial' call to action—inviting readers to explore an AI customer agent built for revenue.

    When a lead is ready to act, Fin closes. It books meetings via tools like Chili Piper or Calendly, guides qualified buyers into trials or subscriptions, and routes opportunities to your sales team with full context. Crucially, it passes the full conversation history and an AI-generated summary so reps pick up exactly where the buyer left off—no repeated questions, no lost nuance. For self-serve motions, Fin can guide prospects from discovery to trial signup or even paid conversion, automatically assigning the right path.

    Real results underscore the model’s value. Fin is already delivering measurable results for early customers across different company sizes, sales motions, and go-to-market models. Attio, an AI CRM built for scaling go-to-market intelligently, deployed Fin to replace their traditional form-and-wait inbound flow with real-time conversational engagement. In three months, Fin handled over 1,600 conversations with website visitors, qualified more than 50 leads for sales, and routed over 30 applicants into their startup program. One returning prospect engaged with Fin, had their questions answered in real time, and converted to a paying customer at six times Attio’s average contract value.

    Fellow, an AI-powered meeting assistant and management platform, started by deploying Fin overnight, a window where no human was online and prospects waited up to 18 hours for a reply. In January alone, Fin booked 18 meetings the team would never have reached, converting at around 48%. Importantly, the human team maintained its booking rate while Fin added net-new meetings—proof that automation layered on top of strong human coverage can be additive, not cannibalistic.

    Fin for Sales is built on the same AI platform that powers the highest-performing Agent in customer service, which keeps the end-user experience consistent. If a prospect asks a support question mid-sales conversation, Fin can handle it—no handoffs to other vendors, no lost context. It shares knowledge and memory across its platform, always knows whether it’s talking to a prospect or a customer, and moves between roles as needed. Setup follows the same Fin Flywheel: Train, Test, Deploy, Analyze. Describe your sales playbook, qualification criteria, and routing rules in natural language; test in preview; deploy live; and use Analyze to understand performance and iterate quickly.

    Fin for Sales is available today, and there’s more coming. I share the conviction that the future is a single Customer Agent, vertically integrated down to the model layer, orchestrating customer experience across the entire lifecycle. If you want to see it in action, go to fin.ai/sales and talk to Fin—then imagine that instant, high-quality engagement running across your inbound sales engine, every hour of every day.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • AI Now Approves Our Pull Requests—Safely: Inside an Agentic, Auditable Review Engine

    AI Now Approves Our Pull Requests—Safely: Inside an Agentic, Auditable Review Engine

    At Intercom, shipping is our heartbeat. We push code to production hundreds of times a day, and I’ve seen firsthand how that pace sharpens our product instincts and forces clarity in our CI/CD practices.

    Engineers, engineering managers, designers, and PMs all contribute to this, safely. The average time from merging code to it running in production is 12 minutes. For me, that’s not just a vanity metric—it’s a DORA-style signal that our release pipeline and observability are aligned with the velocity our customers expect.

    I’ve long held a belief that might sound counterintuitive: speed is not the enemy of safety. It’s a prerequisite for it. Accumulating code creates risk. Shipping small batches minimizes it. The faster you ship, the smaller each change is, and the easier it is to catch problems, and roll back when something goes wrong as the context is still fresh in your head. That small-batch discipline underpins how I approach AI workflows and risk management across product teams.

    Today, over 93% of our pull requests (PRs) across our two main codebases are Agent-driven. And over 19% are auto-approved with no human reviewer in the loop. When I first saw those numbers at scale, I asked the same question you might be asking: are we trading rigor for speed? The answer lives in the data.

    I want to focus on that second number, and why I think it makes us safer. Most people hear “AI is approving our pull requests” and think that’s reckless. I thought so once, too—until I looked at the outcomes that actually matter.

    Last year, our CTO Darragh Curran set an explicit goal: double the productivity of our entire R&D organization within 12 months. Because the faster we can build and ship, the faster our customers get the capabilities they need. Ambitious? Absolutely. But the operational clarity that comes from such a target is invaluable for product leaders.

    Nine months later, we did it. The results were significant across the board, but here’s the stat that crystallized it for me: downtime from breaking code changes dropped 35%, even as our deployments doubled. Shipping faster made us safer. As we modernize how we build and ship software, we systematically surface bottlenecks and tackle them. One of the biggest we found? PR review.

    Humans simply don’t have the time or mental capacity to properly review the volume of AI-generated code we’re now producing. I’ve watched great engineers get stuck in review queues, or worse, feel pressure to rubber-stamp under time constraints—an anti-pattern I’ve battled in multiple orgs.

    When an AI Agent can produce a working implementation in minutes, waiting hours or days for a human to review it is an impedance mismatch. The production line is moving faster than the quality gate can keep up. When that happens, one of two things follows: either the queue backs up and velocity drops, or, more dangerously, humans start rubber-stamping. Glancing at a diff, skimming the description, clicking approve. Some companies are drifting into this failure mode silently. We chose to confront it head-on and built a rigorous solution.

    PR review, done properly, is complex. A good reviewer evaluates the problem statement, aligns the diff to intent, checks for safety and logical issues, applies deep product context, and scans for performance and anti-patterns. No single human can cover all of that on every PR at high deployment frequency. The truth—borne out by data—is that the human baseline we often assume is stronger than it really is.

    Bar chart showing AI-approved pull requests merge 5.2x faster than human-reviewed ones, with medians of 14.6 minutes vs 75.8 minutes, illustrating reduced PR cycle time from creation to merge.
    AI is accelerating code reviews: our data shows median merge time drops from 75.8 minutes with human review to just 14.6 minutes with AI approval—about 5.2x faster—while maintaining strong safety checks.

    So we asked ourselves: what if we could do better?

    Our PR review Agent doesn’t treat code review as a single task. It decomposes it into separate sub-jobs, each handled by an independent sub-Agent. One assesses the quality of the problem description. Another checks whether the diff actually aligns with the stated intent. Another reviews for safety concerns. Another checks for logical correctness. Another reviews against best practices and known anti-patterns. And so on. As a product leader, this is exactly the kind of agentic AI architecture I look for: specialized, auditable steps that strengthen the overall control plane.

    The result is that every PR is reviewed as if a dozen of our most tenured and knowledgeable engineers were all looking at it simultaneously, each bringing their own specialist lens. In the past, getting that breadth of review on a single PR was impossible. Now it’s the default. And unlike ad hoc human review, this system is consistent and tireless.

    A human reviewer typically focuses on the actual code changes, the diff. Our Agent goes deeper. It traces execution paths, following the implications of a change through the codebase. This is something humans rarely had time to do, even when they wanted to.

    While testing our new PR review Agent on a set of historical PRs, we found it flagging a one-line text copy change as incorrect. On the surface, it looked completely harmless, just a text update. We assumed it was a mistake, but it wasn’t. Our Agent caught that the new copy contradicted an existing validation mechanism elsewhere in the codebase. No human reviewer would have realistically found this unless they happened to have written that validation code very recently. Our Agent catches this kind of thing consistently, every time, because it’s always tracing execution.

    The review isn’t generic either. It’s grounded in Intercom-specific guidance that our engineers have built and continue to refine, encoding the same context, standards, and product knowledge they’d apply if they were reviewing the PR themselves. When the Agent reviews a PR, engineers flag whether the review comments were helpful or not, and that feedback continuously sharpens the guidance. It’s a flywheel: the more our engineers invest in teaching the system how to think about our codebase, the better every subsequent review gets. This is eval-driven development in action.

    Automated approval is also never forced. Any engineer can request a human review on any change, at any time. The system is a tool, not a mandate. At Intercom, shipping code doesn’t end at merge. The engineer who ships a change is expected to watch it go live, monitor its behaviour in production, and be ready to roll back if something isn’t right. AI approval doesn’t change that. The human who ships the code remains accountable for the outcome.

    Graph showing 19.2% of all PRs fully auto-approved by AI, 60% are evaluated by AI

    The naive take on AI-approved PRs is that it’s just a rubber-stamp LLM call so that humans don’t have to bother. A convenience feature. That misses what’s actually happening. Our Agent is strict. It won’t approve large PRs. If a change is too big, too complex, or too broad in scope, it flags it and requires it to be broken down. That design nudges engineers toward smaller, well-scoped changes—the safest way to ship, review, test, and, if needed, roll back.

    This matters enormously for safety. Small changes are easier to review, easier to test, easier to understand, and, critically, easier to roll back when something goes wrong. This is the same principle that has always underpinned our shipping culture, but now the PR review Agent actively enforces it. As someone who’s owned incident management and SRE partnerships, I can’t overstate how powerful this is.

    Bar chart of revert rates by code author type, comparing human-authored vs AI-authored code for backend and frontend; AI shows about 10x lower reverts (0.53% vs 5.39% backend, 0.22% vs 2.00% frontend).
    A snapshot of our code review results: AI-authored pull requests are reverted far less often than human-written ones—around 10x lower—across both stacks, with 0.53% vs 5.39% in backend and 0.22% vs 2.00% in frontend, signaling safer merges.

    It’s tempting to look at a goal like “>50% AI-approved PRs” and worry we’re optimizing for a metric rather than an outcome. I see it differently. The real goal is to remove a bottleneck that, if left unchecked, pushes people toward rubber-stamping. By elevating the review bar and keeping batch sizes small, we protect both speed and stability.

    We didn’t assume AI review would be good enough; we actively ran an experiment. Our hypothesis was that AI review could match or outperform human review quality, measured by outcomes: were the changes correct? Did they cause problems in production? How quickly were they reviewed and approved?

    We started with a controlled pilot of over 100 PRs through the AI approval pipeline. The results: zero reverts of AI-approved PRs, and a 6–16x improvement in time-to-approval at the 75th percentile. Since then, the system has scaled significantly. In the first four weeks of broader rollout, 497 PRs went fully autonomous, with Claude writing the code and our AI approval system reviewing, approving, and shipping to production.

    Graph showing AI approval is 5x faster than human review

    Beyond the approval pipeline itself, we also looked more broadly at how AI-authored code performs in production compared to human-authored code. AI-authored backend code had a revert rate of 0.53%, compared to 5.39% for human-authored. On the frontend, it was 0.22% versus 2.00%.

    10X lower revert rate for AI-Authored code

    AI-authored code, reviewed and approved through our automated pipeline, is being reverted at a fraction of the rate of human-authored, human-approved code. I don’t expect that to stay at zero forever, but the evidence shows the quality bar our Agent holds is at least as high as the one humans were holding, and in many cases higher. And here’s the humbling perspective: the product changes that caused outages in the past? They were all reviewed and approved by humans. Human review is not a guarantee of safety. It never was.

    Everything I’ve described—the sub-Agent architecture, the traceability, the labeling, the data—wasn’t just built for speed. It was built for auditability. Every AI-approved PR is labelled, logged, and queryable. The review comments, the approval decision, the test results, the merge event: all recorded. The evidence an auditor expects to see is the same whether a human or an AI approved the change. The “who” may change, but the “what” doesn’t. That’s how you meet SOC 2, HIPAA, ISO 27001, ISO 42001, and AIUC-1 without compromising agility.

    We engaged our auditors, Schellman, early, before we scaled. We proactively worked with them to confirm that our automated review processes and the evidence they produce meet the requirements of our compliance frameworks, including SOC 2, HIPAA, ISO 27001, ISO 42001, and AIUC-1, among others. We think AI-driven change management can meet and exceed the standards that human-driven processes set, and we want to help prove that. In my experience, when you build for safety, compliance follows—never the other way around.

    You can only go so far with PR review as a safety mechanism, no matter how good the reviewer is, human or AI. Only in production do you discover the unknown unknowns. The majority of Intercom’s largest outages weren’t even caused by changes to product code at all. They were infrastructure issues, unanticipated customer usage patterns, or third-party outages. PR review, whether human or AI, was never going to catch those. That’s why, in parallel, we’re also working on an Agent that proactively diagnoses issues in production. We’ll share more on this soon.

    Speed has always been at the core of how we build at Intercom, not in spite of safety, but because of it. And we’re getting even faster with AI. It’s easy to assume that AI-approved PRs would lead to a drop in quality and safety but our data proves otherwise. Our heartbeat is just getting stronger. For product leaders, this is the blueprint: pair agentic AI with small batches, robust observability, and clear accountability, and you make shipping both faster and safer.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • From Brain Dump to Done: How Todoist’s Ramble Captures Tasks in Real Time with AI

    From Brain Dump to Done: How Todoist’s Ramble Captures Tasks in Real Time with AI

    Turning a rambling stream of consciousness into a clean task list while someone is still talking has been a longtime product dream of mine. With Ramble, Todoist brought that dream to life by using live audio AI to capture tasks in real time—no transcription step required. The result is a voice-to-task flow that feels natural, fast, and surprisingly disciplined.

    As I listened to the Doist team—Ernesto Garcia (Front-end Product Engineer), Thomas Jost (Backend Software Engineer), and Hugo Fauquenoi (Product Manager)—walk through their approach, I heard a blueprint for building pragmatic GenAI features. What began as a two-to-three month AI exploration became one of their most technically deliberate releases: a “Gemini-powered pipeline that makes tool calls while the user is still speaking, surfacing tasks on screen in real time without any text output from the model.”

    The breakthrough started with user research. People weren’t merely dictating tasks; they were doing a “brain dump” first—often into pen and paper or even ChatGPT voice—and only then committing items to Todoist. Meeting users where they already are reframed the problem: don’t force structure upfront; capture fluid thought and translate it into actionable tasks instantly.

    That insight led to a bold architectural choice: skip transcription entirely and process raw audio directly with a Gemini live audio model. By removing the brittle middleman of text, the team reduced latency and kept the model focused on one job—turning intent into structured actions. It’s a crisp example of AI workflows designed for reliability over novelty.

    The real magic is in the real-time “tool calls.” As the user speaks, the model triggers add task, edit task, and delete task operations immediately. For high-friction contexts like driving, they paired visual task cards with subtle sound effects as confirmation cues. It’s thoughtful conversation design that respects attention and safety without sacrificing speed.

    Teaching the model to capture tasks literally—without over-interpreting or trying to complete the work—required careful prompt engineering for voice and temperature tuning. Drawing a bright line between “capture versus do” kept the experience trustworthy. In my own AI Strategy work, I’ve found that establishing explicit agentic guardrails early prevents unintended autonomy later.

    Dates were the sleeper challenge. The team had to inject the current date, normalize to days vs. months, and always output dates in English for the natural language parser—while preserving the user’s original language for everything else. If you’ve ever shipped date handling across locales, you’ll appreciate how many edge cases hide in “Taming Dates and Time.”

    Quality didn’t hinge on intuition alone. They built an LLM-judge eval system using real employee recordings from 100+ people across 35 countries in 20+ languages to catch prompt regressions. That’s eval-driven development done right: representative data, repeatable scoring, and tight feedback loops as models and prompts evolve.

    For project and label matching, they chose direct context injection over RAG. Instead of building a retrieval pipeline, they injected the full project/label list into the system prompt. With smart context window management and a sharply constrained task schema, this was both simpler and more accurate. Sometimes the fastest path to product-market fit is removing moving parts, not adding them.

    One product principle stood out: easy correction beats perfect first-time accuracy. Natural language interfaces earn trust when users can fix misfires in a tap or two. That bias toward quick recovery over false precision is how you ship AI that feels useful from day one.

    Looking ahead, the roadmap is compelling: multimodal task capture from images and text blobs, Apple Watch support, and automation integrations. As voice AI agent patterns mature, this “tool-only architecture” sets a solid foundation for going from capture to coordinated execution—without losing the simplicity that makes Ramble shine.

    If you want to hear the full conversation, you can listen on Spotify or Apple Podcasts. It’s a masterclass in building focused GenAI features that trade cleverness for clarity—and still delight.

    Resources & Links: Todoist • Doist • Google Vertex AI (Gemini)


    Inspired by this post on Product Talk.


    Book a consult png image
  • Inside 27,000 AI Sessions: What Real Users Taught Me About Designing High-Trust Agents

    Inside 27,000 AI Sessions: What Real Users Taught Me About Designing High-Trust Agents

    Over the past quarter, I’ve been obsessed with a simple question: how do real people actually prompt AI agents when the stakes are high and the clock is ticking? We analyzed 27K sessions with Amplitude's Global Agent using our Agent Analytics tool. Here's what we found out about how real users are prompting our agent. That single line belies months of careful instrumenting, qualitative review, and product debates—and it forever changed how I design agent experiences.

    The clearest pattern I saw: users don’t craft “perfect” prompts—they co-create with the agent. Most sessions began with a broad intent, then tightened through rapid, iterative turns. The winning structure emerged as context, command, and constraints. When our agent acknowledged context first, clarified the command, and reflected constraints back, users responded with noticeably more confidence. It reinforced what great prompt engineering already teaches, but grounded in lived behavior across thousands of journeys.

    Trust was the next breakthrough. People wanted transparency on capabilities, a concise first answer, and an easy path to deeper detail and sources. They frequently asked the agent to show its work, summarize trade-offs, or restate assumptions in plain language. Instrumenting observability into the agent’s reasoning artifacts—without overwhelming the user—proved foundational for building credibility session by session.

    On task complexity, users fared best when the agent orchestrated a few small, verifiable steps rather than one heroic leap. Retrieval-first pipeline patterns consistently reduced confusion and rework, especially when paired with strong context window management. The more the agent proactively chunked the problem, validated intermediate outputs, and offered next-best actions, the smoother the journey—and the more reusable the prompts became.

    UX nudges mattered as much as model quality. Inline examples (“Try this”), one-click refinements (“Shorter,” “Add a table,” “Cite sources”), and lightweight guardrails kept momentum high without boxing users in. When the agent made uncertainty explicit and offered safe fallbacks, abandonment dropped and users explored more ambitiously. The experience felt less like “querying a model” and more like collaborating with a capable teammate.

    From a product management lens, these insights shape how I prioritize agentic AI. I’m doubling down on: scaffolded prompts that lead with context and constraints; transparent citations and assumptions; multi-step plans that the user can edit; and evaluation loops that A/B test prompt templates, tool strategies, and response formats. I’m also investing in analytics that connect session patterns to activation, speed-to-value, and retention so we can run eval-driven development, not opinion-driven debates.

    If you’re building agents into a core product workflow, start by designing for iterative co-creation, not one-shot brilliance. Offer progressive disclosure, keep the first answer tight, and make verification effortless. Shape the model with retrieval-first strategies, manage your context window like a scarce resource, and treat observability as a feature, not a debug tool. Most of all, let real usage guide your roadmap—these 27K sessions reminded me that the best agent UX is learned alongside our users, not imagined in isolation.


    Inspired by this post on Amplitude – Perspectives.


    Book a consult png image
  • Never Lose Your AI Superpowers: How I Sync Context and Skills Across Every Device

    Never Lose Your AI Superpowers: How I Sync Context and Skills Across Every Device

    I spend a meaningful portion of my week helping teams operationalize AI workflows, and one theme comes up over and over: how to share context files and skills seamlessly across devices and with colleagues. Hosting Claude Code office hours has only reinforced it—sharing context and skills is the single biggest blocker to reliable, repeatable outcomes.

    I hear from leaders driving AI adoption who have built robust, high-signal context systems and carefully crafted skills. Their challenge isn’t creating value—it’s distributing it. They need a way to make the same trusted workflows available to teammates and to keep everything in sync across laptops, desktops, and phones.

    I hit the same wall myself. I work across multiple devices (a Mac Mini for day-to-day, a MacBook Air on the road, and an iPhone) and I collaborate with a full-time admin. I wanted my context and skills to be consistent everywhere, for both of us. In this piece, I’ll share my setup—what I store where, how I share it across devices and with my team, the trade-offs of each option, and how I keep everything current. We’ll cover four different syncing services: git/GitHub, Obsidian Sync, Dropbox and iCloud.

    If you’re new to this series, this is the eighth installment. Earlier pieces provide foundational context: Claude Code: What It Is, How It's Different, and Why Non-Technical People Should Use It; Stop Repeating Yourself: Give Claude Code a Memory; How to Use Claude Code Safely: A Non-Technical Guide to Managing Risk; How to Choose Which Tasks to Automate with AI (+50 Real Examples); How to Build AI Workflows with Claude Code (Even If You're Not Technical); How to Use Claude Code: A Guide to Slash Commands, Agents, Skills, and Plug-ins; and Context Rot: Why AI Gets Worse the Longer You Chat (And How to Fix It).

    The day it really hit me was right before my interview with Claire Vo on How I AI. I was staying in an AirBnB with only my laptop, and I planned to demo my /today command along with my context file structure. Minutes before the session, I realized the latest version of my /today command wasn’t on that machine. I was able to remote into my Mac Mini and grab it—crisis averted—but it was a wake-up call. I needed a more reliable, shareable approach for syncing context and skills across devices and with my admin.

    I started by testing the tools I already used—Dropbox, iCloud, and GitHub—to see what might fit. Each got me partway there, but each also introduced friction that mattered in daily use.

    First, absolute file paths don’t travel well. I began with Dropbox but quickly ran into cross-linking headaches. Good context systems rely on rich interlinking—index files point to other context files, and those context files link to each other. When Claude creates a link from one context file to another, it tends to use the full file path: /Users/ttorres/Library/CloudStorage/Dropbox. That worked on my Mac Mini and MacBook (same user name), but not on my phone—and not for my admin. I tried to force relative links (~/Dropbox), but couldn’t get Claude to do it consistently, which led to broken links. This isn’t unique to Dropbox; Claude prefers full paths because they’re reliable on a single machine, but they’re brittle across devices and useless when sharing with colleagues. Claude is trained to use relative file paths when working within a git repository, but I struggled to get it to work reliably in Dropbox.

    Second, skills live in a user directory by default. By default, skills live in ~/.claude/skills. Most sync services aren’t designed to share your ~/ folder. iCloud is the exception, but then you’re limited to Apple devices—no Windows or Android. There is a workaround: set up a claude folder in Dropbox and create a symlink from ~/.claude to your synced claude folder, so all skills, commands, and settings live in Dropbox. Then, on each device (yours or a colleague’s), you set up a symlink to that folder so Claude can find the files. This works, but I was running into another limitation that made Dropbox a poor fit.

    Third, Obsidian on iOS doesn’t sync cleanly with Dropbox. I rely on Obsidian’s file browser alongside my notes to navigate context quickly. Storing vaults in Dropbox gave me parity across my Mac Mini and MacBook Air, but I couldn’t get the iOS Obsidian app to reliably load my Dropbox vaults. That friction was a dealbreaker for on-the-go work.

    At that point, I explored git/GitHub. GitHub is cloud storage for git repositories. A git repository is a folder of shared files used so engineers can collaborate on the same code base. Each person clones a local copy, works locally, then pushes changes back to the hosted repo on GitHub; others pull to update. Git’s merge and conflict tooling is excellent. Git is the powerhouse of file syncing and version control. It easily handles syncing context and skills, Claude behaves better with relative links in a git repo, and I can open the repo in my IDE with a clean file browser. For me, that checked all the boxes—until I factored in my admin. Git has a learning curve, requires manual pull/push hygiene, and often assumes an IDE workflow. That overhead was too heavy for a non-technical collaborator.

    The turning point was Obsidian Sync. A colleague suggested it, and it ended up being the sweet spot. Obsidian is a markdown reader; files are stored locally in a normal folder you can open in Finder or File Explorer. There’s no proprietary format—you can read files with any text editor, and Claude can access them via bash commands. Obsidian Sync is simpler than git: open a note and it syncs in the background. I can access the same vaults across my Mac Mini, MacBook Air, and iPhone, and I can share a vault with my admin so we can both create and access notes.

    Because we’re in different time zones and rarely edit the same note simultaneously, limited conflict handling hasn’t been an issue. Obsidian’s internal link notation also means one note can link to another and those links just work across devices. Claude can follow these links, so the brittle file path problem disappears.

    Here’s where I landed. After a lot of trial and error, I have a setup that works across my devices and for my admin, who uses both a Windows desktop and a Mac laptop. I keep my core context in Obsidian vaults synced with Obsidian Sync, which preserves portability, link integrity, and ease of use. For skills, I avoid scattering files in machine-specific locations and instead centralize what Claude needs to reference in shared, human-readable folders. If you require advanced version control with branching and reviews, git/GitHub is excellent. If your priority is low-friction, cross-device access for non-technical teammates, Obsidian Sync is a practical, reliable choice. And if you must use Dropbox or iCloud, consider symlinks and be vigilant about relative paths—just know that absolute paths won’t travel well.


    Inspired by this post on Product Talk.


    Book a consult png image
  • Cracking the Hardest Percentages: Turn Complex Support into Scalable, Trust-Building Automation

    Cracking the Hardest Percentages: Turn Complex Support into Scalable, Trust-Building Automation

    I’ve learned that the smallest slice of your support queue often dictates the majority of your operating cost, customer memory, and automation ceiling. In product reviews and CX ops deep-dives, I see the same pattern: the “easy” tickets pad your resolution counts, but the complex, multi-step queries quietly own your handle time and your brand trust. If you care about compounding impact, your customer support AI strategy has to target that hardest percentage first.

    Complex queries are a small percentage of your queue, but they consume a disproportionate share of your team’s time.

    Take a typical queue: password resets outnumber refund disputes ten to one, but a reset takes five minutes and a dispute takes thirty. The “rare” query accounts for over a third of total handling time. The same pattern holds for account investigations, subscription changes, and billing disputes.

    How you handle complex queries is also what customers actually remember about their support experience. When someone is dealing with a damaged order or a billing dispute, the stakes are higher, and a fast, good resolution is what separates a forgettable interaction from one that builds lasting trust.

    Most AI Agents automate the easy, informational queries well. The question for your automation rate is whether they can handle the hard ones. That’s where agentic AI and robust AI workflows make or break your outcomes.

    We’ve gotten really good at informational queries – the hard part is what comes next. I’ve seen teams invest deeply here, and for good reason: it lifts containment quickly and cheaply. But to break through the plateau, you have to execute actions across systems, not just answer with text.

    We’ve invested deeply in informational Q&A. We built Apex, a specialized customer service model trained on billions of support interactions, as Fin’s core answering engine. Beneath that sits a custom retrieval model, a purpose-built reranker, and a unified RAG pipeline, all trained specifically for customer service. Fin resolves issues at a higher rate than general-purpose frontier models, with fewer hallucinations and at lower cost.

    But informational Q&A only covers queries where text is the answer. Most Agents can handle that. Far fewer let you configure complex, multi-step actions without a forward-deployed engineer setting it up for you, which creates a gap.

    Every query your team handles falls into one of three categories:

    Informational: “Can you ship transatlantic by priority next day?” Answered with text from your knowledge base.

    Personalized: “Where is my order?” Requires data unique to that user.

    Action-led: “My order arrived damaged, I need a refund.” Requires doing something: checking a return window, cross-referencing transaction data, making a judgment call – reading from multiple systems and acting across them.

    Dark-themed line chart of percentages from Jan 2026 to Apr 2026. An orange line with circular markers climbs steadily, pauses briefly mid‑period, then spikes sharply to a new high near the end of the timeline.
    From Jan to Apr 2026, the trend moves steadily upward, pausing briefly before a sharp late surge. A clear snapshot of momentum for customer service KPIs, finance results, and the impact of new procedures.

    These complex queries, the ones that require multi-step processes across systems, aren’t edge cases; they’re the reason your support team exists. This is the gap Fin Procedures was built to close.

    It works in practice, and the trajectory matters for product strategy and ops planning.

    Procedures is live, it’s scaling, and the results are clear. Since launching in managed availability, Procedures has handled over 1.5 million conversations, and volume is doubling month over month across hundreds of apps in fintech, e-commerce, gaming, healthcare, and SaaS.

    When customers hit complex, multi-step queries, the experience is dramatically better when Fin can do the work end-to-end. We tested this with a randomized 5% holdout – conversations where Procedures would normally run, but didn’t. CSAT was 28.93% higher when Procedures ran, a statistically significant result.

    A product, not a services engagement. I’ve sat through too many “automation” projects that were really solutions engineering gigs: workshops, custom scripts, then a queue of change requests when policies shift. It’s fragile and slow.

    The B2B AI industry has a consultingware problem. It’s not databases being forked anymore, it’s prompts. The economics of maintaining bespoke setups per customer don’t work. Either the application falls behind new models, or the vendor changes the model and quality degrades invisibly.

    In my view, an agentic AI platform should be a product your team owns end to end: a natural language editor – literally paste your existing SOPs – branching logic, data connectors, and AI-powered simulations for testing. Your CX ops team configures this, iterates on it, owns it. If you need help, a forward-deployed team can assist, but they’re optional, not a dependency. You always have control.

    And because it’s a unified product, improvement compounds. When the vendor optimizes a prompt, every customer’s Procedures get better. When they upgrade the model, they can A/B test across the entire customer base and know it’s better before rolling out. You can’t do that when every customer has a bespoke prompt. The consulting model isn’t just expensive, it’s structurally unable to compound.

    Today, Fin Procedures is available to every Intercom customer – no waitlist or managed rollout, ready for all 8,000+ customers.

    We’re iterating fast based on real customer feedback. Here’s what’s landed since the last major update, and why it matters for reliability and governance:

    AI-powered Procedure review: Flags broken logic, missing references, and unreachable conditions before you deploy.

    Promotional banner reading "Get started with the #1 Agent today" over a dark, aurora-like gradient background, featuring a white button labeled "Start a free trial"; marketing graphic for an AI support agent.
    Kick off your journey with the #1 Agent—an AI partner designed to turn resolutions into real outcomes. Tap “Start a free trial” to explore faster, smarter customer service and see how Fin delivers value from day one.

    Procedure failure reporting: A new reporting dimension that lets you drill into conversations where Procedures failed, so you can diagnose and fix.

    Version history with rollback: Track every change, compare versions, roll back if needed.

    Data connector health monitoring: See at a glance if your integrations are healthy, degraded, or failing.

    Optional data connector parameters: Fin only asks customers for information when it’s actually needed, instead of prompting for every field.

    Email Simulation support: Test how your Procedures behave across chat and email before going live.

    Agent in the Loop (Beta) unlocks the next tranche of automation. Even with Procedures, two things hold teams back from automating their most complex queries: missing integrations and policies that require a human sign-off on sensitive decisions.

    “Agent in the Loop” is built for both. Need Fin to check your internal admin tools but haven’t built a data connector yet? Put a human checkpoint at that step. Fin handles the conversation, gathers context, and pauses, surfacing a structured summary for a human agent to verify or act, then resumes. You get automation on the 80% that doesn’t need the integration.

    For compliance – identity verification, high-value refunds – Fin does the legwork, a human makes the final call and then hands it back to Fin. This works natively in the Intercom Inbox and via Slack. Some competitors don’t have an inbox-native variant at all, meaning humans need to leave their primary workspace to review AI actions.

    Procedures are also built to let you collaborate with all your teammates – both human agents and AI Agents. Fin can work with them directly inside a Procedure, using APIs and webhooks to loop in another teammate mid-flow, hand off context, and pick back up once they’re done.

    Making it easier, faster. Procedures is already self-serve, but the next step is making Procedure creation, testing, and maintenance significantly more streamlined and easy to do, with less manual editing and more AI-assisted building and debugging. There’s a lot coming in this space over the next few months – and it aligns perfectly with a retrieval-first pipeline and stronger governance at scale.

    The hardest percentages matter the most. The biggest unlock for your automation rate won’t be answering more FAQs, it will be handling the complex, multi-step queries that consume your team’s time and define what customers remember about their experience with you.

    That means working with an Agent that goes beyond answering questions and executes processes. A product your team owns and configures, not a service you buy and hope gets maintained. And a platform where every improvement compounds across every customer. That’s Procedures. Available now, for everyone.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • From High-Touch Swarms to Scalable Product: Turning Customer Signals into High-Impact Features

    From High-Touch Swarms to Scalable Product: Turning Customer Signals into High-Impact Features

    The best signal often comes from the least scalable work.

    I’ve learned this the hard way—and the rewarding way. When I’m closest to customers, rolling up my sleeves with the team, I uncover nuanced, high-signal insights that no dashboard or aggregate report can reveal. Those insights, when treated with rigor and discipline, become the backbone of a durable product strategy and true product management leadership.

    At Intercom, that is at the heart of how we operate on “swarms.” Swarms are cross-functional teams of Fin experts focused on ensuring customers succeed when trialing Fin. Each team consists of engineers, data scientists, and a product manager, all focused on optimizing Fin for our customers.

    Working in these teams gives us deep insights into the needs of individual customers, but they can also form the foundation of new Fin features. Let me explain.

    I frame the journey from insight to impact in three levels: “Level 1: Swarms – where the signal comes from,” “Level 2: Cockpit – where the signal starts to scale,” and “Level 3: Product – where the signal reaches maximum leverage.” This model blends continuous discovery with pragmatic solutions engineering and creates a clear path from hands-on learning to product-led growth.

    Level 1: Swarms – where the signal comes from. The goal is simple: help Fin resolve more conversations and help customers understand and use the product. Swarms partner with customers to define their goals and how Fin fits into their workflows. We map out an automation roadmap by analyzing their conversations, determining the APIs and Procedures they need, and the level of automation they can achieve. We then support them in implementing it and reaching that outcome. This involves ongoing analysis to identify optimizations to their configuration and the next best actions for increasing automation levels, such as improving knowledge base content or deploying new APIs.

    During a swarm, the feedback loop is fast. We test something, ship something, and quickly see whether the metric moves. That speed and depth is what makes swarms so valuable. It’s also what makes them hard to scale. I’ve felt the thrill of watching a key metric bend within hours—and the constraint of knowing that kind of attention doesn’t scale to every account.

    For example, we developed an automation taxonomy to predict the level of automation a customer can achieve. Initially, this analysis was manual and took more than half a day to run, with time required to prep and visualize the data. But the effort was worthwhile. For one customer, we predicted an automation rate of 70% and they achieved exactly that.

    By working closely with customers, we learn what drives success, but this work is inherently hands-on and doesn’t scale on its own. So the real challenge is figuring out how to turn what we learn in those high-touch engagements into systems, tools, and product changes that benefit far more customers. That’s the inflection point where AI workflows and product strategy meet.

    Level 2: Cockpit – where the signal starts to scale. Not every customer should need swarm-level attention. The way we bridge that gap is by making the swarm analyses repeatable and shareable. Once we can run the same analysis across customers, we can start turning bespoke swarm learnings into reusable signals. This is where Cockpit comes in.

    Analytics dashboard showing taxonomy breakdown of customer support conversations: raw volume trend, 100% stacked percentage split, and topic-level bars for account settings, billing, integration, and more.
    Transform customer signals into action: this dashboard tracks support conversation volume, taxonomy percentages by type, and topic demand across account settings, billing, integration, and more to guide scalable feature bets.

    We take patterns learned in swarms and encode them into internal tooling inside our insights web app, Cockpit. Instead of analysis being a bespoke project, it becomes a workflow. For example, we scaled the automation taxonomy and this has enabled us to quickly understand automation potential for all customers.

    Now, a customer success manager (CSM) can pick a customer, see their automation potential and current performance, understand the biggest issues, and propose next actions. This is how we scale the impact of swarm learnings through CSMs and Sales. It allows far more customers to benefit from the same patterns we see in high-touch work, without requiring direct data science involvement every time.

    Cockpit also functions as a valuable proving ground. It gives us a way to test ideas across a much broader set of customers and see what generalizes before we consider taking anything further. In other words, we transform sharp, local signal into broadly useful guidance—an essential step in any AI Strategy that aims to balance precision with scale.

    Level 3: Product – where the signal reaches maximum leverage. The real payoff comes when the patterns we have validated internally become part of the product itself. Instead of helping one customer directly, or helping many customers through internal teams, we deliver a feature directly to customers so they can improve Fin’s performance on their own. Today, the automation taxonomy is a part of Insights and accessible to customers who have this feature.

    Another example is CX Score. It started with close work alongside Intercom’s Customer Support team to understand performance with Fin, initially through predicted CSAT and resolution. Over time, this work evolved into CX Score: a scalable way to measure conversation quality across all customers.

    The product stage is fundamentally different from Cockpit because of the constraints. Cockpit provides a platform for our customer analyses/tools but it doesn’t need to scale as far as product. What moves into product has to work for every customer, without configuration, at scale, so it has to generalize. That bar is what protects long-term quality while unlocking product-led growth.

    That’s why the move from Cockpit to product isn’t automatic. We’re not just asking whether something is useful, but whether it’s broadly useful, robust, and scalable enough to run across the entire customer base. As a product leader, I push for this discipline because it’s where customer success, engineering excellence, and business outcomes converge.

    The loop. The model is simple. Swarms generate the best signal, grounded in real customer problems. Cockpit operationalizes that signal so CSMs and Sales can use it across many customers. Product takes the patterns that truly generalize and turn them into scalable features that enhance every customer’s experience.

    This loop allows a small swarm data science function to have impact beyond a small set of high-touch accounts, resulting in a stream of continuous improvements across all three levels and an ever-increasing level of automation for our customers. Practically, it’s a repeatable playbook for product management leadership: start with high-signal discovery, prove repeatability, and only then scale through product. Done well, it compounds learning, accelerates time-to-value, and aligns the entire organization around measurable outcomes.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • Stop Drowning in Tasks: How AI Marketing Agents Restore Focus and Maximize Impact

    Stop Drowning in Tasks: How AI Marketing Agents Restore Focus and Maximize Impact

    Every week I meet marketers who are working harder than ever—more campaigns, more content, more dashboards—yet seeing less movement on metrics that matter. The surge of AI tooling has amplified activity, not necessarily impact. That’s the focus problem: we confuse motion with momentum, and our backlogs look great while our outcomes stall.

    Learn how AI agents for marketing can help you prioritize impact so you can do important work, instead of just more work.

    In my role leading product and growth teams, I’ve learned that AI only compounds value when it is pointed squarely at outcomes. If we don’t define what “good” looks like, agentic AI will simply scale busywork. The antidote is a disciplined operating model that connects strategy to execution and instruments agents with clear success criteria.

    First, anchor your program with outcomes vs output OKRs. Choose one or two measurable business outcomes—such as qualified pipeline, conversion rate, or activation—and make everything else subordinate. This provides the compass agents need to make effective trade-offs when speed and volume tempt you to do “one more thing.”

    Second, map a driver tree from the target outcome down to the controllable levers: audience segments, offers, channels, messaging, and experience friction. This traceability shows where agents can move the needle fastest—whether that’s accelerating research, sharpening positioning, or eliminating handoffs that slow experimentation.

    Third, design a small, agentic AI workforce aligned to those levers. For example: a Research Agent that synthesizes market insights and past performance; a Copy Agent that generates on-brief, on-brand variants; a Distribution Agent that adapts content to each channel and schedules posts; and an Analytics Agent that runs A/B tests, summarizes results, and flags anomalies. Keep human oversight where judgment matters most—strategy, brand voice, and high-stakes decisions.

    Fourth, instrument rigor from day one with Agent Analytics and eval-driven development. Define offline evals for brand consistency, factuality, safety, and response time; pair them with online experiments that quantify lift on your target outcomes. Set a minimum detectable effect (MDE) so you stop shipping changes that cannot plausibly move the metric.

    Fifth, operationalize your AI workflows. Standardize prompts, inputs, and handoffs; templatize briefs and acceptance criteria; and keep a change log so improvements compound rather than reset. Use short, frequent feedback loops to prune low-impact work and double down on what demonstrably advances your objectives.

    I’ve seen teams reclaim focus and momentum when they treat agents as teammates, not toys. The magic isn’t in producing more assets—it’s in consistently choosing the next best action in service of a clear outcome. When you combine outcome clarity, a driver tree, targeted agents, and tight evals, AI becomes a force multiplier for marketing impact.

    If you’re feeling overwhelmed by AI’s possibilities, start small: commit to one outcome, one driver you believe is material, and one agent designed for that job. Prove lift, codify the workflow, then scale. Velocity is only valuable when it’s pointed in the right direction.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • Inside the Most Politically Dangerous C‑Suite Role: Hard Truths on Culture, Layoffs, and Leadership

    Inside the Most Politically Dangerous C‑Suite Role: Hard Truths on Culture, Layoffs, and Leadership

    I’ve long believed the people function is a strategic engine, not a support lane. That conviction was only reinforced in a recent deep dive with Katie Burke, now COO at Harvey after joining as Chief People Officer. Before Harvey, she spent 11 years in HR leadership at HubSpot, helping build one of tech’s most distinctive cultures. In this piece, I unpack what resonated most for me as a product leader: a marketing-minded approach to HR, deliberate hiring from hospitality, and the non-negotiable case for culture as a core business strategy.

    The first principle is simple and often overlooked: HR leaders should think like marketers. Employer brand is a product; your candidate and employee journeys are funnels; and your programs deserve the same rigor we bring to product—segmentation, positioning, channels, and continuous A/B testing. When we treat onboarding, performance, and manager enablement like iterative product launches—complete with activation metrics, retention curves, and NPS—we stop guessing and start compounding results.

    One line has become a north star for how I approach executive leadership: “Don’t ask for a seat at the table. Build the table.” In practice, that means codifying the operating system—decision rights, principles, cadences, and accountability—so the organization isn’t improvising strategy in every meeting. Product, People, and Finance should co-own this OS; that’s how you scale clarity faster than headcount.

    Transparency is the tax we pay for alignment, and it compounds trust. After an IPO, the impulse can be to close ranks. The better move is radical transparency with context: what changed, why it matters, and how decisions get made now. On my teams, that looks like publishing decision records, sharing tradeoffs explicitly, and using written docs to reduce rumor velocity—core muscles in stakeholder management as complexity grows.

    I also loved the counterintuitive hiring bet: prioritize hospitality backgrounds alongside traditional corporate pedigrees. People who’ve thrived in service environments bring customer empathy, operational resilience, and a bias for proactive care—traits that elevate everything from onboarding to incident response. In product terms, they’re culturally accretive hires with high signal on service quality and consistency.

    The trickiest part of the Chief People Officer role isn’t process—it’s politics. You are the executive team’s own HR business partner, which requires coaching, candor, and conflict mediation at the highest stakes. The goal is to “Be the Michael Jordan of your exec team”—the teammate who elevates standards, makes others better, and chooses the hard right over the easy familiar.

    Layoffs create a culture debt that accrues interest. Expect a “2.5-year cultural hangover after a layoff”—in many companies, an inevitable two-year layoff hangover—unless you actively repay it. That repayment plan includes narrating the why with specificity, rebuilding trust through manager enablement, and re-anchoring on performance and values. Measure leading indicators (manager effectiveness, time-to-decision, psychological safety) alongside lagging ones (regretted attrition) to track the true recovery arc.

    People leaders also need to create “graceful exits.” Doing this well preserves dignity for the person, protects the team’s morale, and safeguards the company’s brand. The bar is straightforward: clear rationale, fair process, useful feedback, generous support, and alumni pathways. A graceful exit signals that even when business realities bite, respect is non-negotiable.

    Expectation-setting matters. Two truths cut through the noise: “The workplace shouldn’t be Disneyland” and “Our job is not to make you happy every day.” The promise is not perpetual happiness; it’s meaningful work, fair standards, growth opportunities, and leaders who tell the truth. When we set that contract clearly, engagement becomes an outcome of purpose and progress—not perks.

    On feedback, I use the protein vs. sugar rule for employee feedback. Sugar feedback is pleasant and perishable; protein feedback is specific, sometimes uncomfortable, and growth-driving. Great cultures build a taste for protein—clear role expectations, crisp examples, and written follow-ups. Mechanically, that looks like structured 1:1s, decision retros, skip-levels, and manager training that demystifies “what good looks like.”

    Being a Chief People Officer isn’t for the faint of heart. The role must be demanding by design—on executive hiring quality, performance management courage, and values enforcement. Moments like “Berry-Gate” are reminders that small symbolic issues can balloon when feedback loops are unclear. Close the loop fast, publish the rationale, and ensure there’s a predictable path for concerns to be heard and resolved.

    When hiring, beware patterns that predict friction. That’s why “frequent flyers” are a new-hire red flag. Movement can signal adaptability—but weather-vein pivots and blame-shifting often repeat. Probe for ownership, learning moments, and sustained impact; you want people who compound value, not just sample it.

    Clarity on scope prevents leadership whiplash. Which company decisions fall to the Chief People Officer? Think leveling frameworks, compensation philosophy and bands, performance calibration, manager standards, ER policies, and org design guardrails—always in lockstep with Finance and the CEO. Escalate when there are values collisions or systemic risks; otherwise, push decisions to the right altitude and owner.

    Scaling exposes the same few failure modes on repeat: fuzzy decision rights, a thin manager bench, brittle processes that don’t flex, and inconsistent leveling that erodes trust. The antidote is an operating model that pairs clear principles with lightweight mechanisms—documented roles, regular calibration, and reviews that audit for both outcomes and operating behaviors.

    Comparing a scaled SaaS like HubSpot with an AI-native company like Harvey surfaces important differences. The former optimizes for durable systems, predictable cadences, and governance; the latter optimizes for rapid learning loops, emergent org design, and a higher tolerance for ambiguity. The art is porting the right controls at the right time without crushing velocity.

    AI is already changing the people function. GenAI can draft job descriptions, summarize performance notes, classify themes from engagement surveys, and power AI workflows that resolve common HR tickets. The human-in-the-loop remains essential for judgment, context, and ethics—especially around data governance and privacy-by-design. A pragmatic AI Strategy here frees HRBPs for higher-order coaching and organizational development work.

    One practice I recommend widely: share your own performance reviews. Modeling openness normalizes growth and turns feedback into a shared craft, not a secret ritual. It also builds trust when you later ask the organization to lean into sharper, protein-rich feedback.

    Finally, disagreements with the CEO are inevitable—and healthy. Handle them with pre-briefs, crisp written proposals, explicit tradeoffs, and a shared decision record. Argue like scientists, not politicians; once a call is made, disagree and commit. That combination of candor and alignment is what keeps executive teams high-trust and high-velocity.

    The people leader’s chair may be the most politically dangerous role in the C-suite—but it’s also one of the most leveraged. Build the table, tell the truth, design for standards and dignity, and treat culture like the product that powers everything else.


    Book a consult png image
  • Beat AI FOMO: A Product Leader’s Playbook to Choose Tools, Stay Focused, and Learn Deeply

    Beat AI FOMO: A Product Leader’s Playbook to Choose Tools, Stay Focused, and Learn Deeply

    Lately, it feels like every morning brings a new AI launch, a dazzling demo, or a must-try tool. I love the pace of innovation, but the constant stream can trigger counterproductive FOMO if I’m not intentional. As a product leader, I’ve learned to turn that anxiety into a disciplined learning system—one that keeps me curious without letting novelty hijack my focus.

    That’s exactly why this conversation with Petra Wille and Teresa Torres resonated with me. They explore how to stay experimental in the AI era without chasing every shiny object. Their perspective aligns closely with my own operating cadence: start with real problems, go deep on a small set of tools, and create explicit boundaries between work, learning, and play.

    Listen to this episode on: Spotify | Apple Podcasts

    Here’s the mindset I apply. I don’t start with tools—I start with problems. When I encounter concrete friction in a workflow or see a credible opportunity to improve an outcome, that’s my trigger to explore a new capability. This mirrors the continuous discovery habit of prioritizing opportunities over solutions, and it’s how I avoid performing “innovation theater.”

    To keep exploration healthy, I time-box my learning. I block recurring windows specifically for experiments, reading, and hands-on trials so they don’t overrun my core product work. During these blocks, I’ll set a clear question, run a tight test, and capture what I learned. No rabbit holes, no endless tinkering.

    I also separate “interesting” from “actionable.” Plenty of inputs are worth awareness, but very few deserve immediate action. I bookmark the rest for later. This simple filter reduces cognitive load and keeps my backlog—from ideas to proofs of concept—well-governed.

    Social media can amplify technology hype cycles, so I establish boundaries. I batch consumption, mute low-signal channels, and prioritize practitioner communities over performative threads. The goal isn’t to be first; it’s to be right for my customers, my team, and our strategy.

    When choosing what to try next, I use a practical rubric. Does the tool target a real friction I’ve seen in discovery or delivery? Can it plug cleanly into our AI workflows without unsustainable glue work? Do we have a safe, compliant way to test it? Is there a plausible path from trial to compounding value? If the answer isn’t a confident yes to most of these, I wait.

    Depth beats breadth. I’d rather take one promising tool into a real use case, instrument it, and measure outcomes than skim ten trending demos. That tighter loop produces sharper intuition, clearer product bets, and better partner decisions. A quick opportunity solution tree helps me connect user pain to outcomes before I let any solution onto the field.

    In the episode, Petra Wille and Teresa Torres talk candidly about managing FOMO, deciding which tools to explore, and designing intentional learning systems. They discuss why starting with a problem is more valuable than starting with a tool, how social media amplifies technology FOMO, and why going deeper with fewer tools can lead to better learning. If you’ve ever felt like you’re falling behind because you haven’t tried the latest AI tool yet, this conversation will help you rethink how you approach learning and experimentation.

    If you’re curious about what came up, here are some of the tools and communities mentioned: Claude Code, OpenClaw (formerly Clawdbot, Moltbot), NotebookLM, Product Talk, ElevenLabs, Lenny’s Newsletter Community, and even a nod to Bridgerton for a touch of levity.

    My takeaway is simple but powerful: curiosity doesn’t require constant experimentation. The best product managers cultivate a balanced system—grounded in product discovery, energized by focused experiments, and protected by clear boundaries—so we can learn faster while staying pointed at outcomes that matter.

    Discussion Question: How do you decide which new tools or technologies are worth exploring—and which ones you can safely ignore?

    Resources & Links: Follow Teresa Torres: https://ProductTalk.org | Follow Petra Wille: https://Petra-Wille.com

    Full transcripts are only available for paid subscribers.

    Have thoughts on this episode? Leave a comment below.


    Inspired by this post on Product Talk.


    Book a consult png image