Category: Product Management Leadership

  • From High-Touch Swarms to Scalable Product: Turning Customer Signals into High-Impact Features

    From High-Touch Swarms to Scalable Product: Turning Customer Signals into High-Impact Features

    The best signal often comes from the least scalable work.

    I’ve learned this the hard way—and the rewarding way. When I’m closest to customers, rolling up my sleeves with the team, I uncover nuanced, high-signal insights that no dashboard or aggregate report can reveal. Those insights, when treated with rigor and discipline, become the backbone of a durable product strategy and true product management leadership.

    At Intercom, that is at the heart of how we operate on “swarms.” Swarms are cross-functional teams of Fin experts focused on ensuring customers succeed when trialing Fin. Each team consists of engineers, data scientists, and a product manager, all focused on optimizing Fin for our customers.

    Working in these teams gives us deep insights into the needs of individual customers, but they can also form the foundation of new Fin features. Let me explain.

    I frame the journey from insight to impact in three levels: “Level 1: Swarms – where the signal comes from,” “Level 2: Cockpit – where the signal starts to scale,” and “Level 3: Product – where the signal reaches maximum leverage.” This model blends continuous discovery with pragmatic solutions engineering and creates a clear path from hands-on learning to product-led growth.

    Level 1: Swarms – where the signal comes from. The goal is simple: help Fin resolve more conversations and help customers understand and use the product. Swarms partner with customers to define their goals and how Fin fits into their workflows. We map out an automation roadmap by analyzing their conversations, determining the APIs and Procedures they need, and the level of automation they can achieve. We then support them in implementing it and reaching that outcome. This involves ongoing analysis to identify optimizations to their configuration and the next best actions for increasing automation levels, such as improving knowledge base content or deploying new APIs.

    During a swarm, the feedback loop is fast. We test something, ship something, and quickly see whether the metric moves. That speed and depth is what makes swarms so valuable. It’s also what makes them hard to scale. I’ve felt the thrill of watching a key metric bend within hours—and the constraint of knowing that kind of attention doesn’t scale to every account.

    For example, we developed an automation taxonomy to predict the level of automation a customer can achieve. Initially, this analysis was manual and took more than half a day to run, with time required to prep and visualize the data. But the effort was worthwhile. For one customer, we predicted an automation rate of 70% and they achieved exactly that.

    By working closely with customers, we learn what drives success, but this work is inherently hands-on and doesn’t scale on its own. So the real challenge is figuring out how to turn what we learn in those high-touch engagements into systems, tools, and product changes that benefit far more customers. That’s the inflection point where AI workflows and product strategy meet.

    Level 2: Cockpit – where the signal starts to scale. Not every customer should need swarm-level attention. The way we bridge that gap is by making the swarm analyses repeatable and shareable. Once we can run the same analysis across customers, we can start turning bespoke swarm learnings into reusable signals. This is where Cockpit comes in.

    Analytics dashboard showing taxonomy breakdown of customer support conversations: raw volume trend, 100% stacked percentage split, and topic-level bars for account settings, billing, integration, and more.
    Transform customer signals into action: this dashboard tracks support conversation volume, taxonomy percentages by type, and topic demand across account settings, billing, integration, and more to guide scalable feature bets.

    We take patterns learned in swarms and encode them into internal tooling inside our insights web app, Cockpit. Instead of analysis being a bespoke project, it becomes a workflow. For example, we scaled the automation taxonomy and this has enabled us to quickly understand automation potential for all customers.

    Now, a customer success manager (CSM) can pick a customer, see their automation potential and current performance, understand the biggest issues, and propose next actions. This is how we scale the impact of swarm learnings through CSMs and Sales. It allows far more customers to benefit from the same patterns we see in high-touch work, without requiring direct data science involvement every time.

    Cockpit also functions as a valuable proving ground. It gives us a way to test ideas across a much broader set of customers and see what generalizes before we consider taking anything further. In other words, we transform sharp, local signal into broadly useful guidance—an essential step in any AI Strategy that aims to balance precision with scale.

    Level 3: Product – where the signal reaches maximum leverage. The real payoff comes when the patterns we have validated internally become part of the product itself. Instead of helping one customer directly, or helping many customers through internal teams, we deliver a feature directly to customers so they can improve Fin’s performance on their own. Today, the automation taxonomy is a part of Insights and accessible to customers who have this feature.

    Another example is CX Score. It started with close work alongside Intercom’s Customer Support team to understand performance with Fin, initially through predicted CSAT and resolution. Over time, this work evolved into CX Score: a scalable way to measure conversation quality across all customers.

    The product stage is fundamentally different from Cockpit because of the constraints. Cockpit provides a platform for our customer analyses/tools but it doesn’t need to scale as far as product. What moves into product has to work for every customer, without configuration, at scale, so it has to generalize. That bar is what protects long-term quality while unlocking product-led growth.

    That’s why the move from Cockpit to product isn’t automatic. We’re not just asking whether something is useful, but whether it’s broadly useful, robust, and scalable enough to run across the entire customer base. As a product leader, I push for this discipline because it’s where customer success, engineering excellence, and business outcomes converge.

    The loop. The model is simple. Swarms generate the best signal, grounded in real customer problems. Cockpit operationalizes that signal so CSMs and Sales can use it across many customers. Product takes the patterns that truly generalize and turn them into scalable features that enhance every customer’s experience.

    This loop allows a small swarm data science function to have impact beyond a small set of high-touch accounts, resulting in a stream of continuous improvements across all three levels and an ever-increasing level of automation for our customers. Practically, it’s a repeatable playbook for product management leadership: start with high-signal discovery, prove repeatability, and only then scale through product. Done well, it compounds learning, accelerates time-to-value, and aligns the entire organization around measurable outcomes.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • Net Recurring Revenue Mastery: How Elite CS Teams Drive Expansion, Retention, and Growth

    Net Recurring Revenue Mastery: How Elite CS Teams Drive Expansion, Retention, and Growth

    Net Recurring Revenue (NRR) is the clearest signal of whether our product, pricing, and customer success motions are compounding value or quietly leaking it. When I review our dashboard, NRR tells me—in one number—how well we retain, expand, and engage customers. It’s the difference between linear progress and durable, compounding growth.

    At its core, NRR answers a simple question: did revenue from our existing customers grow or shrink this period? The standard way I frame it is: NRR = (Starting MRR + Expansion – Contraction – Churn) / Starting MRR. Expansion reflects upsells, cross-sells, and increased usage; contraction and churn capture downgrades and departures. Great teams don’t just watch this number—they engineer it.

    The teams that consistently outperform treat NRR as an outcome of intentional design across the entire customer journey. They align product-led growth with customer success, weaving onboarding, user activation, in-app guides, and lifecycle messaging into one coherent system. They make adoption the star of the show, not an afterthought tucked beneath quarterly targets.

    To scale that system efficiently, I lean on platforms that streamline in-app guidance and rich behavioral analytics. The promise is crisp and concrete: “Increase revenue, cut costs, and reduce risk with Pendo’s Software Experience Management platform. Optimize the entire software experience to drive adoption and improve engagement.” When the experience is instrumented end to end, expansion opportunities show up as patterns, not surprises.

    Retention analysis is where the signal gets sharp. I segment cohorts by plan, size, and use case; map their journey; and run driver trees that connect leading indicators (activation depth, feature breadth, time-to-value) to the lagging outcome (NRR). This turns hunches into hypotheses and gives customer success managers a prioritized playbook, not a long wish list.

    Onboarding is the first and most powerful NRR lever. The faster a customer experiences their first win, the more likely they are to adopt core features, invite teammates, and expand. I use in-app guides, product tours, and contextual tooltips to pave the path to value—always grounded in clear jobs-to-be-done, not generic walkthroughs. The goal is simple: remove friction, celebrate progress, and make the next best action obvious.

    Operating cadence matters as much as tooling. I separate the rhythms: QBRs for strategic alignment and expansion planning; OKRs for cross-functional execution and accountability. QBRs anchor the conversation in outcomes and value realized; OKRs ensure product, marketing, and CS move in lockstep to close the gaps those QBRs reveal.

    Pricing and packaging complete the loop. When the value proposition is clear and plans are aligned to outcomes customers care about, expansion feels natural—more capability for more value. Usage insights guide which features to gate, which to bundle, and where to price to maximize retention while unlocking healthy upsell paths.

    None of this works without tight product–CS collaboration. My teams practice continuous discovery—customer interviews, win/loss insights, and in-product feedback—so we improve the experience where it truly matters. Journey mapping turns those insights into experiments, and experiments turn into polished features once the data speaks.

    I build an NRR driver tree into our weekly reviews. Each branch (activation, adoption, multi-seat expansion, downgrade prevention, reactivation) has a clear owner, a measurable hypothesis, and a time-bound experiment. A/B testing guides what we ship broadly, and we define success upfront to avoid moving goalposts after the fact.

    I’ve seen NRR climb meaningfully in a single quarter when we pair rigorous retention analysis with targeted onboarding improvements and value-based packaging. The lift rarely comes from one big bet; it’s the compounding effect of many small, well-instrumented decisions.

    Here’s the 90-day play I return to: first, baseline NRR by segment and identify the top three drivers of expansion and the top three causes of contraction. Next, streamline onboarding with in-app guides and product tours that accelerate time-to-value and drive user activation. Then, craft expansion plays aligned to real outcomes (additional seats, advanced workflows, new use cases), and operationalize them via QBRs. Finally, preempt downgrades with early-warning alerts, targeted education, and a clear path from “stuck” to “successful.”

    NRR is a team sport. When product, customer success, and go-to-market align around adoption and outcomes, growth compounds, risk declines, and every customer interaction becomes a chance to create more value—today and in every renewal to come.


    Inspired by this post on Pendo – Perspectives.


    Book a consult png image
  • Stop Drowning in Tasks: How AI Marketing Agents Restore Focus and Maximize Impact

    Stop Drowning in Tasks: How AI Marketing Agents Restore Focus and Maximize Impact

    Every week I meet marketers who are working harder than ever—more campaigns, more content, more dashboards—yet seeing less movement on metrics that matter. The surge of AI tooling has amplified activity, not necessarily impact. That’s the focus problem: we confuse motion with momentum, and our backlogs look great while our outcomes stall.

    Learn how AI agents for marketing can help you prioritize impact so you can do important work, instead of just more work.

    In my role leading product and growth teams, I’ve learned that AI only compounds value when it is pointed squarely at outcomes. If we don’t define what “good” looks like, agentic AI will simply scale busywork. The antidote is a disciplined operating model that connects strategy to execution and instruments agents with clear success criteria.

    First, anchor your program with outcomes vs output OKRs. Choose one or two measurable business outcomes—such as qualified pipeline, conversion rate, or activation—and make everything else subordinate. This provides the compass agents need to make effective trade-offs when speed and volume tempt you to do “one more thing.”

    Second, map a driver tree from the target outcome down to the controllable levers: audience segments, offers, channels, messaging, and experience friction. This traceability shows where agents can move the needle fastest—whether that’s accelerating research, sharpening positioning, or eliminating handoffs that slow experimentation.

    Third, design a small, agentic AI workforce aligned to those levers. For example: a Research Agent that synthesizes market insights and past performance; a Copy Agent that generates on-brief, on-brand variants; a Distribution Agent that adapts content to each channel and schedules posts; and an Analytics Agent that runs A/B tests, summarizes results, and flags anomalies. Keep human oversight where judgment matters most—strategy, brand voice, and high-stakes decisions.

    Fourth, instrument rigor from day one with Agent Analytics and eval-driven development. Define offline evals for brand consistency, factuality, safety, and response time; pair them with online experiments that quantify lift on your target outcomes. Set a minimum detectable effect (MDE) so you stop shipping changes that cannot plausibly move the metric.

    Fifth, operationalize your AI workflows. Standardize prompts, inputs, and handoffs; templatize briefs and acceptance criteria; and keep a change log so improvements compound rather than reset. Use short, frequent feedback loops to prune low-impact work and double down on what demonstrably advances your objectives.

    I’ve seen teams reclaim focus and momentum when they treat agents as teammates, not toys. The magic isn’t in producing more assets—it’s in consistently choosing the next best action in service of a clear outcome. When you combine outcome clarity, a driver tree, targeted agents, and tight evals, AI becomes a force multiplier for marketing impact.

    If you’re feeling overwhelmed by AI’s possibilities, start small: commit to one outcome, one driver you believe is material, and one agent designed for that job. Prove lift, codify the workflow, then scale. Velocity is only valuable when it’s pointed in the right direction.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • Inside the Most Politically Dangerous C‑Suite Role: Hard Truths on Culture, Layoffs, and Leadership

    Inside the Most Politically Dangerous C‑Suite Role: Hard Truths on Culture, Layoffs, and Leadership

    I’ve long believed the people function is a strategic engine, not a support lane. That conviction was only reinforced in a recent deep dive with Katie Burke, now COO at Harvey after joining as Chief People Officer. Before Harvey, she spent 11 years in HR leadership at HubSpot, helping build one of tech’s most distinctive cultures. In this piece, I unpack what resonated most for me as a product leader: a marketing-minded approach to HR, deliberate hiring from hospitality, and the non-negotiable case for culture as a core business strategy.

    The first principle is simple and often overlooked: HR leaders should think like marketers. Employer brand is a product; your candidate and employee journeys are funnels; and your programs deserve the same rigor we bring to product—segmentation, positioning, channels, and continuous A/B testing. When we treat onboarding, performance, and manager enablement like iterative product launches—complete with activation metrics, retention curves, and NPS—we stop guessing and start compounding results.

    One line has become a north star for how I approach executive leadership: “Don’t ask for a seat at the table. Build the table.” In practice, that means codifying the operating system—decision rights, principles, cadences, and accountability—so the organization isn’t improvising strategy in every meeting. Product, People, and Finance should co-own this OS; that’s how you scale clarity faster than headcount.

    Transparency is the tax we pay for alignment, and it compounds trust. After an IPO, the impulse can be to close ranks. The better move is radical transparency with context: what changed, why it matters, and how decisions get made now. On my teams, that looks like publishing decision records, sharing tradeoffs explicitly, and using written docs to reduce rumor velocity—core muscles in stakeholder management as complexity grows.

    I also loved the counterintuitive hiring bet: prioritize hospitality backgrounds alongside traditional corporate pedigrees. People who’ve thrived in service environments bring customer empathy, operational resilience, and a bias for proactive care—traits that elevate everything from onboarding to incident response. In product terms, they’re culturally accretive hires with high signal on service quality and consistency.

    The trickiest part of the Chief People Officer role isn’t process—it’s politics. You are the executive team’s own HR business partner, which requires coaching, candor, and conflict mediation at the highest stakes. The goal is to “Be the Michael Jordan of your exec team”—the teammate who elevates standards, makes others better, and chooses the hard right over the easy familiar.

    Layoffs create a culture debt that accrues interest. Expect a “2.5-year cultural hangover after a layoff”—in many companies, an inevitable two-year layoff hangover—unless you actively repay it. That repayment plan includes narrating the why with specificity, rebuilding trust through manager enablement, and re-anchoring on performance and values. Measure leading indicators (manager effectiveness, time-to-decision, psychological safety) alongside lagging ones (regretted attrition) to track the true recovery arc.

    People leaders also need to create “graceful exits.” Doing this well preserves dignity for the person, protects the team’s morale, and safeguards the company’s brand. The bar is straightforward: clear rationale, fair process, useful feedback, generous support, and alumni pathways. A graceful exit signals that even when business realities bite, respect is non-negotiable.

    Expectation-setting matters. Two truths cut through the noise: “The workplace shouldn’t be Disneyland” and “Our job is not to make you happy every day.” The promise is not perpetual happiness; it’s meaningful work, fair standards, growth opportunities, and leaders who tell the truth. When we set that contract clearly, engagement becomes an outcome of purpose and progress—not perks.

    On feedback, I use the protein vs. sugar rule for employee feedback. Sugar feedback is pleasant and perishable; protein feedback is specific, sometimes uncomfortable, and growth-driving. Great cultures build a taste for protein—clear role expectations, crisp examples, and written follow-ups. Mechanically, that looks like structured 1:1s, decision retros, skip-levels, and manager training that demystifies “what good looks like.”

    Being a Chief People Officer isn’t for the faint of heart. The role must be demanding by design—on executive hiring quality, performance management courage, and values enforcement. Moments like “Berry-Gate” are reminders that small symbolic issues can balloon when feedback loops are unclear. Close the loop fast, publish the rationale, and ensure there’s a predictable path for concerns to be heard and resolved.

    When hiring, beware patterns that predict friction. That’s why “frequent flyers” are a new-hire red flag. Movement can signal adaptability—but weather-vein pivots and blame-shifting often repeat. Probe for ownership, learning moments, and sustained impact; you want people who compound value, not just sample it.

    Clarity on scope prevents leadership whiplash. Which company decisions fall to the Chief People Officer? Think leveling frameworks, compensation philosophy and bands, performance calibration, manager standards, ER policies, and org design guardrails—always in lockstep with Finance and the CEO. Escalate when there are values collisions or systemic risks; otherwise, push decisions to the right altitude and owner.

    Scaling exposes the same few failure modes on repeat: fuzzy decision rights, a thin manager bench, brittle processes that don’t flex, and inconsistent leveling that erodes trust. The antidote is an operating model that pairs clear principles with lightweight mechanisms—documented roles, regular calibration, and reviews that audit for both outcomes and operating behaviors.

    Comparing a scaled SaaS like HubSpot with an AI-native company like Harvey surfaces important differences. The former optimizes for durable systems, predictable cadences, and governance; the latter optimizes for rapid learning loops, emergent org design, and a higher tolerance for ambiguity. The art is porting the right controls at the right time without crushing velocity.

    AI is already changing the people function. GenAI can draft job descriptions, summarize performance notes, classify themes from engagement surveys, and power AI workflows that resolve common HR tickets. The human-in-the-loop remains essential for judgment, context, and ethics—especially around data governance and privacy-by-design. A pragmatic AI Strategy here frees HRBPs for higher-order coaching and organizational development work.

    One practice I recommend widely: share your own performance reviews. Modeling openness normalizes growth and turns feedback into a shared craft, not a secret ritual. It also builds trust when you later ask the organization to lean into sharper, protein-rich feedback.

    Finally, disagreements with the CEO are inevitable—and healthy. Handle them with pre-briefs, crisp written proposals, explicit tradeoffs, and a shared decision record. Argue like scientists, not politicians; once a call is made, disagree and commit. That combination of candor and alignment is what keeps executive teams high-trust and high-velocity.

    The people leader’s chair may be the most politically dangerous role in the C-suite—but it’s also one of the most leveraged. Build the table, tell the truth, design for standards and dignity, and treat culture like the product that powers everything else.


    Book a consult png image
  • Commercial vs. Internal Products: Hard Truths, High Leverage, and How I Make the Call

    Commercial vs. Internal Products: Hard Truths, High Leverage, and How I Make the Call

    Internal Products Are Hard; Commercial Products Are Harder. That line captures years of hard-won lessons from leading both internal platforms and market-facing SaaS at HighLevel. I’ve seen how the two demand different muscles—even when the tech stack, talent, and timelines look the same on paper.

    When I talk about internal products, I mean services and solutions that our own employees use to take care of customers—customer-enabling tools and services, agent consoles, fulfillment and billing workflows, operations dashboards, and the underlying platforms that keep them fast, compliant, and resilient. These tools don’t generate revenue directly, but they quietly determine customer experience, gross margin, and how quickly we can ship, resolve issues, and scale.

    Commercial products, by contrast, add a second challenge layer. Beyond discovery, usability, and reliability, we must conquer positioning, pricing and packaging, competitive differentiation, sales enablement, procurement hurdles, and ongoing customer success motion. The surface area for failure is bigger, and the time-to-signal on product-market fit is slower and noisier.

    Here’s how I decide where to invest. First, I anchor on outcomes, not output. If the business priority is net revenue retention, faster onboarding, or reduced cost-to-serve, internal products often provide the highest-leverage path. If the priority is new revenue, new market entry, or a must-have differentiator, we lean commercial. I make the trade explicit in outcomes vs output OKRs so we can defend the decision when pressure mounts.

    Second, I run a clear build vs buy calculus. For internal needs, the default is buy if a mature, configurable solution exists that meets our security, data governance, and integration requirements. I only build when the workflow is core to our differentiation, the TCO of customization is lower than vendor sprawl, or we can capture unique proprietary advantage. For commercial products, I avoid embedding third-party IP in a way that caps differentiation or compresses margins as we scale.

    Third, I insist on continuous discovery. Internal audiences are not a captive market—they’re discerning experts with real jobs to do. I treat them like customers, with structured customer interviews, journey mapping, and opportunity solution trees. I rely on empowered product teams and product trios to validate problems and reduce solution risk before we commit engineering time.

    Fourth, I frame commercial vs internal work with capacity guardrails. In most planning cycles, I reserve explicit allocation for platform scalability and internal tooling, separate from feature bets. Without this, internal products become backlog filler, which guarantees we’ll pay the interest later in churn, SLA breaches, and slower delivery.

    Execution differs too. For internal products, change management is the make-or-break. I plan enablement as a first-class deliverable: clear rollouts, in-app guides, training, and feedback loops with frontline champions. I track adoption, time-to-resolution, error rate, and satisfaction for internal users with the same rigor we apply to external users.

    For commercial products, I design the discovery-to-GTM handshake early. Pricing and packaging must reflect value drivers discovered in research, not what’s easiest to meter. Sales and solutions engineering need crisp narratives, objection handling, and proof points. Customer success needs activation plans and health signals tied directly to leading indicators of retention.

    Across both, I instrument the product and process. I lean on feature flags and progressive delivery to manage risk, and I protect SLOs with error budgets so teams balance reliability with iteration speed. CI/CD isn’t a badge—it’s how we earn the right to ship continuously without eroding trust.

    Common pitfalls recur. Teams skip UX for employee tools because “they have to use it”—which backfires as shadow workflows and rework. Leaders underfund internal platforms, then wonder why velocity stalls. On the commercial side, teams over-index on features and under-invest in positioning and onboarding, leading to poor activation and elongated sales cycles.

    What’s the payoff? When we treat internal products as products, we unlock scale: shorter handling times, fewer escalations, clearer accountability, and higher customer satisfaction. When we approach commercial products with the same discovery rigor plus smart GTM, we compress time-to-value and amplify differentiation. The craft is knowing which lever to pull when—and having the discipline to measure what matters.

    My rule of thumb is simple. If the goal is operational excellence that compounds across the entire customer journey, invest in internal products with the same intensity you reserve for revenue-generating features. If the goal is market expansion or category leadership, invest in commercial products with a tight discovery-to-GTM loop. In either case, clarity of outcomes, disciplined discovery, and empowered teams win the day.


    Inspired by this post on SVPG.


    Book a consult png image
  • Stop Forcing AI to Prove ROI: A Product Leader’s Playbook to Measure Real Business Value

    Stop Forcing AI to Prove ROI: A Product Leader’s Playbook to Measure Real Business Value

    Every planning cycle, I feel the drumbeat: “Show me the AI ROI—this quarter.” The pressure is real, especially when boards and CFOs expect immediate payback. Yet when I review stalled initiatives across teams and peers, the pattern is consistent: most companies treat AI like a feature to ship, not a system to manage. That mindset almost guarantees we measure the wrong things, declare victory (or failure) too early, and miss the durable value AI can create.

    Here’s the core problem I see: we leap to solution and skip the counterfactual. Without a baseline, a clear control, or a defined “what would have happened otherwise,” we’re guessing. We also fixate on lagging, financial KPIs that move slowly (revenue, cost, risk), then use outputs—not outcomes—as OKRs. If we don’t align on outcomes vs output OKRs upfront, the best team in the world can still optimize for activity over impact.

    My AI Strategy starts from a simple truth: value shows up along three vectors—revenue, cost, and risk—on different timelines. In the near term, we must validate leading indicators (adoption, engagement, activation) that ladder to those vectors through a transparent driver tree. Over time, those drivers compound into the lagging KPIs finance cares about. When we make the driver tree explicit, everyone can see how model precision, response time, and workflow integration roll up to conversion lift, case deflection, time-to-resolution, or reduced exposure.

    To make this rigorous, I run a five-step playbook. First, define the decision and business outcome in plain terms. Second, instrument the baseline with behavioral analytics on a unified analytics platform—tools like Amplitude analytics or Pendo help expose friction points we’ll later target. Third, create a counterfactual using A/B testing and specify a minimum detectable effect (MDE) so we know how long to run and how much traffic we need. Fourth, quantify costs (training, inference, integration, change management) and include AI risk management, privacy-by-design, and data governance up front. Fifth, lock a measurement plan that connects leading indicators to lagging ROI through the driver tree.

    Most AI initiatives don’t fail on model quality—they fail on adoption. If the workflow isn’t smoother, trust isn’t earned, or value isn’t obvious, users revert. That’s why I invest early in onboarding, in-app guides, product tours, and thoughtful tooltip design to reduce the time-to-first-value. Then I watch user activation, retention analysis, and task completion to ensure the assistive experience is not just novel—it’s habit-forming.

    For generative use cases, eval-driven development is non-negotiable. I maintain offline evaluations for accuracy and safety, and online evaluations for business impact. Retrieval-first pipeline health, context window management, and prompt engineering affect reliability; so do latency and grounding quality. We ship behind feature flags, measure guardrail effectiveness, and tighten feedback loops from human-in-the-loop reviews into model updates—continuously.

    On the business side, I avoid “AI theater” by structuring benefits like a CFO. Revenue: increased conversion or expansion driven by better recommendations, faster sales cycles, or higher trial activation. Cost: case deflection, agent time saved, fewer escalations, and lower rework. Risk: reduced exposure via automated checks, anomaly detection, and consistent policy application. If any claim can’t be tied to measured deltas—via A/B testing or strong quasi-experiments—it doesn’t go in the deck.

    Build vs buy deserves the same discipline. I map platform scalability, governance requirements, and total cost of ownership against time-to-impact. Teams often underestimate integration and maintenance drag; a pragmatic mix of bought components with thin custom layers can accelerate outcomes while keeping options open. The goal isn’t to own every layer—it’s to own the learning loop and the differentiated experience.

    I also remind teams that tooling should serve the strategy, not replace it. I’ve seen concise, effective messaging that captures the point: “Increase revenue, cut costs, and reduce risk with Pendo’s Software Experience Management platform. Optimize the entire software experience to drive adoption and improve engagement.” The words are compelling because they reflect the three-vector value model and the adoption imperative. The same standard should apply to any AI initiative we propose.

    If you’re under pressure to prove ROI, shift the conversation: lead with the driver tree, specify your counterfactual, and anchor on leading indicators you can move in weeks—not quarters. Then connect those to the lagging KPIs finance expects over time. When we manage AI like a product—grounded in evidence, experimentation, and user-centered adoption—we don’t have to force ROI. We compound it.


    Inspired by this post on Pendo – Perspectives.


    Book a consult png image
  • How Top Product Teams Roadmap Through Uncertainty: Align Faster, Adapt Smarter, Deliver

    How Top Product Teams Roadmap Through Uncertainty: Align Faster, Adapt Smarter, Deliver

    Product roadmaps should not be promises etched in stone; they are portfolios of bets made under uncertainty. When I build a roadmap, I’m not predicting the future—I’m designing a system that helps the team learn faster than the market changes, allocate capital wisely, and create alignment across engineering, design, go-to-market, and leadership.

    The best roadmaps I’ve seen and shipped anchor on outcomes rather than features. “Outcomes vs output OKRs” is more than a slogan; it’s how we translate strategy into measurable impact. I start by defining a small set of outcome metrics that matter—such as activation rate, time-to-first-value, or expansion revenue—and attach clear key results and guardrails to each theme. This reframes prioritization from “what can we build?” to “what must change in customer behavior?” and gives empowered product teams real autonomy.

    I organize the roadmap into time horizons—Now, Next, Later—with explicit confidence levels. Near-term items have higher confidence and more specificity; mid- and long-term bets are thematic with wider time windows. This approach reduces false precision and builds trust because stakeholders can see both the intent and the uncertainty. When dates matter, I use windows and service level expectations rather than single deadlines, and I pair each initiative with a lightweight risk scoring so we can discuss uncertainty explicitly rather than implicitly.

    Continuous discovery keeps the roadmap honest. I partner in tight “product trios” across product, design, and engineering to run rapid customer interviews, opportunity sizing, and assumption tests before we commit significant delivery capacity. The opportunity solution tree is my favorite artifact here; it visualizes the path from outcomes to opportunities to experiments and solutions, making trade-offs and sequencing transparent. By the time something moves into sprint planning, we’ve already reduced key uncertainties and clarified the narrowest viable slice we can ship.

    Uncertainty demands options. I plan initiatives as options with stage gates and explicit kill criteria rather than as single monolithic projects. For every significant theme, I outline base, best, and worst-case scenarios with pre-decided triggers for when we escalate, pivot, or stop. This practice prevents sunk-cost fallacy and keeps the team focused on evidence. We treat scope as a knob, not a switch, and we bias toward small, sequential bets that compound learning.

    Capacity is strategy. I routinely reserve a discovery buffer—typically 10–20%—and a contingency buffer for integration, security, and performance risks that always show up late. I ruthlessly control work-in-progress to limit thrash and protect the team’s ability to respond when new information arrives. When we must navigate dependencies, I use thin vertical slices and decouple via contracts or feature flags so discovery momentum doesn’t stall while platforms evolve underneath.

    Prioritization under uncertainty benefits from explicit models. I combine value, effort, and confidence with risk scoring to surface where the unknowns are hiding. Driver trees help us connect top-level outcomes to leading indicators, so we can place bets where they have the highest causal leverage. I also lean on the Kano Model and qualitative signals to avoid over-investing in performance attributes while neglecting excitement features that unlock differentiation and word-of-mouth.

    The most effective stakeholder management is narrative-first. For executives, I present a one-page outcomes roadmap that shows themes, expected shifts in key results, and the learning plan. For teams, I provide a more detailed plan that links discovery insights, assumptions-to-test, and decision points. I make room for a “what we’re not doing” section to reduce noise and prevent shadow backlogs from reappearing in every meeting. Most importantly, I socialize change before it happens, explaining the evidence and the trade-offs so adjustments feel like progress, not whiplash.

    Measurement closes the loop. We instrument experiments and releases with leading indicators tied to the driver tree and review them on a predictable cadence. If movement stalls, we diagnose whether we have a targeting problem (wrong audience), a value problem (weak proposition), or a friction problem (broken journey). That discipline lets us iterate with purpose instead of chasing vanity metrics or isolated anecdotes.

    Here’s a concrete example of roadmapping through uncertainty. Suppose our Q3 objective is to “Increase user activation” with key results to raise the Week-1 activation rate from 32% to 45% and cut time-to-first-value by 30%. In discovery, customer interviews reveal confusion in the first-run setup and a missing integration that advanced users expect. We map an opportunity solution tree and identify two high-leverage opportunities: simplifying the first 10 minutes and offering a guided setup for the integration. We then shape two minimal bets: an in-app guide to streamline the first three tasks and an integration wizard behind a feature flag. Each bet has an explicit decision rule and a two-sprint runway. We ship the guide first, confirm a statistically significant lift via A/B testing, then expand scope. The integration wizard underperforms initial expectations, so we pause, revisit the assumptions, and re-allocate buffer to the stronger path. The roadmap updates in real time, and everyone understands why.

    When uncertainty spikes—new competitor, pricing shock, platform deprecation—I shift the roadmap cadence to rolling-wave planning. We shorten planning horizons, increase the frequency of readouts, and elevate discovery allocations temporarily. We also create thematic “containment zones” where we explore multiple options in parallel with small budgets until one path justifies scale. This allows us to stay responsive without abandoning strategy.

    Good governance accelerates, it doesn’t slow. A lightweight product council that reviews outcomes, risks, and cross-functional dependencies prevents surprise escalations and ensures we keep shipping what matters. We avoid death-by-approval by agreeing in advance on decision rights and thresholds—for example, a product trio can pivot a bet within a theme up to a certain budget or timeline impact without additional approval, as long as it improves the outcome likelihood.

    If you’re evolving your roadmap practice, start with three moves. First, reframe your plan in outcomes and publish a driver tree that connects those outcomes to the few leading indicators you believe move them. Second, stand up a continuous discovery cadence with a visible opportunity solution tree and an assumptions-to-test backlog. Third, implement time windows and confidence levels for all mid- and long-term items, and pair each major initiative with explicit kill criteria. You’ll feel the difference in a single quarter: clearer trade-offs, faster learning, and more predictable delivery—despite uncertainty.

    In the end, a roadmap that thrives in uncertainty is an agreement about how we learn and decide together. It aligns the organization on outcomes, it funds options—not fantasies—and it gives empowered product teams room to maneuver. That’s how top product teams plan for uncertainty and still deliver with confidence.


    Inspired by this post on Product Talk.


    Book a consult png image
  • Stop Misleading A/B Tests: Master Sample Size Assumptions for Reliable Results

    Stop Misleading A/B Tests: Master Sample Size Assumptions for Reliable Results

    I’ve learned the hard way that sample size calculators can be both empowering and deceptive. They feel wonderfully precise, but they’re only as trustworthy as the assumptions you feed them. When I lead A/B testing at scale, I treat the calculator as a planning tool, not a verdict—then I systematically validate the assumptions behind it so our decisions stay rigorous and our roadmap stays credible.

    At a minimum, most calculators assume you know your baseline rate, your “minimum detectable effect (MDE),” your desired statistical power, and your significance level. They also quietly assume independent observations, clean randomization, stable traffic quality, and a fixed test horizon with no peeking. If any of those break, the “right” sample size can be wildly wrong—and the test conclusions can nudge teams toward the wrong product or go-to-market bet.

    Baseline and variance come first for me. I estimate the baseline conversion (and volatility) from recent behavior using behavioral analytics, sanity-check it across key segments, and look for seasonality. Tools like Amplitude analytics help me spot anomalies, bots, or instrumentation drift. If baseline is unstable or highly skewed, I either stabilize it with longer lookbacks or narrow the target segment to reduce noise.

    Setting the “minimum detectable effect (MDE)” is where product strategy meets statistics. I work backward from an outcome that actually matters: the revenue, retention, or activation uplift that justifies the opportunity cost of building and running the experiment. If that effect size is implausible given historic lift and variance, I rethink the scope or stack changes into a sequenced set of learning experiments rather than overpromising a single moonshot.

    For power and alpha, I default to 80–90% power and a 5% significance level unless the downside risk of a false positive is unusually high, in which case I tighten alpha. I choose one-tailed tests only when we would not act on a negative result and we’ve explicitly pre-registered that decision; otherwise, two-tailed is safer for real-world ambiguity.

    Randomization and independence are where many tests quietly fail. I randomize at the user level (not session or pageview), guard against cross-device contamination, and ensure consistent exposure via feature flags. If there’s shared context—say, team-based usage or geographic clustering—I account for it via cluster randomization or acknowledge the inflated variance it can introduce.

    Traffic allocation integrity is non-negotiable. I monitor for sample ratio mismatch by comparing observed group splits to the intended allocation and immediately pause if they drift. When SRM appears, the root cause is often instrumentation gaps, eligibility filters applied asymmetrically, or caching layers. Fixing that early preserves trust in every test that follows.

    Fixed-horizon math assumes no peeking. If stakeholders need continuous reads, I use sequential testing methods with alpha spending or always-valid approaches designed for ongoing monitoring. If we commit to a fixed horizon, we stay disciplined: no early looks, no midstream metric swaps, no retrofitted hypotheses.

    Multiple comparisons can quietly inflate false positives. I predeclare one primary metric to decide, define guardrail metrics to protect experience and revenue, and apply appropriate corrections (for example, controlling the false discovery rate) when testing many variants or slicing results by numerous segments.

    Duration and seasonality matter more than most roadmaps admit. I run through full business cycles (at least one complete week for daily patterns, longer for B2B buying rhythms), plan for novelty effects, and watch for behavior settling after initial exposure. If the intervention changes long-run behavior, I extend the measurement window or add a post-test holdout to capture durable impact.

    Not all metrics are binomial. For revenue, time-on-task, or heavy-tailed distributions, I confirm variance assumptions, use robust estimators or bootstrapping, and consider variance reduction methods like CUPED to improve power without overextending duration. The calculator’s simplicity should not mask the data’s complexity.

    Finally, I connect experimentation to product outcomes. I map hypotheses to a driver tree, ensure each test ladders to activation, retention, or monetization, and document assumptions up front so we learn even when results are null. The result is a culture that respects the math and moves faster precisely because we trust our reads.

    Here’s the practical checklist I use before pressing “Start”: validate baseline and variance from recent behavior; set an MDE tied to meaningful business impact; choose power and alpha explicitly; confirm user-level randomization and stable exposure; watch for sample ratio mismatch; align on fixed-horizon vs sequential testing; predeclare a single primary metric and guardrails; run long enough to cover seasonality; use robust methods for non-binomial metrics; and write a brief pre-read so the whole team commits to the plan.

    When we honor these assumptions, sample size calculators become sharp instruments rather than blunt ones. You’ll ship fewer misleading wins, avoid costly false negatives, and build a repeatable experimentation engine that compounds learning—and results—over time.


    Inspired by this post on Amplitude – Perspectives.


    Book a consult png image
  • Join Me in April: Build a Continuous Interviewing Habit and Unlock Real Customer Insights

    Join Me in April: Build a Continuous Interviewing Habit and Unlock Real Customer Insights

    “Continuous Discovery Habits” turns five this year, and I’m celebrating by reading it with our community—together, in practice, not just in theory. Each month, I’m publishing an in-depth reading guide with the chapters we’ll cover, a preview of the most important concepts, short videos you can share with your teams, individual and team discussion questions, practical exercises to apply what you read, and additional resources to go deeper.

    We’ll keep the conversation active in the comments each month and meet live once a quarter to compare notes, share what’s working, and troubleshoot what’s not. If you’re joining late, no problem—start with the current month or go back to January. You can also find all of the book club articles here.

    If you want to participate, grab a copy of the book (or dust off your old one), share the “Spread the Love” videos with colleagues, block time for the team exercises, and register for the community sessions. Let’s dive in together.

    This Month’s Reading

    Chapter: Chapter 5: Continuous Interviewing. Estimated reading time: ~37 minutes.

    This chapter grounds us in why interviewing on a regular cadence is critical to the success of any product trio; how cognitive biases affect what we learn from direct questions; the difference between research questions and interview questions; how to use story-based interviewing to uncover actual customer behavior (not ideal behavior); the interview snapshot, a one-page tool for synthesizing what you learned from a single interview; how to automate the recruiting process so interviewing becomes easier than not interviewing; and why product trios should interview customers together.

    Need a copy? Grab the book.

    Share the Love with Friends and Colleagues

    We learn best in community. To help your team rally around these practices, share these concise primers and invite them to join the book club discussion with you.

    What are customer interviews? – Build a competitive advantage that compounds over time.

    What should we ask in customer interviews? – Mitigating cognitive biases.

    Research questions vs. interview questions – And why the difference matters.

    Getting reliable feedback from customer interviews – Ask the right questions.

    Who should conduct customer interviews? – My answer might surprise you.

    How do you find customers to interview? – Automate the recruiting process.

    The Interview Snapshot – How to synthesize a single customer interview.

    Reflect and Discuss What You Read

    Reflection cements learning. This month, I’m challenging you—as I challenge my own teams—to build a weekly habit of interviewing customers and to shift from direct questions (which trigger bias) to collecting specific stories about past behavior. For many teams, this is a big mindset change: from infrequent “big research projects” to lightweight, continuous conversations that fuel daily decision-making.

    Individual Reflection: Think about your last customer interview or conversation. Did you rely on direct questions, or did you excavate a specific story about what happened? How might the answers have changed if you had used the other approach?

    Consider your own behavior—buying jeans, going to the gym, choosing what to watch on Netflix. Where do your ideal intentions differ from what you actually do? How might that same gap show up in your customers’ answers to direct questions?

    Scan your calendar from the past month. How many customer interviews did you conduct? If it’s fewer than four, what got in the way? What needs to change to make weekly interviewing sustainable?

    Team Discussion: As a team, discuss your current interview cadence. If you’re not interviewing at least weekly, name the biggest obstacle—recruiting, time, or synthesis—and commit to reducing one barrier this month.

    Try this together: Ask a teammate, “How does a product idea go from concept to launch at our company?” Have them write it down. Then ask for the last specific feature or improvement that launched and capture the story. Compare the two. What’s different? What does this reveal about the gap between ideal process and actual process?

    If you already interview regularly, ask: Who participates? Is it just one person (like the designer or product manager), or does the whole trio join? What value might you be missing by not having all three perspectives in the room?

    Put It Into Practice

    Understanding the “why” is easy; building the habit is the work. The following exercises are how my teams operationalize continuous interviewing week over week.

    Exercise: Conduct a Story-Based Interview (Time: 20–30 minutes. Do this with your product trio.) Schedule a conversation with a current customer. Instead of drafting a long script, identify a handful of research questions (what you need to learn) and translate them into one story-based interview question (what you’ll ask).

    For example, research questions might include: What challenges do customers face when onboarding? Where do they get stuck? What are we asking them to do that they don’t understand? How can we make it easier for them to get to the activation moment? The corresponding interview question could be: Tell me about the first time you used our product.

    During the interview, excavate the story with temporal prompts like “What happened first?”, “What happened next?”, and “What happened before that?” If the participant drifts into generalities (“I usually…” or “In general…”), gently bring them back to the specific instance.

    After the interview, debrief as a trio. What did each of you hear? Which opportunities surfaced? What surprised you? If you want personalized, detailed feedback on your technique, consider the Interview Coach available through the Story-Based Customer Interviews course.

    Exercise: Create Your First Interview Snapshot (Time: 30 minutes. Do this with your product trio immediately after the interview.) Using the interview snapshot template, capture a photo of the participant (or a visual that represents their story), quick facts about their context, a memorable quote you’ll still recall months from now, the opportunities (needs, pain points, desires) you heard, notable insights that aren’t yet opportunities, and an experience map that illustrates the story. Over time, aim to complete each snapshot in 15–20 minutes.

    Go Deeper: Additional Reading

    If you prefer audio, I’ve included an audio summary for paid subscribers that covers this month’s chapter plus the resources below.

    Related In-Depth Guides: Customer Interviews: How to Recruit, What to Ask, and How to Synthesize What You Learn.

    The Value of Continuous Interviewing: Why Product Trios Should Interview Customers Together – How interviewing together ensures research is timely, actionable, and believable.

    How to Find Customers to Talk To: Customer Recruiting: Get Easy Access to Customers Week Over Week – Practical strategies for automating your recruiting process. Ask Teresa: How Do You Select Customers for Customer Interviews? – Who to interview and how to recruit them. Tools of the Trade: Finding People to Interview Before You Have Customers – Recruiting strategies for early-stage products.

    What to Ask in Your Interviews: Why You Are Asking the Wrong Customer Interview Questions – Understanding the gap between ideal behavior and actual behavior. Story-Based Customer Interviews Uncover Much-Needed Context – Why collecting specific stories is more reliable than asking direct questions. Ask Teresa: What Are the Best Customer Interview Questions? – Common questions and how to improve them. Ask About the Past Rather than the Future – Why memories about recent instances are more reliable than speculation.

    How to Take Notes and Synthesize What You Are Learning: How to Take Notes During Customer Research Interviews – Practical tips for capturing what you hear. The Interview Snapshot: How to Synthesize and Share What You Learned from a Single Customer Interview – A comprehensive guide to creating and using interview snapshots. Customer Interview Analysis: How AI Helps and Hurts – Learn how to use AI effectively.

    Videos: All Things Product Podcast: Customer Interview Analysis – Petra and I discuss using AI to analyze customer interviews, the risks and benefits, and why your interviewing skills matter more than any AI tool.

    Other Resources from Around the Web: The Top 5 Mistakes Product Teams Make With Customer Interviews by Pragmatic Live. Continuous interviewing with Kristian Collin Berge (CEO & Co-founder at UX Signals) by Afonso Franco. How to Make Time for Customer Interviews & Validation by Rich Mironov. Brave UX: An interview with Teresa Torres by Brendan Jarvis.

    Related Courses: Customer Recruiting for Continuous Discovery – Get easy access to customers week over week. Story-Based Customer Interviews – Collect reliable feedback from every customer conversation.

    Our Live Discussion Schedule

    Our live discussion sessions are for paid subscribers. Sessions are not recorded. Invitations will go out to members two weeks before each event—add these to your calendar now: Tuesday, June 16, 2026: 9am–10am PDT. Thursday, September 17, 2026: 9am–10am PDT. Wednesday, December 16, 2026: 9am–10am PST.

    Audio Summary

    This summary was produced by NotebookLM. The sources supplied were the book chapters as well as all of the additional reading.

    This article is part of the CDH Book Club celebrating the five-year anniversary of Continuous Discovery Habits.


    Inspired by this post on Product Talk.


    Book a consult png image
  • Never Stop Disrupting: Why the Fin API Platform Signals a New Era for Agentic AI

    Never Stop Disrupting: Why the Fin API Platform Signals a New Era for Agentic AI

    Disruption is the only sustainable strategy in product. When a platform meaningfully changes how we build and operate, I pay attention—not just as a product leader, but as someone accountable for turning AI Strategy into durable competitive differentiation. That’s why the launch of the Fin API platform stands out: it’s a concrete step toward agentic AI at enterprise scale.

    Today, I’m diving into what this launch includes, why it matters for product strategy, and how I’d navigate the build vs buy decision in this new landscape. My goal is to translate the announcement into actionable guidance for product teams, CX leaders, and forward-deployed engineers who are building the next generation of customer support and product-led experiences.

    Fin is a customer agent platform that at present resolves over 2M customer issues a week, growing at a rapid exponential pace. It’s relied on by the best brands, large and small, in every vertical you can imagine. From Atlassian and Riot Games, to smaller hot upstarts like Mercury and Polymarket. It runs on a family of models trained by its AI group. Last week, they announced Apex, which is the world’s first specialized customer service LLM. In production tests over the last 6 months, it beat every single frontier model, including those from Anthropic and OpenAI, on resolution rate, latency, hallucination rate, and cost.

    With this launch, teams can access the platform’s core capabilities and underlying models directly via API, with contracts starting at $250k per year, and usage rates that are by far the cheapest in the industry for each of the model’s subcategories. For leaders evaluating total cost of ownership, this is a meaningful data point: it shifts the economics of scaled automation from experimental to operational.

    Why now? Because builders want options. I hear from teams daily that want to design their own agents, tune prompts and policies, and integrate with bespoke CRMs, data lakes, and product surfaces. The Fin announcement meets that demand with three clear build-paths, each mapping to a different operating model and maturity stage.

    First, for the vast majority of companies, the Fin Agent Platform is the pragmatic starting point. Fin reports ~8k companies on it today. It addresses 99% of customer needs out of the box—without exhausting consulting engagements—while delivering top-tier resolution rates. If your priority is time-to-value, governance, and platform scalability, this route de-risks implementation and accelerates outcomes.

    Second, for teams that need custom surfaces or channels, the Fin Agent API lets you present Fin in unique contexts. You get the Fin platform’s orchestration and controls, but you’re free to bypass the default messenger, email, voice, or any prebuilt channel and embed the agent natively in your product. I see this as the sweet spot for product-led growth motions where conversation design and UX writing are strategic levers.

    Third, for companies building hyper-specific agents—think service plus in-product actions—the new API access to Apex and the broader collection of models is the obvious move. Unlike generalized models, these are purpose-trained for customer service scenarios and operational policies. If you have strong in-house solutions engineering, a retrieval-first pipeline, and eval-driven development in place, this path maximizes control without reinventing the model layer.

    This also opens the door for vertical specialists. Fin-like businesses focused on deep domains can emerge quickly—Fin for dentists? Why not? Fin for car dealerships? Sure. I expect startups and modern CX providers (including players like Decagon and Sierra) to carve out niches where domain data, workflows, and compliance are the real moats. That’s where differentiated AI beats generic capability.

    There’s a defensive reason to pay attention here. The software landscape is shifting fast: the moat is no longer feature parity—it’s the quality of your agents and the data flywheels powering them. Building software is simply less hard now, and I’ve watched engineering teams more than double measurable productivity as they adopt AI-assisted development. The implication is clear: the interface-and-features era is giving way to an agents-and-outcomes era.

    Serious software companies must evolve from being a features company to an agents company—and build those agents on differentiated AI. More value will accrue at the model and orchestration layers, where safety, latency, cost, and resolution quality are won. That puts a premium on prompt engineering discipline, policy routing, continuous discovery of edge cases, and rigorous offline/online evals to keep hallucination rates low while maintaining speed.

    How would I choose among the three build-paths? If you’re early or resource-constrained, start with the Fin Agent Platform to validate outcomes and align stakeholders. If you need branded experiences and tighter product integration, use the Fin Agent API to control surfaces without owning the heavy lifting. If you have strong ML ops and a mature customer support ai strategy, go model-level with Apex and companions, layering in your own guardrails, context window management, and test harnesses. In each case, balance velocity, control, and risk—your build vs buy decision should be grounded in clear metrics and an explicit product strategy.

    Where does this lead? We’ll see more companies expose specialized model families with clearer economics and stronger governance. For now, I’m excited to see what teams build with the Fin API platform—and how they turn agentic AI into measurable improvements in resolution rate, CSAT, cost-to-serve, and ultimately, customer loyalty.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • Inside Amplitude’s ML Playbook: Practical Strategies for Smarter A/B Tests and Growth

    Inside Amplitude’s ML Playbook: Practical Strategies for Smarter A/B Tests and Growth

    I’m continually asked how machine learning can make product analytics more actionable. Drawing from Amplitude analytics in real-world settings, I’ve distilled what matters most for product teams that want faster, smarter decisions without sacrificing rigor.

    When I design experiments, I start with minimum detectable effect (MDE) to size samples correctly and avoid costly, inconclusive tests. I pair that with disciplined A/B testing hygiene—clear hypotheses, thoughtful stop rules, and guardrails for key metrics—so results translate into credible product strategy choices instead of noisy dashboards.

    For growth and retention, I map behavioral analytics to activation and long-term value. Driver trees help me connect feature adoption to revenue or retention, and anomaly detection keeps me from overreacting to outliers when seasonality or data quality shift.

    I segment cohorts by user intent and lifecycle stage, measure user activation with crisp event definitions, and monitor leading indicators across a unified analytics platform. This keeps cross-functional conversations grounded, accelerates product-led growth, and reduces the risk of optimizing for vanity metrics.

    Operationally, that means building self-serve views that flag MDE-ready experiments, surface retention analysis by cohort, and trigger anomaly detection alerts only when the signal outpaces noise. The payoff is fewer meetings debating data quality and more time shipping value.

    If you’re leveling up your analytics stack, start by tightening experimentation basics, instrumenting activation and retention with behavioral analytics, and wiring in anomaly detection as a safety net. You won’t just move faster—you’ll learn faster, and with the confidence to bet big when the data earns your trust.


    Inspired by this post on Amplitude – Perspectives.


    Book a consult png image
  • Inside Banani: How a Canvas-First AI Designer Elevates UX and Accelerates Product Teams

    Inside Banani: How a Canvas-First AI Designer Elevates UX and Accelerates Product Teams

    I believe the future of product design isn’t about replacing designers—it’s about giving every team access to one. That’s why Banani grabbed my attention. It’s an AI product designer that doesn’t just generate code—it generates design. For solo founders, stretched design teams, and early-stage startups, that shift matters: it raises the design floor without lowering the creative ceiling.

    I spent time with Vlad Solomakha (CEO & Co-founder), Vova Kovalchuk (CTO & Co-founder), and Vlad Ostapovats (Founding Growth) to unpack how they took Banani from a Figma plugin proof-of-concept to a canvas-first AI design tool generating hundreds of thousands of designs per week. Vlad brings a decade of design experience and a precise north star: AI should produce beautiful, tasteful design rather than average, undifferentiated UI.

    The architectural choices stood out. They engineered their agent to handle parallel screen edits, manage per-screen context across canvases with hundreds of frames, and make surgical edits without regenerating entire screens. This is the kind of agentic AI work that product leaders have been waiting for: concrete advances in context window management, tool orchestration, and prompt engineering that translate into higher throughput without sacrificing quality.

    Equally important is how they addressed the "gulf of specification"—the mismatch between how designers think visually and how agents understand text. Banani’s canvas-first approach acknowledges that design is spatial, hierarchical, and iterative. Rather than forcing a chat-first UX, they center the canvas and let the agent do production work while keeping the designer firmly in control. In practice, this narrows intent ambiguity, speeds up iteration, and preserves taste.

    The team made another pivotal bet: Why Banani doesn’t compile running applications — just HTML/CSS mockups — and how that shapes everything. By decoupling the design artifact from runnable code, they optimize for velocity, taste, and exploration. In my experience, this separation is the right product strategy for early discovery and gen ai for product prototyping—move fast on aesthetics and flows, then converge on implementation once you’ve validated the direction.

    I also appreciated their pragmatic evaluation approach. Instead of traditional evals, they spin up 10 screens from one prompt to compare models. It’s hands-on, outcome-based, and aligned with eval-driven development in real product environments. They’re relentlessly discerning about when to work around model limitations versus when to wait for the models to improve—an essential discipline when building at the edge of what’s possible.

    Under the hood, context engineering and specialized agent tools do the heavy lifting. Per-screen history with shared project context enables precise, reversible changes across large canvases. The result: fewer destructive regenerations, more reliable design intent preservation, and a workflow that feels like collaborating with a strong mid-level designer who’s exceptionally fast and consistent.

    If you want a quick tour, I recommend jumping to a few highlights: 20:13 Product Tour Canvas First AI, 33:40 Gulf of Specification, 42:54 Agent Architecture Under Hood, 48:48 State History Context Tricks, and 56:04 Navigating Busy Canvases. Each segment reveals a different layer of the system design and product thinking behind Banani’s canvas-first UX.

    For product leaders, this is a compelling blueprint for raising the design floor while protecting the last mile of craft. It aligns with empowered product teams, continuous discovery, and LLMs for product managers who need leverage without losing judgment. If you’re exploring agentic AI in design, this is a thoughtful, execution-focused model worth studying and trialing on your next product tour or redesign.

    Resources worth exploring: Banani and TL Draw. To hear the full conversation, you can listen on Spotify or Apple Podcasts. Then, pressure-test the approach inside your own product development lifecycle and see how a canvas-first AI designer reshapes your team’s velocity and quality bar.


    Inspired by this post on Product Talk.


    Book a consult png image