Category: Product Management

  • AI Data Security for Product Teams: Protect Sensitive Product Data Without Slowing Innovation

    AI Data Security for Product Teams: Protect Sensitive Product Data Without Slowing Innovation

    Protecting product data has never felt more urgent. Every week, my teams experiment with gen ai prototypes and LLM-powered capabilities, and I’m accountable for ensuring our innovation never compromises cybersecurity, privacy, or customer trust. The goal is not to slow down—it's to build in the right guardrails so speed and safety reinforce each other.

    Understand AI data security risks in product teams, what product data is most exposed, and how to use AI tools responsibly without slowing innovation.

    When I assess AI risk with product managers, I start with how data moves. The biggest threats usually come from prompt and context leaks, unsafe logging of sensitive inputs or outputs, permissive access controls, unmanaged third-party model usage (shadow AI), and unclear data-retention policies. For LLMs for product managers, I emphasize that every step in AI workflows—from collection to processing to storage—must assume adversarial conditions.

    In my experience, the product data most exposed includes customer PII and payment identifiers, internal strategy documents and roadmaps, analytics and behavioral telemetry tied to users, feature flags and configuration values, embeddings and vector stores that can reveal sensitive patterns, and the prompts or contexts themselves. Even “harmless” evaluation datasets can contain inferred identities. Treat all of this as high-value assets in your data governance model.

    I apply privacy-by-design from the first discovery conversation: minimize data by default, redact or tokenize before any external model call, and separate identities from content wherever possible. A retrieval-first pipeline helps keep raw customer data within our boundary while still enabling relevant context. We combine deterministic safeguards (policy-based redaction, allow/deny lists) with runtime observability to detect anomalous prompts, outputs, or access patterns.

    To keep velocity high, we operationalize risk rather than debate it ad hoc. A lightweight risk scoring rubric classifies each capability (e.g., internal-only, customer-facing, regulated data adjacent) and dictates controls: redaction requirements, human-in-the-loop thresholds, eval-driven development gates, and incident response readiness. These controls live in CI/CD so product teams get fast, automated feedback without waiting on meetings.

    Partnership is essential. I bring Security, Legal, and Data partners into the product trios early to align on regulatory compliance and threat modeling while scoping solutions that meet outcome goals. We maintain a shared catalog of approved providers and architectures, document data flows, and version our policies just like code—so everyone can see what changed and why.

    Vendor diligence is non-negotiable. I ask LLM providers about data retention and training usage, encryption at rest and in transit, key management, regional data controls, audit posture (SOC 2, ISO 27001, HIPAA where needed), and support for private networking. We restrict scopes with least-privilege access and instrument robust observability for threat detection and response across the full path, not just the API call.

    Culture makes the biggest difference. I coach teams on prompt hygiene, secret handling, and context window management; we publish redaction patterns, approved libraries, and clear do/don’t examples. When incidents happen, we treat them as learning opportunities, run blameless reviews, and update our playbooks, guardrails, and training materials accordingly.

    The outcome I aim for is confidence with speed: we ship AI features that customers love while protecting the data they entrust to us. With a clear risk model, strong data governance, and embedded controls, product teams can innovate boldly—without compromising on security or trust.


    Inspired by this post on Product School.


    Book a consult png image
  • AI Experimentation Mastery: How I Test Faster, Tame Variability, and Ship with Confidence

    AI Experimentation Mastery: How I Test Faster, Tame Variability, and Ship with Confidence

    I’ve learned that the fastest path to durable AI impact is a disciplined experimentation engine: one that moves quickly, reduces ambiguity, and earns trust with evidence. My goal isn’t just to ship models—it’s to ship measurable outcomes with repeatable rigor.

    AI experimentation for product teams. Here’s how to test AI features, choose the right metrics, handle variability, and make data-driven decisions.

    I start every AI initiative by framing a clear decision: what must be true for this feature to be worth building, and how will we know quickly? From there, I map driver trees that connect user value to measurable signals, so every test clarifies both impact and risk, not just accuracy.

    Success criteria come next. I translate aspirations into testable thresholds, define leading and lagging indicators, and size tests with minimum detectable effect (MDE) so we don’t confuse noise for signal. This keeps us honest about sample sizes, power, and the real cost of waiting for certainty.

    Before I touch production traffic, I run eval-driven development. I curate golden datasets that reflect real user complexity, codify rubrics for correctness, safety, tone, and latency, and automate scoring so improvements are reproducible—not anecdotal. This gives the team a stable baseline to iterate prompts, tools, and policies with confidence.

    Model behavior is inherently stochastic, so I deliberately control variability. I document temperature, top-p, and seed strategies; I compare deterministic settings for regression checks versus sampled settings for user-facing creativity; and I test sensitivity across content lengths and edge cases. This reduces flakiness and prevents surprise regressions during CI/CD.

    When it’s time to learn from real users, I favor A/B testing with thoughtful guardrails. I run holdouts, cap exposure with feature flags, and protect core experience metrics like retention and time-to-value. For ranking and retrieval changes, I’ll use interleaving or switchback tests to isolate effects from seasonality and traffic mix.

    To handle LLM variability online, I aggregate outcomes over multiple prompts per cohort, use stratified bucketing to balance power users and new accounts, and track confidence intervals over time instead of snapshot p-values. This approach turns noisy model outputs into stable product signals.

    Instrumentation fuels everything. I rely on behavioral analytics to trace user intent, effort, and satisfaction across flows, and I wire up Amplitude analytics for event schemas, funnel drop-offs, and cohort comparisons. Clear event taxonomies and naming discipline make it trivial to separate model quality from UX friction.

    Risk is part of the work, so I bake in AI risk management early. I include toxicity and PII checks in my offline evals, monitor safety metrics in every A/B, and set rollback criteria tied to user harm and system costs. Privacy-by-design, audit logs, and runtime safeguards aren’t afterthoughts—they’re acceptance criteria.

    The operating cadence matters as much as the math. I run continuous discovery with customer interviews to keep the test queue grounded in real jobs-to-be-done, and I align product trios on hypotheses, success metrics, and stop-loss rules before launch. Weekly readouts keep decisions crisp, and post-ship learning cycles feed the next iteration.

    Finally, I invest in upskilling the team. We run internal workshops on LLMs for product managers, standardize experiment templates, and maintain a living playbook so new experiments start at 80% instead of 0%. The result: faster learning loops, safer bets, and more confident shipping.


    Inspired by this post on Product School.


    Book a consult png image
  • Pretotyping vs. Prototyping: How I Validate Ideas Fast and Build Products Customers Love

    Pretotyping vs. Prototyping: How I Validate Ideas Fast and Build Products Customers Love

    I learned early in my career that beautiful prototypes don’t save you when you’re solving the wrong problem. What does save you is separating market risk from solution risk and choosing the fastest, lowest-cost way to get evidence. That’s why I rely on pretotyping to test demand in days and prototyping to refine usability and feasibility once I see a strong signal. The result: faster cycles, fewer wasted sprints, and products customers genuinely want.

    Pretotyping vs. prototyping explained: differences, benefits, examples, and when to use each approach to validate ideas before you build.

    Here’s how I define the two in practice. Pretotyping answers, “Should we build this at all?” Its goal is to validate real user intent and behavior with the lightest-weight artifact possible—often before any code. Think painted-door (fake door) experiments, Wizard-of-Oz flows powered by humans behind the scenes, concierge tests, landing-page smoke tests with waitlists or preorders, and simple A/B testing to gauge click-through intent. It optimizes for time-to-signal and cost-to-learn.

    Prototyping answers, “Can we build this well?” and “How should it work?” Once demand is evidenced, I prototype to de-risk solution details: usability, architecture, performance, and integration. This might include interactive UI models, high-fidelity flows, technical spikes, or service stubs. Here, I optimize for learning about user experience and technical feasibility without fully committing to production.

    When should you use each? If your biggest unknown is market risk—whether customers care at all—start with pretotyping. If your biggest unknown is solution risk—how to deliver an experience that’s usable, reliable, and scalable—move to prototyping. In other words, validate the “right thing” before you perfect the “thing right.”

    My decision rule is simple: identify the dominant risk, then pick the smallest experiment that can credibly invalidate it. For market risk, I look for evidence of behavior, not opinions: clicks on a painted door, signups on a landing page, willingness to pay (deposits, preorders), or sustained repeat usage in a Wizard-of-Oz flow. For solution risk, I look for task completion, time-on-task, error rates, and qualitative friction from usability sessions with a realistic prototype.

    Concrete examples from recent work help illustrate the difference. When exploring a new analytics insight, I shipped a fake door inside our product nav; a simple tooltip explained the concept and captured interest. Click-through rate, conversion to a short explainer, and waitlist signups told me whether the value proposition resonated before building anything. For a complex AI-assisted workflow, I ran a Wizard-of-Oz experiment: users experienced the end-to-end flow while our team manually handled the “AI” behind the curtain. That gave us real engagement data and edge cases to inform the prototype and later the MVP.

    Metrics matter. I set a clear hypothesis with a guardrail on sample size and a minimum detectable effect I’d consider actionable. For pretotyping, I focus on time-to-first-signal, intent conversion (CTR to interest, interest to signup), cost-per-qualified-lead, and evidence of willingness to pay. For prototyping, I prioritize task success rates, usability severity findings, and qualitative insights that materially change the design or technical approach. Above all, I avoid vanity metrics and anchor decisions to outcomes, not output.

    My repeatable playbook looks like this: (1) Frame the problem and value proposition in one crisp sentence. (2) Choose the leanest pretotyping method that can reveal real behavior. (3) Define success metrics and a decision rule before you run the test. (4) Launch quickly, instrument well, and let the data run long enough to be credible. (5) If demand is strong, promote to a prototype to refine UX and de-risk technicals; if not, iterate the proposition or stop. This keeps product discovery continuous and ensures roadmapping and sprint planning are guided by evidence.

    There are ethical guardrails I never skip. Painted doors must set correct expectations once clicked; waitlists or learn-more pages are honest and respectful. For Wizard-of-Oz and concierge tests, I’m explicit about data handling and provide timely follow-up. Trust compounds when experiments are transparent and user time is valued.

    Tooling can accelerate the cycle without diluting rigor. I often use lightweight design systems and no-code automations to stitch together realistic flows, and I’ll leverage gen AI for product prototyping to generate copy, microinteractions, or data scaffolding. But the principle remains: don’t over-invest until evidence earns the investment. Empowered product teams thrive when they optimize for learning velocity, not feature velocity.

    If you’ve ever felt the tension between shipping fast and shipping right, this approach resolves it. Pretotype to prove the market; prototype to perfect the solution. Do that consistently and you’ll spend more time delivering outcomes customers value—and far less time debating outputs.


    Inspired by this post on Product School.


    Book a consult png image
  • The AI PM One-Pager: Radical prototyping requirements for speed, clarity, and truth

    The AI PM One-Pager: Radical prototyping requirements for speed, clarity, and truth

    I move fastest in Generative AI when I strip work down to its essential signals. At HighLevel, I rely on a single-page format—”Prototyping Requirements: The One-Pager for AI PMs”—to turn ideas into testable artifacts within hours, not weeks. This approach reinforces AI Strategy, minimizes coordination overhead, and keeps Product Management focused on learning over ceremony.

    “Prototyping requirements go rogue: one page, zero bureaucracy, built for AI. Shape concepts fast, prompt tools directly, and get to the truth sooner.”

    In practice, my one-pager captures only what’s required to run an immediate experiment: the user problem, the target behavior change, success signals, core constraints, intended AI workflows, and the smallest realistic path to an evaluable demo. I also include example prompts, guardrails, and evaluation criteria so the team can apply prompt engineering and LLMs for product managers without guessing.

    This is eval-driven development in action. I document a minimal hypothesis, concrete inputs/outputs, and a quick plan for metrics, including qualitative signals from product discovery and continuous discovery. By prompting tools directly, we expose assumptions early, shorten feedback loops, and build an AI product toolbox that compounds learning sprint after sprint.

    I run this with a product trio to ensure we balance feasibility, usability, and value. We align on risks, dependencies, and what “good” looks like, then we integrate the learnings into product roadmapping and sprint planning. The result: fewer meetings, tighter collaboration, and empowered product teams delivering sharper outcomes with less friction.

    If you want speed and clarity without sacrificing rigor, adopt the one-pager. It centers the conversation on evidence, accelerates AI workflows from prompt to prototype, and makes it obvious what to try next—and what to stop doing. Most importantly, it keeps the team focused on truth over theater, which is how great AI products actually ship.


    Inspired by this post on Product School.


    Book a consult png image
  • My Essential AI Toolbox for Product Managers: Tested Picks, Prompts, Workflows + Checklists

    My Essential AI Toolbox for Product Managers: Tested Picks, Prompts, Workflows + Checklists

    I created this practical guide to help product managers cut through the hype and apply AI where it genuinely moves the needle—faster discovery, clearer strategy, sharper execution, and measurable outcomes.

    A practical guide to AI tools for product managers: tested picks, what each tool is best for, copy-paste prompts, workflows, and screenshot checklists.

    Leading product management at HighLevel, I’ve pressure-tested dozens of gen AI solutions across product discovery, roadmap planning, delivery, and go-to-market. In this guide, I map an AI product toolbox to core PM jobs-to-be-done so you can move from experimentation to repeatable impact with confidence.

    Expect clear recommendations on where each tool excels—LLMs for product managers, research synthesis for customer interviews, behavioral analytics for opportunity sizing, and lightweight automation for in-app guides and product tours. I connect these tools to proven practices like continuous discovery, outcomes vs output OKRs, and product roadmapping and sprint planning so you can operationalize AI inside your existing workflows.

    I also share the evaluation criteria I use before rollout—AI Strategy alignment, data governance and privacy-by-design, AI risk management, observability, and total cost of ownership. This eval-driven development approach helps teams avoid technology FOMO while creating defensible, trustworthy workflows that scale.

    To accelerate adoption, I’ve included copy-paste prompts (including prompt engineering patterns for both chat and voice), retrieval-first pipeline blueprints to ground your models in product docs and decision logs, and conversation design tips for support and success use cases. You’ll see step-by-step AI workflows that tie directly to journey mapping, opportunity solution trees, and Kano Model trade-offs.

    Every workflow comes with screenshot checklists you can use for onboarding or stakeholder management, making it easy to align ICs and leaders on the same operating picture. Whether you’re optimizing A/B testing, retention analysis, or QBRs vs OKRs, these checklists turn good intentions into repeatable rituals.

    Use this guide as your field companion to ship faster with higher confidence—reducing cycle time, improving signal in discovery, and building momentum for product-led growth. If you’re ready to translate generative AI into reliable PM leverage, start with the workflows, adapt the prompts, and make them your own.


    Inspired by this post on Product School.


    Book a consult png image
  • Why Your Product Needs a Smarter Support Agent: Data-Driven, Agentic AI That Truly Helps

    Why Your Product Needs a Smarter Support Agent: Data-Driven, Agentic AI That Truly Helps

    Your product deserves a support experience that does more than point users to a help article. In my work leading product teams, I’ve seen how an intelligent, in-product assistant can reduce friction, accelerate user activation, and create the kind of product-led growth that traditional support channels struggle to deliver. The bar is higher now: customers expect immediate, context-aware help that feels proactive, measurable, and trustworthy.

    When I evaluate support solutions, I look for three capabilities: an assistant that truly knows the user’s context, can act on their behalf to resolve issues end-to-end, and can prove the impact with rigorous measurement. Anything less is just another interface to your knowledge base. The shift to agentic AI makes this possible—if it’s grounded in behavioral analytics and integrated with your unified analytics platform.

    Learn more about Amplitude AI Assistant. Our in-product support agent knows your users, acts on their behalf, and measures whether it actually helped.

    That promise resonates with how I design AI Strategy: start with data fidelity, not dialog. When an assistant is wired into Amplitude analytics and behavioral analytics, it can understand where a user is in the journey, the features they have (or haven’t) adopted, and which nudges or in-app guides historically drive success. This is the foundation for precise, contextual help—surfacing the right product tours at the right moments and removing guesswork.

    Knowing users isn’t enough; the assistant must act. With agentic AI, the assistant can execute safe, auditable steps on a user’s behalf—updating settings, triggering a workflow, or guiding a multi-step configuration—rather than handing off a to-do back to the customer. Done well, this reduces time-to-value and support tickets while aligning with a thoughtful customer support ai strategy that respects permissions, privacy-by-design, and clear guardrails.

    Equally important is measurement. I expect every AI touchpoint to demonstrate lift: faster time-to-resolution, higher feature adoption, improved retention, and lower churn. This is where robust A/B testing, Agent Analytics, and retention analysis come in—so we can quantify the assistant’s contribution against meaningful product outcomes, not vanity metrics. If we can’t measure it, we can’t manage it.

    Operationally, I advise teams to pilot with narrowly scoped, high-impact journeys and iterate with tight feedback loops. Instrument the assistant’s actions and outcomes, set minimum detectable effect thresholds for experiments, and continually refine prompts and playbooks. Tie insights back to your unified analytics platform so learnings inform roadmap choices and reinforce a durable product-led growth motion.

    In short, the next generation of in-product support will be built on data-rich context, agentic execution, and rigorous proof of value. That’s the standard I hold my teams to—and the experience users deserve when they ask for help.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • Forget Crystal Balls: How Scenario Planning Helps Me Ship Smarter in the Age of AI

    Forget Crystal Balls: How Scenario Planning Helps Me Ship Smarter in the Age of AI

    AI headlines are everywhere—and many claim they know exactly what’s coming next. In product management, I’m often asked to make single-point predictions about gen ai and LLMs for product managers. I resist that temptation because confident forecasts are seductive—and usually wrong.

    Listening to Teresa Torres and Petra Wille unpack why certainty fails reinforced what I practice with my product trios: scenario planning. Instead of betting on one future, I explore several plausible ones, define the signals that would confirm or disconfirm each, and translate those insights into product strategy and product roadmapping and sprint planning we can adapt as evidence evolves.

    Their argument mirrors what I see with customers and stakeholders: people are bad at predicting the future, and overconfidence creates fragility. Early adopters don’t represent everyone, so when we extrapolate from enthusiasts to the mainstream, we waste time and erode trust by building the wrong things.

    Here’s how I apply this to avoid technology FOMO and make sharper AI Strategy decisions. I treat every bold claim as one possible future, then ask, “what else could happen?” I push extremes—AI everywhere vs. AI as invisible utility; GUIs vanish vs. GUIs evolve; centralized vs. edge compute—and hunt for the needs that stay true across scenarios. Those invariants anchor empowered product teams to outcomes, not outputs, and they help us stage bets responsibly.

    Listen to this episode on: Spotify | Apple Podcasts

    My key takeaways: Confident predictions are often wrong. Early adopters don’t represent everyone. Treat predictions as one possible future. Scenario planning > trying to be right. Focus on patterns, not hype.

    In short: We’re in a period of change—but no one can predict exactly how it plays out. Strong predictions often ignore uncertainty.

    A better approach in practice: Treat every prediction as a scenario. Ask: what else could happen? Use multiple futures to guide decisions.

    As you evaluate roadmaps, watch for traps like “My experience = everyone’s future” thinking, over-indexing on early adopters, and ignoring real-world constraints like budgets, compliance, and change management.

    Tactically, we run quick scenario exercises, push ideas to extremes to explore implications, and extract the underlying insight (not the exact prediction). This complements continuous discovery and helps us write outcomes vs output OKRs that are resilient to uncertainty.

    00:00 – The problem with future predictions

    04:00 – Why experts get it wrong

    06:00 – Scenario planning explained

    12:00 – Early adopters vs. reality

    20:00 – AI, GUIs, and extreme takes

    27:00 – Using scenarios in product work

    34:00 – Final thoughts

    Resources & Links:

    Follow Teresa Torres: https://ProductTalk.org

    Follow Petra Wille: https://Petra-Wille.com

    Mentioned in this episode:

    Claude Code

    What did I miss—or what scenarios are you considering for your team? Leave a comment below and let’s compare notes.


    Inspired by this post on Product Talk.


    Book a consult png image
  • Inside Artemis’ AI vs AI Security War: Hiring at Speed, PMF Signals, and Founder-Led Sales

    Inside Artemis’ AI vs AI Security War: Hiring at Speed, PMF Signals, and Founder-Led Sales

    I’m fascinated by how fast truly AI-native companies can move when the problem is urgent, the founders have deep domain credibility, and the culture is built around customer obsession from day one. Artemis, an AI-native security platform, just emerged from stealth with $70M in combined seed and Series A funding, assembled a 30-person team in seven months, and made a bold promise to “stay on a texting basis with every customer, even at scale.” As a product leader, I see this as a masterclass in AI Strategy, go-to-market focus, and disciplined execution in cybersecurity.

    At its core, Artemis is operating in what I’d call an “AI vs AI” security war: increasingly, we’re defending against adversaries who leverage models just as aggressively as we do. That shifts the job from rule-writing to intelligence orchestration, threat detection and response at machine speed, and continuous evaluation. It also explains why AI-native companies are outperforming their AI-enabled counterparts—when intelligence is the product, the org must be built around model quality, data pipelines, and rapid iteration, not as a bolt-on.

    Founder-market fit is the early signal I look for, and here it’s unmistakable. Shachar Hirshberg’s “AWS and Palo Alto” playbook and Dan Shiebler’s path “From Twitter to Abnormal” create a rare combination: deep infrastructure and enterprise security know-how paired with production-grade machine learning at scale. When those experiences intersect, you get crisp problem statements, faster learning loops, and credibility with the exact ICP that feels the pain first.

    Timing the leap to build is more art than science, but I listen for three cues: customers describing the problem in quantified terms, a wedge that can deliver value within one buying cycle, and a data advantage that compounds. Artemis clearly identified a high-urgency buyer and ignored adjacent segments that would dilute focus—an underrated act of courage that accelerates product-market fit.

    Hiring for AI fluency is a different exercise than traditional software roles. I don’t just screen for model familiarity; I screen for product thinking under uncertainty, a bias for eval-driven development, and the ability to explain tradeoffs to security teams. Practical prompts help: “How would you diagnose precision/recall tradeoffs under evolving threat patterns?” or “Show me how you’d design a red/blue evaluation harness for a new detection.” The best candidates can translate model metrics into business outcomes and customer trust.

    Building a 30-person AI-native team in stealth requires ruthless clarity on the handful of roles that compound: forward deployed engineers who can ship with customers, solutions engineering that feeds learning back into the model, and product managers who treat data as the primary surface area. Culture-wise, I anchor on two rituals: weekly customer debriefs with actual artifacts (alerts, misclassifications, escalations) and a written log of hypotheses, evals, and next bets—so the entire team can reason from the same evidence.

    AI implementation reshapes the dashboard. Beyond the usual business KPIs, I watch a second layer: model precision/recall by scenario, alert fatigue reduction, time-to-first-signal on emerging threats, drift and data freshness, and latency under load. When these improve, downstream product metrics—activation, expansion, NRR—almost always follow. Observability isn’t an afterthought; it’s the control center for trust in AI-driven cybersecurity.

    ICP discipline is non-negotiable. Artemis focused on the segment with the highest urgency-to-adopt and the clearest data pathways, and deliberately ignored a seemingly attractive adjacent ICP that would slow learning. I’ve made that trade myself: it feels painful in the short term but pays off in faster cycles, cleaner roadmap decisions, and better founder-led GTM.

    Closing the first customers is where the magic happens—and where the most surprising signals of early product-market fit emerge. It’s rarely about feature breadth. It’s about whether customers escalate, volunteer data, and invite your team into their workflows. In founder-led sales, the most valuable insights come from the objections you lose on. I document every “no,” cluster them by root cause, and turn the top two into experiments within a sprint.

    I also believe the first product should make founders a little uncomfortable—just enough to prove the thesis in the messiest, fastest path possible. In AI security, that often means prioritizing the smallest end-to-end loop that can stop or downgrade a real threat, even if the initial UX is rough. If the loop works, you’ll earn the right to harden it.

    Co-founder dynamics matter as much as the roadmap. I liked the question “Should we be arguing more?” because it reframes conflict as a system. My rule: disagree in writing with a time box, escalate only the principle in dispute (not the plan), and commit to the decision with a pre-agreed review point. This keeps speed without calcifying bad calls.

    On structure, I’m convinced AI-native beats AI-enabled for this market. Organize around data, evaluations, and deployment rather than traditional feature teams. Blend product, research, and solutions into durable, customer-facing units. Consider forward deployed engineers who can ship safely in live environments and bring back the sharpest, most actionable learning. It’s the only way to keep pace with adversaries that iterate as fast as you do.

    The broader landscape provides context and competition. I benchmark capabilities and go-to-market motions against players like Abnormal, CrowdStrike, and Palo Alto Networks, with respect for the automation lineage from Demisto (now Cortex XSOAR). Cloud scale and data gravity from Amazon Web Services (AWS) matter, while model innovations from OpenAI and Anthropic raise the offensive and defensive bar. And Artemis is staking a claim in that intersection—where security outcomes, model excellence, and frontline customer intimacy meet.

    If you care about AI risk management, threat detection and response, and building empowered product teams that can win in this “AI vs AI” environment, the lessons here are clear: hire for AI fluency, not just titles; instrument the model like a business; let founder-led GTM shape your roadmap; and keep the customer close enough that you can text them—because that’s how you outlearn the market.


    Book a consult png image
  • From 70 Employees to Dominance: My Playbook for Hypergrowth, Focus, and Top-Down Goals

    From 70 Employees to Dominance: My Playbook for Hypergrowth, Focus, and Top-Down Goals

    Scaling a real-world marketplace from scrappy to dominant takes a different kind of product leadership. Reflecting on Christopher Payne’s decade leading DoorDash as President and COO — growing from roughly 70 employees to the dominant food delivery platform in the US — I’m struck by how much of that success hinged on mastering an atoms-based business while still operating with software-level rigor. As a VP of Product Management, I see the same patterns in my own work: relentless clarity on inputs, a bias for builder-executives, and a cadence that keeps leaders close to product details without becoming bottlenecks.

    Running an atoms-based business versus a pure software company forces you to obsess over operational physics: unit economics, quality control, on-time reliability, and dense local liquidity. It’s precisely where traditional “bits” executives can stumble. What’s worked for me is a simple “plate spinning” framework for executive attention: identify the five or six plates that must never stop — customer experience, marketplace health, quality and safety, product velocity, platform reliability, and P&L — then schedule recurring deep dives to keep those plates spinning. If a plate wobbles, I drop in, fix the root cause, re-instrument the inputs, and zoom back out.

    Hiring at hypergrowth speed only works when you bias toward a “builder mentality.” I look for executives who run toward fuzzy problems, write clearly, and can prove they’ve shipped value with incomplete information. Prior industry experience can be a liability when you’re reinventing the market; first-principles thinkers outlearn domain experts who try to port yesterday’s playbooks. In executive hiring, I’ve found structured work samples and narrative memos far more predictive than marathon interview loops — companies routinely spend too much time on job interviews and too little time evaluating how candidates think and execute.

    Great executives never outgrow the details. Staying close doesn’t mean micromanaging — it means sampling the customer journey and instrumenting the system so you can feel where it hurts. In my own practice, I rotate through frontline touchpoints weekly: support transcripts, NPS verbatims, failed checkout sessions, and reliability dashboards. Small signals often reveal systemic issues. A single ciabatta bread moment — the kind of edge-case substitution that seems trivial — can expose broken handoffs, unclear policies, and misaligned incentives across the marketplace.

    Top-down goal setting beats bottom-up when you’re aiming for category leadership. Bottom-up targets tend to regress to comfort; they calibrate to today’s constraints, not tomorrow’s possibilities. I set ambitious, top-down outcomes (not output), frame the non-negotiables, and map driver trees to clarify the input metrics that matter. Then I ask empowered product teams to pressure-test the plan, propose approaches, and own the how. This preserves ambition while unlocking creativity — a practical balance of clarity and autonomy that outcomes vs output OKRs were designed to achieve.

    One-size-fits-all management is a myth. Early-stage teams need hands-on coaching and fast decisions; later-stage teams need mechanisms that scale: crisp PRDs, pre-mortems, and operating cadences that separate strategy, planning, and execution. The mark of a high-functioning executive team is not uniform style — it’s high candor, fast escalation paths, and visible commitment after debate. In tough moments, a little charisma goes a long way; in practice, that’s not theatrics, it’s steady optimism, simple language, and consistent follow-through that keeps people moving forward.

    The hypergrowth skill stack for executives is surprisingly learnable: ruthless prioritization under uncertainty, narrative writing that aligns cross-functionally, structured delegation with clear “inspection points,” and a weekly rhythm that protects maker time. I leverage a cadence of business reviews (inputs > outputs), customer-scent checks, and decision logs so we can move fast without losing the thread. CEO and executive time management is the ultimate forcing function — if we can’t show where our attention maps to goals, the team won’t either.

    Some of my enduring lessons echo the best of Amazon and eBay: customer obsession beats competitor obsession, input metrics beat lagging vanity metrics, and simple mechanisms beat heroics. From Jeff Bezos’s playbook I borrow the insistence on written narratives, single-threaded ownership, and clarity on what will not change. Those principles remain the backbone of platform scalability and resilient product strategy, especially when markets get noisy.

    AI is about to flatten organizations. With agentic AI, retrieval-first pipelines, and AI workflows embedded into product development, managers can widen their span without losing fidelity. I see LLMs for product managers accelerating discovery, PRD drafting, and experiment analysis — while raising the bar on decision quality. The implication for leadership: fewer layers, more transparency, and even greater pressure to define sharp, top-down outcomes that teams can autonomously pursue.

    If I had to compress this into a playbook, it’s this: set audacious, top-down goals; keep your “plate spinning” calendar sacred; write more than you talk; hire builders, not resume archetypes; sample the customer journey every week; and build mechanisms that make the right thing easier than the heroic thing. That’s how you scale product management leadership from dozens to thousands — in atoms, in bits, and in the messy, exhilarating space where they meet.


    Book a consult png image
  • Build to Learn vs. Build to Earn: My Proven Playbook for Outcomes Over Output in the AI Era

    Build to Learn vs. Build to Earn: My Proven Playbook for Outcomes Over Output in the AI Era

    Product teams rarely fail because they don’t ship enough features; they fail because they don’t learn fast enough. That’s the core tension I manage every day: when to build to learn and when to build to earn. Navigating that balance is how we protect focus, accelerate time-to-value, and ultimately deliver durable business impact.

    Over the years, I’ve seen at least two major ways to develop product: build to learn and build to earn. The first is discovery-led and evidence-seeking; the second is delivery-led and value-capturing. Both are essential. The real craft is knowing which mode to be in, when to switch, and how to keep stakeholders aligned around outcomes instead of output.

    The project model remains the default in many organizations—even in the age of AI—and it’s all about output. Stakeholders or executives assemble a prioritized roadmap of features and projects, and teams ship against it. This can create momentum, but without clear outcome metrics and customer validation, it’s easy to drift into a feature factory that looks productive while missing the mark on user value and business results.

    When I build to learn, I emphasize continuous discovery. That means using customer interviews to surface unmet needs, running lightweight prototypes to test desirability and usability, and deploying A/B testing to quantify impact. I map assumptions, risks, and opportunities with an opportunity solution tree, and I timebox experiments so we learn fast and cheap. The standard is evidence, not opinions—especially my own. The goal is simple: reduce uncertainty before we scale.

    When I build to earn, the objective shifts to capturing value with confidence. Here I align teams to outcomes vs output OKRs, commit to clear acceptance criteria, and ensure product roadmapping and sprint planning reflect the highest-leverage bets we validated in discovery. Delivery excellence matters: crisp definition, reliable release trains, observability, and a strong feedback loop to confirm we’re moving activation, conversion, or retention in the intended direction.

    Deciding when to transition from learning to earning is all about thresholds of evidence. I look for leading indicators that our solution reliably solves the target problem, shows a measurable lift in key behaviors, and can be delivered with acceptable risk. If we can’t articulate the expected outcome and how we’ll measure it, we’re not ready to scale. If we can, we invest, monitor impact, and keep guardrails in place to avoid scope drift.

    The operating model that makes this sustainable is simple and disciplined. I rely on empowered product teams organized as product trios (product, design, engineering) to run dual tracks of discovery and delivery. We socialize learning with stakeholders early and often to strengthen trust and stakeholder management. We elevate strategy by linking every roadmap item to a problem statement, a testable hypothesis, and a quantified outcome—no orphan features, no vanity launches.

    In the AI era, speed can tempt us back into shipping-by-idea. I use gen AI for product prototyping and insight synthesis, and I lean on LLMs for product managers to accelerate discovery work—without treating AI as a shortcut to validation. Our AI Strategy clarifies where AI augments discovery, where it powers the product, and how we evaluate risk, so we move faster without compromising rigor or ethics.

    My rule of thumb: spend just enough time building to learn to achieve conviction, then shift decisively to building to earn—while preserving a small discovery cadence to keep learning alive. This rhythm protects focus, compounds insight, and makes growth more predictable. It’s how we avoid the output trap, deliver meaningful outcomes, and create products that customers love and the business celebrates.


    Inspired by this post on SVPG.


    Book a consult png image
  • Build vs. Buy for Churn Prediction: My Proven Playbook for Faster Retention and ROI

    Build vs. Buy for Churn Prediction: My Proven Playbook for Faster Retention and ROI

    Churn is the silent tax on growth, and I treat churn prediction as a core product capability—not a side project. Over the years, I’ve led teams through multiple implementations across different data maturities and go-to-market motions, and the same question keeps returning at kickoff: what’s the smartest path to impact now and defensibility later?

    “Should you build or buy your churn prediction model?” The right answer depends on time-to-value, data readiness, available talent, and whether churn prediction is a true differentiator for your product strategy or simply a must-have capability to power customer success and product-led growth.

    When speed and coverage matter most, I start by evaluating category platforms that pair behavioral analytics with activation. As one example, vendors emphasize immediate business outcomes such as integrations, in-app guides, and workflow triggers that help you act on risk signals fast—without waiting months for model training or data engineering.

    Increase revenue, cut costs, and reduce risk with Pendo’s Software Experience Management platform. Optimize the entire software experience to drive adoption and improve engagement.

    Buying makes sense when you need rapid time-to-value, opinionated best practices, and a unified analytics platform to operationalize insights through product tours, in-app guides, and CRM integration. In these cases, I’m optimizing for coverage, consistent signal quality, and ease of activation for customer success—so the team can focus on interventions, not infrastructure.

    Building is compelling when churn prediction is a source of competitive differentiation or you have proprietary signals others can’t access. If your product generates unique behavioral data, requires custom anomaly detection or explainability constraints, or must blend usage telemetry with domain-specific risk scoring, a tailored model can raise precision and unlock novel retention levers.

    My hybrid approach has become a reliable playbook: buy first to establish a strong baseline and close the activation loop, then selectively build where proprietary data and context yield outsized gains. I use retention analysis to identify high-signal behaviors, then iterate with A/B testing and a clear minimum detectable effect (MDE) to validate uplift before committing engineering capacity.

    Total cost of ownership is non-negotiable. I account for more than license or training costs: ongoing data engineering, feature pipeline maintenance, model monitoring for drift, and AI risk management all add up. Strong data governance, privacy-by-design, and regulatory compliance must be baked in—whether I build, buy, or blend both.

    Activation determines real ROI. Predictions that don’t flow into customer success workflows, lifecycle messaging, or in-product nudges rarely move Net Recurring Revenue (NRR). I prioritize tight integrations that enable targeted experiments—journey mapping, contextual tooltips, and timely outreach—to reduce friction and increase user engagement at the moments that matter.

    My quick decision test: buy if time-to-value and adoption are the immediate goals; build if proprietary signals and explainability are core strategic assets; blend if you want fast wins now with room to differentiate later. Answering the build vs. buy question through this lens consistently improves retention, accelerates product-led growth, and keeps teams focused on the customer experience rather than plumbing.


    Inspired by this post on Pendo – Perspectives.


    Book a consult png image
  • From Brain Dump to Done: How Todoist’s Ramble Captures Tasks in Real Time with AI

    From Brain Dump to Done: How Todoist’s Ramble Captures Tasks in Real Time with AI

    Turning a rambling stream of consciousness into a clean task list while someone is still talking has been a longtime product dream of mine. With Ramble, Todoist brought that dream to life by using live audio AI to capture tasks in real time—no transcription step required. The result is a voice-to-task flow that feels natural, fast, and surprisingly disciplined.

    As I listened to the Doist team—Ernesto Garcia (Front-end Product Engineer), Thomas Jost (Backend Software Engineer), and Hugo Fauquenoi (Product Manager)—walk through their approach, I heard a blueprint for building pragmatic GenAI features. What began as a two-to-three month AI exploration became one of their most technically deliberate releases: a “Gemini-powered pipeline that makes tool calls while the user is still speaking, surfacing tasks on screen in real time without any text output from the model.”

    The breakthrough started with user research. People weren’t merely dictating tasks; they were doing a “brain dump” first—often into pen and paper or even ChatGPT voice—and only then committing items to Todoist. Meeting users where they already are reframed the problem: don’t force structure upfront; capture fluid thought and translate it into actionable tasks instantly.

    That insight led to a bold architectural choice: skip transcription entirely and process raw audio directly with a Gemini live audio model. By removing the brittle middleman of text, the team reduced latency and kept the model focused on one job—turning intent into structured actions. It’s a crisp example of AI workflows designed for reliability over novelty.

    The real magic is in the real-time “tool calls.” As the user speaks, the model triggers add task, edit task, and delete task operations immediately. For high-friction contexts like driving, they paired visual task cards with subtle sound effects as confirmation cues. It’s thoughtful conversation design that respects attention and safety without sacrificing speed.

    Teaching the model to capture tasks literally—without over-interpreting or trying to complete the work—required careful prompt engineering for voice and temperature tuning. Drawing a bright line between “capture versus do” kept the experience trustworthy. In my own AI Strategy work, I’ve found that establishing explicit agentic guardrails early prevents unintended autonomy later.

    Dates were the sleeper challenge. The team had to inject the current date, normalize to days vs. months, and always output dates in English for the natural language parser—while preserving the user’s original language for everything else. If you’ve ever shipped date handling across locales, you’ll appreciate how many edge cases hide in “Taming Dates and Time.”

    Quality didn’t hinge on intuition alone. They built an LLM-judge eval system using real employee recordings from 100+ people across 35 countries in 20+ languages to catch prompt regressions. That’s eval-driven development done right: representative data, repeatable scoring, and tight feedback loops as models and prompts evolve.

    For project and label matching, they chose direct context injection over RAG. Instead of building a retrieval pipeline, they injected the full project/label list into the system prompt. With smart context window management and a sharply constrained task schema, this was both simpler and more accurate. Sometimes the fastest path to product-market fit is removing moving parts, not adding them.

    One product principle stood out: easy correction beats perfect first-time accuracy. Natural language interfaces earn trust when users can fix misfires in a tap or two. That bias toward quick recovery over false precision is how you ship AI that feels useful from day one.

    Looking ahead, the roadmap is compelling: multimodal task capture from images and text blobs, Apple Watch support, and automation integrations. As voice AI agent patterns mature, this “tool-only architecture” sets a solid foundation for going from capture to coordinated execution—without losing the simplicity that makes Ramble shine.

    If you want to hear the full conversation, you can listen on Spotify or Apple Podcasts. It’s a masterclass in building focused GenAI features that trade cleverness for clarity—and still delight.

    Resources & Links: Todoist • Doist • Google Vertex AI (Gemini)


    Inspired by this post on Product Talk.


    Book a consult png image