Tag: AI Strategy

  • Inside AI Product Management at Amplitude: How Leaders Turn Data into Better Products

    Inside AI Product Management at Amplitude: How Leaders Turn Data into Better Products

    When I think about the impact of AI on product management, one line sums it up for me: "Spencer Whittaker is a senior AI product manager at Amplitude. He focuses on using AI to advance Amplitude's mission of helping companies build better products." That focus on outcomes reflects how I frame AI Strategy—grounding every model and workflow in customer value and product-led growth.

    In practice, that means pairing Amplitude analytics and behavioral analytics with A/B testing and continuous discovery. I lean on eval-driven development to keep models honest, and I coach LLMs for product managers techniques so teams can prototype safely while we protect signal. Using a unified analytics platform clarifies what to build next and how to iterate faster.

    On teams I lead, product discovery stays tightly coupled to AI workflows: we map hypotheses to metrics, design experiments, and close the loop with instrumentation before we ship. That discipline turns AI from a demo into durable value, accelerating activation, retention, and feature adoption without sacrificing quality. A pragmatic AI product toolbox keeps us focused on measurable outcomes, not just novel capabilities.

    If you’re building with AI today, take a page from leaders pushing the craft forward: start with clear outcomes, connect your data in a unified analytics platform, and let A/B testing and continuous discovery guide your roadmap. With the right foundations—Amplitude analytics, behavioral analytics, and a sharp AI Strategy—you’ll transform insight into impact and build better products, faster.


    Inspired by this post on Amplitude – Perspectives.


    Book a consult png image
  • What’s New with Amplitude Agents: Faster Releases, Smarter Insights, and Must‑Try Upgrades

    What’s New with Amplitude Agents: Faster Releases, Smarter Insights, and Must‑Try Upgrades

    I’ve been deep in the work of turning agentic AI from a promising idea into reliable, measurable outcomes. Today, I want to share a concise, practitioner’s update on what’s new with Amplitude Agents—and, more importantly, how to get real value fast using proven product management techniques.

    We launched AI Agents a few weeks ago. We’ve been shipping pretty fast since then, so we wanted to loop you in on what’s new and what’s worth trying.

    Rapid releases only matter if they translate into user value. My approach is to treat every agent improvement as a learning opportunity: instrument it, set clear success metrics, run controlled experiments, and iterate. This eval-driven development mindset keeps us honest about what’s truly working in the wild.

    If you’re trying Amplitude Agents now, start with a narrowly scoped, high-signal workflow where success is unambiguous—think a single journey with a clear “done” state. Connect the experience to your unified analytics platform so you can see the full picture across events, funnels, and cohorts. In practice, I lean on Amplitude analytics and Agent Analytics to make this visibility effortless.

    Define how you’ll measure impact before you ship. Identify activation and completion events, baseline them, and then A/B test your agentic AI flow against the status quo. Behavioral analytics will show whether users are discovering the agent, sticking with it, and returning for more. When the story in the data is clean, it’s much easier to scale the win.

    Hardening matters as much as headlines. As you expand use, apply sensible guardrails—input validation, clear prompts, and transparent handoffs to deterministic flows when confidence is low. Pair this with observability so you can spot anomalies early and recover gracefully. These practices reduce risk while preserving the speed and creativity that make AI workflows powerful.

    Once the basics are working, dig into adoption patterns: segment by cohort, study user activation paths, and run retention analysis to find where the agent is truly changing behavior. These insights shape roadmap priorities and help you invest in the moments that drive durable value.

    We’ll keep shipping quickly and sharing practical guidance. If you have feedback, experiments to showcase, or questions about instrumentation, send them our way—I use that signal to refine our next set of improvements and learning agendas. Expect more short, focused updates and deeper dives on evaluation frameworks, prompt strategies, and rollout playbooks.

    In short: keep it scoped, instrument everything, test deliberately, and let the data guide your next move. That’s how Amplitude Agents becomes not just new, but indispensable.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • AI Agents That Truly Help Product Teams: A Practical Framework for When—and When Not—to Use Them

    AI Agents That Truly Help Product Teams: A Practical Framework for When—and When Not—to Use Them

    Every week, I field the same question from product leaders and engineers: should we deploy an AI agent here, or are we overfitting the problem to a shiny solution? Learn when AI Agents actually help product teams—plus a simple framework to decide when not to use them.

    When I say “AI agents,” I’m talking about autonomous or semi-autonomous systems that can perceive context, plan steps, and take actions across tools and data sources with minimal supervision—what many now call agentic AI. In product management terms, they’re not just another feature; they’re an operating model shift. Used well, they compound team leverage. Used poorly, they add invisible complexity, new failure modes, and governance headaches.

    To make the call with confidence, I use a straightforward VITAL framework that my team can apply in minutes. It keeps us honest about where AI agents are a force multiplier—and where a simpler automation, rule, or in-product UX is the better choice.

    V is for Volume. Agents shine where there’s sustained, repetitive, high-throughput work: triaging inbound support, cleansing CRM records, orchestrating QA checks, or synthesizing weekly research summaries. If the workflow happens rarely or ad hoc, an agent is often overhead in disguise.

    I is for Instructions. Can I specify success in clear, testable terms? Strong instructions include measurable acceptance criteria and constraints. If I can’t articulate what “good” looks like without hand-waving, the task likely needs product discovery, not autonomy.

    T is for Tolerance. What is the blast radius if the agent makes a wrong call? Low-stakes, reversible actions with tight guardrails are ideal. If the tolerance for error is near zero (e.g., irreversible financial transactions or sensitive regulatory actions), favor human-in-the-loop, stronger approvals, or defer agents entirely.

    A is for Access. The agent needs the right data, tools, and permissions, with privacy-by-design and data governance in place. If telemetry is sparse, integrations are brittle, or you can’t enforce least-privilege access, you’ll fight fragility more than you’ll gain leverage.

    L is for Learning loop. Agents require eval-driven development, Agent Analytics, and continuous feedback to stay accurate as reality shifts. If you can’t measure quality, latency, and cost per outcome—or you lack a retrieval-first pipeline to ground responses—expect drift and stakeholder distrust.

    Now, the counterweight. Don’t use agents when the problem is novel or strategically ambiguous and you still need exploratory research; when outcomes are unmeasurable or subjective without heavy context; when stakes are high and the acceptable error rate is effectively zero; when data is siloed, stale, or legally constrained; when the work is one-off or low-volume; or when your team can’t commit to instrumentation, evaluations, and ongoing maintenance. In these cases, a simpler rules engine, a clearer UX, or a well-defined workflow usually beats agentic complexity.

    Here’s how this plays out in practice. We’ve seen agents materially improve customer support triage (categorization, priority, and next-best-action suggestions), CRM hygiene (deduplication, enrichment, and routing), and release QA (regression check orchestration with human sign-off). Conversely, we avoid agents for nuanced pricing decisions, sensitive risk scoring without robust datasets, or any workflow where “explainability” and auditability trump speed.

    Operationalizing agents is a product problem before it’s an ML problem. Start narrow with a retrieval-first pipeline and rigorous prompt engineering, define success metrics upfront (quality, latency, cost per task), and run head-to-head evaluations against human baselines. Ship behind feature flags, monitor with Agent Analytics, and graduate from assisted to autonomous modes only after you’ve proven stability. Align this with product roadmapping and sprint planning so the work lands as durable capability, not a lab demo.

    Finally, be honest about build vs buy. If the workflow is a point of parity, consider buying and focusing your team on integration quality and governance. If it’s a potential source of competitive differentiation, invest in a modular architecture with clear context window management, strong observability, and a feedback loop tightly coupled to your empowered product teams.

    The bottom line: AI agents unlock leverage when there’s volume, clarity, tolerance, access, and a learning loop. If any of those pillars is missing, pause. Your best next move is likely better instrumentation, sharper problem framing, and continuous discovery—not more autonomy. That discipline is how product teams turn agentic AI from hype into habit.


    Inspired by this post on Product School.


    Book a consult png image
  • AI Experimentation Mastery: How I Test Faster, Tame Variability, and Ship with Confidence

    AI Experimentation Mastery: How I Test Faster, Tame Variability, and Ship with Confidence

    I’ve learned that the fastest path to durable AI impact is a disciplined experimentation engine: one that moves quickly, reduces ambiguity, and earns trust with evidence. My goal isn’t just to ship models—it’s to ship measurable outcomes with repeatable rigor.

    AI experimentation for product teams. Here’s how to test AI features, choose the right metrics, handle variability, and make data-driven decisions.

    I start every AI initiative by framing a clear decision: what must be true for this feature to be worth building, and how will we know quickly? From there, I map driver trees that connect user value to measurable signals, so every test clarifies both impact and risk, not just accuracy.

    Success criteria come next. I translate aspirations into testable thresholds, define leading and lagging indicators, and size tests with minimum detectable effect (MDE) so we don’t confuse noise for signal. This keeps us honest about sample sizes, power, and the real cost of waiting for certainty.

    Before I touch production traffic, I run eval-driven development. I curate golden datasets that reflect real user complexity, codify rubrics for correctness, safety, tone, and latency, and automate scoring so improvements are reproducible—not anecdotal. This gives the team a stable baseline to iterate prompts, tools, and policies with confidence.

    Model behavior is inherently stochastic, so I deliberately control variability. I document temperature, top-p, and seed strategies; I compare deterministic settings for regression checks versus sampled settings for user-facing creativity; and I test sensitivity across content lengths and edge cases. This reduces flakiness and prevents surprise regressions during CI/CD.

    When it’s time to learn from real users, I favor A/B testing with thoughtful guardrails. I run holdouts, cap exposure with feature flags, and protect core experience metrics like retention and time-to-value. For ranking and retrieval changes, I’ll use interleaving or switchback tests to isolate effects from seasonality and traffic mix.

    To handle LLM variability online, I aggregate outcomes over multiple prompts per cohort, use stratified bucketing to balance power users and new accounts, and track confidence intervals over time instead of snapshot p-values. This approach turns noisy model outputs into stable product signals.

    Instrumentation fuels everything. I rely on behavioral analytics to trace user intent, effort, and satisfaction across flows, and I wire up Amplitude analytics for event schemas, funnel drop-offs, and cohort comparisons. Clear event taxonomies and naming discipline make it trivial to separate model quality from UX friction.

    Risk is part of the work, so I bake in AI risk management early. I include toxicity and PII checks in my offline evals, monitor safety metrics in every A/B, and set rollback criteria tied to user harm and system costs. Privacy-by-design, audit logs, and runtime safeguards aren’t afterthoughts—they’re acceptance criteria.

    The operating cadence matters as much as the math. I run continuous discovery with customer interviews to keep the test queue grounded in real jobs-to-be-done, and I align product trios on hypotheses, success metrics, and stop-loss rules before launch. Weekly readouts keep decisions crisp, and post-ship learning cycles feed the next iteration.

    Finally, I invest in upskilling the team. We run internal workshops on LLMs for product managers, standardize experiment templates, and maintain a living playbook so new experiments start at 80% instead of 0%. The result: faster learning loops, safer bets, and more confident shipping.


    Inspired by this post on Product School.


    Book a consult png image
  • Unleashing Inbound Sales with AI: My Playbook for Launching and Scaling Sales Agents Fast

    Unleashing Inbound Sales with AI: My Playbook for Launching and Scaling Sales Agents Fast

    Inbound leads shouldn’t wait for a rep’s calendar. When we first launched The Service Agent Blueprint, support leaders finally had a clear AI path. Go-to-market and revenue teams are now facing similar uncertainty, so I’m introducing The Sales Agent Blueprint—a practical map for launching and scaling AI for sales with confidence.

    For most sales teams, inbound motions require a lot of manual work. I’ve watched leads pile up in queues, waiting for availability rather than being prioritized by buyer intent. That delay costs meetings, pipeline, and momentum—and it’s exactly where a modern AI Strategy can transform your go-to-market strategy.

    Agents can run sales conversations end to end – engaging buyers, qualifying leads, and routing high-intent opportunities to the right team to move prospective buyers forward quickly. Humans will still be involved, but will move their focus to the consultative conversations and higher-value work they did not have time to focus on before. In practice, this shift enables cleaner AI workflows, better conversation design, and a healthier balance between sales-led growth and product-led growth.

    The questions many go-to-market and revenue leaders are facing now are where do you start? What should success look like? How do you actually test and deploy these solutions? These are the right questions—and the ones I hear most often when teams weigh build vs buy decisions, evaluation frameworks, and CRM integration nuances.

    The Sales Agent Blueprint answers those questions. It’s designed to be a strategic guide for sales, revenue, and AI transformation leaders who want to deploy AI for inbound sales fast, prove value, and build momentum. If you’re aiming for eval-driven development, this will help you define success up front and operationalize it.

    What’s inside is simple by design yet deep enough to take you from zero to value. The Sales Agent Blueprint is structured around two tracks that reflect how high-performing teams adopt agentic AI: first, launch for quick wins; next, scale for durable growth.

    Minimal blue banner for Introducing the Sales Agent Blueprint with a bold 'Scale it' headline, abstract halftone device graphic, subtle crop marks, and a 'Coming Soon' badge in the upper-right corner.
    Coming soon: Sales Agent Blueprint. A sleek, blueprint-inspired teaser with the call to 'Scale it' signals tools, playbooks, and workflows to grow revenue, streamline operations, and scale teams with confidence.

    Today, I’m releasing the first part of the Blueprint: “Launch it.” It’s a practical guide for getting your Agent live and seeing real results. You’ll learn how to deploy a Sales Agent that runs inbound sales conversations end to end, engaging buyers, qualifying leads, and routing high-intent opportunities to the right outcome in real time—without disrupting your current CRM integration or pipeline processes.

    By the end of the “Launch it” track, you’ll be ready to execute with clarity. Here’s how I frame the essential steps, based on what consistently works in the field.

    Understand what a Sales Agent is: Discover why they’re different from chatbots and how they work. Build a business case: Prove the basic economics of AI, decide whether to buy or build, and get the buy-in and budget you need to move forward.

    Evaluate an Agent: Learn how to define success, choose the right evaluation criteria, and run a focused, high-impact assessment with our five-step framework.

    Deploy with confidence: Build a deployment plan that gets your Agent live quickly to engage buyers at peak intent. Learn what to expect at each stage.

    Vector-style 'Blueprint' title on a light grid with Bézier points, plus a royal-blue panel reading '1 Launch it' next to a satellite icon; footer shows FIN.AI/BLUEPRINT/SALES promoting the Sales Agent Blueprint.
    Introducing the Sales Agent Blueprint. This crisp, grid-based graphic spotlights step 1—Launch it—signaling day-one activation for an AI sales agent. Explore the framework and get started at fin.ai/blueprint/sales.

    Continuously improve performance: After launch, your Agent becomes a system to manage. We’ll show you how to implement a repeatable process to train, test, deploy, and optimize.

    The second track, “Scale it” (coming soon), focuses on the organizational and systems design work that unlocks compounding gains. Launching AI is only the beginning. To unlock its full potential, you need to rewire your inbound sales motion—redesigning the buyer journey, building AI-first systems and ownership models, and rethinking how pipeline is generated and scaled. This is where governance, measurement, and team roles evolve to support sustainable growth.

    I’ll be building this Blueprint in public as I navigate the same challenges—sharing what works, what to avoid, and how to accelerate time-to-value without sacrificing quality or trust. If you’re ready to turn intent into revenue with agentic AI, this is your head start.

    The Sales Agent Blueprint is live now. Explore the full guide at fin.ai/blueprint/sales and start your “Launch it” sprint today.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • My Essential AI Toolbox for Product Managers: Tested Picks, Prompts, Workflows + Checklists

    My Essential AI Toolbox for Product Managers: Tested Picks, Prompts, Workflows + Checklists

    I created this practical guide to help product managers cut through the hype and apply AI where it genuinely moves the needle—faster discovery, clearer strategy, sharper execution, and measurable outcomes.

    A practical guide to AI tools for product managers: tested picks, what each tool is best for, copy-paste prompts, workflows, and screenshot checklists.

    Leading product management at HighLevel, I’ve pressure-tested dozens of gen AI solutions across product discovery, roadmap planning, delivery, and go-to-market. In this guide, I map an AI product toolbox to core PM jobs-to-be-done so you can move from experimentation to repeatable impact with confidence.

    Expect clear recommendations on where each tool excels—LLMs for product managers, research synthesis for customer interviews, behavioral analytics for opportunity sizing, and lightweight automation for in-app guides and product tours. I connect these tools to proven practices like continuous discovery, outcomes vs output OKRs, and product roadmapping and sprint planning so you can operationalize AI inside your existing workflows.

    I also share the evaluation criteria I use before rollout—AI Strategy alignment, data governance and privacy-by-design, AI risk management, observability, and total cost of ownership. This eval-driven development approach helps teams avoid technology FOMO while creating defensible, trustworthy workflows that scale.

    To accelerate adoption, I’ve included copy-paste prompts (including prompt engineering patterns for both chat and voice), retrieval-first pipeline blueprints to ground your models in product docs and decision logs, and conversation design tips for support and success use cases. You’ll see step-by-step AI workflows that tie directly to journey mapping, opportunity solution trees, and Kano Model trade-offs.

    Every workflow comes with screenshot checklists you can use for onboarding or stakeholder management, making it easy to align ICs and leaders on the same operating picture. Whether you’re optimizing A/B testing, retention analysis, or QBRs vs OKRs, these checklists turn good intentions into repeatable rituals.

    Use this guide as your field companion to ship faster with higher confidence—reducing cycle time, improving signal in discovery, and building momentum for product-led growth. If you’re ready to translate generative AI into reliable PM leverage, start with the workflows, adapt the prompts, and make them your own.


    Inspired by this post on Product School.


    Book a consult png image
  • Why Your Product Needs a Smarter Support Agent: Data-Driven, Agentic AI That Truly Helps

    Why Your Product Needs a Smarter Support Agent: Data-Driven, Agentic AI That Truly Helps

    Your product deserves a support experience that does more than point users to a help article. In my work leading product teams, I’ve seen how an intelligent, in-product assistant can reduce friction, accelerate user activation, and create the kind of product-led growth that traditional support channels struggle to deliver. The bar is higher now: customers expect immediate, context-aware help that feels proactive, measurable, and trustworthy.

    When I evaluate support solutions, I look for three capabilities: an assistant that truly knows the user’s context, can act on their behalf to resolve issues end-to-end, and can prove the impact with rigorous measurement. Anything less is just another interface to your knowledge base. The shift to agentic AI makes this possible—if it’s grounded in behavioral analytics and integrated with your unified analytics platform.

    Learn more about Amplitude AI Assistant. Our in-product support agent knows your users, acts on their behalf, and measures whether it actually helped.

    That promise resonates with how I design AI Strategy: start with data fidelity, not dialog. When an assistant is wired into Amplitude analytics and behavioral analytics, it can understand where a user is in the journey, the features they have (or haven’t) adopted, and which nudges or in-app guides historically drive success. This is the foundation for precise, contextual help—surfacing the right product tours at the right moments and removing guesswork.

    Knowing users isn’t enough; the assistant must act. With agentic AI, the assistant can execute safe, auditable steps on a user’s behalf—updating settings, triggering a workflow, or guiding a multi-step configuration—rather than handing off a to-do back to the customer. Done well, this reduces time-to-value and support tickets while aligning with a thoughtful customer support ai strategy that respects permissions, privacy-by-design, and clear guardrails.

    Equally important is measurement. I expect every AI touchpoint to demonstrate lift: faster time-to-resolution, higher feature adoption, improved retention, and lower churn. This is where robust A/B testing, Agent Analytics, and retention analysis come in—so we can quantify the assistant’s contribution against meaningful product outcomes, not vanity metrics. If we can’t measure it, we can’t manage it.

    Operationally, I advise teams to pilot with narrowly scoped, high-impact journeys and iterate with tight feedback loops. Instrument the assistant’s actions and outcomes, set minimum detectable effect thresholds for experiments, and continually refine prompts and playbooks. Tie insights back to your unified analytics platform so learnings inform roadmap choices and reinforce a durable product-led growth motion.

    In short, the next generation of in-product support will be built on data-rich context, agentic execution, and rigorous proof of value. That’s the standard I hold my teams to—and the experience users deserve when they ask for help.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • How AI Product Leaders Drive Better Products: My Take on Amplitude’s Mission and Impact

    How AI Product Leaders Drive Better Products: My Take on Amplitude’s Mission and Impact

    I’m constantly studying how AI is elevating product organizations, and Amplitude offers a compelling example of how to turn data into durable, customer-centered outcomes.

    Spencer Whittaker is a senior AI product manager at Amplitude. He focuses on using AI to advance Amplitude's mission of helping companies build better products.

    From my vantage point leading product teams, that focus translates into practical AI Strategy across behavioral analytics and Amplitude analytics: turning raw event streams into decision-ready insights that accelerate product-led growth and continuous discovery.

    In my own roadmap reviews, the highest-impact patterns are consistent: pair A/B testing with eval-driven development, coach PMs on LLMs for product managers to sharpen problem framing, and amplify signal quality through thoughtful instrumentation and journey mapping. When these practices come together, empowered product teams ship with confidence and reduce time-to-learning.

    Equally important are the guardrails: clear build vs buy criteria for gen ai components, privacy-by-design and data governance from day one, and a crisp measurement model that ties experiments to activation, retention analysis, and customer success outcomes.

    Practically, this means instrumenting hypotheses with the right metrics, setting a minimum detectable effect (MDE) where relevant, and looping insights back into the opportunity solution tree so the next sprint is smarter than the last. This disciplined rhythm separates hype from durable value.

    Seeing peers push this mission forward reinforces a core belief of mine: when AI helps teams find the right problems faster, we build products people truly love—and we do it responsibly, repeatably, and at scale.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • Forget Crystal Balls: How Scenario Planning Helps Me Ship Smarter in the Age of AI

    Forget Crystal Balls: How Scenario Planning Helps Me Ship Smarter in the Age of AI

    AI headlines are everywhere—and many claim they know exactly what’s coming next. In product management, I’m often asked to make single-point predictions about gen ai and LLMs for product managers. I resist that temptation because confident forecasts are seductive—and usually wrong.

    Listening to Teresa Torres and Petra Wille unpack why certainty fails reinforced what I practice with my product trios: scenario planning. Instead of betting on one future, I explore several plausible ones, define the signals that would confirm or disconfirm each, and translate those insights into product strategy and product roadmapping and sprint planning we can adapt as evidence evolves.

    Their argument mirrors what I see with customers and stakeholders: people are bad at predicting the future, and overconfidence creates fragility. Early adopters don’t represent everyone, so when we extrapolate from enthusiasts to the mainstream, we waste time and erode trust by building the wrong things.

    Here’s how I apply this to avoid technology FOMO and make sharper AI Strategy decisions. I treat every bold claim as one possible future, then ask, “what else could happen?” I push extremes—AI everywhere vs. AI as invisible utility; GUIs vanish vs. GUIs evolve; centralized vs. edge compute—and hunt for the needs that stay true across scenarios. Those invariants anchor empowered product teams to outcomes, not outputs, and they help us stage bets responsibly.

    Listen to this episode on: Spotify | Apple Podcasts

    My key takeaways: Confident predictions are often wrong. Early adopters don’t represent everyone. Treat predictions as one possible future. Scenario planning > trying to be right. Focus on patterns, not hype.

    In short: We’re in a period of change—but no one can predict exactly how it plays out. Strong predictions often ignore uncertainty.

    A better approach in practice: Treat every prediction as a scenario. Ask: what else could happen? Use multiple futures to guide decisions.

    As you evaluate roadmaps, watch for traps like “My experience = everyone’s future” thinking, over-indexing on early adopters, and ignoring real-world constraints like budgets, compliance, and change management.

    Tactically, we run quick scenario exercises, push ideas to extremes to explore implications, and extract the underlying insight (not the exact prediction). This complements continuous discovery and helps us write outcomes vs output OKRs that are resilient to uncertainty.

    00:00 – The problem with future predictions

    04:00 – Why experts get it wrong

    06:00 – Scenario planning explained

    12:00 – Early adopters vs. reality

    20:00 – AI, GUIs, and extreme takes

    27:00 – Using scenarios in product work

    34:00 – Final thoughts

    Resources & Links:

    Follow Teresa Torres: https://ProductTalk.org

    Follow Petra Wille: https://Petra-Wille.com

    Mentioned in this episode:

    Claude Code

    What did I miss—or what scenarios are you considering for your team? Leave a comment below and let’s compare notes.


    Inspired by this post on Product Talk.


    Book a consult png image
  • Inside Artemis’ AI vs AI Security War: Hiring at Speed, PMF Signals, and Founder-Led Sales

    Inside Artemis’ AI vs AI Security War: Hiring at Speed, PMF Signals, and Founder-Led Sales

    I’m fascinated by how fast truly AI-native companies can move when the problem is urgent, the founders have deep domain credibility, and the culture is built around customer obsession from day one. Artemis, an AI-native security platform, just emerged from stealth with $70M in combined seed and Series A funding, assembled a 30-person team in seven months, and made a bold promise to “stay on a texting basis with every customer, even at scale.” As a product leader, I see this as a masterclass in AI Strategy, go-to-market focus, and disciplined execution in cybersecurity.

    At its core, Artemis is operating in what I’d call an “AI vs AI” security war: increasingly, we’re defending against adversaries who leverage models just as aggressively as we do. That shifts the job from rule-writing to intelligence orchestration, threat detection and response at machine speed, and continuous evaluation. It also explains why AI-native companies are outperforming their AI-enabled counterparts—when intelligence is the product, the org must be built around model quality, data pipelines, and rapid iteration, not as a bolt-on.

    Founder-market fit is the early signal I look for, and here it’s unmistakable. Shachar Hirshberg’s “AWS and Palo Alto” playbook and Dan Shiebler’s path “From Twitter to Abnormal” create a rare combination: deep infrastructure and enterprise security know-how paired with production-grade machine learning at scale. When those experiences intersect, you get crisp problem statements, faster learning loops, and credibility with the exact ICP that feels the pain first.

    Timing the leap to build is more art than science, but I listen for three cues: customers describing the problem in quantified terms, a wedge that can deliver value within one buying cycle, and a data advantage that compounds. Artemis clearly identified a high-urgency buyer and ignored adjacent segments that would dilute focus—an underrated act of courage that accelerates product-market fit.

    Hiring for AI fluency is a different exercise than traditional software roles. I don’t just screen for model familiarity; I screen for product thinking under uncertainty, a bias for eval-driven development, and the ability to explain tradeoffs to security teams. Practical prompts help: “How would you diagnose precision/recall tradeoffs under evolving threat patterns?” or “Show me how you’d design a red/blue evaluation harness for a new detection.” The best candidates can translate model metrics into business outcomes and customer trust.

    Building a 30-person AI-native team in stealth requires ruthless clarity on the handful of roles that compound: forward deployed engineers who can ship with customers, solutions engineering that feeds learning back into the model, and product managers who treat data as the primary surface area. Culture-wise, I anchor on two rituals: weekly customer debriefs with actual artifacts (alerts, misclassifications, escalations) and a written log of hypotheses, evals, and next bets—so the entire team can reason from the same evidence.

    AI implementation reshapes the dashboard. Beyond the usual business KPIs, I watch a second layer: model precision/recall by scenario, alert fatigue reduction, time-to-first-signal on emerging threats, drift and data freshness, and latency under load. When these improve, downstream product metrics—activation, expansion, NRR—almost always follow. Observability isn’t an afterthought; it’s the control center for trust in AI-driven cybersecurity.

    ICP discipline is non-negotiable. Artemis focused on the segment with the highest urgency-to-adopt and the clearest data pathways, and deliberately ignored a seemingly attractive adjacent ICP that would slow learning. I’ve made that trade myself: it feels painful in the short term but pays off in faster cycles, cleaner roadmap decisions, and better founder-led GTM.

    Closing the first customers is where the magic happens—and where the most surprising signals of early product-market fit emerge. It’s rarely about feature breadth. It’s about whether customers escalate, volunteer data, and invite your team into their workflows. In founder-led sales, the most valuable insights come from the objections you lose on. I document every “no,” cluster them by root cause, and turn the top two into experiments within a sprint.

    I also believe the first product should make founders a little uncomfortable—just enough to prove the thesis in the messiest, fastest path possible. In AI security, that often means prioritizing the smallest end-to-end loop that can stop or downgrade a real threat, even if the initial UX is rough. If the loop works, you’ll earn the right to harden it.

    Co-founder dynamics matter as much as the roadmap. I liked the question “Should we be arguing more?” because it reframes conflict as a system. My rule: disagree in writing with a time box, escalate only the principle in dispute (not the plan), and commit to the decision with a pre-agreed review point. This keeps speed without calcifying bad calls.

    On structure, I’m convinced AI-native beats AI-enabled for this market. Organize around data, evaluations, and deployment rather than traditional feature teams. Blend product, research, and solutions into durable, customer-facing units. Consider forward deployed engineers who can ship safely in live environments and bring back the sharpest, most actionable learning. It’s the only way to keep pace with adversaries that iterate as fast as you do.

    The broader landscape provides context and competition. I benchmark capabilities and go-to-market motions against players like Abnormal, CrowdStrike, and Palo Alto Networks, with respect for the automation lineage from Demisto (now Cortex XSOAR). Cloud scale and data gravity from Amazon Web Services (AWS) matter, while model innovations from OpenAI and Anthropic raise the offensive and defensive bar. And Artemis is staking a claim in that intersection—where security outcomes, model excellence, and frontline customer intimacy meet.

    If you care about AI risk management, threat detection and response, and building empowered product teams that can win in this “AI vs AI” environment, the lessons here are clear: hire for AI fluency, not just titles; instrument the model like a business; let founder-led GTM shape your roadmap; and keep the customer close enough that you can text them—because that’s how you outlearn the market.


    Book a consult png image
  • From 70 Employees to Dominance: My Playbook for Hypergrowth, Focus, and Top-Down Goals

    From 70 Employees to Dominance: My Playbook for Hypergrowth, Focus, and Top-Down Goals

    Scaling a real-world marketplace from scrappy to dominant takes a different kind of product leadership. Reflecting on Christopher Payne’s decade leading DoorDash as President and COO — growing from roughly 70 employees to the dominant food delivery platform in the US — I’m struck by how much of that success hinged on mastering an atoms-based business while still operating with software-level rigor. As a VP of Product Management, I see the same patterns in my own work: relentless clarity on inputs, a bias for builder-executives, and a cadence that keeps leaders close to product details without becoming bottlenecks.

    Running an atoms-based business versus a pure software company forces you to obsess over operational physics: unit economics, quality control, on-time reliability, and dense local liquidity. It’s precisely where traditional “bits” executives can stumble. What’s worked for me is a simple “plate spinning” framework for executive attention: identify the five or six plates that must never stop — customer experience, marketplace health, quality and safety, product velocity, platform reliability, and P&L — then schedule recurring deep dives to keep those plates spinning. If a plate wobbles, I drop in, fix the root cause, re-instrument the inputs, and zoom back out.

    Hiring at hypergrowth speed only works when you bias toward a “builder mentality.” I look for executives who run toward fuzzy problems, write clearly, and can prove they’ve shipped value with incomplete information. Prior industry experience can be a liability when you’re reinventing the market; first-principles thinkers outlearn domain experts who try to port yesterday’s playbooks. In executive hiring, I’ve found structured work samples and narrative memos far more predictive than marathon interview loops — companies routinely spend too much time on job interviews and too little time evaluating how candidates think and execute.

    Great executives never outgrow the details. Staying close doesn’t mean micromanaging — it means sampling the customer journey and instrumenting the system so you can feel where it hurts. In my own practice, I rotate through frontline touchpoints weekly: support transcripts, NPS verbatims, failed checkout sessions, and reliability dashboards. Small signals often reveal systemic issues. A single ciabatta bread moment — the kind of edge-case substitution that seems trivial — can expose broken handoffs, unclear policies, and misaligned incentives across the marketplace.

    Top-down goal setting beats bottom-up when you’re aiming for category leadership. Bottom-up targets tend to regress to comfort; they calibrate to today’s constraints, not tomorrow’s possibilities. I set ambitious, top-down outcomes (not output), frame the non-negotiables, and map driver trees to clarify the input metrics that matter. Then I ask empowered product teams to pressure-test the plan, propose approaches, and own the how. This preserves ambition while unlocking creativity — a practical balance of clarity and autonomy that outcomes vs output OKRs were designed to achieve.

    One-size-fits-all management is a myth. Early-stage teams need hands-on coaching and fast decisions; later-stage teams need mechanisms that scale: crisp PRDs, pre-mortems, and operating cadences that separate strategy, planning, and execution. The mark of a high-functioning executive team is not uniform style — it’s high candor, fast escalation paths, and visible commitment after debate. In tough moments, a little charisma goes a long way; in practice, that’s not theatrics, it’s steady optimism, simple language, and consistent follow-through that keeps people moving forward.

    The hypergrowth skill stack for executives is surprisingly learnable: ruthless prioritization under uncertainty, narrative writing that aligns cross-functionally, structured delegation with clear “inspection points,” and a weekly rhythm that protects maker time. I leverage a cadence of business reviews (inputs > outputs), customer-scent checks, and decision logs so we can move fast without losing the thread. CEO and executive time management is the ultimate forcing function — if we can’t show where our attention maps to goals, the team won’t either.

    Some of my enduring lessons echo the best of Amazon and eBay: customer obsession beats competitor obsession, input metrics beat lagging vanity metrics, and simple mechanisms beat heroics. From Jeff Bezos’s playbook I borrow the insistence on written narratives, single-threaded ownership, and clarity on what will not change. Those principles remain the backbone of platform scalability and resilient product strategy, especially when markets get noisy.

    AI is about to flatten organizations. With agentic AI, retrieval-first pipelines, and AI workflows embedded into product development, managers can widen their span without losing fidelity. I see LLMs for product managers accelerating discovery, PRD drafting, and experiment analysis — while raising the bar on decision quality. The implication for leadership: fewer layers, more transparency, and even greater pressure to define sharp, top-down outcomes that teams can autonomously pursue.

    If I had to compress this into a playbook, it’s this: set audacious, top-down goals; keep your “plate spinning” calendar sacred; write more than you talk; hire builders, not resume archetypes; sample the customer journey every week; and build mechanisms that make the right thing easier than the heroic thing. That’s how you scale product management leadership from dozens to thousands — in atoms, in bits, and in the messy, exhilarating space where they meet.


    Book a consult png image
  • Build to Learn vs. Build to Earn: My Proven Playbook for Outcomes Over Output in the AI Era

    Build to Learn vs. Build to Earn: My Proven Playbook for Outcomes Over Output in the AI Era

    Product teams rarely fail because they don’t ship enough features; they fail because they don’t learn fast enough. That’s the core tension I manage every day: when to build to learn and when to build to earn. Navigating that balance is how we protect focus, accelerate time-to-value, and ultimately deliver durable business impact.

    Over the years, I’ve seen at least two major ways to develop product: build to learn and build to earn. The first is discovery-led and evidence-seeking; the second is delivery-led and value-capturing. Both are essential. The real craft is knowing which mode to be in, when to switch, and how to keep stakeholders aligned around outcomes instead of output.

    The project model remains the default in many organizations—even in the age of AI—and it’s all about output. Stakeholders or executives assemble a prioritized roadmap of features and projects, and teams ship against it. This can create momentum, but without clear outcome metrics and customer validation, it’s easy to drift into a feature factory that looks productive while missing the mark on user value and business results.

    When I build to learn, I emphasize continuous discovery. That means using customer interviews to surface unmet needs, running lightweight prototypes to test desirability and usability, and deploying A/B testing to quantify impact. I map assumptions, risks, and opportunities with an opportunity solution tree, and I timebox experiments so we learn fast and cheap. The standard is evidence, not opinions—especially my own. The goal is simple: reduce uncertainty before we scale.

    When I build to earn, the objective shifts to capturing value with confidence. Here I align teams to outcomes vs output OKRs, commit to clear acceptance criteria, and ensure product roadmapping and sprint planning reflect the highest-leverage bets we validated in discovery. Delivery excellence matters: crisp definition, reliable release trains, observability, and a strong feedback loop to confirm we’re moving activation, conversion, or retention in the intended direction.

    Deciding when to transition from learning to earning is all about thresholds of evidence. I look for leading indicators that our solution reliably solves the target problem, shows a measurable lift in key behaviors, and can be delivered with acceptable risk. If we can’t articulate the expected outcome and how we’ll measure it, we’re not ready to scale. If we can, we invest, monitor impact, and keep guardrails in place to avoid scope drift.

    The operating model that makes this sustainable is simple and disciplined. I rely on empowered product teams organized as product trios (product, design, engineering) to run dual tracks of discovery and delivery. We socialize learning with stakeholders early and often to strengthen trust and stakeholder management. We elevate strategy by linking every roadmap item to a problem statement, a testable hypothesis, and a quantified outcome—no orphan features, no vanity launches.

    In the AI era, speed can tempt us back into shipping-by-idea. I use gen AI for product prototyping and insight synthesis, and I lean on LLMs for product managers to accelerate discovery work—without treating AI as a shortcut to validation. Our AI Strategy clarifies where AI augments discovery, where it powers the product, and how we evaluate risk, so we move faster without compromising rigor or ethics.

    My rule of thumb: spend just enough time building to learn to achieve conviction, then shift decisively to building to earn—while preserving a small discovery cadence to keep learning alive. This rhythm protects focus, compounds insight, and makes growth more predictable. It’s how we avoid the output trap, deliver meaningful outcomes, and create products that customers love and the business celebrates.


    Inspired by this post on SVPG.


    Book a consult png image