Author: Shivam Tiwari

  • Break the Headcount Ceiling: How AI Agents Create Net-New Pipeline at Scale

    Break the Headcount Ceiling: How AI Agents Create Net-New Pipeline at Scale

    I’ve been through enough planning cycles to know the impossible math sales leaders juggle. Every year, we’re asked to deliver more pipeline, and the expectation is that the team will somehow hit the target—whether headcount follows or not. In a good year you close some of the gap, but the underlying constraint remains: your pipeline ceiling is tied to your headcount. The ask gets bigger, but the resources rarely keep pace. There’s never been a convincing answer to “how do I grow pipeline by 30% without 30% more people?”

    For the first time in my 20-year sales career, there’s a real answer, and it comes from how we’re using our Customer Agent—internally nicknamed “Fin”—for inbound sales. What changed my perspective wasn’t faster execution on the same tasks; it was recognizing that an Agent can generate its own pipeline, consistently and at scale.

    Most conversations about AI in sales focus on efficiency—do the same work, just faster. That’s helpful but incomplete. In practice, the Agent is producing net-new, attributable pipeline. It’s not simply an efficiency layer inside the SDR team; it’s a distinct source that deserves its own targets, its own owner, and clear visibility in our pipeline analytics.

    Here’s how we run it. Fin has dedicated performance metrics but is held to the same outcomes as any rep: meetings booked, pipeline created, and revenue generated. On live chat, we track qualified, disqualified, and dropped conversations, then follow those cohorts through to opportunity and close. When you fold the Agent’s numbers into the team’s aggregate, you lose the crucial signal of what the Agent is actually doing. Reframing this with explicit attribution changes the boardroom conversation from “efficiency gains” to “a new, incremental source of pipeline.” Last month was our highest pipeline month from Fin to date—stronger than when live chat was handled by humans alone.

    The template for this transformation came from customer service. Before we operationalized AI for sales, I partnered closely with our support organization. They built the organizational architecture we’re applying today: clear ownership of the AI motion, Agents and humans running in parallel, and a continuous optimization loop that treats the Agent as a living system, not a set-and-forget tool. The workflows in support and sales are more similar than people expect—qualify the need, guide to the right solution, and move decisively toward an outcome.

    “The right benchmark is matching a high-performing rep on that channel, consistently and at scale”

    When the Agent reliably meets that benchmark, the gains compound. The team wins back time for work where relationships truly matter—multi-threading across stakeholders, tailoring value narratives, and navigating complex buying processes. That is where human judgment shines.

    The most common question I hear is what this means for SDRs. If the Agent owns the frontline, what are SDRs actually doing? The answer is: higher-leverage work. The Agent handles frontline inbound—engaging instantly, qualifying, routing high-intent prospects to the right team, and keeping lower-intent visitors warm by directing them to self-serve resources or remembering their context until they’re ready for a real conversation. It does this 24/7, across languages, without the capacity constraints that come with a human-only model.

    What changes is where SDRs’ time goes. For us, that’s phone-based qualification, where we still see the strongest conversion. It’s also deeper relationship-building across multiple stakeholders in an account—the kind of multi-threaded engagement that takes time and judgment. Trials are a great example: rather than treating a trial as a conversion mechanism, SDRs can help prospects get real value from it through guided setup and outcome-oriented check-ins.

    Minimalist hero graphic with the headline 'Add Fin to your sales team today,' a glossy 3D blue spiral at center, and a black 'Start free trial' button, promoting Fin for Sales as an AI customer agent.
    Introduce Fin for Sales to your team with this clean hero banner: bold headline, signature blue spiral, and a clear 'Start free trial' call to action—inviting readers to explore an AI customer agent built for revenue.

    “That’s work they rarely have capacity for right now, because too much of their time goes to the frontline. Fin changes that”

    I want to be direct about one thing: replacing your SDR function entirely with AI is a mistake. SDRs are the talent pipeline for closing teams. The reps who become your best AEs are, more often than not, people who came up through an SDR role. That’s where they learn to qualify and build relationships at speed. Eliminating that function to reduce cost creates fragility further up the funnel that can take years to surface.

    Across the market, many sales organizations are still early in this journey. Startups and smaller teams are ahead—they’re building AI-first motions from the ground up and deliberately designing to avoid scaling headcount in the traditional way. Larger, more established sales development functions are mostly still running standard workflows. That makes sense—transforming a mature org is harder than building anew—but complexity isn’t a reason to wait. Momentum is building, and the gap is widening between teams leaning in and those holding back.

    What’s emerging now is dedicated AI ownership within sales. It requires someone with program-level responsibility for how the Agent actually performs, rather than bolting AI tools onto an existing job description. We created that role – it’s called “AI SDR program lead.” This role owns the strategy, implementation, and optimization of Fin within the inbound SDR motion, ensuring it drives pipeline growth and integrates well across our systems and workflows. It’s a new career opportunity that came directly from the AI motion, with one of our existing managers moving into it.

    The long-held assumption that pipeline growth requires proportional headcount growth is no longer a fixed law. AI-generated pipeline is real, measurable, and improvable with the same rigor we apply to any other part of the function. Treating it as its own source—with explicit targets, attribution, and dedicated ownership—is the difference between marginal efficiency gains and truly breaking the link between pipeline growth and headcount.

    The constraint hasn’t disappeared; it has moved. It’s no longer just about how many people you can hire. It’s about how well the Agent understands your product, your customers, and your qualification logic—and how quickly your team can iterate the workflows, knowledge, and guardrails around it. For the first time, the pipeline ceiling can be higher than your headcount allows.

    If you’re standing up this motion now, start with three moves: give the Agent its own KPIs and attribution, put a single owner in charge of performance and iteration, and reorient SDR time toward high-conversion conversations and multi-threaded account development. That’s how you scale pipeline with AI Strategy and sales-led growth—without scaling headcount in lockstep.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • Inside AI Product Management at Amplitude: How Leaders Turn Data into Better Products

    Inside AI Product Management at Amplitude: How Leaders Turn Data into Better Products

    When I think about the impact of AI on product management, one line sums it up for me: "Spencer Whittaker is a senior AI product manager at Amplitude. He focuses on using AI to advance Amplitude's mission of helping companies build better products." That focus on outcomes reflects how I frame AI Strategy—grounding every model and workflow in customer value and product-led growth.

    In practice, that means pairing Amplitude analytics and behavioral analytics with A/B testing and continuous discovery. I lean on eval-driven development to keep models honest, and I coach LLMs for product managers techniques so teams can prototype safely while we protect signal. Using a unified analytics platform clarifies what to build next and how to iterate faster.

    On teams I lead, product discovery stays tightly coupled to AI workflows: we map hypotheses to metrics, design experiments, and close the loop with instrumentation before we ship. That discipline turns AI from a demo into durable value, accelerating activation, retention, and feature adoption without sacrificing quality. A pragmatic AI product toolbox keeps us focused on measurable outcomes, not just novel capabilities.

    If you’re building with AI today, take a page from leaders pushing the craft forward: start with clear outcomes, connect your data in a unified analytics platform, and let A/B testing and continuous discovery guide your roadmap. With the right foundations—Amplitude analytics, behavioral analytics, and a sharp AI Strategy—you’ll transform insight into impact and build better products, faster.


    Inspired by this post on Amplitude – Perspectives.


    Book a consult png image
  • Scale Support with Heart: How AI Makes Every Customer Interaction Faster and More Human

    Scale Support with Heart: How AI Makes Every Customer Interaction Faster and More Human

    Every day at HighLevel, I talk with support leaders who are balancing two imperatives that can feel at odds: scaling service efficiently while deepening empathy in every interaction. My product lens is simple—use AI to clear the path for humans to do what only humans can do: listen, understand, and solve nuanced problems with care.

    Discover how AI helps support teams deliver faster, more empathetic experiences. Automate the repetitive, so agents can focus on what matters: the customer.

    That principle anchors our customer support AI strategy. We deploy AI workflows that handle the heavy lift—classification, intent detection, summarization, knowledge retrieval, and next-best-action—so agentic AI can triage, resolve routine issues, and hand off the right context when a human touch is needed. The result is a queue that moves faster, with more signal and less noise, and a team freed to bring empathy and judgment to the moments that matter most.

    On the front line, a voice AI agent or chat interface deflects repetitive requests, while conversation design ensures the experience feels respectful, transparent, and helpful. Inside the console, Agent Analytics surface what leaders care about: which topics spike, where customers get stuck, how sentiment and CSAT shift, and which playbooks actually shorten time to resolution. When an agent steps in, AI-assisted replies, real-time summarization, and suggested macros reduce cognitive load—so attention goes to the customer, not the keyboard.

    Shipping these capabilities responsibly requires rigor. My playbook pairs LLMs for product managers with a retrieval-first pipeline that grounds responses in audited knowledge, backed by privacy-by-design and data governance. We use eval-driven development to measure safety and quality, and A/B testing to quantify impact before broad rollout. This isn’t just about automation; it’s about trust, reliability, and continuous discovery with real customers.

    Context is king, so CRM integration is non-negotiable. By unifying tickets, purchase history, prior conversations, and lifecycle stage, agents walk in with empathy already loaded. Whether the channel is Intercom, HubSpot, or native chat, a unified analytics platform connects signals across journeys, enabling proactive outreach, smarter product tours, and in-app guides that prevent avoidable tickets in the first place.

    The outcome is a support organization that scales without sacrificing humanity. AI handles the repetitive; people handle the relational. Teams spend less time searching and more time solving. Leaders coach with data instead of guesswork. And customers feel heard—because they are. That’s how we make human support more human, at scale.


    Inspired by this post on Amplitude – Perspectives.


    Book a consult png image
  • What’s New with Amplitude Agents: Faster Releases, Smarter Insights, and Must‑Try Upgrades

    What’s New with Amplitude Agents: Faster Releases, Smarter Insights, and Must‑Try Upgrades

    I’ve been deep in the work of turning agentic AI from a promising idea into reliable, measurable outcomes. Today, I want to share a concise, practitioner’s update on what’s new with Amplitude Agents—and, more importantly, how to get real value fast using proven product management techniques.

    We launched AI Agents a few weeks ago. We’ve been shipping pretty fast since then, so we wanted to loop you in on what’s new and what’s worth trying.

    Rapid releases only matter if they translate into user value. My approach is to treat every agent improvement as a learning opportunity: instrument it, set clear success metrics, run controlled experiments, and iterate. This eval-driven development mindset keeps us honest about what’s truly working in the wild.

    If you’re trying Amplitude Agents now, start with a narrowly scoped, high-signal workflow where success is unambiguous—think a single journey with a clear “done” state. Connect the experience to your unified analytics platform so you can see the full picture across events, funnels, and cohorts. In practice, I lean on Amplitude analytics and Agent Analytics to make this visibility effortless.

    Define how you’ll measure impact before you ship. Identify activation and completion events, baseline them, and then A/B test your agentic AI flow against the status quo. Behavioral analytics will show whether users are discovering the agent, sticking with it, and returning for more. When the story in the data is clean, it’s much easier to scale the win.

    Hardening matters as much as headlines. As you expand use, apply sensible guardrails—input validation, clear prompts, and transparent handoffs to deterministic flows when confidence is low. Pair this with observability so you can spot anomalies early and recover gracefully. These practices reduce risk while preserving the speed and creativity that make AI workflows powerful.

    Once the basics are working, dig into adoption patterns: segment by cohort, study user activation paths, and run retention analysis to find where the agent is truly changing behavior. These insights shape roadmap priorities and help you invest in the moments that drive durable value.

    We’ll keep shipping quickly and sharing practical guidance. If you have feedback, experiments to showcase, or questions about instrumentation, send them our way—I use that signal to refine our next set of improvements and learning agendas. Expect more short, focused updates and deeper dives on evaluation frameworks, prompt strategies, and rollout playbooks.

    In short: keep it scoped, instrument everything, test deliberately, and let the data guide your next move. That’s how Amplitude Agents becomes not just new, but indispensable.


    Inspired by this post on Amplitude – Best Practices.


    Book a consult png image
  • Beyond Command and Control: How I Build Trust, Speed, and Autonomy in Product Teams

    Beyond Command and Control: How I Build Trust, Speed, and Autonomy in Product Teams

    When uncertainty spikes, I notice many organizations snap back to "Command and control." It feels fast, safe, and decisive—especially when the stakes are high. But in product management leadership, speed without shared context is often an illusion, and control without trust rarely scales. I’ve learned that what looks like strength from the top can quietly create bottlenecks, missed signals, and disengaged teams.

    Why do smart companies revert in tough times? Familiarity. Centralizing decisions can reduce short-term cognitive load and signal clarity. Yet the cost shows up quickly: leaders become single-threaded on context they cannot possibly hold, and teams spend cycles asking for permission rather than creating value. The result is slower learning and weaker product strategy just when continuous discovery and iteration matter most.

    Here’s the hard truth: no single leader can hold all the context required to make every decision in a modern, cross-functional environment. The hidden complexity of customer segments, technical debt, data signals, and go-to-market constraints outstrips any one person’s bandwidth. That’s why empowered product teams, staffed with domain experts, outperform command centers—provided they’re aligned on outcomes and guardrails.

    I like the burning house analogy: in a true emergency, crisp direction helps—"take the stairs, not the elevator"—because the problem is clear, the time horizon is short, and the action is obvious. But most product work is not a single burning house; it’s a city with evolving fire codes, shifting weather, and neighborhoods that look different block to block. In that environment, distributed action scales better than centralized control.

    Strong leadership is not the same as command-and-control. In practice, it means setting a compelling direction, defining guardrails, and running tight feedback loops. I aim for what I call the "Flotilla of kayaks": we’re all headed to the same lighthouse, but each kayak navigates its own currents based on local information. That’s aligned autonomy—fast, resilient, and deeply accountable.

    People often ask why some command-and-control companies still succeed. My view: beneath the surface, there’s usually more trust and unofficial autonomy than their org charts suggest. Teams earn freedom by shipping reliably, sharing decision rationales, and showing outcomes. Leaders tolerate—and even quietly endorse—those pockets of autonomy because they see the results.

    It’s a spectrum, not a binary. I flex my style based on risk, reversibility, and time horizon—what I’d call spectrum thinking. Early in a bet, or when risks are existential, I raise the altitude and tighten the cadence. As confidence builds, I widen autonomy and shift the team to outcomes over outputs. Beware "Founder mode" when it drifts from vision-setting into day-to-day decision vetoes; it’s intoxicating early and suffocating at scale.

    On decision-making, I prefer a simple principle: let the person with the most relevant expertise decide, while incorporating the right input. That’s "Consultative decision-making" in practice. In some regions, you’ll hear it called "Konsultativer Einzelentscheid." The point is to seek counsel without defaulting to consensus that bogs down speed. One person owns the call, and everyone commits to the decision once it’s made.

    Practically, here’s what works for my teams: we clarify decision rights up front, draft pre-reads with clear options and risks, involve the smallest set of stakeholders required, and document the decision and expected signals ahead of time. Product trios keep discovery tight with design and engineering, while stakeholder management focuses on context, not sign-offs. We track outcomes vs output OKRs and hold regular decision reviews so we can reverse or double down fast.

    My key takeaways are consistent: "Command and control" can feel efficient, but it doesn’t scale in complex environments. No leader can hold all the context. Strong leadership is about direction, guardrails, and feedback loops—not control. High-performing teams balance autonomy with alignment. Decision-making should sit with the person closest to the problem, supported by the right input and transparent reasoning. Trust is built and earned over time—and it changes how teams operate.

    Reflection prompts I use with my leads: Where does your team sit on the command-and-control ↔ autonomy spectrum? Are the highest-context people truly making the decisions? What would it take to increase trust and autonomy—better instrumentation, clearer guardrails, or tighter cadences? Which calls require consensus, and which deserve a decisive, single-threaded owner?

    If you’re wrestling with speed, alignment, and autonomy in your organization, start small: pilot "Consultative decision-making" on one consequential decision, set explicit guardrails, and measure the outcome. You may be surprised how quickly aligned autonomy compounds into better product discovery, sharper product strategy, and stronger execution.


    Inspired by this post on Product Talk.


    Book a consult png image
  • Master Build-to-Learn: The Essential FAQ to Supercharge Product Discovery in the AI Era

    Master Build-to-Learn: The Essential FAQ to Supercharge Product Discovery in the AI Era

    In the age of AI, I’ve come to believe we’re all builders—yet not all building is the same. There is a very meaningful difference between building to learn (known as product discovery) versus building to earn (known as product delivery). When we confuse the two, we waste precious time, budget, and team energy on output over outcomes. My goal in this FAQ-style reflection is to clarify when and how to choose each mode so we can make smarter, faster, more confident product decisions.

    Why does this distinction matter so much right now? Because as the cost of product delivery continues to drop, the scarce resource shifts from shipping capacity to clarity of problem, solution, and value. Cloud infrastructure, CI/CD, feature flags, and even gen AI code assistance have made it cheaper to launch. That’s great—but if we don’t learn the right things before we scale, we’ll efficiently deliver the wrong product. Discovery is how we de-risk that.

    What do I mean by build to learn? I use discovery to quickly validate problems, test value, and shape solutions before committing delivery teams to scale. In practice, that means continuous discovery with customer interviews, rapid prototyping, and lightweight experiments that put us in front of real users fast. I rely on product trios and empowered product teams to co-own outcomes, not just output, and I anchor decisions with outcomes vs output OKRs so we stay focused on measurable impact.

    How do I structure discovery sprints? I start with an opportunity solution tree to map customer pain points and candidate solutions, then select the smallest test that can invalidate a risky assumption. When signals are ambiguous, I refine the questions and instrument better learning loops rather than pushing harder on delivery. For experiments, I keep a bias to speed: clickable prototypes, concierge tests, or gen ai for product prototyping often reveal more in days than a coded MVP does in weeks. When experiments go live, I use a clear minimum detectable effect (MDE) and resist reading noise as signal.

    Where does AI change the calculus? LLMs for product managers are turbocharging discovery by accelerating research synthesis, persona drafts, and early concept validation. I pair that with eval-driven development to set crisp acceptance criteria for AI behaviors before any production integration. Prompt engineering and conversation design are part of the toolkit, but the same rule applies: prototype to learn, not to impress. AI can make bad ideas cheaper to build—so disciplined discovery matters more than ever.

    So when do I switch to build to earn? Once I have evidence of value and feasibility, I shift into product delivery to scale with quality, security, and reliability. This is where I bring in product roadmapping and sprint planning, DORA metrics to monitor deployment frequency and lead time, and strong SRE and observability practices to safeguard the user experience. The handoff isn’t a wall; discovery continues inside delivery to refine scope, reduce risk, and maintain momentum.

    What pitfalls do I watch for? The biggest is treating delivery as discovery—shipping features to “see what happens” without a clear learning thesis. Another is tech-first decisions driven by technology FOMO instead of product strategy and customer value. I also see teams set output-based commitments that crowd out learning; outcomes vs output OKRs keep us honest. And when considering build vs buy, I evaluate whether the capability differentiates us; if not, I’ll buy to preserve discovery capacity on what truly matters.

    My operating conviction is simple: invest early and deliberately in build to learn so build to earn becomes high-confidence, high-velocity, and high-impact. In practical terms, that means smaller bets, faster feedback, clearer outcomes, and tighter collaboration across product, design, and engineering. If we get discovery right, delivery feels inevitable—and customers feel understood.


    Inspired by this post on SVPG.


    Book a consult png image
  • AI Agents That Truly Help Product Teams: A Practical Framework for When—and When Not—to Use Them

    AI Agents That Truly Help Product Teams: A Practical Framework for When—and When Not—to Use Them

    Every week, I field the same question from product leaders and engineers: should we deploy an AI agent here, or are we overfitting the problem to a shiny solution? Learn when AI Agents actually help product teams—plus a simple framework to decide when not to use them.

    When I say “AI agents,” I’m talking about autonomous or semi-autonomous systems that can perceive context, plan steps, and take actions across tools and data sources with minimal supervision—what many now call agentic AI. In product management terms, they’re not just another feature; they’re an operating model shift. Used well, they compound team leverage. Used poorly, they add invisible complexity, new failure modes, and governance headaches.

    To make the call with confidence, I use a straightforward VITAL framework that my team can apply in minutes. It keeps us honest about where AI agents are a force multiplier—and where a simpler automation, rule, or in-product UX is the better choice.

    V is for Volume. Agents shine where there’s sustained, repetitive, high-throughput work: triaging inbound support, cleansing CRM records, orchestrating QA checks, or synthesizing weekly research summaries. If the workflow happens rarely or ad hoc, an agent is often overhead in disguise.

    I is for Instructions. Can I specify success in clear, testable terms? Strong instructions include measurable acceptance criteria and constraints. If I can’t articulate what “good” looks like without hand-waving, the task likely needs product discovery, not autonomy.

    T is for Tolerance. What is the blast radius if the agent makes a wrong call? Low-stakes, reversible actions with tight guardrails are ideal. If the tolerance for error is near zero (e.g., irreversible financial transactions or sensitive regulatory actions), favor human-in-the-loop, stronger approvals, or defer agents entirely.

    A is for Access. The agent needs the right data, tools, and permissions, with privacy-by-design and data governance in place. If telemetry is sparse, integrations are brittle, or you can’t enforce least-privilege access, you’ll fight fragility more than you’ll gain leverage.

    L is for Learning loop. Agents require eval-driven development, Agent Analytics, and continuous feedback to stay accurate as reality shifts. If you can’t measure quality, latency, and cost per outcome—or you lack a retrieval-first pipeline to ground responses—expect drift and stakeholder distrust.

    Now, the counterweight. Don’t use agents when the problem is novel or strategically ambiguous and you still need exploratory research; when outcomes are unmeasurable or subjective without heavy context; when stakes are high and the acceptable error rate is effectively zero; when data is siloed, stale, or legally constrained; when the work is one-off or low-volume; or when your team can’t commit to instrumentation, evaluations, and ongoing maintenance. In these cases, a simpler rules engine, a clearer UX, or a well-defined workflow usually beats agentic complexity.

    Here’s how this plays out in practice. We’ve seen agents materially improve customer support triage (categorization, priority, and next-best-action suggestions), CRM hygiene (deduplication, enrichment, and routing), and release QA (regression check orchestration with human sign-off). Conversely, we avoid agents for nuanced pricing decisions, sensitive risk scoring without robust datasets, or any workflow where “explainability” and auditability trump speed.

    Operationalizing agents is a product problem before it’s an ML problem. Start narrow with a retrieval-first pipeline and rigorous prompt engineering, define success metrics upfront (quality, latency, cost per task), and run head-to-head evaluations against human baselines. Ship behind feature flags, monitor with Agent Analytics, and graduate from assisted to autonomous modes only after you’ve proven stability. Align this with product roadmapping and sprint planning so the work lands as durable capability, not a lab demo.

    Finally, be honest about build vs buy. If the workflow is a point of parity, consider buying and focusing your team on integration quality and governance. If it’s a potential source of competitive differentiation, invest in a modular architecture with clear context window management, strong observability, and a feedback loop tightly coupled to your empowered product teams.

    The bottom line: AI agents unlock leverage when there’s volume, clarity, tolerance, access, and a learning loop. If any of those pillars is missing, pause. Your best next move is likely better instrumentation, sharper problem framing, and continuous discovery—not more autonomy. That discipline is how product teams turn agentic AI from hype into habit.


    Inspired by this post on Product School.


    Book a consult png image
  • AI Data Security for Product Teams: Protect Sensitive Product Data Without Slowing Innovation

    AI Data Security for Product Teams: Protect Sensitive Product Data Without Slowing Innovation

    Protecting product data has never felt more urgent. Every week, my teams experiment with gen ai prototypes and LLM-powered capabilities, and I’m accountable for ensuring our innovation never compromises cybersecurity, privacy, or customer trust. The goal is not to slow down—it's to build in the right guardrails so speed and safety reinforce each other.

    Understand AI data security risks in product teams, what product data is most exposed, and how to use AI tools responsibly without slowing innovation.

    When I assess AI risk with product managers, I start with how data moves. The biggest threats usually come from prompt and context leaks, unsafe logging of sensitive inputs or outputs, permissive access controls, unmanaged third-party model usage (shadow AI), and unclear data-retention policies. For LLMs for product managers, I emphasize that every step in AI workflows—from collection to processing to storage—must assume adversarial conditions.

    In my experience, the product data most exposed includes customer PII and payment identifiers, internal strategy documents and roadmaps, analytics and behavioral telemetry tied to users, feature flags and configuration values, embeddings and vector stores that can reveal sensitive patterns, and the prompts or contexts themselves. Even “harmless” evaluation datasets can contain inferred identities. Treat all of this as high-value assets in your data governance model.

    I apply privacy-by-design from the first discovery conversation: minimize data by default, redact or tokenize before any external model call, and separate identities from content wherever possible. A retrieval-first pipeline helps keep raw customer data within our boundary while still enabling relevant context. We combine deterministic safeguards (policy-based redaction, allow/deny lists) with runtime observability to detect anomalous prompts, outputs, or access patterns.

    To keep velocity high, we operationalize risk rather than debate it ad hoc. A lightweight risk scoring rubric classifies each capability (e.g., internal-only, customer-facing, regulated data adjacent) and dictates controls: redaction requirements, human-in-the-loop thresholds, eval-driven development gates, and incident response readiness. These controls live in CI/CD so product teams get fast, automated feedback without waiting on meetings.

    Partnership is essential. I bring Security, Legal, and Data partners into the product trios early to align on regulatory compliance and threat modeling while scoping solutions that meet outcome goals. We maintain a shared catalog of approved providers and architectures, document data flows, and version our policies just like code—so everyone can see what changed and why.

    Vendor diligence is non-negotiable. I ask LLM providers about data retention and training usage, encryption at rest and in transit, key management, regional data controls, audit posture (SOC 2, ISO 27001, HIPAA where needed), and support for private networking. We restrict scopes with least-privilege access and instrument robust observability for threat detection and response across the full path, not just the API call.

    Culture makes the biggest difference. I coach teams on prompt hygiene, secret handling, and context window management; we publish redaction patterns, approved libraries, and clear do/don’t examples. When incidents happen, we treat them as learning opportunities, run blameless reviews, and update our playbooks, guardrails, and training materials accordingly.

    The outcome I aim for is confidence with speed: we ship AI features that customers love while protecting the data they entrust to us. With a clear risk model, strong data governance, and embedded controls, product teams can innovate boldly—without compromising on security or trust.


    Inspired by this post on Product School.


    Book a consult png image
  • AI Experimentation Mastery: How I Test Faster, Tame Variability, and Ship with Confidence

    AI Experimentation Mastery: How I Test Faster, Tame Variability, and Ship with Confidence

    I’ve learned that the fastest path to durable AI impact is a disciplined experimentation engine: one that moves quickly, reduces ambiguity, and earns trust with evidence. My goal isn’t just to ship models—it’s to ship measurable outcomes with repeatable rigor.

    AI experimentation for product teams. Here’s how to test AI features, choose the right metrics, handle variability, and make data-driven decisions.

    I start every AI initiative by framing a clear decision: what must be true for this feature to be worth building, and how will we know quickly? From there, I map driver trees that connect user value to measurable signals, so every test clarifies both impact and risk, not just accuracy.

    Success criteria come next. I translate aspirations into testable thresholds, define leading and lagging indicators, and size tests with minimum detectable effect (MDE) so we don’t confuse noise for signal. This keeps us honest about sample sizes, power, and the real cost of waiting for certainty.

    Before I touch production traffic, I run eval-driven development. I curate golden datasets that reflect real user complexity, codify rubrics for correctness, safety, tone, and latency, and automate scoring so improvements are reproducible—not anecdotal. This gives the team a stable baseline to iterate prompts, tools, and policies with confidence.

    Model behavior is inherently stochastic, so I deliberately control variability. I document temperature, top-p, and seed strategies; I compare deterministic settings for regression checks versus sampled settings for user-facing creativity; and I test sensitivity across content lengths and edge cases. This reduces flakiness and prevents surprise regressions during CI/CD.

    When it’s time to learn from real users, I favor A/B testing with thoughtful guardrails. I run holdouts, cap exposure with feature flags, and protect core experience metrics like retention and time-to-value. For ranking and retrieval changes, I’ll use interleaving or switchback tests to isolate effects from seasonality and traffic mix.

    To handle LLM variability online, I aggregate outcomes over multiple prompts per cohort, use stratified bucketing to balance power users and new accounts, and track confidence intervals over time instead of snapshot p-values. This approach turns noisy model outputs into stable product signals.

    Instrumentation fuels everything. I rely on behavioral analytics to trace user intent, effort, and satisfaction across flows, and I wire up Amplitude analytics for event schemas, funnel drop-offs, and cohort comparisons. Clear event taxonomies and naming discipline make it trivial to separate model quality from UX friction.

    Risk is part of the work, so I bake in AI risk management early. I include toxicity and PII checks in my offline evals, monitor safety metrics in every A/B, and set rollback criteria tied to user harm and system costs. Privacy-by-design, audit logs, and runtime safeguards aren’t afterthoughts—they’re acceptance criteria.

    The operating cadence matters as much as the math. I run continuous discovery with customer interviews to keep the test queue grounded in real jobs-to-be-done, and I align product trios on hypotheses, success metrics, and stop-loss rules before launch. Weekly readouts keep decisions crisp, and post-ship learning cycles feed the next iteration.

    Finally, I invest in upskilling the team. We run internal workshops on LLMs for product managers, standardize experiment templates, and maintain a living playbook so new experiments start at 80% instead of 0%. The result: faster learning loops, safer bets, and more confident shipping.


    Inspired by this post on Product School.


    Book a consult png image
  • Pretotyping vs. Prototyping: How I Validate Ideas Fast and Build Products Customers Love

    Pretotyping vs. Prototyping: How I Validate Ideas Fast and Build Products Customers Love

    I learned early in my career that beautiful prototypes don’t save you when you’re solving the wrong problem. What does save you is separating market risk from solution risk and choosing the fastest, lowest-cost way to get evidence. That’s why I rely on pretotyping to test demand in days and prototyping to refine usability and feasibility once I see a strong signal. The result: faster cycles, fewer wasted sprints, and products customers genuinely want.

    Pretotyping vs. prototyping explained: differences, benefits, examples, and when to use each approach to validate ideas before you build.

    Here’s how I define the two in practice. Pretotyping answers, “Should we build this at all?” Its goal is to validate real user intent and behavior with the lightest-weight artifact possible—often before any code. Think painted-door (fake door) experiments, Wizard-of-Oz flows powered by humans behind the scenes, concierge tests, landing-page smoke tests with waitlists or preorders, and simple A/B testing to gauge click-through intent. It optimizes for time-to-signal and cost-to-learn.

    Prototyping answers, “Can we build this well?” and “How should it work?” Once demand is evidenced, I prototype to de-risk solution details: usability, architecture, performance, and integration. This might include interactive UI models, high-fidelity flows, technical spikes, or service stubs. Here, I optimize for learning about user experience and technical feasibility without fully committing to production.

    When should you use each? If your biggest unknown is market risk—whether customers care at all—start with pretotyping. If your biggest unknown is solution risk—how to deliver an experience that’s usable, reliable, and scalable—move to prototyping. In other words, validate the “right thing” before you perfect the “thing right.”

    My decision rule is simple: identify the dominant risk, then pick the smallest experiment that can credibly invalidate it. For market risk, I look for evidence of behavior, not opinions: clicks on a painted door, signups on a landing page, willingness to pay (deposits, preorders), or sustained repeat usage in a Wizard-of-Oz flow. For solution risk, I look for task completion, time-on-task, error rates, and qualitative friction from usability sessions with a realistic prototype.

    Concrete examples from recent work help illustrate the difference. When exploring a new analytics insight, I shipped a fake door inside our product nav; a simple tooltip explained the concept and captured interest. Click-through rate, conversion to a short explainer, and waitlist signups told me whether the value proposition resonated before building anything. For a complex AI-assisted workflow, I ran a Wizard-of-Oz experiment: users experienced the end-to-end flow while our team manually handled the “AI” behind the curtain. That gave us real engagement data and edge cases to inform the prototype and later the MVP.

    Metrics matter. I set a clear hypothesis with a guardrail on sample size and a minimum detectable effect I’d consider actionable. For pretotyping, I focus on time-to-first-signal, intent conversion (CTR to interest, interest to signup), cost-per-qualified-lead, and evidence of willingness to pay. For prototyping, I prioritize task success rates, usability severity findings, and qualitative insights that materially change the design or technical approach. Above all, I avoid vanity metrics and anchor decisions to outcomes, not output.

    My repeatable playbook looks like this: (1) Frame the problem and value proposition in one crisp sentence. (2) Choose the leanest pretotyping method that can reveal real behavior. (3) Define success metrics and a decision rule before you run the test. (4) Launch quickly, instrument well, and let the data run long enough to be credible. (5) If demand is strong, promote to a prototype to refine UX and de-risk technicals; if not, iterate the proposition or stop. This keeps product discovery continuous and ensures roadmapping and sprint planning are guided by evidence.

    There are ethical guardrails I never skip. Painted doors must set correct expectations once clicked; waitlists or learn-more pages are honest and respectful. For Wizard-of-Oz and concierge tests, I’m explicit about data handling and provide timely follow-up. Trust compounds when experiments are transparent and user time is valued.

    Tooling can accelerate the cycle without diluting rigor. I often use lightweight design systems and no-code automations to stitch together realistic flows, and I’ll leverage gen AI for product prototyping to generate copy, microinteractions, or data scaffolding. But the principle remains: don’t over-invest until evidence earns the investment. Empowered product teams thrive when they optimize for learning velocity, not feature velocity.

    If you’ve ever felt the tension between shipping fast and shipping right, this approach resolves it. Pretotype to prove the market; prototype to perfect the solution. Do that consistently and you’ll spend more time delivering outcomes customers value—and far less time debating outputs.


    Inspired by this post on Product School.


    Book a consult png image
  • The AI PM One-Pager: Radical prototyping requirements for speed, clarity, and truth

    The AI PM One-Pager: Radical prototyping requirements for speed, clarity, and truth

    I move fastest in Generative AI when I strip work down to its essential signals. At HighLevel, I rely on a single-page format—”Prototyping Requirements: The One-Pager for AI PMs”—to turn ideas into testable artifacts within hours, not weeks. This approach reinforces AI Strategy, minimizes coordination overhead, and keeps Product Management focused on learning over ceremony.

    “Prototyping requirements go rogue: one page, zero bureaucracy, built for AI. Shape concepts fast, prompt tools directly, and get to the truth sooner.”

    In practice, my one-pager captures only what’s required to run an immediate experiment: the user problem, the target behavior change, success signals, core constraints, intended AI workflows, and the smallest realistic path to an evaluable demo. I also include example prompts, guardrails, and evaluation criteria so the team can apply prompt engineering and LLMs for product managers without guessing.

    This is eval-driven development in action. I document a minimal hypothesis, concrete inputs/outputs, and a quick plan for metrics, including qualitative signals from product discovery and continuous discovery. By prompting tools directly, we expose assumptions early, shorten feedback loops, and build an AI product toolbox that compounds learning sprint after sprint.

    I run this with a product trio to ensure we balance feasibility, usability, and value. We align on risks, dependencies, and what “good” looks like, then we integrate the learnings into product roadmapping and sprint planning. The result: fewer meetings, tighter collaboration, and empowered product teams delivering sharper outcomes with less friction.

    If you want speed and clarity without sacrificing rigor, adopt the one-pager. It centers the conversation on evidence, accelerates AI workflows from prompt to prototype, and makes it obvious what to try next—and what to stop doing. Most importantly, it keeps the team focused on truth over theater, which is how great AI products actually ship.


    Inspired by this post on Product School.


    Book a consult png image
  • Unleashing Inbound Sales with AI: My Playbook for Launching and Scaling Sales Agents Fast

    Unleashing Inbound Sales with AI: My Playbook for Launching and Scaling Sales Agents Fast

    Inbound leads shouldn’t wait for a rep’s calendar. When we first launched The Service Agent Blueprint, support leaders finally had a clear AI path. Go-to-market and revenue teams are now facing similar uncertainty, so I’m introducing The Sales Agent Blueprint—a practical map for launching and scaling AI for sales with confidence.

    For most sales teams, inbound motions require a lot of manual work. I’ve watched leads pile up in queues, waiting for availability rather than being prioritized by buyer intent. That delay costs meetings, pipeline, and momentum—and it’s exactly where a modern AI Strategy can transform your go-to-market strategy.

    Agents can run sales conversations end to end – engaging buyers, qualifying leads, and routing high-intent opportunities to the right team to move prospective buyers forward quickly. Humans will still be involved, but will move their focus to the consultative conversations and higher-value work they did not have time to focus on before. In practice, this shift enables cleaner AI workflows, better conversation design, and a healthier balance between sales-led growth and product-led growth.

    The questions many go-to-market and revenue leaders are facing now are where do you start? What should success look like? How do you actually test and deploy these solutions? These are the right questions—and the ones I hear most often when teams weigh build vs buy decisions, evaluation frameworks, and CRM integration nuances.

    The Sales Agent Blueprint answers those questions. It’s designed to be a strategic guide for sales, revenue, and AI transformation leaders who want to deploy AI for inbound sales fast, prove value, and build momentum. If you’re aiming for eval-driven development, this will help you define success up front and operationalize it.

    What’s inside is simple by design yet deep enough to take you from zero to value. The Sales Agent Blueprint is structured around two tracks that reflect how high-performing teams adopt agentic AI: first, launch for quick wins; next, scale for durable growth.

    Minimal blue banner for Introducing the Sales Agent Blueprint with a bold 'Scale it' headline, abstract halftone device graphic, subtle crop marks, and a 'Coming Soon' badge in the upper-right corner.
    Coming soon: Sales Agent Blueprint. A sleek, blueprint-inspired teaser with the call to 'Scale it' signals tools, playbooks, and workflows to grow revenue, streamline operations, and scale teams with confidence.

    Today, I’m releasing the first part of the Blueprint: “Launch it.” It’s a practical guide for getting your Agent live and seeing real results. You’ll learn how to deploy a Sales Agent that runs inbound sales conversations end to end, engaging buyers, qualifying leads, and routing high-intent opportunities to the right outcome in real time—without disrupting your current CRM integration or pipeline processes.

    By the end of the “Launch it” track, you’ll be ready to execute with clarity. Here’s how I frame the essential steps, based on what consistently works in the field.

    Understand what a Sales Agent is: Discover why they’re different from chatbots and how they work. Build a business case: Prove the basic economics of AI, decide whether to buy or build, and get the buy-in and budget you need to move forward.

    Evaluate an Agent: Learn how to define success, choose the right evaluation criteria, and run a focused, high-impact assessment with our five-step framework.

    Deploy with confidence: Build a deployment plan that gets your Agent live quickly to engage buyers at peak intent. Learn what to expect at each stage.

    Vector-style 'Blueprint' title on a light grid with Bézier points, plus a royal-blue panel reading '1 Launch it' next to a satellite icon; footer shows FIN.AI/BLUEPRINT/SALES promoting the Sales Agent Blueprint.
    Introducing the Sales Agent Blueprint. This crisp, grid-based graphic spotlights step 1—Launch it—signaling day-one activation for an AI sales agent. Explore the framework and get started at fin.ai/blueprint/sales.

    Continuously improve performance: After launch, your Agent becomes a system to manage. We’ll show you how to implement a repeatable process to train, test, deploy, and optimize.

    The second track, “Scale it” (coming soon), focuses on the organizational and systems design work that unlocks compounding gains. Launching AI is only the beginning. To unlock its full potential, you need to rewire your inbound sales motion—redesigning the buyer journey, building AI-first systems and ownership models, and rethinking how pipeline is generated and scaled. This is where governance, measurement, and team roles evolve to support sustainable growth.

    I’ll be building this Blueprint in public as I navigate the same challenges—sharing what works, what to avoid, and how to accelerate time-to-value without sacrificing quality or trust. If you’re ready to turn intent into revenue with agentic AI, this is your head start.

    The Sales Agent Blueprint is live now. Explore the full guide at fin.ai/blueprint/sales and start your “Launch it” sprint today.


    Inspired by this post on The Intercom Blog.


    Book a consult png image