Great products aren’t just shipped; they’re understood. In my product management practice, the difference between a good release and a great one often comes down to disciplined documentation that moves at the speed of delivery. That’s why the docs-as-code approach has become a cornerstone of how I build, lead, and measure product experiences across teams.
As I reflect on leaders who set a high bar in this craft, one description stands out: "With years of experience as Senior Documentation Manager, Jeff leads teams and oversees the end-to-end creation of documentation using docs-as-code methodology." That concise statement captures a model I deeply respect—one that treats documentation as a first-class citizen in the product lifecycle.
In practice, docs-as-code integrates documentation into CI/CD pipelines, version control, and peer review workflows—exactly how we ship software. This elevates quality, enforces consistency, and accelerates responsiveness to change, all while enabling rigorous content audit and UX writing standards. When documentation evolves with code, it becomes discoverable, testable, and measurable—key traits for scalable product management leadership.
The downstream impact is tangible. Users ramp faster through onboarding, in-app guides, and product tours because the narrative aligns with the product’s true state at any given commit. Support tickets drop, developers work with greater clarity, and PMs gain the feedback loops needed for continuous discovery. In a product-led growth motion, this clarity compounds—reducing time-to-value and enabling teams to ship confidently.
Equally important is the leadership pattern behind the methodology: aligning product, engineering, and customer-facing teams around shared truths. I’ve seen empowered product teams operate at their best when documentation is embedded in planning, sprint reviews, and release gates. This creates a single source of truth that scales knowledge, preserves intent, and shortens the path from decision to delivery.
For me, the standard expressed above isn’t just a role description—it’s a blueprint for operational excellence. When we manage documentation with the same rigor as code, we build trust at every touchpoint and create the conditions for sustained product velocity. That’s the level of clarity and execution I strive to foster across every product line.
Inspired by this post on Amplitude – Perspectives.
Every day, I challenge my teams to make one small, meaningful improvement—something so lightweight it’s impossible to ignore and easy to repeat. That tiny daily motion compounds, and over time it reshapes customer experience, operational quality, and team culture.
That’s the essence of Kaizen, the Japanese philosophy of continuous improvement. Developed in post-war Japan and popularized by companies like Toyota, Kaizen proves that small, steady changes lead to significant long-term results. In product management and customer support, this approach transforms big ambitions into daily behaviors that actually stick.
Crucially, Kaizen isn’t passive or unstructured. It thrives on three principles I reinforce across my org. First, small changes reduce resistance—when you lower the activation energy, teams move faster. Second, improvement is continuous, not occasional; instead of waiting for quarterly reviews or major releases, you ask: “What can we improve right now?” Third, everyone participates—the people closest to the work are best positioned to improve it. That’s how momentum spreads.
In practice, the cycle is simple: identify a small problem, test the change, measure the result, refine, and repeat. The point isn’t radical transformation in a single swing; it’s steady progress guided by data and observation—a rhythm that aligns beautifully with eval-driven development and continuous discovery.
At Intercom, we apply this same philosophy to how we manage our Agent Fin through a process we call the “Fin Flywheel”. Here’s how this works.
Train: Teach Fin how to handle and resolve the most complex customer queries.
Test: Run fully simulated customer conversations from start to finish to see exactly how Fin will behave before going live.
Deploy: Launch Fin across all channels so customers get consistent support wherever they reach out.
Analyze: Use AI-powered insights to review and improve Fin’s performance so it can deliver better customer experiences.
This isn’t a one-time setup; it’s a continuous loop where every interaction feeds ongoing improvement. Rather than deploying AI and assuming it will perform as expected, improvement is built into the system itself. The more Fin is used, the better it gets. That’s the hallmark of agentic AI done right—tight feedback loops, purposeful conversation design, and clear Agent Analytics that illuminate what to tune next.
But continuous improvement doesn’t stop with AI. Within our Human Support operations, I emphasize the same mindset that drives great LLMs for product managers: you instrument the experience, learn from real usage, and close gaps fast. We operate with a simple mindset: the first time that you solve a customer issue should be the last time it happens.
When a conversation reaches a human, we pause to diagnose and prevent recurrence. Why did this reach me? Why couldn’t Fin resolve it? How can we prevent this from happening again? Those questions anchor a culture of root-cause thinking and accelerate product-led growth by removing friction at the source.
To make this effortless, we’ve built a lightweight, AI-powered way to log suggestions in the moment—no long explanations or heavy admin required. Ideas are reviewed quickly and implemented by subject matter experts or by the team themselves. This keeps the flywheel spinning: insights flow in, fixes go out, and measurable outcomes improve.
The result is a frontline that evolves from reactive problem-solvers into a proactive improvement engine. The people closest to customers spot friction, suggest fixes, and see their insights shaped into meaningful change. It’s continuous discovery embedded in everyday work, not a side project.
Kaizen demonstrates that lasting progress doesn’t come from occasional transformation; it comes from intentional, everyday refinement. The “Fin Flywheel” applies that philosophy to AI. Our Human Support continuous improvement process applies it to human insights. Together, they create a shared system where both people and AI learn continuously from customer interactions.
When improvement is built into the mechanics of how you work, it stops being a one-off project and becomes an ingrained capability. Over time, those small daily improvements don’t just add up—they compound into a sustainable, data-driven advantage that elevates customer experience and differentiates your customer support ai strategy.
I remember the exact moment our product crossed the threshold from scripted automation to truly agentic AI. The excitement was real—so was the pit in my stomach when our dashboards went dark. Our trusted analytics and observability stack, which had served us flawlessly for traditional software, suddenly couldn’t explain what the agent was doing, why it made certain choices, or how to reproduce outcomes across runs.
"The moment our product became a AI agent, our entire observability stack became irrelevant—not something you want as an analytics company. Here's what we did."
Why does this happen? Agentic AI doesn’t behave like conventional apps. Instead of deterministic flows and neatly tagged events, we face non-deterministic trajectories, tool-use chains, evolving prompts, context window dynamics, and policy guardrails that influence outcomes in real time. Clicks and pageviews give way to tokens, tool calls, and conversation turns. Without purpose-built observability, you can’t do credible product discovery, measure behavioral analytics, or run eval-driven development with confidence.
That’s why we built Agent Analytics. We needed a unified lens to trace every step of an AI workflow—from user intent to model prompts, function calls, retrievals, tool outputs, and final responses—while capturing latency, cost, guardrail hits, fallbacks, and outcome tags. We instrumented runs end-to-end, added experiment support for prompt engineering and policy variants, and wired in evaluations so we could turn subjective quality into objective signals the team could act on.
The impact on product management was immediate. We shortened iteration cycles by making failure states obvious and reproducible, turned ambiguous feedback into structured data, and gave engineers and designers a shared source of truth for conversation design and AI workflows. With visibility into containment, escalation, autonomy ratio, and step-level success, we could ship confidently, rollback safely, and align roadmap bets to measurable outcomes—not anecdotes.
Building this capability demanded more than logging. We invested in data governance and privacy-by-design to mask sensitive content while preserving semantic context, and we separated human-identifiable data from model telemetry. We treated prompts and policies like code—versioned, diffable, and safely rolled out behind feature flags and CI/CD—so we could experiment without risking regressions in production.
What should every team measure? Start with outcome quality (task success, resolution, containment), reliability (tool success rate, guardrail triggers, fallbacks), performance (time-to-first-token, total latency, step-level latency), and efficiency (tokens and cost per successful task). Add groundedness checks for retrieval steps, regression evals for core journeys, and post-release anomaly detection to catch drift before users do. These metrics become your operating system for agent performance and your compass for product strategy.
If you’re building or scaling AI agents, you need Agent Analytics before you hit your first incident. It’s the difference between guessing and knowing—between reactive firefighting and proactive iteration. With the right observability, your team can move faster, manage risk intelligently, and translate agent behavior into business outcomes that compound over time.
Inspired by this post on Amplitude – Best Practices.
I’ve spent my career building and scaling product platforms, and I’ve seen firsthand how the right AI Strategy can unlock disproportionate impact. Foundational AI platforms are the engine room of modern analytics—when they’re done well, they compress time-to-insight, improve quality, and empower empowered product teams to deliver outcomes that matter.
Across leading analytics ecosystems, including Amplitude analytics, the winning pattern is consistent: invest in a unified analytics platform that abstracts complexity while enabling rapid iteration. By standardizing data governance and privacy-by-design, teams gain the freedom to experiment confidently without sacrificing compliance or security.
For me, “foundational AI platforms” means pragmatic building blocks that product and engineering can trust: evaluation harnesses for models, retrieval pipelines that surface the right context, feature stores that ensure consistency, and CI/CD with robust observability. When these AI workflows are in place, behavioral analytics, anomaly detection, and A/B testing stop being one-off projects and become repeatable capabilities.
The payoff isn’t just efficiency—it’s strategic differentiation. Internal innovation accelerates when teams can go from idea to live experiment in days, not quarters. That speed shapes the future of AI analytics: richer insights woven directly into product experiences, LLMs for product managers to prototype faster, and analytics that feel conversational, contextual, and deeply actionable.
Execution still makes or breaks the vision. I align product strategy around outcomes vs output OKRs, pair product trios with forward-deployed engineers, and use a clear build vs buy rubric for platform components. The goal is platform scalability without reinventing the wheel—own the parts that differentiate, integrate the rest, and keep your interfaces painfully simple.
If you’re leading this journey, start by mapping your critical use cases to platform capabilities, close gaps in data governance, and stand up an eval-driven development loop. Within one or two quarters, you should see a measurable lift in deployment frequency, a sharper signal on performance, and a culture that ships with confidence. That’s how foundational AI platforms empower internal innovation and help define the future of AI analytics.
Inspired by this post on Amplitude – Best Practices.
What if AI could help reduce the 10-plus years it takes to get a new drug to market? That question has shaped much of my own product strategy thinking, and it’s exactly why I was drawn to Medable’s bold move with Agent Studio. It’s a rare look inside an enterprise AI platform built for one of the most regulated industries in the world—and a team that’s still figuring it out in real time.
In this episode of Just Now Possible, Teresa Torres talks with four members of the Medable team: Luke Bates (Product Leader, Agent Studio), Jen Brown (Product Manager), Matt Schoolfield (Product Designer), and Fiachra Matthews (Principal Architect). Listening through a product management lens, I focused on how their choices reflect a modern agentic AI strategy that balances speed, safety, and scale.
Medable does something uniquely hard: enabling global clinical trials across 100+ languages and accelerating drug-to-market timelines. That scope demands more than clever prompts—it requires a durable platform approach. Their answer is Agent Studio, a no-code/low-code platform for configuring and deploying agents across the clinical trial lifecycle.
What impressed me most was how clearly the platform’s primitives map to repeatable value: models, skills, knowledge bases, MCP connectors, versioning, and trigger types. In my experience, platforms win when these building blocks are composable, governed, and observable—exactly the direction Medable is taking.
You’ll also hear about the two agents they’ve built on top of it: an ETMF agent that automates document classification across 80,000-plus documents per year, and a CRA agent that monitors patient safety and data quality across 13 different clinical systems. For a domain where errors carry real human consequences, this is the right mix of automation and oversight.
Under the hood, their architecture choices echo what I’ve seen work in other high-stakes environments. They walk through RAG approaches at scale: embeddings vs. markdown hierarchies vs. just-in-time MCP retrieval, and explain Why they built custom MCPs with an authentication and credentialing wrapper. They also detail Context window management with sub-agents and automatic tool filtering—critical to keep agents focused and reliable as complexity grows.
Data alignment is often the unsung hero of agent reliability. I appreciated how they described How they built a unified ontology layer to map terminology across 13 different clinical data systems. Equally important, they show their paper trail: How they document agent intent → specification → test evidence to satisfy regulatory bodies. In a GXP context, this kind of lineage isn’t “nice to have”—it’s the price of admission.
Discover how Medable's Agent Studio reimagines clinical operations, shrinking drug-to-market timelines from a decade to a year with no-code agents, automated eTMF document classification, unified data monitoring, and human-in-the-loop validation.
Strategically, I love that Medable chose a platform approach to agents instead of one-off builds. They outline Three deployment models: Medable-built products, services-led custom builds, and self-serve platform access. This mirrors a healthy platform business model: prove value with first-party solutions, extend via services for complex needs, and unlock scale with self-serve—while keeping governance centralized.
Reliability is a theme throughout. They describe Evaluation design in a GXP-regulated environment: golden datasets, production monitoring, and the challenge of human feedback as ground truth. We also get a concrete picture of what human-in-the-loop really looks like when clinical decisions are on the line—tight feedback cycles, auditable interventions, and clear escalation paths.
Looking forward, they don’t shy away from ambition. The "full self-driving" vision for clinical trials and what it would take to get there is both provocative and grounded. My read: the path runs through stronger domain ontologies, standardized interfaces (MCP done right), eval-driven development, and relentless simplification of agent skills.
If you’re a product leader building in regulated spaces, this discussion is a masterclass in balancing innovation with compliance. The takeaways map cleanly to AI Strategy: define platform primitives, invest in retrieval-first pipeline patterns, design for context window management, lean into eval-driven development, and operationalize regulatory compliance from day one.
To dive deeper, listen to the conversation on Spotify or Apple Podcasts, and explore Medable’s broader platform work at medable.com. I left both inspired and practically equipped—an uncommon combo in today’s AI noise.
I keep a running list of product wisdom that sounds great on a slide but quietly sabotages execution. Recently, I revisited that list after a deep conversation with a seasoned CPO from a leading security and compliance platform and reflected on how these lessons show up in my own operating rhythm. What follows is my practical playbook for scaling product organizations without losing speed, quality, or the soul of the product.
Most big-tech veterans struggle when they leap into startups because the safety net of process disappears. At a startup, the buck truly stops with you—there’s no committee to shield a decision and no process to rescue a weak plan. The mindset shift is simple to say and hard to do: own outcomes end to end, reduce your reliance on institutional scaffolding, and make decisions with incomplete information while keeping standards high.
“Great product leaders stay in the details.” I sample artifacts every week—PRDs, design flows, user research notes, postmortems—and I read customer threads to calibrate my intuition. To maintain shipping velocity as headcount grows, I instrument a few critical indicators (deployment frequency, change failure rate) and favor outcomes over output. Data guides my attention; it never replaces judgment.
As teams scale, I use a blunt rule to keep speed high: small autonomous teams, small batch sizes, short feedback loops. One clear owner, one prioritized backlog, and weekly demos to customers. We ship thin slices, not big bangs. And “Great CPOs should avoid comfort metrics”—the easy dashboards that rise when nothing meaningful is moving. I push for outcome-centric OKRs tied to customer value, not vanity charts.
Rigid hierarchies derail quality decision-making. They slow signal, encourage escalation theater, and suppress the truth from the edges. I shorten paths between PMs, engineers, designers, research, and go-to-market leads, and I strip out stage gates that don’t add learning. Above all, I refuse to “Stop making your team fetch rocks”—randomized executive requests without context. Instead, I frame clear problem statements, explicit constraints, and observable success criteria.
Revenue and product can feel at odds, but they don’t have to be. The key to a quality CPO and CRO relationship is a shared operating model: one customer narrative, a joint pipeline of problems worth solving, and a common scorecard. We meet weekly, review the same signals, and align on sequencing: what we solve now for impact, what we stage for scale, and what we sunset to reduce complexity. When trade-offs get tough, we anchor on customer value and long-term defensibility.
Who ultimately oversees the quality bar? I do—and I do it through clarity, exemplars, and consistent feedback loops, not micromanagement. When I leave feedback, I make it actionable and specific: name the user scenario, note the friction, propose a sharper decision frame, and suggest a smaller, testable slice. I expect narrative memos and crisp acceptance criteria; I offer rapid, detailed responses so momentum never stalls.
Open office hours are my forcing function for transparency and speed. Anyone can bring a thorny escalation, a design in progress, or a customer insight. Pair that with weekly 1:1s—non-negotiable for developing leaders and unblocking work—and the organization learns to surface issues early, make faster decisions, and self-correct without drama.
Here’s a glimpse into my working week: Mondays set priorities and confirm the few decisions that matter; midweek is for deep reviews across roadmap, research, and engineering readiness; Thursdays I’m with customers and partners; Fridays I write and synthesize. I leave space for unscripted time with individual contributors—because ICs are the unsung heroes of a company—and I celebrate excellent craft out loud.
The hardest leadership skill is knowing when to push and when to give space. I push on clarity, sequencing, and quality; I give space on solutions and implementation paths. I reject comfort metrics, reinforce outcomes vs. output, and keep the organization close to customers and details. If you’re stepping from big tech into a startup or scaling your product org through rapid growth, these practices will help you ship faster, decide better, and raise the quality bar without burning out your team.
“Outcomes over outputs” is the right mantra—and one I’ve championed across product teams—but turning it into daily practice is where most teams stumble.
It’s simple in theory: focus on the impact of what we build, not just shipping features. In reality, it’s rarely black and white because most teams are asked to do both—hit outcomes and deliver specific outputs—at the same time.
In a benchmark survey, 20% of product teams claim to be outcome-focused, nearly half describe themselves as working in a mix of outcomes and outputs, and about 30% are still primarily working with outputs. I’ve seen versions of this in my own org: we aspire to outcomes, but our rituals, roadmaps, and reporting still reward shipping.
Here’s how I draw the line clearly, coach my teams to avoid common traps, and negotiate better, more actionable outcomes that unlock genuine product discovery and business results.
Simple definitions we live by
An output is something you build or produce—a feature, a project, an initiative. It’s something your team ships.
An outcome is the impact of that output—a change in customer behavior or a business result.
Josh Seiden puts it well in his book Outcomes Over Output: “An outcome is a change in human behavior that drives business results.”
Shift from shipping to shaping results. This graphic clarifies outputs vs outcomes, revealing that value emerges between deliverables and impact—when features change customer behavior and move business results.
I distinguish business outcomes from product outcomes. Business outcomes are typically financial metrics that measure the health of the business (e.g. increase revenue or reduce costs) while product outcomes measure a customer behavior in the product or a sentiment about the product.
Here’s a simple example I’ve used with platform teams. Many B2B companies support a number of integrations. Integrations are outputs. Having integrations alone doesn’t create value. Customers using and finding value in those integrations—that’s an outcome. If those customers retain their subscriptions longer because of the integrations—that’s also an outcome.
Building something isn’t the same as creating value. That’s the core of this distinction, and it’s what separates empowered product teams from feature factories.
Why this distinction matters for empowered product teams
When we task teams with delivering outputs, they’re done when the software ships. When we task teams with delivering outcomes, they aren’t done until the software ships and has the expected impact.
That small shift changes almost everything about how a team works: what we measure (impact, not just delivery), how we know we’re done (measurable behavior change, not release notes), the autonomy we grant (told what to achieve, not what to build), and the planning artifacts we use (an opportunity solution tree beats a feature roadmap when we’re exploring the best path to an outcome).
When I assign outcomes, I’m giving the team latitude—and responsibility—to figure out the best path to success. That’s what opens the door for real product discovery and continuous discovery habits.
Shift your lens from shipping features to achieving impact. This side-by-side visual explains how outcome-driven teams measure success, grant more autonomy, define 'done' by results, and plan with an opportunity solution tree.
Examples: spotting outputs disguised as outcomes
Clear-cut example: “Our outcome is to deliver an Android app.” An Android app is something we build and ship. It’s clearly an output.
To get to an outcome, I ask, “What’s the value of having an Android app?” or “How will we know the Android app is successful?”
We might answer: “Having an Android app will allow us to engage more users. We’ll know it’s successful when people engage with the app on a regular basis.”
This answer uncovers the hidden outcome: engage more people. Now we can set the right scope: increase the percentage of engaged users across any platform; increase the percentage of engaged mobile users; or increase the percentage of engaged Android users.
Any of these outcomes gives us more room to explore than a fixed output. Maybe we don’t need a native app at all. We could deliver the same engagement through a mobile web experience, notifications, or email. And we’re not done when we ship—we’re done when the right people are actually engaged.
Tricky example 1: measure the value creation moment (hires, not applicants)
Move beyond shipping features to the impact that matters. This visual maps the path from build an Android app to the real goal, increase engaged users, by asking why, defining value, and owning results.
When setting outcomes, it’s tempting to choose the easiest-to-measure metric. But a good outcome measures the customer’s value creation moment.
I worked at a company that helped new college grads find their first job. When I started working there, the primary outcome was “increase job applications.” This technically is an outcome—it measures a specific behavior in the product.
But it doesn’t measure the value creation moment. A job seeker doesn’t get value when they apply for a job. They only get value when they get the job. Similarly, employers don’t get value from any job applicant, they get value when the right job applicant applies.
Many job boards try to measure qualified applicants—instead of counting any applicant, they compare the credentials of the applicant to the job description and only count qualified applicants. This is better. But it still doesn’t measure the value creation moment. Both the job seeker and the employer get value when an open job is successfully filled. The right metric is hires.
Yes, “hires” can be hard to instrument because it happens off-platform and incentives misalign. Measure it anyway, even with proxies. The easy metric isn’t always the right outcome.
Tricky example 2: measure impact, not user-generated output (the course reviews trap)
I worked with a team that helped students choose university courses. They set their outcome as: “Increase the number of course reviews on our platform.”
Confusing activity with impact? This visual breaks down four common outcome traps—measuring at the wrong moment, mistaking outputs, chasing adoption, and relying on sentiment—so teams focus on real value.
Sounds like an outcome, right? It’s a metric. You can measure it. It’s an action users take on the site—writing a review. But it’s actually an output in disguise.
Reviews are valuable when they help a student evaluate a course. They don’t create any value if a student never sees them. More reviews aren’t always better, especially if they’re clustered where nobody looks.
A better outcome is “Increase the number of course views that include reviews.” Now we’re measuring impact on the decision moment, not just the production of content.
If you can hit your metric without helping customers, you’re tracking an output, not an outcome.
Tricky example 3: measure success, not just adoption (the traction metric trap)
“Increase the percentage of users who viewed the performance report.”
This looks like a good outcome. It measures a specific behavior in the product. It’s within the team’s control. But it’s what I call a traction metric—it measures adoption of a single feature, not value to the customer.
Why teams get trapped in shipping features: a vicious trust cycle fuels micromanagement, while performance-linked outcomes push safe targets. Break the loop and refocus on customer outcomes that truly move the needle.
Two problems arise. First, people can view the report and still not find what they need. Second, we might have perfectly happy customers who don’t need the report at all. Driving usage of an unneeded feature wastes time and erodes trust.
Measure the value creation moment, not just feature adoption.
Tricky example 4: pair sentiment with behavior
I define a product outcome as a metric that measures either 1. a specific behavior in the product or 2. a sentiment about the product. But sentiment metrics—like CSAT or NPS—can be tricky on their own.
Sentiment metrics are outcomes, but they aren’t directional. They don’t tell us where to explore or set guardrails for what to avoid. So I pair a behavior with a sentiment, for example: “Increase engagement without negatively impacting satisfaction.” I use sentiment as a counterweight.
Facebook and Instagram illustrate why this matters. Meta is exceptional at driving engagement—but to a fault. Many of us don’t like these addictive products. Pairing engagement with a satisfaction guardrail prevents “engagement at all costs.”
Why getting this right is hard (and how I counter it)
Ready to move from shipping features to creating impact? This visual playbook shares five practical moves—translate metrics, partner with teams, iterate, avoid traps, and dig deeper—to turn outputs into measurable outcomes.
The trust cycle. Managers don’t trust that teams can reach outcomes on their own. So managers micromanage the outputs. Teams, in turn, don’t communicate their progress toward outcomes—they communicate their progress on features. This reinforces the manager’s belief that they need to stay involved in the details. It’s a vicious cycle.
I break it by asking teams to show their work—share assumptions, research, opportunity solution trees, and evidence behind choices—and by giving feedback on the thinking, not just the solutions.
The accountability trap. When performance reviews are tied to hitting outcomes, teams play it safe. They sandbag their targets. They disguise outputs as outcomes to guarantee “success.”
I treat outcomes as learning opportunities first. When we start on a new outcome, I set a learning goal—“learn what moves the needle on this metric”—before a performance goal—“increase X by Y%.” This creates space to explore without fear.
How I get teams started with better outcomes
Translate business outcomes to product outcomes. Business outcomes like revenue, retention, and market share are lagging indicators—by the time you see them, it’s too late to act. Product outcomes measure behavior changes within the product that lead to those business results. They’re leading indicators within the team’s control.
Negotiate outcomes with your team. Outcome-setting should be a two-way conversation. Leadership brings the cross-company context. The team brings customer insight and technical realities. Neither side dictates; we co-own the target and the constraints.
Stop celebrating shipped features and start celebrating change. This visual contrasts a feature factory mindset with a true product team, urging teams to track impact, not output, and define success by outcomes.
Expect to iterate on your metrics. Your first outcome metric probably won’t be right. That’s normal. Sonja at tails.com went through four iterations—from 90-day retention to 30-day to 5-day to behavior-based metrics—before landing on something actionable. Thomas at Bluestone Analytics iterated three or four times before finding the right metric. Iteration is the work.
Watch for common mistakes. Outputs disguised as outcomes. Traction metrics masquerading as product outcomes. Sentiment metrics without direction. Business outcomes assigned directly to product teams without translating to behavior change.
Use the right artifacts. Replace feature roadmaps with an opportunity solution tree to explore multiple paths, test assumptions, and sequence bets explicitly against a clear outcome.
Align OKRs with outcomes. If your company uses OKRs, make sure the “KR”s are true product outcomes (behavior change and value creation), not a list of features to ship.
The bottom line
When we shift from an output-first mindset to an outcome-first mindset, it doesn’t mean that outputs stop mattering. Product teams will always ship features, and the ability to do so quickly and with quality still matters. This shift simply ensures those features achieve the intended impact. We aren’t done when we ship—we’re done when what we shipped has the intended impact.
Measure success by the impact of what you ship and you’ll build a product team that learns, adapts, and creates real value. Measure success by what you ship and you’ll get a feature factory.
Quick self-check: is your “outcome” really an outcome?
Ask yourself: 1) Does it measure a behavior change or a sentiment tied to value creation? 2) Could we hit it without helping customers? 3) Is it adoption of a single feature (a traction metric) or a result that customers and the business care about? 4) Do we have a counter-metric to prevent unintended harm? If you stumble on any of these, refine it before you commit.
The world can feel like it’s spinning, and as a product leader, I feel that pressure acutely—juggling customer needs, stakeholder expectations, and the relentless news cycle. I recently listened to a powerful conversation with Teresa Torres and Petra Wille about staying grounded when everything feels “bonkers,” and it offered a practical, human way to keep showing up without losing yourself.
What resonated most was the invitation to live my values through small, consistent actions. Rather than waiting for grand gestures or perfect solutions, I’m leaning into the mindset of “Something is better than nothing.” It’s the same spirit we bring to continuous improvement in product: make a change, evaluate impact, iterate.
“Create the world you want to live in” has become a daily prompt for me. I’m applying it to how I spend my attention, time, and platform—three scarce resources for any product management leader. I’m not going to do everything perfectly, but I can make better trade-offs this week than I did last week, and I can keep improving.
Practically, that looks like reconsidering which speaking invites I accept, especially when representation is skewed. If a stage is heavily male, I now ask organizers about their plan for balance before committing. I also question travel expectations for short talks when a high-quality virtual experience is possible—good for sustainability, budgets, and energy. These choices compound, just like product roadmapping and sprint planning decisions.
Petra’s “under-complexity” lens was a wake-up call. In product, oversimplified narratives—whether a single KPI, a vanity metric, or a forced binary—usually increase fear and bad decisions. The same is true in civic discourse. To counter that, I’m seeking more nuance on purpose: reading multiple sources on the same story, listening for who’s not in the room, and noticing how the same facts can carry different meanings depending on who’s telling it.
One simple habit helps: I’ll read The New York Times and The Wall Street Journal on a headline, then follow up with Tangle by Isaac Saul, which lays out “what the left says / what the right says / editor’s take,” sometimes including perspectives from affected communities. It’s a lightweight form of personal knowledge management that improves my product judgment and my citizenship.
Another idea that stuck with me is swapping media proxies for human connection. In product, we don’t ship based on secondhand opinions—we run customer interviews, co-create with users, and build empowered product teams. The same principle applies in community: talk to someone directly affected, ask real questions, and stay curious. When conversations get heated, I try to build bridges, reduce proxies, and look people in the eye.
I’m also reflecting on platform responsibility. Even a “small” platform can snowball through weak ties inside a company or community. I’m asking: When should I speak up? Where should I draw lines? And when is “staying in your lane” actually a way to avoid necessary leadership? These are the same stakeholder management questions we navigate in product strategy—assess impact, clarify intent, and act with integrity.
Local grounding matters, too. I’ve found energy and clarity in community-level action: voting, attending public protests when it feels right, mentoring, and supporting nonprofits like World Pulse. I love the framing of “don’t mess with my neighbors”—it keeps me focused on tangible care when the internet starts to feel like reality. I’ve also seen leaders use angel investing in agriculture-related efforts as a counterbalance to “internet reality,” channeling resources into durable, real-world outcomes.
If you want to experiment this week, pick one small lever you control: where you spend money, time, attention, or your platform. Add nuance by reading at least two different perspectives before reacting. Replace proxies with people by talking to someone with lived experience. Reduce polarization by asking, “what shaped that view?” before judging it. And go local—connect with neighbors or a community group and let small actions compound.
If you’d like to hear the full conversation that inspired these reflections, you can listen on Spotify or Apple Podcasts. Here are the direct links: Spotify: https://open.spotify.com/episode/1sxEFquu73ZB9fL9gGk6Om and Apple Podcasts: https://podcasts.apple.com/kh/podcast/staying-sane/id1794203808?i=1000755696295
Resources I’m exploring and recommend: World Pulse (https://www.worldpulse.org/), The New York Times (https://www.nytimes.com/), The Wall Street Journal (https://www.wsj.com/), and Tangle by Isaac Saul (https://www.readtangle.com/ and https://www.readtangle.com/author/isaac-saul/). For builders and writers, I also appreciate Ghost (https://ghost.org/) as an open-source publishing platform. If you work in or with the MENA ecosystem, take a look at MENA Product Summit ’26 (https://www.prdkt.plus/summit26). Colleagues like Jeff Merrell (https://jeffdmerrell.com/) and grassroots efforts such as No Kings Protest (https://www.nokings.org/) offer additional perspectives and ways to get involved.
If this resonates, share it with a teammate who’s been feeling the weight of the world. I’d love to hear one small, values-aligned action you’re taking this month—what “something” will you try next?
In my role leading product teams at HighLevel, I’m often asked to explain what’s really happening behind the scenes of today’s AI products. The short answer is that modern systems are built on "Agentic Architecture: How Modern AI Systems Actually Work"—not just a single model, but a coordinated loop of planning, tool use, memory, and evaluation. Once you see that pattern, the design decisions snap into focus and the roadmap becomes far easier to prioritize.
At its core, agentic AI treats the model as a reasoning engine embedded within an AI workflow. The agent interprets intent, plans steps, calls the right tools and APIs, grounds itself in trusted data, and then evaluates outcomes before deciding to continue or stop. This loop creates reliability, reduces hallucinations, and enables the system to operate in real-world, multi-step scenarios.
Here’s the practical lifecycle I rely on. A user provides intent (a goal or request). We run a retrieval-first pipeline to ground the model in accurate, current data. Prompt engineering structures the task and primes the agent with constraints and success criteria while managing context window management. The agent generates a plan, executes steps by calling tools or services, evaluates intermediate results, reflects or revises as needed, and only then returns a final answer with clear citations or evidence.
For more complex work, I orchestrate multiple specialized agents—commonly a planner, a solver, and a critic—coordinated by a lightweight controller. This multi-agent pattern reduces single-agent blind spots, encourages self-checking, and mirrors how empowered product teams collaborate. Whether it’s conversation design for support flows or a voice AI agent driving hands-free tasks, orchestration is the difference between a clever demo and a dependable product.
Memory is the second pillar. Short-term working context sits in the prompt, while long-term memory lives in vector stores or databases to track past interactions, preferences, and outcomes. Retrieval augments the model with the right facts at the right time, and tight context window management ensures the agent stays focused on signal, not noise. The result is faster responses, lower costs, and far better accuracy.
Reliability is earned through eval-driven development and robust AI risk management. I define offline and online evaluations, guardrails, and human-in-the-loop checkpoints before scaling traffic. These evaluations become living, automated tests that protect against regressions as prompts, models, and tools evolve. The payoff is real: fewer escalations, higher trust, and measurable improvements to quality over time.
From a product strategy perspective, I resist over-engineering. Start with a simple retrieval-first pipeline and a single agent; prove value; then layer in multi-agent orchestration only where it moves key metrics. Instrument everything—latency, cost, grounding coverage, and outcome quality—and build Agent Analytics dashboards so teams can diagnose issues and iterate with confidence.
If you’re looking for a practical playbook, here’s mine: clarify the user intent and success criteria; design the tools the agent can call; ground with authoritative data; write prompts that constrain scope and define termination conditions; add reflection and automated evaluations; and ship behind feature flags for safe, staged rollout. Each step compounds reliability without killing velocity.
The diagram and the video above bring these patterns to life. If you watch closely, you’ll see the same loop—plan, retrieve, act, evaluate—show up in every effective implementation, regardless of domain. That repetition isn’t accidental; it’s the backbone of agentic architecture and a blueprint you can adapt to your own stack.
Ultimately, what matters is outcomes. When we build around agentic AI, we create systems that are explainable to stakeholders, maintainable by engineers, and genuinely helpful to customers. That’s how we move past hype to durable impact—shipping AI products that plan, learn, and execute at scale.
Leading the Support function for a company that builds a leading Agent and AI-forward customer service platform has been, for me, unique, exciting, and yes—daunting. It’s where product ambition meets operational reality, and where every decision I make is immediately tested by customers who expect excellence.
It’s unique because we use the same technology as our customers. We live in the product every day, which puts us in a privileged position to be the voice of the customer across the organization. That tight feedback loop has shaped how I prioritize, what I build next, and how I measure success.
It’s exciting because we get to try all of the new features and capabilities of Fin and the Intercom helpdesk. With a relentless focus on AI innovation, I’ve had access to remarkable tools that help us deliver an incredible customer experience—and I’ve seen firsthand how the right workflows and guardrails turn those tools into outcomes.
And it’s daunting because expectations for our own Customer Support (CS) team are sky high. If we can’t deliver incredible support using our own technology, we undermine its value proposition. That imperative has kept me honest, focused, and fast.
In our new research, “The 2026 Customer Service Transformation Report,” we’ve been sharing how forward-looking teams use AI to transform their support models. If you’d like to get straight to the report, download it here.
When Intercom changed its focus in late 2022 to prioritize the customer service use case, we undertook a critical review of the support experience we were delivering and committed to driving meaningful change under an AI-first framework. That was a turning point: I aligned product strategy and operations around a single north star—automate with quality, and elevate humans to higher-value work.
Three years on, Fin now resolves over 81% of all our customer support volume, delivering immediate and high-quality resolutions. We have absorbed a 300%+ increase in customer demand since 2022 without proportional headcount growth. Without Fin, we would have needed at least 100 additional CS team members to meet that demand and our improved service levels – a net saving to Intercom of between $7.5M–$9M annually.
Throughout this work, we drew on research from the 2026 Customer Service Transformation Report and applied the lessons directly to our own org design, knowledge management, and AI workflows. What follows is our story of transformation and how we achieved a mature deployment of Fin.
The problems we set out to solve
Back in 2022, our challenges looked familiar to any modern support organization, and I knew we needed a step-change—not incremental tweaks.
We faced increased support demand from new and existing customers: Intercom was launching major features and changes at speed, driving up overall customer conversation volume and requiring additional headcount for the CS team. I could see we were scaling people faster than processes—unsustainable without automation.
Our support policy (as defined by our service level objectives) was not based on a high bar: In most cases, we were only committed to “business hours” coverage for the majority of our customers, impacting first response times. Even with SLOs that were not considered best in class, we were struggling to meet our commitments. I wanted 24/7 coverage and faster first responses without sacrificing quality.
We wanted to do more: As we pivoted our strategy, we wanted to open new routes to our support team, such as providing support to website visitors with technical questions and to trial customers. That meant meeting customers earlier in their journey with accurate, on-brand responses—at scale.
What we did
We made a very conscious decision to become our own best reference customer. As Intercom embraced the opportunity that generative AI presented to transform customer service, we intentionally moved to an AI-first strategy for our Customer Support team. I set a simple operating principle: ship value quickly, measure relentlessly, and let evidence guide the next bet.
We started with the highest-volume, informational queries and saw our resolution rates climb quickly. With that foundation in place, we pushed Fin further, training it on deeper documentation and internal procedures, and eventually giving it the ability to take actions on behalf of customers. As Fin took on more complex work, our results started to compound—and trust in the system grew across the organization.
Early adoption and building trust. When “AI Assist” features came to the Intercom Inbox, the CS team got early exposure to AI and were empowered to provide feedback directly to our product teams. This built awareness and trust across the team about what we were trying to achieve with AI, and helped shape the product roadmap. We were also the first beta customer for Fin, rolling it out to a subset of customers to watch sentiment and outcomes closely. With no adverse reaction and an initial resolution rate of over 25%, we deployed Fin to most customer segments within weeks. I’ll never forget the first week we put Fin in front of real customers—the silence of issues that never reached humans was the loudest signal of success.
Knowledge management as a product. We recognized quickly that time spent tuning our help center and knowledge assets for Fin would pay dividends. We transitioned our Help Center Manager into a “Knowledge Manager,” with a dedicated remit to optimize content for Fin. We embedded knowledge creation into our “New Product Introduction” (NPI) process, targeting that Fin would resolve at least 50% of customer issues at every new product and feature launch. Over time, we added new sources, including “Developer Documents,” enabling Fin to handle increasingly complex issues. We built a culture of continuous improvement—allocating “out of the inbox” time so every teammate could close content gaps and raise the bar.
Conversation design end-to-end. To ensure a consistent, high-quality customer experience, we created a new “Conversation Designer” role that owns the journey across automation and human handoffs. Using Intercom’s Workflows, we introduced “skills-based routing” so that when a customer asks for a human, the conversation reaches someone with the right expertise quickly. This is now handled by Fin directly using a feature called “Attributes.” The result: a seamless, on-brand experience regardless of channel or escalation path.
Leaders are racing ahead with real AI in support. Explore the 2026 Customer Service Transformation Report to see where deployment is stalling, benchmark your team, and get practical steps to scale automation that delights.
Organization changes that unlocked leverage. As we scaled Fin, we stood up a dedicated AI Support team under a senior CS leader to continuously optimize automation and define our AI adoption strategy across the journey. We restructured human roles into “Technical Support Specialist” and “Technical Support Engineer” to better align with the complexity of incoming work. We also expanded Support Operations to focus on optimization—using AI to uplevel Enablement, Workforce Management, QA, Process Management, and Data Insights. Just as important, we reset expectations about the balance between time spent supporting customers directly versus improving AI. That mindset shift created compounding returns.
Pushing Fin further with new capabilities. As capabilities matured, we were early adopters and saw measurable wins:
Fin Guidance: Multiple Guidance rules provide additional controls and a more personalized, targeted experience for customers.
Fin Tasks and Procedures: Enables Fin to carry out activities such as updating customers on incident status and deep troubleshooting for technical issues.
Insights: AI-driven dashboards provide deep insight into Fin’s performance and surface recommendations for further optimization. Insights also provides a Customer Experience (CX) Score for every customer interaction, enabling more targeted improvement efforts and opening up new ways to close the loop with customers who have had a poor experience.
What we achieved
What started as a focused effort to improve our customer support experience became the strongest proof point for what’s possible when you fully embrace AI. Fin now resolves over 81% of all our customer support volume and has allowed us to absorb a 300%+ increase in demand without proportional headcount growth. Over 90% of our customers now benefit from improved first response performance, 24/7 coverage, and outbound phone support.
What the numbers don’t fully capture is the shift in how our team operates. With volume absorbed by Fin, our CS teammates now deliver consultative support—guiding next best actions, deepening product adoption, and contributing directly to retention and expansion. Customers that receive these engagements adopt Fin at a much deeper level and achieve greater support success. What was once a reactive, volume-driven team is now a function that generates significant revenue.
What’s next
Customer expectations are always rising, so we’re building on our progress by embracing the Fin Flywheel—an actionable framework for ongoing improvement and optimization. This keeps us honest about the discipline required to sustain AI performance at scale.
Train: Teach Fin to resolve even the most complex queries with Procedures, knowledge, and policies.
Test: Run fully simulated customer conversations from start to finish to see exactly how Fin will behave before going live.
Deploy: Set Fin live across every channel – voice, email, chat, and social – for consistent support wherever customers reach out.
Analyze: Use AI-powered Insights to analyze and improve Fin’s performance and deliver better customer experiences.
We are also investing in our support teammates so they can adjust to the new world of AI—taking on more complex work and being valued for the subject matter expertise, consultative engagement, and empathy they bring to the role. That human layer is where differentiation shines.
We will continue to develop and share best practices for deploying an Agent, based on our own experience with Fin and the lessons learned from our most forward-looking customers. These are captured and continually evolving in The Agent Blueprint.
Transformation takes commitment
The most successful teams aren’t bolting AI onto old processes; they’re rebuilding support around it—investing in knowledge and people alongside technology, and treating AI as a continuous discipline rather than a one-time deployment. That’s the real change required. For support teams willing to make it, there’s a rare opportunity to redefine what customer service can deliver—higher CSAT, faster resolution, and durable ROI.
I’ve long believed a simple truth about AI in customer support: if AI is going to earn trust, pricing has to be aligned with value. That principle has guided my product decisions and the way I hold our teams accountable for measurable outcomes, not activity.
When we shared our perspective on pricing AI Agents in 2023, we made a simple argument: if AI is going to earn trust, pricing has to be aligned with value. At the time for Fin, that value was clear. You pay when the AI resolves a customer’s problem. If it doesn’t, you don’t. That’s fair, easy to understand, and grounded in results, not activity. We were the first to introduce this pricing model because we believed that pricing and value should be inherently linked.
That belief hasn’t changed, it’s grown stronger over time. What’s changed is what Fin can do. As we expanded capabilities and pushed deeper into complex workflows, it became clear that measuring value solely by end-to-end resolutions no longer captured the full picture of impact.
Resolutions were the right place to start. Historically, we measured value based on whether Fin fully resolved a conversation on its own. These are known as resolutions and they gave support teams a clear way to measure ROI, easily comparing the cost of AI versus human support. They also aligned our incentives with our customers, as our revenue was directly tied to Fin’s performance.
That clarity worked. Today, more than 7,000 teams use Fin. Our average resolution rate across customers has increased every month and now stands at 67%, even as Fin increasingly handles more complex queries. That progress came from building an Agent that could take on harder problems and still deliver.
But as Fin got more powerful, “success” stopped being binary. I saw this first-hand in customer design sessions where policy, risk, and compliance needs rightly demanded human-in-the-loop confirmation. We weren’t failing to deliver value; we were delivering it differently.
Over the last couple of years, we invested heavily to ensure Fin could handle the most complex parts of support. As Fin’s capabilities expanded, customers began pushing what Fin can do for them by deploying Fin deeper into their workflows to handle the toughest queries.
In some cases, this required Fin to work in tandem with a human agent because that’s what customer policies and oversight needs dictated. Subscription changes, transaction disputes, billing issues, and other multi-step support scenarios can often require Fin to gather context, read and write to external systems, and execute actions before handing off to a human agent for confirmation.
Fin is still doing what it was configured for – intentionally handing off after doing more of the heavy lifting, saving valuable time for support teams and overall time to serve for their customers. But our pricing metric only recognized value when the conversation ended in a full “AI resolution” (i.e. a human was never involved).
That’s why we’re evolving Fin’s pricing metric from resolutions to outcomes. This shift reflects how customers now define value: not just in full automation, but in safe, efficient progress toward the right result across complex, multi-step, and policy-constrained workflows.
An outcome represents when Fin successfully completes the action it was configured to perform, as part of a conversation. Resolutions are still one type of outcome Fin can deliver, where it handles the issue end-to-end. Another type of outcome can be a Procedure where Fin gathers context, takes action, and hands the conversation off when that’s what customers configured it to do.
Kick off your journey with the #1 Agent—an AI partner designed to turn resolutions into real outcomes. Tap “Start a free trial” to explore faster, smarter customer service and see how Fin delivers value from day one.
Increasing end-to-end AI resolutions is still a core component of scaling Agents, but they are no longer the only measure of Fin's success and utility. Especially as Fin takes on more complex work. Moving to outcomes recognizes that solving a customer problem with full automation isn’t always appropriate. It’s about getting to the right result, safely, and efficiently.
As Fin’s capabilities expand, teams should feel empowered to use it in more nuanced, collaborative work. Outcomes support that by allowing customers to design workflows that meet compliance requirements and include a human agent when necessary. From a product management standpoint, this is how we align incentives, keep risk controls intact, and still accelerate time-to-value.
Fin is becoming even more powerful at handling complex, multi-step support queries. With outcomes, we can support that growth without constantly reinventing how value is measured. And this change gives us a strong pricing foundation that can scale as Fin continues to grow and take on more roles beyond service. This aligns with our vision of Fin becoming a “Customer Agent,” capable of handling the entire customer experience.
What this means for pricing is intentionally straightforward. An outcome will be counted when Fin successfully completes an action it was configured to perform, as part of a conversation. That keeps the model predictable for finance leaders while staying transparent for operators and product teams managing AI workflows.
The pricing model stays simple and the definition of value becomes more accurate. In other words, we’re doubling down on fairness, predictability, and competitiveness—core tenets for any consumption SaaS pricing strategy tied to real business impact.
When we first wrote about outcome-based pricing, we said that trust is the currency of AI. That’s still true. Trust is earned when customers see pricing move in lockstep with utility and risk posture, especially as gen AI and agentic AI take on higher-stakes tasks.
Pricing has to feel fair, it has to be predictable, and it has to stay competitive. Evolving from resolutions to outcomes isn’t a departure from that belief. It’s the natural maturation of how we measure value as AI moves from simple Q&A into complex procedures and human-in-the-loop collaboration.
Fin has grown more powerful because customers asked more of it. Outcomes are how we reflect that progress honestly, while staying true to the same principles that guided us from the start. This is product strategy in action: align incentives, measure what matters, and scale what works.
And as Fin continues to get stronger, we’ll keep holding ourselves to the same standard: price based on the value delivered. That’s how we build durable trust, sustainable ROI, and a better customer experience at scale.
I’m consistently drawn to stories where product strategy and operational grit collide to change real lives. Zipline, the world’s largest commercial autonomous delivery system, is one of those rare cases. Serving 5,000 hospitals across multiple countries and saving an estimated 17,000 lives per year, it embodies the kind of mission-driven execution I try to model in product management. The arc—from a near-dead home robot startup to a scrappy bet on drone blood delivery in Rwanda, to 135 million autonomous miles flown—offers some of the clearest lessons I’ve seen on hiring, leadership, and product-market fit under extreme constraints.
One principle that immediately resonated with me: why Zipline doesn’t hire for experience. The idea behind “Why Zipline hires teenagers over PhDs” isn’t a dismissal of expertise; it’s a commitment to learning velocity, ownership, and unteachable hunger. The best startup employees, as described here, are “heat-seeking missiles for pain”—people who chase the hardest problems, not the shiniest projects. In my org, I look for the same signal: candidates who can move from ambiguity to action, who find the bottleneck without being asked, and who care more about outcomes than optics.
I also appreciated the unapologetic stance that “blind references are a non-negotiable.” In high-stakes builds—especially in regulated or safety-critical categories—the cost of a mis-hire compounds. I routinely validate for two traits during references: intellectual humility and accountability. “Can candidates admit when they screwed up?” is a powerful filter. If someone can’t name a hard mistake and how they specifically changed as a result, they’re unlikely to scale with the organization.
Equally important is clarity about who not to hire. The employees Zipline doesn’t want are those who optimize for status, process theater, or low-friction work. In practice, that means pressure-testing for problem-finding, not just problem-solving. I often design interviews around messy, cross-functional constraints (regulatory, operational, and financial) to see who can integrate tradeoffs, not just ideate features. That’s how we build empowered product teams that ship consequential outcomes, not outputs.
There’s a reference to “Zipline’s secret leadership playbook,” and while the specifics remain private, the spirit is unmistakable: first principles decision making, ruthless focus, and a culture that rewards radical responsibility. Translating that to my product organization, I emphasize five behaviors: orient to the mission under uncertainty, run fast but close the loop with data, communicate constraints early and often, own the long tail of consequences (especially in safety and reliability), and scale judgment by teaching the why, not just the what. That blend of clarity and autonomy is the backbone of product management leadership at any growth stage.
On the other side of the culture coin is “Why you should always fire quickly” and “The brutal firing advice that shaped Keller’s leadership.” I’ve learned (sometimes the hard way) that slow decisions erode trust and team velocity. Moving quickly doesn’t mean being harsh; it means being fair, explicit, and humane—tight feedback loops, role clarity, and decisive action when the gap persists. If your bar is clear and your coaching is consistent, acting fast protects both the mission and the team’s energy.
Strategically, the origin story reads like a masterclass in choosing the right problem. The team moved “from toy robots to drone delivery: Zipline’s pivot,” then partnered deeply with Rwanda, where “How Rwanda’s health minister changed everything” is a pivotal moment. It wasn’t a linear climb—”How Zipline almost died – twice” and “Why Zipline’s launch was a ‘complete disaster’” underline a tough truth: breakthrough products rarely arrive fully formed. What matters is the operating cadence that turns early chaos into repeatable reliability—especially when the stakes are measured in minutes and lives.
Scaling from 1 hospital to 5000 required more than product brilliance; it demanded systems thinking across logistics, compliance, safety, and community trust. That’s stakeholder management at its highest level. The product lessons are durable: anchor on outcomes, not artifacts; build reliability as a feature; and practice founder-led GTM where your credibility is on the line with customers and regulators. This is where first principles decision making beats benchmarking—particularly in novel categories where there are no playbooks to copy.
There’s also a hard-nosed operational takeaway in “The 10x hardware cost rule every founder should know.” My read: assume total cost of ownership will balloon once you account for manufacturing variability, support, redundancy, maintenance, and compliance. In product strategy, I treat those multipliers as design inputs, not afterthoughts. If the unit economics can’t survive these realities, the idea isn’t ready—no matter how elegant the prototype looks in a lab.
Across all of this, a few product management patterns stand out for me: build teams around outcomes vs output OKRs; hire for slope, not just intercept; make continuous discovery routine with real users (in this case, clinicians and health systems); and treat operational excellence as a product surface. When a mission is this consequential, culture becomes a safety system—and every leadership decision compounds into either speed with quality or speed with regret.
For leaders building in complex domains, this journey is a blueprint: pick problems that matter, hire “heat-seeking missiles for pain,” keep blind references non-negotiable, lead with first principles, and scale with responsibility. Do that well and even a “complete disaster” launch can become the inflection point of a category-defining company that flies 135 million autonomous miles and saves 17,000 lives per year.