Tag: build vs buy

  • Old-School Selling Beats PLG in the AI Era: My GTM Playbook for 8‑Day Enterprise Deals

    Old-school, in-person selling is having a renaissance in the AI era, and I’ve seen why up close. From leading product and go-to-market teams through hypergrowth, I keep returning to one lesson: enterprise buyers still reward the teams who show up, orchestrate change management, and own outcomes end-to-end. The tech has changed; the human dynamics haven’t.

    Has the sales playbook changed in the AI era? The tools are faster and the surface area is bigger, but the core motion remains the same: “showing up” beats letting the marketplace decide. That’s why in-person enterprise rollouts still beat product-led motions, especially when the stakes include security, governance, and cross-functional adoption. You win by reducing organizational risk, not by assuming free trials will do the heavy lifting.

    Great enterprise sellers collapse silos. They sell to engineers and executives in one motion, pairing deeply technical validation with crisp business narratives. In my org, that means every high-velocity pilot has a dual thread: hands-on, eval-driven proof for the builders and a value architecture for the budget owners. When those motions run in parallel, time-to-value plummets and procurement friction fades.

    Selling to AI-native buyers who grew up on ChatGPT changes tempo, not fundamentals. The same seller, different tempo: 8 weeks vs. 8 business days. These buyers evaluate fast, expect clear ROI, and push for automation-first workflows. How AI-native buyers handle build vs. buy decisions comes down to build for differentiation and buy for acceleration. If you make procurement feel like product—frictionless, instrumented, and transparent—you’ll meet their bar.

    Process matters, but humanity wins. Building a robust sales process that still leaves room for unscripted moments is where trust is formed. I’ll never forget the story of the rep who taught a champion’s son guitar over Zoom—an unscripted moment that cemented a partnership. The lesson: raise the floor without capping the ceiling. Equip every rep with repeatable plays, then celebrate the creative instincts that make champions out of customers.

    In early GTM, why the three highest-leverage early sales hires aren’t sellers at all resonates with my experience. I prioritize a solutions engineer who can de-risk integration, a forward-deployed operator who can run the first rollout like a product manager, and a customer success lead who designs adoption paths from day zero. Together, they compress the value journey from proof to production.

    Compensation design shapes your talent market. The case for outsized commission accelerators for star sellers — and the kind of person they attract is real: magnets for competitors who close complex, multi-threaded deals and thrive with ownership. But beware: why too much process narrows the kind of seller you attract. Over-script it and you filter out the very people who can navigate ambiguity with customers.

    Under the hood, instrumenting the funnel from stage zero to close keeps the system honest. I track intent signals before pipeline, conversion by persona and use case, proof milestones, and time-to-value in production. The three pillars of GTM excellence for me are repeatable discovery, referenceable outcomes, and relentless enablement. And inside the leadership team, building peers who are 80% aligned, not 100% preserves healthy tension while keeping execution fast.

    AI is expanding the definition of enablement—whether AI is changing what good enablement looks like isn’t a theoretical question anymore. I see world-class teams arming reps with retrieval-first knowledge bases, sandbox environments, and objection libraries that evolve weekly. Meanwhile, selling against direct and implied competitors at once is the norm: your battlecard must cover “do nothing,” internal tools, adjacent categories, and new AI entrants—while you still remember why in-person enterprise rollouts still beat product-led motions for durable adoption.

    Planning horizons tighten in AI markets. How far out should a GTM leader be planning? I work a dual cadence: a rolling 6-week operating plan that’s ruthlessly tactical and a 2–3 quarter roadmap for coverage, enablement, and category storytelling. What a normal week looks like in hypergrowth blends customer time, pipeline triage, onboarding and enablement, deal engineering, and process tuning—always with one or two high-conviction bets that could bend the curve.

    References: Ahead: https://www.ahead.com; Amazon: https://www.amazon.com; Anthropic: https://www.anthropic.com; Attio: https://www.attio.com; Augment Code: https://www.augmentcode.com/; Cognition: https://cognition.ai; Cursor: https://cursor.com; Dani McCabe: https://www.linkedin.com/in/danielle-mccabe/; Datadog: https://www.datadoghq.com; GitHub Copilot: https://github.com/features/copilot; HubSpot: https://www.hubspot.com; Jeremy Powers: https://www.linkedin.com/in/jeremypowers/; JPMorgan: https://www.jpmorgan.com; Matt McClernan: https://www.linkedin.com/in/mattmcclernan/; MongoDB: https://www.mongodb.com; Nicole Rettinger: https://www.linkedin.com/in/nicole-rettinger-23b20465/; Notion: https://www.notion.com; OpenAI: https://openai.com; Parag Agrawal: https://www.linkedin.com/in/paragagr/; Parallel: https://parallel.ai; Snowflake: https://www.snowflake.com; University of Chicago: https://www.uchicago.edu; Windsurf: https://windsurf.com

    If you’re scaling an AI product today, pair a disciplined sales-led growth engine with the best of product-led growth: fast paths to proof, hands-on validation for builders, executive-level value mapping, and human moments that turn customers into advocates. That’s how you compress an eight-week cycle into five business days—and keep the expansion flywheel spinning.


    Book a consult png image
  • Escape the “It’s Just an LLM” Trap: Inside Operator, a Reliable, Actionable AI Agent

    Escape the “It’s Just an LLM” Trap: Inside Operator, a Reliable, Actionable AI Agent

    We just launched Operator, an Agent for your customer operations that helps you understand, manage, and improve your entire customer experience. I’ve spent years shipping AI-driven products at production scale, and this one reflects the lessons I’ve learned the hard way about what it really takes to go from a flashy demo to a dependable system your team trusts.

    To give you a clear view of just how powerful this Agent is, I want to share the technical infrastructure and engineering choices that make Operator work reliably at production scale across thousands of customer workspaces. My goal is to demystify the gap between a well-prompted LLM and a true, production-grade Agent—so you can make an informed build vs. buy decision.

    If you’re a technical leader evaluating whether to build something like this yourself, or trying to understand the difference between a well-prompted LLM and a production Agent system, this is for you.

    Escaping the “it’s just an LLM” trap

    Most engineering teams in this space start the same way: a prototype. You take a foundation model, give it API access to your support data, add a system prompt with some domain context, and you’ve got something that queries your database, summarizes tickets, and generates reports that look right. It demos convincingly—and I’ve been there, impressed in the moment, only to watch it buckle under real-world complexity.

    The problem with that prototype is that it obscures the scope of what’s actually required. It demonstrates the 10% of the system that’s straightforward to build, and it’s easy to assume the rest is just as straightforward. It isn’t. The gap between a working demo and a production system your team depends on daily is where most of the engineering investment lives. That’s precisely the gap we focused on closing.

    With Operator, we’ve invested deeply in every layer: tooling, reasoning, how the Agent takes action, and the infrastructure that makes it reliable at scale. Here’s a closer look at the architecture and why it matters for agentic AI, platform scalability, and observability.

    The tooling layer

    The first thing we had to confront was that the obvious approach (giving a model access to your APIs and letting it figure things out) doesn’t hold up in production. The model makes reasonable decisions for simple queries, but operating across thousands of customer workspaces with different configurations, data models, and usage patterns, a “figure it out” approach isn’t nearly precise enough.

    What you need is purpose-built tooling: tools that encode decisions about what data to fetch, how to structure it, what context to include, and what to leave out. Operator has over 50 of these tools and 10 skills.

    A tool is a single action that Operator takes (search content, run a query, look up a conversation). A skill chains multiple tools together to complete a whole job, like debugging a conversation end-to-end, rolling out a content update across an entire help center, and identifying the next automation opportunity. This is where AI workflows move from abstract prompts to dependable, repeatable outcomes.

    The difference between using thin wrappers around API endpoints and purpose-built tooling shows up in something as seemingly simple as a performance question. When you ask “how did Fin perform last week?”, a naive implementation runs a query and hands back a table. Operator runs a reporting tool that determines which metrics are relevant for your specific workspace, which are meaningful for your particular question, and what the numbers actually mean in context, giving you a much richer answer that you can do something tangible with.

    Developing that behavior took months of engineering. Not because any individual piece is conceptually hard, but because getting it right across the full range of customer workspaces, configurations, and edge cases is an iterative process. You build it, you test it against real conversations, you find the cases where it breaks, you fix those, and you repeat. There’s no shortcut—and in practice, this is where most DIY efforts stall.

    The intelligence layer

    The tooling layer solves what to do, but beneath it is a harder problem: understanding what’s worth doing, and why. This is the layer that makes Operator understand your business rather than just query it. Three components go into it, and in my experience they’re non-negotiable for a reliable Agent.

    1. Semantic search

    Unlike solutions that rely on keyword matching, Operator uses a system that understands what content is about, not just what words it contains. When it searches your help center, it’s using the same semantic search engine we’ve spent years optimizing for Fin itself. This is a retrieval system that’s been tuned against millions of real support conversations, with precision and recall characteristics we’ve measured and improved continuously. This retrieval-first pipeline is the backbone of grounding and dramatically reduces hallucinations.

    2. Attribute awareness

    Operator has access to your data and knows what is meaningful for different questions. It knows which metrics are actually in use in your workspace, which custom attributes carry signals, and which fields are populated versus effectively empty. We’ve built specific skills that give Operator this meta-knowledge, so when it’s investigating a performance question, it’s looking at the right things, not hallucinating insights from sparse data.

    3. Intelligent reasoning

    A well-built Agent can answer your question and anticipate what you should ask next. If you ask Operator about escalations spiking, it doesn’t just say, “escalations increased 23% week-over-week.” It’ll continue on to tell you why this happened by examining the escalated conversations and identifying that a disproportionate number involved a specific product area, before moving on to check whether the relevant help content is up to date, and, if it isn’t, proposing an update. That chain of reasoning isn’t prompt engineering. It’s encoded in the skills we’ve built, refined against the patterns we see across our entire customer base.

    The action layer

    This is where the engineering complexity increases by an order of magnitude because instead of just analyzing problems and recommending solutions, Operator takes action to solve them itself. It can update Guidance rules, draft and publish help articles, create Procedures, configure data connectors, and modify your Fin configuration. Moving from read-only insights to write-capable actions is a fundamentally different class of product and infrastructure problem—one that demands rigorous SRE practices and rock-solid safeguards.

    Every one of these actions has to be safe, reversible, and auditable. An analytics tool that occasionally returns a wrong number is frustrating. but an Agent that occasionally applies a wrong configuration change to a live support system is a different category of problem. To prevent this, we built a robust proposal system, whereby every change Operator suggests is presented as a reviewable diff. You see exactly what will change before anything is applied, with the option to accept, reject, or refine. Nothing goes live without your explicit approval.

    What else sets Operator apart

    A UI that’s both conversational and graphical, not one or the other. Operator blends conversational interaction with purpose-built graphical components. Proposal diffs show exactly what will change in an article. Inline charts visualize performance trends. Dashboards render directly inside the conversation thread. In practice, that means a knowledge manager reviews a structured diff—not a wall of LLM-generated text—and a team lead asking about weekly performance gets an accurate chart with context, not a paragraph approximating data.

    Building this hybrid experience is extremely difficult outside of a native platform integration. In a chat interface or CLI, you’re limited to text output; in a standalone dashboard, you lose conversational context. Operator does both in the same thread, so every interaction is detailed and context-rich—and importantly, actionable in the flow of work.

    It lives where your team already works. Operator is built into the same platform your team uses every day. It’s not a separate tool with a separate login, nor is it a Slack bot your engineer set up that only three people know about. It operates exactly where you are, alongside the conversations, help center articles, workflows, and data you’re working with. That tight integration closes the gap between finding a problem and fixing it: spot an outdated article while reviewing a Fin conversation, and Operator can surface the fix in the same session. Notice an escalation spike in the morning, and you can ask Operator to investigate without switching tools, waiting for a data pull, or filing a ticket.

    The compounding advantage

    Every customer using Operator teaches us something. We see which debugging approaches work across different types of support operations, learn which content structures perform better, and identify automation strategies that consistently land. Those patterns get encoded back into Operator’s skills and tools. When we discover that a particular sequence of investigation steps reliably identifies the root cause of a spike in escalations, we build that into Operator’s diagnostic skill. When we find that a specific way of structuring help articles leads to higher Fin resolution rates, we encode that into the content creation skill. Our engineering team is continuously shipping improvements based on what we observe across the entire customer base.

    A custom-built solution gives you exactly what you built, meaning it doesn’t get smarter unless you invest engineering resources into making it smarter. And that usually means taking time and talent away from your core product. I’ve watched teams underestimate the ongoing cost of eval-driven development, model upgrades, and API churn—costs that only grow as your footprint expands.

    We’re not locking the door

    Some teams want to build their own Agents. Some of our most technical customers do this. But when you do, you’re working with raw APIs and building your own tooling on top of them. When you use Operator, you’re working with a system that already knows what questions to ask, understands your data, and encodes the best practices we’ve learned from thousands of support teams. We recently launched the Fin CLI, which means you can use third-party agents like Claude Code or Cursor to interact with your Fin data and configuration. That door is open. What I hope this post has clarified is everything that goes into the build of Operator: Over 50 tools and 10 skills, purpose-built for support operations. Years of investment in semantic search. Deep integration with every layer of Fin’s stack. The proposal system. The intelligence layer. The reliability infrastructure.

    If you’d still like to move ahead with building a custom solution, here’s an honest assessment. You can build a useful read-only tool in weeks. It’ll query your data, summarize tickets, and generate reports, but turning it into a production system will take quarters. Reliability, security, edge case handling, multi-tenant data isolation, and graceful degradation are all important architectural decisions that you’ll need to get right from the start. The action layer is also where you might risk stalling out. Going from “here’s what’s wrong” to safely making changes in a production system is a fundamentally different engineering problem than analysis. Most DIY projects never get there. Finally, you’ll be maintaining it forever. Every model upgrade, API change, and new capability in your support platform means updating your custom tooling. We have a team dedicated to this. You’ll need one too.

    The economics still favor buying when a vendor has invested more in the problem than you can justify internally. What I hope this post adds is a clearer picture of what that investment actually looks like from an engineering perspective—and why it compounds into a durable advantage for your support organization.

    The investment is ongoing. The problems we’re solving at the infrastructure level today are harder than the ones we solved a year ago, and that trajectory isn’t slowing down. If you’re ready to see the difference a production-grade Agent can make, explore Operator.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • Beyond the Product Builder Hype: How AI, org design, and joy shape PM success

    Beyond the Product Builder Hype: How AI, org design, and joy shape PM success

    I recently spent time with the debate behind the "product builder" trend—asking whether it’s the future of product management or just another wave of tech FOMO. The conversation featuring Teresa Torres and Petra Wille is a useful prompt, but what matters most is how we translate these ideas into healthy product practices inside our own organizations.

    Here’s my take: the product builder movement is neither a mandate nor a fad—it’s a tool. The right question isn’t "should product managers code?" but whether leaning into building advances outcomes for our customers and our teams. In practice, that means letting interest and skill—not pressure—set the pace.

    Petra captured it perfectly: "Just because I can do it — is it something I enjoy doing? And do I have enough experience to really get into the flow?" Those two tests—joy and depth—are underrated filters. I’ve seen PMs light up when prototyping or vibe coding a thin slice, and I’ve also seen well-meaning dabbling create hidden complexity that slows everyone down later.

    Org design determines whether this works. It’s not about the tools—it’s about clarity of roles, healthy interfaces between product, design, and engineering, and explicit guardrails for where experiments stop and production begins. AI has raised the stakes: "AI can make unskilled work look polished. That’s a feature and a bug — executives see the shine, engineers inherit the mess." If you’ve ever watched a glossy demo turn into weeks of refactors, you know exactly what this looks like.

    To avoid that trap, I deliberately separate the three layers where AI is changing product work: personal productivity, team process, and product strategy. Treating these as different stacks keeps expectations clean: a prompt that accelerates personal workflows isn’t the same as an AI-enhanced process that reshapes delivery, and neither automatically produces durable product advantage. Don’t conflate them.

    Discovery remains stubbornly human. "Why discovery still requires talking to your customers (sorry)" is more than a friendly nudge. AI can broaden our search space and sharpen analysis, but it doesn’t replace qualitative conversations or the judgment that comes from pattern recognition across real customer contexts. Continuous discovery and disciplined customer interviews are still the most reliable compasses we have.

    Where does "vibe coding" fit? It’s great for roughing out concepts, de-risking slices, and communicating intent when words or static mocks won’t cut it. Tools like Claude Code make this faster than ever, and familiar stacks like Ruby on Rails lower the bar for spinning up functional prototypes. But remember the design system trap: AI can make bad decisions look good on the surface. If you don’t control for architecture, accessibility, data contracts, and handoff quality, your team pays the integration tax later.

    In well-set-up orgs, the output-oriented muscle memory gets rewired. When AI frees up time, strong teams reinvest it into better problem framing, sharper opportunity solution trees, and tighter product strategy—rather than simply chasing more output. That’s a leadership challenge, not a tooling problem, and it shows up quickly in how teams make trade-offs.

    Here’s how I operationalize this with empowered product teams: we articulate clear boundaries for prototypes versus shippable code, define decision rights for when PMs or designers "build," and align on review gates that protect quality without stifling speed. We also make the three AI layers explicit in roadmapping and retros, so improvements to personal workflows don’t get mistaken for strategic advantage.

    My distilled guidance echoes the episode’s throughline. The product builder trend isn’t a mandate — it’s a tool. Let enjoyment and skill guide who on your team leans into it. Organizational readiness determines whether AI empowers your team or creates chaos. Don’t conflate personal efficiency, process change, and product impact—they require different responses. Discovery fundamentals haven’t changed; AI helps you go deeper, not skip the work. And the real takeaway on product builders: not everyone has to build, but everyone can if they want to.

    If you want to hear the full discussion that sparked these reflections, listen on Spotify or Apple Podcasts. Then tell me: where will you apply builder energy in your team—and where will you deliberately say no?

    Resources & Links: Follow Teresa Torres: https://ProductTalk.org. Follow Petra Wille: https://Petra-Wille.com. Mentioned in this episode: Claude Code, Vibe coding, Ruby on Rails.

    One more quote I loved because it centers autonomy and craft: "It’s a tool in our toolbox. We can decide who on our team has fun with it, wants to do it, wants to contribute." That’s the mindset that sustains both momentum and morale.


    Inspired by this post on Product Talk.


    Book a consult png image
  • Why We Built AI-Powered FinOps In‑House—and Beat Off‑the‑Shelf Tools in Under a Year

    Why We Built AI-Powered FinOps In‑House—and Beat Off‑the‑Shelf Tools in Under a Year

    When our cloud costs started outpacing growth, I knew we had to make a decisive call on “build vs buy.” Buying a FinOps platform would have been faster on paper, but it wouldn’t internalize our operational nuance. Building an agentic AI layer on top of our cost, telemetry, and product usage data promised not just dashboards—but compounding leverage. Less than a year later, our homegrown approach outperformed off‑the‑shelf alternatives on speed, precision, and organizational adoption.

    The aspiration was clear from the outset: See how Amplitude scaled FinOps with AI agents—cutting manual work, accelerating insights, and turning a one-person function into a cost optimization engine. We set that as a bar for both outcomes and operating cadence, then translated it into a roadmap grounded in first principles.

    Our build vs buy analysis hinged on three factors. First, cloud cost optimization is only as good as the context it carries; we needed deep hooks into our pricing, feature flags, and deployment frequency to reason about unit economics in real time. Second, we required agentic AI workflows that could detect anomalies, recommend actions, and close the loop—not just visualize waste. Third, governance mattered: privacy‑by‑design, data governance controls, and transparent decision logs were non‑negotiable under our AI Strategy and product management leadership standards.

    We architected a retrieval‑first pipeline to blend billing exports, usage telemetry, and observability signals with product and GTM metadata. Agent workflows ran on top: one agent built driver trees that explained spend shifts by service, customer cohort, and environment; another specialized in anomaly detection with confidence scoring; a third agent proposed commitment strategy, rightsizing, and schedule adjustments. Each recommendation linked back to source data for auditability.

    From a delivery standpoint, we treated the system like a product, not a tool. A product trio (PM, engineering, and FinOps) ran continuous discovery interviews with stakeholders, instrumented eval‑driven development for agent prompts, and shipped improvements via CI/CD weekly. We optimized prompt engineering for decision clarity over verbosity and codified acceptance criteria: time‑to‑insight, actionability, and measurable savings per recommendation.

    The impact was immediate and then compounding. Manual effort on month‑end analysis shrank as agents pre‑triaged drift and surfaced root causes with suggested remediations. Insights arrived continuously, not as end‑of‑month surprises, which meant engineering could fold changes into regular sprints. What started as a one‑person FinOps function evolved into a cost optimization engine embedded across teams—product, SRE, and finance—all speaking a shared language of drivers, tradeoffs, and outcomes.

    Along the way, we learned where building truly beats buying. If your architecture, pricing model, and growth loops are unique—and they usually are in consumption SaaS—agentic AI amplifies institutional knowledge in a way generic platforms can’t. Conversely, if you lack clean tagging, clear ownership, or basic observability, investing there first will raise ROI on any approach, built or bought.

    My advice if you’re at this crossroads: define success in terms of decisions changed, not reports shipped. Start with a thin slice—anomaly detection plus one high‑leverage remediation path—then iterate. Keep humans in the loop for executive sign‑off until your confidence intervals and post‑action telemetry prove reliability. With the right guardrails and focus, in‑house AI FinOps can move faster than the market and pay for itself well within a year.


    Inspired by this post on Amplitude – Perspectives.


    Book a consult png image
  • Pretotyping vs. Prototyping: How I Validate Ideas Fast and Build Products Customers Love

    Pretotyping vs. Prototyping: How I Validate Ideas Fast and Build Products Customers Love

    I learned early in my career that beautiful prototypes don’t save you when you’re solving the wrong problem. What does save you is separating market risk from solution risk and choosing the fastest, lowest-cost way to get evidence. That’s why I rely on pretotyping to test demand in days and prototyping to refine usability and feasibility once I see a strong signal. The result: faster cycles, fewer wasted sprints, and products customers genuinely want.

    Pretotyping vs. prototyping explained: differences, benefits, examples, and when to use each approach to validate ideas before you build.

    Here’s how I define the two in practice. Pretotyping answers, “Should we build this at all?” Its goal is to validate real user intent and behavior with the lightest-weight artifact possible—often before any code. Think painted-door (fake door) experiments, Wizard-of-Oz flows powered by humans behind the scenes, concierge tests, landing-page smoke tests with waitlists or preorders, and simple A/B testing to gauge click-through intent. It optimizes for time-to-signal and cost-to-learn.

    Prototyping answers, “Can we build this well?” and “How should it work?” Once demand is evidenced, I prototype to de-risk solution details: usability, architecture, performance, and integration. This might include interactive UI models, high-fidelity flows, technical spikes, or service stubs. Here, I optimize for learning about user experience and technical feasibility without fully committing to production.

    When should you use each? If your biggest unknown is market risk—whether customers care at all—start with pretotyping. If your biggest unknown is solution risk—how to deliver an experience that’s usable, reliable, and scalable—move to prototyping. In other words, validate the “right thing” before you perfect the “thing right.”

    My decision rule is simple: identify the dominant risk, then pick the smallest experiment that can credibly invalidate it. For market risk, I look for evidence of behavior, not opinions: clicks on a painted door, signups on a landing page, willingness to pay (deposits, preorders), or sustained repeat usage in a Wizard-of-Oz flow. For solution risk, I look for task completion, time-on-task, error rates, and qualitative friction from usability sessions with a realistic prototype.

    Concrete examples from recent work help illustrate the difference. When exploring a new analytics insight, I shipped a fake door inside our product nav; a simple tooltip explained the concept and captured interest. Click-through rate, conversion to a short explainer, and waitlist signups told me whether the value proposition resonated before building anything. For a complex AI-assisted workflow, I ran a Wizard-of-Oz experiment: users experienced the end-to-end flow while our team manually handled the “AI” behind the curtain. That gave us real engagement data and edge cases to inform the prototype and later the MVP.

    Metrics matter. I set a clear hypothesis with a guardrail on sample size and a minimum detectable effect I’d consider actionable. For pretotyping, I focus on time-to-first-signal, intent conversion (CTR to interest, interest to signup), cost-per-qualified-lead, and evidence of willingness to pay. For prototyping, I prioritize task success rates, usability severity findings, and qualitative insights that materially change the design or technical approach. Above all, I avoid vanity metrics and anchor decisions to outcomes, not output.

    My repeatable playbook looks like this: (1) Frame the problem and value proposition in one crisp sentence. (2) Choose the leanest pretotyping method that can reveal real behavior. (3) Define success metrics and a decision rule before you run the test. (4) Launch quickly, instrument well, and let the data run long enough to be credible. (5) If demand is strong, promote to a prototype to refine UX and de-risk technicals; if not, iterate the proposition or stop. This keeps product discovery continuous and ensures roadmapping and sprint planning are guided by evidence.

    There are ethical guardrails I never skip. Painted doors must set correct expectations once clicked; waitlists or learn-more pages are honest and respectful. For Wizard-of-Oz and concierge tests, I’m explicit about data handling and provide timely follow-up. Trust compounds when experiments are transparent and user time is valued.

    Tooling can accelerate the cycle without diluting rigor. I often use lightweight design systems and no-code automations to stitch together realistic flows, and I’ll leverage gen AI for product prototyping to generate copy, microinteractions, or data scaffolding. But the principle remains: don’t over-invest until evidence earns the investment. Empowered product teams thrive when they optimize for learning velocity, not feature velocity.

    If you’ve ever felt the tension between shipping fast and shipping right, this approach resolves it. Pretotype to prove the market; prototype to perfect the solution. Do that consistently and you’ll spend more time delivering outcomes customers value—and far less time debating outputs.


    Inspired by this post on Product School.


    Book a consult png image
  • Unleashing Inbound Sales with AI: My Playbook for Launching and Scaling Sales Agents Fast

    Unleashing Inbound Sales with AI: My Playbook for Launching and Scaling Sales Agents Fast

    Inbound leads shouldn’t wait for a rep’s calendar. When we first launched The Service Agent Blueprint, support leaders finally had a clear AI path. Go-to-market and revenue teams are now facing similar uncertainty, so I’m introducing The Sales Agent Blueprint—a practical map for launching and scaling AI for sales with confidence.

    For most sales teams, inbound motions require a lot of manual work. I’ve watched leads pile up in queues, waiting for availability rather than being prioritized by buyer intent. That delay costs meetings, pipeline, and momentum—and it’s exactly where a modern AI Strategy can transform your go-to-market strategy.

    Agents can run sales conversations end to end – engaging buyers, qualifying leads, and routing high-intent opportunities to the right team to move prospective buyers forward quickly. Humans will still be involved, but will move their focus to the consultative conversations and higher-value work they did not have time to focus on before. In practice, this shift enables cleaner AI workflows, better conversation design, and a healthier balance between sales-led growth and product-led growth.

    The questions many go-to-market and revenue leaders are facing now are where do you start? What should success look like? How do you actually test and deploy these solutions? These are the right questions—and the ones I hear most often when teams weigh build vs buy decisions, evaluation frameworks, and CRM integration nuances.

    The Sales Agent Blueprint answers those questions. It’s designed to be a strategic guide for sales, revenue, and AI transformation leaders who want to deploy AI for inbound sales fast, prove value, and build momentum. If you’re aiming for eval-driven development, this will help you define success up front and operationalize it.

    What’s inside is simple by design yet deep enough to take you from zero to value. The Sales Agent Blueprint is structured around two tracks that reflect how high-performing teams adopt agentic AI: first, launch for quick wins; next, scale for durable growth.

    Minimal blue banner for Introducing the Sales Agent Blueprint with a bold 'Scale it' headline, abstract halftone device graphic, subtle crop marks, and a 'Coming Soon' badge in the upper-right corner.
    Coming soon: Sales Agent Blueprint. A sleek, blueprint-inspired teaser with the call to 'Scale it' signals tools, playbooks, and workflows to grow revenue, streamline operations, and scale teams with confidence.

    Today, I’m releasing the first part of the Blueprint: “Launch it.” It’s a practical guide for getting your Agent live and seeing real results. You’ll learn how to deploy a Sales Agent that runs inbound sales conversations end to end, engaging buyers, qualifying leads, and routing high-intent opportunities to the right outcome in real time—without disrupting your current CRM integration or pipeline processes.

    By the end of the “Launch it” track, you’ll be ready to execute with clarity. Here’s how I frame the essential steps, based on what consistently works in the field.

    Understand what a Sales Agent is: Discover why they’re different from chatbots and how they work. Build a business case: Prove the basic economics of AI, decide whether to buy or build, and get the buy-in and budget you need to move forward.

    Evaluate an Agent: Learn how to define success, choose the right evaluation criteria, and run a focused, high-impact assessment with our five-step framework.

    Deploy with confidence: Build a deployment plan that gets your Agent live quickly to engage buyers at peak intent. Learn what to expect at each stage.

    Vector-style 'Blueprint' title on a light grid with Bézier points, plus a royal-blue panel reading '1 Launch it' next to a satellite icon; footer shows FIN.AI/BLUEPRINT/SALES promoting the Sales Agent Blueprint.
    Introducing the Sales Agent Blueprint. This crisp, grid-based graphic spotlights step 1—Launch it—signaling day-one activation for an AI sales agent. Explore the framework and get started at fin.ai/blueprint/sales.

    Continuously improve performance: After launch, your Agent becomes a system to manage. We’ll show you how to implement a repeatable process to train, test, deploy, and optimize.

    The second track, “Scale it” (coming soon), focuses on the organizational and systems design work that unlocks compounding gains. Launching AI is only the beginning. To unlock its full potential, you need to rewire your inbound sales motion—redesigning the buyer journey, building AI-first systems and ownership models, and rethinking how pipeline is generated and scaled. This is where governance, measurement, and team roles evolve to support sustainable growth.

    I’ll be building this Blueprint in public as I navigate the same challenges—sharing what works, what to avoid, and how to accelerate time-to-value without sacrificing quality or trust. If you’re ready to turn intent into revenue with agentic AI, this is your head start.

    The Sales Agent Blueprint is live now. Explore the full guide at fin.ai/blueprint/sales and start your “Launch it” sprint today.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • Build vs. Buy for Churn Prediction: My Proven Playbook for Faster Retention and ROI

    Build vs. Buy for Churn Prediction: My Proven Playbook for Faster Retention and ROI

    Churn is the silent tax on growth, and I treat churn prediction as a core product capability—not a side project. Over the years, I’ve led teams through multiple implementations across different data maturities and go-to-market motions, and the same question keeps returning at kickoff: what’s the smartest path to impact now and defensibility later?

    “Should you build or buy your churn prediction model?” The right answer depends on time-to-value, data readiness, available talent, and whether churn prediction is a true differentiator for your product strategy or simply a must-have capability to power customer success and product-led growth.

    When speed and coverage matter most, I start by evaluating category platforms that pair behavioral analytics with activation. As one example, vendors emphasize immediate business outcomes such as integrations, in-app guides, and workflow triggers that help you act on risk signals fast—without waiting months for model training or data engineering.

    Increase revenue, cut costs, and reduce risk with Pendo’s Software Experience Management platform. Optimize the entire software experience to drive adoption and improve engagement.

    Buying makes sense when you need rapid time-to-value, opinionated best practices, and a unified analytics platform to operationalize insights through product tours, in-app guides, and CRM integration. In these cases, I’m optimizing for coverage, consistent signal quality, and ease of activation for customer success—so the team can focus on interventions, not infrastructure.

    Building is compelling when churn prediction is a source of competitive differentiation or you have proprietary signals others can’t access. If your product generates unique behavioral data, requires custom anomaly detection or explainability constraints, or must blend usage telemetry with domain-specific risk scoring, a tailored model can raise precision and unlock novel retention levers.

    My hybrid approach has become a reliable playbook: buy first to establish a strong baseline and close the activation loop, then selectively build where proprietary data and context yield outsized gains. I use retention analysis to identify high-signal behaviors, then iterate with A/B testing and a clear minimum detectable effect (MDE) to validate uplift before committing engineering capacity.

    Total cost of ownership is non-negotiable. I account for more than license or training costs: ongoing data engineering, feature pipeline maintenance, model monitoring for drift, and AI risk management all add up. Strong data governance, privacy-by-design, and regulatory compliance must be baked in—whether I build, buy, or blend both.

    Activation determines real ROI. Predictions that don’t flow into customer success workflows, lifecycle messaging, or in-product nudges rarely move Net Recurring Revenue (NRR). I prioritize tight integrations that enable targeted experiments—journey mapping, contextual tooltips, and timely outreach—to reduce friction and increase user engagement at the moments that matter.

    My quick decision test: buy if time-to-value and adoption are the immediate goals; build if proprietary signals and explainability are core strategic assets; blend if you want fast wins now with room to differentiate later. Answering the build vs. buy question through this lens consistently improves retention, accelerates product-led growth, and keeps teams focused on the customer experience rather than plumbing.


    Inspired by this post on Pendo – Perspectives.


    Book a consult png image
  • Why MCP Is Transforming Product Management: Field-Tested Lessons from Miro, Atlassian & More

    MCP is the acronym I keep hearing in every product conversation—and for good reason. When teams like Miro and Atlassian lean in, it signals a real shift in how we design, ship, and scale value. From my vantage point leading product at HighLevel, I see MCP less as a feature and more as an operating advantage: a way to align strategy, execution, and governance so product teams move faster with higher confidence.

    When I evaluate a platform like MCP, I start with three questions. First, does it advance our product strategy and sharpen competitive differentiation? Second, does it strengthen product-led growth by improving activation, onboarding, and retention? Third, does it help us drive outcomes vs output OKRs so we consistently measure what matters, not just what ships?

    Execution discipline makes or breaks any MCP investment. I design measurement upfront: instrument A/B testing, define activation milestones, and monitor retention cohorts. In parallel, I use Pendo for in-app guides and product tours to accelerate adoption and reduce time-to-value, then connect this data back to roadmap decisions so each release compounds learning instead of creating noise.

    On the operating model, I apply a rigorous build vs buy lens and stress-test platform scalability, reliability, and integration surfaces. Stakeholder management is critical—security, SRE, and solutions engineering must be partners from day one. I anchor teams in product trios and continuous discovery so we learn with customers in the loop, not after the fact.

    At Pendomonium 2026, Pendo CPO Rahul Jain brought together four product leaders who are building with MCP. Read or watch their conversation to learn more.

    My practical playbook for MCP: choose one high-signal use case, define clear success metrics, and run a tightly scoped pilot with visible executive sponsorship. Treat governance and data hygiene as first-class requirements. Close the loop weekly with qualitative insights from customer interviews and quantitative telemetry from experiments. Only then scale to adjacent workflows, keeping a steady focus on measurable customer value and repeatable delivery.

    Whether you’re an emerging startup or an established enterprise, the opportunity is the same: turn MCP curiosity into durable capability. With disciplined measurement, thoughtful stakeholder alignment, and a relentless outcomes mindset, MCP can become a lever for product management leadership—not just another acronym in the stack.


    Inspired by this post on Pendo – Best Practices.


    Book a consult png image
  • Commercial vs. Internal Products: Hard Truths, High Leverage, and How I Make the Call

    Commercial vs. Internal Products: Hard Truths, High Leverage, and How I Make the Call

    Internal Products Are Hard; Commercial Products Are Harder. That line captures years of hard-won lessons from leading both internal platforms and market-facing SaaS at HighLevel. I’ve seen how the two demand different muscles—even when the tech stack, talent, and timelines look the same on paper.

    When I talk about internal products, I mean services and solutions that our own employees use to take care of customers—customer-enabling tools and services, agent consoles, fulfillment and billing workflows, operations dashboards, and the underlying platforms that keep them fast, compliant, and resilient. These tools don’t generate revenue directly, but they quietly determine customer experience, gross margin, and how quickly we can ship, resolve issues, and scale.

    Commercial products, by contrast, add a second challenge layer. Beyond discovery, usability, and reliability, we must conquer positioning, pricing and packaging, competitive differentiation, sales enablement, procurement hurdles, and ongoing customer success motion. The surface area for failure is bigger, and the time-to-signal on product-market fit is slower and noisier.

    Here’s how I decide where to invest. First, I anchor on outcomes, not output. If the business priority is net revenue retention, faster onboarding, or reduced cost-to-serve, internal products often provide the highest-leverage path. If the priority is new revenue, new market entry, or a must-have differentiator, we lean commercial. I make the trade explicit in outcomes vs output OKRs so we can defend the decision when pressure mounts.

    Second, I run a clear build vs buy calculus. For internal needs, the default is buy if a mature, configurable solution exists that meets our security, data governance, and integration requirements. I only build when the workflow is core to our differentiation, the TCO of customization is lower than vendor sprawl, or we can capture unique proprietary advantage. For commercial products, I avoid embedding third-party IP in a way that caps differentiation or compresses margins as we scale.

    Third, I insist on continuous discovery. Internal audiences are not a captive market—they’re discerning experts with real jobs to do. I treat them like customers, with structured customer interviews, journey mapping, and opportunity solution trees. I rely on empowered product teams and product trios to validate problems and reduce solution risk before we commit engineering time.

    Fourth, I frame commercial vs internal work with capacity guardrails. In most planning cycles, I reserve explicit allocation for platform scalability and internal tooling, separate from feature bets. Without this, internal products become backlog filler, which guarantees we’ll pay the interest later in churn, SLA breaches, and slower delivery.

    Execution differs too. For internal products, change management is the make-or-break. I plan enablement as a first-class deliverable: clear rollouts, in-app guides, training, and feedback loops with frontline champions. I track adoption, time-to-resolution, error rate, and satisfaction for internal users with the same rigor we apply to external users.

    For commercial products, I design the discovery-to-GTM handshake early. Pricing and packaging must reflect value drivers discovered in research, not what’s easiest to meter. Sales and solutions engineering need crisp narratives, objection handling, and proof points. Customer success needs activation plans and health signals tied directly to leading indicators of retention.

    Across both, I instrument the product and process. I lean on feature flags and progressive delivery to manage risk, and I protect SLOs with error budgets so teams balance reliability with iteration speed. CI/CD isn’t a badge—it’s how we earn the right to ship continuously without eroding trust.

    Common pitfalls recur. Teams skip UX for employee tools because “they have to use it”—which backfires as shadow workflows and rework. Leaders underfund internal platforms, then wonder why velocity stalls. On the commercial side, teams over-index on features and under-invest in positioning and onboarding, leading to poor activation and elongated sales cycles.

    What’s the payoff? When we treat internal products as products, we unlock scale: shorter handling times, fewer escalations, clearer accountability, and higher customer satisfaction. When we approach commercial products with the same discovery rigor plus smart GTM, we compress time-to-value and amplify differentiation. The craft is knowing which lever to pull when—and having the discipline to measure what matters.

    My rule of thumb is simple. If the goal is operational excellence that compounds across the entire customer journey, invest in internal products with the same intensity you reserve for revenue-generating features. If the goal is market expansion or category leadership, invest in commercial products with a tight discovery-to-GTM loop. In either case, clarity of outcomes, disciplined discovery, and empowered teams win the day.


    Inspired by this post on SVPG.


    Book a consult png image
  • How We Built PR Review Bots In‑House for a Fraction of the Cost—and How You Can Too

    How We Built PR Review Bots In‑House for a Fraction of the Cost—and How You Can Too

    PR review bots are all the rage, but they cost a premium. We built our own for cheap that work just as well, if not better. Here's how.

    As a VP of Product Management, I care deeply about the velocity and quality of our software delivery. The decision to build our own pull request (PR) review agents came from a simple calculus: we needed tighter control over developer experience, CI/CD integration, and cost—without sacrificing accuracy or reliability. The result was a pragmatic system that accelerates reviews, improves code quality, and pays for itself through faster feedback loops.

    Before we wrote a line of code, we defined success. Our objectives were to shorten review cycles, reduce back-and-forth on style and test coverage, and surface risks earlier—measured against DORA metrics like lead time and deployment frequency. That focus aligned the team, guided our build vs buy decision, and anchored scope to the highest-impact use cases.

    We started rules-first, AI-optional. The initial release enforced guardrails that are universally valuable: linting and formatting checks, required test coverage thresholds, commit message standards, ownership validation (CODEOWNERS), and basic security scans. These automated gates eliminated predictable review friction, freeing engineers to focus on logic and architecture rather than style debates.

    Then we layered intelligence where it mattered. We added lightweight, explainable checks for common code smells and dependency risks, plus optional natural-language summaries that turn large diffs into concise context. Where appropriate, we introduced agentic AI workflows to triage PRs by risk, draft review comments, and suggest missing tests—always keeping humans in the loop. This hybrid approach kept costs low and outcomes high.

    Integration with our CI/CD pipeline was non-negotiable. We wired GitHub/GitLab webhooks to a stateless service that queued work, executed checks in containerized workers, and posted results back as status checks and review comments. Caching, parallelization, and smart diff-scoping ensured we only computed what changed, keeping the experience snappy even on large repos.

    Adoption hinged on developer experience. We made the bot’s feedback fast, specific, and actionable, with clear remediation steps and links to documentation. Feature flags allowed teams to opt into new checks gradually. ChatOps commands enabled quick overrides for emergencies, while policy-as-code kept rules visible, versioned, and auditable.

    We treated this like any product: eval-driven development for accuracy, ongoing telemetry for false-positive rates, and explicit SLAs for response times. We instrumented outcomes end-to-end—tracking PR cycle time, comment-to-merge ratios, and rework—so we could prove the ROI and tune the system without guesswork.

    The outcome: a reliable PR review companion that runs on a shoestring budget, integrates cleanly with our workflows, and measurably improves engineering throughput. If you’re weighing build vs buy, start small with rules that deliver immediate value, then layer intelligence where it earns its keep. With a clear product strategy, you can stand up capable PR review bots quickly—and scale them as your needs grow.

    If you’re ready to try this yourself, begin with your top three friction points in code reviews, wire them into your CI/CD checks, and pilot with a single team. Iterate weekly, measure relentlessly, and let your developers be your strongest signal. You’ll be surprised how far a pragmatic, product-led approach can take you.


    Inspired by this post on Amplitude – Perspectives.


    Book a consult png image
  • Apex Arrives: Vertical AI That Beats GPT-5.4 on Customer Service Speed, Accuracy, and Cost

    Apex Arrives: Vertical AI That Beats GPT-5.4 on Customer Service Speed, Accuracy, and Cost

    I just watched one of the most significant leaps in customer service AI in years. Last week, a quiet but seismic release landed in CX: Fin introduced Apex, a vertical model purpose-built for support that raises the bar on speed, accuracy, and cost. As a product leader, this is exactly the kind of breakthrough that changes roadmaps, vendor strategies, and what customers can expect from modern service operations.

    It’s a brand new model for Fin called Apex, and it’s objectively the highest performing, fastest, and cheapest model for customer service. It beats the very best models in the industry including GPT-5.4 and Opus 4.5.

    In this analysis, I’ll unpack why the launch matters for the customer service agent category, what it signals for frontier labs and open‑weight ecosystems, and how leaders should rethink their AI Strategy, build vs buy decisions, and eval-driven development roadmaps.

    Fin was already the highest performing and most sophisticated agent in the customer service space, consistently beating impressive competitors like Decagon and Sierra at an average win rate in the 70s. It operates at tremendous scale, now resolving almost 2M customer issues per week, a number that’s growing at an exponential clip. In its short life it’s grown to nearly $100M in recurring revenue.

    As of last week, ~100% of all (English language, chat and email) customer conversations are now running on Apex. Since day 1, the Fin engine has comprised a system of models, and last year the team began replacing off‑the‑shelf models with custom ones trained on proprietary data. The core answering model had been a frontier labs offering—initially versions of GPT and more recently Sonnet 4.0. Now, that core answering model is Apex 1.0.

    This model resolves customer issues at a materially higher rate than any other model available. One of their largest customers in the gaming space saw the resolution rate improve overnight from 68% to 75% (i.e. a reduction in unresolved conversations of 22%). The team notes they had never seen a jump this large from a single improvement since they started Fin.

    Just as important, it’s dramatically faster, has fewer hallucinations, and is far cheaper than other available models—exactly the attributes operations leaders weigh most when deploying agents at scale. In practice, these are the levers that unlock higher CSAT, tighter SLAs, and better unit economics.

    Achieving all three simultaneously is extraordinarily hard. Credit goes to foundational research from a 60‑person AI group run by Fergal Reid, and, crucially, to domain‑specific proprietary evals drawn from billions of human and agent interactions produced by the Fin resolution engine—already hand‑tuned to be the most effective in the category. That creates a flywheel: an eval‑driven development loop that trains models to keep improving at the edge of the system’s abilities. In other words, Apex 1.0 looks like the tip of the iceberg.

    Zooming out, service is one of the few categories where generative AI has already delivered commercial impact at scale (alongside coding, and arguably the legal industry). With TAMs measured in the hundreds of billions, competition is intense and well capitalized. The pattern I’ve seen repeatedly is clear: winners in these spaces must become full‑stack AI companies. As features become ~free to build, durable competitive differentiation shifts under the hood—to proprietary data, post‑training, inference efficiency, and the quality of the eval loop.

    Dual bar charts showcasing Fin Apex 1.0 with -65% hallucination reduction and a 3.7s time to first token, benchmarked against Sonnet 4.6, Opus 4.5, and GPT-5.4 on a clean, light background.
    Fin Apex raises the bar for finance-ready AI, highlighting a -65% cut in hallucinations and a quicker first token at 3.7s (0.6s faster), compared with Sonnet 4.6, Opus 4.5, and GPT-5.4 in side-by-side charts.

    That’s why competitors will need to release their own models. Many appear to be just starting to hire the talent to do so, which likely gives Fin at least a year of head start. For product leaders, this is a strong signal to revisit build vs buy assumptions, and to quantify when owning your post‑training pipeline and evals becomes the rational move.

    Honestly, 2–3 years ago I expected AI application differentiation to live mostly in what we built around third‑party models. The AI game humbles all of us; today it’s obvious that vertical models paired with proprietary evals create compounding moats.

    In a podcast interview last week, Andrej Karpathy said:

    "I do think we should expect more speciation in the intelligences. The animal kingdom is extremely [diverse] in the brains that exist. And there’s lots of different niches of nature… And I think we should be able to see more speciation. And you don’t need this oracle that knows everything. You kind of speciate it. And then you put it on a specific task. And we should be seeing some of that because you should be able to have much smaller models that still have the cognitive core."

    The frontier labs still have the very best models, but open‑weight models aren’t far behind—making pre‑training look increasingly like a commodity. The frontier is moving to post‑training, which is precisely what we see with Apex (and Cursor’s Composer 2), and what we should expect to dominate going forward.

    Labs now face a dual reality. On one hand, horizontal general‑purpose models can over‑serve specific verticals (e.g., customer service doesn’t need an oracle that knows everything). On the other, open‑weight models are good enough that high‑quality, domain‑specific post‑training can produce superior models for special‑purpose jobs—and in the ways that matter for those jobs. In service, soft factors like judgement, pleasantness, and attentiveness matter alongside hard factors like resolution effectiveness, speed, and cost.

    I’m still bullish on the labs. Many organizations remain heavy customers of Anthropic—whether as part of multi‑model systems or through deep usage of Claude Code in engineering teams (see this example of Claude Code adoption). Yet classic disruption (à la the late, great Clay Christensen) is now at their door. The way out is to disrupt themselves by building cheaper specialized models too, which likely requires acquiring the evals—or the companies with the evals—needed for each task. Expect creative data partnerships, M&A consolidation, and a wave of hyper‑specific model providers that compete head‑to‑head with the labs.

    In the meantime, Fin appears to be the only vendor in its space with a custom model that’s also objectively superior to everything else out there. I’m excited to see it deployed broadly for end customers, and I’m watching closely for the next announcement that will accelerate that rollout. For product leaders, the message is clear: the age of vertical models and agentic AI is here—bring your evals, or bring your checkbook.


    Inspired by this post on The Intercom Blog.


    Book a consult png image
  • Build vs. Buy in an AI-First World: My Framework to De-Risk Decisions and Own Your Data

    Build vs. Buy in an AI-First World: My Framework to De-Risk Decisions and Own Your Data

    Build vs. buy is a decision that never truly goes away, and with AI reshaping the economics of software, I’m revisiting this question more frequently—and with more nuance—than ever. The temptation to “just build it” is real when prototypes are cheaper, shipping feels faster, and small tools can rival big platforms. But the real decision has never been about code; it’s about value, data, and long-term responsibility.

    Across product orgs at every stage, I see the same pattern: AI makes building feel easier—but it doesn’t eliminate the tradeoffs. The hard part is separating what differentiates your product from what simply supports it. That’s why I start by asking whether the capability is truly core to my value stream, and then I force myself to reason about ownership and maintenance, not just velocity.

    My rule of thumb remains simple: If something isn’t core to your value stream, don’t build it. And even when it is core, vendors may still be better positioned—especially for payments, invoicing, and infrastructure. Those domains carry deep operational complexity, continuous compliance, and reliability requirements that are easy to underestimate and painful to own.

    Here’s how this plays out for me. I would never build my own blogging platform. I moved from WordPress to Ghost, because publishing isn’t where I differentiate, and the long tail of upgrades, security, and performance is a drag on focus. The platform does the job, my audience gets a better experience, and my team avoids owning commodity maintenance work.

    On the other hand, I did build my own task management system—despite the abundance of excellent tools like Trello, Evernote, and OmniFocus. For me, tasks, notes, and workflows are deeply personal and idiosyncratic. I wanted my system to reflect how I think, plan, and communicate, with tight integration to my daily product rituals. In this case, the underlying data became the real product—and owning and controlling that data changed the equation.

    That’s the heart of the decision: When the underlying data becomes the real product, ownership matters. Task management, notes, and workflows evolve into a personalized operating system. The moment your data model represents your unique value—and your future differentiation—build vs. buy is no longer a tooling choice; it’s a strategy choice.

    AI is pushing this even further. Cheaper prototyping and “vibe coding” lower the cost of building. Tools like Claude Code and platforms from OpenAI make it viable to ship smaller, targeted tools that would have been uneconomical a few years ago. That expands the frontier of what teams can build without committing to a monolithic platform—and it puts pressure on vendors to improve data portability.

    Which brings me to vendor lock-in. Exports aren’t always enough. When I evaluate CRMs or course platforms, I look for more than CSV dumps. I want robust, well-documented APIs, webhook coverage, import/export parity, schema transparency, and a clear migration path. I’ve seen teams drown in brittle integrations with Salesforce or HubSpot, struggle to unwind course data from Teachable, or get stuck in signature workflows around DocuSign without a clean escape hatch. Portability is table stakes now.

    I treat build vs. buy as a discovery problem. Options are assumptions to test. On the build side, I run feasibility spikes: proof-of-concept integrations, latency checks, cost-to-serve models, and a sober read on maintenance. On the buy side, I trial vendors, not their marketing. I replicate a real workflow, test the edges, validate data portability, and simulate failure modes like vendor downtime or schema changes.

    A word of caution on complexity: “we can build anything” is not the same as “we should build this.” Long-lived products accumulate hidden complexity over time—security, privacy, performance, observability, SRE runbooks, QA automation, documentation, and compliance. Be honest about engineering capabilities and maintenance costs, especially when uptime and regulatory exposure are in play.

    My practical checklist looks like this: Is this core to our differentiation? Do we need to own the data model? How strong is data portability (APIs, webhooks, mapping, re-import)? What’s the true total cost of ownership over three years (people, ops, security, compliance)? Are there regulatory or reliability constraints better handled by a vendor? What’s the opportunity cost of not building something more strategic? And if we buy, what’s our exit plan?

    Ultimately, build vs. buy isn’t just about speed or cost—it’s about core value, data ownership, and long-term responsibility. AI lowers the barrier to building, but it doesn’t erase complexity. Treat build vs. buy decisions like any other discovery effort: test assumptions, prototype, and validate before committing. Ask not just can we build it, but should we own it?

    If you’re wrestling with vendor lock-in, fielding pressure to “just build it,” or rethinking your stack in an AI-first world, this lens will help you ask better questions before you commit. And if you’re exploring targeted builds alongside platforms like Stripe, Dropbox, Obsidian, or Ghost, I’d love to hear what’s working for you and where portability remains a hurdle.


    Inspired by this post on Product Talk.


    Book a consult png image