Tag: stakeholder management

Implementing Agentforce the Smart Way: My Proven Playbook for Salesforce Agentic Success

Implementing Agentforce isn’t a feature rollout—it’s a strategic shift. In my role building AI-driven products, I treat Agentforce as its own product with clear outcomes, rigorous governance, and disciplined iteration. The objective is to create durable operational leverage inside Salesforce without compromising trust, data integrity, or customer experience.

Learn the ways in which Pendo helps companies design and iterate on their agentic strategy for Salesforce.

I start with product discovery. That means selecting the right use cases, defining the target user, and aligning on measurable outcomes rather than outputs. In practice, I prioritize use cases across sales, service, and marketing using an impact–effort–risk lens, then set crisp success metrics—response time, deflection rate, case resolution, win rate lift, and user adoption. This keeps everyone focused on value creation, not just model novelty.

Next, I design the agentic system with guardrails. I specify agent roles, tools, and policies; define when to escalate to humans; and embed privacy-by-design and data governance from day one. I also build an evaluation harness with offline tests and live A/B testing, ensuring we have a minimum detectable effect that’s meaningful for the business. The goal is to measure outcomes reliably and course-correct quickly.

When building the first slice, I scope narrow and ship fast. For example, start with a constrained service workflow—classify the case, propose a response, and take a safe action—with clear affordances in Salesforce so users understand what the agent did and why. I instrument the experience end-to-end and use Pendo for in-app guides, surveys, and behavioral analytics to reduce onboarding friction and capture real-time feedback at scale.

Iteration is where value compounds. I run weekly reviews of conversations, error taxonomies, and edge cases; adjust prompts and tool access; and maintain a steady experiment cadence. We track outcomes vs output to avoid vanity metrics, and we document learnings to de-risk the next use case. This steady drumbeat builds credibility with stakeholders and confidence with frontline users.

Change management is non-negotiable. I align leaders early, set expectations on what the agent can and cannot do, and define SLAs for humans-in-the-loop. I use product tours to teach new behavior, highlight quick wins, and establish transparent feedback channels. This combination of enablement and accountability accelerates adoption and creates a culture that embraces agentic AI responsibly.

Finally, I scale thoughtfully. Once the first use case demonstrates value, I standardize patterns, unify analytics, and evolve governance as usage grows. I review risk regularly, align OKRs with the roadmap, and keep a tight feedback loop between product, ops, and go-to-market teams. Treating Agentforce as an evolving product—not a one-off project—maximizes impact while protecting the customer experience.

Inspired by this post on Pendo – Perspectives.

October 24, 2025
Build vs. Buy in Experimentation: Why Embracing Vendors Accelerates Real Innovation

For much of my career, I reflexively favored building experimentation tooling in-house. Over the last few years, I’ve changed my mind. The ecosystem has matured, the bar for statistical rigor has risen, and the opportunity cost of reinventing the wheel has become too high to ignore. Read why the industry has changed to more broadly embrace vendor solutions—and why that's a good thing for innovation.

The short version: buying core experimentation capabilities increasingly lets us learn faster, reduce risk, and focus scarce engineering cycles on true differentiation. I still believe in building when it creates competitive advantage, but I’ve seen too many teams burn time on “table stakes” infrastructure instead of delivering outcomes that matter.

When I evaluate build vs. buy, I start with two questions: Is this capability a point of parity or a source of competitive differentiation? And what is the real total cost of ownership over three years, including staffing, maintenance, on-call, compliance, roadmap drag, and delayed time-to-learning? Most experimentation platforms are now points of parity; the differentiation is how quickly and responsibly we learn, not whose statistics package we forked.

Modern experimentation isn’t just a split URL test. It demands identity resolution across devices, reliable bucketing, exposure logging at scale, edge delivery for flags, guardrail metrics, and rigorous methods like minimum detectable effect (MDE), CUPED, and sequential testing. Add privacy requirements, data governance, and auditability, and the platform burden grows beyond a “quick internal tool.” This is exactly where vendors have pulled ahead, baking in best practices we’d otherwise relearn the hard way.

There are still good reasons to build. If you operate under unique latency constraints (e.g., sub-20ms decisions at the edge), have non-negotiable regulatory boundaries, or your experimentation model is deeply coupled to proprietary ML systems, bespoke tooling can be justified. I’ve supported builds in those cases—but only with a clear plan for long-term ownership, documentation, and explicit trade-offs.

More often, buying is the sane default. Vendor solutions give us hardened SDKs, consistent flagging, proven stats engines, and integrations with analytics—freeing teams to spend their energy on high-quality hypotheses and better product discovery. Connecting experiment outcomes to a unified analytics platform (and tools like Amplitude analytics) helps us align on source-of-truth metrics, tighten feedback loops, and empower product trios to make confident, outcome-driven decisions.

A hybrid approach frequently wins: buy the platform core, then extend it. Build custom decisioning services where needed, enrich telemetry, and add domain-specific metrics on top. I’ve had success pairing vendor platforms with forward deployed engineers and thoughtful developer evangelism to create the best of both worlds—speed from the vendor, nuance from our domain.

If you’re considering a shift, here’s the adoption playbook I use: – Define success upfront: decision latency targets, MDE guidance, guardrail metrics, governance needs, and privacy constraints. – Run a time-boxed pilot with an A/A test and a handful of A/B testing use cases. Validate exposure logging, bucketing stability, and metric parity against your analytics stack. – Align on outcomes vs output OKRs, so “more experiments” is never the goal; better decisions are. – Establish data governance and metric definitions before full rollout. Treat metrics as a product, not a spreadsheet. – Invest in enablement: in-app guides, product tours, and training for PMs, engineers, and analysts. Proactive stakeholder management is what separates a successful rollout from shelfware.

AI is accelerating this shift. Gen AI for product prototyping and agentic AI assistants can help generate hypotheses, auto-suggest experiment designs, and flag risky rollouts in real time. Pairing AI with a robust experimentation backbone improves both velocity and quality—without asking teams to become statisticians overnight.

My bottom line: the industry’s embrace of vendor experimentation platforms is not a retreat from craftsmanship—it’s a strategic allocation of talent. By buying where the market is excellent and building where our differentiation truly lives, we learn faster, reduce risk, and compound innovation. If you haven’t revisited your build vs. buy calculus recently, now is the time. Your customers don’t reward you for owning a stats engine; they reward you for shipping better outcomes, sooner.

Inspired by this post on Amplitude – Perspectives.

October 24, 2025
9 Proven Collaboration Practices to Unite Teams and Deliver Exceptional Digital Experiences

Every standout digital experience I’ve shipped has one thing in common: deep, consistent collaboration across product, marketing, and data. When we align on outcomes and operate from a shared truth, we move faster, reduce rework, and create value our customers actually feel.

Discover best practices to fuel cross-functional collaboration and help product, marketing, and data teams create better digital experiences.

Over the years, I’ve refined nine practices that reliably elevate team performance and customer outcomes. They’re simple to state, practical to implement, and powerful when they compound together in day-to-day execution.

1) Align on outcomes, not output. I start every initiative by clarifying the customer problem, success metrics, and “outcomes vs output OKRs.” When everyone can name the desired behavior change and the KPIs that prove it, teams earn the autonomy to solve creatively—and the discipline to say no when work doesn’t move the needle.

2) Establish a shared source of truth. A unified analytics platform gives product, marketing, and data teams the same lens on activation, engagement, conversion, and retention. I insist on event hygiene, operational definitions, and self-serve dashboards so decisions are informed by facts, not folklore—especially when running retention analysis or growth experiments.

3) Form empowered product trios. I routinely pair a product manager, a designer, and a tech lead as a decision-making nucleus. This “product trios” model accelerates discovery, balances desirability/feasibility/viability, and prevents handoff theater. Extended partners (marketing, data science, support) join early to shape solutions, not just rubber-stamp them.

4) Codify decision-making rituals. Speed comes from clarity. We document DRIs, timebox debates, and use first-principles reasoning to cut through ambiguity. Lightweight decision records (why we chose X over Y) keep context intact for future contributors and reduce unproductive re-litigation.

5) Co-create the roadmap—and keep it alive. I bring stakeholders into roadmap and sprint planning to surface dependencies, risks, and opportunities upfront. We review priorities regularly, tie bets to strategy, and maintain traceability from objectives to epics to experiments. This is stakeholder management in service of focus, not bureaucracy.

6) Make insights travel. We weave discovery into delivery: problem interviews, concept tests, instrumented prototypes, and in-product feedback loops. Marketing shapes messaging early; product refines UX writing; data validates signals. The result is tighter problem-solution fit and fewer surprises late in the game.

7) Communicate early, often, and in plain language. I favor one-page briefs, narrative memos, and short demo videos over sprawling docs. Clear artifacts make collaboration inclusive, reduce meeting load, and help new collaborators ramp quickly without losing nuance.

8) Shorten the feedback loop in production. We rely on feature flags, small batch releases, and in-app guides or product tours to educate users and capture behavioral data. This supports product-led growth by turning every release into a learn-and-iterate cycle tied to the metrics that matter.

9) Default to transparency and respect. Shared channels, open calendars, and visible roadmaps build trust. When disagreements arise, we return to customer outcomes and the evidence. Healthy friction pushes the work forward; psychological safety keeps the team together.

None of these practices are exotic. The magic is in the consistency: aligning on outcomes, measuring what matters, and giving talented people clear guardrails and room to run. When we work this way, collaboration becomes a force multiplier—and customers feel the difference in every click and interaction.

Inspired by this post on Amplitude – Perspectives.

October 24, 2025
Master Points of Parity in SaaS: Nail Table Stakes, Earn Trust, and Unlock Differentiation

Early in any market, I obsess over one thing before splashy features or clever messaging: are we meeting the table stakes that buyers expect? Points of parity (POPs) are the baseline capabilities that put us on a buyer’s shortlist and establish the credibility to compete. Without them, even the best differentiators won’t land.

Understand how points of parity are crucial to getting your foot in the door. Explore different strategies to make POPs work for your SaaS business.

Here’s how I define POPs in practice: they’re the “no-regrets” features, assurances, and experiences that customers assume you have because your competitors already do. In SaaS, that often includes security certifications (e.g., SOC 2), SSO, predictable performance (SLAs/Uptime), clear pricing, responsive support, and integrations with the rest of the customer’s stack.

POPs differ from points of difference (PODs). PODs are what make you unique; POPs are what make you viable. I’ve seen teams try to lead with innovation before building credibility, only to stall in procurement. You earn the right to showcase differentiation after you meet parity.

For SaaS, POPs frequently map to procurement checklists. Think InfoSec reviews, role-based access controls, audit logs, encryption standards, user management, and integrations with systems like Salesforce, HubSpot, or Slack. These aren’t glamorous, but they remove friction, reduce perceived risk, and accelerate time-to-value—cornerstones of product-led growth and a healthy go-to-market motion.

To identify the right POPs, I triangulate across four inputs: customer interviews focused on buying criteria, win/loss analysis to understand disqualifiers, competitor teardowns to benchmark table stakes, and support data to spot recurring gaps eroding trust. Collectively, these inputs reveal the minimum viable promises we must keep.

Prioritization matters. I translate POPs into outcomes (not output) and align them with our roadmapping and sprint planning. For example, instead of “Ship SSO,” I set an objective like “Reduce enterprise security objections by 60%” and measure RFP pass rates, security review cycle time, and sales stage conversion. This keeps us anchored to impact, not just checkboxes.

Execution should be pragmatic. With POPs, “good enough” is often the right bar—reliable, discoverable, and well-documented. Over-engineering POPs slows you down and diverts resources from differentiation. I focus on stable defaults, clear UX patterns, great docs, and in-app guides that help users activate parity features without friction.

Measuring POP health is straightforward if you wire it into your system. I monitor activation rates for parity features (e.g., SSO enabled), support volume tied to trust blockers (security, performance, billing), and the presence of POP gaps in win/loss notes. Retention and expansion are the ultimate validators: when POPs are solid, renewal conversations shift from risk mitigation to value creation.

Consider two tangible examples. For a messaging platform, POPs may include 99.9% uptime, message deliverability guarantees, two-factor authentication, and role-based permissions. For a product analytics tool, POPs could include granular event tracking, user privacy controls, standard dashboards, and self-serve onboarding. None differentiate you alone, but missing any one of them can disqualify you.

Common pitfalls I warn teams about: over-indexing on shiny features while losing deals on basics; inconsistent messaging that promises parity you can’t operationalize; ignoring pricing and packaging parity (buyers expect clear tiers and predictable billing); and underinvesting in enablement, leaving sales to “sell around” missing POPs.

Communicating POPs is as important as building them. I make sure parity shows up on our pricing page, security and reliability pages, and in crisp one-pagers for buying committees. In the product, I highlight parity features during onboarding with checklists and tooltips so customers experience trust quickly. For founder-led GTM, a tight narrative—“Yes, we meet the table stakes; here’s where we go beyond”—keeps discovery calls focused on outcomes.

My playbook is simple: meet parity fast, prove reliability visibly, and then pour fuel on your differentiators. When POPs are nailed, sales cycles shorten, support debt drops, and your unique value finally gets the stage time it deserves.

Inspired by this post on Amplitude – Best Practices.

October 23, 2025
Stop the Data Chaos: 3 Simple Steps to Structure Amplitude Analytics Without Governance Headaches

Messy analytics creates real product risk—slow decisions, confused teams, and initiatives that drift off strategy. Over the years, I’ve learned that clean data isn’t an accident; it’s the result of simple habits practiced consistently. When we apply those habits in Amplitude, we get trustworthy insights without drowning in governance.

Learn how to keep your data clean, consistent, and scalable in Amplitude with three simple steps.

Here’s the playbook I use to set teams up for fast, confident decisions while keeping overhead low. It’s practical, lightweight, and built to scale across product lines and stages of growth.

Step 1: Define a durable tracking plan and taxonomy. Start with the outcomes you need to drive and the questions you must answer, then translate them into a concise event schema. Name events with an action–object pattern (e.g., “Signed In,” “Added to Cart”) and standardize event properties and user properties. Document required properties, success criteria, and ownership in a single living tracking plan that product, engineering, and analytics maintain together. This keeps your Amplitude workspace coherent and makes your unified analytics platform far more actionable.

I also make the tracking plan discoverable in the tools people use daily. That means clear examples, do/don’t guidance, and a simple change process. A little upfront clarity prevents dozens of downstream “what does this event mean?” questions and reduces friction across empowered product teams.

Step 2: Instrument consistently and validate at the source. Treat instrumentation as product work, not an afterthought. Use consistent casing and naming, avoid reserved keywords, and send only the properties you commit to in the plan. Establish identity resolution rules (e.g., user_id vs device_id) early so cohorts and funnels stay reliable. Before shipping, QA in a staging project, sample actual sessions, and confirm events match the plan exactly. Prefer versioning events over breaking changes, and explicitly deprecate what you supersede.

Amplitude’s data governance controls help you approve “official” events, deprecate outdated ones, and block rogue data before it pollutes reports. Enabling guardrails early eliminates rework later and keeps “source of truth” dashboards trustworthy.

Step 3: Govern at scale with lightweight rituals and automation. Assign clear ownership for event families, set SLAs for changes, and keep a simple changelog so everyone understands what evolved and why. I run brief, recurring reviews with product trios to align on upcoming instrumentation, tie it back to outcomes vs output OKRs, and retire data that no longer serves a decision. Pair that with proactive monitoring—alerts for invalid events, a dashboard for unplanned properties, and a quarterly cleanup of deprecated artifacts—and governance becomes a steady heartbeat instead of a fire drill.

When you combine a crisp taxonomy, rigorous source validation, and lightweight governance, Amplitude becomes a force multiplier. Product discovery accelerates, roadmaps stay aligned to measurable outcomes, and stakeholders trust the numbers. Most importantly, your team spends less time debating definitions and more time shipping value.

Inspired by this post on Amplitude – Best Practices.

October 23, 2025
Stop the Leaky Bucket: Proven Playbook to Turn User Acquisition into Lasting Growth

I've led products through dazzling acquisition spikes only to watch churn quietly erase the gains. More users don't automatically mean more long-term growth. In our world, that disconnect is the leaky bucket problem: every new signup pours water into a bucket riddled with holes across activation, engagement, monetization, and advocacy.

Losing users as fast as you acquire them? Get exclusive insights from our 2025 Product Benchmark Report on how to fix the leaky bucket problem and drive lasting growth.

When I diagnose this problem, I start by shifting the conversation from top-of-funnel volume to full-lifecycle health. I look at cohort retention curves, time-to-value, activation rates, depth and frequency of core actions, and expansion revenue. These metrics reveal whether we have true product-market fit, whether our onboarding accelerates value discovery, and where users fall out before they experience a durable “aha.”

My playbook is rigorous and repeatable. I instrument a unified analytics platform to produce clean, decision-grade metrics. I define a single, canonical activation moment that ties to value, and segment it by ideal customer profiles to avoid averages hiding the truth. I run product trios to close the gap between discovery and delivery. I set outcomes vs output OKRs so the team aligns on retention and engagement, not just shipping features. And I connect roadmap bets to measurable behaviors that lead indicators predict—never vanity metrics.

Onboarding is where I usually find the biggest, fastest wins. I trim steps, reduce cognitive load, and default users into best-practice templates so they achieve value in minutes, not weeks. I use contextual education, empty states that teach by doing, and lifecycle messaging triggered by real behavior. Then I close the loop with customer success by aligning QBRs vs OKRs so feedback from high-value accounts translates into clear product outcomes, not feature requests.

Pricing and packaging matter more than most teams realize. If SaaS pricing doesn’t map to realized value, expansion stalls and churn rises. I align paywalls to natural milestones in the journey (usage thresholds tied to success), avoid early friction on critical adoption paths, and make upgrades an obvious outcome of growing value rather than a forced gate.

Execution discipline turns strategy into lift. I run weekly growth reviews that pair qualitative discovery with quantitative signal, keep an experiment backlog prioritized by expected impact and confidence, and insist on clean experiment design (counterfactuals, guardrails, and holdouts). Typical high-leverage tests include reducing time-to-first-value, clarifying the core job-to-be-done in the first session, and collapsing setup with smart defaults and in-product guidance.

The pattern is consistent: when we measure what matters, build with empowered product teams, and commit to outcome-driven roadmaps, the bucket stops leaking. Acquisition starts compounding because each cohort retains better than the last. If your growth feels like running on a treadmill, it’s time to refocus on activation, engagement, and retention—and use benchmarks to calibrate where you are versus where durable growth lives.

Inspired by this post on Amplitude – Best Practices.

October 23, 2025
Unify Your Analytics to Accelerate Growth: Cut Costs, Move Faster, and Decide in Real Time

I’ve learned that the fastest way to stall growth is to scatter your data across a maze of dashboards and point solutions. My guiding principle is simple: Escape fragmented tools with a unified analytics platform that accelerates growth, reduces costs, and empowers smarter, real-time decision-making. When every team can trust a single source of truth, momentum compounds.

By “unified analytics,” I mean a single platform that integrates product, marketing, sales, support, and finance data with consistent definitions, shared metrics, and strong governance. The right foundation pairs real-time instrumentation and event streaming with standardized taxonomies and role-based access. This is what transforms raw data into reliable insight that product managers and executives can act on with confidence.

Growth accelerates when hypotheses move faster from discovery to delivery. A unified analytics platform tightens the experimentation loop, informs product discovery, and aligns product roadmapping and sprint planning with measurable outcomes. It anchors outcomes vs output OKRs in trustworthy metrics, so QBRs and executive reviews focus on impact, not anecdotes. The result is clearer prioritization, sharper bets, and faster compounding wins.

Costs come down just as decisively. Consolidating analytics reduces redundant SaaS, manual reporting, and bespoke pipelines that are expensive to build and maintain. With one data model, we cut duplication, improve data quality, and negotiate smarter under consumption SaaS pricing. Teams spend less time wrangling CSVs and more time shipping value.

Real-time decision-making is where unified analytics truly pays off. Proactive alerts and cohort insights surface anomalies before they become churn. LTV, funnel, and retention forecasts inform pricing and packaging moves. Layering gen ai on top of clean, unified data speeds synthesis and narrative insight, while a thoughtful customer support AI strategy connects voice-of-customer signals directly to the roadmap.

Implementation starts with clarity. Identify the highest-impact decisions you want to improve, map KPIs to events, and instrument end-to-end tracking with quality SLAs. Establish governance early, align stakeholders across data, engineering, RevOps, and finance, and empower product trios to own their metrics. With disciplined stakeholder management and empowered product teams, the platform becomes a force multiplier rather than another tool to maintain.

The payoff is strategic agility: faster learning cycles, lower operating costs, and confident calls made in the moment, not after the fact. If you’re ready to break free from fractured dashboards and lagging reports, commit to a unified analytics platform and let your data become a competitive advantage.

Inspired by this post on Amplitude – Best Practices.

October 23, 2025
How I Decode Founder Advice: Lessons from Thumbtack CEO Marco Zappacosta on Boards, Time, and Trust

I recently dug into a conversation with Marco Zappacosta, co-founder and CEO of Thumbtack, who has spent the last 13 years building the company into a billion-dollar business — and it’s his first and only job after graduating college. As someone who lives at the intersection of product management leadership and company-building, I was struck by how deliberately he navigates the deluge of advice that comes with being a first-time founder.

What resonated most was the way he differentiates between moments that demand a return to first principles and those that benefit from a tested playbook. In my own practice, I’ve found that product strategy and organizational design often require first-principles thinking, while operational cadence and execution rituals tend to scale best with proven patterns. The key is recognizing which game you’re playing — invention versus optimization — and applying the right mental models to filter input without losing velocity.

Marco’s approach to parsing counsel as a first-time CEO is refreshingly pragmatic. Rather than treating advice as binary, he triangulates from multiple data points, looks for invariants, and pressure-tests assumptions against the company’s unique context. I use a similar lens: anchor on the problem, map potential solutions to risk/return, and calibrate decisions with base rates where possible. It’s a disciplined way to turn a mountain of opinion into actionable signal — especially when stakes are high.

He also connects this discipline to stakeholder management, particularly in how he runs Thumbtack’s board so quarterly meetings become a critical resource, not just a time suck — and why he shares the board deck with the entire company. I’ve found this level of transparency to be a force multiplier: it aligns teams on priorities, elevates product roadmapping and sprint planning, and empowers leaders to make trade-offs with clarity. When the narrative is shared, accountability scales.

Marco candidly reflects on Thumbtack’s COVID-related layoff last year, and what he specifically did as CEO to ensure the folks who remained still had confidence in the company and his leadership moving forward. In hard moments like these, consistent communication, explicit prioritization, and a clear framework for decision-making matter more than ever. Trust is built by showing your work — why choices were made, what changes now, and how success will be measured.

Finally, he opens up his playbook for choosing what to spend his time on as a busy CEO with only so many hours in the day — and perhaps more importantly, how he stays accountable for these priorities. I’ve learned to pair outcome-oriented OKRs with a ruthless weekly schedule audit: if the calendar doesn’t reflect the strategy, the strategy won’t happen. This discipline creates focus, accelerates learning loops, and keeps leaders from becoming the bottleneck.

For builders at any growth stage, there’s a powerful takeaway here: cultivate a repeatable way to distill advice, clarify when to use first principles versus a playbook, operationalize board relationships as strategic assets, and turn time into your sharpest instrument. The result is a more resilient company — and a leadership practice that compounds.

Inspired by this post on First Round.

October 23, 2025
AI Changes Everything—and Nothing: A Practical Product Playbook for LLMs, RAG, and Evals

AI has changed the tempo of product management, but not the timeless fundamentals. I’m living that paradox daily: the technology reshapes how we plan, build, and ship—yet the way we find real customer value hasn’t budged. Here’s how I reconcile both truths in practice.

At their core, large language models predict the next token. That’s it. Consider the prompt, “The cat sat on the ____.” Mat is most likely. Chair is pretty likely. Floor, roof, piano—all possible, just less probable. When you give an LLM a prompt, it runs through a neural network—billions of parameters making calculations—to predict the most likely next word, then the next, then the next. That neural net represents not just facts, but relationships, patterns, context—and some might even argue reasoning capabilities—that all influence the probabilities of different outputs. So LLMs don’t just predict the next token. They predict the next token by drawing upon a vast amount of knowledge. LLMs are transformational.

Behind the magic, LLMs don’t “know”—they rank what likely comes next. This slide’s cat sentence with weighted options underscores how generative AI reshapes work while remaining probability-driven.

And yet, AI makes silly mistakes. The kind that make you wonder how it could be so smart and so dumb at the same time. No matter how hard I tried, I could not get ChatGPT-5 to fix the closing quote on an image. And it took 12 seconds of thinking to figure out that I asked for a meeting summary but didn’t include any meeting notes. Why do LLMs make these mistakes? Well, it’s because all the LLM is doing is predicting the next token based on patterns. And sometimes those predictions are wrong. And when it’s wrong, it can be confidently wrong. So what do we do?

A side-by-side visual pits an optimistic AGI rocket against a frowning skeptic with a thumbs-down, ending with the prompt 'Which is it?'—capturing the debate over whether generative AI is transformative progress or overhyped.

To build reliable AI features, I focus my team on four core skills: prompt engineering, context engineering, orchestration, and evaluation (evals). Together, they let us harness what’s powerful about generative AI while reducing unforced errors.

Generative AI isn’t magic—it’s inputs, training data, and models. This visual invites readers to examine how predictions are produced and why transparency, bias checks, and ethics shape responsible outcomes.

First, prompt engineering. With LLMs, the quality of the input determines the quality of the output. An LLM can’t guess what you want. You have to specify what you want in a language the LLM understands. In a chat, you can iterate with multiple turns. In a product, you get one shot. When we ship a meeting summarizer, for example, I design a production-ready prompt that assigns a role (“You are a chief of staff for a busy executive…”), specifies output format (e.g., a structured list of action items), and clarifies exactly what to include and how to structure it. This is prompt engineering. It’s like writing a really good product spec—but for an AI.

A simple sentence becomes a window into generative AI: a neural network weighs options and selects 'mat,' showing how models rank likely words to produce fluent text—and why outputs feel confident yet probabilistic.

Second, context engineering. LLMs know a lot, but not everything. Imagine a takeaway reads, “Send the pipeline report to John by Friday.” When is Friday? Which John? What is the pipeline report? The model lacks today’s date, the roles of participants, and the specifics of our reporting vocabulary. If you dump every possible detail into the prompt, the model gets noisy input and the quality drops. The key is to add only the context the LLM needs to do the task at hand and nothing more. This is where techniques like RAG (Retrieval Augmented Generation) shine: retrieve just the relevant snippets—e.g., attendee roles, the precise definition of “pipeline report,” and the calendar context for “Friday”—and include only those in the prompt.

A minimalist slide demystifies how LLMs choose words: given 'The cat sat on the ____', the model ranks options by probability, showing that AI outputs are weighted predictions, not facts, which matters for ethics and product decisions.

Third, orchestration. LLMs are better at simpler tasks. If you try to do everything in one prompt—identify action items, summarize the meeting, categorize by urgency, match to owners, check for clarity—the quality goes down. But if you break it into steps, each focused on one thing, the quality goes up. I design a workflow of multiple LLM calls that work together. For example: First call: identify action items from the transcript. Second call: categorize by urgency using lightweight project context. Third call: match to owners with a minimal team directory. Fourth call: generate calendar events via an API. Each step is simple. But together, they create something sophisticated.

From building apps by chatting to auto-cleaning video and translating comments, AI compresses complex tasks into simple prompts. This scene shows why it feels transformative—while nudging us to weigh its limits and ethics.

Fourth, evaluation (evals). Otherwise known as: Is my AI product any good? At the heart of evals is this question: Given the input, did we get the expected output? I use several approaches. Datasets: I collect 20–100 real examples, define expected outputs, run the prompt, and measure the success rate. Code assertions: I enforce rules the output must follow (for example, “Every action item must have a task, owner, and deadline”) and fail the test when they’re violated. LLM-as-Judge: I use a model to verify factuality (e.g., “Is each item in this summary in the meeting transcript? Yes or no.”). Human evals: I track whether users need to edit the output and by how much. Start simple; get more sophisticated as the stakes rise.

From “AI Changes Everything (And Nothing At All),” this slide contrasts bold promises with everyday glitches: optimism about LLMs and AGI on the left, and a meeting-summarizer that can’t find content on the right—reminding us AI still errs.

Here’s the hard truth: you can master all four skills and still ship the wrong thing. Why? Because you chose the wrong problem to solve. Who needs yet another meeting summarizer? The ease of building with generative AI tempts us to jump straight to solutions. Resist it.

A keynote slide from AI Changes Everything (And Nothing At All) urges teams to master AI by harnessing its power while minimizing errors, framing a practical, ethical approach to building trustworthy products.

Discovery matters more than ever. Before writing your first prompt, be explicit about the impact you want and set a clear outcome. Talk to your customers and ensure you understand their needs; choose the right opportunity before you chase a cool solution. Explore multiple options and use assumption testing to converge on what will actually move the needle.

Clear prompts shape better AI. This slide shows how explicit, well-structured instructions lead to stronger LLM responses, reminding builders that inputs drive outcomes—and that responsible prompting begins with clarity.

Prototype, test, and build iteratively. With AI products, it’s easy to get to a great demo and hard to get to a production-grade experience. I validated feasibility before I wrote a line of production code by experimenting directly in a model’s UI, pasting real transcripts, and shaping the system instructions and output format. Only after I had a repeatable prompt and useful feedback did I worry about deployment.

Designing prompts is product design. This visual pairs common meeting requests with a rigorous summary framework, reminding teams that when prompts are embedded in products, accuracy matters because you get one shot.

From there, I tested deployment paths with the smallest viable investment. I tried a custom chat in Replit and embedded it. It wasn’t the right interaction model for targeted feedback. I switched to a submission flow using a homework-style pattern: students upload their transcript, the system processes it, they get detailed feedback. Done. That worked—until automation reliability lagged. I wired it up in Zapier, then rebuilt it on AWS Step Function for robust retries and better error handling when scale introduced edge cases.

A reminder that AI needs context: a simple instruction becomes ambiguous without dates, roles, recipients, and report details, highlighting the limits of LLM knowledge and the value of precise, shared prompts.

Each iteration tested a different assumption. Iteration 1 (Claude): Can AI even do this? What makes good feedback? Iteration 2 (Replit chat): How should people interact with it? Iteration 3 (Zapier): Can I integrate this into the existing workflow with minimal engineering? Iteration 4 (Step Functions): How do I make it reliable at scale when things inevitably go wrong? I didn’t know the answers up front; I learned by building, shipping, and observing.

From AI Changes Everything (And Nothing At All), this visual reminds builders that curated, high‑quality prompts yield better LLM responses, while overloaded, noisy inputs derail reasoning and reduce reliability.

Ethical data practices are non-negotiable. Improving AI products requires inspecting traces—the user input, system prompts, tool calls, and LLM responses. For a coaching flow, that includes the uploaded transcript, the system prompts that define evaluation criteria, and the AI’s feedback. To fix issues, I look at traces where things went wrong—but only with explicit user permission. Too many AI products collect everything by default and review traces without clear consent. Don’t do this.

A visual explainer showing how structured inputs—documents, verified code, roles, metrics, and timelines—flow into a network to shape a single, clear LLM response, underscoring the power of context in generative AI.

In my products, users must explicitly grant permission for me to review their data. The consent is clear and specific. If they say no, I don’t see their data. Full stop. That forces me to rely on synthetic data for many tests and to engineer privacy into the architecture from day one rather than bolt it on later. It’s the right thing to do, and increasingly, it’s required.

A visual guide to orchestration: break a complex meeting recap into smaller LLM calls, gather action items, bullet points, and a detailed summary, then aggregate the outputs to boost accuracy and reliability.

Everything changes—and nothing changes. Yes, there are new skills you should master: prompt engineering to get the right output the first time; context engineering to give the LLM exactly what it needs; orchestration to decompose complex tasks into simpler steps; and evals to systematically measure quality. And the fundamentals still rule: solve the right customer problems with strong discovery; prototype and iterate to de-risk; and uphold ethical data practices as a design constraint, not an afterthought. The teams that win with AI will master both the new technical craft and the timeless product fundamentals.

A presentation slide visualizes AI orchestration: user prompts generate meeting summaries while a graph of RAG and LLM calls routes tasks to the best sources, underscoring that retrieving the right context drives better outputs.

If this resonates, pick one active workflow, apply the four AI skills end-to-end, and run a short discovery and eval cycle against a clear outcome. Then ship. You’ll learn faster than any slide deck could ever teach you.

This presentation slide maps how to assess AI: link inputs to outputs with datasets, code assertions, LLM-as-judge checks, and human reviews—centering the question, "Given the input, did we get the expected output?"

Inspired by this post on Product Talk.

October 23, 2025
Why IT Must Lead Your AI Revolution: A Strategic, Cross-Functional Playbook That Wins

I’ve led and observed AI initiatives across fast-moving product organizations, and one pattern is unmistakable: “The AI revolution needs a departmental leader.” When that leader is unclear, pilots stall, risk mounts, and value gets trapped in proof-of-concept purgatory. When it’s clear, AI moves from demos to durable outcomes.

In my experience, IT is uniquely positioned to play that leadership role. IT sits at the nexus of data, identity, security, and infrastructure—exactly where scalable AI capabilities live. IT also has the vantage point to connect use cases across teams, manage risk, and operationalize change without derailing core systems.

Put simply, this is the promise: “Learn the key reasons why IT teams are uniquely positioned to be the strategic leaders of your company’s AI projects.” The reasons are pragmatic—access to systems of record, stewardship of data governance, ownership of integration patterns, and accountability for reliability and compliance—yet the impact is strategic.

Here’s how I frame the operating model. IT provides strategic leadership and platform stewardship; Product owns the outcomes; Engineering delivers services and integrations; Security and Legal codify guardrails; and Finance supports cost modeling. We establish tight collaboration through product trios (Product, Design, Engineering) that plug into an IT-led AI platform, enabling empowered product teams to ship safely and quickly.

Governance turns intent into repeatable action. I use outcomes vs output OKRs to force clarity on value, pair them with lightweight QBR cadences for course correction, and require architecture reviews that cover model/data governance, observability, privacy, and vendor risk. This ensures we can scale gen ai without surprise failures or compliance gaps.

On the delivery side, forward deployed engineers embedded with business units accelerate discovery and reduce translation loss. We leverage gen ai for product prototyping to validate desirability and feasibility early, then harden solutions on our shared AI platform. This keeps experimentation fast while maintaining an enterprise-grade backbone.

Roadmapping balances ambition with throughput. I tie product roadmapping and sprint planning to value streams, not just features, and I make stakeholder management explicit—especially with customer support, finance, and operations—so we design for adoption. For example, a customer support ai strategy isn’t a chatbot alone; it’s an outcome-driven service redesign, with training, playbooks, and measurable deflection and CSAT targets.

Success demands the right metrics. Beyond typical velocity measures, I track time-to-first-value, model quality and drift, cost-to-serve, and risk posture. These roll into OKRs that link frontline improvements (e.g., resolution time) to enterprise outcomes (e.g., gross margin, retention), giving executives confidence and teams a clear definition of done.

If you lead IT, this is your moment to step into strategic ownership and elevate AI from scattered experiments to a coherent platform. If you lead Product, partner with IT to align discovery, outcomes, and guardrails so empowered teams can move fast and responsibly. Together, we can turn AI from a buzzword into a durable advantage.

Inspired by this post on Pendo – Perspectives.

October 23, 2025
Proven Playbook: From Pilot Teams to Full Transformation with the Product Operating Model

The product operating model is getting a lot of airtime for good reason. In my experience leading product and driving change across complex organizations, it offers a pragmatic way to build technology-powered solutions that customers love while delivering measurable business results.

What is the product operating model? I anchor teams on first principles: it’s not a process checklist—it’s a way of operating. As one leading thinker puts it, “The product operating model, also referred to as the product model, is a conceptual model based on a set of first principles that leading product companies believe to be true about creating products.” It’s “about consistently creating technology-powered solutions that deliver value to customers and that also drive results for the business. It’s about achieving outcomes versus merely producing output.” That framing matches exactly what I’ve seen separate high performers from the rest.

A clean visual distills the product operating model—covering how products are built, how problems are solved, and how to choose which problems to tackle—ideal for teams moving from pilots to enterprise-wide transformation.

Why adopt it? Because too many organizations swing between two extremes—beloved products that can’t sustain the business, or hard-charging business tactics that burn customer trust. The product model forces a both/and mindset: customer value and business outcomes. The biggest triggers I see for making the shift are heightened competition, visible frustration with the status quo (lots of spend, little impact), and a clear view of the upside when you manage by outcomes instead of deliverables.

Clean, modern visual underscores a core promise of the product operating model: delight customers and deliver measurable business outcomes. A timely teaser for organizational change content on scaling from pilot teams to full transformation.

How is it different from the project model? Projects start and stop; products evolve. When you manage by projects, you optimize for scope, budget, and dates—often at the expense of learning. When you operate as a product organization, cross-functional leaders (product, design, engineering) own the problem space and decide which customer problems to prioritize, in continuous conversation with stakeholders about business context and constraints. The litmus test: are teams funded and measured by outcomes, or by a backlog of features and deadlines?

A crisp reminder that projects end while products evolve—the mindset shift at the core of a product operating model. Great for posts on moving from pilot teams to enterprise adoption and delivering continuous customer value.

Can a hybrid approach work? Early on, yes—and it’s often the smart path. I strongly recommend piloting with a small number of teams while the rest of the organization continues working in the project model. Use pilots to learn, de-risk, and build internal proof points. That said, some work will remain project-based (e.g., regulatory, compliance, or one-off migrations). The goal isn’t purity; it’s pragmatism.

Change works best in waves, not a tidal surge. This visual champions pilot teams and staged rollout, showing how a product operating model scales successfully when leaders avoid all-at-once transformation.

How does continuous discovery fit? Continuous discovery is the operating system for deciding which problems to solve and how to solve them. Practically, that means weekly customer touchpoints by the team building the product, lightweight research to learn fast, mapping opportunities, and testing assumptions to reduce risk—all in pursuit of a clearly defined outcome. If you don’t learn continuously, you default to guessing continuously.

Continuous discovery helps teams choose the right problems and solutions. Explore how the Product Operating Model moves from pilot teams to enterprise-wide adoption—and why discovery is the engine of transformation.

How long does transformation take? Longer than most leaders think. Changing funding models, planning rituals, decision rights, and team habits is measured in years, not months. Pace yourself, stage the rollout, and expect to iterate on org design, coaching, and governance.

Big change isn’t overnight. This visual from our Product Operating Model series highlights the reality: scaling from pilot teams to full transformation takes time, steady learning loops, and aligned leadership.

Is your organization ready? I look for two CEO-level signals: a genuine belief that how you operate today won’t win tomorrow, and felt pain with the current results (missed targets, margin pressure, slipping share). Transformation is hard; without executive conviction, resistance will stall progress.

Even in heavily regulated sectors, a product operating model can thrive. Learn how to adapt governance and compliance without losing speed—from pilot teams to enterprise-wide scale.

What are signs it’s working? During planning, you fund teams and outcomes—not projects. Day to day, outcomes trump outputs. Product teams are empowered to solve customer problems (not merely deliver requests). Stakeholders contribute context and constraints, not solutions. Most importantly, your outcome metrics improve over time.

A side-by-side snapshot of how product work changes from early startups to large enterprises—shifting from directional outcomes to transparent processes, stakeholder management, and accountable leadership.

Will it work in regulated industries? Yes. Constraints shape the solution space, but the underlying truth remains: products are never done. In highly regulated environments, continuous discovery, tight risk management, and fast feedback loops are even more critical.

Shift from big-bang training to smart scaling. Start with pilot teams, learn fast, and extend practices that work. This post explores how a phased product operating model reduces risk and builds lasting capability.

Startups vs. enterprises: Early-stage teams can start with a directional outcome (who you serve and what you aim to change) and evolve to measurable outcomes as signal strengthens. The anti-pattern is “idea-first” debates that devolve into whether-or-not arguments. Even with an idea in hand, interview customers, map the opportunity space, story-map solutions, and surface assumptions to test. In large enterprises, the challenge shifts to scale: more stakeholders and gatekeepers, a greater need to show your work concisely, and stronger accountability rituals so leaders can manage by outcomes without dictating outputs.

To scale a product operating model, leaders must be all-in. This graphic spotlights two CEO mindset shifts: accepting that the old operating mode fails and recognizing the real cost of the status quo.

What does failure look like? A narrow transformation that trains product managers but leaves funding, planning, decision rights, and stakeholder behaviors unchanged. If you spot this pattern, pause and reset: recommit at the executive level to outcomes-based funding, change the operating cadence, and invest in coaching for leaders and teams.

In a product operating model, managers are coaches first. This quote underscores that real transformation starts by developing people's skills, building capable teams before scaling from pilot initiatives to enterprise-wide change.

The biggest mistake I see is trying to roll out to everyone at once. Broad, simultaneous training creates confusion, fuels resistance, and overwhelms your enablement capacity. Start small, prove value, fix systemic blockers, and then scale.

Meet the Product Trio—product manager, designer, and software engineer—the nucleus for moving from pilot teams to full transformation, aligning collaboration across discovery and delivery.

How does remote or distributed work affect the model? It raises the bar on clarity and connection. Define core collaboration hours, codify working norms, and be explicit about when to switch from async to live. Build lightweight rituals—discovery reviews, outcome check-ins, assumption-test demos—that create trust and fast feedback across time zones.

Pilot teams prove the product operating model can work. This clean visual highlights the first step of transformation—start small, validate results, and build momentum toward organization-wide change.

Roles and responsibilities change in meaningful ways. The CEO must champion outcomes-based funding and empower teams to own the problem space. Product leadership’s primary job is coaching—developing skills, modeling outcome management, and making strategic context accessible. Teams operate as small, cross-functional units (often a product manager, designer, and engineer, plus data or product marketing when needed). Keep the group as small as possible while still making informed decisions; bigger groups slow decisions and dilute ownership.

Pilot teams expose the friction and blockers you’ll face on the path to a full product operating model. This clean visual underscores the message: surface resistance early so transformation can progress.

Empowerment doesn’t mean teams do whatever they want. It means they own figuring out the best way to achieve an outcome within clear business constraints. Leaders provide the why and the guardrails; teams own the how.

Launching a product operating model starts with the right pilots. This graphic spotlights five factors to weigh—relationships, learning mindset, existing skills, access to customers, and the product's lifecycle stage.

Pilot teams are your change engine. I select a handful of teams most likely to become bright spots—familiar with each other, hungry to learn, fully cross-functional, with reliable customer access, and working on products in an “explore” or “invest” lifecycle stage. Small organizations might start with one pilot; mid-size with two to four; large with up to five. The point is to learn, not to boil the ocean.

Kickstart your product operating model with pilot teams set up to win: align on business context, ensure customer access, equip the right tools, remove friction so they move fast, and foster cross-functional collaboration.

How do I set pilots up for success? First, give them a clear outcome and rich business context. Second, insist on regular customer conversations to uncover unmet needs and prioritize opportunities. Third, build the muscle for evaluating whether they built the right thing through quick, targeted assumption tests. Fourth, shorten delivery cycles by right-sizing work to accelerate learning. Fifth, teach collaboration skills—show your work, externalize your thinking, and respect each discipline’s expertise—so the team can storm and reform productively.

Kickstart your product operating model with pilot teams. This visual spotlights three musts: back their outcomes, understand outcome‑focused work, and foster transparency and collaborative decision‑making.

What should leaders do differently during pilots? Align on the team’s outcome and invite relevant stakeholders into outcome definition and context-sharing. Learn to manage by outcomes, not outputs, and clarify when feedback is a suggestion versus a decision. Create predictable rituals where teams share what they’re learning, what they’ll learn next, and what’s blocking them. Transparency builds trust; trust enables speed.

Scale your product operating model by making learning public. Encourage forums, demos, and showcases where teams share insights, turning pilot wins into organization-wide momentum.

Which rituals help pilots share progress? I like “discovery-delivery demos” where teams show the latest insights, the opportunity space they’re focusing on, the assumptions they’re testing, early signals from tests, and the decision implications. Keep it concise and outcome-linked so leaders and peers can coach effectively.

Before taking a product operating model beyond pilot teams, fix the patterns you've found. Resolve recurring issues early to reduce risk, build confidence, and enable a smoother, organization-wide transformation.

How do you scale beyond pilots? Start by fixing systemic issues that surfaced across pilots: upgrade planning and funding to outcomes, standardize access to customers, invest in capability building, and clarify decision rights. Then expand in waves, pairing newer teams with experienced coaches and reusing the same operating cadence that worked in pilots.

Leaders can boost pilot teams by setting rituals, reducing bias, and making communication explicit. This quick checklist shares four habits that keep work visible and aligned during a product operating model transformation.

Expect and manage resistance. From pilot teams, you’ll hear, “We don’t have time,” “I don’t want to be a beginner again,” or “That will never work here.” Treat these like objections to understand. Unpack delivery pressures, normalize the discomfort of learning, and make the case for change by contrasting current pain with desired outcomes. From leaders and stakeholders, expect attachment to favored solutions, discomfort with reduced control, and update bottlenecks. Address this by sharpening outcome definitions, making strategic context accessible, and committing to clear check-in points.

Pilot teams often push back with time, comfort, and context concerns. Naming these patterns sparks constructive dialogue and clears the path from experimental pilots to a durable, organization-wide product operating model.

The mindset shift that unlocks everything is outcomes over outputs. Success isn’t shipping; success is impact. Without a clear, meaningful outcome, teams stay busy but struggle to move the needle—and they can’t tell what’s working. Tie work to outcomes, instrument the experience, and let evidence guide decisions.

Leaders can stall a product operating model by clinging to old habits—prescribing solutions, holding tight to control, and demanding every decision cross their desk. Use this prompt to spark dialogue and reset expectations.

Collaboration is the other non-negotiable. When product, design, and engineering collaborate from the start, you reduce rework, make smarter trade-offs, and ship solutions that are valuable, usable, and feasible within your constraints. Silos optimize individuals; collaboration optimizes outcomes.

Move from shipping more to achieving more. This visual underscores the product truth: prioritize measurable outcomes over busywork outputs to steer teams from early pilots to scalable, organization-wide transformation.

If you’re embarking on this journey, start small, learn loudly, coach relentlessly, and scale what works. That’s how you move from pilot teams to full transformation without losing momentum—or your people.

Inspired by this post on Product Talk.

October 23, 2025