We open-sourced our AI Skills library. Here's what we built, why we built it, and how to use it. I’m sharing the approach we’ve used to move faster with more confidence across product discovery, prototyping, and production—while keeping governance, safety, and measurement front and center.
What we built is a modular, open-source library of “skills” for agentic AI and LLM-powered workflows—things like retrieval and grounding, summarization, classification, tool-use, data enrichment, safety guardrails, and evaluation harnesses. Each skill follows consistent interfaces and conventions so teams can compose them like building blocks, swap implementations without breaking flows, and standardize best practices across products.
Why we built it is simple: we kept rebuilding the same core capabilities across experiments and teams. Standardizing these skills accelerates time-to-value, reduces integration risk, and helps product trios collaborate with a common language. It also lets us scale what works—prompt patterns, eval datasets, telemetry—so every new initiative starts on third base instead of at bat.
How to use it in practice: start by running a quick-start example to see a baseline skill chain in action. Then compose your own flow by selecting skills (for example, retrieval + summarization + tool call), configure them with environment variables and guardrails, and wire in evaluation datasets. From there, instrument the pipeline with metrics so you can compare variants and promote the best-performing chain to your main app or API.
In a typical stack, the library dovetails with analytics and experimentation: ship skill variants behind feature flags, measure impact with A/B testing, and observe runtime behavior with logs and traces. CI/CD hooks let you run evals pre-merge, and production dashboards keep an eye on latency, cost, and outcome quality. This creates a virtuous loop where ideas move from prototype to production with clear evidence.
Common use cases include customer support summarization and triage, lead scoring and enrichment, anomaly detection in product telemetry, and automated content workflows. Because the skills are composable, you can try multiple retrieval-first strategies, swap prompt templates, or add tools (search, RAG, calculators, connectors) without rewriting everything from scratch.
Governance and safety are built in. Guardrails handle PII redaction, content policy checks, and rate limiting; configs make it easy to enforce privacy-by-design; and evaluation harnesses encourage an eval-driven development culture. The result is faster iteration without sacrificing data governance or reliability.
If you want to contribute, add a new skill, improve prompts, share eval datasets, or open an issue with a scenario you want supported. The roadmap focuses on richer retrieval adapters, better test fixtures, and deeper observability so teams can debug and optimize complex chains with confidence.
I’m excited to see how you’ll use the library to accelerate your roadmap. Clone it, run a quick start, and compose your first workflow today—then measure, iterate, and scale what works. I’ll keep sharing patterns, learnings, and updates as we grow the skills catalog and sharpen the tooling.
Inspired by this post on Amplitude – Perspectives.
I keep meeting talented product teams who can demo impressive proof-of-concepts but can’t get durable business impact into production. The difference isn’t raw ingenuity—it’s the operating model. As I’ve scaled AI initiatives in my own organization, one sentence has proven painfully accurate: "What the top 1% of AI-native product teams are doing differently – and why most won't catch up without rebuilding the operating model."
When I say “AI operating model,” I mean the end-to-end way we set strategy, discover value, build, ship, govern, and learn—specifically adapted for AI systems. If we try to bolt AI onto a classic software cadence, we stall. If we rebuild our operating model around AI’s unique constraints and compounding advantages, we accelerate.
It starts with strategy. I anchor our portfolio to explicit outcomes, not features—tying every initiative to measurable customer and commercial impact. Driver trees and an opportunity solution tree make tradeoffs transparent, while outcomes vs output OKRs prevent us from celebrating activity over results. This is how empowered product teams earn autonomy without losing alignment on the AI Strategy.
Next is discovery. Continuous discovery reframes “can we ship a model?” into “can we change a behavior or decision with acceptable risk?” I pair customer interviews with in-product telemetry and journey mapping to qualify moments of high value and high frequency. The litmus test: can we describe the target workflow in plain language and simulate success before training models? If not, we’re not ready.
Data foundations come third. A retrieval-first pipeline is now my default, not an afterthought. We invest in data governance, privacy-by-design, and observability so we can explain where answers come from, prove consent, and debug drift. Without trustworthy data and clear lineage, every downstream AI promise is fragile—and your AI readiness is mostly theater.
Then I insist on eval-driven development. Before we optimize prompts or tune models, we define offline and online evals that represent the real task, including safety and “gotcha” cases. We treat prompt engineering, context window management, and agentic AI patterns as hypotheses that must beat a baseline under repeatable tests. This moves debate from opinions to evidence.
Shipping is where most teams quietly stall. We integrate AI into our CI/CD with feature flags, shadow modes, and progressive rollouts, building MLOps into the same platform that runs our services. I watch DORA metrics to keep delivery velocity healthy, but I also watch AI-specific signals—input distribution shifts, response variance, and time-to-mitigation—so we catch regressions before customers do. Platform scalability matters more when inference costs and latency can spike overnight.
Governance isn’t a gate at the end; it’s a runway from the start. We operationalize AI risk management with tiered reviews, model and data cards, and clear escalation paths. The goal is not to slow down, but to reduce surprise—so product managers, engineers, and legal share the same playbook for safety, fairness, and regulatory compliance.
Value capture closes the loop. We connect product metrics to commercial levers like Net Recurring Revenue (NRR) and retention analysis, then shape packaging so customers pay for outcomes, not raw compute. This is where product-led growth meets sales-led growth: we demonstrate value in-product, then arm go-to-market teams with unambiguous proof.
So why are 80% of teams stuck? Three patterns recur: technology FOMO masquerading as strategy, fragmented data that can’t support high-quality retrieval, and a lack of evals that forces decisions by vibes. Add ad hoc governance and you get pilots that impress in slides but wither under real-world variance.
How do the top 1% think differently? They rebuild the operating model first. They position discovery around workflows, not models. They invest in retrieval-first architectures early. They standardize evals. They ship with guardrails. And they treat “learning per week” as a sacred metric—because compounding insight beats sporadic heroics.
If you need a 90-day plan, here’s the sequence I use. Week 1–2: run a content audit of data sources and map the top five repeatable workflows ripe for AI leverage. Week 3–4: define success metrics and offline evals for one beachhead use case. Week 5–8: build the retrieval pipeline, implement prompt baselines, and instrument observability. Week 9–12: ship behind feature flags, run A/B testing with safety thresholds, and iterate on failure cases. By the end, you’ll have a reusable blueprint—not just a demo.
Team design matters. I staff product trios (PM, design, tech lead) with forward deployed engineers or solutions engineering partners who sit with customers. That proximity reduces spec ambiguity and accelerates learning. It also sharpens our product roadmapping and sprint planning because we plan against outcomes, not outputs.
The hardest part is emotional, not technical: letting go of familiar software rituals that don’t serve AI. Once we accept that AI demands a different operating rhythm, progress feels lighter. The top 1% don’t have secret models; they have disciplined systems. Rebuild yours, and the compounding benefits will outpace any single model upgrade.
AI in customer service is no longer experimental—it’s the standard. In my work leading product and customer experience teams, I’ve seen the shift firsthand, and the stakes have never been higher for getting the foundations right.
Fin’s 2026 Customer Service Transformation Report found that 82% of senior leaders say their teams invested in AI for customer service over the last 12 months, with 87% planning to invest in 2026. Those investments pay off with 24/7 availability, multilingual support, major time savings, and faster resolutions. But there’s an unsung hero behind every AI-first support experience: knowledge management.
A Service Agent is only as good as what we give it to work with. If we’re using an Agent, like Fin, to resolve customer queries end to end, it needs an extensive pool of knowledge to draw from. We have to feed it accurate answers on our product, features, policies, and troubleshooting. Without these, the Agent can’t do its job—and our team ends up handling repetitive queries that should be automated.
A Fin-branded quote pairs with a friendly black-and-white portrait to champion smarter support. It reminds readers that time spent building knowledge and processes today compounds into fewer tickets and smoother operations.
In this guide, I’ll walk you through two phases of the journey. Phase 1 is about building a high-quality knowledge base from scratch or overhauling what you have. Phase 2 is about maintaining, optimizing, and scaling that knowledge so your AI performance keeps compounding over time.
Definition: Knowledge management is the process of creating, organizing, sharing, and maintaining knowledge in your business.
Fin’s quote card blends a friendly headshot with a message to think outside the box and tap new information sources to power an AI knowledge base—ideal inspiration for service teams leveling up knowledge management.
Your help center is the obvious example, but it’s only the tip of the iceberg. Effective knowledge management also means creating resources like FAQs, troubleshooting guides, onboarding and best-practice docs, internal support guidance, and learning materials that cover everything from everyday how‑tos to complex billing and account questions.
It means identifying content gaps—missing troubleshooting steps, unclear policy explanations, outdated feature details, or unanswered edge cases—before your customers find them. It means implementing systems so both your Agent and your support reps can access the right information at the right time. And it means developing processes so your content stays in lockstep with product updates, policy changes, and bug fixes.
From Fin's guide to knowledge management, this monochrome quote card urges teams to test their first deployment themselves so agents feel the same journey customers do, turning insights into faster, higher-quality support.
Your knowledge base now fuels your entire support experience, not just self-serve. It’s the key to accurately answering complex questions, reducing handle time, and delighting customers across channels.
Here’s the blunt truth I share with every team: your Agent is only as strong as what you feed it. A lack of information, messy structure, or stale documentation will tank accuracy and trust. No large language model (LLM) knows your business like you do. It doesn’t understand your customers’ needs, pain points, and use cases. That knowledge is unique to you and your organization, meaning you need to be the one to map it all out and make it available to your Agent.
Equip service agents with a clear playbook for damaged delivery reports. This procedure page outlines when to use the guide, how to verify evidence, and the next action to reorder—ready to test, save, and set live.
Every investment in knowledge also has compounding results. Think of it as a flywheel: when you improve your knowledge base, your Agent solves more cases and generates better data. That data shows you what to add, update, or refine next. The sooner you plant the seeds, the sooner you’ll harvest the returns.
Consider a simple calculation. If it takes 30 minutes to write a troubleshooting article for a common issue, that half hour often saves hours for your support reps, who no longer need to handle that query. You can estimate impact by multiplying the average time to compose a response by the frequency of the query. For customers, multiply the number of customers who ask this question by their average time to resolution to quantify time saved. Then monitor Agent involvement rate, resolution rate, and automation rate to see the compounding effect.
Give every seller instant, trusted answers with an AI-powered knowledge base that unifies docs, FAQs, and playbooks into a single source of truth—accelerating ramp, boosting call confidence, and improving every customer conversation.
Phase 1: Building your knowledge base is about getting your content durable and AI-ready. I start by prioritizing what to include, where to source it, and how to audit and triage before go‑live.
Data-driven tools can surface the right starting points. For example, platforms like Fin can surface knowledge gaps from real customer conversations where help content is missing, unclear, duplicated, or contradictory. A centralized knowledge hub then becomes your single source of truth for both customer-facing and internal content, with audience controls to ensure your Agent only uses the right materials for the right users.
AI elevates service when teams treat deployment as a learning loop. This Fin-branded quote visual introduces our ultimate guide to knowledge management for service agents—iterate from day one to improve customer outcomes and teammate efficiency.
Here’s how I prioritize content for the first wave. Support FAQs come first—billing changes, account updates, feature usage, troubleshooting, and policy questions. I mine the inbox and historical conversations to find the highest-frequency issues and turn them into crisp help articles the Agent can quote.
Next, I build onboarding and setup guides so new customers reach value fast. I collaborate with customer success and product to document the fastest path to “first win,” and I ensure the Agent can reference those steps in chat and in‑product guidance.
Keep your help content fresh. A Fin quote urges support leaders to audit and update their knowledge base so AI assistants and service agents surface accurate answers that genuinely add value.
Then I add troubleshooting and advanced guides for deeper issues and power-user workflows. I pull in product managers, engineering, and success managers to capture deeper diagnostics, known limitations, and recommended workarounds—exactly the details that prevent escalations.
Finally, I create content for specific use cases and customer segments. Different goals and configurations require contextual guidance, so I reflect language customers actually use and tailor examples to their jobs-to-be-done.
Smarter support starts with better knowledge. A testimonial highlights how Fin learns from website and help center content, showing that robust knowledge bases train AI agents, raise accuracy, and yield compounding gains.
When sourcing knowledge, I cast a wide net and consolidate it so the Agent and my team can use it reliably. That includes public help articles and troubleshooting guides; internal runbooks, escalation steps, and policy clarifications; curated snippets for short replies and exceptions; past conversations that expose gaps; relevant website pages; and documents like PDFs and DOCX with selectable text.
Before anything goes live, I run a structured content audit. The goal is twofold: prevent the Agent from learning from outdated information, and expose gaps that will cause escalations. I divide content by product area, assign clear ownership, and set a time‑boxed review window to update, consolidate, or retire content. Shared ownership turns a daunting clean‑up into a manageable sprint.
Why can’t knowledge content be an afterthought? This Fin visual pairs a grayscale portrait with a bold message: great Service Agents rely on a strong, current knowledge base to deliver accurate, evolving support. Explore the guide.
I also walk the customer journey myself—exactly as a new user would—so I can experience the Agent’s responses firsthand and spot missing topics or keywords. Where my platform supports it, I use preview and batch testing to validate coverage across common questions, then simulate more complex workflows to ensure handoffs and steps are properly defined before launch.
After 30 days of Agent activity, I dive into the data. I look for topics driving handoffs to humans, articles correlated with low resolution rates or CSAT, and content that customers view but still escalate. Those signals tell me exactly what to write or refine next—and where to tighten conversation design or retrieval.
Centralize your conversations, customer data, and knowledge in one place to sharpen context and speed resolutions. This Fin graphic pairs a monochrome portrait with a bold pull-quote highlighting unified platforms for better support.
Prioritization is where impact accelerates. I focus first on the content my team shares most: top help articles, troubleshooting steps, onboarding flows, and policies. I study conversation analytics to identify the most common questions, the longest handle times, and the lowest CX scores, then close those gaps with targeted content. I also review high‑view articles that haven’t been updated recently and refresh anything affected by changes to product, policies, or plans.
Resourcing matters. Building a high-performing Service Agent shouldn’t be a side gig. I explicitly allocate weekly time for frontline reps, support specialists, and product partners to work on content requests and knowledge improvements. A 5–10 hour per‑person cadence is a practical baseline, and it doubles as a powerful way to upskill the team for emerging AI roles.
Jumpstart smarter support with the #1 Agent—organize knowledge, speed answers, and automate routine work. Click Start a free trial to see how AI elevates your service team and delivers faster resolutions.
Writing for AI is writing for customers. I train the Agent to mirror the terms our customers use by analyzing search queries and real conversation language. I avoid internal jargon, expand acronyms, and clarify key concepts to eliminate ambiguity. When a topic invites yes/no answers, I restate the question and add the necessary context so the Agent doesn’t misinterpret shorthand. I always pair images or videos with clear explanatory text so the guidance is accessible and machine‑readable. And I structure content for scanning with crisp headings and short sections, avoiding hidden information that requires clicks to reveal.
When I have bite‑size answers—common edge cases, policy clarifications, repetitive high‑volume queries—I collect them into focused internal snippets or compact FAQs so the Agent can retrieve and deliver precise answers quickly.
Phase 2: Knowledge management is where the compounding value kicks in. Once live, I track the metrics that matter: resolution rate (conversations fully resolved by the Agent when it was involved), automation rate (total conversations handled by the Agent across overall volume), time saved (hours of manual work offloaded), Customer Experience (CX) Score comparisons across AI and human conversations, and CSAT parity.
Then I put those learnings to work. Inevitably, some problems won’t be solvable on day one. That’s a gift—it shows me where to refine workflows, add clarifying steps, and strengthen knowledge depth. The richest insights often come from where the Agent struggles or escalates; those friction points become my highest‑ROI content tickets.
Knowledge management is never one‑and‑done. As products, customers, and business goals evolve, so must the knowledge. I formalize an ongoing maintenance cadence with clear ownership, review intervals, and time blocks on the calendar. Wherever possible, I use AI‑assisted drafting to propose updates, summarize gaps, and accelerate review without sacrificing quality.
To sustain momentum, I create a simple intake for content requests—often a lightweight ticket workflow inside our support tools—so anyone in support, success, sales, marketing, engineering, or product can flag gaps and propose improvements. The teams closest to customers usually spot the patterns first; a good intake system ensures we don’t lose those insights.
I also bake knowledge work into every launch plan. New features, product updates, plans, and policies require Agent‑ready content at launch, not after. I partner with product, support, and product marketing to produce best practices and anticipated FAQs in advance, then I review early conversations post‑launch to spot recurring confusion and fast‑follow content needs.
Brand consistency builds trust across every touchpoint. I standardize terminology for products, features, plans, and policies so the Agent, the help center, and human reps all speak the same language. I proof for tone, spelling, and grammar, and I use templates so content feels cohesive. I also include clear contact options for customers who need them—what channel to use, when to use it, and what to expect—so we maintain confidence even when escalation is required.
Clarity about audience matters, too. If certain content applies only to specific roles, plans, or regions, I label it explicitly and, where my platform supports it, target content so the Agent uses the right guidance for the right segment.
Finally, I connect the dots. When conversations, customer data, and knowledge live in one place, every interaction becomes an insight loop. A connected Agent turns support into a retrieval-first pipeline, making it far easier to diagnose issues, improve accuracy, and continuously raise the bar on customer experience.
Behind every high-performing Agent is a rigorous, AI-friendly knowledge management practice. Treating knowledge as a core service function—not a project—creates systems that improve with every conversation. That’s how we transform support from a cost center into a compounding engine for customer satisfaction, operational efficiency, and growth.
I’ve learned the hard way that the fastest path to a reliable command-line agent is radical subtraction. "In the last month of developing Amplitude Wizard CLI, we cut more than we added. Learn less is more when it comes to building CLI agents." That decision was less about minimalism and more about product strategy: constraints sharpen behavior, clarify intent, and raise trust.
When I evaluate agentic AI systems, especially those that act on developer environments, I start by asking what the agent must never do. By establishing hard guardrails first, the design naturally converges on an opinionated, safe, and teachable interface. Every additional flag, tool, or permission expands the blast radius; every removal shortens the path to first success.
For CLI agents, the most valuable product choice is a narrow toolset with sane defaults. Opinionated workflows reduce cognitive load and failure modes, while clear human override points keep users in control. I prefer a bias toward idempotent actions, reversible changes, and explicit confirmation gates for anything destructive. If a feature can’t explain itself in a single, crisp sentence in the help text, it likely doesn’t belong.
Security and reliability flow from limits. Progressive permissioning, scoped credentials, and time-bounded tokens prevent the agent from wandering. Dry-run modes build confidence without side effects. When a user can reason about what the agent will and won’t do, adoption accelerates—and support tickets plummet.
Observability is the other half of trust. I instrument "Agent Analytics" across every run: inputs, tool choices, durations, outcomes, and error patterns. Those signals reveal where the agent gets confused, which steps users abandon, and which prompts need pruning. With that loop in place, "less is more" stops being a philosophy and becomes an evidence-backed operating model.
I anchor the roadmap in eval-driven development. Before adding a capability, I define a measurable task, a success threshold, and the smallest viable interface to reach it. If the capability can’t lift completion rate, time-to-first-success, or re-run stability, it waits. That simple discipline protects the experience from feature creep and preserves velocity in CI/CD.
Under the hood, I design for a retrieval-first pipeline and careful context window management. The agent should fetch only the minimally relevant facts, present a compact plan, and execute predictably. Thoughtful prompt engineering helps—but prompts are not a substitute for clear boundaries, deterministic tool contracts, and robust error handling.
Documentation is product. I maintain docs-as-code with runnable examples that mirror the golden paths. When the docs and the CLI disagree, the CLI changes—never the docs. This creates an internal forcing function: if we can’t document it simply, we probably shouldn’t ship it.
My litmus test for any proposed addition is simple: does this make the mental model smaller? If not, cut it, make it progressive, or hide it behind a clearly named subcommand. Defaults should be boring, safe, and fast. Advanced power should be opt-in and discoverable without overwhelming new users.
The paradox of agentic AI is that capability grows as surface area shrinks. By removing distractions, we amplify signal, increase repeatability, and earn the right to add the next carefully chosen step. The result is a CLI agent that feels sharp, dependable, and—most importantly—useful on day one.
Inspired by this post on Amplitude – Perspectives.
When I guide teams building agentic AI features, I’ve seen a single prompt turn Amplitude Global Agent into either a world-class analyst or a well-meaning rambler. The difference isn’t magic—it’s method. With the right structure and iteration, we consistently get faster, clearer insights that stand up to product and analytics scrutiny.
AI has gotten really good, but success still depends on the quality of your prompts. Explore three best practices for prompting in Amplitude Global Agent.
Tip 1 — Define the role, goal, and guardrails. I begin every prompt by stating the agent’s role (for example: “You are a product analyst”), the business objective (“identify activation drop-offs by cohort”), and the boundaries (“use only Amplitude analytics events and properties provided; return JSON with metric, segment, timeframe”). This simple pattern reduces ambiguity, improves context window management, and yields outputs I can compare across runs.
Tip 2 — Ground the model with concrete context and examples. Agent outputs improve dramatically when I supply the exact data it should reference: event names, properties, segments, filters, and timeframes. I often include a short example—one ideal question and one ideal answer—to anchor tone, structure, and depth. Think retrieval-first pipeline: feed the agent authoritative snippets (definitions, dashboards, prior queries) rather than hoping it guesses. That’s how I cut hallucinations and make results reproducible for LLMs for product managers.
Tip 3 — Iterate with measurement, not vibes. I version prompts, A/B test variants, and log inputs/outputs so I can score quality with lightweight evals (accuracy against known answers, clarity, and actionability). Over time, a small library of “winning” prompts emerges for common AI workflows—activation analysis, retention cohorts, anomaly detection—so the team can move from tinkering to repeatable performance. This is where Agent Analytics practices pay off: we inspect outcomes, not just outputs.
A practical starter structure I use: Role and Audience; Objective and Success Criteria; Data Context (events, properties, segments, timeframe); Constraints (sources, methods, privacy); Output Format (tables/JSON, fields, length); Examples (one good Q/A); and Fallbacks (what to do when data is insufficient). Even written as plain language, that scaffold reliably steers Amplitude Global Agent to precise, defensible answers.
The emotional arc here is familiar: when the agent nails a complex funnel question in one pass, the team gets that “oh wow” moment; when it meanders, morale dips. Clear prompting turns those spikes of delight into a steady cadence of wins—less rework, faster learning loops, and cleaner handoffs from discovery to delivery. In short, invest in prompt engineering once, and you compound gains across every analysis session.
If you’re just getting started, pick one critical question (for example, activation or retention), apply the three tips above, and commit to two to three prompt iterations with scoring. Within a single sprint, you’ll have a robust template you can reuse and adapt—helping Amplitude Global Agent deliver trustworthy insights at the speed your product strategy demands.
Inspired by this post on Amplitude – Perspectives.
When teams evaluate AI Agent options for customer service, I often see the rigor aimed at the wrong subset of criteria. After leading and observing dozens of proof of concept (POC) efforts with our customers and prospects, I understand why performance—accuracy scores, resolution rates, and benchmark tests on curated datasets—soaks up most of the attention. But those indicators alone won’t guarantee success once you leave the sandbox and face real customers.
If your POC only proves that the AI “works,” you’re missing the bigger picture. Here’s what else I look for to make the best long-term decision.
How does it handle your real-world setup?
Performance is table stakes, but it has to reflect the messiness of an actual support environment. The best-performing Agents don’t just get answers right—they exhibit resilient, human-like behavior under pressure. I watch how the Agent behaves when it doesn’t know an answer: does it recover or spiral? Does it stay on track through multi-step requests, and how gracefully does it hand off to human agents? If your knowledge base depends on a retrieval-first pipeline, test cross-source retrieval and grounding—not just single-document lookups.
When I build evaluation scenarios, I put the Agent through its paces with a broad, realistic mix:
Multi-turn queries that require the Agent to carry context across a conversation, not just answer isolated questions.
Vague or fragmented inputs, like typos, grammatical errors, and incomplete questions, because that’s how customers actually write.
Edge cases and sensitive scenarios, like billing disputes, frustrated customers, and questions that sit at the boundary of what the Agent is trained on.
Different phrasings of the same question. An Agent that handles one version well but fails on a rephrasing has a knowledge problem, not a performance problem.
Queries that require pulling from multiple knowledge sources. Real issues are rarely answered by a single help article, and an Agent that can only handle single-source questions will hit a ceiling fast.
Multilingual conversations, if your customer base requires it. Performance can vary significantly across languages and it’s better to discover that in testing than in production.
This preparation is worth the effort. Any Agent can look impressive in a demo; what matters is how it holds up as part of your team, serving your customers in production.
What does it feel like to interact with the Agent?
Two AI Agents can post the same quantitative scores—resolution rates, containment rate, and more—and still deliver very different customer experiences. Resolution rate tells me whether the Agent finishes conversations; it says nothing about how customers felt during them. I deliberately assess the experience, not just the outcome, because conversation design shapes trust and brand perception.
Here’s what I look for to ensure the AI Agent is enjoyable to interact with:
Is the tone natural and on-brand, or does it feel robotic and generic?
Does it build trust early in the conversation, or does it create friction that makes customers want to immediately request a human?
When it doesn’t know the answer, does it handle that gracefully?
When it hands off to a human, is that transition seamless, or does the customer feel abandoned?
As George Dilthey at Clay put it when evaluating their AI setup: “Keep what’s important to your business up front and center. For us, that was transparency and control over the customer experience.”
That framing is exactly right. The Agent represents your brand in every conversation. Customers don’t experience “accuracy,” they experience conversations. An Agent that’s technically accurate but tonally off-brand will erode customer trust over time.
I make the experience dimension explicit in my POCs. I have people on my team—and when possible, a small cohort of real customers—interact with the Agent under realistic conditions. Then I ask how it felt, not just whether it worked.
Can you keep improving it after launch?
This is the dimension most teams don’t evaluate at all, and it’s possibly the most important one. Choosing an Agent that works today and ensures you can continuously improve the customer experience over time requires more than a functional demo. You’re buying a system that must get better every week, not just during the first sprint.
The feedback loop
Can your team easily review conversations and identify where the Agent is underperforming? Can you pinpoint specific gaps (missing knowledge, incorrect tone, poor handoff decisions) and act on them quickly? The faster the loop between “something isn’t working” and “we’ve fixed it,” the more value compounds over time. In practice, that means instrumenting conversations, leveraging Agent Analytics, tagging misroutes and tone slips, and running targeted evals on known failure modes.
The speed of iteration
When you identify a gap, how quickly can you address it? This is partly a question of tooling (how easy is it to update knowledge, refine guidance, adjust behavior?) and partly a question of team capability. The teams getting the most out of AI are the ones that have changed how they operate and made continuous improvement a part of their everyday work. They’ve committed to going all-in for the long term, not just the first few weeks when launching their AI Agent. We treat this as eval-driven development: automate evaluations that mirror real tickets, tighten prompt engineering and retrieval settings, and ship small fixes daily.
The vendor partnership
The vendor behind the Agent matters just as much as the solution itself. You’re choosing a partner for transformation that will help you evolve how your business delivers customer experience. Ask:
How does customer feedback influence the product roadmap, and can they show you examples?
If you have feedback on limitations or weaknesses, do they engage transparently or get defensive?
What kind of support will you get post-launch?
Are they shaping where AI customer experience is going, or reacting to what others are building?
How a vendor responds to those questions tells you more about the long-term relationship than any benchmark result.
What a good POC proves
If your POC only proves “the AI works,” you haven’t done enough. A strong proof of concept tests performance in realistic conditions, evaluates the experience from the customer’s perspective, and validates the system that will support continuous improvement after launch. Done well, it sets you up for long-term operational success and builds organizational AI readiness—not just a flashy demo.
I just finished a standout conversation on AI engineering and product discovery that hit squarely at the questions I hear from product leaders every week: What does practical AI engineering actually look like for product managers, and how do we ramp without a traditional software background?
Listen to this episode on: Spotify | Apple Podcasts
Here’s the arc that resonated with me: a product leader goes from occasional tinkerer to spending 60% of her time on real engineering work—building AI-powered tools for continuous discovery, forming a licensing partnership with Vistaly, and quietly constructing "Teresa Bot," an AI discovery coach trained on everything she’s ever written. The journey is less about mastering every framework up front and more about structuring learning, tightening feedback loops, and shipping useful outcomes.
The most energizing throughline is the myth-busting: you don’t need a deep engineering pedigree to operate in this space. Curiosity, rigorous discovery habits, and eval-driven development will take you further than brute-force coding. As one moment put beautifully, "I know anything that I don't know how to do, Claude will teach me how to do. And Claude is infinitely patient." That captures the posture I expect modern PMs to adopt with LLMs and tools like Claude Code.
On the nuts and bolts, the discussion gets concrete about AI engineering in practice: context engineering, prompt writing, RAG, observability, and evals. This is the real stack—think retrieval-first pipeline design, prompt engineering guardrails, instrumentation for model drift, and continuous, automated evals to protect behavior as you iterate. If you’ve been dabbling with context window management but haven’t formalized your test harnesses or dashboards, this is your cue.
What I appreciated most is how directly discovery skills transfer. Framing assumptions, running tight customer interviews, mapping opportunity solution trees, and aligning stakeholders—these are precisely the muscles you need to shape problem spaces before you “vibe code” solutions. As one reflection nails it, "The moment I learned more about data science, all of my discovery work became so different." That’s the bridge from qualitative sense-making to measurable, model-centered learning.
The partnership with Vistaly is also a smart build vs buy case study. Rather than reinvent infrastructure, the choice to license purpose-built opportunity solution tree software keeps focus on the differentiated layer—learning systems and product outcomes. As it’s put plainly: "I don't want to build all that stuff. I don't really want to be a software company. I'm almost set up like an AI researcher." Product leaders should internalize this lens for platform choices across their AI roadmaps.
On "Teresa Bot," the implementation breadcrumbs are familiar and pragmatic: pair a solid retrieval-first pipeline (RAG) with clean content sources, keep prompts modular, enforce code review even for vibe coding, and stand up observability and evals early. I’ve had similar success using Claude Code for rapid iteration while treating every prompt and context change as a versioned artifact. That discipline pays dividends when you need to trace regressions or prove improvements.
If you’re a PM ready to lean in, start small and systematic. Pick one high-signal discovery workflow, model the knowledge you already have, and wire up basic evals before you scale. Keep a lab notebook, use programmatic tests to gate deployments, and measure outcome movement—not just model cleverness. This is where LLMs for product managers move from novelty to execution readiness.
Resources mentioned: Watch the episode on YouTube, Claude Code, Vistaly (opportunity solution tree software), Opportunity Solution Trees: Visualize Your Discovery to Stay Aligned and Drive Outcomes, Product Talk Academy, Just Now Possible Podcast, Vibe Coding Best Practices: Avoid the Doom Loop with Planning and Code Reviews, and the AI Evals for Engineers and PMs course on Maven.
What stood out to you—RAG design choices, eval frameworks, or the discovery-to-engineering mindset shift? Drop your thoughts below; I’d love to learn how you’re applying these patterns in your own product roadmaps.
We just launched Operator, an Agent for your customer operations that helps you understand, manage, and improve your entire customer experience. I’ve spent years shipping AI-driven products at production scale, and this one reflects the lessons I’ve learned the hard way about what it really takes to go from a flashy demo to a dependable system your team trusts.
To give you a clear view of just how powerful this Agent is, I want to share the technical infrastructure and engineering choices that make Operator work reliably at production scale across thousands of customer workspaces. My goal is to demystify the gap between a well-prompted LLM and a true, production-grade Agent—so you can make an informed build vs. buy decision.
If you’re a technical leader evaluating whether to build something like this yourself, or trying to understand the difference between a well-prompted LLM and a production Agent system, this is for you.
Escaping the “it’s just an LLM” trap
Most engineering teams in this space start the same way: a prototype. You take a foundation model, give it API access to your support data, add a system prompt with some domain context, and you’ve got something that queries your database, summarizes tickets, and generates reports that look right. It demos convincingly—and I’ve been there, impressed in the moment, only to watch it buckle under real-world complexity.
The problem with that prototype is that it obscures the scope of what’s actually required. It demonstrates the 10% of the system that’s straightforward to build, and it’s easy to assume the rest is just as straightforward. It isn’t. The gap between a working demo and a production system your team depends on daily is where most of the engineering investment lives. That’s precisely the gap we focused on closing.
With Operator, we’ve invested deeply in every layer: tooling, reasoning, how the Agent takes action, and the infrastructure that makes it reliable at scale. Here’s a closer look at the architecture and why it matters for agentic AI, platform scalability, and observability.
The tooling layer
The first thing we had to confront was that the obvious approach (giving a model access to your APIs and letting it figure things out) doesn’t hold up in production. The model makes reasonable decisions for simple queries, but operating across thousands of customer workspaces with different configurations, data models, and usage patterns, a “figure it out” approach isn’t nearly precise enough.
What you need is purpose-built tooling: tools that encode decisions about what data to fetch, how to structure it, what context to include, and what to leave out. Operator has over 50 of these tools and 10 skills.
A tool is a single action that Operator takes (search content, run a query, look up a conversation). A skill chains multiple tools together to complete a whole job, like debugging a conversation end-to-end, rolling out a content update across an entire help center, and identifying the next automation opportunity. This is where AI workflows move from abstract prompts to dependable, repeatable outcomes.
The difference between using thin wrappers around API endpoints and purpose-built tooling shows up in something as seemingly simple as a performance question. When you ask “how did Fin perform last week?”, a naive implementation runs a query and hands back a table. Operator runs a reporting tool that determines which metrics are relevant for your specific workspace, which are meaningful for your particular question, and what the numbers actually mean in context, giving you a much richer answer that you can do something tangible with.
Developing that behavior took months of engineering. Not because any individual piece is conceptually hard, but because getting it right across the full range of customer workspaces, configurations, and edge cases is an iterative process. You build it, you test it against real conversations, you find the cases where it breaks, you fix those, and you repeat. There’s no shortcut—and in practice, this is where most DIY efforts stall.
The intelligence layer
The tooling layer solves what to do, but beneath it is a harder problem: understanding what’s worth doing, and why. This is the layer that makes Operator understand your business rather than just query it. Three components go into it, and in my experience they’re non-negotiable for a reliable Agent.
1. Semantic search
Unlike solutions that rely on keyword matching, Operator uses a system that understands what content is about, not just what words it contains. When it searches your help center, it’s using the same semantic search engine we’ve spent years optimizing for Fin itself. This is a retrieval system that’s been tuned against millions of real support conversations, with precision and recall characteristics we’ve measured and improved continuously. This retrieval-first pipeline is the backbone of grounding and dramatically reduces hallucinations.
2. Attribute awareness
Operator has access to your data and knows what is meaningful for different questions. It knows which metrics are actually in use in your workspace, which custom attributes carry signals, and which fields are populated versus effectively empty. We’ve built specific skills that give Operator this meta-knowledge, so when it’s investigating a performance question, it’s looking at the right things, not hallucinating insights from sparse data.
3. Intelligent reasoning
A well-built Agent can answer your question and anticipate what you should ask next. If you ask Operator about escalations spiking, it doesn’t just say, “escalations increased 23% week-over-week.” It’ll continue on to tell you why this happened by examining the escalated conversations and identifying that a disproportionate number involved a specific product area, before moving on to check whether the relevant help content is up to date, and, if it isn’t, proposing an update. That chain of reasoning isn’t prompt engineering. It’s encoded in the skills we’ve built, refined against the patterns we see across our entire customer base.
The action layer
This is where the engineering complexity increases by an order of magnitude because instead of just analyzing problems and recommending solutions, Operator takes action to solve them itself. It can update Guidance rules, draft and publish help articles, create Procedures, configure data connectors, and modify your Fin configuration. Moving from read-only insights to write-capable actions is a fundamentally different class of product and infrastructure problem—one that demands rigorous SRE practices and rock-solid safeguards.
Every one of these actions has to be safe, reversible, and auditable. An analytics tool that occasionally returns a wrong number is frustrating. but an Agent that occasionally applies a wrong configuration change to a live support system is a different category of problem. To prevent this, we built a robust proposal system, whereby every change Operator suggests is presented as a reviewable diff. You see exactly what will change before anything is applied, with the option to accept, reject, or refine. Nothing goes live without your explicit approval.
What else sets Operator apart
A UI that’s both conversational and graphical, not one or the other. Operator blends conversational interaction with purpose-built graphical components. Proposal diffs show exactly what will change in an article. Inline charts visualize performance trends. Dashboards render directly inside the conversation thread. In practice, that means a knowledge manager reviews a structured diff—not a wall of LLM-generated text—and a team lead asking about weekly performance gets an accurate chart with context, not a paragraph approximating data.
Building this hybrid experience is extremely difficult outside of a native platform integration. In a chat interface or CLI, you’re limited to text output; in a standalone dashboard, you lose conversational context. Operator does both in the same thread, so every interaction is detailed and context-rich—and importantly, actionable in the flow of work.
It lives where your team already works. Operator is built into the same platform your team uses every day. It’s not a separate tool with a separate login, nor is it a Slack bot your engineer set up that only three people know about. It operates exactly where you are, alongside the conversations, help center articles, workflows, and data you’re working with. That tight integration closes the gap between finding a problem and fixing it: spot an outdated article while reviewing a Fin conversation, and Operator can surface the fix in the same session. Notice an escalation spike in the morning, and you can ask Operator to investigate without switching tools, waiting for a data pull, or filing a ticket.
The compounding advantage
Every customer using Operator teaches us something. We see which debugging approaches work across different types of support operations, learn which content structures perform better, and identify automation strategies that consistently land. Those patterns get encoded back into Operator’s skills and tools. When we discover that a particular sequence of investigation steps reliably identifies the root cause of a spike in escalations, we build that into Operator’s diagnostic skill. When we find that a specific way of structuring help articles leads to higher Fin resolution rates, we encode that into the content creation skill. Our engineering team is continuously shipping improvements based on what we observe across the entire customer base.
A custom-built solution gives you exactly what you built, meaning it doesn’t get smarter unless you invest engineering resources into making it smarter. And that usually means taking time and talent away from your core product. I’ve watched teams underestimate the ongoing cost of eval-driven development, model upgrades, and API churn—costs that only grow as your footprint expands.
We’re not locking the door
Some teams want to build their own Agents. Some of our most technical customers do this. But when you do, you’re working with raw APIs and building your own tooling on top of them. When you use Operator, you’re working with a system that already knows what questions to ask, understands your data, and encodes the best practices we’ve learned from thousands of support teams. We recently launched the Fin CLI, which means you can use third-party agents like Claude Code or Cursor to interact with your Fin data and configuration. That door is open. What I hope this post has clarified is everything that goes into the build of Operator: Over 50 tools and 10 skills, purpose-built for support operations. Years of investment in semantic search. Deep integration with every layer of Fin’s stack. The proposal system. The intelligence layer. The reliability infrastructure.
If you’d still like to move ahead with building a custom solution, here’s an honest assessment. You can build a useful read-only tool in weeks. It’ll query your data, summarize tickets, and generate reports, but turning it into a production system will take quarters. Reliability, security, edge case handling, multi-tenant data isolation, and graceful degradation are all important architectural decisions that you’ll need to get right from the start. The action layer is also where you might risk stalling out. Going from “here’s what’s wrong” to safely making changes in a production system is a fundamentally different engineering problem than analysis. Most DIY projects never get there. Finally, you’ll be maintaining it forever. Every model upgrade, API change, and new capability in your support platform means updating your custom tooling. We have a team dedicated to this. You’ll need one too.
The economics still favor buying when a vendor has invested more in the problem than you can justify internally. What I hope this post adds is a clearer picture of what that investment actually looks like from an engineering perspective—and why it compounds into a durable advantage for your support organization.
The investment is ongoing. The problems we’re solving at the infrastructure level today are harder than the ones we solved a year ago, and that trajectory isn’t slowing down. If you’re ready to see the difference a production-grade Agent can make, explore Operator.
Revenue leaders are starting to use AI to generate better leads, capture peak buyer intent, and scale their pipeline without a linear increase in headcount. I see it every day in my own teams: when we get the foundations right, AI doesn’t just answer questions—it accelerates qualification and turns curiosity into pipeline.
Done well, an AI-first inbound sales experience engages buyers 24/7 in any language, qualifies leads intelligently, and routes high-intent prospects to the right conversion path. But behind that experience, there’s an unsung hero: knowledge management. I’ve learned the hard way that even the smartest Agent underperforms if it’s not fed the right information.
A Sales Agent is only as good as what you give it to work with. If you’re using an Agent, like Fin, to run inbound sales motions end to end, it needs an extensive pool of knowledge to draw from. You need to feed it accurate answers on pricing, features, and plan fit, and clear rules for how to qualify and route each prospect. Without it, your Agent can’t do its job, and your sales team is back to answering the same questions manually and triaging leads that could have been handled automatically.
In this guide, I walk through everything you need to know about building and maintaining the knowledge base that powers your Sales Agent—what to include, how to launch, what to measure, and how to iterate so results compound over time.
What is knowledge management and why is it so important?
Definition: Knowledge management is the process of creating, organizing, sharing, and maintaining knowledge in your business.
Knowledge is your sales agent's edge. This Fin testimonial shows how organizing and optimizing content removes friction in the funnel, lifting conversion and unlocking millions in pipeline and revenue for growing teams.
Your public website and product pages are classic examples, but those are just the tip of the knowledge management iceberg. In an inbound sales motion, knowledge management involves a range of activities such as creating resources (FAQs, pricing overviews, competitive battlecards, case studies, internal sales materials), identifying gaps in documentation and qualification criteria, implementing systems that make information easy to access and use, and developing processes to keep everything current. In my experience, these elements are what allow an Agent to move from merely answering questions to recommending the right plan and explaining why it fits.
Why knowledge management matters even more in the age of AI
Your knowledge base is no longer just static collateral for buyers to read. It powers your Sales Agent and entire inbound motion. It’s the key to accurately answering complex prospect queries, guiding product discovery, qualifying intent in real time, and accelerating the path to pipeline. Two realities shape my approach:
1) Your Agent is only as strong as what you “feed” it. Your Agent is only as good as the knowledge and content that it has access to. A lack of information, poorly structured sales materials, or out-of-date pricing documentation all prevent it from providing clear and correct answers to your buyers, leading to poor buying experiences that degrade trust and cost you deals. No large language model (LLM) knows your business like you do. It doesn’t understand your prospects’ specific needs, pain points, pricing tiers, or use cases. That knowledge is unique to you and your organization, which means you need to map it all out and explicitly feed it to your Agent. You need to feed it facts about your product, and also give it the context behind those facts so it can guide buyers to the right solution rather than just answering their questions.
2) Every investment of knowledge has compounding results. Making the switch to AI isn’t just adopting a new tool. It means adapting to a new ecosystem. Think of it as a flywheel. Every piece of knowledge you add makes your Agent more effective. It generates better conversations and data, which tells you what to add or refine next. The more you invest in it, the faster it compounds.
Smart sales teams don’t copy what already works for service—they connect to it. This Fin quote card reminds readers to reuse trusted knowledge, cut duplication, and keep content manageable for faster, more accurate selling.
“You have to think about AI like a new sales rep. On day one, it needs coaching, guidance, and feedback. But over time, as you refine the inputs and learn from real conversations, it becomes more autonomous and the level of coaching required decreases significantly.” Pascaline Albin, Director of Sales Development at Fin
Every upfront investment you make in your sales knowledge has long-term, revenue-generating impact. Whether you hire someone to do this work full time or give your sales reps time away from the inbox each week, the ROI speaks for itself. I’ve routinely seen small content improvements unlock big conversion gains.
Think of it this way: say it takes 30 minutes to document a new competitive battlecard or update pricing information. That 30-minute investment results in hours saved for your sales team, highly engaged buyers who get instant answers, and actionable data to optimize your inbound motion.
Calculate: Average time to compose a response × frequency of question = time saved for your team. More importantly, that’s time your SDRs and AEs can reinvest in multi-threading into accounts, running complex evaluations, and closing high-value deals that actually move pipeline.
Calculate: Number of prospects who ask this query × average time to respond = total time saved for buyers.
Give your sales agents the knowledge they need from Day 0. A friendly portrait sits next to a bold statement on using Fin's AI Customer Agent to optimize content, guide reps, and turn buyer intent into pipeline and revenue.
“For sales funnels, identifying knowledge gaps or friction can result in a huge improvement in conversion. When you optimize Fin with the right content, the incremental improvements have a big impact on our bottom line and can lead to millions of dollars in pipeline and revenue. That's why knowledge management is an integral part of our training and optimization process.” Tommy Dunton, Senior Manager of Sales Development at Fin
The best way to start generating that data is simply to start. The sooner you begin, the sooner you can capture insights about what your buyers want and need from your inbound sales experience. I prioritize quick deployment, fast feedback loops, and continuous iteration.
What to include in your knowledge base
Wrangling and prioritizing all of your internal and external sales documentation can feel daunting, but with the right technology, it doesn’t have to. The ideal platform provides data-driven insights to show what buyers actually ask and a centralized place to create, manage, and optimize your knowledge content. For example, with Fin for Sales, you get access to a leads report that gives you insight into disengaged prospects. Intercom’s Knowledge Hub enables you to create a single source of truth for your public-facing collateral and internal sales materials. Using Content Targeting, you can segment this information so your Sales Agent only uses the exact content you want.
1) Pricing and product FAQs. What it is: answers to the most common discovery questions buyers have, from pricing and plan differences to implementation, integration, and security or trust topics. How to source: analyze your sales inbox and early discovery calls. Where to use: public website, Sales Agent, and proactive outbound messages.
Give every seller instant, trusted answers with an AI-powered knowledge base that unifies docs, FAQs, and playbooks into a single source of truth—accelerating ramp, boosting call confidence, and improving every customer conversation.
2) Competitor comparisons and battlecards. What it is: guidance for handling competitor mentions, addressing friction, and highlighting unique value propositions. How to source: talk to top-performing AEs or your product marketing team. Where to use: internal snippets for your Sales Agent and internal sales materials.
3) Case studies and social proof. What it is: proof points that help buyers build business cases and gain confidence, speeding deal cycles. How to source: collaborate with customer success and marketing on ROI stories. Where to use: Sales Agent, website, and sales collateral.
4) Specific use cases and buyer personas. What it is: targeted content for cohorts with similar pain points and jobs-to-be-done (e.g., engineering teams, startups). How to source: combine product marketing’s value propositions with real discovery conversations. Document the exact probing questions your best SDRs and AEs use so your Agent can uncover context in real time. Where to use: website and Sales Agent to enable contextual solution matching.
Content formats and sources
When sourcing knowledge, cast a wide net. You likely have more relevant content than you realize, and almost any information is useful once framed correctly. With Fin, you can use public articles (product FAQs, pricing overviews, feature benefits), internal articles (internal sales materials, internal FAQs), snippets (short-form text like promotions or battlecards), website pages (synced from your marketing site), and PDFs (whitepapers, technical specs, detailed sales materials).
Turn conversations into revenue with a clear Sales Performance view. Track rising KPIs and follow leads from Chat and Email through Qualified, Disqualified, and Recovered to outcomes such as Sales Qualified, Pro Plan, or Free Plan.
Create a knowledge management process that fuels your Agent: 5 steps
Step 1: Audit what you have. Start by reviewing your current materials to prevent your Agent from learning outdated information and to identify gaps. If you’re already using a Customer Agent, much of that content can pull double duty for sales—no need to start from scratch. Make your existing content available for your Sales Agent and build sales-specific content on top, like pricing comparisons, competitive battlecards, customer case studies, and qualification criteria that wouldn’t apply to service conversations. If you’re starting fresh, audit pricing, product FAQs, feature details, competitor comparisons, case studies, and buyer use cases.
Put yourself in your buyer’s shoes. Walk through the same steps your prospects take, including their first interaction with your Sales Agent. Before going live, test it yourself. If you’re using Fin, you can do this using the built-in Preview panel to validate answers, routing, and missing topics or objections. Confirm that your Agent asks the right probing questions about goals, fit, and urgency before making a routing decision.
“We're moving incredibly fast at Fin with our Customer Agent, which means optimising our content, guidance and experience with Fin is a constant focus. Before we launch new products, we're testing Fin for Sales to ensure it's got all of the knowledge it needs to make sure the customer experience is perfect and we can convert that intent into pipeline and revenue from Day 0 of that launch.” Tommy Dunton, Senior Manager of Sales Development at Fin
Seek input from across your GTM organization. Don’t rely solely on sales. Involve marketing, growth, revenue ops, and sales ops to align content with campaigns and routing logic, and to integrate with systems like your CRM. Your SDRs and AEs bring real-world objections, use cases, and competitor insights that win deals—and those should feed directly into your Agent’s knowledge base. Judging fit is as much art as science, and your best SDRs can teach the Agent to interpret subtle signals.
Scalable selling starts with better knowledge. This graphic pairs a monochrome portrait with a bold Fin quote showing how training agents and curating a strong knowledge base compound AI performance over time.
Step 2: Plan and prioritize. Decide where to start by focusing on questions your team still answers manually that, if documented, would help your Agent capture more qualified intent. Identify the content your reps share most (demos, explainers, case studies) and ensure the Agent can access it. Look at leads reporting to find early-stage questions, stuck points, and high-volume disengaged outcomes, then strengthen objection-handling content. Prioritize based on pipeline value—build competitive battlecards and enterprise-tier documentation before free-plan details. Use reporting to find funnel drop-offs and content that hasn’t been updated recently—refresh pricing immediately if it has changed.
Allocate time and resources. Treat your Sales Agent like a core GTM channel, not a side project. Assemble a cross-functional project team with clear roles. The Agent owner translates sales strategy into prompts, routing logic, integrations, and rollout. The optimization owner reviews performance data, identifies drop-offs, and drives changes to content or Agent behavior. Early alignment ensures your Agent operates as a professional extension of your sales team.
Step 3: Go live and learn. Deploy broadly across your marketing site and pricing pages to accelerate learning. Within weeks, you’ll see where the Agent guides discovery and qualifies buyers versus where it stalls. Investigate drop-offs—often these point to missing answers or weak probing questions. If your Agent and knowledge base live in the same platform, you’ll get full visibility into your qualification funnel and content performance across touchpoints.
Track metrics to measure success. Monitor completion rate (conversations reaching a clear routing decision), pipeline created (opportunities generated through Agent-handled conversations), meetings booked (qualified prospects routed to a call), and customer satisfaction (quality of the experience). These metrics show what content is working and where to improve.
Step 4: Iterate and improve. Expect gaps early on. That’s good—it surfaces what buyers need to convert. When the Agent gives a poor response, the root cause is usually missing, outdated, or shallow content. Close the gaps, then monitor your metrics and conversation reviews to keep compounding improvements.
Your Sales Agent runs on great content. This Fin-themed graphic pairs a professional headshot with a bold statement highlighting how strong knowledge enables discovery answers and timely updates across the GTM motion.
Build ongoing maintenance into your workflow. Knowledge management is continuous. As your product, personas, and goals evolve, so must your content. Define owners, review cadences, and working time to refresh and create content—don’t wait for launch week chaos. Encourage a “knowledge management” mindset by logging content requests from SDRs and AEs when they hear new objections or discover probing questions that uncover true pain points.
“Training Agents to get better over time is fundamental to using AI. Fin learns from our website and help center, so the quality of those resources directly impacts its performance. The more we’ve invested in our knowledge base, the more success we’ve seen with Fin and those gains continue to compound.” Beth-Ann Sher, Senior AI Knowledge Manager at Fin
Step 5: Build knowledge management into future launch plans. Make Agent-ready sales content part of every product or pricing launch checklist. Partner with engineering, product marketing, and revenue operations to update catalogs and your Agent’s knowledge base on day zero. Then review early discovery conversations to add resources, address new objections, and fine-tune contextual solution matching.
“Content should no longer be an afterthought. It is one of your strongest GTM levers because your Sales Agent relies on it to handle discovery questions and stay up to date on your latest offerings.” Beth-Ann Sher, Senior AI Knowledge Manager at Fin
Best practices for Agent-friendly knowledge management
A pull-quote from Fin explains why one platform matters in sales: centralize conversation data, lead reporting, and agent configuration to spot funnel drop-offs, learn which content works, and elevate the buying journey.
Use the terms your buyers use. Language varies by industry, persona, and role. Analyze discovery calls and on-site searches to capture how buyers actually speak and train your Agent accordingly. Test internally across SDRs, revenue ops, and marketing to reveal variations and content gaps.
Simplify language and remove ambiguity. Machine-friendly language is buyer-friendly. Avoid jargon, spell out acronyms, and clearly explain key product terms so value propositions land.
Keep the experience consistent and on-brand. Ensure product terminology, feature names, and pricing tiers are consistent everywhere. Proof for tone, spelling, grammar, and use standardized templates to build trust.
Add context to your answers. If your internal FAQ is full of “yes/no” answers, expand on the why. Restate the question, provide business context, and equip the Agent with follow-ups that keep the conversation alive and uncover goals and constraints.
Add text to images and videos. Show and tell—always include clear explanatory text so your Agent and all users, including those with accessibility needs, can benefit.
Introduce Fin for Sales to your team with this clean hero banner: bold headline, signature blue spiral, and a clear 'Start free trial' call to action—inviting readers to explore an AI customer agent built for revenue.
Create a scannable structure. Use clear headers and lists in your source content so both Agents and humans can navigate quickly. Avoid dynamic elements that hide crucial details.
Collect bite-size information in FAQ articles. Package tactical intel—seasonal promotions, short battlecards, edge cases—into concise snippets so your Agent can retrieve and deliver them instantly.
A connected Agent turns every conversation into insight. When a Sales Agent is connected to your CRM and enrichment tools, every interaction, qualification signal, and piece of sales content flows into a connected system. “A single platform matters in sales. When your conversation data, lead reporting, and Agent configuration all live in one place, you get much better visibility into your qualification funnel. You can see where buyers are dropping off, what content is working, and can improve the buying experience.” Fred Walton, Senior AI Conversation Designer at Fin
Every conversation makes your knowledge base sharper, showing you what’s resonating, what’s missing, and where to invest next. That’s the retrieval-first pipeline mindset I push with my teams.
Make knowledge management a core sales function
Behind every high-performing Sales Agent is a comprehensive, machine-friendly knowledge management process. Without it, even the most capable Agent will struggle to deliver the pipeline gains AI can deliver. This isn’t a one-time project; it’s a continuous investment. The teams treating knowledge management as a core sales function are building systems that improve with every conversation, turning inbound demand into a compounding growth engine.
Today, I’m thrilled to share Fin’s next leap as a Customer Agent: ecommerce. When we launched Fin for Sales, Fin expanded further across the customer journey — and now we’re bringing that same intelligence to product discovery, checkout conversion, and post‑purchase support for Shopify merchants.
Fin for Ecommerce is a new role purpose-built for Shopify merchants that combines shopping assistance and ecommerce support. Fin is already the best Agent for customer service, resolving over a million queries a week for 8,000+ businesses. Now, it also guides shoppers to the right product, addresses concerns in the moment, and converts browsing into buying — all in one fluid experience.
Here’s what’s new and why it matters for conversion rate, average order value (AOV), and lifetime value:
A leading mattress retailer shares how Fin for Ecommerce acts like an expert associate—asking about sleep style and firmness, then recommending the best-fit product to boost confidence and drive conversions.
Fin helps shoppers find the right product. It asks thoughtful questions, narrows options across large catalogs, and compares products based on what the shopper actually needs — like a great in‑store assistant, at scale.
Fin helps increase order value. It recommends relevant add‑ons and higher‑value alternatives based on conversation context, keeps carts effortless to update, and guides shoppers smoothly into checkout when they’re ready.
See Fin for Ecommerce in action: a Product Discovery card curates three high-performance ski jackets with images, names, and prices, revealing how the customer agent guides shoppers and accelerates confident purchases.
Fin handles support without losing the sale. Returns, refunds, and order changes happen in the same conversation; once resolved, Fin brings shoppers right back to browsing so momentum isn’t lost.
Fin is integrated with Shopify. Connect your store and Fin syncs your catalog, order data, and APIs in minutes — no manual training or complex setup.
A customer spotlight from Ninja Transfers shows Fin for Ecommerce boosting sales: 10% of support chats convert, with order values 20% above average—proof that an AI customer agent can drive revenue while improving service.
In a great retail store, an attentive associate changes everything: they ask what you’re looking for, understand your preferences, answer the questions that matter, and walk you to checkout — and when you return, they remember you. That level of proactive, human‑quality assistance has never truly made it online.
Most ecommerce still looks like it did a decade ago: filters, FAQs, and self‑serve flows that assume the customer already knows what they want. Ecommerce offers scale and 24/7 convenience, but it’s passive — it can’t understand a shopper’s intent and actively guide them to a product that fits.
Fin for Ecommerce acts like a customer agent—checking shipping status, surfacing in‑stock color variants, and updating the order in the same thread—turning a jacket mix‑up into a quick, seamless experience.
Fin for Ecommerce changes that by bringing high‑quality shopping assistance to Shopify stores.
"Fin doesn't just recommend products — it asks the right questions about sleep position and firmness preference, understands what the customer actually needs, and guides them to the right decision. It sells the way we sell." Anthony Navarro, Market Sales Manager at Avocado
An Avocado Green Mattress customer experience leader shares how Fin for Ecommerce unifies support and sales—answering policies, selling products, and explaining the mattress break-in period—so shoppers get instant, agent-level help.
Here’s how it works in practice. When a shopper says "I need a gift for my partner" or asks "what running shoes work for trail and road?," Fin doesn’t dump them on a search results page — it starts a conversation. It asks about preferences, incorporates live browsing context, surfaces the most relevant options, and compares them based on what the shopper cares about.
This is powered by Fin Apex 1.0, the best-performing model for customer service, combined with a retrieval engine purpose-built for ecommerce. It handles vague, exploratory shopping questions and large product catalogs, helping shoppers find the right fit, faster.
Seamlessly connect Fin to your Shopify store. With one click, sync your product catalog, pull live inventory, and import store policies so your customer agent can answer questions and resolve orders faster.
In practical terms, this is agentic AI meeting ecommerce: Fin plans, retrieves, and reasons through complex product questions and next best actions to move the shopper forward confidently.
Based on the conversation, Fin recommends complementary or higher-value options, keeps carts easy-to-update, and guides shoppers into checkout when they’re ready.
Customer testimonial from Groupsumi spotlights Fin for Ecommerce: rapid, high-quality support with minimal setup, powered by Shopify as the single source of truth, helping teams cut complexity and focus on growth.
"Fin for Ecommerce is already driving meaningful revenue, with 10% of conversations converting to orders averaging 20% above our store AOV." Matt Satell, Director of Ecommerce, Ninja Transfers
Fin for Ecommerce is built on the same AI platform that powers Fin for Service. Fin understands whether a conversation requires shopping assistance, support, or both, and moves between them seamlessly without the customer noticing.
Meet Fin for Ecommerce, your always‑on customer agent. This bold hero invites you to add Fin to your store so shoppers get instant answers, higher confidence at checkout, and fewer support tickets.
This means the same Agent that helps shoppers buy also handles the hard and complex post‑purchase work including refunds, exchanges, order changes, tracking, and shipping questions. It can make changes in real time, within the same conversation, using the same context and data.
"The handoff between support and sales is so smooth I can't tell the difference without checking the filters. Fin talks policy, sells products, and references our mattress break-in period all in one conversation. It handles both the way our best agents would — but without the customer waiting to be passed between people." Kurt Dwiggins, Customer Experience Manager at Avocado
Fin for Ecommerce is purpose-built for Shopify merchants. Connect your Shopify store and Fin establishes a live connection to your entire catalog – products, variants, content, and order data – ensuring every response reflects your latest inventory and shoppers only see what’s actually available.
You can add the Messenger to your store and set Fin live in minutes without any manual training or technical expertise. When connected to Shopify’s API, Fin can handle even your most complex customer requests like tracking orders, processing returns, and updating subscriptions via Procedures. Fin automatically drafts Procedures for common ecommerce support queries based on your Shopify account and customized to your company policies.
You review, adjust, and publish, allowing Fin to start handling real queries in minutes.
"What surprised us most about Fin for Ecommerce is how quickly it delivers high-quality support with minimal, non-technical setup. Using Shopify as the single source of truth reduces operational complexity and allows us to focus on core business execution." Arnau Jiménez, Chief Technology Officer, GroupSumi
Fin is now a Customer Agent, with multiple roles that work seamlessly across the customer lifecycle. When a single Agent can guide a shopper from "I need a gift for my partner" to checkout, and handle a return weeks later without losing context, that’s a fundamentally better customer experience. It’s one Agent that deeply understands your products and your customers, and supports them throughout their entire journey with your business.
Leading ecommerce brands, including Avocado, WHOOP, Shutterstock, Flaviar, Carvana, Nuuly, MPB, Pure Electric, and Goodbuy Gear, already trust Fin to create standout experiences for their shoppers. I’m excited to continue expanding Fin’s roles as a Customer Agent and share more soon.
Ready to see it in action? Visit fin.ai/ecommerce and add Fin to your Shopify store today.
I’m energized by the momentum I’m seeing at the intersection of behavioral analytics and AI workflows. "Chanaka is an AI Engineer at Amplitude, where he’s building the MCP server that brings Amplitude’s behavioral context directly into your AI tools." That single sentence captures a strategic inflection point for product organizations: AI that finally understands user behavior at the moment of decision.
Why does this matter? When behavioral analytics flow natively into our AI tools, we move from generic assistants to product-savvy copilots. Instead of prompting blind, I can ground my questions in Amplitude analytics—segment performance, cohort trends, and event funnels—so AI answers reflect real customer journeys, not hypotheticals. The result is sharper prioritization, faster discovery, and tighter feedback loops that directly support product-led growth.
From a technical standpoint, an MCP server becomes a clean, secure interface for LLMs to access behavioral analytics as-needed. That enables a retrieval-first pipeline that reduces hallucinations, improves context window management, and elevates prompt engineering quality. It also unlocks agentic AI patterns—where the assistant autonomously requests the right behavioral context to diagnose activation drops, spot anomalies, or recommend experiments. In short, it’s a unified analytics platform meeting LLMs for product managers where we actually work.
In day-to-day product management, this translates into practical wins. I can ask, “Which onboarding step is blocking user activation for the SMB segment?” and get an answer grounded in behavioral analytics with relevant visualizations or funnels. I can explore retention analysis by cohort without switching tools, then iterate on hypotheses and next-best actions inside the same AI-driven workflow. These tighter loops materially improve decision quality and team velocity.
There are governance considerations, of course. I advocate clear data access policies, strong privacy-by-design controls, and well-defined scopes for what the MCP server can retrieve. Start with high-value, low-risk datasets, pilot with a focused team, and instrument eval-driven development to measure accuracy, latency, and business impact. When done right, the AI Strategy becomes an execution engine—not just a slide.
My playbook: begin with one or two high-impact questions (e.g., activation blockers or churn drivers), wire them into the MCP-powered AI workflow, and quantify time-to-insight and decision quality improvements. As wins accumulate, expand to roadmap shaping, opportunity sizing, and experiment generation. The promise here is compelling—AI that doesn’t just talk about the product, but truly understands how customers use it, and helps us build the right things faster.
Inspired by this post on Amplitude – Best Practices.
Weekly product reviews are where strategy meets execution, and over the past year I’ve turned them into a high-signal, low-friction ritual by leaning on agentic AI. As VP of Product Management at HighLevel, Inc., I’ve standardized a set of agent skills that compress preparation time, surface the right insights, and keep PMs, engineers, and designers focused on decisions—not document wrangling.
"Learn how our teams use agent skills with claude, cursor and codex to run product reviews as PMs, engineers, and designers. Here are 5 killer use cases for builder."
Below, I walk through the five skills I rely on most in our weekly cadence—each one mapped to a clear product management outcome. They’re simple to set up, easy to govern, and aligned with core practices like continuous discovery, product roadmapping and sprint planning, and eval-driven development.
Skill 1 — Backlog triage with signal extraction: I point an agent at fresh tickets, customer notes, and experiment results to cluster themes, tag impact, and flag regressions. Using a retrieval-first pipeline and Agent Analytics, the assistant ranks items by value, effort, and risk so our meeting starts with a prioritized, explainable shortlist instead of a raw queue.
Skill 2 — PRD and spec synthesizer: Ahead of the review, an agent drafts a one-page PRD update from design diffs, git history, and decision logs. With Claude Code and Cursor, it highlights interface changes, acceptance criteria, and open questions, linking back to sources. The result is a crisp, auditable brief that keeps product trios aligned without re-litigating context.
Skill 3 — Experiment and metrics analyzer: An analytics agent pulls A/B testing readouts, checks minimum detectable effect assumptions, and annotates anomalies. It turns raw telemetry into a narrative: what moved, by how much, and whether we trust it. This makes our discussion about tradeoffs, not spreadsheets, and speeds commitments on next steps.
Skill 4 — Voice-of-customer synthesizer: The assistant clusters interviews, support threads, and NPS verbatims into jobs-to-be-done and pain themes. It proposes opportunity solution tree updates and calls out places where our roadmap diverges from customer signal. That keeps continuous discovery alive in the room—even when time is tight.
Skill 5 — Roadmap and sprint planning co-pilot: After decisions, an agent converts outcomes into scoped backlog items, engineering tasks, and stakeholder updates. It drafts sprint goals, flags dependency risks, and aligns work to objectives. Because it’s grounded in the meeting record, it preserves intent while removing ambiguity.
Under the hood, prompt engineering patterns and guardrails keep these workflows predictable: a retrieval-first pipeline for context, eval-driven development for quality checks, and role-specific prompts for PMs, engineers, and designers. With Claude Code I generate structured diffs and test scaffolds; with Cursor I accelerate code-review summaries; and with codex I bootstrap utility scripts that keep the loop tight between insights and implementation.
The payoff is tangible: higher decision velocity, fewer meetings to “re-clarify,” and clearer accountability across the product organization. Just as important, governance and privacy-by-design are built in—every agent logs rationale, cites sources, and respects data boundaries—so leaders can scale AI workflows confidently.
If you’re looking to level up your product reviews, start with these five skills, measure impact with Agent Analytics, and iterate. Small automations compound quickly, and the more consistently you run them, the more your team’s attention shifts from preparing content to making better product decisions.
Inspired by this post on Amplitude – Perspectives.