Happy New Year! I’m kicking off 2026 with a behind-the-scenes look at what’s changing in my product practice, the experiments I’m running with my teams at HighLevel, and the trends I’m most energized by—especially around continuous discovery, AI workflows, and building stronger coaching cultures.
If you want to listen to the conversation that sparked many of these reflections, you can find it here: Spotify | Apple Podcasts.
Why Teresa sunset the live deep-dive cohorts—and how on-demand and the new Discovery Habits Toolbox better support real behavior change. This pivot resonated with my own experience: some skills, especially discovery habits, only stick when they’re reinforced in the flow of real product work, not just in a time-boxed cohort. In my org, we’re leaning into on-demand learning paired with manager coaching to drive durable behavior change.
What leaders actually need to coach interviewing, assumption testing, and core discovery habits inside their orgs. I’ve found that empowered product teams thrive when leaders have lightweight coaching tools, practical prompts, and clear expectations for product trios. This is less about one-off training and more about building communities of practice where deliberate practice and feedback loops become routine.
Why training is shifting toward ongoing, leader-supported learning (and how AI will accelerate the shift). AI Strategy isn’t just about tools—it’s about learning systems. For LLMs for product managers to create leverage, we need eval-driven development, privacy-by-design, and clear guardrails. I’m building AI workflows that enable managers to review interviews, spot anti-patterns, and nudge teams toward better decisions—without replacing critical thinking.
Teresa’s move into paid subscriptions and why AI content doesn’t fit the classic “design once, run for years” course model. I see the same reality in my content roadmap: the half-life of AI guidance is short. That pushes us toward subscription models, tighter feedback loops, and a more adaptive go-to-market strategy for education products.
A sneak peek into the AI tools Teresa is building for discovery work—from interview coaching to near-ready interview snapshot generation. I’m particularly excited by tooling that scaffolds better interviews, sharpens assumption testing, and speeds up synthesis without skipping the human judgment step. These capabilities map directly to where I want my teams investing time: spending less energy on admin and more on learning from customers.
Petra’s plans for the year: community building with Product at Heart, a new product leadership email course, her Product Leadership Wheel, and workshops launching in Cairo. As someone who believes in conferences as high-quality “energy wells,” I’m inspired by how these programs create momentum for leaders who are upgrading their coaching muscles.
The role of conferences and retreats in staying grounded, inspired, and connected. I treat these gatherings as strategic resets—spaces to test ideas, confront blind spots, and deepen my network for future collaboration. The best outcomes often come from serendipitous hallway conversations and hands-on sessions where you can pressure test frameworks with peers.
How Teresa is staying on top of academic research (and why “synthetic users” aren’t ready for prime time). I agree: while synthetic data can be useful for scaffolding, it’s not a substitute for direct customer contact. Combine academic rigor with real-world interviewing and strong data governance—especially when operating under General Data Protection Regulation (GDPR).
The shared challenge of evaluating vendors and conference speakers making questionable AI claims. My heuristic: ask for clear problem statements, reproducible evaluations, grounded benchmarks, and a path to safe deployment. If a pitch can’t show measurable uplift or ignores compliance, it’s not ready for empowered product teams.
Key takeaways I’m carrying into 2026: delivery models matter; leaders need coaching tools, not just training; AI is reshaping how we teach and learn; experimentation is the theme of 2026; and community still energizes. That’s the blueprint I’m using to strengthen continuous discovery, refine our AI workflows, and sustain high standards in product management leadership.
What about you? How are you integrating AI workflows into your discovery practice, and what coaching tools are helping your managers reinforce the right habits? Share your approach—I’d love to learn what’s working in your context.
Resources & Links:
Follow Teresa Torres: https://ProductTalk.org
Follow Petra Wille: https://Petra-Wille.com
Teresa’s website: Product Talk
General Data Protection Regulation (GDPR)
Product Talk Academy
Deliberate Practice – ATP episode where Teresa talked about the ending live cohorts for Deep Dive classes
2026 is closer than it feels, and the signals are already clear. I’ve been synthesizing what I’m seeing across empowered product teams, boards, and cross-functional partners into a practical view of what matters next. A sharp look at product management trends for 2026. Not guesses, but signals from top product leaders shaping how PMs will actually work next.
In this analysis, I distill eleven shifts that are changing the craft—from outcomes vs output OKRs and continuous discovery to stronger product strategy and tighter product roadmapping and sprint planning. The throughline is simple: prioritize customer value, ship with focus, and measure what moves the business. These aren’t headline trends; they’re working patterns I’m seeing across high-performing organizations.
AI is no longer a side project—it’s part of the product manager’s core toolkit. Agentic AI, LLMs for product managers, and trustworthy AI workflows are accelerating discovery, sharpening problem framing, and enabling faster iteration. The best teams pair this with disciplined evaluation and experimentation, so insight compounds without sacrificing safety, privacy, or product quality.
Execution is getting crisper through product trios and stronger stakeholder management. When design, product, and engineering co-own discovery and delivery, teams reduce handoffs and increase clarity. That alignment translates into better prioritization, fewer context-switches, and a roadmap that reflects real trade-offs—not wish lists.
On growth, product-led growth remains a durable engine when it’s anchored in a compelling value proposition and instrumented end-to-end. Clear activation moments, in-app guides, and thoughtful product tours outperform brute-force acquisition. When we connect these motions back to product strategy and the roadmap, we create a repeatable loop that compounds adoption and retention.
Governance and trust are now table stakes. Privacy-by-design, data governance, and a pragmatic approach to regulatory compliance protect both users and velocity. Teams that build these practices into their operating model move faster because they avoid late-stage rework and maintain stakeholder confidence.
If you’re leading a product org—or aspiring to—this is your field guide to 2026. I’ll unpack where these shifts are strongest, how to apply them in your context, and the pitfalls to avoid. The aim is to give you clear language, concrete practices, and a sharper edge as you shape what your team builds next.
I’ve spent the last few years turning AI from an intriguing demo into an operational advantage, and the clearest wins come when we treat agents as productized workflows—not toys. In practice, that means aligning agentic AI to a sharp product strategy, instrumenting everything, and scaling what works across the organization.
Learn how companies like Replit are consolidating workflows, creating one-person departments, and building systems for scale with Amplitude
When I talk about agentic AI, I’m focused on outcomes: fewer handoffs, faster cycle times, and measurable uplift in activation, retention, and NPS. The most successful rollouts start with a specific job-to-be-done, translate it into clear AI workflows, and then iterate with a tight feedback loop between data, design, and engineering.
My implementation playbook is simple and disciplined. First, choose a high-friction workflow and define success upfront. Second, make the build vs buy call on the foundation model, orchestration layer, and connectors. Third, establish AI risk management and safeguards early—before scale amplifies errors. Finally, run small, eval-driven releases and promote what performs.
Instrumentation is where the leverage compounds. With Amplitude analytics as a unified analytics platform, I design purposeful events (agent intent, tool calls, resolution state, human handoff), map funnels from user input to agent outcome, and cohort users by context to pinpoint lift. This gives me an honest read on where agents help, where they hinder, and what to tune next.
The “one-person departments” concept isn’t about doing more with less at all costs; it’s about assembling a tight loop of product management leadership, data, and automation so one operator can own a business outcome end-to-end. An agent handles the repeatable work, while the human focuses on judgment, edge cases, and continuous improvement that compounds.
As we scale, I look for platform scalability patterns: shared tools and policies, reusable prompt libraries, standardized evaluation suites, and consistent governance. That structure keeps agent performance predictable while preserving speed, and it aligns beautifully with product-led growth when agents are embedded directly in the product experience.
If you’re starting now, begin with a single, valuable workflow. Instrument it thoroughly with Amplitude analytics, make decisions from the data you see—not the demos you remember—and expand only after you’ve proven uplift. Iteration beats ambition here: agentic AI rewards teams who measure relentlessly and scale only what truly works.
Inspired by this post on Amplitude – Perspectives.
Support tickets are the rawest signal of product truth. Leading product teams at HighLevel, I’ve learned that the fastest way to build what customers value is to transform frontline conversations into a repeatable, data-driven system for discovery, prioritization, and execution.
What if your support and product teams could unlock CX insights to turn every ticket into strategic product intelligence? Explore how.
Here’s the operating system I rely on. First, I connect our support stack (think Intercom and our CRM integration) into a unified analytics platform so every conversation, tag, and resolution is queryable. I don’t just count tickets—I segment them by product area, customer segment, lifecycle stage, and revenue impact to reveal patterns that roadmaps can act on.
Next, we standardize a shared taxonomy. Agents apply concise, high-signal labels (problem type, severity, intent), and we augment that with AI-driven auto-tagging to reduce noise and improve recall. The result is trustworthy “voice of the customer” data that product managers and support leaders can both stand behind.
Prioritization then becomes rigorous and fair. I weight themes by severity, frequency, ARR exposure, and time-to-value, and tie them directly to outcomes vs output OKRs. Amplitude analytics helps me quantify impact—what’s breaking activation, what’s dragging conversion, what drives retention analysis—so the backlog reflects business outcomes, not opinions.
Discovery is continuous by design. Product trios (PM, design, engineering) run weekly reviews of the highest-signal themes, recruit users straight from recent tickets, and prototype solutions quickly. We validate ideas with A/B testing when appropriate and ship targeted in-app guides to reduce confusion before it becomes a ticket.
Crucially, we close the loop. When we release a fix or improvement, we notify affected customers and the agents who flagged the issue. We track downstream effects—ticket deflection, CSAT, feature adoption, and time-to-resolution—so everyone sees how customer support ai strategy accelerates product-led growth.
This approach also builds culture. Empowered product teams treat support as a strategic partner, not a cost center. Agents become co-creators of the roadmap, and PMs gain a steady stream of product discovery opportunities grounded in real user outcomes.
If you’re getting started, a simple 30-60-90 can help: in 30 days, unify the data and agree on taxonomy; in 60, instrument dashboards and adopt a weekly insights ritual; in 90, align priorities to OKRs, launch targeted fixes, and measure business impact. That’s how tickets turn into product truth—and how CX insights drive compounding wins.
Inspired by this post on Amplitude – Perspectives.
How do you help disadvantaged students take action on opportunities they don't even know exist? That question has been top of mind for me as I’ve explored how AI can augment—not replace—human mentorship. Recently, I dug into the work behind Zero Gravity, a UK-based platform using mentoring, community, and learning pathways to unlock elite career opportunities for state school students. Their approach reframed a core problem I care deeply about: the "knowing-doing gap."
I sat down with Elliot Little (Product Manager) and Dan St. Paul (Software Engineer) from Zero Gravity to unpack how they’re tackling this gap with an AI career co‑pilot. They’ve intentionally positioned the system as an orchestrator, not an automation tool—bridging the space between knowing what to do and actually doing it. As a product leader, I see this as a powerful pattern for Generative AI: use AI to coordinate steps, personalize guidance, and empower action in moments where confidence and clarity are fragile.
What resonated most was the humility of their build journey. They started with grand visions of AI mentors and synthetic avatars, then scaled back to something simpler and more effective. The first prototype—a job suitability summary—didn’t deliver the "wow moment" they expected. And they discovered that hiding the "LLM magic" backfired—students needed to feel the personalization. That insight aligns with my own experience: users must perceive the value for trust and motivation to compound.
From a UX standpoint, the team chose text chat over voice input and leaned into guided prompts rather than empty text boxes. That decision lowered cognitive load and increased completion rates—classic product management tradeoffs that privilege momentum over novelty. In my view, this is what good AI product strategy looks like: invite action with structure, then expand autonomy as confidence grows.
The technical backbone is equally thoughtful. Multi‑month journeys require rigorous context window management to avoid exploding token counts and degrading quality. I appreciated their pragmatic toolkit: context management techniques like removing stale tool calls, summarizing history, exposing tools conditionally. They also used application logic rather than complex RAG architectures to manage tool availability and context freshness. This is the kind of disciplined engineering that keeps systems reliable at scale without overcomplicating the stack.
Model selection was fit‑for‑purpose, not one‑size‑fits‑all. They’re using different models for different tasks, including "GPT-5 Nano for structured outputs, lighter models for quick replies." That modularity enables speed and cost control while preserving high‑fidelity moments where structure matters most.
Safeguarding was treated as a first‑class concern—non‑negotiable when you’re building AI for 16‑year‑olds. Their safeguarding architecture pairs moderation endpoints with external verification via Unitary. They also invested in building a failure taxonomy through internal red team/green team exercises. This is AI risk management done right: define failure modes early, test ruthlessly, and wire safety into the product surface area—not just the model layer.
Evaluation was grounded in outcomes, not demos. The team focused on whether students progressed from insight to action: applying, interviewing, and engaging with mentors. That aligns with how I run eval‑driven development—ship narrowly, measure real behavior, and iterate toward a repeatable "wow moment" that students can actually feel.
Looking ahead, I’m excited by what’s next: long‑term memory management for multi‑year student journeys. It’s a hard problem—balancing privacy, provenance, and portability—but it’s precisely where an AI career co‑pilot can compound value over time. The vision is compelling: a resilient companion that remembers goals, adapts to context, and orchestrates the right next step.
If you want to dive deeper, you can listen to the full conversation on Spotify and Apple Podcasts:
Listen to this episode on: Spotify | Apple Podcasts
Blue Dot Impact AI Safety Course – free AI safety course Elliot recommended: https://bluedot.org/
My key takeaways: build AI that augments human relationships, not replaces them; don’t hide the personalization—let learners feel it; privilege application logic over unnecessary architectural complexity; and treat safety, context, and evaluation as product features, not afterthoughts. That’s how we bridge the "knowing-doing gap" with integrity and scale.
I’ve sat in countless AI measurement debates and noticed a recurring gap. One major voice has been noticeably underrepresented in the AI measurement conversation: the product manager (PM) that’s leading development. From experience, PMs and developers do need different measurement tools—and making those differences explicit is exactly what speeds up decisions and improves outcomes.
Developers optimize the model and system layer. Their toolkit centers on eval-driven development: offline evals, regression suites, red-teaming, latency and throughput monitoring, token cost tracking, and hallucination rate reduction. On the delivery side, engineering teams watch DORA metrics alongside CI/CD performance to keep iteration fast and safe. When building LLM-backed experiences, they also care deeply about retrieval-first pipeline quality and context window management because those mechanics determine grounding, relevance, and consistency.
PMs, by contrast, own outcomes. We instrument user journeys end to end and define a clear north-star tied to value: activation, time-to-value, task success rate, retention analysis, support deflection, and revenue contribution. We rely on A/B testing frameworks and minimum detectable effect (MDE) planning to separate real impact from noise, and we consolidate behavioral signals in a unified analytics platform like Amplitude analytics and Pendo to understand adoption, friction, and cohort differences. This is the heart of product-led growth and continuous discovery: evidence, not anecdotes.
The fact that these toolboxes differ is a strength, not a weakness. Specialized metrics keep responsibilities crisp: developers guarantee model quality and reliability; PMs guarantee that quality translates into customer and business outcomes. What we need is an explicit metrics ladder that connects layers—model-level quality floors and SLOs, feature-level KPIs, and company-level results—so trade-offs are transparent and prioritization is principled.
In practice, I create a shared measurement contract for every AI initiative. It links eval sets to user-facing success criteria, defines acceptance thresholds, and spells out observability across the stack. We include governance from day one—AI risk management, privacy-by-design, and data governance—so we can scale responsibly without slowing teams down.
Here’s the AI product toolbox I give my teams: start with a concise value hypothesis; define a success rubric the customer would recognize; instrument the happy path and the failure path; plan experiments with MDE up front; segment results by persona and job-to-be-done; and close the loop with qualitative feedback inside the product via in-app guides, product tours, and lightweight surveys. For AI features specifically, add Agent Analytics for agentic AI, capture grounding sources for explainability, and log model/context inputs to make debugging and iteration repeatable. That way, LLMs for product managers stop being magic and start being manageable.
When we roll out a new assistant—whether a retrieval-augmented copilot or a voice AI agent—we set two dashboards: one for developers (eval pass rates, latency, context integrity, error budgets) and one for PMs (activation, task completion, deflection, satisfaction). The dashboards read differently by design, yet they are joined at the hip by shared definitions and experiment IDs. This lets us move quickly with confidence: engineering can tighten quality loops while product steers toward the outcome that matters most.
If you’re feeling the tension between model metrics and product metrics, don’t collapse them—connect them. Start with a thin slice, agree on 3–5 measurable outcomes, and let your evals and A/B tests work together. With a clear metrics ladder and a unified analytics platform, PMs and developers can each excel at their craft and still ship AI that customers love.
Every planning cycle, I’m asked the same high-stakes question: should we build or buy? In 2026, with generative AI reshaping the software landscape and budgets under scrutiny, the classic calculus needs an upgrade. The right call can accelerate time to value, protect precious engineering capacity, and sharpen competitive differentiation—while the wrong one can quietly inflate total cost of ownership for years.
“Navigate the build vs buy software dilemma, learn how AI is changing the game, and what you should leverage (and when).” That’s been my north star for product strategy this year, and it’s how I guide teams when the pressure is on.
My first principle is simple: build where we differentiate, buy where we need parity. If the capability is central to our value proposition or our defensibility, I’m inclined to build—often with a phased approach that de-risks scope. If it’s a non-differentiating layer (think billing, analytics plumbing, basic CRM integration), I’ll buy to accelerate, then revisit once scale and specialization justify a deeper internal investment.
AI changes the equation on both sides. On the “buy” side, modern platforms now ship agentic AI, fine-tuning options, and robust APIs that let us compose advanced capabilities fast. On the “build” side, AI workflows and toolchains (from code copilots to eval-driven development) compress cycle time, making bespoke solutions more attainable. The trade-off has shifted from pure functionality to questions of AI risk management, model governance, data privacy, and the portability of prompts, embeddings, and training data.
I evaluate decisions across two economic horizons: time to value versus total cost of ownership. Buying often wins the first round—faster deployment, proven reliability, and lower initial lift. But TCO can creep: integration work, per-seat or consumption SaaS pricing, training, vendor-driven roadmap gaps, and the “shadow ops” of maintaining connectors in our CI/CD. Building flips that profile: slower early velocity, higher upfront complexity, but potentially lower long-run costs and tighter fit with our platform scalability goals.
Operational risk matters just as much as features. I look at incident management posture, SRE maturity, SLAs, and DORA metrics to gauge resilience. If a vendor can’t meet our uptime and recovery expectations—or if their roadmap pace mismatches our deployment frequency—we’re effectively renting risk we can’t control. Conversely, if our team can’t realistically support the operational burden, buying is the safer choice.
Security, regulatory compliance, and data governance are non-negotiables. I assess privacy-by-design, data residency, audit logs, role-based access, SOC2/ISO coverage, and threat detection and response. For AI-heavy systems, I add model lineage, red-teaming practices, PII handling, and retention policies. If we can’t verifiably meet our obligations in a build scenario within the launch window, we buy and require clear data exit and portability clauses.
To keep decisions objective, I use a lightweight scorecard across five dimensions: differentiation, urgency/time to value, regulatory/security risk, integration complexity, and AI leverage/portability. We weight criteria with product trios (PM, design, engineering), run discovery spikes, and validate assumptions with stakeholder management up front. A disciplined scorecard curbs recency bias and helps us communicate trade-offs to leadership.
In practice, I favor staged commitments. When uncertainty is high, we buy to learn—ship value quickly, instrument usage, and collect evidence. If adoption proves sticky and integration pain remains moderate, we double down with deeper vendor integration. If we uncover unique needs or cost inflection points, we pivot to a build plan that reuses learnings, data models, and UX patterns from the bought solution to reduce risk.
AI-specific choices deserve their own pass. For example, if we need retrieval-augmented generation, I’ll often buy for the orchestration and observability layer while building our domain-specific retrieval-first pipeline and prompt engineering guardrails. That split gives us speed plus control: we retain our IP and data gravity while tapping best-in-class tooling that evolves with the ecosystem.
Vendor strategy matters as much as technology. I negotiate clear data export, transparent API quotas, sandbox environments for continuous discovery, and price protections for growth. I pressure-test roadmaps, ask for integration references, and align on outcome-based milestones rather than feature checklists. Strong partners welcome this rigor; weak ones stall—another useful signal.
On the build side, I right-size ambition. We target minimum lovable scope, isolate risk in early sprints, and leverage open source where it’s mature and secure. We design for modularity so we can swap components without rewriting the world, and we budget time for in-app guides and product tours to smooth adoption, because user activation is the real finish line.
Here’s the playbook I return to: buy to validate and compress time to value; build to differentiate and reduce long-run TCO; continuously re-evaluate as the AI toolchain and our scale evolve. With a transparent scorecard, a bias for learning, and a clear view of risk, the build vs buy decision becomes less of a leap of faith and more of a repeatable product management capability.
2026 will reward teams that move fast without mortgaging the future. Make the call deliberately, instrument the outcomes, and stay humble—because the best strategy is the one you can adapt as new evidence arrives.
I wanted to cut through the hype and see what’s actually changing inside customer service teams as AI agents like Fin move from pilots to production. So I analyzed 166 interviews with support leaders, managers, and frontline specialists to understand how roles, workflows, and team structures evolve once AI becomes part of everyday work.
The anecdotes were already loud: AI tools are transforming customer support. But the scale, shape, and consistency of that transformation? Less clear. I went to the source—the practitioners living it—to quantify what’s real and what’s next for customer support AI strategy.
Here’s what I gleaned from the data.
TL;DR — What’s changing
AI is reorganizing core CS operations: Nearly every team (≈95%) reported meaningful workflow changes. Triage, routing, translation, and categorization are increasingly automated. Hybrid human+AI systems are taking their place.
Frontline work is changing to AI oversight: Humans now QA, monitor, and test AI outputs. When it comes to handling queries, they step in for nuance, rather than repetition.
Structural change is widespread but uneven across companies: 83% reported new responsibilities or roles. Some built AI pods, while others retained traditional setups.
Tier 1 headcount demand is falling: 28% saw hiring freezes, slowdowns, or natural attrition at Tier 1 level as AI Agents manage more requests and improve operational efficiency.
Skill gaps are widening inside teams: Data literacy, QA, and cross-functional communication are all rising in value. For many companies, long-term role strategy is lagging behind.
Research methodology
The goal of this research is to understand how many customer service teams have changed their roles, responsibilities and ways of working due to adopting AI agents, as well as understanding how these changes manifest within their organizations.
For this study, the data chosen consists of interviews conducted by the research team, either with Intercom customers or prospects. This data was chosen because the focus of the interviews revolved around the individual experience of the participant, which gives a higher chance of information related to role changes to be present.
The data was collected using Snowflake by pulling all interviews stored in gong conducted by a member of the research team from 01-01-2025 to 14-10-2025.
After the data was pulled, a python script was used to clean the conversation corpus for each conversation retrieved. Common English stopwords (e.g. “and”, “very”, “with”, etc.) were removed, as well as all the text associated with a speaker in the conversation that was not the interview participant(s). This was done to reduce the computational power required for the conversation coding, avoid API timeouts and reduce costs.
After the corpus was cleaned, the OpenAI API was employed, alongside a prompt, to code each conversation using closed codes defined in a closed codebook.
The codes used were:
No role change mentioned: No explicit changes to roles, teams, or reporting lines are attributed to AI/Fin.
Role responsibilities changed due to AI/Fin: Duties/ownership moved between humans and AI/Fin, or scope of a role changed because AI/Fin handles tasks.
Team structure/reporting changed due to AI/Fin: Org/team boundaries, team charters, or reporting lines changed due to adopting AI/Fin.
Headcount/hiring impacted due to AI/Fin: Hiring plans, headcount, staffing coverage, or shifts/rotations changed due to AI/Fin.
Workflow/process changed due to AI/Fin: Steps, triage/escalations, routing, or playbooks changed because AI/Fin alters the process.
Other organizational changes due to AI/Fin: Other changes inside the organization due to AI/Fin that don’t involve a change in responsibilities, team structure/reporting lines, headcount or workflow/processes changes.
Data analysis
166 conversations were retrieved. More than 90% of all conversations report some sort of change either in their role, team, or processes due to implementing Fin, or a similar AI product, with only 13 participants reporting no changes.
Across these conversations, each one could have multiple types of change associated with it (M = 2.35, Med = 2, Min = 1, Max = 4, N = 166).
More specifically, after implementing Fin or a similar AI product:
94.58% participants reported having their processes and workflows disrupted
82.53% participants reported seeing their role and responsibilities change
27.71% participants reported changes in company headcount or hiring
6.02% participants reported their team structure or reporting lines changing as a result
Additionally, 16.27% participants reported a change for a different reason from the ones highlighted above (“Other organizational changes due to AI/Fin”).
Sample representativeness
The sample is representative with a confidence level of 90% and a margin of error of ±6.4% (accounting for an overall unknown population size). The individual confidence intervals for each type of change are as follows.
Workflow/process changed due to AI/Fin: 157 (94.6%), 90% CI: 91.7% – 97.5%
Role responsibilities changed due to AI/Fin: 137 (82.5%), 90% CI: 77.7% – 87.4%
Headcount/hiring impacted due to AI/Fin: 46 (27.7%), 90% CI: 22.0% – 33.4%
Other organizational changes due to AI/Fin: 27 (16.3%), 90% CI: 11.6% – 21.0%
No role change mentioned: 13 (7.8%), 90% CI: 4.4% – 11.3%
Team structure/reporting changed due to AI/Fin: 10 (6.0%), 90% CI: 3.0% – 9.1%
Thematic analysis
1) Automation and AI integration replacing manual steps (94.58%). I see AI workflows embedding into every stage of support. Manual triage, routing, translations, and repetitive responses shift to Fin or similar systems, while agents focus on human-in-the-loop oversight.
Agents’ day-to-day work now revolves around monitoring or fine-tuning AI outputs, not replying to the same questions. In many teams, conversations enter Fin first; humans only step in when nuance or exception handling is required. Testing, QA, and rollout practices have matured too—teams track Fin’s accuracy and iterate intentionally.
2) Humans shift to oversight, AI handles execution (82.53%). The role resets are unmistakable. Support agents and managers move from high-volume execution to optimization, configuration, and measurement. New roles emerge—AI specialists, automation managers, Fin owners—while responsibilities migrate toward strategic analysis and quality assurance.
Duties are redistributed: Fin takes on refunds, triage, simple messaging, even parts of the sales process. I’ve watched some careers pivot toward product/ops or AI systems strategy as managers coordinate testing and monitor adoption metrics.
3) Reductions or slower growth due to efficiency gains (27.71%). Efficiency is real. Many teams reduce Tier 1 headcount needs or slow hiring because AI absorbs simpler requests. Others reallocate people to complex work or AI management. A few still expand—adding automation engineers, implementation specialists, or technical AI leads—but not at past growth rates.
The upshot: organizations handle more volume while stabilizing or reducing staffing, especially at the frontline tier.
4) New AI teams, flatter orgs, fewer escalation layers (6.02%). I’m seeing organizational design catch up to the tech. Some companies form dedicated LLM or automation teams. Others flatten hierarchies, design around workflow complexity instead of region, or merge roles. Dedicated escalation layers shrink as Fin routes or resolves more autonomously.
Team design is getting more modular and data-driven, with clearer ownership for configuration, governance, and Agent Analytics.
5) Broader digital transformation and operational modernization (16.27%). Beyond support, companies are modernizing their operating model: automation-first, digital self-service, better data foundations, and new vendor ecosystems. Collaboration patterns between data, ops, CX, and product/engineering are tightening, with a culture of experimentation and continuous improvement taking hold.
How have customer service roles and responsibilities changed due to Fin/AI agent implementation?
Implementing Fin or a similar AI agent profoundly changes how an organization operates, with around 95% of participants reporting some level of change in their processes after implementation. These systems have significantly reshaped the workflows that customer service teams are used to. Tasks once performed manually, such as ticket triage, routing, repetitive responses, and translations are now handled by AI agents.
“This marks a clear transformation in how customer service agents work: moving away from directly resolving customer queries to focusing on more analytical and procedural work”
As a result, customer service agents’ responsibilities have shifted from performing manual tasks to monitoring and fine-tuning the AI agent whenever its output is inaccurate or incomplete. This marks a clear transformation in how customer service agents work: moving away from directly resolving customer queries to focusing on more analytical and procedural work, such as testing, QA, and performance analysis of AI outputs.
Human agents who still handle conversations tend to do so either because the AI agent cannot yet respond adequately, or because of an organizational choice to retain human involvement for sensitive or high-value interactions. Nevertheless, the need for such roles is diminishing. Around 28% of participants reported a reduction in Tier 1 staff or a hiring slowdown or a full hiring freeze, as AI agents increasingly manage simple requests and organizational attention shifts towards improving automation efficiency.
“In some cases, this has led to the creation of specialized AI teams, reorganizations around workflow complexity, or the merging and redefinition of existing roles”
However, this transformation is not uniform across companies. While some roles have disappeared (particularly escalation layers), others have emerged. Many organizations are reallocating existing staff to AI management or hiring new technical profiles such as automation engineers, implementation specialists, and AI leads. In some cases, this has led to the creation of specialized AI teams, reorganizations around workflow complexity, or the merging and redefinition of existing roles.
Around 83% of participants reported changes to their roles or responsibilities following the introduction of Fin or similar AI agents. Specifically, customer service agents who no longer handle basic queries now focus on managing AI performance, reviewing Fin tasks and improving automation outputs. Managers oversee AI evaluation and implementation, coordinate testing, and monitor AI metrics such as resolution and involvement rates. In some organizations, new dedicated roles have emerged—AI specialists, automation managers, or Fin owners—reflecting a strategic shift toward automation-first, digital self-service models.
These structural shifts are also cultural. I’m seeing teams embrace experimentation, versioning, and eval-driven development while deepening collaboration with data, operations, and product/engineering. The move from outcomes vs output OKRs is palpable: leaders are measuring containment, deflection, CSAT, and time-to-resolution with new rigor.
Overall, a widespread transformation is underway. Roles are broadening, responsibilities are diversifying, and cross-functional collaboration is becoming the norm. Given the pace of gen ai improvement and the rise of agentic AI patterns, I expect these shifts to intensify.
This evolution raises two important questions
Firstly, do customer service agents possess the skills required to succeed in these new roles? While they are experts in customer interaction and company policy, their work now demands new competencies in data analysis (e.g. reporting AI agent performance and how it changes over time), quality assurance/debugging (e.g. Fin output testing and versioning), and cross-functional communication (e.g. if help from another team is required, drafting a business case to justify the resources required could be needed).
Secondly, what long-term strategies are companies adopting to support these evolving roles? Some are reorganizing entirely around automation, while others retain traditional structures. For those undergoing transformation, it remains unclear whether these changes are part of a deliberate strategic plan aimed at achieving specific performance outcomes, or the result of experimentation without defined goals.
Ultimately, Fin’s success— and of AI in customer service more broadly— depends not only on the technology itself but on the people and strategies that shape its use. In my experience, the winners invest early in data literacy, robust QA, clear ownership, and governance; they align product, ops, and CX around a shared AI roadmap; and they measure what matters with disciplined Agent Analytics. That’s how you turn AI workflows into durable customer and business outcomes.
I’ve lost count of how many times I’ve been asked for a “quick AI agent” that can autonomously fix customer problems, write code, or run sales ops. The promise is intoxicating—and I get why. But in practice, sustainable impact comes from disciplined product thinking, not wishful automation. Drawing on my experience leading product for complex, agentic AI initiatives, I want to debunk four misconceptions I see repeatedly and share what actually works.
Misconception 1: AI agents are plug-and-play. The reality is that effective agentic AI behaves more like a new product line than a feature toggle. It needs clear job stories, domain grounding, tool access, and guardrails. I start by narrowing scope to one painful job to be done, then design AI workflows that reflect real constraints (SLAs, compliance, edge cases). From day one, I instrument with Agent Analytics and set up eval-driven development so we can see failure modes early and iterate with intent.
What consistently moves the needle is treating the agent like a teammate you onboard: define responsibilities, provide the right tools, and measure outcomes. I pair scripted validations with live evals, track containment rates and handoff quality, and balance precision/recall depending on the risk profile. This is slow to fast, not fast to broken.
Misconception 2: Bigger models make better agents. In my experience, architecture outperforms horsepower. A retrieval-first pipeline, tight context window management, and practical prompt engineering often beat an oversized model that hallucinates. Tool use matters more than model size: give the agent reliable APIs, clear schemas, and deterministic fallbacks. For LLMs for product managers, the play is to right-size the foundation model and invest in data quality, prompts, and evaluators that reflect your true acceptance criteria.
When I see erratic behavior, I don’t immediately swap models; I improve retrieval, prune irrelevant context, and clarify the agent’s planning loop. Most performance gains come from better state management and grounding rather than a pricier token budget.
Misconception 3: Agents replace teams. High-performing organizations design human-in-the-loop systems. I implement human review on high-risk actions, explicit escalation paths, and simple override mechanisms. That’s not just safety theater—it’s good product design. AI risk management and data governance are part of the product backlog, not an afterthought. In customer support ai strategy, for example, the agent drafts, a specialist approves, and the system learns from deltas to tighten future responses.
The social system matters as much as the technical one: clear role boundaries, audit trails, and feedback loops turn the agent into a force multiplier. Teams gain leverage without surrendering accountability.
Misconception 4: Shipping the agent equals success. Adoption is earned, not announced. I treat agent launches like any product-led growth motion: define activation events, remove friction with in-app guides and product tours, and A/B test prompts, tool choices, and UI affordances. We track time-to-value, task completion rate, and user trust signals (edits, undo patterns, and escalation requests). When we get those leading indicators right, retention follows.
Increase revenue, cut costs, and reduce risk with Pendo’s Software Experience Management platform. Optimize the entire software experience to drive adoption and improve engagement.
My playbook is simple and repeatable: frame the problem narrowly, ground the agent with the right tools and data, measure with eval-driven development and Agent Analytics, then grow adoption with a disciplined go-to-market inside the product. The agents that win don’t feel like magic—they feel dependable. That’s what customers trust, and that’s what scales.
Every week, I watch the cybersecurity landscape shift under our feet. As a VP of Product Management, I’m responsible for building secure, resilient products—and that means understanding how artificial intelligence is transforming the way IT teams defend, respond, and even anticipate attacks.
Learn the ways in which AI is transforming both cybersecurity offense and defense for IT teams.
First, AI supercharges threat detection and prevention. Pattern-recognition models now sift through endpoint telemetry, identity signals, and network flows to surface anomalies in near real time. In practice, that means fewer false positives, faster prioritization, and earlier containment. We’re pairing behavioral analytics with enrichment from our SIEM/EDR stack so analysts get a ranked, explainable view of risk instead of a noisy alert queue—directly improving mean time to detect and laying the groundwork for scalable threat detection and response.
Second, AI accelerates incident response. We’ve embedded LLM-powered copilots into our SOC workflows to summarize alerts, propose next-best actions, and auto-generate draft remediation steps from playbooks. Orchestration then executes routine tasks—isolating endpoints, rotating credentials, updating tickets—while keeping a human-in-the-loop for approvals. To keep this safe, we use privacy-by-design principles, a retrieval-first pipeline for authoritative playbook content, and eval-driven development to measure precision/recall on suggested actions. The result is meaningful reduction in mean time to recover and more consistent incident management.
Third, the offense is getting smarter—and we need to be honest about it. Adversaries use gen AI to craft targeted spear-phishing, deepfake executive voice notes, and polymorphic malware that evades signature-based tools. We counter by red-teaming with AI, deploying deception tech to waste attacker cycles, and hardening identity as the new perimeter (MFA, conditional access, continuous risk scoring). Education matters, too: when employees see how convincing AI-generated lures have become, phishing reports spike and successful compromise rates drop.
None of this works without strong governance. We treat AI like any high-impact capability: rigorous data governance, model access controls, and AI risk management across the lifecycle. We log model prompts and outputs, restrict sensitive data via contextual policies, and continuously test for drift and bias. This is as much an IT leadership challenge as it is a technical one—clear ownership, well-defined runbooks, and regular tabletop exercises make the difference between resilience and chaos.
If you’re getting started, I recommend a focused 90-day plan: identify one high-signal detection use case, one response playbook ripe for automation, and one employee risk area (usually phishing) for immediate uplift. Instrument everything—latency, precision/recall, MTTR—and iterate with a cross-functional group spanning security engineering, SRE, and product management leadership. With disciplined AI strategy and guardrails in place, you can move faster, reduce noise, and stay ahead of adversaries without compromising data or trust.
Every quarter, I revisit the same three questions: Are we accelerating adoption, lowering cost-to-serve, and managing risk without slowing the roadmap? Tools that help me answer all three with clarity earn a place in my stack. That’s why the concept behind Pendo’s Agent Analytics resonates so strongly—it gives product leaders a way to see, in one view, how users engage with AI-powered assistants, in-app guides, and core workflows, and how those behaviors translate into product-led growth.
Increase revenue, cut costs, and reduce risk with Pendo’s Software Experience Management platform. Optimize the entire software experience to drive adoption and improve engagement.
In practice, Agent Analytics functions as a unified analytics platform for the modern product team. I can observe how users interact with agents and nudges inside the product, connect those interactions to user activation and retention analysis, and prioritize improvements that deliver measurable outcomes. The result is fewer blind spots across the journey and a tighter feedback loop between discovery and delivery.
The real value shows up when I pair analytics with targeted interventions. For example, I’ll instrument critical paths, baseline activation, then use in-app guides to remove friction at the exact moment users need help. I incorporate A/B testing and continuous discovery to validate which prompts, pathways, or workflows actually move the needle. With a clean view of adoption, engagement, and time-to-value, my team can double down on what works and retire what doesn’t—faster.
Risk reduction is equally important. With clear behavioral signals, I can spot confusing prompts, unhelpful agent responses, or unexpected drop-offs before they scale into churn or support volume. That visibility informs our product strategy, aligns stakeholders on trade-offs, and keeps our governance tight without stifling innovation—especially critical as AI Strategy becomes part of everyday product decisions.
If you’re weighing whether Agent Analytics deserves a place in your toolkit, consider this: better instrumentation yields better decisions. When you unify guide interactions, agent engagement, and core product usage, you can attribute uplift more precisely, forecast impact with greater confidence, and operationalize product-led growth. That’s how we increase adoption, cut unnecessary cost, and de-risk the roadmap—while building experiences customers actually love.
In my role leading product, I’ve learned that the fastest path to higher-quality deliverables from large language models (LLMs) is not a clever prompt—it’s rigorous context. I call the practice AI context pulling: a repeatable way to assemble, compress, and structure the most relevant knowledge before the model ever starts generating. Done well, it turns generative AI into a dependable partner for discovery, prioritization, and execution.
AI context pulling means I proactively gather the right artifacts (customer insights, analytics, strategy, constraints), manage context windows intentionally, and shape the model’s task with clear objectives and guardrails. This reduces hallucinations, improves alignment, and creates traceability back to sources—critical for product management leadership and stakeholder trust.
Learn a new way in which product professionals can collaborate with AI to get even better results on their projects.
Here’s the simple flow I use: first, I define the intent (e.g., “synthesize discovery interviews for a positioning brief”). Next, I inventory relevant context: top customer pains from product discovery, usage patterns from Amplitude analytics, recent support trends from Intercom, and any constraints from our product strategy. Then I run a retrieval-first pipeline to select only the most pertinent slices—favoring recency, representativeness, and canonical sources.
Because context window management matters, I compress long documents into short, source-cited summaries and keep raw excerpts handy when nuance is important. My prompts follow a consistent structure: role and objective, constraints and audience, curated context, the explicit ask, preferred output format, and a brief self-check (e.g., “cite sources and flag uncertainty”). This is prompt engineering for reliability, not theatrics.
A quick example: when drafting a one-page feature brief, I attach three items—the product strategy paragraph that sets the frame, a usage cohort analysis that highlights who’s affected, and five verbatim customer quotes. I ask the LLM to propose a problem statement, success criteria, and a shortlist of solution hypotheses, each tied to a cited piece of evidence. The result is a grounded, decision-ready artifact I can share with product trios and stakeholders.
Tooling-wise, I keep it pragmatic. A lightweight retrieval-first pipeline (embeddings, metadata filters, and recency rules) ensures the LLM pulls what matters. I version prompts and contexts together so I can run quick A/B testing on output quality. And I log decisions and sources to support eval-driven development and continuous discovery.
Common pitfalls are avoidable. Too little context yields generic answers; too much overwhelms the model. Stale docs can mislead; curate aggressively. Vague asks invite fluffy prose; specify outcomes, audiences, and formats. If the task is high risk, I bias toward smaller, well-cited outputs and expand iteratively with human review in the loop.
To measure impact, I track rework rate, review time, and stakeholder alignment on first pass. Over time, teams adopting AI context pulling report clearer artifacts, faster synthesis cycles, and more confident decisions—because every recommendation traces back to evidence. That’s how humans and LLMs truly collaborate better: we provide the right context, and the model amplifies our judgment.
If you’re ready to operationalize this, start by templatizing your most common product workflows—discovery synthesis, roadmap rationale, and release notes—and attach small, high-signal context packs. With a retrieval-first mindset and disciplined prompting, AI becomes an extension of your product craft, not a gamble.