Tag: AI risk management

AI Product Leadership: Faster Learning, Safer Systems
AI-enabled product leadership is not primarily a contest to automate more work. The stronger opportunity is to shorten learning loops while improving the quality, traceability, and safety of product decisions.

Across the five source articles, a common operating model emerges: begin with bounded problems, connect AI to real customer evidence, define quality through domain expertise, and make safeguards proportional to the consequences of failure. This model applies both to internal product workflows and to customer-facing AI systems.

Move from an AI tool stack to an evidence system

The article on essential tools for product managers presents AI as a working layer across product intelligence, research, analytics, roadmapping, design, prioritization, and delivery. Its most useful implication is that tool selection should begin with the decision a team needs to improve, not with the number of AI features available.

A feedback summarizer, behavioral analytics platform, prototyping assistant, and requirements generator can each save time. Their strategic value appears when their outputs are connected: qualitative feedback helps explain observed behavior, behavioral evidence tests assumptions raised in interviews, and both inform prioritization. The product manager still has to reconcile customer pain, business outcomes, engineering effort, differentiation, and stakeholder expectations.

The practical guide to finding AI use cases reaches the same conclusion from a different direction. It recommends starting with a concrete item from everyday work, testing how AI might help, and studying the gap between the desired result and the output. It specifically proposes a 15-minute daily practice and treats an initially poor result as evidence about instructions, context, constraints, or model capability.

Together, these perspectives suggest two complementary levels of adoption. At the individual level, task-first experimentation builds judgment about what AI can do. At the team level, connected evidence workflows turn that judgment into a repeatable product operating system. Buying tools without the first creates shallow adoption; isolated personal experiments without the second produce scattered efficiency rather than organizational learning.

Use AI to deepen discovery, not to create distance from customers

The 2026 roadmap article frames roadmaps as portfolios of experiments involving products, learning methods, teaching models, and choices about what to stop doing. It argues that AI can reduce tedious discovery work and provide feedback on demanding skills, including interviewing, assumption testing, and opportunity mapping. At the same time, it warns against substituting agents or dashboards for human curiosity and direct customer contact.

That tension supplies an important boundary for AI-enabled discovery. Models can organize notes, identify recurring themes, critique an interview guide, expose possible confirmation bias, or compare evidence across sources. They cannot independently determine whether the team asked the right customers, understood the social context, or interpreted ambiguous language correctly. Those remain product and research judgments.

The safety-first consent coach described in the Override Labs article illustrates why context matters. According to that account, the nonprofit examined 2,000 Reddit posts per subreddit to validate demand and understand how vulnerable questions were expressed. The discovery material included uncertainty, shame, peer pressure, and the possibility that someone might be seeking permission rather than reflection. A conventional feature request or decontextualized summary could have obscured those conditions.

The cross-team review reinforces this point through other domains. It reports that former teachers at eSpark created evaluation rubrics based on how educators assess student work and enriched educational content with domain-specific metadata when generic embeddings produced weak matches. It also describes how local-government knowledge at Zencity changed the interpretation of sentiment, and how incident-response experience informed Incident.io’s investigation architecture. Across these examples, AI increased the importance of domain expertise because people still had to define what relevance, quality, and failure meant.

Let the consequence of failure determine the product architecture

Not every AI-assisted task needs the same controls. A weak draft of an internal stakeholder update can be reviewed and corrected cheaply. A response that could be interpreted as permission in a consent-related situation has a fundamentally different risk profile. Responsible product development begins by distinguishing those cases before selecting architecture or interaction patterns.

The Override Labs account offers the clearest high-stakes pattern. The team reportedly defined a "South star" around the worst outcome: a teenager using the product response as a green light for harmful action. The product therefore avoids giving a green-flag verdict. It runs deterministic risk classification before calling Claude, adjusts responses by risk tier, and uses a structure that validates, reflects, and invites further reflection. A licensed therapist contributed to the evaluation rubric, while positive masculinity coaches helped shape the tone.

The underlying principle is broader than that implementation. A generative model should operate inside a product-defined safety system rather than becoming the safety system. Product leaders can translate that principle into four design questions: what outcome must never be encouraged, which decisions require deterministic handling, when should generation be constrained or withheld, and which domain experts are qualified to judge the response?

The review of AI product teams adds another trust boundary: deciding when a system should admit that it does not know. This is both a model-quality issue and a product behavior. Teams need to specify what insufficient evidence looks like, what the interface communicates in that state, and whether the user should retry, provide more context, consult a person, or stop the workflow.

This risk-based approach avoids two unhelpful extremes. Applying high-stakes controls to every low-consequence drafting task can make experimentation needlessly heavy. Treating sensitive decisions like ordinary content generation can leave critical failure modes to probabilistic behavior. The appropriate control set follows the plausible harm, reversibility, affected population, and user’s ability to detect an error.

Make evaluation, privacy, and leadership part of delivery

The production-team review describes evaluation as an evolving operational capability rather than a final test. It reports that Stack Overflow ran about 50 experiments across five pods in three months, produced four versions of an AI-powered search product, and ultimately stopped that effort. Arize began building its Alyx agent before established agent frameworks were available, while eSpark’s former teachers learned to write evaluation code with LLM assistance. These are source-reported examples, not independently verified benchmarks, but they demonstrate how structured learning can support both shipping and stopping decisions.

Evaluation should therefore start when the use case is defined. Early rubrics can be simple: representative tasks, expected properties, unacceptable outputs, and a review process. As the product matures, teams can add risk tiers, regression sets, production observations, and explicit release criteria. The goal is not to claim that a model is universally good; it is to establish whether a particular system performs acceptably within a bounded workflow.

Privacy belongs in the same product definition. The consent-coach article reports that the service uses no accounts, cookies, or cross-session tracking. That choice limits conventional retention analytics, but it also supports the trust required for a sensitive interaction. It shows that less data can be a deliberate product feature when identification or surveillance would discourage honest use.

Leadership determines whether these practices persist. The roadmap article argues that training alone does not change an organization when leaders continue to reward old behaviors. Its proposed learning model combines on-demand material, AI-generated feedback, coaching resources, and human support. The practical-use-case article similarly recommends peer demonstrations and structured practice. Both suggest that AI readiness is a management system: teams need permission to experiment, shared examples, quality standards, and leaders who reinforce evidence-based behavior.

Key takeaways
- Start with a bounded task and a defined outcome; use repeated practice to learn where AI adds leverage and where it fails.
- Connect research, feedback, behavioral data, prioritization, and delivery so that AI improves decisions rather than producing isolated artifacts.
- Keep direct customer contact and domain expertise at the center of discovery, synthesis, and quality judgment.
- Define the worst credible outcome before designing a customer-facing AI experience, then match controls to that risk.
- Build evaluation and privacy into the product operating model, including criteria for refusing, escalating, or admitting uncertainty.
- Measure AI leadership by better learning and safer outcomes, not by tool count, output volume, or automation alone.
Building the next product operating rhythm

The next step for product organizations is not a universal AI playbook. It is a disciplined rhythm in which teams choose a real problem, gather contextual evidence, define acceptable and unacceptable behavior, test a bounded intervention, and revise or stop it based on results. As AI capabilities change, that rhythm can remain stable. It gives product leaders a way to pursue faster learning without treating speed as a substitute for responsibility.

References
July 3, 2026
Reliable AI Coding Requires Four Kinds of Control
Reliable AI coding is not primarily a matter of finding a better prompt or a more capable model. It is a workflow-design problem: teams must control what the product should do, what the repository currently does, what the model can see, and what the agent is allowed to change.

Managing those four kinds of state turns an AI coding session from an open-ended conversation into a bounded engineering process. The payoff is faster iteration without treating plausible output, confident status messages, or large context windows as substitutes for evidence.

Reliability depends on the surrounding system

A large language model generates an answer token by token from the input available to it. That input can include more than the visible request: an application may add system instructions, conversation history, project files, enabled tools, skills, and other supporting context. As Shivam.Consulting Blog’s guide to how ChatGPT works explains, the surrounding application therefore helps shape the result even when two products use the same underlying model.

This mechanism has an important operational consequence. An agent can produce code that looks convincing without possessing a stable model of the intended product, the complete repository, or the runtime environment. Fluency indicates that the output fits learned patterns; it does not establish that the implementation satisfies the requirement.

A dependable workflow consequently controls four connected states. Product state covers requirements, constraints, permissions, edge cases, and acceptance criteria. Repository state covers the actual code, data model, dependencies, tests, and uncommitted changes. Model state covers the instructions and evidence present in the context window. Execution state covers tools, filesystem access, commands, network activity, and other permissions. A failure in any one can appear to be a coding error even when the code is not the original cause.

Tool selection should reflect that distinction. Shivam.Consulting Blog’s vibe-coding playbook recommends managed app builders when the purpose is to explore an interaction or answer an early product question, while positioning developer-oriented coding agents as more appropriate for existing repositories, multi-file changes, tests, and review workflows. The useful dividing line is not whether a tool can generate code. It is whether the environment exposes enough control and evidence for the consequence of the change.

Convert product intent into a bounded change contract

Many unreliable sessions begin before an agent edits a file. If the requested behavior, non-goals, affected users, data rules, and observable success conditions remain ambiguous, the model must fill the gaps. Each follow-up correction can then preserve a different assumption, creating a chain of locally plausible patches without a coherent final design.

A stronger starting point is a compact change contract written outside the chat. It should identify the outcome, relevant current behavior, permitted scope, important invariants, expected edge cases, and the evidence that will demonstrate completion. For a defect, that evidence begins with a reproducible failing case. For a feature, it includes examples of accepted and rejected behavior. The contract should also record explicit non-goals so that an agent does not broaden a narrow request while attempting to be helpful.

Blast radius deserves separate attention. The vibe-coding playbook uses data, controller, and view as a practical three-layer model. A request involving permissions, sorting, filtering, workflow state, or reporting may cross all three even if it appears in the interface as a small change. Reviewing the planned impact across storage, logic, and presentation helps reveal missing migrations, inconsistent validation, stale queries, and user-interface states before implementation begins.

The same source proposes separate plan-review-fix and implement-review-fix loops. Combined with the change contract, these become distinct gates rather than one continuous conversation. The plan gate asks whether the proposed files, layers, and tests match the requirement. The implementation gate asks whether the resulting diff and observed behavior match the approved plan. Separating the gates makes it easier to reject a mistaken approach before it accumulates code.

This structure also clarifies the human role. The agent can explore the repository, propose a plan, implement a bounded change, and help investigate failures. Product and engineering owners remain responsible for deciding what behavior is correct, which tradeoffs are acceptable, and what evidence is sufficient to ship.

Treat context as a limited working set, not permanent memory

A long conversation can feel comprehensive while becoming less dependable. Shivam.Consulting Blog’s context-rot analysis reports research showing that model performance can deteriorate as input length grows and that information at different positions may receive unequal attention. The article’s practical conclusion is more useful than any advertised context-window maximum: available capacity should not be confused with reliable attention.

Context should therefore be curated as a task-specific working set. Durable facts belong in versioned project documents; the active session should receive only the instructions, files, decisions, and evidence needed for the current change. Old tool output, abandoned plans, duplicate explanations, and superseded requirements consume attention without improving the task.

Shivam.Consulting Blog’s guide to Claude Code workflows describes a layered memory pattern: broad preferences in global instructions, project-specific conventions in repository-level files, and reference material loaded when relevant. It also presents stored commands as a way to make recurring procedures explicit, and sub-agents as a way to isolate context or perform independent work. The transferable principle is architectural rather than product-specific: stable policy, project knowledge, task instructions, and transient evidence should not be mixed into one ever-growing transcript.

A clean session boundary can be a reliability control. When a conversation has accumulated contradictory instructions or repeated failed fixes, the next step should not automatically be another patch request. A new session can begin from a short handoff containing the approved change contract, current repository state, attempted approaches, observed failures, and unresolved questions. This preserves useful evidence without carrying the entire history forward.

Sub-agents require the same discipline. Parallelism is valuable when work can be partitioned into independent questions, such as locating relevant code, examining tests, or reviewing a proposed diff. It is less useful when several agents can modify overlapping files or make incompatible architectural assumptions. Each delegated task needs a narrow scope, an expected output, and a rule for whether it may write or only report.

Require evidence, limited authority, and a recovery path

An agent’s statement that a problem is fixed is a claim to verify, not completion evidence. Verification should return to the original reproducer or acceptance criteria, then examine the diff and run the smallest relevant checks. Broader tests can follow when the change crosses modules, alters shared behavior, or affects data. This sequence distinguishes a real correction from a patch that merely changes the visible symptom.

Review should inspect both behavior and change shape. A diff may pass a narrow test while introducing unrelated refactoring, weakening validation, swallowing errors, or duplicating logic. Unexpected file changes, new dependencies, disabled checks, and unusually broad edits are signals to pause. If the evidence is inconclusive, the workflow should return to diagnosis rather than asking the same context-saturated agent to keep editing.

Reliability also depends on limiting what an agent can do. Shivam.Consulting Blog’s Claude Code risk guide describes escalating exposure as an agent moves from reading a project folder to reading elsewhere, fetching external material, writing files, executing generated code, and installing third-party packages or extensions. Although permission models vary by product, the general control is consistent: grant the least authority required for the current step and review the exact path or command before approval.

Folder boundaries should match the task boundary. Credentials, customer information, confidential documents, and unrelated projects should not be placed within an agent’s working scope. One-time approval is preferable when an operation is unusual or its future use would be difficult to predict. Commands that delete, overwrite, upload, install, or execute deserve more scrutiny than read-only inspection because their impact is larger or harder to reverse.

Reversibility completes the control system. The safety guide emphasizes backups and version control because an AI coding interface may not provide a dependable undo operation. A clean checkpoint before implementation, small commits, reviewable diffs, protected secrets, and a tested rollback path reduce the cost of both model errors and human approval mistakes. For higher-risk work, the agent should operate in a disposable branch, isolated environment, or similarly constrained workspace rather than directly against valuable state.

These safeguards are mutually reinforcing. A bounded contract limits scope; curated context reduces instruction drift; verification exposes incorrect claims; least privilege limits blast radius; and version control makes recovery practical. Removing any one of them shifts too much trust onto probabilistic output.

Key takeaways
- Control product state, repository state, model context, and execution authority as separate parts of one workflow.
- Write a change contract with scope, non-goals, invariants, edge cases, and acceptance evidence before implementation.
- Keep context task-specific; store durable knowledge in files and start a clean session when history becomes contradictory or noisy.
- Treat an agent’s completion report as a hypothesis until the original reproducer, relevant tests, observed behavior, and diff support it.
- Match permissions and isolation to the risk of the operation, and create a recovery point before allowing changes.
As coding agents gain more tools and autonomy, reliable teams will distinguish themselves less by how much work they delegate than by how clearly they define authority, evidence, and recovery. The durable advantage will come from workflows in which faster generation is paired with tighter control.

References
July 3, 2026

Secure System Access for AI Agents: A Phased Control Model

An AI agent becomes operationally valuable when it can move beyond explaining a process and complete the underlying work. That same transition gives the agent access to sensitive data and consequential actions, so integration must be designed as both a product capability and a security boundary.

The practical objective is not maximum access. It is the smallest dependable set of permissions that lets an agent resolve a well-defined workflow, supported by deterministic controls, observable outcomes, and a clear path to human intervention.

System access changes both the value and the risk

Without backend access, an agent can describe how to update an account, check a renewal, or report a damaged order. With access to a CRM, billing platform, or order-management system, it can potentially retrieve the relevant record and complete the request during the conversation. The Intercom article presents this shift from answering to acting as a central difference between basic AI adoption and mature deployment.

The article cites Intercom’s 2026 Customer Service Transformation Report, reporting improved metrics among 87% of teams with mature AI deployments, compared with 62% overall. It also reports that 82% of senior leaders said their teams had invested in AI during the preceding year, while only 10% said they had reached mature deployment. These source-reported figures suggest an integration gap, but they do not independently establish that system access caused the reported improvements or that an integration is secure.

Security therefore cannot be added after the workflow succeeds. A customer-facing interface may remove the need to visit a separate application, but it must not remove identity and authorization checks. The agent still needs a trustworthy way to associate the request with the correct customer, determine what that customer is permitted to do, and constrain the backend operation accordingly.

Choose workflows where access justifies its complexity

Not every automated conversation benefits equally from deeper integration. Intercom reports the results of rebuilding four fixed, scripted Tasks as Procedures with system access. Over the 12 months through May 2026, the reported resolution rate for its bounce-list workflow rose from 9.3% to 79.9%, while bug reporting increased from 9.2% to 66.5%. Email forwarding moved from 44.9% to 66.5%, but Messenger installation rose only from 67% to 69.2%.

The variation is more instructive than the headline gains. According to the article, the bounce-list process required multi-step reasoning, dynamic branches, and error recovery. Bug reporting still ended in a human handoff, but the procedure improved that handoff by pre-triaging the issue, surfacing possible GitHub matches, extracting relevant URLs, and requesting impersonation access. Messenger installation was already a comparatively linear process, leaving less room for improvement.

A suitable first integration is therefore not merely a popular support topic. It should be high-volume and repeatable, have an identifiable system owner, and depend on live data or actions that materially change the outcome. Existing APIs improve feasibility, but the security review should also consider data sensitivity, reversibility, authorization complexity, and the consequences of acting on an ambiguous request.

Use an access ladder instead of a single launch

The phased approach described by Intercom can also serve as a security model. Each stage expands capability only after the workflow and its controls have produced enough evidence to justify the next step.

Stage	Agent capability	Appropriate use	Control emphasis
No integration	Guide, troubleshoot, check policy, triage, and route	Discover where explanations repeatedly lead to manual work	Evaluate answer quality, routing accuracy, and escalation behavior
Read-only access	Retrieve approved fields such as order or subscription status	Resolve information requests without changing a record	Restrict endpoints, records, and fields; verify customer authorization
Write access	Update records or initiate actions such as cancellations or refunds	Complete bounded workflows after earlier stages are dependable	Validate inputs, limit action scope, record outcomes, and require approval where consequences warrant it

Mock responses can test branching logic before an API is ready, as the Intercom article notes. It also proposes a temporary human-in-the-loop step when an integration is still several engineering sprints away. These methods can validate the workflow and expose missing requirements, but simulated success should not be treated as proof that production identity, authorization, failure recovery, and audit controls are ready.

Put deterministic controls around probabilistic decisions

Plain-language workflow instructions can guide an agent, but security-critical constraints should not depend solely on the model interpreting those instructions correctly. A safer architecture places enforceable controls between the agent and each backend system.

Control	Practical design implication
Dedicated identity	Give the agent its own service identity rather than borrowing a staff account, so permissions and activity remain attributable.
Least privilege	Allow only the endpoints, operations, records, and fields required by the selected workflow.
Read and write separation	Keep retrieval permissions distinct from mutation permissions and grant write access only when the use case requires it.
Independent policy enforcement	Validate identity, authorization, limits, and required inputs outside the model before executing an operation.
Bounded actions	Prefer narrow, purpose-built operations over unrestricted database or administrative access.
Human approval and escalation	Route ambiguous, exceptional, sensitive, or difficult-to-reverse cases to an authorized person.
Auditability and monitoring	Record the request, decision, tool call, result, and escalation so failures and unusual patterns can be investigated.
Safe failure behavior	Prevent retries, timeouts, or partial completion from producing duplicated or inconsistent changes.

The integration request should document the workflow in plain language, identify every read and write point, name the system owner, and specify the minimum required fields. It should also define how success and harm will be measured: not only whether the agent completed the conversation, but whether it selected the correct record, performed the authorized action once, protected restricted data, and escalated when it lacked sufficient confidence or permission.

This framing also improves the business case. Engineering is being asked to expose a narrowly scoped capability with explicit boundaries, rather than to provide broad access to a general-purpose agent. Leadership can then compare measurable workflow value with implementation effort and residual risk.

Key takeaways

System access creates value when it lets an agent complete work, but it simultaneously expands the security boundary.
The best initial workflow is frequent, bounded, operationally meaningful, and owned by a team that can approve its data and actions.
Progress from no integration to read-only retrieval and then to narrowly scoped write operations; do not treat access as an all-or-nothing decision.
Enforce identity, authorization, field restrictions, action limits, and audit logging outside the model’s natural-language instructions.
Evaluate correctness, unauthorized-action risk, failure recovery, and handoff quality alongside resolution rate.

The strongest long-term pattern is a portfolio of small, governed capabilities rather than one broadly privileged agent. Each successful workflow can supply the evidence needed to extend access deliberately, while keeping the consequences of error visible and contained.

References

Intercom — Win Executive Buy-In for AI Agent System Access: Unlock Actions, Boost Resolution, Cut Costs

June 11, 2026

A Layered Playbook for Package Supply Chain Security
Package supply chain security is not simply a matter of choosing reputable libraries. The practical challenge is controlling an expanding dependency graph, the code that executes during installation, the resources that installed software can reach, and the automated tools allowed to make those decisions.

A useful defensive model follows the path an attack must take: enter through a package or dependency, execute in the development environment, discover valuable information, and transmit it elsewhere. Organizing safeguards around that sequence produces a stronger posture than relying on any single scanner, sandbox, or package reputation signal.

Package risk grows through the dependency graph

Developers usually evaluate the packages they select directly. The less visible risk lies in transitive dependencies: packages installed because another dependency requires them. The source article illustrates the scale of this effect by reporting that installing Jest brought in 266 packages. That example is not evidence that those dependencies were malicious; it shows how one deliberate choice can create hundreds of additional trust relationships.

This changes the unit of review. The relevant question is not only whether a named package appears legitimate, but whether its complete dependency graph is proportionate to the job. A small utility that introduces unfamiliar native modules, unrelated capabilities, or an unexpectedly broad tree deserves more scrutiny than its simple interface might suggest.

Manifests such as package.json, pyproject.toml, and requirements.txt make dependency installation repeatable. Repeatability alone, however, does not guarantee safety. If version ranges or unresolved transitive dependencies allow later releases to enter automatically, two installations based on the same manifest can produce different risk profiles. Pinning direct and transitive versions converts an evolving external graph into a more deliberate, reviewable input.

Match defenses to the stages of a package attack

The source article says an analysis covering more than 230,000 malicious-code incidents found a recurring pattern: malicious code first needs an entry point, then searches the device for sensitive data, and finally uses a network connection to exfiltrate what it finds. This reported pattern suggests three distinct control points.

Reduce risky entry and automatic execution

A waiting period for newly published packages can reduce exposure to releases that have not yet attracted community scrutiny. The article recommends installing only packages that are at least seven days old. That is a risk filter, not a guarantee: an older malicious package can remain undetected, while a legitimate urgent fix may occasionally justify an exception.

Installation scripts require separate treatment because they may execute before a developer has inspected the installed code. Disabling automatic install hooks by default creates a decision point. A package that depends on a post-install action can still be used, but the script, its purpose, and the capabilities it invokes should be reviewed first.

Constrain access after installation

Pre-install review cannot catch every problem. The next layer limits what package code can inspect or modify if it does execute. Sandboxed folders and isolated development environments can reduce the blast radius, but the source cautions that isolation by itself does not prevent malicious code from entering. Access boundaries therefore complement package controls rather than replace them.

Limit unnecessary network egress

Stolen information has less value to an attacker if malicious code cannot transmit it. Restricting unnecessary outbound connectivity addresses the final stage of the reported pattern. This layer matters because a package may evade provenance review and execute inside an environment despite earlier controls. Entry controls, resource boundaries, and egress restrictions together create independent opportunities to interrupt the attack.

Provenance is a decision process, not a trust badge

No single popularity or identity signal proves that a release is safe. The source proposes evaluating maintainer history, download patterns, repository activity, signed releases, and consistency across registries. Their value comes from comparison: a sudden change in maintainership, an unusual release pattern, or a mismatch between repository and registry information may warrant investigation even when each signal looks plausible in isolation.

Context also matters. Dependency behavior should be compared with the package’s stated purpose. A capability that is normal for a database driver may be difficult to justify in a formatting utility. This purpose-to-capability test helps teams focus limited review time on anomalies rather than treating every dependency as equally suspicious.

These checks work best when they lead to a clear disposition: approve the package and lock the reviewed version, replace it with a narrower dependency, inspect it more deeply, or decline it. Provenance information without a decision rule can become documentation that does not change behavior.

AI coding agents must inherit the same installation policy

AI-assisted development introduces a governance problem as much as a technical one. A coding agent may be able to select and install a package while pursuing a larger task, compressing several human decisions into one automated action. If it can also reach broad areas of the file system and use the network, a malicious dependency may encounter a larger potential blast radius.

The source describes workflows in which Claude searches, creates, and edits files across a broad knowledge system, including notes derived from downloaded PDFs. That breadth provides productivity value, but it also makes one-folder isolation impractical for the reported workflow. The proposed response is disciplined configuration: hooks require the agent to follow the same package-age, install-script, provenance, and dependency rules expected of a human developer.

This principle is more durable than a rule tied to one assistant. Package policy should apply consistently whether an installation is initiated by a developer, an AI agent, a local automation script, or a build process. The initiator may change; the acceptable evidence, permissions, and exceptions should not.

Key takeaways
- Review the full dependency graph, because the packages selected directly represent only part of the installed attack surface.
- Use a waiting period for new releases as one filter, while preserving a documented path for justified exceptions.
- Prevent install scripts from running automatically until their purpose and behavior have been examined.
- Combine provenance checks with a purpose-to-capability test and an explicit approve, investigate, replace, or reject decision.
- Pin direct and transitive versions, then run recurring audits to detect issues discovered after installation.
- Apply the same package rules to coding agents, automation, local development, and build environments.
- Layer installation controls, resource constraints, and network egress limits so that one missed signal does not determine the outcome.
A mature package security posture will increasingly depend on making these controls routine and machine-enforceable. As development becomes more automated, the teams best positioned to move quickly will be those that turn package trust from an informal judgment into a consistent operating policy.

References
- Shivam.Consulting Blog – Stop Package Breaches Before They Start: My Proven Playbook to Block Common Entry Points
June 10, 2026
Package Hack Wake-Up Call: My Playbook for Securing Cowork, Coding Agents, and Secrets

I love being a builder. It feels like a superpower I can’t stop using, and lately I’ve been channeling it into better workflows, faster experimentation, and sharper product thinking.

I tinker with my Claude Code workflows to make every day more effortless. I’m having a blast creating AI-generated interview snapshots and opportunity solution trees for Vistaly. I also spend time digging into traces and iterating on the AI coaches I use for our discovery courses.

Then the recent wave of malicious software spreading through the open-source community popped my bubble. It hit companies big and small—names like OpenAI, PostHog, and Zapier. As I dug in, I realized what many cybersecurity experts have long known: this is a deep rabbit hole. If I want to build responsibly, I have to get significantly better at protecting my devices, credentials, and code. And if you’re building with AI or modern tooling, you likely do, too.

Here’s why. We all rely on open-source software. Most modern applications assemble tried-and-true components—parsing a PDF, handling dates across time zones, visualizing spreadsheet data, connecting to an API—rather than reinventing them. The same is true for agent skills and MCP servers; they accelerate how we get value from models. This is overwhelmingly a good thing. But it also creates an attack surface that bad actors exploit.

We don’t need to abandon third-party code. We do need to understand the mechanisms attackers use and consistently defend against them.

When one malicious worm compromises hundreds of packages, what should dev teams do? This visual teaser maps the agenda—how it spreads, how to guard against it, AI tool risks, and concrete steps to mitigate.

On May 11th, I started seeing tweets about a TanStack hack. At that time, I didn’t know what TanStack was. But apparently, it’s a popular set of JavaScript libraries that are used by a lot of React sites. At first, I didn’t pay much attention. Then I learned the packages were compromised by a worm—malicious software that self-replicates—and it spread quickly. Within hours, dozens of packages were implicated; by day’s end, it was in the hundreds. That’s when I knew I had to lean in.

If you’ve explored safe development practices with coding agents before, you’ve seen the basics of package safety. A package is a bundle of reusable code shared through registries, and nearly every app you use depends on them. The unfortunate twist with this specific hack, known as the Mini Shai-Hulud worm, is that it shows prior “safe enough” heuristics aren’t sufficient. Popularity and trust signals don’t guarantee safety. We have to do more.

So here’s what I’ll cover today: how malicious software typically works, a practical framework for guarding against it, the specific risks of using Cowork to write and run code, and concrete steps to mitigate that risk. My goal is simple: help you keep building—despite the risks—while protecting your data and your business.

Quick disclaimer: I’m not a security expert. I’m sharing my personal journey and what I’ve learned through research and hands-on work. Please use your best judgment when applying any of this.

Package hacks share a simple playbook: get in, sweep for secrets, and phone home. This visual breaks down the 3 steps and flags new entry points—from packages to MCP servers, agent skills, and app extensions.

An agent recently scoured over 230,000 malicious software incidents and found that most malicious software follows a similar pattern. First, it needs an entry point onto your computer. Once installed, it scours your device for sensitive data, and then it uses your network connection to send that data to its own servers. The Mini Shai-Hulud worm spreads via malicious package install scripts that run at download time, then searches the device for credentials (including package publishing rights), poisons additional packages to continue replicating, and uses multiple channels—including the victim’s own GitHub public repos—to distribute secrets.

In practice, most attacks boil down to three steps: 1) It finds an entry point to your device. 2) It searches your device for sensitive data. 3) It sends that data to its own server. The good news: this pattern also tells us how to defend. We can harden entry points, minimize what code and agents can access, and constrain outgoing network traffic.

Keep in mind that install scripts aren’t the only entry vector. Any code that runs on your machine could contain malicious payloads: third-party packages, agent skills, MCP servers, browser or desktop extensions—the list is long. As coding agents and “vibe coding” tools become mainstream, more non-engineers are exposed to the same risks engineers have managed for years.

You might be at elevated risk if you do any of the following: you download and use third-party skills or MCP servers; you let Claude Code, Codex, or other coding agents write scripts that run locally and use third-party packages; you use an IDE like VS Code or Cursor with third-party extensions; or you install third-party extensions in tools like Obsidian. This isn’t an exhaustive list, but if any of these apply, it’s worth tightening your approach.

Relying on third-party code? This visual highlights four common risk zones—agent skills/MCP servers, coding agents, IDE extensions, and Obsidian plugins—and urges a review of downloads, local scripts, and add-ons.

The “safest” approach would be to avoid installing third-party software on your local device entirely. That’s not realistic. We all depend on third-party components in our stack. So I’ll start with one of the most common paths for non-engineers writing and running code today: Cowork.

Evaluating Cowork’s safety was eye-opening. Cowork offers meaningful protection—more than running code directly on your machine—but it isn’t bulletproof. There’s a notable gap you should understand.

Here’s how Cowork helps. It runs code inside a virtual machine, which isolates the execution environment from your real device—a quarantine room for code. While Cowork doesn’t fully control what comes into the room (that part is on you), if malicious code gets in, it’s contained and cannot reach the rest of your filesystem. Cowork also limits outbound network traffic from the virtual machine, which helps disrupt data exfiltration. However, it’s not foolproof.

Because Claude can install packages inside Cowork, it remains susceptible to malicious code like the Mini Shai-Hulud worm. And GitHub is on the allow list so Cowork can read and write to your repos. Since the Mini Shai-Hulud worm uses GitHub to publish secrets, this creates exposure. The crucial mitigation: if you never give Cowork access to sensitive data, there’s nothing for an attacker to steal.

A quick visual from a security deep dive on package hacks shows how Cowork handles threats: entry points are contained, data is only safe when kept outside, and network traffic is partly limited—making shared data the gap to watch.

Your responsibility is straightforward but critical: your data is only safe if it stays outside the virtual machine. When you mount folders into Cowork, those folders become accessible to any code running inside the VM. That includes malicious scripts. Before sharing, ask two questions: do the folders contain any credentials or secrets, and do they include proprietary data that would be harmful if accessed?

It’s common for code to need credentials. That’s why Cowork includes connectors to third-party sources like Google Drive and Slack. Credentials configured for these connectors never enter the VM—they remain outside the quarantine room—so they’re not exposed to malicious code. But if your code requires additional credentials inside the VM, scope them tightly and assume they could be compromised.

You can also use custom MCP servers you create yourself with Cowork. Those credentials stay outside the VM as well, provided the MCP servers are remote (hosted on a web server, not downloaded locally). It’s more work than dropping in a local server, but it keeps secrets out of reach from VM-executed code.

Beyond credentials, scrutinize the actual content you share with Cowork, including anything accessed through connectors. Least privilege is the rule: grant only what’s absolutely necessary for the task, and nothing more.

Amid a wave of package-supply attacks, this Product Talk visual launches a 3-part guide to safer AI building—starting with Cowork safety today, then Claude code config next week, and off-device development coming soon.

What about skills? Cowork supports skills, and you can add third-party skills inside the quarantine room. If you’re not placing your own data in that room, you can afford more risk. The moment you add sensitive or proprietary data, be selective. Skills can include third-party code, and bad actors use skill directories to distribute malicious payloads. Personally, I never use third-party skills as-is. If one looks useful, I read through the files, then ask Claude to recreate it so I understand what it does and maintain control. If I were to use third-party skills, I’d do it in Cowork and keep their data access to the minimum necessary.

Overall, Cowork is a solid, “safe-ish” option if you’re disciplined about what you share. The challenge is that utility often requires access to real data—exactly what we’re trying to protect. In an upcoming deep dive, I’ll outline strategies to keep malicious code out in the first place. While I’ll focus on local development, the same patterns can extend to Cowork with a bit of setup.

One more important clarification: don’t confuse Cowork with the Code tab in the Claude Desktop app. Cowork runs code inside a virtual machine. The Code tab does not. If you ask Claude to write and execute code from the Code tab, that code runs on your local device and you’re fully responsible for security. There is one exception: the Code tab can run code in Anthropic’s cloud; I’ll cover that approach when we get into moving development off the local machine.

To summarize Cowork’s protections against the attacker’s three-step pattern: installs and scripts still run, but they’re contained inside an isolated virtual machine instead of your real device; access to sensitive data is strongly limited to the specific folders you mount, leaving the rest of your filesystem (including unrelated credentials) out of reach; data exfiltration is partially constrained because Anthropic limits outbound network traffic from the VM—helpful, but not absolute. By contrast, local Code tab sessions offer no isolation, no filesystem restrictions, and no network limits—so any malicious install scripts run directly on your machine with full access and open egress.

My takeaways so far: I still love building with AI, but I’m doing it more cautiously. Cowork offers meaningful containment when used deliberately. I still prefer the flexibility of Claude Code, and I’ve reconfigured my setup to reduce risk. Even so, “safer” isn’t “safe,” which is why I’m increasingly shifting development off my local device to more controlled environments. I’ll share the practical details—tools, configs, and scripts—in the next installments.

If this perspective is useful, let me know. I want builders to move fast—and safely—through this new era of agentic AI. Until then, stay safe out there.

Inspired by this post on Product Talk.

June 3, 2026

A Product Leader’s Playbook for Humane, Sustainable Growth

Your growth dashboard can be green while your product is becoming less valuable to the people who use it. Activation rises. Engagement deepens. Revenue follows. Yet customers feel pressured, workers absorb hidden costs, or automation removes the human contact that made the experience trustworthy.

You don’t have to choose between humane technology and commercial performance. You do need an operating model that treats human outcomes as product outcomes, exposes harmful trade-offs early, and rewards durable value rather than extraction.

Start with the harm your growth model could create

Most growth models describe the path from acquisition to revenue. A humane growth model also describes who could be worse off if that path succeeds.

Map the product’s intended value first: the problem a person wants to solve, the moment they receive a useful result, and the reason they would return. Then examine the same journey from the perspective of people who may not appear in your analytics. That can include a customer’s employees, contractors who deliver the service, family members affected by the product, local businesses, or people excluded by the design.

Create an impact ledger for the growth surface you are reviewing. Keep it beside the business case, not in a separate ethics document that nobody consults during prioritization.

Impact area	Question to answer	Signal to monitor
User agency	Can people understand the choice, refuse it, reverse it, and leave?	Overrides, cancellations, reversals, and interview evidence
Well-being	Does additional use help people finish their intended task, or merely keep them present?	Successful outcomes, passive time, and expressions of regret
Economic fairness	Who captures the value, and who absorbs the labor, risk, or cost?	Complaints, payout concerns, and changes in burden across participants
Human connection	Does the experience strengthen useful relationships or replace them unnecessarily?	Human handoffs and feedback from affected communities
Trust and safety	Do people know when automation is involved and what happens to their data?	Escalations, corrections, safety reports, and trust feedback

The ledger is not an attempt to predict every consequence. It is a way to make foreseeable trade-offs visible before a team becomes committed to a launch. This matters commercially as well as ethically: extractive growth can weaken trust and retention while increasing regulatory and reputational exposure.

Pair every growth metric with a human countermetric

A metric becomes dangerous when the team can improve it while making the customer’s life worse. Engagement is the familiar example. More time in a product may indicate value, confusion, dependency, or difficulty leaving. The number alone cannot tell you which.

Give each primary growth metric a countermetric that protects the outcome you actually intend. The pair should appear in the same experiment brief and the same review meeting.

Growth metric	Human countermetric	Decision it improves
Activation	Completion of the customer’s intended outcome	Whether setup creates value or only reaches an internal milestone
Engagement	Intentional task completion	Whether additional use is productive or merely prolonged
Retention	Trust, voluntary continuation, and ease of exit	Whether customers stay because the product remains useful
Conversion	Comprehension of price, consent, and commitment	Whether revenue depends on informed choice
Automation rate	Correction, reversal, and human-escalation success	Whether efficiency survives real-world exceptions

Do not combine the pair into a single score too quickly. A blended score can conceal the exact trade-off leaders need to see. Review both trends and ask whether the business result would still be desirable if the countermetric deteriorated further.

Set the stopping condition before running an experiment. Decide which trust, safety, fairness, or agency signal would block rollout even if the primary metric improves. A guardrail invented after seeing strong conversion is rarely a real guardrail.

Expand discovery beyond the people who already love the product

Power users are good at explaining how to improve the experience they have accepted. They are less able to represent people who abandoned it, avoided it, could not access it, or carry costs without being the buyer.

Add an outside-in lane to continuous discovery. Include customers who reduced usage or left, people who encountered a failed automation, front-line workers affected by the workflow, and community members who experience consequences without controlling the purchase. Treat these conversations as product discovery, not public relations.

Ask questions that reveal displacement and dependency: What became easier? What became harder? What did this replace? When did you feel unable to make a meaningful choice? Who else had to change their behavior so you could receive the benefit? What would a responsible version of this experience preserve?

Bring the evidence into roadmap decisions in its original shape. A complaint about loss of control should not be translated into a generic request for better usability. A contractor describing unfair risk is not reporting a minor service defect. Name the underlying impact so the team can address the product model rather than polish its interface.

Put humane constraints inside the experiment

Principles have little effect if they enter the process after pricing, interaction design, and technical architecture are settled. Put them into the experiment before the team writes production code.

State the human outcome. Describe what should become better in the person’s life or work, not merely what behavior should increase.
Name the affected groups. Include non-users who supply labor, absorb risk, or experience downstream effects.
Define meaningful choice. Specify how people will understand automation, decline it, correct it, and reverse important actions.
Design the failure path. Decide how a person reaches human help when the system is uncertain, unsafe, or wrong.
Pre-commit to a stopping rule. Record which negative signal pauses expansion regardless of the growth result.

For AI products, this is where risk management becomes part of product management. Give users enough information to understand when AI is acting. Preserve review for consequential outputs. Build correction and escalation into the main workflow. Apply privacy-by-design while deciding what data the product needs, rather than after collecting everything that might be useful.

The product trio should own these decisions. Legal, security, trust, and policy partners can strengthen the work, but they cannot compensate for a roadmap whose incentives reward harm. The product leader remains accountable for the whole system being optimized.

Choose durable depth over indiscriminate scale

Scale is not proof of value. It is an amplifier. If the operating model depends on weak consent, hidden costs, unfair labor, or the removal of every human interaction, scale magnifies those weaknesses.

A narrower product can create a stronger business when the team understands a community deeply enough to solve its full problem. A locally focused mobility service, for example, could optimize for rider safety, driver economics, and neighborhood usefulness rather than treating every participant as an interchangeable unit of supply or demand. The market is smaller by design, but the value proposition can be clearer and trust can become part of the product’s advantage.

Test the durability of your strategy with a simple question: if customers become better informed and cultural expectations become stricter, does the growth model become stronger or weaker? A group of German primary-school parents collectively chose to delay smartphones until age 11 or 12. Product leaders should expect social norms to change, sometimes in direct opposition to adoption assumptions embedded in a forecast.

At the next roadmap review, challenge any initiative that needs customers to misunderstand a choice, remain dependent, or accept worsening treatment as the company grows. If removing that mechanism destroys the economics, you have found a strategy problem, not an optimization problem.

Key takeaways

Document who could be harmed by a successful growth initiative, including people who never appear in the customer database.
Pair activation, engagement, retention, conversion, and automation metrics with measures of outcomes, agency, trust, and recovery.
Include former users, affected workers, and non-buyers in continuous discovery.
Define consent, correction, escalation, and stopping conditions before launching an experiment.
Prefer a focused market with durable value over scale that depends on hidden human costs.

Start with the growth initiative carrying the greatest human risk. Add its impact ledger and countermetric to the next decision meeting, assign an owner, and make expansion conditional on both business value and human value holding up.

References

Shivam.Consulting Blog — Is Technology Still Net Positive? A Product Leader’s Reckoning and Playbook for Humane Growth

May 26, 2026

How to Operate Always-On AI Agents Without Losing Control
You want an AI agent to keep work moving after you close your laptop. The difficult part is not getting one successful overnight run. It is making the hundredth run predictable enough that you do not wake up to an embarrassing email, a corrupted task queue, or an unexplained usage bill.

The right operating model looks less like a clever prompt and more like a small, well-managed operations team. Give each agent a narrow job, an inspectable queue, limited tools, a clear definition of done, and an explicit place to stop. That is how you gain useful autonomy without surrendering control.

Start with a delegation contract, not a general-purpose assistant

An always-on agent should not begin with a broad instruction such as “manage my sales work.” That leaves the model to decide what managing means, which systems it may change, and when it has enough evidence to act. The ambiguity is tolerable during an interactive session because you can correct it. It becomes operational risk when the agent runs unattended.

Start by defining a job that produces a recognizable artifact. A sales-admin agent can prepare a briefing before a scheduled call and create proposed follow-up tasks afterward. A podcast-manager agent can assemble interview context, prepare a transcript-review document, and queue a reminder to share it. A coding-manager agent can review prior sessions and identify recurring mistakes. These are bounded responsibilities with visible outputs, not vague mandates to “help.” Three specialized agents handling podcast, sales, and coding workflows demonstrate how cleanly this pattern can separate unrelated work.

Write the delegation contract in an identity file that the agent reads at the beginning of every run. It should answer seven questions:
1. Who are you? Name the role, not the underlying model: sales admin, podcast manager, coding manager, or another function a person would recognize.
2. What outcome do you own? Describe the recurring deliverable and the event that makes it useful.
3. Where may you work? Name the exact task, output, and script folders the agent can use.
4. What inputs may you trust? Identify the calendar, task file, transcript, session log, or other allowed input for the job.
5. What may you change? Separate reading, drafting, creating internal files, updating tasks, and acting in external systems.
6. What counts as complete? Specify the artifact, required fields, location, and status update expected at the end.
7. When must you stop? Define what the agent should do when information conflicts, a tool fails, permission is missing, or the next step would affect another person.
The last question matters most. A useful agent does not need permission to improvise its way through every obstacle. It needs a reliable way to say, “I could not complete this safely; here is the missing decision.” Treat a well-documented block as a successful operational outcome, not as agent failure.

Keep consequential decisions outside the unattended role. The agent can prepare a customer email without sending it. It can propose changes to a deal record without changing the commercial commitment. It can summarize a coding pattern without modifying a production system. Moving from preparation to execution should be a deliberate permission decision, not an accidental side effect of adding another tool.

Build an inspectable operating loop around four components

The prompt is only one part of the system. Reliable agent operations need four components with distinct responsibilities: identity, scheduling, tasks, and scripts. Keeping them separate makes failures easier to locate and changes easier to review.

Identity defines responsibility

The identity file is the stable operating policy. It tells the agent what role it is playing, where its work lives, what it may do, and what completion looks like. Do not overload it with the details of one assignment. If the identity changes every time a task arrives, you no longer have a stable agent; you have an unreviewed prompt generator.

The scheduler supplies a heartbeat

The scheduler should wake the agent, point it to the correct identity and queue, and capture the result. It should not contain the business logic for podcast preparation or sales follow-up. That logic belongs in inspectable task instructions and small scripts.

A Mac that remains online can use macOS LaunchAgents as this heartbeat. LaunchAgents run with the user’s permissions, which is operationally convenient but also defines the risk boundary: the agent may be able to reach anything the scheduled process and its tools can reach. Running scheduled agents on an always-on Mac Mini therefore makes permission design part of the architecture, not a setting to revisit later.

Make the schedule explicit and easy to disable. Each job should have a known trigger, whether that is a recurring interval, a calendar-related event, or a periodic review. If you cannot quickly answer why an agent ran at a particular time, the scheduler is already too opaque.

Tasks hold durable state

Use a dedicated task folder for each agent. A Markdown file with frontmatter is enough to represent a work item while remaining readable by both a person and a tool. The frontmatter can hold machine-readable state; the body can hold the request, context, acceptance criteria, and eventual run notes.

Choose a small lifecycle and apply it consistently. For example: queued, in progress, blocked, completed, and failed. The exact labels matter less than the transition rules:
- A queued task is eligible to be claimed.
- An in-progress task records which run claimed it, preventing another run from silently doing the same work.
- A blocked task names the missing input or decision and preserves all useful partial work.
- A completed task links to its output and records what changed.
- A failed task records the failed operation and whether retrying it is safe.
Give each recurring event a stable identifier. Before creating a meeting brief, transcript-review document, or follow-up task, the agent should check whether that event has already been processed. This idempotency check prevents a retry or overlapping schedule from creating duplicates.

Do not treat chat history as the task database. Conversations are useful working context, but durable state belongs in a file or system you can inspect independently. Saving identities, task files, and scripts in a shared knowledge workspace such as Obsidian also makes the operating model portable across devices and coding assistants. Changing the model runner should not require rebuilding the job.

Scripts expose narrow capabilities

Scripts should perform small, deterministic operations: fetch an allowed input, create a document in a known location, normalize a transcript, or update a task field. Keep the judgement in the agent and the mechanics in scripts with explicit inputs and outputs.

A small script is easier to inspect than a broad instruction to use the terminal however the model sees fit. It also gives you one place to add validation, duplicate checks, and error handling. When an agent repeatedly constructs the same command or edits the same file shape, promote that operation into a reviewed script rather than relying on the model to reproduce it perfectly on every run.

Design the overnight failure path before the happy path

Unattended automation changes the cost of a mistake. During an interactive session, a confusing output costs a correction. Overnight, the same confusion can trigger repeated work, alter several systems, or contact someone before you see it. Your design should limit the consequence of a wrong interpretation, not merely improve the probability of a correct one.

Use a permission ladder

Classify capabilities by consequence and grant them one level at a time:
1. Read: inspect approved calendars, task files, transcripts, logs, or documents.
2. Prepare: create drafts, summaries, reports, and proposed tasks inside a bounded workspace.
3. Update: change internal records whose history can be inspected and reversed.
4. Act externally: send messages, share files, update customer-facing systems, or invoke paid services.
5. Perform destructive or privileged work: delete data, change access, alter infrastructure, or execute an irreversible operation.
Most new agents should prove themselves at the read and prepare levels. Promotion should be capability-specific. An agent that reliably prepares a sales brief has not thereby earned permission to send customer communication. Reliability does not transfer automatically from one action class to another.

For external actions, use a pending-approval state that contains the exact proposed action. You should be able to review the recipient, content, destination, and relevant context without reopening the entire run. Destructive or privileged actions should remain outside unattended execution unless you have an explicit recovery path and have deliberately accepted the consequence of failure.

Treat external text as data, not authority

Calendar descriptions, transcripts, web pages, emails, and documents may contain instructions that conflict with the agent’s job. The identity and task contract must outrank text found inside those inputs. An interview guest’s biography can inform a briefing; it cannot expand the podcast agent’s permissions. A meeting note can identify a follow-up; it cannot authorize the agent to send one.

Keep credentials out of identity and task files. Give scripts access only to the credentials required for their operation, and avoid handing an agent a general browser, terminal, file system, and credential store merely because each tool is useful in isolation. The dangerous capability is often the combination.

Make retries selective

A retry is appropriate when the failure is plausibly temporary and repeating the operation is safe. A network timeout during a read may qualify. Ambiguous recipient identity, conflicting meeting details, missing share settings, or an unclear customer commitment do not. Retrying an ambiguity only asks the model to make the same unsupported decision again.

Before enabling automatic retries, require the operation to pass three tests: it can detect whether it already succeeded, a duplicate would not create harm, and the number of attempts is capped. Otherwise, mark the task blocked and surface it for review.

Put hard boundaries around usage

Always-on does not mean continuously reasoning. It means the system is available to process eligible work on a known schedule. A run should inspect the queue, process a bounded amount of work, record its result, and exit.

Set limits at several layers: eligible task types, work accepted per run, retries per task, tools available to the role, and provider-side spending or usage controls where available. Record usage beside the task outcome so you can distinguish an expensive valuable job from an agent that consumes resources while circling an ambiguity. Surprise charges are not only a pricing problem; they usually indicate that the operating loop lacks a stopping rule.

Finally, maintain a kill switch you can use without asking the agent to cooperate. Disabling the schedule or revoking the narrow credential should stop future work. If stopping the system requires the same model and scripts that may be malfunctioning, it is not an independent control.

Measure whether the agent is reducing work or relocating it

A completed status is not proof of value. An agent can close every task while leaving you to verify facts, repair formatting, remove duplicates, and reconstruct why it made a decision. That is work relocation, not delegation.

Evaluate the operation with measures tied to the job:
- Usable completion rate: the share of eligible tasks that produce an output meeting the acceptance criteria without substantive rework.
- Correction rate: how often you must change facts, recipients, permissions, status, or next steps before using the output.
- Duplicate or false-action rate: how often the agent repeats a job or creates an action that the triggering event did not require.
- Blocked rate by cause: which missing inputs, permissions, or unclear rules repeatedly prevent completion.
- Time to review: the human attention required to approve, repair, or understand the result.
- Usage per usable outcome: the model or service consumption attached to work you actually keep.
These measures tell you what to change. A high blocked rate caused by missing context points to an input problem. Frequent factual corrections point to retrieval or acceptance-criteria problems. Duplicate work points to task identity and idempotency. High review time with otherwise correct output often means the evidence and change log are poorly presented.

Require every run to leave a compact receipt: the task it claimed, inputs it used, scripts it invoked, files or records it changed, output location, completion status, and reason for any block. You should not need to replay hidden reasoning. You need enough evidence to verify the operation and diagnose the next failure.

Review early runs closely and review again after changing an identity, script, tool, model, or input source. A stable task can become unstable when any one of those dependencies changes. Plain-text identities, tasks, and scripts make that change surface inspectable and versionable.

Your agents can also improve the operating system itself. A periodic coding-manager workflow, for example, can review prior coding sessions, identify recurring dead ends, and propose changes in how future sessions are run. The important separation is that the agent proposes an improvement with evidence; the operating policy changes only after review. Self-observation is useful. Unreviewed self-modification is a different risk class.

Expand only when the current job has earned more autonomy

Adding agents is easy once the scheduler and folder structure exist. That convenience can tempt you to automate work whose boundaries are not ready. Scale based on operational evidence, not on the number of possible use cases you can imagine.

A job is a strong candidate for always-on operation when it has a recurring trigger, stable inputs, an observable deliverable, clear acceptance criteria, bounded permissions, and enough repetition to justify maintaining the workflow. Preparation, follow-up capture, document setup, and periodic retrospectives fit because a person can inspect their artifacts and correct them before higher-consequence decisions are made.

Keep work interactive when the task depends on novel judgement, unresolved organizational context, sensitive negotiation, or irreversible action. An agent may still prepare evidence and options, but the decision should remain with the person who owns the consequence.

Before expanding an existing agent’s permissions or creating another role, check five gates:
1. The current output is regularly usable without substantial reconstruction.
2. Common failure modes are visible and end in safe states.
3. Duplicate prevention and retry behavior have been exercised.
4. Usage is attributable to tasks and bounded by stopping rules.
5. The next capability has its own acceptance criteria and consequence review.
Do not create one agent per application. Create one per coherent responsibility. A podcast manager may use a calendar, a document system, and a task list while retaining one outcome. Conversely, sales administration and coding retrospectives should not share an identity merely because they use the same model. Role boundaries should follow accountability, not tooling.

Key takeaways
- Begin with one recurring job that produces an inspectable artifact, not a general instruction to manage a function.
- Give the agent a durable identity, a dedicated task queue, an explicit schedule, and small reviewed scripts.
- Use task states, stable event identifiers, and completion receipts so retries and overlapping runs do not create invisible duplication.
- Keep new agents at read-and-prepare permissions until their outputs and failure modes are consistently understandable.
- Route ambiguity and consequential external actions to approval instead of asking the model to guess.
- Cap eligible work, retries, tools, and usage; always-on availability should still produce finite runs.
- Measure usable outcomes, corrections, blocks, duplicates, review effort, and usage before granting more autonomy.
Pick one task you already repeat and write its delegation contract before choosing more tools. If you cannot define the input, output, permission boundary, completion test, and safe stopping condition on one page, the job is not ready to run while you are offline. Tighten the job first. The agent can earn broader responsibility after the operating evidence is there.

References
- Product Talk – My Always-On AI Team: How I Get Claude Agents to Tackle Work While I’m Offline
May 20, 2026
Unlocking AI Agents: The Real Barrier Is Readiness—Not Capability—Here’s How to Scale

There’s a question that runs underneath every AI Agent evaluation: what can it do?

Two years ago, that was the right question to ask because Agents were limited and capability was a genuine constraint. The gap between what organizations needed and what the technology could deliver was wide. I felt that gap acutely in early pilots—plenty of ambition, not enough dependable execution.

That gap has since narrowed considerably, and yet most organizations are running their Agents well below what’s technically possible. I see teams lean on answering and routing, but stop short of looking things up, taking actions, or resolving complex, multi-step problems—especially where data, process variance, or risk come into play.

The standard explanation is that AI isn’t good enough yet—models must improve, or vendors must ship more features. But after studying organizations across industries actively expanding their AI automation, I’ve found that this explanation holds up less often than people assume. The blockers tend to be elsewhere.

The teams I’ve observed weren’t primarily constrained by what their AI could do; they were constrained by what their organization was structured to let it do. In other words, the ceiling wasn’t the Agent’s capability—it was organizational readiness, governance, and risk tolerance.

“Readiness” for AI breaks into five distinct types, and most organizations have some but not all of them. Below is how I assess them with product, operations, and engineering leaders.

Content readiness is whether you can explain your product and policies clearly and consistently. Most companies can. In practice, that means up-to-date knowledge bases, unified policy language, and clear versions that Agents can cite and apply.

Scope readiness is whether you’ve defined the edges: when should AI engage, and when should it step aside? Edge cases multiply, intent varies by customer segment, sensitive topics surface mid-conversation, but most teams can work through this with effort. Clear guardrails reduce ambiguity and shrink risk.

Procedural readiness is where things start to get harder. This is about whether you can articulate your processes clearly enough for something other than a human with years of tacit knowledge to follow. The happy path is rarely the problem. It’s the failure paths, decision branches, variations that have never been written down because they’ve always lived in someone’s head.

Data readiness is the first real cliff. Can you reliably identify the right user, account, or object at the moment a decision needs to be made? Is the data trustworthy in real time? Are the APIs stable, accessible, and actually connected? For most organizations, the honest answer is “partially, but we’re not always sure when it breaks.”

Execution readiness is the highest bar. Not just technically (can the Agent make the change?) but organizationally. Who owns it when the wrong refund gets processed? Who detects it? Who recovers? Does someone with authority actually accept the risk?

Most companies have the first two, some have the third, fewer have the fourth and fifth. When I map this with teams, we often discover that their Agent’s ceiling is really a reflection of operational maturity and data plumbing, not model quality.

We studied companies across six industries – energy, healthcare, ecommerce, gaming, financial services, property management – all trying to expand what their Agents could do. The pattern was consistent: teams set out to automate real actions—looking up account status, processing changes, handling transactions. In most cases, the AI could technically do it, but at a certain point (somewhere between guiding a user through a process and looking something up on their behalf) they hit a wall.

One team tried to automate application changes but couldn’t reliably identify which application to modify across their internal systems. Another explored billing automation but couldn’t access live account data due to regulatory constraints. A third needed to verify status across third-party vendor systems their Agent couldn’t reliably reach. I’ve seen similar constraints surface around CRM integration, data governance, and vendor SLAs—none of which are model issues.

In most cases, the team redesigned around what their infrastructure could support. They moved toward guiding—walking users through processes step by step, rather than executing changes on their behalf. It worked, it resolved conversations and delivered real value, just differently than anyone planned. In customer support, this often looks like consultative flows that shorten time-to-resolution even without direct writes.

Most Agent evaluations are built around capability. Can it handle complex queries? Does it support multiple channels? Can it integrate with our systems? These are reasonable things to evaluate for, but they produce a capability score, and that doesn’t tell you whether your organization can actually use what you’re buying.

The teams that got to deeper automation, the ones executing actions early, didn’t have “better AI,” they had more standardized operations. Actions that were already well-defined, consistently applied, and exposed through stable systems with clear rules. Automation wasn’t inventing new behavior, it was triggering actions that were already tightly controlled elsewhere.

Readiness enables capability, not the other way around. Which reframes the evaluation question from “can the AI do this?” to “are we actually ready for it to?”

Something that gets lost in most conversations about AI readiness is that organizations are often further along than they assume, just not for the kind of work they were planning for. A team that set out to automate refunds but can reliably guide users through complex troubleshooting has genuine capability deployed. They’re operating at the level their readiness supports, which is a starting point, not a deficit.

The more useful frame isn’t “are we ready?” – it’s “what are we ready for, and what specifically stands between here and the next level?” The gaps tend to be concrete: a missing API, data that lives in three systems that don’t agree, a process that’s never been documented, or an ownership question nobody has answered. These are solvable problems. They just require a different kind of investment than buying a more capable Agent.

What nobody has worked through seriously yet is how organizations actually build readiness. Does it develop naturally through using AI at shallower levels first? Or is it mostly a function of prior decisions, like system architecture choices made years ago, operational maturity that accumulated over time, engineering investments that have nothing to do with AI? When readiness does increase, what actually changes? Does the support team develop it? Does engineering grant it? Does it require executive sponsorship and investment in infrastructure with no obvious AI label on it?

In my experience, progress comes from a joint effort: product to define scope and guardrails, operations to codify procedures and edge cases, engineering to harden APIs and observability, and leadership to underwrite risk with clear ownership. When those pieces align, agentic AI moves from guided assistance to safe, auditable execution.

Until there are clearer answers, the pattern is likely to continue. Companies will buy capable Agents, plan ambitious rollouts, and find that the harder work is building the organizational infrastructure. The Agents can do the work. The question is what it takes to let them.

Inspired by this post on The Intercom Blog.

May 18, 2026

Governed AI Analytics in Financial Services: A Playbook

You have a credible AI analytics use case, product teams want access, and risk leaders want proof that the system will not expose sensitive data or influence the wrong decision. The mistake is to settle that tension with a broad choice between “innovation” and “control.” That choice is too vague to operate.

Start with a narrower question: what decision may this system influence, using which data, under whose authority, with what evidence afterward? Once those boundaries are explicit, you can give teams meaningful speed without asking compliance to accept an invisible risk.

Classify the decision before you assess the AI

Many AI reviews begin with the model: where it is hosted, how it was trained, or whether it can explain an answer. Those questions matter, but they do not establish the business risk. The same model can summarize an approved dashboard, flag an unusual transaction pattern, or help determine an outcome that affects a customer. Those are not equivalent uses.

Classify each use case by consequence, reversibility, and action authority. Consequence asks what happens if the output is wrong. Reversibility asks whether a person can correct the result before harm occurs. Action authority asks whether the system informs a person, recommends an action, or executes one.

Use case pattern	Permitted role for AI	Control that matters most	Boundary to make explicit
Descriptive analysis	Summarize approved metrics or behavioral patterns	Data permissions and traceable metric definitions	The output cannot create a new customer-level action
Investigative signal	Surface anomalies or suspicious patterns for review	Analyst validation, evidence capture, and disposition logging	A signal is not a finding or a verdict
Product recommendation	Suggest an intervention, workflow, or experiment	Human approval and outcome monitoring	The recommendation cannot bypass existing approval paths
Customer-affecting decision	Support a formally governed decision process	Documented oversight, explainability, and accountable human authority	The final authority and escalation path must be unambiguous

This classification prevents two common errors. The first is applying the heaviest possible review to every analytical assistant, which sends teams into unofficial tools and manual workarounds. The second is treating every output as “just an insight” even when a downstream workflow turns it into a customer action.

Trace the output one step beyond the interface. If an anomaly score enters a case-management queue, changes account handling, or triggers outreach, govern that downstream effect as part of the use case. A recommendation does not become low risk merely because a person clicks the final button.

Before development begins, write an allowed-action statement and a prohibited-action statement. For example: “The system may prioritize patterns for analyst investigation. It may not label a customer, close a case, or initiate an external action.” That pair of sentences is more operationally useful than calling the project “medium risk.”

Risk and compliance leaders still need to map the use case to the organization’s actual legal and regulatory obligations. A product risk classification is an operating tool, not a legal conclusion. When a use case could affect access, eligibility, pricing, fraud treatment, or another consequential outcome, obtain the appropriate compliance and legal review before activation.

Turn governance principles into an enforceable contract

Principles such as fairness, privacy, transparency, and human oversight do not control a production workflow by themselves. Each principle needs an owner, an enforcement point, and evidence that the control operated. I treat that combination as the governance contract for the use case.

Define the data boundary

List the approved data domains, fields, purposes, environments, and user groups. Do not stop at “customer data” or “analytics data.” Those labels are too broad to enforce. State which attributes the system can retrieve, which identifiers it can display, whether results may be exported, and where generated outputs may be stored.

Purpose: the business question the data may be used to answer.
Permitted inputs: the approved events, attributes, aggregates, and reference data.
Prohibited inputs: data classes that the workflow must never retrieve or infer.
Permitted users: roles allowed to query, review, approve, or export results.
Output handling: where results may be displayed, retained, shared, or reused.
Failure behavior: what the system does when permission, provenance, or confidence is insufficient.

Enforce that boundary with role-based access controls and granular permissions at retrieval time. Filtering an answer after a model has received restricted data is not equivalent to preventing access. The model, retrieval layer, analytics service, export path, and destination workflow all need to respect the same user identity and policy context.

Assign decision rights to named roles

A committee can set policy, but it cannot own every operational decision. Give each use case an accountable product owner, a data owner, a control owner, and a business reviewer. Clarify who can approve launch, who can change the data scope, who reviews exceptions, and who has authority to stop the workflow.

The product owner defines the user problem, allowed action, prohibited action, and business outcome.
The data owner approves the data purpose, quality expectations, permissions, and reuse limits.
The risk or compliance owner maps policy obligations to testable controls and reviews material exceptions.
The platform or security owner implements identity, access, isolation, logging, and change controls.
The business reviewer accepts, rejects, or escalates outputs and records why.

Keep the decision rights close to the workflow. If a reviewer sees an unsupported conclusion, that person needs a clear way to reject it, preserve the evidence, and route the issue. If every exception disappears into a general governance inbox, the formal control will be bypassed when operational pressure rises.

Design the audit record before launch

An audit trail should reconstruct what happened without relying on someone’s memory. Capture the requesting identity and role, the approved purpose, the data and metric definitions used, the system configuration, the generated result, any human review, the resulting action, and later corrections or overrides.

Logging creates its own data risk. Prompts, retrieved context, generated explanations, and reviewer notes can contain sensitive information. Protect the audit store with appropriate access, retention, and segregation rather than treating logs as harmless operational exhaust. Where policy permits, record protected references to sensitive records instead of duplicating raw payloads.

A practical platform evaluation should test whether the system combines strong data governance, auditable AI behavior, secure scale, and a direct connection to product outcomes. A policy document that cannot be enforced in the workflow is not enough, and a platform control without an accountable operating process is not enough either.

Put controls inside the workflows people actually use

Governance fails when it exists as a review ceremony around the product rather than a behavior inside it. Analysts should not have to remember a separate policy every time they ask a question. The approved data scope, identity context, review step, and evidence capture should travel with the task.

Behavioral analytics: govern the meaning as well as the data

Behavioral analytics can reveal how customers move through onboarding, self-service, support, payments, and other product journeys. The danger is not limited to unauthorized access. An AI system can also combine valid events into a misleading interpretation of customer intent.

Start the workflow with curated event definitions and approved business metrics. Require the output to expose the cohort definition, time context, filters, exclusions, and comparison used. The analyst should be able to inspect the path from a narrative claim back to the underlying measure before sharing it.

Separate observation from inference in the interface. “Users in this cohort abandoned the flow after this step” is an observation tied to event data. “They abandoned because they distrusted the process” is a hypothesis. Labeling those differently prevents fluent language from turning a plausible explanation into an unsupported fact.

Anomaly detection: route a signal into investigation, not judgment

An anomaly means a pattern differs from an expected baseline. It does not establish fraud, customer intent, system abuse, or operational error. Treat anomaly detection as a prioritization mechanism unless a separately governed process establishes something more.

Give the reviewer the observed deviation, relevant context, the comparison baseline, and links to permitted evidence. Capture the reviewer’s disposition: confirmed issue, expected behavior, insufficient evidence, data-quality problem, or escalation. That disposition is both an audit artifact and a feedback signal for improving the workflow.

Watch the operational burden as closely as the detection capability. A flood of weak signals can make the nominal control less safe because reviewers rush, defer, or stop trusting the queue. Monitor false positives, unresolved escalations, overrides, and the reasons analysts reject outputs. When those indicators deteriorate, reduce scope or pause automated routing while the cause is investigated.

Self-service analysis: give teams a governed lane

Product managers and analysts need enough freedom to explore without sending every question through a central approval queue. Create a governed workspace containing approved metrics, documented data products, role-aware access, and restricted export paths. Let people iterate freely inside that lane while changes to data scope, decision authority, or external activation trigger a new review.

Make the boundary visible. Users should know when an answer is based on incomplete data, when a metric is not approved for customer-level decisions, and when an output cannot be exported. A silent denial encourages workarounds; a clear denial that identifies the policy boundary gives the user a legitimate next step.

Do not give an analytics assistant write access to operational systems merely because the integration is convenient. Insight generation and action execution are separate privileges. Connect them only when the action, reviewer, failure mode, and rollback path have been governed explicitly.

Pilot with evidence, not a polished demonstration

A convincing demo proves that the happy path works. A governed pilot must also prove that the system refuses the wrong request, exposes enough evidence for review, and leaves a usable record when something goes wrong.

Choose a narrow workflow with an identifiable user, a bounded data set, a reviewable output, and a business outcome you already understand. Avoid beginning with an enterprise-wide assistant or an autonomous action layer. Broad scope makes it difficult to distinguish model behavior, data problems, permission failures, and process gaps.

Write the decision contract. Record the user, purpose, permitted inputs, allowed action, prohibited action, reviewer, and stop authority.
Configure the smallest useful data boundary. Include only the fields and metrics needed for the chosen workflow.
Test legitimate work. Confirm that authorized users can produce an insight, inspect its basis, and complete the intended review.
Test prohibited work. Attempt access with the wrong role, request excluded attributes, try an unauthorized export, and ask the system to take a prohibited action.
Test ambiguity and failure. Use incomplete context, conflicting metric definitions, missing permissions, and unavailable dependencies. Confirm that the system fails visibly and safely.
Reconstruct the event. Use the audit record to determine who requested the output, what information was used, what was generated, who reviewed it, and what happened next.
Change the system deliberately. Update a relevant configuration or model component and confirm that approval, documentation, testing, and monitoring follow the change.

Do not accept screenshots as evidence for controls that operate behind the interface. Ask the vendor or internal platform team to demonstrate a denied request, a permission change, a reviewer override, an exported audit record, and the behavior after a governed configuration change. The test should follow your use case and identities, not a generic demonstration tenant.

Measure value and control health together. If the system produces faster insights but increases unreviewed actions, weakens attribution, or creates an investigation backlog, it has not delivered a durable improvement.

Dimension	Question	Useful signals
Business value	Does the workflow improve a real product, growth, risk, or operational decision?	Time to a validated insight, useful investigations completed, issues resolved, and attributable product outcomes
Analytical quality	Can a reviewer verify the conclusion?	Accepted and rejected outputs, unsupported claims, metric-definition errors, and missing context
Control effectiveness	Did policy operate as designed?	Prohibited requests blocked, required reviews completed, permission exceptions, and audit-record completeness
Operational health	Can people sustain the workflow?	False-positive burden, unresolved escalations, overrides, rework, and reviewer backlog
Change safety	Do updates preserve the approved boundary?	Documented changes, completed regression checks, new failure patterns, and monitored post-change behavior

Set release gates in binary language. The use case has a named accountable owner or it does not. Permissions have been tested with unauthorized identities or they have not. High-impact outputs receive the required review or they do not. Audit evidence can reconstruct an event or it cannot. Ambiguous gates become exceptions as soon as delivery pressure appears.

When the pilot is stable, reuse the control components rather than copying the entire use case. Standard identity propagation, data classification, audit schemas, reviewer workflows, and change gates can form a shared control plane. Each new use case still needs its own purpose, decision boundary, outcome measure, and risk assessment.

Key takeaways

Govern the decision the AI can influence, not just the model that produces the output.
Write both an allowed-action statement and a prohibited-action statement before development begins.
Enforce data permissions before retrieval and carry the user’s identity through analysis, export, and downstream action.
Treat human review as an operational workflow with evidence, dispositions, escalations, and stop authority.
Keep observations, hypotheses, recommendations, and customer-affecting decisions visibly distinct.
Test denial, ambiguity, change, and audit reconstruction alongside the happy path.
Track business value, analytical quality, control effectiveness, and operational burden on the same scorecard.

Your next move is not to draft an enterprise AI policy. Pick one live analytics workflow and write its decision contract on a single page. If you cannot name the allowed action, prohibited action, data boundary, reviewer, audit evidence, and stop authority, the workflow is not ready to scale. If you can, you have the foundation for AI analytics that product teams can use and risk leaders can defend.

References

Amplitude – Financial Services AI

May 15, 2026

How to Prove the ROI of an AI Product Before You Scale It

Your AI product is getting used. The demos land well, task completion is improving, and internal enthusiasm is high. Then the CFO asks a harder question: what changed in the business because this product exists?

You cannot answer that question with prompt volume, response quality, adoption, or tickets touched. You need a measurement system that separates activity from incremental value, counts the full operating cost, and makes risk visible before a rollout gets larger. Here is how to build one.

Start with the decision your ROI model must support

ROI is not a retrospective slide assembled after launch. It is a decision rule. Before development begins, decide what evidence would justify launching, scaling, redesigning, rolling back, or retiring the capability.

That distinction changes the conversation. Instead of asking whether the agent is accurate enough or popular enough, you ask whether a measurable change in customer behavior produces a measurable business result without crossing an unacceptable risk threshold.

Build a driver tree with four levels:

Company outcome: revenue growth, lower cost to serve, or reduced business risk.
Customer outcome: the user completes a valuable job, reaches value sooner, or resolves a problem without unnecessary effort.
Product behavior: the AI capability changes conversion, expansion, self-service completion, containment, handle time, or escalation.
Controllable lever: the team changes the workflow, model behavior, conversation design, human review, or product guidance.

The chain matters because a model metric is rarely a business metric. Better answer quality may improve task completion, which may improve trial-to-paid conversion. The ROI case depends on the full chain, not the first link.

Value path	Business outcome	Leading evidence	Guardrails
Revenue	Higher conversion, average order value, or expansion	Time-to-first-value and self-service completion	Errors, complaints, and policy violations
Cost	Lower cost to serve	Containment, deflection, and reduced handle time	Escalations, false resolution, and downstream customer harm
Risk	Lower frequency or impact of harmful failures	Human-review events and detected violations	False positives, false negatives, hallucinations, and security breaches

Choose one primary value path for the investment case. Revenue, cost, and risk can all appear on the scorecard, but declaring all three as primary makes it too easy to rescue a weak result with whichever metric moved after launch.

A support agent, for example, may appear successful because it contains more conversations. But containment is only valuable if customers actually resolve their problems. A conversation that never reaches a human can reduce measured support volume while increasing complaints or churn risk. This is why revenue, cost, and risk measures must be evaluated together.

Write the measurement contract before you build the dashboard

A measurement contract is a short agreement among product, data, finance, and the operational team affected by the AI workflow. It prevents the definitions, cost boundaries, and success thresholds from changing after results arrive.

Your contract should answer these questions:

Who is eligible? Define the users, accounts, tasks, channels, and exclusions. Do not mix workflows with materially different economics.
What is the intervention? Name the AI capability and the version being evaluated. A model, prompt, retrieval pipeline, policy, or escalation change can alter the result.
What is the primary outcome? Select the business metric that determines whether the hypothesis passed.
What are the leading indicators? Use measures such as time-to-first-value, containment, and self-service completion to diagnose movement before lagging results mature.
What are the guardrails? Predefine acceptable limits for errors, hallucinations, false positives, false negatives, escalations, complaints, security events, and policy violations.
What is the baseline? Freeze the comparison period or control group before exposing the eligible population to the capability.
How will incrementality be proven? Specify the experiment, holdout, assignment unit, and minimum detectable effect.
What costs count? Agree on model or API consumption, labeling, evaluation, human review, and ongoing oversight before calculating value.
What action follows each result? Record the thresholds for launch, scale, redesign, rollback, and retirement.

The contract should distinguish an outcome OKR from an output OKR. Shipping the agent, generating responses, and increasing feature use are outputs. Improving conversion, lowering verified cost to serve, or reducing harmful failures are outcomes. Outputs can explain what happened, but they cannot establish value on their own.

Instrument the complete journey, not just the conversation

An AI log tells you what the model did. An ROI dataset must also tell you what the user did next.

Connect the journey from eligibility to business outcome:

The user or account became eligible for the capability.
The AI experience was offered, viewed, and engaged.
A task was attempted, completed, abandoned, or repeated.
A response was accepted, corrected, regenerated, or sent for human review.
The interaction was contained, escalated, or handed to another workflow.
The downstream conversion, expansion, support, retention, or complaint event occurred.
The associated model cost, labeling work, and human-oversight cost were recorded.

Carry a stable user or account identifier, experiment assignment, agent version, and journey identifier across those events. Without that connective tissue, the team may have an impressive agent dashboard and no defensible way to attribute a business outcome to the experience.

Use behavioral analytics and session replay to understand why a metric moved. Use journey mapping and retention analysis to locate the friction worth solving in the first place. Product tours and in-app guidance can then help eligible users reach a validated workflow. This creates a closed loop from journey friction to experiment and measurable outcome, instead of a collection of disconnected AI metrics.

Calculate economic value without turning activity into savings

Start with net business value:

Net business value = incremental revenue + cost avoided – total operating cost – quantified risk loss

If finance requires an ROI percentage, divide net business value by the agreed investment base. Keep both the numerator and denominator visible. A percentage without its cost boundary is easy to inflate and hard to audit.

Count only incremental revenue

Do not credit the AI product with every transaction it touched. Credit it with the difference between the exposed population and the valid control or holdout.

A practical revenue calculation is:

Incremental revenue = eligible volume x measured outcome lift x value per additional outcome

The measured outcome might be trial-to-paid conversion, self-service upsell, average order value, or expansion. Use the same eligibility definition, attribution window, and revenue treatment for the intervention and control. If the AI experience merely appears somewhere in a successful journey, that is influenced revenue, not proof of incremental revenue.

Separate capacity from cashable savings

Cost claims require more care than a deflection count. A contained interaction may create capacity without reducing expenditure. That capacity can still be valuable, but it should not be presented as cash savings unless spending actually changes.

Capacity created: employees have time available for other work, but the existing cost base remains.
Variable cost avoided: the company no longer incurs a cost that would have grown with each additional interaction.
Cashable savings: an approved budget, vendor charge, or staffing requirement is actually reduced.

Report these separately. Otherwise, the same saved minute can be counted once as employee capacity and again as reduced spend.

Validate that a deflected task was resolved, not abandoned or displaced to another channel. Then calculate avoided cost from the incremental lift in verified resolution, not the total number of conversations the agent handled.

Include the operating costs that make the agent dependable

Model or API cost is only one part of the investment. Include labeling, evaluation, human review, and operational oversight. If a safer workflow requires more review, that review is part of the product’s economics, not an external inconvenience to exclude from the model.

Segment cost by agent, workflow, and outcome. Cost per response is useful for infrastructure management, but cost per verified successful outcome is the better economic unit. A cheap response that triggers retries, escalations, or corrections may be more expensive than a higher-cost response that completes the job.

Do not bury risk inside an average ROI number

Risk adjustment should make uncertainty visible, not create false precision. Use three layers:

Hard guardrails: security and policy conditions that trigger containment or rollback regardless of financial upside.
Observed risk indicators: error, hallucination, escalation, complaint, false-positive, and false-negative rates tracked by workflow and cohort.
Financial adjustment: expected loss deducted from net value only when the probability and impact assumptions are credible enough for finance and risk owners to accept.

Do not let a low-frequency, high-consequence failure disappear inside a high average success rate. If the downside cannot be defensibly monetized, keep it as an explicit decision constraint rather than assigning it a convenient dollar value.

Prove incrementality before claiming impact

The strongest ROI calculation still fails if the attribution is weak. A before-and-after improvement may come from seasonality, pricing, traffic quality, a support policy change, or another product release. The AI capability needs a counterfactual: what would have happened to comparable eligible users without it?

Use an A/B test or holdout whenever the product and risk profile allow it. Make these choices before launch:

Assignment unit: Randomize at the level where the outcome occurs. If expansion is measured per account, account-level assignment can prevent users in the same customer organization from receiving conflicting experiences.
Primary outcome: Pick the metric that determines success and keep diagnostic metrics secondary.
Minimum detectable effect: Precompute the smallest lift worth detecting based on the baseline, available population, and business value. If the experiment cannot detect a decision-relevant change, extending the metric list will not fix it.
Guardrails: Test quality, escalation, complaints, security, and policy outcomes alongside the primary metric.
Analysis population: For a product-level ROI claim, analyze eligible users according to their assigned experience. Looking only at people who voluntarily used the agent introduces selection bias.
Measurement horizon: Keep the holdout long enough to observe the outcome named in the contract. Leading indicators can guide iteration, but they should not be substituted for retention, churn, Net Recurring Revenue, or other lagging outcomes.

If randomization is not practical, use a fixed holdout or a frozen comparison period and document the limitations. A weaker design can still inform a decision, but the ROI claim should carry less confidence. Do not quietly promote correlation to causation because the rollout has executive attention.

Interpret the result as a system. Suppose self-service completion rises but the business outcome does not. The agent may be solving a low-value task, attracting users who would have converted anyway, or shifting effort to a later step. If conversion improves while complaints or policy violations cross the guardrail, the value hypothesis may be valid but the implementation is not ready to scale.

This is eval-driven development applied to product economics: define acceptable behavior and business success, measure both under controlled conditions, diagnose the failures, and repeat the test after a meaningful change.

Turn ROI into a portfolio operating system

A one-time business case goes stale as models, prompts, traffic, user behavior, and operating costs change. Maintain an Agent Analytics view for every production capability.

Each agent scorecard should show:

The primary business outcome and current experiment result.
Leading journey metrics from eligibility through verified completion.
Revenue contribution, cost avoided, and total operating cost using the agreed definitions.
Quality and risk guardrails, including escalations and human-review events.
Performance by relevant customer, task, and journey cohort.
The agent, model, policy, and workflow version associated with the result.
The current decision status: exploring, launching, scaling, redesigning, contained, or retiring.

Use the dashboard to make portfolio decisions, not merely to report trends:

Scale when the primary outcome clears the precommitted threshold, guardrails hold, net value is positive, and the result remains credible across the cohorts that matter.
Redesign when leading indicators improve but the business outcome does not, or when human review and escalation erase the economic gain.
Contain or roll back when a hard security, policy, or customer-harm threshold is breached, even if average financial performance is positive.
Retire when controlled measurement shows no decision-relevant incrementality or when dependable operation costs more than the value created.

Review operational signals with frontline teams because they can explain patterns hidden by aggregate metrics. Review portfolio value in QBRs with product, data, finance, and risk owners so investment follows evidence rather than novelty.

Only accelerate adoption after the workflow has demonstrated unit value. In-app guides, product tours, and lifecycle nudges can bring more eligible users into a validated flow. Measure whether those interventions increase the business outcome, not merely clicks or agent sessions. Scaling exposure to an unproven workflow scales its cost and risk as readily as its potential benefit.

Key takeaways

Treat ROI as a precommitted decision rule for launch, scale, redesign, rollback, or retirement.
Connect model behavior to customer behavior and then to revenue, cost, or risk through a driver tree.
Freeze the baseline, cost boundary, guardrails, attribution method, and success thresholds before results arrive.
Credit only incremental revenue and verified avoided cost. Keep created capacity separate from cashable savings.
Include model consumption, labeling, evaluation, human review, and oversight in the operating cost.
Use controlled experiments or holdouts, with a decision-relevant minimum detectable effect, to separate causal impact from correlation.
Keep severe risk conditions as explicit constraints when they cannot be responsibly converted into a financial estimate.
Scale adoption only after the AI workflow has shown positive unit value under acceptable risk.

Pick one high-friction customer journey and complete its measurement contract before the next roadmap review. If the team cannot name the baseline, control, primary outcome, cost boundary, guardrails, and decision thresholds, the capability is still an exploration. Label it honestly, instrument it properly, and earn the right to make an ROI claim.

References

May 15, 2026

How to Deploy an Operator AI Agent in Customer Operations

Your support team probably does not need another chatbot that summarizes a ticket on command. It needs help with the operational work surrounding every ticket: finding why escalations changed, keeping knowledge accurate, correcting broken automations, coordinating incident communication, and showing human reps what deserves attention next.

An operator AI agent can take on that work, but only if you design it as an operating system for customer operations rather than a conversational layer over support APIs. The useful version closes the loop from signal to diagnosis to tested change. The dangerous version produces plausible commentary and receives permission to act before it has earned trust.

Define the job as a closed loop, not a chat box

A customer-facing AI agent handles an individual customer’s request. An operator agent works on the system around those requests: conversations, help content, automation configuration, performance data, incident workflows, and the human queue.

That distinction changes the product requirement. The agent is not complete when it answers a question such as why escalations increased. It is complete when it can investigate the increase, identify a supported cause, determine which operational object needs attention, prepare a change, test that change where possible, and route it to the right person for approval.

Observe: Detect a question, anomaly, scheduled task, failed conversation, release brief, or incident.
Diagnose: Select the relevant metrics and attributes, inspect representative conversations, and separate recurring patterns from isolated cases.
Locate the control point: Determine whether the problem sits in knowledge, guidance, a procedure, a data connector, an automation rule, or a human workflow.
Propose: Produce a concrete artifact such as an article diff, configuration change, procedure, incident audience, or prioritized queue.
Verify: Run a simulation or another appropriate check and expose failures, edge cases, and remaining uncertainty.
Act and learn: Apply an approved change, record what happened, and monitor the affected outcome for regression.

Consider the prompt, Why did escalations rise last week? A reporting copilot returns a chart. A useful operator identifies which escalation definition applies, segments the change, reads relevant conversations, finds the repeated cause, checks whether the corresponding help content or automation is deficient, and prepares the smallest defensible correction. That progression from an operational question to an actionable proposal is already possible across analysis, knowledge maintenance, automation building, and human support workflows.

Write the acceptance criteria around that complete handoff. Require the evidence used, the proposed artifact, the scope of impact, the verification result, the named reviewer, and any action the agent is forbidden to take. If the output still leaves an operations manager rebuilding the context manually, you have a chat assistant, not an operator.

Build reliability below the model and price that work honestly

A foundation model with API access can make a persuasive prototype. It can query ticket data, summarize conversations, and write a report that appears coherent. The hard part begins when different workspaces use different fields, configurations, workflows, permissions, and definitions of success.

The model should not have to rediscover your operating rules on every run. Encode those rules in purpose-built tools and reusable skills. A tool performs one bounded operation, such as retrieving a conversation, searching knowledge, or running a defined report. A skill coordinates several tools to complete a business job, such as debugging a failed resolution or rolling a policy change through the help center.

Operator’s production architecture is described as having more than 50 tools and 10 multi-step skills. Those counts are not targets to copy. They illustrate how quickly the hidden surface area grows once an agent must do dependable operational work instead of demonstrating a few API calls.

System layer	Job it must perform	Failure you should test for	Control to add
Semantic retrieval	Find content by meaning, not only exact words	Irrelevant or incomplete evidence produces a confident diagnosis	Evaluate retrieval against real support questions and known content gaps
Attribute awareness	Know which metrics, fields, and custom attributes are populated and meaningful	The agent invents a pattern from sparse or unused fields	Expose field definitions, coverage, allowed joins, and missing-data signals
Atomic tools	Perform narrow reads or writes predictably	A broad API wrapper allows an unintended query or change	Use typed inputs, constrained scopes, explicit permissions, and structured results
Domain skills	Chain tools according to a repeatable customer-operations method	The same request follows a different process on each run	Define required steps, exit conditions, evidence, and escalation paths
Review interface	Turn reasoning into charts, diffs, tests, and proposals	A reviewer approves a wall of prose without understanding the change	Render the decision in the format appropriate to the object being changed

Semantic retrieval and attribute awareness deserve particular attention. Retrieval grounds the agent in the content that can actually answer the question. Attribute awareness stops it from treating every available field as equally meaningful. A custom field that exists but is almost never populated should not become the foundation of an operational recommendation.

Give every tool a contract before the model can call it:

The business purpose and the questions it is allowed to answer.
The read and write permissions it requires.
The preconditions that must be true before it runs.
The evidence and identifiers it must return.
Its behavior when data is missing, ambiguous, stale, or inconsistent.
The audit event, approval requirement, and rollback path for a write.

Evaluate build versus buy beyond the demonstration

A proof of concept establishes that a model can produce a plausible answer with your data. It does not establish that the answer is grounded, that the proposed action is safe, or that the system will behave consistently as configurations change.

For a build decision, include retrieval tuning, permission design, tenant isolation, tool maintenance, skill development, evaluation data, observability, proposal interfaces, audit history, rollback behavior, and on-call ownership. Also ask who will update the agent when a support object, metric definition, product policy, or API changes. If these responsibilities do not have durable owners, the internal agent will age like any other unsupported operations system.

For a buy decision, ask the vendor to demonstrate your difficult cases rather than its preferred prompts. Use a conversation with conflicting evidence, an unused custom attribute, an outdated localized article, a misconfigured rule, and a proposed write with a wide blast radius. Inspect the evidence, tool trace, permissions, diff, test result, and audit record. The quality of the generated prose is one of the least informative parts of that evaluation.

Put a proposal boundary around every material action

Moving from analysis to live changes is a different class of production problem. A wrong summary wastes time. A wrong configuration can degrade customer outcomes across every conversation that matches it. An incorrect outbound message cannot be recalled after customers have read it.

I would give the agent autonomy according to consequence, not according to how confident its language sounds:

Read: Search content, inspect conversations, calculate approved metrics, and assemble evidence. Run these tasks autonomously within access controls and log every operation.
Recommend: Explain a root cause or rank an opportunity. Attach the underlying conversations, segments, fields, and assumptions so a person can challenge the conclusion.
Prepare: Draft an article, procedure, rule, connector configuration, customer response, or queue. Save it as a proposal with no production effect.
Change: Publish, configure, send, or otherwise alter the live operation only after the required reviewer sees the exact scope and explicitly approves it.

A proposal is a structured change object, not a paragraph asking for trust. Production-grade operator systems can present reviewable diffs before applying changes, allowing the reviewer to accept, reject, or refine the work. The same principle should govern any operator implementation.

Your review screen should answer six questions without forcing the approver into another tool:

What object will change?
What exact fields, passages, rules, or recipients are affected?
What evidence connects the observed problem to this change?
What test ran, and which cases failed or remained untested?
Who must approve, and which permission will execute the action?
How can the change be reversed, and what cannot be reversed?

Customer outreach needs the strictest treatment because sending is effectively irreversible. Do not approve a batch from a conversational summary that hides the audience. The safe alternative is a preview containing the resolved customer list, inclusion logic, exclusions, exact message variants, delivery channel, and approver. Start by allowing the agent to prepare that package while a person performs the send.

Simulation also needs a visible place in the proposal. If the agent modifies an automation procedure, show which representative conversations were tested, the expected outcome for each, the observed outcome, and why any mismatch occurred. An overall pass label is not enough to reveal an important edge case.

Human approval is not a permanent substitute for system quality. If reviewers routinely accept proposals without inspecting them, the control has become ceremonial. Track corrections, rejections, rollbacks, and the evidence reviewers open. Use those signals to improve the relevant retrieval rule, tool, skill, or interface.

Roll out workflows in increasing order of consequence

Choose the first workflow by its operating characteristics. A strong starting candidate recurs frequently, consumes expert attention, has accessible evidence, produces a clear artifact, and has a named reviewer. It should also allow the agent to be useful before it receives broad write permission.

A practical rollout sequence looks like this:

Recurring operations analyst. Give the agent one standing question, such as what changed in escalations or automation performance. Define the metric, comparison period, relevant segments, evidence requirements, and report destination. Require links to representative conversations and allow the conclusion that no action is warranted. Compare its reasoning with an experienced operator’s review until the failure modes are understood.
Knowledge steward. Feed it a release brief or policy change. Ask it to find affected help content, identify missing coverage, and prepare article diffs in the required voice and format. Include localized variants where they exist. The reviewer should validate product behavior, instructions, links, policy language, and whether the proposed set of pages is complete before publishing.
Automation maintainer. Start with known failed conversations. Ask the agent to distinguish a content gap from a rule, procedure, guidance, or connector problem; prepare the smallest correction; define triggers and edge cases; and simulate the result. Do not grant live configuration access until the tool trace and tests make the diagnosis reproducible.
Human-operations coordinator. Use the agent to assemble an incident audience, draft targeted responses, prepare coaching evidence, or prioritize a rep’s queue. These workflows can save substantial coordination time, but they touch customer communication and employee decisions. Begin in preparation mode, expose the selection logic, and expand autonomy only after identity, permission, review, and audit controls have been exercised.

This sequence is a risk ordering, not a universal maturity model. A read-only weekly analysis is easier to inspect and reverse than an outbound incident campaign. A knowledge proposal has a reviewable artifact. A live automation change affects future conversations, while customer communication may create an immediate and irreversible consequence. Move forward when the evidence and controls for the next class of action are ready, not merely because the previous feature launched.

Measure the completed loop, not chat activity

Prompt counts and conversation volume tell you that people opened the product. They do not tell you that customer operations improved. Build the scorecard around the operational loop:

Diagnostic quality: Whether the proposed root cause survives expert review, whether its evidence supports the conclusion, and how often factual correction is required.
Operational throughput: Time from a detected signal to a reviewed proposal and from an approved proposal to a verified change.
Artifact quality: Acceptance, revision, rejection, and rollback patterns for knowledge, automation, configuration, and communication proposals.
Customer outcome: Resolution, escalation, repeat contact, and sentiment for the affected topic after the change, interpreted alongside volume and case mix.
Safety: Permission denials, attempted out-of-scope actions, failed simulations, unauthorized writes, rollbacks, and missing audit events.
Human leverage: Expert time spent collecting evidence, recreating context, drafting the artifact, and reviewing the final proposal.

Do not make automation rate the only goal. A higher rate can coexist with poor resolutions or avoidable escalations. Treat it as one diagnostic measure and pair it with customer outcomes, correction rates, and topic-level regressions.

Create an evaluation set from real operating conditions: known content gaps, misconfigured rules, legitimate escalations, sparse attributes, conflicting evidence, localized content, and incidents with precise audience criteria. Give each case an expected outcome, required evidence, allowed tools, and forbidden action. Re-run the set when the model, retrieval system, tool, skill, permissions, or support configuration changes.

Scheduled work is where the leverage begins to compound. An operator can run recurring analysis and deliver the resulting report without waiting for a manager to remember the question. Keep an owner on every scheduled job, however. That owner should know where failures appear, when the task last completed, which data it used, and how to pause it.

Key takeaways

An operator agent improves the system around customer conversations; it is not simply another customer-facing bot.
The product boundary should cover observation, diagnosis, proposal, verification, approval, action, and monitoring.
Reliable behavior comes from grounded retrieval, attribute awareness, bounded tools, encoded domain skills, and structured review surfaces.
Grant autonomy by consequence: broad freedom to inspect approved data, tighter controls to prepare changes, and explicit approval for production writes.
Roll out recurring analysis before knowledge changes, automation configuration, and customer communication unless your own risk profile clearly supports another order.
Measure supported diagnoses, accepted artifacts, customer outcomes, human time, and safety events rather than prompt volume alone.

Your next step is to choose one recurring operational question and write down the evidence it requires, the artifact a good answer should produce, the person who will review it, and the actions the agent must not take. Once that loop works reliably, add one downstream proposal. That is a much stronger foundation for an operator agent than beginning with an open-ended prompt and a broad API key.

References

May 14, 2026

AI Product Data Security: A Practical Playbook for PMs

Your AI feature is ready to move beyond the prototype, but one question can still stop the release: exactly which customer data leaves your boundary, where is it copied, and who can retrieve it later? If the answer is scattered across architecture diagrams, vendor settings, and assumptions, you do not yet have a security decision.

You can resolve that uncertainty without turning every experiment into a committee exercise. Map the data path, assign the capability a risk lane, minimize what the model receives, and automate the controls that follow from the classification. The result is a release process that is both faster and easier to defend.

Start with the data path, not the model

The first security question is not what the model knows. It is what your product sends, retrieves, transforms, stores, logs, and displays. A provider can have a strong security posture while your implementation still exposes data through an overbroad retrieval query, a debug log, or an incorrectly scoped support tool.

Draw the complete path for one user request. Do not use a generic platform diagram. Follow the actual capability from the moment a user or system creates an input until every resulting copy has expired or been deleted.

Identify the original input, including form fields, uploaded files, messages, system-generated events, and API payloads.
List the context added by your application, such as account attributes, conversation history, analytics, retrieved documents, feature configuration, or tool results.
Mark every transformation before the model call: filtering, redaction, tokenization, summarization, chunking, or schema conversion.
Name the service that receives each payload, including gateways, model providers, observability tools, evaluation systems, queues, and caches.
Trace the response through validation, tool execution, display, analytics, support access, and downstream storage.
Record when each copy expires, how deletion propagates, and who can access it while it exists.

For every step, capture six fields: data class, system owner, access scope, external recipient, retention rule, and failure consequence. If any field is unknown, label it unknown. An explicit unknown is useful discovery work; an undocumented assumption is hidden risk.

Do not stop at obvious records such as customer PII and payment identifiers. Prompts, retrieved context, user-linked analytics, internal roadmaps, feature flags, configuration values, embeddings, vector stores, and evaluation datasets can also reveal confidential facts or inferred identity. Treat them as product data with owners and controls, not harmless implementation residue.

Use a completion test that exposes weak assumptions

Your map is ready for a decision when someone outside the feature team can answer these questions from it:

What is the most sensitive field the capability can receive?
Which fields cross the company boundary, and which named service receives them?
Can one customer ever retrieve another customer’s data?
Are raw prompts, completions, retrieved passages, or tool results logged?
Which identities can inspect those logs or replay a request?
What happens to derived data when the original record is deleted or its permissions change?
Which control contains the incident if the model, retrieval layer, or tool call behaves unexpectedly?

If the team can only answer these questions by asking several vendors or searching production settings, keep the release open. The missing work is not paperwork. It is part of the product’s operating design.

Turn the risk assessment into a release lane

A risk score is useful only when it changes what the team must do. Avoid a long questionnaire that ends with an ambiguous rating. Use a small number of lanes, give each lane an observable entry condition, and attach default release controls.

Risk lane	Typical signals	Default release posture
Low	Internal capability; synthetic or public inputs; no sensitive context; no consequential external action	Approved provider, least-privilege credentials, basic access tests, and confirmation that secrets are not entering prompts or logs
Elevated	Customer-facing capability; authenticated user context; behavioral telemetry; stored prompts or outputs; retrieval from private content	Data minimization, pre-call redaction, permission-aware retrieval, explicit retention, adversarial evaluations, runtime monitoring, and a named incident owner
High	Regulated-data adjacent; payment identifiers; broad confidential retrieval; sensitive identity data; or authority to perform a consequential action	Early Security, Legal, privacy, and Data involvement; documented threat model; human approval where an action warrants it; verified containment; and release evidence reviewed before exposure

These lanes are an operating model, not a compliance determination. Applicable controls depend on the actual data, customer contracts, geography, industry, and use case. Security and legal specialists should make those determinations when the capability creates legal, regulatory, or material customer exposure.

Classify the capability, not the entire product. A writing assistant that uses text supplied for a single request may sit in a different lane from an account assistant that searches every customer conversation and updates CRM records, even when both use the same model.

Score the capability across these dimensions:

Data sensitivity: public, internal, confidential, personal, payment-related, or regulated-data adjacent.
Audience: constrained employee group, all employees, authenticated customers, or public users.
Retrieval reach: one supplied record, an authorized account subset, or a broad internal corpus.
Action authority: produces a suggestion, drafts a change, or executes an external action.
Persistence: ephemeral processing, structured event storage, or retained raw inputs and outputs.
Third-party exposure: stays inside your controlled environment or passes through one or more providers and subprocessors.

Use the highest-risk dimension to set the initial lane. Lower it only after a design change removes the exposure. A promise to be careful is not a mitigating control; scoped retrieval, enforced redaction, disabled raw logging, and restricted tool permissions are.

Reclassify when the feature changes its data, audience, retrieval reach, retention, provider, or ability to act. A seemingly small roadmap addition, such as remembering past conversations or connecting a second data source, can change the security posture more than a model upgrade does.

Design the system to disclose less data

The most reliable way to protect data is to keep unnecessary data out of the AI path. Encryption and contractual terms matter, but they do not make an irrelevant customer field necessary. Start with the user outcome and ask which minimum facts the model needs to produce it.

Minimize before you redact

Redaction is a valuable deterministic safeguard, but it should not carry the whole design. Free-form text can contain names, secrets, identifiers, and confidential business information in formats your rules do not recognize. Reduce the payload first, then redact the smaller payload that remains.

Replace a full customer object with the few fields required for the task.
Use a temporary account token when the model does not need a person’s name, email address, or payment identifier.
Convert long interaction histories into purpose-specific structured fields when the task does not require the original prose.
Exclude internal notes, disabled fields, hidden metadata, and unrelated attachments by default.
Log structured events such as policy result, model identifier, latency, and request status when raw prompt text is not required.

Separate identity from content wherever the workflow allows it. The application can retain the relationship between a temporary token and an account while the model processes only the content needed for the task. Access to the token map should remain narrower than access to routine AI telemetry.

Make retrieval permission-aware

A retrieval-first architecture can keep the raw corpus inside your controlled boundary while selecting only relevant context for a request. It is not automatically private. If an external model receives the selected passages, those passages still cross the boundary and still require minimization, redaction, approved-provider controls, and a clear retention policy.

Apply authorization when the request is made, not only when content is indexed. The retrieval layer should constrain results by tenant, user, role, and current document permissions before any text becomes model context. Do not index content that the eventual searcher could never be allowed to read unless the architecture has another enforceable isolation boundary.

Treat embeddings and vector-store metadata as sensitive derived data. A vector is not a magic anonymizer, and metadata can disclose document names, account relationships, categories, or activity patterns even when full text is elsewhere. Your deletion and permission-change process must reach the index, cached results, evaluation copies, and any stored citations, not just the primary database.

Retrieved content is also untrusted input. A malicious or compromised document can contain instructions intended to change model behavior. Keep system instructions separate, restrict available tools, validate tool arguments, and enforce authorization in application code. The model should never be the component that decides whether a user may access a record or perform an action.

Place deterministic controls on both sides of the call

Before the call: validate the request schema, remove disallowed fields, redact known sensitive patterns, apply allow and deny policies, and constrain retrieval.
After the call: validate output structure, block disallowed sensitive patterns, verify any cited record belongs to the authorized scope, and check tool arguments before execution.
During operation: monitor unusual prompt, output, retrieval, and access patterns without creating a second uncontrolled store of raw content.

An output filter cannot undo data already disclosed to an external provider. Use post-call checks to protect users and downstream systems, but use pre-call minimization and access enforcement to prevent the disclosure itself.

Make vendor approval specific to the intended use

Do not approve an AI vendor in the abstract. Approve a defined service, account configuration, data class, region, retention posture, and use case. A provider suitable for public-content summarization may not be suitable for customer conversations or payment-related identifiers.

Ask questions that produce enforceable answers rather than broad assurances:

Training and service improvement: Can prompts, files, retrieved passages, outputs, feedback, or metadata be used to train models or improve services? Is the restriction a default, a setting, or a contractual term?
Retention: How long does each data type remain in primary systems, safety systems, failure logs, backups, and support tooling? What initiates deletion, and what exceptions apply?
Human access: Under what conditions can provider personnel inspect customer content, and how is that access authorized, logged, and reviewed?
Security controls: Is data encrypted in transit and at rest? What key-management options, private networking, scoped credentials, access logs, and administrative controls are available?
Location and subprocessors: Which regions process and store the data? Where can support access occur? Which subprocessors participate in the path?
Assurance evidence: Which services and controls are covered by SOC 2, ISO 27001, or HIPAA-related commitments where relevant to the use case?
Response: How will the provider communicate a security incident, policy change, model change, or subprocessor change that affects your approved use?

An audit or certification is useful evidence about a defined scope. It is not proof that your architecture, settings, or use case is safe. Confirm that the service named in the evidence is the service your product will actually call, and that your configuration does not bypass the controls you evaluated.

Keep a short decision record with the approved purpose, permitted and prohibited data, named endpoints or services, required account settings, retention terms, region, responsible owner, and review triggers. Reopen the decision when the purpose, data class, provider terms, model path, subprocessor chain, or architecture changes.

A shared catalog of approved providers and patterns also reduces shadow AI. Make the approved route easier to use by supplying scoped credentials, reference architectures, redaction utilities, retrieval patterns, and clear examples of prohibited inputs. Governance works better when the safe path is a usable product for internal teams.

Put the controls into delivery and incident response

A policy that depends on every engineer remembering every rule will drift. Store the capability’s classification, required controls, approved provider configuration, and decision owner alongside the delivery artifacts. Version changes so the team can see when a new data source or retention behavior altered the release posture.

Translate the release lane into automated checks wherever the control can be tested:

Scan prompts, templates, configuration, and code for exposed secrets and unapproved endpoints.
Unit-test redaction and tokenization against representative allowed and disallowed inputs.
Integration-test tenant boundaries, role permissions, retrieval filters, and deletion propagation.
Run evaluations that attempt to elicit restricted data, override instructions, retrieve unauthorized records, or trigger tools outside the allowed scope.
Validate the selected provider, model path, region, logging setting, and retention configuration against the approval record.
Block release when required evidence, monitoring, rollback controls, or an incident owner is missing.

Evaluation data needs the same scrutiny as production data. Remove unnecessary identities, restrict access, define retention, and avoid copying raw customer interactions merely because an evaluation system is internal. A test corpus can become a long-lived data store if nobody owns its lifecycle.

Monitor security-relevant events rather than indiscriminately recording content. Useful signals include blocked sensitive-data patterns, denied cross-scope retrieval, calls to unapproved services, unusual access behavior, unexpected changes in model or endpoint usage, and failed retention or deletion jobs. Structured metadata often provides the operational signal you need without preserving every prompt and completion.

Prepare containment before the first customer request

Your incident runbook should name the people and mechanisms needed to contain the feature. Depending on the incident, that can include disabling the affected path with a feature flag, revoking or rotating credentials, restricting retrieval, stopping unsafe logging, locating downstream copies, and contacting the provider.

Do not improvise evidence deletion or customer notification during an incident. Security, privacy, and legal owners should determine preservation, notification, and regulatory obligations based on the specific exposure. The product runbook should make those owners reachable and give them an accurate data-flow record, timestamps, affected systems, and containment status.

After containment, update the control that failed: the architecture, automated check, provider setting, policy, runbook, or team guidance. A review that ends with a reminder to be more careful leaves the same mechanism in place.

Key takeaways

Map every copy of the data, including retrieved passages, logs, embeddings, evaluations, caches, and tool results.
Classify individual capabilities by their highest-risk dimension, then attach mandatory controls to the lane.
Minimize fields before redaction, enforce permissions outside the model, and treat derived stores as sensitive.
Approve vendors for a named use, configuration, data class, region, and retention posture rather than issuing blanket approval.
Put redaction, access, retrieval, configuration, evaluation, and release checks into CI/CD.
Design containment and ownership before launch so an incident does not begin with a search for the right people and switches.

Pick one AI capability currently approaching release and produce its request-to-deletion data map. Assign its lane, turn every unknown into an owned backlog item, and automate the first control the team is still checking by hand. That is how security becomes part of product delivery instead of a negotiation at the end.

References

Shivam.Consulting Blog – AI Data Security for Product Teams: Protect Sensitive Product Data Without Slowing Innovation

April 27, 2026

Tag: AI risk management

Move from an AI tool stack to an evidence system

Use AI to deepen discovery, not to create distance from customers

Let the consequence of failure determine the product architecture

Make evaluation, privacy, and leadership part of delivery

Key takeaways

Building the next product operating rhythm

References

Reliability depends on the surrounding system

Convert product intent into a bounded change contract

Treat context as a limited working set, not permanent memory

Require evidence, limited authority, and a recovery path

Key takeaways

References

System access changes both the value and the risk

Choose workflows where access justifies its complexity

Use an access ladder instead of a single launch

Put deterministic controls around probabilistic decisions

Key takeaways

References

Package risk grows through the dependency graph

Match defenses to the stages of a package attack

Reduce risky entry and automatic execution

Constrain access after installation

Limit unnecessary network egress

Provenance is a decision process, not a trust badge

AI coding agents must inherit the same installation policy

Key takeaways

References

Start with the harm your growth model could create

Pair every growth metric with a human countermetric

Expand discovery beyond the people who already love the product

Put humane constraints inside the experiment

Choose durable depth over indiscriminate scale

Key takeaways

References

Start with a delegation contract, not a general-purpose assistant

Build an inspectable operating loop around four components

Identity defines responsibility

The scheduler supplies a heartbeat

Tasks hold durable state

Scripts expose narrow capabilities

Design the overnight failure path before the happy path

Use a permission ladder

Treat external text as data, not authority

Make retries selective

Put hard boundaries around usage

Measure whether the agent is reducing work or relocating it

Expand only when the current job has earned more autonomy

Key takeaways

References

Classify the decision before you assess the AI

Turn governance principles into an enforceable contract

Define the data boundary

Assign decision rights to named roles

Design the audit record before launch

Put controls inside the workflows people actually use

Behavioral analytics: govern the meaning as well as the data

Anomaly detection: route a signal into investigation, not judgment

Self-service analysis: give teams a governed lane

Pilot with evidence, not a polished demonstration

Key takeaways

References

Start with the decision your ROI model must support

Write the measurement contract before you build the dashboard

Instrument the complete journey, not just the conversation

Calculate economic value without turning activity into savings

Count only incremental revenue

Separate capacity from cashable savings

Include the operating costs that make the agent dependable

Do not bury risk inside an average ROI number

Prove incrementality before claiming impact

Turn ROI into a portfolio operating system

Key takeaways

References

Define the job as a closed loop, not a chat box

Build reliability below the model and price that work honestly

Evaluate build versus buy beyond the demonstration

Put a proposal boundary around every material action

Roll out workflows in increasing order of consequence