I spent years in incident response, some of it at Mandiant, some in the government, dealing with an organization's worst day to answer one question while the clock ran: what happened, and how do we keep it from getting worse?
That question has always been hard. Enterprise AI agents are about to make it categorically harder. The industry is deep in the conversation about how to deploy these agents and barely talking about the other side. When one causes an incident, what does response actually look like? In the CISO rooms I sit in around the world, almost nobody has a real answer, and the time to build one is running out.
The IR Playbook Was Not Written for This
Traditional security IR assumes an actor who has made a decision or taken an action. With human incidents, we work backward from the action to intent, scope, and blast radius. For machine incidents, we find the tool, trace its run, and shut it down.
Agents break both models at once. An agent reasons, adapts, and makes judgment calls on context that looked legitimate in the moment. It moves across Finance, HR, Operations, and IT, mostly with no human watching, calling other agents and inheriting context from workflows nobody built it for. So the questions I relied on throughout my whole career are no longer lining up.
Who is the actor? The agent. Also, the supervisor agent that handed it the task. Also, the employee whose context it inherited three hops back. Also, the team that stood it up six weeks ago under conditions that no longer exist.
What was the intent? The agent stayed inside its permissions. Whether what it did matched the intent that authorized it is a separate question, and answering it means tracing a chain of context that may have degraded with every hop.
How do you contain it? Kill the agent? The model? The whole chain? What falls over downstream when you pull the thread?
This is what your team will stare at the first time an agent causes a serious incident. The only thing within your control is whether you built the means to answer in advance.
Forensics works because actions leave evidence that rebuilds into a timeline. Agentic AI breaks that quietly. The trail is still there, but it doesn't tell you what you need to know. You can capture every action, API call, and tool invocation and still have no idea whether the step at hop three lined up with the human intent at hop one. Intent doesn't show up as a field in a log.
This is the part that keeps me and most of my fellow security leaders up at night. A chain of agents produces an outcome. The logs show every step. Nobody signed off on that outcome. So which step was the failure: how the agent was provisioned, drift in the middle, or a delegation that stayed inside the letter of the policy while blowing past its spirit?
Logs alone won't answer that. You need a running record of behavioral context: what the agent was trying to do, what it believed it was allowed to do, and how that compared to its normal behavior at each step. That is why the SANS AI Security Maturity Model calls for structured logging with trace IDs across agent steps and adds reasoning traceability and decision-audit artifacts at higher maturity levels. Skip it, and you're left holding an audit trail when the moment demands forensics.
Containment is a special beast in the context of IR. Stopping the bleeding without doing more damage is one of the cornerstones of good IR.
A compromised endpoint comes off the network. A malicious process gets killed. We can predict what happens next before we act. An agent sewn into Finance, HR, and Operations is a different animal. Pull it mid-workflow, and dependent agents fail silently, transactions are left half-finished, and compliance obligations get tangled in automation that never completes. The responder's instinct to isolate runs straight into the reality that this thing is load-bearing for the business.
You cannot design containment for agentic AI in the middle of the incident. The organizations that get this right will already know their agent dependencies and blast radius, and have the governance to make a clean, targeted cut instead of yanking everything offline.
Before you contain anything, you have to answer a question that sounds basic and isn't. Who, or what, are you dealing with?
Human identity holds reasonably still: a role, entitlements, a baseline. Agents don't work that way. Their scope changes with every task, they absorb context from whatever called them, and their runtime behavior can drift far from how they looked at provisioning. In a multi-agent chain, identity erodes with every hop.
So, agent identity was never a governance checkbox to me. I treat it as the thing whose response depends on. You can't investigate what you can't identify, contain what has no clear identity boundary, or assign accountability if the record doesn't show who owned which action and when. That's why I wrote two ideas into the Maturity Model: the Principle of Least Agency, prove the autonomy is genuinely needed before you hand it over, and a distinct Non-Human Identity with a named human owner for every deployed agent. Without that, response turns into archaeology.
I served in the Marine Corps, and the military has governed autonomous systems for decades: drones, automated targeting, and command-and-control that makes consequential calls faster than a human can react. Nobody fielded those and crossed their fingers. We wrapped them in governance that settled the hard questions first. Clear chain of command. Defined rules of engagement. Accountability at every rung. Human authorization above a set risk threshold. Response protocols written for autonomous actors, not borrowed from the ones built for people.
Enterprise AI is rolling out autonomous agents at scale with almost no scaffolding. The move that matters is building the command-and-control layer alongside the deployment, before the first serious incident forces the conversation the hard way.
Most organizations won't build agentic IR capability until they're forced to. Agentic AI changes the math because of speed. An autonomous agent loose across enterprise systems, no human in the loop, in an environment that was never properly governed, can rack up an enormous blast radius before anyone notices.
The teams that come through intact will be the ones who did the work early: real visibility into every agent, continuous behavioral baselines, audit trails that capture intent and context, and playbooks built for autonomous actors with containment steps that respect how deeply these agents are wired into operations. It also means settling ownership before the incident. The Maturity Model draws a hard line: security incidents like prompt injection, data exfiltration, and model theft belong to Security, while safety and reliability incidents like bias, hallucination, and legal exposure belong to Legal and Risk. Working that out at two in the morning while something is on fire is not a plan.
Let me be straightforward. Slowing AI adoption was never the play. The gains from agents deployed well are real, and the organizations that deploy them safely will pull ahead.
What I'm pushing for is the governance underneath it: visibility, behavioral context, dependency mapping, and playbooks built for autonomous actors. That foundation does more than make response possible. It lets a team deploy with confidence, because they can see what their agents are doing and step in when something breaks. It's the same reason governance sits at the floor of the Maturity Model. Overall maturity can never climb more than a single stage above governance, because without that base every other strength just piles up unmanaged risk.
We've poured our energy into how to stand these agents up. The harder, more urgent question is how we respond the day one goes sideways. That day is coming. The only thing left to decide is whether you're ready when it does.
Chris Cochran is the Field CISO & VP of AI Security at SANS Institute. A Marine Corps veteran and former leader at Netflix, Mandiant, the NSA, and Axonius, Chris has spent his career at the intersection of operational cyber defense and emerging technology risk.
Explore the SANS AI Security Maturity Model™.