Anthropic’s new report on the first documented AI-orchestrated cyber-espionage campaign is one of the most important security publications of the year – maybe ever. Not because of the tooling. Not because of the victim profile. But because it demonstrates a shift many have anticipated:

Attackers are  now running large portions of an intrusion lifecycle through agentic AI systems.

GTG-1002, a nation-state threat actor, used an agentic framework built around a frontier model to drive reconnaissance, vulnerability discovery, credential harvesting, lateral movement, data collection, and documentation.

This wasn’t an experiment. It was an operational campaign — with the AI handling most tactical decision-making and execution.

Below, we unpack the technical structure of the attack and what defenders must learn from it.

1. The Attack Overview: A Quick Review of Anthropic’s Findings

Anthropic observed an attacker using a custom agentic framework integrated with a frontier model (Claude Code). This framework allowed the model to:

  • Break complex strategy into tactical steps
  • Execute tasks via expert sub-agents
  • Make iterative decisions and adapt its plan
  • Chain multi-step actions
  • Produce operational documentation

In effect, the threat actor built a multi-agent system where the model acted as the operator, and the humans acted as strategists.

This flipped the traditional offensive model.

2. How the Attacker Built Their Agentic System

The attacker leveraged the Model Context Protocol (MCP) — a standardized interface for connecting AI models to tools and environments. Their setup functioned similarly to how developers use MCP for automation:

Tooling & Environment Setup

  • Reconnaissance utilities
  • Vulnerability scanners
  • Remote command interfaces
  • Code Analysis tools
  • Browser Automation tools

AI-Orchestrated Workflow

The model was provided with:

  • The Target name
  • Access to Tools via MCP interfaces
  • Carefully structured prompts defining each task as routine technical objectives
  • An orchestration system to manage attack stage and phase transitions
  • Persistent memory for maintaining multi-day operational context
  • Human checkpoints for validation

Execution Pattern

  1. Plan a multi-step objective where agents can work in parallel and with inputs from previous steps
  2. Call a tool prescribed by the objective and agent type used
  3. Read the output and analyze findings and variables
  4. Adapt the plan and automatically make adjustments based on gathered information
  5. Repeat

This is the foundational pattern of agentic systems — and it worked.

3. Which Parts of the Kill Chain Were Automated

Anthropic found that the AI system, consisting of separate but coordinated AI agents, executed a substantial portion of each phase.

Reconnaissance

  • Domain scanning
  • Service enumeration
  • Fingerprinting
  • Identifying exposed assets

Vulnerability Discovery

  • Vulnerability analysis and research
  • Exploit development and payload generation
  • Exploitation validation and internal system enumeration

Exploitation

  • Running targeted commands
  • Validating remote access

Lateral Movement

  • Searching for reachable systems
  • Attempting system-to-system authentication
  • Using discovered creds automatically

Data Gathering

  • Directory searches
  • Credential harvesting
  • File retrieval
  • Content summarization

Documentation

  • Summaries of access vectors
  • Cataloging what was found
  • Recommendations for next steps

This is one of the most striking parts of the report:

The AI agents not only executed the attack — they  documented it.

Something SOC analysts spend hours upon hours doing manually.

And the cold reality is that there’s even more that attackers could leverage with agentic AI in terms of tradecraft for evasion and obfuscation, as well as tool development and repurposing.

4. Where the AI System Struggled — and Why It Still Succeeded

Even though the system was highly effective, Anthropic noted several limitations:

Hallucinated Credentials

The model occasionally invented credentials it thought “might work.”

Overconfidence

It asserted publicly available information as critical discoveries it didn’t have.

Human Intervention Required

Operators had to verify jreal vs. hallucinated output to validate claimed results. Humans were also required in the loop for strategic guidance.

Error Cascades

Bad assumptions sometimes build on each other.

But here’s the critical point:

Even with these failure modes, the attack succeeded because the AI operated faster than human defenders and covered more ground.

Imperfect autonomy is still dangerous autonomy.

5. How Defenders Can Mirror This Architecture

If attackers are using agentic systems, defenders must adopt their own.

Core Architectural Patterns to Adopt

  • Multi-agent workflows for triage, intel, IR, vulnerability analysis
  • Context containers for each alert or incident
  • Long-term memory structures that persist across cases
  • AI-driven task execution, not just analysis
  • Supervision loops where humans approve key steps

Critical Defenses That Map Directly to the Attack Lifecycle

  1. Continuous asset discovery
  2. Agentic vulnerability enrichment
  3. Automated initial triage of anomalies
  4. Rapid correlation of system-to-system activity
  5. Task-level execution in investigations
  6. Automatic incident documentation

This is not about replacing humans.

It’s about matching the speed and scale of an agentic offense.

6. A Practical Architecture for an Agentic SOC

A modern SOC architecture will include:

1. Multi-Agent Teams

Each agent specializes — triage, intel, documentation, enrichment, IR.

2. Persistent Memory & Shared Context Engine

Agents see the same case data, learn from prior actions, and coordinate.

3. Event-Driven Execution Model

Agents take action when something happens — not only when a human asks it to.

4. Supervision & Guardrails

Humans approve sensitive actions or escalations.

5. Integration With Existing Tools

Agentic workflows ride on top of SIEM, EDR, ticketing, and cloud logs.

This architecture mirrors what attackers are doing — but in a controlled, defensive structure that amplifies human analysts.

7. Recommendations for Organizations Today

Most SOCs can begin implementing agentic workflows gradually.

Start with:

  • Tier 1 alert triage and detection rule optimization
  • Threat intel retrieval and summarization
  • Incident documentation
  • Vulnerability enrichment

These domains benefit the most from automation and can be executed safely.

Avoid at first:

  • Full system-level autonomous actions
  • Firewall rule changes
  • Identity provisioning
  • Live-response modifications

Start with the workflows where agents add immediate value and minimal risk.

Conclusion: Offense Has Moved On — Defense Must Catch Up

Anthropic’s report isn’t a theoretical warning — it’s concrete evidence that AI has already reshaped the threat landscape.

Attackers are using agentic systems to run large portions of the intrusion lifecycle. They’re faster, more consistent, and able to scale beyond human limitations.

Defenders must adapt by building agentic defensive architectures that mirror the speed and intelligence of modern offense. Not fully autonomous SOCs — but human-led, machine-operated workflows built around specialized agents.

This shift won’t happen overnight, but it has to start now.

The organizations that take the first steps toward agentic defense will be the ones prepared for what comes next.

Request a Demo