Anthropic's LLM ATT&CK Navigator & ARiES

Dark cyberpunk illustration of a grid of threat-intelligence cells in darkness: most glow faint cyan while a diagonal path burns amber, connected by a thread of light, suggesting an AI stitching attack techniques together.

Anthropic's Frontier Red Team just published the kind of thing defenders have been asking AI companies for since ChatGPT shipped. They took a year of their own abuse data, mapped it onto the framework we already use to talk about attacks, MITRE ATT&CK, and put the whole thing behind an interactive heatmap anyone can click through. It went live on June 4, 2026, and they call it the LLM ATT&CK Navigator.

What makes it worth a look isn't the packaging. It's the vantage point. Almost everything written about "AI-enabled attacks" is inferred from the outside: a researcher finds a suspicious artifact and guesses a chatbot helped build it. Anthropic is reporting from the other side of the prompt. They saw the actual conversations, the techniques being requested, and how far each operation got. No external analyst has that view.

What the dataset actually is

The team reviewed 832 accounts banned for breaking the cyber-related parts of Anthropic's Usage Policy between March 2025 and March 2026. They logged 13,873 individual malicious actions across those accounts and mapped every one to MITRE ATT&CK version 18. That works out to 482 distinct techniques touching all 14 ATT&CK tactics. Put plainly: people are leaning on AI across basically the entire attack matrix, not just the recon and phishing steps everyone assumes.

A handful of numbers set the scene. Two-thirds of the banned accounts (560 of 832, or 67.3%) used the model for capability development, which mostly meant malware. A striking 84.4% touched defense evasion, asking the model to help their code or activity slip past detection. Over the year, AI-assisted account and credential discovery rose about 8.9% while phishing-based initial access dropped about 8.6%, a quiet sign of where the effort is shifting. The typical actor reached for somewhere between 16 and 20 distinct techniques, which is a real chunk of an end-to-end operation rather than a one-off question.

One trend matters more than the rest. The share of actors Anthropic rated medium-risk or higher went from 33% in the first half of the year to 56% in the second. That's roughly a 1.7x jump in twelve months. The group doing serious harm with the tool is growing, and growing quickly.

ARiES: scoring the actor, not the prompt

The part I keep coming back to is the scoring method, the AI Risk Enablement Score, or ARiES. Rather than asking whether a single question was a bad one, it rates the whole actor on a 0 to 100 scale. Three things feed in. Threat is worth up to 35 points and covers intent, sophistication, and evasion. Vulnerability is another 35 and captures how much the model could genuinely enable the requested harm, plus how risky the interface was. Impact rounds it out at 30 and weighs the real or potential consequences out in the world.

That distinction does a lot of work. It pulls apart "someone typed something nasty" from "someone is running a live operation and the model is materially helping." A teenager pasting in a malware tutorial and a state team coordinating an intrusion can ask for surprisingly similar things. ARiES is built to tell them apart by weighing capability and consequence instead of keywords.

What the Navigator lets you see

The tool itself is a familiar ATT&CK matrix with every technique cell shaded by the data. You can recolor it three ways, and which one you pick changes the story. "Raw mean ARiES" shows which techniques tend to show up alongside the most dangerous actors. "Adjusted," which multiplies that mean by prevalence, surfaces the techniques that are both dangerous and common, so you're not chasing a rare outlier. "Percentage of banned accounts" is plain popularity, regardless of how risky the actor turned out to be.

Flipping between "common" and "high-risk" is really the point of the thing. Some techniques glow on prevalence but stay dark on risk: lots of people ask, few of them are serious. Others are the opposite, rare but almost always tied to the worst actors. If you tune detections or decide where controls go first, that's the distinction you actually care about.

The finding that should change how you read AI threats

Here's Anthropic's least intuitive conclusion: the highest-risk actors don't stand out by volume. They don't necessarily run more techniques or fire off more requests than a middling actor. What separates them is orchestration, how completely they hand the operation over to the model.

The example that makes it concrete is a state-linked espionage campaign Anthropic disrupted in November 2025. On paper it looked ordinary, about 30 techniques across 13 tactics, right in line with plenty of medium-risk actors. It still scored a perfect 100. The operators had wired Claude Code into the attack chain and let it run as much as 90% of the operation on its own, executing commands, exploiting vulnerabilities, stealing credentials, and making tactical calls, only pulling in a human at a few decision points. The threat wasn't the list of techniques. It was a machine stitching them together at speed.

Three takeaways fall out of that, and they're the ones worth sitting with. First, AI has moved into the later, harder stages of an attack. Post-compromise tradecraft like lateral movement, privilege escalation, and defense evasion used to keep less-skilled attackers out by sheer difficulty; the model lowers that bar, so people who couldn't have gotten there alone now reach deeper into a network. Second, attacks are getting more autonomous, and the old habit of triaging risk by technique count or alert volume quietly misses the actor who automates the whole kill chain. Third, ATT&CK has a blind spot: it has no real vocabulary for agentic, AI-driven behavior. Anthropic says it's talking to MITRE about adding one, which is a polite way of admitting the defender's map now trails the ground.

Why it matters past the AI-safety crowd

It would be easy to file this under "interesting research" and move on. I'd push back on that, for a few reasons.

For one, this is defender-grade intelligence flowing into the channels you already read. Anthropic fed these findings into the Verizon 2026 Data Breach Investigations Report, one of the most-cited documents in the field. Telemetry from inside the model is now shaping the same report your team uses to plan a year of work.

It also puts numbers on the "democratization" everyone has been hand-waving at. "AI helps unskilled attackers" has been a talking point for two years with little behind it. This is the first time a model provider has measured it from the inside: a 1.7x rise in serious actors, advanced post-compromise moves turning up in hands that couldn't have pulled them off unaided. The floor is rising, and now we can see by how much.

And it points you somewhere specific. If what marks the worst actors is autonomous orchestration rather than exotic techniques, the signal you want is behavioral. Think machine-speed action sequences, an operator that never seems to pause, tool use that runs straight from reconnaissance into exploitation with none of the usual human cadence in between. That's a detection-engineering brief, and it holds whether you're a Fortune 500 SOC or a small business relying on a managed provider. The techniques in the matrix are the same ones you already defend against, from Active Directory abuse to credential theft to defense evasion. What's changed is the speed, and how little skill it now takes to run them.

A few honest caveats

A primary source isn't an unbiased one, and it's worth reading the Navigator with that in mind. The dataset is everything that tripped Anthropic's own enforcement, so it's shaped by what they catch and ban; the actors who slipped past detection or ran their own self-hosted models are, almost by definition, underrepresented. ARiES leans on human judgment about intent and impact, which brings the usual subjectivity. And it's one company's window onto one model family, a slice of the picture rather than the whole of it. None of that takes away from the value. It just means you should treat the heatmap as a strong signal to corroborate, not a census.

Worth an hour of your team's time

Most AI-security content is projection. This is concrete, interactive, and built on real data, which makes it rare. Open it, flip through the three colorings, and watch which techniques are merely popular versus which ones travel with the dangerous actors. Then sit with the question it's clearly designed to raise: if an adversary can now run most of an operation through a model, are your detections watching for techniques, or for the machine-speed chaining of them? That's the shift this dataset captures, and it's the one defenders will be adjusting to for years.

Sources: Anthropic, LLM ATT&CK Navigator and "What we learned mapping a year's worth of AI-enabled cyber threats." Figures cited reflect Anthropic's published analysis of 832 banned accounts (March 2025–March 2026) and the Verizon 2026 DBIR.

Do your detections catch technique-chaining, or only individual techniques?

The shift this dataset captures is machine-speed orchestration — an attacker running most of an operation through a model. We help teams pressure-test whether their detection and response keep up, from Active Directory abuse to credential theft to defense evasion. Book a 1-on-1 session to talk through your environment.

Book a Session

Anthropic's LLM ATT&CK Navigator and ARiES: A Year of AI-Enabled Attacks, Mapped