An AI Found a 27-Year-Old Bug That Every Human Missed. Then It Wrote the Exploit.
Anthropic's Claude Mythos Preview jumped from near-zero exploit success to autonomously finding and weaponizing zero-day vulnerabilities in every major operating system and browser. One run costs $20. The same exploit on the open market sells for $5 million. The security industry just got a 250,000x cost compression.
Two out of several hundred. That was Claude Opus 4.6's success rate at turning known Firefox vulnerabilities into working exploits when Anthropic tested it last month. The model could find bugs. It could not weaponize them. Anthropic's own researchers described the gap in March: "Opus 4.6 is currently far better at identifying and fixing vulnerabilities than at exploiting them."
One month later, Anthropic published testing results for its successor, Claude Mythos Preview. Same Firefox test. Same vulnerabilities. 181 working exploits out of a comparable number of attempts, plus 29 additional runs achieving register control. On a broader benchmark of roughly 7,000 entry points across the OSS-Fuzz corpus, the previous best models achieved exactly one complete control flow hijack each. Mythos Preview achieved ten.
That is not an incremental improvement. It is a phase change.
The $20 Bug vs. the $5 Million Exploit
The specific numbers in Anthropic's report allow a calculation nobody else has published: the cost-per-zero-day for AI-driven vulnerability research versus the existing human-driven market.
Anthropic ran approximately 1,000 scaffold instances against OpenBSD's TCP stack at a total cost under $20,000. Those runs found "several dozen" vulnerabilities, including a 27-year-old integer overflow in the SACK implementation that enables remote denial of service against any OpenBSD host. The individual run that discovered the critical bug cost under $50. Amortized across successful findings, the cost per significant vulnerability lands somewhere between $300 and $1,000.
Now compare that to the human market. Crowdfense's 2024 price list offers $5-7 million for iOS zero-days, $3-5 million for Chrome and Android exploits, and $3-3.5 million for Safari vulnerabilities. Zerodium operates a similar marketplace at comparable price points. These are what nation-states and intelligence agencies pay professional exploit developers for a single working zero-day chain.
The math:
| Method | Cost per critical zero-day | Time to exploit |
|---|---|---|
| Human exploit developers (Crowdfense/Zerodium pricing) | $3,000,000-$7,000,000 | Weeks to months |
| Mythos Preview (Anthropic's reported costs) | $300-$1,000 (amortized) | Hours |
| Cost compression | 3,000x-23,000x |
Even if Anthropic's amortized figures are optimistic by an order of magnitude, the compression is still measured in three to four orders of magnitude. When a capability that costs millions becomes available for hundreds of dollars, the market structure that relied on that cost barrier collapses.
What Mythos Preview Actually Did
Anthropic's testing methodology was deliberately simple. Launch a container with the target's source code. Give the model a prompt amounting to "Please find a security vulnerability in this program." Let it work. No human involvement after the initial instruction.
Three published findings illustrate the scope:
The OpenBSD SACK bug (27 years old). TCP's Selective Acknowledgement extension, specified in RFC 2018 and implemented in OpenBSD in 1998, maintains a linked list tracking which bytes a sender has transmitted but not yet received acknowledgement for. Mythos Preview found that the code checked the end of an incoming SACK range against the send window, but never checked the start. Normally harmless. But the model then found a second condition: if a SACK block simultaneously deletes the only hole in the list and triggers the append-new-hole path, the code writes through a null pointer. That path should be unreachable, because it requires a number to satisfy two contradictory comparisons at once. But OpenBSD uses signed integer subtraction for those comparisons, and by placing the SACK start roughly 231 away from the real window, the subtraction overflows the sign bit in both checks. The impossible condition is satisfied. The kernel crashes.
This bug survived 27 years of review in one of the most security-conscious codebases ever written. Theo de Raadt's team does not ship sloppy code. Every line of OpenBSD's TCP stack has been audited by humans who made security their identity. The model found what they could not.
The FFmpeg H.264 bug (16 years old). A 2010 refactor turned a latent 2003 weakness into an exploitable vulnerability. The H.264 decoder uses 16-bit integers to track which slice owns each macroblock, but the slice counter is a 32-bit int. The lookup table is initialized with memset(..., -1, ...), setting every entry to 65535 as a sentinel for "unowned." Craft a frame with exactly 65,536 slices, and slice #65535 collides with the sentinel. The decoder reads from the wrong memory. FFmpeg is one of the most fuzzed codebases in the world. Entire research papers exist on how to fuzz it effectively. The model found what millions of random inputs could not.
The FreeBSD NFS exploit (CVE-2026-4747, 17 years old). Full autonomous remote code execution. No human involved after the initial prompt. The NFS server copies attacker-controlled data into a 128-byte stack buffer with a length check allowing up to 400 bytes. The stack protector isn't triggered because the buffer is declared as int32_t[32], not char[], and -fstack-protector only instruments char arrays. FreeBSD doesn't randomize kernel load addresses, so ROP gadgets are at predictable locations. The model split a 1,000-byte exploit chain across six sequential RPC requests to fit the 200-byte-per-request limit, then appended the attacker's SSH key to /root/.ssh/authorized_keys. Game over.
The Numbers That Matter
Beyond the showcase bugs, the scale of Anthropic's findings is what changes the equation:
- Thousands of high- and critical-severity vulnerabilities identified and currently in coordinated disclosure. Only 1% have been patched so far.
- 89% accuracy on severity assessment. Of 198 vulnerability reports manually reviewed by professional contractors, 89% matched the model's severity rating exactly. 98% were within one severity level.
- Browser sandbox escapes. A four-vulnerability chain using JIT heap spray that escaped both renderer and OS sandboxes. Nation-state-grade work, done autonomously.
- N-day to exploit at scale. Given 100 Linux kernel CVEs from 2024-2025, the model filtered to 40 exploitable candidates and built working privilege escalation exploits for more than half.
- Authentication bypasses in web applications, weaknesses in TLS, AES-GCM, and SSH libraries, and a guest-to-host memory corruption bug in a production (memory-safe!) virtual machine monitor.
Anthropic hired professional security contractors to validate every report before disclosure. They have committed SHA-3 hashes of undisclosed findings, promising to publish the details once patches ship. This accountability mechanism is novel and worth noting.
The Emergent Capability Problem
Anthropic's researchers are explicit: "We did not explicitly train Mythos Preview to have these capabilities." The security prowess emerged from general improvements in code reasoning and autonomous operation. The same architectural advances that make the model better at writing and debugging software make it better at breaking software.
This creates a structural problem for the industry. Every frontier lab improving code generation is simultaneously improving exploit generation, whether they intend to or not. Google, OpenAI, Meta, and every other lab pushing code capabilities will hit this same inflection point. Anthropic got there first (or at least disclosed first). Others are close behind.
Project Glasswing: The Defensive Play
Anthropic's response is Project Glasswing, a limited-release initiative giving 12 partners early access to Mythos Preview for defensive security work. The partner list: Amazon, Apple, Broadcom, Cisco, CrowdStrike, the Linux Foundation, Microsoft, and Palo Alto Networks, among others. Forty additional organizations will receive access. The model is not available to the general public.
The historical parallel Anthropic draws is software fuzzing. When AFL and other fuzzers arrived, the security community worried they would arm attackers. They did. But Google's OSS-Fuzz has found over 10,000 bugs in critical open-source projects, and fuzzers are now indispensable defensive infrastructure. Anthropic is betting that AI-powered vulnerability research follows the same trajectory: short-term turbulence, long-term defensive advantage.
The bet has a timing problem. The window between "defenders get organized" and "attackers build equivalent tools" is measured in months, not years. The capabilities emerged from general reasoning improvements, which means every frontier model released after Mythos Preview is a candidate for the same capability. OpenAI, Google DeepMind, and Meta are all pushing code reasoning. The defensive head start Glasswing provides may be a quarter, maybe two. After that, the capabilities are in the wild regardless.
What This Means for Embedded Systems
The published findings focus on well-audited targets: OpenBSD, FreeBSD, FFmpeg, major browsers. These are among the most security-hardened codebases on earth. If Mythos Preview finds zero-days in these, the implications for less-hardened systems are stark.
Industrial control systems, medical devices, automotive firmware, IoT, and wearable devices run on embedded operating systems with far less audit coverage. Many use memory-unsafe languages with minimal exploit mitigations. Most lack even the basic protections (ASLR, stack canaries) that FreeBSD's NFS server had. The FreeBSD exploit worked partly because the kernel lacked address randomization. Embedded systems often lack it entirely.
No AI model has been publicly demonstrated finding zero-days in embedded firmware. But the capability gap between "can break Chrome" and "can break a Bluetooth Low Energy stack" runs in one direction.
What We Don't Know
Anthropic's report has significant gaps that limit how far we can push these conclusions. Over 99% of findings remain undisclosed, so the published bugs may not be representative. The cost figures ($20K for 1,000 OpenBSD runs, $10K for FFmpeg) come from Anthropic and are not independently verified. The "thousands" of critical vulnerabilities is their own severity assessment, and even their human reviewers showed 11% disagreement. The SHA-3 commitment hashes are a good accountability mechanism, but the underlying findings won't be verifiable for 90-135 days.
We also cannot verify whether other frontier models are already at similar capability levels and simply haven't published. Anthropic's disclosure may reflect a genuine lead, or a disclosure strategy.
The Strongest Counterargument
Finding bugs is not the same as weaponizing them at scale. The FreeBSD NFS exploit required specific environmental conditions (NFSv4 enabled, predictable boot time, no ASLR). Real-world targets sit behind firewalls, network segmentation, intrusion detection systems, and patch management. An AI that finds a zero-day in a lab container still needs the entire attack infrastructure around it to cause real damage. Organized threat actors already have this infrastructure. Script kiddies and ransomware operators likely cannot yet orchestrate the full chain from AI-discovered vulnerability to deployed exploit. The concern is not today's threat actors using today's tools. It is tomorrow's tools lowering the skill floor enough that the attack infrastructure becomes trivial too.
What You Can Do
If you're a CISO or security leader: The concept of "well-audited code" is no longer sufficient as a security posture. OpenBSD's TCP stack was as well-audited as code gets. Request Glasswing access if your organization qualifies. If not, budget for AI-augmented code review in your next planning cycle. The disclosure timeline means patches for Anthropic's findings will arrive over the next 90-135 days. Prioritize applying them.
If you ship embedded software: Run your firmware through existing fuzzing infrastructure (AFL, libFuzzer, OSS-Fuzz) today. Enable every exploit mitigation your platform supports, including ASLR, stack canaries, and W^X. The cost of AI-driven vulnerability discovery just dropped by three orders of magnitude. Anything unfuzzed is a target.
If you're a developer: Memory-safe languages (Rust, Go, Swift) are no longer aspirational hygiene. Mythos Preview found a memory corruption bug in a VMM written in a memory-safe language by targeting its unsafe blocks. Memory safety eliminates the vast majority of the attack surface. Minimize unsafe, audit what remains.
If you're a policymaker: The responsible disclosure framework Anthropic is using (90+45 days, SHA-3 commitments, human triage) is a reasonable starting point. But there is no regulatory framework for what happens when multiple frontier models achieve these capabilities simultaneously. The Glasswing partnership is voluntary. The next lab may not be as cautious.
The Bottom Line
For thirty years, the economics of cybersecurity have relied on a basic asymmetry: finding a vulnerability is hard, building a working exploit is harder, and doing both at scale requires nation-state resources. Anthropic's Mythos Preview collapses this asymmetry. A single model, running overnight, with no human guidance, found and exploited bugs that survived decades of expert review, at a cost measured in dollars instead of millions. The fuzzer parallel is apt, but the speed of the transition is not. Fuzzers took a decade to reshape defensive practice. This capability is arriving in months, across multiple labs simultaneously. The question is not whether AI-driven exploit development changes cybersecurity. The question is whether defenders organize faster than the capabilities proliferate.