🛡️ Defense

Anthropic Said Its AI Found 'Thousands' of Zero-Days. The Evidence Trail Shows One Confirmed CVE.

VulnCheck can trace only a single CVE to Project Glasswing. The cURL creator called the findings "primarily marketing." The UK's AI Security Institute found GPT-5.5 matches Mythos on autonomous cyberattack simulations at 71.4% success, but OpenAI shipped its model to vetted defenders while Anthropic locked Mythos behind a 12-partner gate. With Anthropic's IPO targeted for late 2026, the gap between the marketing narrative and the confirmed evidence deserves a closer read.

Futuristic corporate headquarters at night with glowing code projections and a cracked holographic padlock floating above, symbolizing the gap between cybersecurity claims and evidence

One. That is how many confirmed CVEs VulnCheck can trace to Anthropic's Project Glasswing, the defensive cybersecurity initiative built around the company's most powerful model, Claude Mythos. Not hundreds. Not dozens. Researchers at VulnCheck found roughly 40 CVEs across Glasswing's 50 partner companies, but most cannot be directly linked to Mythos, and only one, CVE-2026-4747, a remote code execution bug in FreeBSD reported by Nicholas Carlini using Mythos, can be publicly tied to the program.

Anthropic's own narrative paints a different picture. Mythos, according to the company, discovered "thousands" of zero-day vulnerabilities across every major operating system and browser, including a 27-year-old OpenBSD flaw that had survived decades of manual code audits. It claims 90 times the exploit-generation capability of its previous model, Claude Opus 4.6, found 181 vulnerabilities in Firefox alone, and says ninety-nine percent of these findings remain unpatched. Too dangerous to release, Anthropic concluded, so dangerous that it created a restricted access program, Project Glasswing, and invited only 12 elite partners including CrowdStrike and JPMorgan to touch it, with $100 million in credits to sweeten the deal.

Those claims are extraordinary, and the evidence supporting them is remarkably thin.

The cURL Test

Daniel Stenberg has maintained cURL for 28 years, and it is one of the most heavily audited open-source projects in existence, embedded in billions of devices from smartphones to spacecraft, the kind of codebase where finding a real vulnerability would genuinely matter. Mythos scanned it and reported five vulnerabilities.

Four were false positives, and the single confirmed finding was low-severity. Stenberg called the broader Mythos hype "primarily marketing," noting that other AI-assisted tools had already triggered hundreds of legitimate bugfixes in cURL over the preceding eight to ten months without anyone claiming the tools were too dangerous to release. Other tools shipped their results quietly, he observed. Anthropic shipped a press strategy.

If Mythos produces a false positive rate of 80% on one of the most well-understood codebases in open-source software, what does that imply about the "thousands" of zero-days found in less scrutinized code? Honestly, we cannot know, because Anthropic has not published the full vulnerability reports, the false positive rates across different codebases, or any methodology that would allow independent verification. We have a marketing number and a 12-partner access gate, not a dataset.

What the UK Actually Found

The UK's AI Security Institute ran Mythos and GPT-5.5 through identical autonomous cyberattack simulations. Both models achieved essentially identical results: a 71.4% success rate on expert-level cybersecurity tasks. GPT-5.5 completed a 32-step corporate network attack simulation in approximately 10 minutes, succeeding in two of ten attempts, and it cracked a reverse-engineering challenge that human experts typically needed 12 hours to solve.

Here is the part that matters for the "too dangerous to release" framing: GPT-5.5 failed a 7-step industrial control system scenario, and AISI researchers discovered a universal jailbreak that bypassed safety guardrails on all tested models, including Mythos.

Two conclusions follow from this data, and they pull in opposite directions.

First, frontier AI models have genuinely crossed a capability threshold in autonomous cyberattack execution, because a 71.4% success rate on expert-level tasks is not zero, and a 32-step attack chain completed without human guidance represents a real advance over what models could do 18 months ago.

Second, the capability is not unique to Mythos. OpenAI's GPT-5.5 matches it, and OpenAI shipped GPT-5.5-Cyber to vetted defenders through a program called Trusted Access for Cyber, granting access to cybersecurity firms like Cisco, Palo Alto Networks, and Sophos. OpenAI's approach: build the capability, restrict access to verified professionals, and let defenders use it. Anthropic's approach: build the capability, restrict access to 12 partners, and tell everyone else it is too dangerous.

Metric Claude Mythos GPT-5.5-Cyber
Expert-level cyber success rate (UK AISI) 71.4% 71.4%
32-step attack simulation Completed Completed (2 of 10 attempts)
Defender access model 12 gated partners Vetted professionals (broader)
Public release Withheld entirely Shipped with TAC restrictions
Confirmed CVEs attributed 1 (VulnCheck) Not tracked separately
Open-source replication cost $0.11 per million tokens (community models)

The 1,000-Pentest Reality Check

Security firm Aikido ran 1,000 AI-assisted penetration tests to measure whether newer, more powerful models actually translate into better attack outcomes. Their finding cuts against the prevailing narrative: attackers benefit more from existing knowledge of a target's context than from access to more capable models. In real-world exploitation, the bottleneck is not the attacker's AI capability but their knowledge of the target system's architecture, configuration, and dependencies.

This matters because the "too dangerous to release" framing assumes that the model itself is the scarce ingredient. If Aikido's data is representative, the scarce ingredient is the attacker's prior context, and that does not change regardless of which model they use. A nation-state attacker with deep knowledge of a target's infrastructure and access to open-source models running at $0.11 per million tokens poses more danger than a script kiddie with full Mythos access and no context.

Bruce Schneier at Harvard's Berkman Klein Center made a similar observation in his analysis: Mythos did not outperform other large language models in bug detection capability. Schneier argued that the real concern is not a single model but the category-level improvement in AI's capacity to find exploitable loopholes in complex systems, a concern that applies to every frontier model, not just the one Anthropic chose to withhold.

The IPO Calendar

Anthropic launched Mythos Preview on April 7, 2026, and announced Project Glasswing on April 22. Anthropic is preparing for an IPO targeted for late 2026, possibly as early as October, at a valuation between $400 billion and $500 billion, having raised its most recent private round at a $900 billion valuation. Annual revenue reached $19 billion by March 2026, and an IPO at that valuation could raise $60 billion.

Correlation is not causation, and the timing of a product announcement relative to an IPO does not prove that the announcement was designed to influence the offering, since companies release products on their own schedules.

But the "too dangerous to release" narrative does specific things for a company approaching a public offering. It establishes technological superiority without requiring independent benchmarking, creates scarcity around access by positioning the company as gatekeeper to a uniquely powerful capability, and generates press coverage that reinforces the perception of Anthropic as the "responsible AI" company, a brand differentiation that resonates with institutional investors who have learned to care about AI safety positioning after years of regulatory scrutiny. It is, in the language of investor relations, a moat story told through the frame of ethics.

Compare this with Anthropic's stated mission: the responsible development of AI for the benefit of humanity. Nothing about that mission requires withholding a defensive security tool from the global community of defenders. If Mythos truly finds vulnerabilities that OpenBSD missed for 27 years, the humanitarian case for broader access is stronger than the case for restriction. Every day those vulnerabilities remain unpatched because the tool is locked behind a 12-partner gate is a day that attackers who have independently discovered the same flaws can exploit them against targets that lack the information to defend themselves.

The Remediation Paradox

Even within Glasswing's restricted partner network, fewer than 1% of the vulnerabilities Mythos found have been patched, which creates a paradox. If the vulnerabilities are real and dangerous, the program is failing at its stated purpose because its partners are not remediating in volume. If they are not being remediated because most are low-severity false positives or duplicates of known issues, then the "thousands of zero-days" framing overstates the actual risk by orders of magnitude.

Neither interpretation supports the current access model. Either the bugs are real and need wider remediation, which requires wider access to the findings, or the bugs are mostly noise and the danger is overstated, which means the restriction is solving a problem that does not exist at the claimed scale.

The Strongest Case for Anthropic

VulnCheck's one confirmed CVE may not reflect the full picture. CVE assignment is notoriously slow, with backlogs that can stretch six months to a year from initial disclosure to formal identifier assignment, so many real findings could be in the pipeline awaiting formal confirmation rather than absent from it entirely. OpenBSD's 27-year-old vulnerability has been independently verified, establishing that Mythos can find bugs that decades of expert manual review missed. Anthropic's internal testing reportedly showed the model generating functional exploits for previously unknown vulnerabilities in minutes, and even if the false positive rate is high, a 20% true positive rate applied to thousands of scans still yields hundreds of real, unpatched flaws that adversaries could weaponize if the model were public.

Comparing Mythos with GPT-5.5-Cyber is also imperfect. OpenAI's Trusted Access for Cyber program restricts its most capable model to vetted professionals, which is functionally closer to Anthropic's approach than the "shipped it anyway" framing suggests. OpenAI drew a wider circle, Anthropic a narrower one: a difference of degree, not kind, and reasonable people can disagree about where to draw the line.

And Anthropic may genuinely be concerned about downstream risks that the benchmarks do not capture, because a model that can scan every GitHub repository on Earth and generate exploit code across every public repository creates a categorically different threat from one that completes a simulated 32-step attack in a controlled environment, and scale changes the nature of the risk in ways that synthetic benchmarks cannot measure.

Limitations

We have not independently examined Mythos's vulnerability reports, and we cannot verify the "thousands" claim or the false positive rate across all claimed findings. VulnCheck's audit may not be exhaustive, and CVE assignment delays could account for the gap between claimed findings and confirmed CVEs. AISI simulations tested controlled environments, not real-world attack scenarios, and Anthropic has not published its full methodology. We cannot determine from public data what fraction of the claimed zero-days would qualify as high-severity, as exploit generation capability does not imply that every generated exploit targets a critical flaw. An 80% false positive rate on cURL may not generalize to other codebases with different levels of prior audit coverage.

What You Can Do

If you run a security team, do not wait for Glasswing access to act on AI-assisted vulnerability discovery, because multiple open-source and commercial tools already perform AI-driven code scanning at costs that are a rounding error on any enterprise security budget. Google's OSS-Fuzz has found over 10,000 bugs using AI-assisted fuzzing across 1,000 open-source projects, and it is free. Semgrep, CodeQL, and Snyk all integrate AI-assisted scanning, and the capability Anthropic describes as unique is available now, in less dramatic packaging, without the press releases.

If you are an investor evaluating Anthropic's IPO, ask specific questions: how many of the "thousands" of Mythos findings have been independently confirmed, what is the false positive rate across different codebases, and how does Mythos's vulnerability discovery rate compare with commercially available alternatives on the same targets? Answers to these questions do not exist in any public document, and their absence should inform your assessment of the capability claims driving the valuation narrative.

If you work in policy, the meaningful question is not "should Mythos be released?" but "what disclosure timeline should apply when an AI system discovers vulnerabilities in critical infrastructure?" Right now, the answer is whatever Anthropic decides, which is not a governance framework but a press strategy.

The Bottom Line

Anthropic built a model that may be genuinely capable, but the verified evidence confirming that capability is remarkably thin for a company asking the world to trust its judgment about what is too dangerous to share. One confirmed CVE from VulnCheck. An 80% false positive rate on cURL. Identical performance to a competitor's model that was shipped to defenders. Fewer than 1% of findings remediated by the partners who do have access, and a $400-to-$500-billion IPO on the horizon. Defenders do not need a narrative about danger. They need vulnerability data, reproducible results, and tools they can actually use. Right now, the company that says it built the most powerful vulnerability scanner in history is holding the scanner behind a velvet rope and asking everyone to take its word for the threat level. That is not safety. That is positioning.