🛡️ Defense

Criminals Used AI to Build a Zero-Day Exploit. Google Caught It by the Hallucinations.

Google's Threat Intelligence Group has documented what it calls the first confirmed case of criminals using artificial intelligence to build a working zero-day exploit. The target was a two-factor authentication bypass in a popular open-source web administration tool. Google worked with the vendor to patch the flaw before a mass-exploitation campaign could launch. What gave the AI away: polished educational docstrings, textbook code structure, and a CVSS severity score that does not exist in any database on Earth.

Ghostly lines of glowing code reflecting off a dark server room floor with a faint red warning light in the distance

Seven hundred forty-five days. That was the average gap between the moment a software vulnerability became publicly known and the moment attackers started exploiting it, according to Flashpoint's vulnerability intelligence data from 2020. By 2025, that window had collapsed to 44 days, a 94 percent compression over five years driven by the rapid weaponization of researcher-published proof-of-concept code and internet-wide scanning tools like Shodan and FOFA that let even unsophisticated attackers automate mass exploitation across entire networks in hours.

Forty-four days was already a crisis for defenders who needed to identify, test, and deploy patches across sprawling enterprise infrastructure before someone weaponized the disclosure and turned it against them. Now add a second compression on top of the first: an AI that can discover the vulnerability and build the exploit before any researcher publishes anything at all.

No disclosure, no patch window, and zero days in the most literal sense of the phrase.

What Google Found

On May 11, 2026, Google's Threat Intelligence Group published a report documenting a zero-day exploit it had intercepted before the planned campaign could begin. GTIG declined to name the vulnerable software, describing it only as a popular open-source, web-based administration tool. It also declined to name the cybercrime group, though John Hultquist, GTIG's chief analyst, told CyberScoop that the attackers have "a strong record of high-profile incidents and mass exploitation," language that points to a well-known and well-resourced operation.

What GTIG did disclose fills in the operational picture. Attackers found a flaw in a Python script that handles two-factor authentication for the tool. Developers had hard-coded a trust exception into the authentication flow, creating a path that allowed a user with valid primary credentials to bypass the second factor entirely. That is not a buffer overflow or a memory corruption bug, which is the class of flaw that traditional automated security tools like fuzzers and static analyzers are built to catch. It is a semantic logic error: the code does something its designers did not intend because a human made a reasoning mistake about when to trust an input, and that kind of high-level conceptual flaw is precisely the territory where frontier language models are becoming capable.

Google worked with the vendor to issue a patch before the campaign launched. Hultquist told CyberScoop the operation was likely disrupted before it gained traction, though he was careful to frame the discovery as evidence of a much larger problem rather than a single contained incident. "This is probably the tip of the iceberg," he said. "The game's already begun and we expect the capability trajectory is pretty sharp."

How the AI Gave Itself Away

What makes this case different from previous suspicions about AI-assisted hacking is the evidence trail embedded in the code itself. GTIG researchers described the exploit as "suspiciously machine-made," and the artifacts they found tell a specific story about how the code was generated.

First, the Python code contained polished educational-style docstrings, the kind of well-structured documentation that appears in programming tutorials and textbook examples but almost never in the rushed, pragmatic output of a criminal exploit developer working under operational pressure. Second, the code was heavily annotated in a way consistent with the structured formatting that large language models produce when generating technical explanations. Third, and most telling: the exploit included a CVSS severity score that does not correspond to any entry in the National Vulnerability Database or any other scoring system. It was hallucinated. An LLM, trained on thousands of vulnerability reports that include CVSS scores, generated a plausible-looking but entirely fabricated one as part of its output, the same way these models fabricate citations, invent statistics, and produce confident-sounding references to papers that were never written.

Google confirmed that neither its own Gemini model nor Anthropic's Mythos was involved, and whichever model the attackers used remains unidentified.

A Second Acceleration on a Curve Already in Free Fall

Consider the math. Flashpoint's data shows the time-to-exploit for known vulnerabilities shrinking from 745 days to 44 days between 2020 and 2025. That entire compression happened without AI-assisted exploit development. It was driven by human attackers working faster with better tools, scanning the internet for unpatched systems and adapting publicly released proof-of-concept code into weaponized exploits in days or weeks rather than months or years.

AI-built zero-days operate on a different axis altogether, one where there is no prior disclosure, no published proof-of-concept, and no patch window of any duration because the vulnerability was never in any database to begin with. When AI finds the flaw and builds the exploit, the timeline collapses to zero. Traditional fuzzers and static analysis tools scan for crashes, memory corruption, and known vulnerability patterns at the code level, but an LLM reading an authentication module can identify that a trust exception was hard-coded in a way that logically permits bypass, and that kind of reasoning about developer intent is not a pattern any traditional tool is designed to flag because it is not a technical bug at all but a flawed assumption baked into the logic by a human who thought it was safe.

GTIG had already proved this was possible. In late 2024, Google's Big Sleep AI agent discovered a real zero-day vulnerability in production software during a controlled research exercise. "I think the watershed moment was two years ago when we proved this was possible," Hultquist said. What changed is that criminals apparently reached the same capability independently, outside any research lab, using a model that Google cannot identify. Lab to field in roughly eighteen months.

Ninety Zero-Days in 2025, and the Enterprise Is Losing

GTIG's annual zero-day review counted 90 zero-day vulnerabilities exploited in the wild in 2025, up from 78 in 2024. Nearly half, 48 percent, targeted enterprise technologies, an all-time high. Security and networking appliances alone accounted for 21 of those 43 enterprise-focused zero-days, which means the tools that companies deploy specifically to protect their networks were the single most popular entry point for attackers who already had zero-day access.

For the first time since GTIG began tracking attribution, commercial surveillance vendors were credited with more zero-day exploits than traditional state-sponsored espionage groups. PRC-nexus cyber espionage groups remained the most prolific state-sponsored users, continuing a near-decade trend, and financially motivated threat groups tied their previous high at nine attributed zero-days in a single year.

Kaspersky's Q1 2026 data corroborates the acceleration: a record surge in vulnerability exploitation during the first three months of this year, with advanced persistent threat groups leveraging complex exploit chains and AI-assisted vulnerability discovery to target Microsoft Office, Windows, and Linux systems at a pace that outstripped any previous quarter on record.

Why the Hallucination Paradox Matters

Here is an uncomfortable irony. AI hallucinations are the primary reason companies hesitate to trust language models with critical decisions. Models fabricate facts, invent citations, generate plausible nonsense. In this case, the very tendency that makes AI unreliable in a boardroom is what exposed its use in an exploit. A human exploit developer would never include a CVSS score for a vulnerability that has no CVE identifier because no human would think to assign a severity rating to a flaw they are keeping secret. Only a model trained on vulnerability reports, where CVSS scores appear as standard metadata alongside every disclosure, would include one automatically because that is how its training data is structured.

As attackers learn to clean up these artifacts, stripping out docstrings, removing hallucinated scores, roughing up the formatting to look more human, the forensic indicators that tipped off GTIG will vanish. Future AI-built zero-days will not announce themselves. Google caught this one because the attackers did not know their tool had left fingerprints. That advantage evaporates fast.

Strongest Counterargument

Defenders have AI too. Google's Big Sleep project and Anthropic's responsible scaling commitments demonstrate that the same models capable of finding vulnerabilities can be directed at code review and defensive hardening before attackers ever see the flaws. GTIG's own 2026 forecast explicitly states that "AI will empower defenders to harness tools like agentic solutions to enhance security operations" and that "AI agents can proactively discover and help patch previously unknown security flaws." In principle, the defenders' advantage is structural: a vendor can run its own code through defensive AI continuously, while an attacker has to probe the code from the outside with limited context.

Cybersecurity's fundamental asymmetry has always favored attackers: find one flaw, you win; miss one flaw, you lose. Scaling defensive AI across every open-source project, enterprise tool, and legacy system in production is a coordination problem orders of magnitude larger than scaling offensive AI across a single exploit campaign, and nobody is solving that coordination problem today while the attackers keep moving.

What This Analysis Did Not Prove

GTIG has not determined whether the AI that built this exploit also discovered the underlying vulnerability. Maybe human researchers identified the logic flaw and then fed it to an AI that developed the weaponized exploit code. That would be a less dramatic escalation, still significant but fundamentally different. Google has not identified the specific model used, which means the capabilities required remain unknown: this could be a frontier model or a fine-tuned open-weight model running on consumer hardware. Without knowing the model, we cannot assess how widely the capability is distributed. And this is one case. Just one. GTIG believes there are others they have not yet found, but that belief rests on inference rather than additional documented evidence.

What You Can Do

If you maintain open-source software, audit your authentication flows for hard-coded trust exceptions. This particular zero-day exploited a pattern where developers bypassed 2FA under certain conditions that seemed safe at design time but were reachable by any user who met narrow, specific criteria that the developers did not anticipate an attacker would know how to satisfy. Run your authentication logic through an LLM-based code review tool, because if an attacker's AI can spot the reasoning flaw, your own AI should be able to spot it first.

If you run enterprise security, stop treating the 44-day patch window as a planning assumption. Flashpoint's data says 44 days is the average, but averages conceal a distribution where the fast end is measured in hours and the slow end no longer matters because those organizations were already breached during the fast end. Prioritize edge devices and security appliances, which GTIG's data identifies as the single most exploited category of enterprise technology in 2025. If your security tool vendor has not published a defensive AI strategy, ask why.

If you work in AI governance, this case demands attention. A hallucinated CVSS score saved the internet this time. Next time, the code will be cleaner, and the forensic trail will not include a convenient tell that only an LLM would produce.

The Bottom Line

Between 2020 and 2025, the time attackers needed to exploit a known vulnerability fell from 745 days to 44. That compression was terrifying enough. Now an AI has demonstrated the ability to find and weaponize a vulnerability that nobody knew existed, collapsing the window from 44 days to zero for at least one target. Google caught this attempt because the AI left fingerprints in the code that only a machine would leave: a severity score for a database entry that does not exist. John Hultquist called it "the tip of the iceberg." Icebergs are measured by how much you cannot see.