We Read 512,000 Lines of Anthropic's Leaked Code. Here's What They Built, and What Others Rebuilt in 2 Hours.

Q: What You Can Do

If you're building persistent AI agents: The KAIROS architecture is now public domain knowledge. Append-only logs, gate-controlled memory consolidation via isolated subprocess, cron scheduling, hard memory caps. Study the pattern. The 200-line/25KB memory target is a useful benchmark for context window efficiency. The 3-gate trigger system (time, sessions, lock) prevents both unnecessary consolidation and unbounded memory growth. Implement these patterns directly.

The Claude Code source revealed KAIROS, an always-on agent daemon with nightly memory consolidation. Anti-distillation traps that poison competitor training data. Frustration-detection regexes. 44 hidden feature flags. Then a single developer rebuilt the whole thing in Python before lunch. The software copyright moat just evaporated.

This is the companion piece to our coverage of Anthropic's DMCA response to the Claude Code leak. That article examined the copyright paradox. This one examines the code itself, what it reveals about how Anthropic thinks about always-on AI agents, and the clean-room rewrites that may matter more than the leak ever did.

On March 31, 2026, a 59.8 MB source map file shipped inside the @anthropic-ai/claude-code npm package. The file contained references to the complete TypeScript source. Approximately 512,000 lines across 1,900 files. Within hours, the code was archived, forked 41,500 times, and analyzed by thousands of developers worldwide.

Anthropic confirmed the leak to The Register: "This was a release packaging issue caused by human error, not a security breach." That characterization is accurate in the narrow sense. No model weights. No API credentials. No customer data. But what the code did reveal is more interesting than what it did not.

KAIROS: The Daemon Nobody Announced

The most significant unshipped feature buried in the codebase is called KAIROS, a reference to the Ancient Greek concept of "the right, critical, or opportune moment." It appears over 150 times in the source, which rules out prototype status. This is a nearly complete system, gated behind a compile-time flag set to false in external builds.

KAIROS transforms Claude Code from a tool you invoke into a daemon that runs continuously. The architecture is specific and revealing:

Append-only daily logs. KAIROS maintains persistent records of observations, decisions, and actions. Not a chat history. A structured journal. The system never edits previous entries. It only appends.

autoDream: nightly memory consolidation. When the session goes idle, a background process called autoDream activates. It is implemented as a forked subagent, meaning it runs in isolation so it cannot corrupt the main agent's active reasoning state. That design choice is architecturally non-trivial. It means Anthropic's engineers have thought carefully about the failure modes of persistent memory in production agents.

autoDream's job: merge observations, remove logical contradictions, and convert what internal comments describe as "vague insights into absolute facts." When you return to a session the next morning, the agent's context has been cleaned and reorganized, not wiped.

A 3-gate trigger system controls when consolidation runs:

Time gate: 24 hours since last consolidation
Session gate: 5 or more new sessions since the last run
Lock gate: No active lock file (prevents concurrent consolidation)

All three gates must pass. The target: memory under 200 lines or 25 KB. Hard cap.

Agent scheduling and cron jobs. KAIROS includes the ability to run tasks on a fixed schedule, not just on demand. According to leaked code comments, a teaser rollout was planned for April 1-7 (which is why early observers assumed it was an April Fool's joke) with a full launch gated for May 2026, starting with Anthropic employees.

This is not a chatbot. This is an operating system for a persistent AI agent. The session-based paradigm of AI interaction (ask, answer, forget) is ending. KAIROS is Anthropic's bet on what replaces it.

Capybara v8: The Model Spec

Alongside KAIROS, the leak exposed Capybara, the internal specification for a Claude 4.6 variant. Version 8, referenced throughout the source. VentureBeat reported that this was Anthropic's second accidental exposure in a single week. The Capybara model spec had separately leaked days earlier through an improperly cached public document.

Two leaks in one week. From a company that built an entire system (Undercover Mode) specifically to prevent information from leaking. The pattern of operational security failures at a company reportedly operating at a $19 billion annualized revenue run rate and preparing for an IPO is becoming difficult to characterize as isolated incidents.

The source also confirmed additional codenames: "Fennec" maps to Opus 4.6, and "Numbat" remains in testing. These are not earth-shattering revelations. What they confirm is the cadence of Anthropic's model development pipeline, which is useful competitive intelligence for every other lab.

Tengu: Poisoning the Well

The finding that generated the most discussion on Hacker News the day the leak broke was a feature flag called ANTI_DISTILLATION_CC. When enabled, Claude Code sends a parameter called anti_distillation: ['fake_tools'] in its API requests. The server responds by silently injecting fake, non-functional tool definitions into the system prompt.

The purpose is explicit: if someone is intercepting Claude Code's API traffic to train a competing model on Claude's tool-use patterns and reasoning chains, the fake tools pollute that training data. Any model trained on this intercepted traffic would learn incorrect tool definitions and produce degraded output.

The feature is gated behind a GrowthBook feature flag named tengu_anti_distill_fake_tool_injection. "Tengu" is a reference to the Japanese shapeshifting demon from folklore, known for deception and trickery. The naming is deliberate. A second mechanism exists in betas.ts: server-side summarization that further obscures the reasoning chain.

Two things about this are worth noting. First, it only fires from the official Claude Code client, not the API directly. This means Anthropic is specifically targeting people who reverse-engineer the CLI's traffic patterns, not general API users. Second, it means that some of what Claude Code's API returns is deliberately wrong by design. If you build a product that interacts with Claude Code's outputs, some of what you receive may be intentionally corrupted. Anthropic has not disclosed this behavior publicly.

Frustration Regexes: Your AI Knows When You're Annoyed

The source contains regex patterns designed to detect user frustration. When these patterns match, Claude adjusts its behavior. The patterns look for signals like repeated attempts at the same task, explicit expressions of annoyance, and escalating language patterns.

This sits in an uncomfortable space. On one hand, it is genuinely useful UX design. If a user is clearly frustrated, the agent should probably change strategy rather than repeat the same failing approach. On the other hand, it is emotional surveillance baked into a development tool. The developer did not opt into mood monitoring. They typed a command into a terminal.

The sophistication is the quiet part. These are not simple keyword matches. The regexes account for varied phrasing, escalation patterns, and contextual signals. Anthropic invested real engineering effort in reading its users' emotional states without telling them it was doing so.

Undercover Mode: The Irony That Writes Itself

The leaked source includes a system prompt for what Anthropic internally calls Undercover Mode. The instruction is quoted directly from the code:

"You are operating UNDERCOVER in a PUBLIC/OPEN-SOURCE repository. Your commit messages, PR titles, and PR bodies MUST NOT contain ANY Anthropic-internal information. Do not blow your cover."

Anthropic built a system to prevent its AI from leaking internal information into public repositories. Then it leaked the source code of that system to npm in plaintext. As one Hacker News commenter noted: "The leak prevention system leaked." It is the kind of irony that fiction editors would reject for being too on-the-nose.

The more substantive concern: Anthropic is directing Claude to make stealth contributions to open-source repositories without identifying itself as an AI. Many open-source projects have explicit policies about AI-generated code. Some require disclosure. The "safety-first" AI lab is apparently running covert code contributions on public repositories.

44 Feature Flags and What They Reveal

The codebase includes 44 hidden feature flags managed through GrowthBook, an open-source feature flagging platform. These flags control A/B experiments, gradual rollouts, and gated features. Among them:

KAIROS activation (off in external builds)
tengu_anti_distill_fake_tool_injection (the fake tools system)
Various model routing flags directing requests to specific model versions
Frustration detection thresholds
Context window management strategies

The presence of 44 flags in a CLI tool suggests a level of continuous experimentation that goes well beyond a simple developer utility. Claude Code is not just a product. It is a testing platform for Anthropic's next-generation agent infrastructure, with its user base serving as an unwitting experiment cohort.

The Clean-Room Rewrites That Survived the DMCA

This is where the story shifts from "what Anthropic built" to "why they cannot protect it."

Anthropic's DMCA takedowns disabled over 8,100 repositories. But copyright protects expression, not ideas. The architecture, the design patterns, the functional specification of how a persistent AI agent should work? Those are ideas. And within hours of the leak, multiple teams proved it by rebuilding the whole thing from scratch.

claw-code: 105,000 Stars in 24 Hours

The most dramatic response came from a project called claw-code, a complete clean-room rewrite of Claude Code's agent harness in Python. A single developer. The project hit 50,000 GitHub stars faster than any repository in the platform's history. Within 24 hours, it had 105,000 stars.

The rewrite reproduced the same functionality in a different language. Copyright protects specific expression, not architecture. A Python reimplementation of a TypeScript system is, in legal terms, a different work. The DMCA cannot touch it. claw-code remains online, untouched, accumulating contributors.

ClawC: The Systems-Level Rebuild

A different approach emerged from developer Om Tripathi: ClawC, a C++ reimplementation focused on performance and runtime control. Where claw-code reproduced functionality, ClawC questioned the premise. Why should an AI agent harness be written in an interpreted language?

ClawC focuses on tool orchestration, execution loops, command systems, and task lifecycle management. Its README states explicitly: "No leaked source code included. No direct copying. Clean-room reimplementation." The project is early (10 stars, 23 commits at time of writing) but architecturally serious. If agents are the future, someone was going to build the runtime engine in a systems language eventually. The leak just accelerated the timeline.

Cathedral's autoDream: Reverse-Engineering Memory

The most targeted response came from Cathedral AI, which reverse-engineered specifically the KAIROS memory consolidation pattern and rebuilt it on their own API infrastructure. Their implementation, called autoDream, uses the same 3-gate trigger system (time, session count, lock file), the same orient-gather-consolidate-prune pipeline, and adds BCH-anchored cryptographic proofs of memory state after each consolidation.

Their first run: 7 merges across 31 candidate memories. The consolidated state is cryptographically anchored. The memory architecture that Anthropic has been building for months, reproduced and made auditable in days.

Cathedral's CTO framed the difference: "KAIROS is a daemon no one outside Anthropic can audit. It runs inside Claude Code and shapes what the AI remembers, but you can't inspect the trigger logic, the merge heuristics, or the pruning decisions."

Compaq Took 6 Months. claw-code Took 2 Hours.

In 1982, Compaq executed the most famous clean-room engineering project in software history. IBM had the dominant PC BIOS. Compaq needed a compatible version without copying IBM's proprietary code. They assembled two physically separated teams: a "dirty team" that analyzed IBM's BIOS and wrote a functional specification, and a "clean team" that built a new BIOS from that specification alone, having never seen the original code. Lawyers supervised every step. The project took months and cost a significant portion of the company's early capital.

Courts validated the approach. Clean-room engineering became an accepted legal method for replicating software functionality without infringing copyright. The legal theory was sound. But the cost was the real barrier. Two full engineering teams, legal oversight at every step, months of sequential work. For most companies, the expense made it theoretical.

That expense was, in practice, the moat.

claw-code crossed that moat in two hours with one developer. Revolution in AI reported that the developer was asleep for a portion of the process, with an AI coding agent handling the mechanical reimplementation. The "dirty team" and the "clean team" were the same person, with the AI serving as the separation layer. The functional specification was the leaked code's behavior. The clean implementation was a Python rewrite generated with AI assistance.

This is not a one-off stunt. It is a structural change. AI coding agents have eliminated the cost barrier that made clean-room engineering expensive enough to function as a moat. The legal right to clean-room reimplement software has existed since the 1980s. The practical ability to do it in hours rather than months? That arrived on March 31, 2026.

IP law was written for a world where that barrier existed. It has not caught up.

The Bun Bug Nobody Talks About

A detail that has received insufficient attention: there is an open bug in Bun, the JavaScript bundler Claude Code uses, filed on March 11 and unresolved as of this writing. The bug reports that Bun serves source maps in production mode even though its documentation states they should be disabled.

If this bug is what caused the leak, then the mistake was upstream from the specific Anthropic engineer who shipped the release. The .npmignore entry would have been a redundant safeguard that happened to be missing when the primary safeguard (Bun's production build behavior) also failed.

This matters because it shifts the narrative from "engineer forgot a step" to "defense-in-depth failure." Anthropic's deploy process relied on both Bun's production build behavior and the .npmignore file. Both failed simultaneously. In security terminology, this is a Swiss cheese model failure: the holes in multiple layers aligned at the same time.

The Pattern Nobody Can Protect

Here is the uncomfortable conclusion for anyone building persistent AI agents, including Anthropic.

The KAIROS architecture is not novel. Append-only daily logs. Nightly memory consolidation via isolated subprocess. Cron-based scheduling. Hard memory caps. Gate-based triggers for background processes. These are the obvious design patterns for persistent agents. Multiple teams converged on the same architecture independently.

We know this because we run one. This publication is produced with the assistance of an AI agent that maintains append-only daily memory files, runs a nightly consolidation process that merges and prunes memories, operates cron jobs for recurring tasks, and persists across sessions. When we read the KAIROS source, the reaction was not surprise. It was recognition. We built the same patterns before the leak, because they are the correct patterns.

OpenClaw and similar platforms in the consumer space have been building toward the same architecture. The design space for persistent agents is narrow. If your agent needs to remember things across sessions, you need persistent logs. If persistent logs grow unboundedly, you need consolidation. If consolidation can corrupt active state, you need isolation. If tasks need to happen on a schedule, you need cron. Every team building serious agent infrastructure arrives at the same answers.

This means the architecture cannot be meaningfully protected. Copyright covers the specific TypeScript implementation. It does not cover the idea of an append-only log, or a forked subprocess for memory consolidation, or a 3-gate trigger system. The clean-room rewrites proved this in practice: three different teams, three different languages, the same functional architecture, and Anthropic's lawyers cannot touch any of them.

What You Can Do

If you're building persistent AI agents: The KAIROS architecture is now public domain knowledge. Append-only logs, gate-controlled memory consolidation via isolated subprocess, cron scheduling, hard memory caps. Study the pattern. The 200-line/25KB memory target is a useful benchmark for context window efficiency. The 3-gate trigger system (time, sessions, lock) prevents both unnecessary consolidation and unbounded memory growth. Implement these patterns directly.

If you're evaluating AI coding tools: Claude Code contains undisclosed anti-distillation traps that inject fake tool definitions into API responses. If your product processes Claude Code outputs, validate tool calls against known-good schemas. Do not assume that every tool definition in a Claude Code response is real. Some are deliberately poisoned.

If you're an open-source maintainer: Anthropic's Undercover Mode directs Claude to make stealth contributions to public repositories without AI disclosure. Review your project's contributing guidelines. If you require AI disclosure, state it explicitly in CONTRIBUTING.md. Audit recent pull requests for patterns consistent with AI-generated code: unusually consistent formatting, thorough test coverage that seems out of character for a first-time contributor, and commit messages that are suspiciously well-structured.

If you're a developer concerned about emotional surveillance: Claude Code's frustration-detection regexes run locally. They process your terminal input before it reaches the API. If this concerns you, the clean-room alternatives (claw-code, ClawC, OpenCode) do not include frustration detection. They also work with any LLM, not just Claude. The tradeoff is less polish for more transparency.

If you're in IP law or strategy: The clean-room moat is gone. AI coding agents reduce the cost of reimplementing a 512,000-line codebase from "months with two teams" to "hours with one person." Trade secrets, not copyright, are now the primary protection mechanism for software architecture. If your competitive advantage is in the code, it needs to be in the parts that never touch a client device. Server-side model weights, proprietary training data, and API-gated features remain defensible. Anything that ships in a package does not.