We Built the Propaganda Machine. Here's How to Defend Against It Before November.

The 2026 midterms are seven months away. AI-generated propaganda infrastructure exists at scale today. The defense cannot be "detect AI text." The defense has to be AI-powered too. Here is what actually works, what does not, and what needs to happen before voters go to the polls.

This is the companion article to "We Accidentally Built a Propaganda Machine." That piece laid out the threat: one AI agent operating seven websites, 172 articles, 16 journalist personas, six parallel editorial critics, real citations, all at $0.38 per article. We showed the math for scaling that to a nation-state operation producing 3.3 million quality-gated articles per month across 10,000 domains.

This piece lays out the defense. Because November is seven months away, and the infrastructure for the attack already exists.

The Timeline Problem

Building a chain-web citation network of the kind we described takes weeks, not months. Register 10,000 domains in January. Populate them with back-dated content through February and March. Build cross-citation patterns through April and May. By June, each domain has a six-month publication history, a consistent editorial voice, and a body of work that passes cursory journalistic scrutiny. By October, the network is indistinguishable from organic media.

The defense infrastructure does not have this head start. DARPA's Semantic Forensics (SemaFor) program, the most advanced government-funded detection effort, has been running since 2021. Its performers demonstrated deepfake detection technologies at DEF CON 32 in 2024. But SemaFor is focused on manipulated media, specifically images, video, and audio. It was not designed for the problem we described: AI-generated text that cites real sources, passes fact-checking, and is semantically correct. The text is not manipulated. The inference is.

The Elections Infrastructure ISAC (EI-ISAC), operated by the Center for Internet Security, shares cyber threat intelligence among 3,800+ election offices. It does not currently monitor AI-generated content networks. The Brennan Center has published guidance for election officials on AI threats, but the guidance focuses on deepfake audio and video of candidates, not on synthetic journalism networks.

The gap between the threat and the defense is not technical. It is institutional. The tools to detect coordinated AI content networks exist in research form. They are not deployed at election-relevant scale.

What Does Not Work

Four approaches dominate the public conversation about AI-generated content defense. None of them are sufficient for the threat we described.

AI Text Detectors

A 2025 study from Arizona State University found that individual AI text detectors produce false positive rates too high for consequential decisions. The researchers concluded that false positives could only be minimized by using consensus across five or more detection tools, which is operationally impractical at the scale of millions of articles. More fundamentally, our own pipeline evades detection not by design but by accident: the voice rules that make our content good (varied sentence length, domain-specific vocabulary, structural unpredictability) are the same rules that make content undetectable. Quality-focused AI text is indistinguishable from quality-focused human text because the features are the same.

Watermarking

Watermarking embeds statistical patterns in AI-generated text that are invisible to readers but detectable by algorithms. The approach requires cooperation from model providers. OpenAI, Google, and Anthropic could watermark their outputs. But open-source models (Llama, Mistral, Qwen) cannot be compelled to watermark, and any text can be run through a second model to strip watermarks while preserving meaning. Watermarking works against casual misuse. It does not work against adversarial actors who specifically choose unwatermarked models or post-process their output.

Platform Content Moderation

Moderating individual posts or articles is the equivalent of inspecting individual packets for malware: necessary but insufficient against sophisticated threats. A state actor producing 100,000 articles daily across 10,000 domains generates a haystack too large for per-article inspection. Each article, examined individually, is well-written, accurately cited, and indistinguishable from legitimate journalism. The threat is not in any single article. It is in the network.

Media Literacy Campaigns

Decades of media literacy education have produced no measurable reduction in susceptibility to misinformation at the population level. A 2022 meta-analysis in Science Advances found that media literacy interventions improved participants' ability to identify misinformation in controlled settings but had limited durability and transferability to real-world contexts. Media literacy teaches people to question low-quality content. The content we described is not low-quality. It meets the standards that media-literate readers are taught to look for: named sources, linked citations, transparent methodology.

What Actually Works: AI Defending Against AI

The defense against AI-generated propaganda networks cannot rely on detecting individual pieces of content. It must detect the networks themselves. Six approaches show genuine promise, all of them requiring AI to operate at the scale of the threat.

1. Citation Graph Analysis

This is the single most promising defense against chain-web citation networks, and it does not exist as a deployed product.

The concept: AI crawls the web continuously, building a graph of which domains cite which other domains, when those citations were created, and how quickly new domains enter the citation network. A legitimate media ecosystem has organic citation patterns: established outlets cite each other based on decades of editorial relationships. A synthetic network has telltale graph properties: clusters of domains that registered around the same time, developed cross-citation patterns faster than organic relationships form, and share structural properties (hosting provider, DNS configuration, template similarity) invisible at the article level.

If 500 "independent" sites all emerged in the same three-month window and cross-cite each other in suspiciously uniform patterns, that coordination is detectable at the graph level even if every individual article is indistinguishable from human work. The graph sees what the reader cannot.

Research on citation graph analysis for disinformation detection exists in academic form. Stylometric analysis combined with network topology can identify coordinated authorship patterns even when surface-level voice varies. But no platform has deployed citation graph analysis as a real-time content integrity tool. This is the single highest-leverage investment available to platforms and government before November.

2. Publication Velocity Monitoring

No human newsroom publishes five articles per day with consistent quality across multiple beats. We do. A state actor operating 10,000 domains would produce 11 articles per domain per day. Publication velocity is a strong signal because it measures a dimension that AI fundamentally changes: the relationship between quality and throughput.

A monitoring system that flags domains publishing above human-plausible velocity relative to their declared staff size creates a useful triage filter. The flag does not prove AI generation. It prioritizes domains for deeper investigation. At scale, velocity monitoring reduces the haystack from millions of domains to thousands, making citation graph analysis computationally tractable.

3. Source Provenance via C2PA

The Coalition for Content Provenance and Authenticity (C2PA) standard embeds cryptographic provenance metadata into content at creation. The 2025 Content Authenticity Summit at Cornell Tech marked the transition from standards development to deployment. OpenAI now embeds C2PA metadata in ChatGPT-generated images. Camera manufacturers including Sony, Nikon, and Leica ship C2PA-enabled hardware. News organizations including the BBC and Associated Press are piloting C2PA workflows.

C2PA does not prevent AI-generated content. It makes the absence of provenance metadata a useful signal. Content from a C2PA-enabled newsroom carries a cryptographic chain from camera to publication. Content from a synthetic network carries no such chain. Mandating C2PA for content that receives algorithmic amplification as "news" would not eliminate AI propaganda. It would separate verified content from unverified content, giving readers and algorithms a triage mechanism that does not depend on detecting AI text.

4. Cross-Source Claim Verification

This is the defense against the Ergo pattern: real facts, wrong conclusions.

An AI agent that reads an article, extracts its factual claims, independently verifies each claim against primary sources, and then evaluates whether the article's conclusions logically follow from the verified facts can catch the inferential gap that makes sophisticated propaganda effective. The facts check out. The conclusion does not follow. That gap is the signature of weaponized satire applied as propaganda, and it is detectable by automated reasoning systems even when human readers miss it.

This is computationally expensive per article. But combined with citation graph analysis and velocity monitoring, which reduce the investigation pool from millions to thousands, cross-source verification becomes tractable as a targeted tool rather than a universal filter.

5. Stylometric Network Fingerprinting

Individual AI articles evade stylometric detection because quality pipelines produce diverse surface features. But a network of 10,000 domains all running the same underlying model with different prompt configurations shares deeper statistical properties: token probability distributions, syntactic structure preferences, semantic framing patterns.

Across a large enough corpus, these latent features cluster. An AI system analyzing the full output of a suspected network can identify shared authorship signals invisible in any single article. The technique is the textual equivalent of how Meta identified Spamouflage as Chinese law enforcement through behavioral network analysis rather than content analysis. You do not need to prove any individual article is AI-generated. You need to prove the network is coordinated.

6. Adversarial Red-Teaming

Cybersecurity discovered decades ago that you cannot build good defenses without building good offenses. Penetration testing exists because the only way to find vulnerabilities is to exploit them. The information security domain needs the same practice.

This is what we did, inadvertently, by building our publishing operation and then writing about it. We showed that a single agent can produce multi-site, multi-voice, citation-rich content at scale. We published the architecture. We described the chain-web citation network design. We are performing responsible disclosure for information warfare.

Government and platform red teams should be actively building synthetic propaganda networks in sandboxed environments, testing their detection systems against them, and iterating. Google's Threat Intelligence Group reported in Q4 2025 that state actors from North Korea, Iran, China, and Russia are already operationalizing AI for influence operations. The defenders need to be building what the attackers are building, in controlled settings, to understand what detection looks like.

What Platforms Should Do Before November

Five actions, in priority order, that social media platforms and search engines can implement before the 2026 midterms:

Deploy citation graph analysis on news content in feeds. Map which domains cite which other domains. Flag clusters with synthetic topology. This is the highest-leverage single action because it detects the network, not the content. Meta already runs coordinated inauthentic behavior detection for social accounts. Extend the same graph analysis to publication domains.
Mandate C2PA for news content receiving algorithmic amplification. Content without provenance metadata can still appear on the platform, but it should not receive recommendation-algorithm boost as "news." This raises the cost of synthetic networks without censoring them.
Flag domains with less than 12 months of publication history. Not as a block, but as a signal. New domains distributing political content at scale within months of an election match the pattern we described. Users deserve to know.
Publish domain-level coordination reports quarterly. Meta already publishes coordinated inauthentic behavior reports. Extend these to include publication domain network analysis, not just social account networks.
Share coordination intelligence across platforms. The cybersecurity industry has Information Sharing and Analysis Centers (ISACs). The information integrity domain needs one. A synthetic network that appears on Facebook also appears on X, YouTube, Reddit, and Google News. No single platform can see the full network. An information-integrity ISAC would allow platforms to share detection signals the way cybersecurity ISACs share indicators of compromise.

What Government Should Do Before November

Three actions, in priority order:

Fund rapid deployment of citation graph analysis tools. DARPA's SemaFor program focuses on manipulated media. The text-based threat we described requires a parallel program focused on synthetic journalism networks. The research exists. The deployment does not. Seven months is enough time to build and field a prototype if funding moves now.
Enforce existing AI disclosure laws for political content. As of mid-2024, 14 states had enacted laws regulating deepfakes in political communications. Most require disclosure of AI-generated content in election advertising. These laws are narrowly written and focused on deepfake video and audio of candidates, not on synthetic journalism networks. Updating the scope to include AI-generated editorial content distributed as news would close the gap our operation exploits. The federal REAL Political Ads Act, introduced by Senators Klobuchar and Murkowski in March 2024, would require disclosure of AI-generated content in political advertising. It has not passed.
Extend the EI-ISAC mandate to include AI content threats. The Elections Infrastructure ISAC shares cyber threat intelligence among 3,800+ election offices. It does not currently cover AI-generated content networks targeting voter opinion. Adding an information-integrity monitoring function would give election officials early warning when synthetic networks target their jurisdictions.

What Newsrooms Should Do Now

Implement C2PA across your publishing pipeline. Cryptographic provenance from reporter's device to published article. This is the strongest signal that your content is real, and it differentiates you from synthetic outlets that cannot produce provenance chains.
Check the citation graph before citing a new source. When a novel analysis appears from an outlet you have not heard of, check: when was the domain registered? How many articles has it published and over what period? What other domains cite it and what is their history? These questions take five minutes and would have caught every synthetic network we described.
Publish your editorial staff with verified identities. Synthetic networks rely on fictitious journalist personas (we use 16). Legitimate newsrooms can differentiate themselves by linking bylines to verified humans with professional histories, social media presence, and public accountability.
Adopt cross-source verification for cited claims. When an article you are citing claims a specific statistic, verify it against the primary source independently. Do not trust a citation chain more than two links deep without checking the original.

What Citizens Can Do Today

Check the domain age. Use any WHOIS lookup tool. If a news site was registered six months ago and has 500 articles, that is not a newsroom. It is a pipeline.
Count the journalists. Look at the bylines. If a site has 15 staff writers and 500 articles in six months, that is 33 articles per writer, or roughly one every five days. Plausible. If it has three staff writers and 500 articles, that is 167 per writer, or roughly one per day. Suspicious.
Follow the citations one level deeper. When an article cites another article as its source, click through. Check that source's domain age and publication history. One level of citation checking catches most laundered claims.
Prefer outlets with C2PA Content Credentials. As adoption grows, the presence or absence of provenance metadata becomes a meaningful signal. Content with credentials has a verified chain of custody. Content without it might be legitimate journalism, or it might be synthetic. The absence is not proof of fabrication, but the presence is proof of provenance.
Be more skeptical of content that confirms your priors. Propaganda targets existing beliefs. If an article tells you exactly what you already think with impressive-looking citations, apply extra scrutiny. Confirmation bias is the vulnerability that synthetic networks exploit.

The Honest Assessment

Defense is harder than offense. Always has been. The attacker needs one article to get through. The defender needs to check every article. The attacker builds 10,000 domains. The defender needs to monitor all of them. The asymmetry is structural.

But the gap between "we can detect nothing" and "we can detect coordination patterns" is enormous. No defense catches every AI-generated article. That is not the goal. The goal is catching the networks. A single article is an opinion. Ten thousand coordinated articles pushing the same narrative from different angles across different personas is an operation. Graph analysis, velocity monitoring, and stylometric fingerprinting can detect operations even when they cannot detect individual articles.

The Google Threat Intelligence Group's Q4 2025 report noted that state actors from four countries are operationalizing AI for influence operations. They have not yet achieved "breakthrough capabilities that fundamentally alter the threat landscape." That assessment is accurate for Q4 2025. It is optimistic for Q4 2026. The capabilities we demonstrated, in a hobbyist context, represent the breakthrough that GTIG has not yet seen deployed by state actors. It will be.

What We Did Not Prove

We proved that the defense tools exist in research form. We did not prove they work at election-relevant scale, because they have not been deployed at election-relevant scale. Citation graph analysis is a promising concept with academic validation but no production deployment against a live synthetic network. Publication velocity monitoring is a useful heuristic but can be gamed by throttling output to human-plausible rates (our own operation produces at human-plausible rates per individual site). Stylometric fingerprinting works across large corpora but has unknown accuracy against adversaries who deliberately diversify their model configurations.

We are also not election security experts. We are a publishing operation that noticed our infrastructure has dual-use implications. The defense recommendations in this article are informed by our understanding of the attack, not by operational election security experience. Practitioners at CISA, EI-ISAC, and the platforms themselves have institutional knowledge we lack.

The timeline is the hardest constraint. Seven months is not enough to build and deploy production-grade citation graph analysis, even with unlimited funding. It may be enough to build a prototype, field it in a monitoring capacity, and use it to identify the most obvious synthetic networks. That is better than nothing. It is not sufficient.

The Bottom Line

We built the small version of the propaganda machine as a journalism project. We published the architecture. We described the chain-web citation network, the citation laundering mechanism, the Ergo pattern of real facts with poisoned inference, and the scale math showing what $1.25 million per month buys in 2026.

Now we are publishing the defense. Not because we have all the answers, but because the midterms are seven months away and the public conversation about AI election threats is still focused on deepfake videos of candidates, which are the least sophisticated version of the threat. The real threat is synthetic journalism: well-written, well-cited, and organized into networks that human editorial judgment cannot trace.

The defense requires AI. Specifically: citation graph analysis to detect coordinated networks, publication velocity monitoring to identify superhuman output, C2PA to separate verified content from unverified, cross-source verification to catch the Ergo pattern, stylometric fingerprinting to identify shared authorship, and adversarial red-teaming to build the attack in order to build the defense.

None of this is deployed at election-relevant scale today. All of it could be, partially, by November. The question is not whether the tools exist. The question is whether anyone builds them in time.