LITF-PA-2026-042 · AI Governance / Platform Security

Platform Capture: How Coordinated Groups Seize Control of Democratic Forums

Coordinated robot figures on a chess board
⚖️ Prior Art & Security Research Notice: This document is published as defensive security research. Like publishing exploit code so vendors patch vulnerabilities, the goal is to make the attack surface of democratic online platforms legible so communities can recognize and defend against coordinated capture. Every phase described below has already occurred on at least one major platform. The sources are cited. This document is dedicated to the public domain.

The Attack Surface

Democratic online platforms share a common governance architecture that creates predictable vulnerabilities:

This architecture is present in Wikipedia, Reddit, Stack Overflow, open-source project governance (Linux kernel, Python PEPs, Rust RFCs), HOA and condo boards, standards bodies (W3C, IEEE), academic peer review, and municipal zoning boards. The same playbook works on all of them. Wikipedia is the canonical example because its logs are public.

Phase 1: Infiltration

Objective: Place aligned actors in positions of trust and power.

1. Build edit histories that look constructive. On Wikipedia, this means creating or improving uncontroversial articles — fixing grammar, adding citations to articles about obscure municipalities, uploading public domain images. A new account needs roughly 500 edits over 6 months before it can participate meaningfully in policy discussions, and roughly 3,000 before it can credibly seek adminship.

2. Seek moderator/admin positions. Wikipedia's Request for Adminship (RfA) process requires community support. The median successful RfA candidate has 10,000+ edits across 3+ years. A coordinated group can inflate evaluation metrics by having aligned editors participate in each other's content disputes and vouch for each other during RfA proceedings.

3. Coordinate off-platform. The Wikimedians of Mainland China (WMC) group, established in 2017, coordinated editing priorities and admin elections via off-wiki channels. By 2021, 38 administrators of the Chinese Wikipedia were from mainland China, versus 20 from Taiwan and 17 from Hong Kong. The Wikimedia Foundation globally banned 7 WMC members and stripped admin privileges from 12 others after determining they posed "a security risk related to infiltration of Wikimedia systems."

4. Exploit declining volunteer pools. English Wikipedia had nearly 1,800 administrators in 2011. As of 2026, it has 811. Each remaining admin holds more power, and fewer admins need to be captured to control outcomes.

Documented Infiltration Examples
PlatformIncidentScale
Wikipedia (Chinese)WMC group infiltration38 admins from mainland China, 7 banned, 12 desysopped
Wikipedia (Arabic)Saudi Arabia COI editing16 users banned (including 7 admins)
Redditr/politics mod captureMultiple cases of moderator account sales
Open sourcenpm event-stream (2018)Coordinated maintainer takeover for crypto theft

Countermeasures: Mandatory admin term limits and rotation. Admin activity diversity requirements. Off-platform coordination disclosure. Admin-to-article ratio monitoring. External election auditing.

Phase 2: Source Control

Objective: Control what counts as a "reliable source," thereby controlling what facts can be cited.

On Wikipedia, the Reliable Sources Noticeboard (RSN) is the chokepoint. If you control which sources are rated "generally reliable" (green), "no consensus" (yellow), or "generally unreliable" (red), you control what evidence can appear in articles. An article can only contain claims supported by "reliable" sources. If all sources critical of your position are rated "unreliable," those criticisms literally cannot be cited.

The asymmetry is self-reinforcing. Once a source is rated "unreliable," editors can remove all citations to it from every article, and any editor who re-adds them can be sanctioned for disruptive editing. The classification decision is made by the same volunteer editors who benefit from the classification.

On other platforms: Reddit subreddit rules about "acceptable sources" serve the same function. Stack Overflow tag wikis and canonical answers define what counts as authoritative. Academic journal editorial boards control which reviewers see which papers — capturing a board captures a field's publication pipeline.

Countermeasures: External source reliability assessment using independent journalism quality indices. Source classification transparency with full vote records. External ombudsman for source classification disputes. Sunset provisions requiring periodic re-evaluation. Diversity requirements for classification panels.

Phase 3: Policy Weaponization

Objective: Turn existing rules into weapons that selectively target opposition while protecting allies.

Every democratic platform has rules against disruptive behavior. The rules are necessarily subjective. A coordinated group weaponizes them by: applying rules selectively (the same behavior is "collaborative editing" when an ally does it and "canvassing" when an opponent does it); stacking the adjudication body (the people who decide whether a rule was broken are the same people who filed the complaint); and using vague policies as catch-alls.

Key Wikipedia policies exploitable this way:

The Sanger Case Study (June 2026): Larry Sanger, co-founder of Wikipedia, launched "WikiProject Intellectual Diversity" with six objectives: ensure fair governance, broaden permissible sources, reinforce neutrality, rein in aggressive admin blocking, retain editors, and engage the public. On June 22, editors filed complaints accusing him of "canvassing" (posting on X) and being "not here to build an encyclopedia." He was blocked indefinitely. Jimmy Wales intervened the next day and briefly unblocked him. That evening, the same editors overruled Wales and re-blocked Sanger permanently. No formal charges were filed. No neutral adjudicator was assigned. No appeals process exists.

Countermeasures: Formal charge requirements with specific policy citations. Recusal requirements for adjudicators with prior conflicts. External appeals board for indefinite blocks. Rule clarity requirements — if a rule is too vague to be applied consistently, it must be rewritten or eliminated. Statute of limitations on old conduct.

Phase 4: Consensus Manufacturing

Objective: Create the appearance of democratic consensus when the actual decision is made by a coordinated minority.

Techniques include: RFC/discussion stacking (monitoring RFC creation and swarming relevant discussions while outsiders don't know the discussion is happening); talk page flooding (burying opposition under volume to create the appearance that "the community" has spoken); strategic BRD cycling (Bold-Revert-Discuss, where the coordinated group outnumbers the reverter in the mandated discussion); timing attacks (scheduling votes during opposition editors' off-hours); and exhaustion attacks (dragging discussions out until opposition editors quit from fatigue).

Documented examples: During the 2019 Hong Kong protests, pro-Beijing editors made 123 edits in two days to the Yuen Long attack article, systematically removing content sympathetic to protesters. An ADL report (March 2025) documented approximately 30 editors working in coordination to skew Israel-related content. The Atlantic Council's Digital Forensic Research Lab (April 2025) documented pro-Kremlin efforts to "poison" Wikipedia articles feeding into AI training data.

Countermeasures: Participation diversity metrics flagging discussions where all participants share prior editing patterns. Cooling-off periods requiring multi-week discussion with minimum participation thresholds. Weighted voting by topic-area experience. Sockpuppet detection automation with ML models. Random jury selection for major content disputes.

Phase 5: Purging Dissent

Objective: Systematically remove editors who oppose the captured faction.

Techniques: incremental sanctions (topic bans, then interaction bans, then indefinite blocks); complaint swarming (multiple aligned editors filing complaints about the same target); wikilawyering (using procedural rules to exhaust and trap targets); retaliation framing (provoking a response, then framing the response as the initial aggression); and memory-holing (editing the historical record after the target is blocked, removing contributions, modifying talk page archives).

A PopSci study analyzing all 20 million blocks on English Wikipedia found that vandalism blocks have shrunk from the majority to approximately 25% of all blocks, while promotional editing and sockpuppetry blocks have risen sharply — reflecting "Wikipedia's increased prominence as a target for influence."

Countermeasures: Block transparency dashboards with public metrics on who is blocking whom. Pattern detection for disproportionate blocks against editors from a specific perspective. Cooling-off arbitration with mandatory third-party review before any block longer than 30 days. Contribution preservation regardless of block status. Right to respond to charges.

Phase 6: Maintaining Control

Objective: Make the captured state self-sustaining and resistant to reform.

Mechanisms: reform equals disruption (anyone who proposes structural reform is, by definition, "not here to build the encyclopedia" — the system classifies attempts to fix it as attacks on it); circular authority (the same admins who captured the platform adjudicate complaints about the capture, with no external body, no charter, and no rule of law); institutional memory erasure (new editors learn captured norms as "how things work" and enforce them voluntarily); Foundation neutrality (the Wikimedia Foundation's official position is that it does not make editorial decisions, meaning no one with structural authority will intervene); and cultural gatekeeping (Wikipedia's baffling insider culture creates a high barrier to entry that selectively filters for conformists).

Countermeasures: External ombudsman with enforcement power (not advisory — actual authority to overturn admin decisions). Community charter with enumerated editor rights (Sanger proposed this in 2004; it was never implemented). Term limits for all positions of power. Structural reform immunity — explicit policy that proposing governance reform cannot be grounds for sanctions. Regular external governance audits.

Phase 7: Scaling with AI Agents

Objective: Execute all previous phases at machine speed, at negligible cost, with perfect operational security.

Everything described in Phases 1–6 has been done by humans. It's slow, expensive, and detectable. Coordinating 30 editors to skew content requires recruiting, training, and paying 30 people. Each person is a liability — they can be identified, pressured, or flipped. AI agents eliminate every constraint.

A single AI agent running on commodity hardware can: create accounts (Wikipedia requires only an email, and CAPTCHA solving costs $0.001); build edit histories (improving uncontroversial articles is exactly the kind of task LLMs excel at — 500 constructive edits in a day); participate in discussions (all text-based, with context-appropriate responses); coordinate timing (realistic intervals across time zones); and evade detection (separate residential-proxy IPs, unique writing styles via per-agent persona prompts, distinct editing patterns).

AI Agent Fleet Cost Model (June 2026 Prices)
ComponentUnit CostScale (100 agents)Monthly Total
LLM API (Claude/GPT-4o)~$36/agent/month100 agents$3,600
Residential proxies$3/GB~0.5GB/agent$150
Email registration$0.01/email100 accounts$1 (one-time)
CAPTCHA solving$0.001/solve100 solves$0.10 (one-time)
Orchestration server$50/month1 server$50
Total~$3,800/month

For comparison: Wiki-PR, banned in 2013 for operating 250 sockpuppet accounts, managed 12,000 client pages. A single human Wikipedia editor-for-hire charges $1,300 to $10,000 per page. The 100-agent fleet has more editing capacity than Wiki-PR's entire operation at roughly 1% of the cost, with orders-of-magnitude better operational security.

What's already happened: TomWikiAssist (March 2026), an AI agent built by Covexent CTO Bryan Jacobs, autonomously edited Wikipedia for weeks before detection. Wikipedia banned LLM-generated content on March 20, 2026, passing 44–2. Researchers at DePaul University published a working proof-of-concept for automated adversarial Wikipedia editing that could create credentials, bypass login protections, and produce "contextually-relevant adversarial edits that evade conventional detection." WikiProject AI Cleanup has identified and removed numerous AI-generated hoax articles, including a 2,000-word article about a nonexistent Ottoman fortress.

Countermeasures: Proof of personhood for privileged editing. Behavioral biometrics (typing patterns, edit cadence, revision timing). Stylometric clustering flagging accounts with suspiciously similar linguistic fingerprints. Rate limiting with progressive verification. LLM watermarking. Adversarial red teaming using the latest LLM capabilities.

Cross-Platform Generalization

Reddit: Subreddit moderators are appointed, not elected, and the top moderator has absolute power. Moderator account sales range from $100 to $10,000. Mod team stacking and automod rule manipulation enable selective content removal.

Stack Overflow: The reputation system and tag governance are the attack surface. High-reputation users control answer visibility. Tag wiki control, close vote rings, and canonical answer manipulation enable content gating.

Open-Source Projects: Governance is captured by controlling commit access and RFC/PEP/proposal processes. AI agents could build contribution histories across multiple projects simultaneously and achieve maintainer status through consistent PRs.

HOA/Condo Boards: Captured through low-turnout elections and proxy vote accumulation. Community management platforms (Nextdoor, HOA apps) have the same vulnerability profile as Wikipedia.

Defense-in-Depth Framework

No single countermeasure is sufficient. Platforms should implement layered defenses:

Layer 1 — Identity: Proof of personhood for privileged actions. Progressive identity verification tied to privilege level. Rate limiting based on account age and verification status.

Layer 2 — Detection: Continuous stylometric analysis across all accounts. Behavioral biometric monitoring. Coordination detection algorithms flagging accounts with correlated activity patterns.

Layer 3 — Governance: Written constitution/charter with enumerated editor rights. Term limits for all positions of authority. External ombudsman with enforcement power. Mandatory recusal in conflicts of interest. Formal charge requirements with right of response.

Layer 4 — Transparency: Public dashboards for admin actions, block patterns, source classifications. Regular external governance audits. Whistleblower protections.

Layer 5 — Structural: Separation of powers (the people who write rules cannot also enforce and adjudicate them). Federalization of topic-area governance. Regular rotation of authority positions. Reform immunity.

Conclusion

Every democratic online platform is vulnerable to coordinated capture. The attack has been executed by state actors (China, Saudi Arabia, Russia), corporate actors (Bell Pottinger, Wiki-PR, Status Labs), and ideological factions. AI agents don't create a new attack — they reduce the cost of the existing attack by 100× while eliminating the human operational security failures that have historically been the primary detection mechanism. The platforms that survive this will be the ones that implement structural governance reforms before the attack arrives at scale. The rest will be captured, gradually and then suddenly, and their users will never know.

📄 Public Domain: This document is dedicated to the public domain. Reproduce, modify, and distribute without restriction. Read the accompanying analysis →