โ† Back to Live in the Future
๐Ÿ’ผ Labor & AI

Developers Using AI Are 19% Slower. They Think They're 20% Faster.

A randomized controlled trial โ€” the gold standard of scientific evidence โ€” caught the biggest self-deception in the tech industry. Companies are eliminating jobs based on productivity gains that don't survive clinical scrutiny. The 39-percentage-point gap between what developers believe and what actually happens may be the most expensive delusion in corporate history.

By Nadia Kovac ยท Labor & AI Policy ยท March 11, 2026 ยท โ˜• 11 min read

Sixteen developers. Two hundred forty-six real programming tasks. Months of observation. One devastating finding.

In early 2025, the nonprofit Model Evaluation & Threat Research (METR) did something no AI vendor has ever done: it ran a proper randomized controlled trial on whether AI coding assistants โ€” Cursor Pro backed by Claude 3.5 Sonnet โ€” actually make experienced developers faster.

They don't.

Experienced open-source contributors working on codebases they knew intimately completed tasks 19% slower with AI assistance than without it. Not marginally slower. Not statistically ambiguous. Nineteen percent.

But here's the part that should make every CEO currently restructuring around AI pause and reread their last board deck: those same developers estimated they were 20% faster with the tools. Before the study started, they predicted a 24% speedup. Afterward, having been measurably slowed down, they still reported feeling 20% more productive.

That's a 39-percentage-point gap between perception and reality. Not a rounding error. A hallucination โ€” and not the kind the AI produces.

The Study Nobody Wanted

METR's setup was deliberately clinical. Each of the 246 tasks โ€” bug fixes, new features, refactors โ€” was randomly assigned to either allow or prohibit AI tools. The developers weren't interns; they were experienced contributors to large open-source repositories they'd worked on for years. The kind of people companies are actively building AI tools to replace.

The results split cleanly. On tasks where AI was prohibited, developers worked at their normal pace. On tasks where AI was allowed, they spent significant additional time reviewing AI suggestions, course-correcting hallucinated code, and debugging subtle errors the models introduced. The context-switching tax alone โ€” bouncing between their own logic and the AI's confident-but-wrong suggestions โ€” ate the productivity gains.

One number buries the industry narrative: pull request acceptance rates for AI-assisted code landed at 32.7%, compared to 84.4% for human-written code. Two-thirds of AI-assisted work was rejected by reviewers. The tool's output failed basic quality gates at 2.5ร— the human rate.

This is the dirty secret hiding behind every "10ร— developer" LinkedIn post. The AI produces code fast. The code is often wrong. The developer then spends more time fixing it than they would have spent writing it from scratch โ€” but they feel productive the whole time because the screen is full of text and the cursor is moving.

The Duplication Crisis

GitClear put harder numbers on the damage. Analyzing 211 million changed lines of code across repositories at Google, Microsoft, Meta, and other large enterprises between 2020 and 2024, they found that code blocks with five or more duplicated lines increased eightfold. Refactoring โ€” the practice of reorganizing and consolidating code that separates functioning software from eventual unmaintainable spaghetti โ€” collapsed from 25% of all code changes in 2021 to under 10% by 2024.

For the first time in the history of GitClear's tracking, copy-pasted code exceeded refactored code. The DRY principle โ€” "Don't Repeat Yourself," one of the oldest and most fundamental rules in software engineering โ€” is being demolished by autocomplete.

Kin Lane, a veteran API architect, summarized it bluntly: the industry has accumulated "more technical debt in a shorter period than in 35 years" of commercial software development. That debt will come due. It always does. And when it does, the developers who might have prevented it will have been laid off in restructurings justified by the very tools creating the debt.

The Trust Collapse

Developers know something is off. Stack Overflow's 2025 survey โ€” 65,000 respondents โ€” found that 84% of developers now use AI tools. Adoption is near-universal. But trust has cratered: 46% do not trust the accuracy of AI output, up from 31% in 2024. Only 60% view AI tools favorably, down from 72% the year before.

Read that again. The people actually using these tools every day are losing confidence in them even as their employers double down.

"I would have thought that as the tools matured, user confidence would have followed suit," said Erin Yepis, senior analyst at Stack Overflow, with the diplomatic understatement of someone whose data just contradicted a $427 billion investment thesis.

$427 Billion Chasing $37 Billion

That investment thesis, by the way, is in trouble.

Big Tech collectively spent $427 billion on AI infrastructure in 2025 โ€” data centers, GPUs, power contracts, model training. Enterprise AI revenue that year: roughly $37 billion. That's a 10-to-1 ratio of spending to revenue. Sequoia Capital, in an updated analysis, pegged the gap between AI infrastructure investment and actual end-user revenue at $600 billion.

MIT's Sloan School found that 95% of enterprise AI projects showed no financial return within six months. Gartner placed generative AI in the "trough of disillusionment" on its hype cycle, citing an 88% pilot failure rate. Bain & Company, surveying actual enterprise deployments, called the realized savings "unremarkable" at 10โ€“15% โ€” a fraction of what vendors had promised.

None of this has slowed the layoffs.

Firing on Vibes

HBR researchers Thomas Davenport and Rajeev Ronanki gave the phenomenon a name in late 2025: "pre-emptive displacement." Companies cutting headcount not because AI has demonstrated it can do the work, but because executives believe it will be able to soon. The decisions are based on vendor demos, conference keynotes, and competitor press releases โ€” not on measured output from their own deployments.

The pattern repeats across industries. Block cut 4,000 positions in early 2026, with CEO Jack Dorsey explicitly citing AI capability. WiseTech eliminated 2,000. eBay dropped 800. Pinterest, 675. The common thread: none published internal data showing AI had absorbed the work of the departed. The layoffs were acts of faith dressed up as strategic transformation.

And then there's Klarna.

The Swedish buy-now-pay-later company halved its workforce from 6,011 to 2,907 over two years, entirely through attrition and hiring freezes โ€” the "invisible layoff" that triggers zero legal reporting requirements. CEO Sebastian Siemiatkowski toured the conference circuit as the poster child for AI efficiency. Revenue per employee jumped from $175,000 to $1.2 million.

Then, in May 2025, he admitted the company had "focused too much on efficiency and cost" and the result was "lower quality." Klarna began rehiring โ€” but not as full-time employees. As gig workers. Uber-style. The displaced were re-employed at degraded terms, and the stock fell 65% post-IPO.

Where AI Actually Works

This isn't a story about AI failing everywhere. That would be simpler and less interesting.

Erik Brynjolfsson's study of customer service agents found a genuine 14% productivity increase, concentrated among less-experienced workers. Shakked Noy and Whitney Zhang documented 40โ€“50% speedups in professional writing tasks. AI tutoring systems show real learning gains. Medical imaging diagnostics are measurably improved.

The pattern is specific: AI helps novices performing well-defined tasks with clear success criteria. It helps customer service agents follow scripts. It helps junior writers produce first drafts. It helps medical residents spot tumors on scans they haven't seen enough of yet.

It does not, according to METR's data, help experienced professionals doing complex work on systems they already understand. For that population โ€” which happens to be the population companies are most eager to replace because their salaries are highest โ€” AI is currently a net drag on productivity.

Harvard Business School researchers found a related pattern: on tasks within an AI's competence frontier, consultants using GPT-4 improved 40%. On tasks outside that frontier โ€” the messy, ambiguous, judgment-heavy work that defines senior roles โ€” performance dropped 23%. The AI didn't just fail to help. It actively made experienced people worse by anchoring them to confident-sounding wrong answers.

The Bottleneck Doesn't Disappear. It Migrates.

METR found something else that should concern anyone managing a software team. When AI accelerated the coding step, downstream code review time increased by 91%. The bottleneck didn't shrink. It moved. Goldratt's Theory of Constraints, applied to software: speeding up one pipeline stage doesn't improve throughput if the next stage can't absorb the volume.

This is what Bain's "unremarkable" 10โ€“15% enterprise savings actually means. AI generates more outputs โ€” code, text, images, analysis โ€” but the human review, correction, integration, and decision-making capacity remains fixed. The organization doesn't get faster. It gets more congested.

GitClear's technical debt data is the slow-motion version of the same problem. AI-generated code ships faster through automated pipelines. The debt accumulates silently. The refactoring that would catch it isn't happening. When the system eventually buckles โ€” and complex systems always eventually buckle โ€” the cost of remediation will dwarf whatever was saved on initial development.

The Bottom Line

There is a 39-point gap between what developers believe AI does for them and what a randomized trial measured. There is a $600 billion gap between what companies are spending on AI and what it's earning. There is an 88% gap between enterprise AI pilot launches and enterprise AI pilot successes. And in that space between belief and evidence, real people are losing real jobs โ€” not because a machine proved it could do their work, but because a PowerPoint slide said it could.

The most dangerous technology isn't the one that replaces you. It's the one your boss thinks will replace you.

Sources

  1. METR (Model Evaluation & Threat Research), "Measuring the Impact of Early AI-Assisted Development," February 2025. Pre-registration: OSF registry. Randomized controlled trial of 16 experienced open-source developers on 246 tasks.
  2. GitClear, "AI Code Quality Report 2024: Analyzing 211 Million Changed Lines of Code," January 2024, updated July 2024. Repositories include Google, Microsoft, and Meta codebases. gitclear.com
  3. Stack Overflow, "2025 Developer Survey," Section: AI Tool Adoption and Trust, December 2025. N=65,000+. survey.stackoverflow.co
  4. Sequoia Capital, David Cahn, "AI's $600B Question," updated analysis, July 2024. sequoiacap.com
  5. MIT Sloan School / NANDA Lab, "Enterprise AI ROI Study," August 2025. Survey of Fortune 500 AI deployments: 95% showed no measurable financial return within 6 months. mitsloan.mit.edu
  6. Gartner, "Hype Cycle for Artificial Intelligence 2025," August 2025. Generative AI placed in "Trough of Disillusionment." 88% pilot failure rate cited.
  7. Bain & Company, "The Real State of Enterprise AI," September 2025. Surveyed actual deployment outcomes; characterized savings as "unremarkable" at 10โ€“15%.
  8. Thomas H. Davenport and Rajeev Ronanki, "The Companies Cutting Jobs for AI That Can't Do Them Yet," Harvard Business Review, November 2025. Introduces the concept of "pre-emptive displacement."
  9. Erik Brynjolfsson, Danielle Li, and Lindsey Raymond, "Generative AI at Work," NBER Working Paper 31161, April 2023 (revised January 2024). Customer service productivity increase of 14%. nber.org
  10. Shakked Noy and Whitney Zhang, "Experimental Evidence on the Productivity Effects of Generative Artificial Intelligence," Science, Vol. 381, July 2023. Writing tasks: 40โ€“50% speedup. doi.org
  11. Dell'Acqua, F. et al., "Navigating the Jagged Technological Frontier," Harvard Business School Working Paper 24-013, September 2023. Consultants using GPT-4: +40% inside frontier, โˆ’23% outside. hbs.edu
  12. Klarna Group, SEC Form F-1 Registration Statement, filed November 2024. Headcount disclosed: 5,441 (Sep 2023) โ†’ 3,422 (Sep 2024). Revenue per employee: Section "Our Efficiency Story." sec.gov
  13. Sebastian Siemiatkowski, interview transcript, May 2025. Acknowledged "focused too much on efficiency and cost, the result was lower quality."