EFF Warns: Blocking Internet Archive Over AI Concerns Will Erase the Web's Historical Record

2026-03-21T11:18:00.000Z·2 min read
The Electronic Frontier Foundation argues that publishers blocking the Internet Archive in response to AI scraping concerns are destroying decades of historical documentation.

Publishers Blocking Internet Archive: A Misguided Response to AI

The Electronic Frontier Foundation (EFF) has published a detailed analysis warning that major publishers' decision to block the Internet Archive from crawling their websites — driven by concerns over AI companies scraping copyrighted content — threatens to destroy decades of irreplaceable historical documentation.

What's happening

In recent months, The New York Times began blocking the Internet Archive's crawlers using technical measures that go beyond the web's traditional robots.txt rules. The Guardian appears to be following suit. The Archive's Wayback Machine, which contains over one trillion archived web pages, has served as the definitive historical record of online publishing for nearly three decades.

Why publishers are blocking

Publishers say the move is driven by concerns about AI companies scraping their content for model training. Several major outlets, including the Times, are currently suing AI companies over whether training on copyrighted material constitutes fair use.

Why EFF says this is wrong

The EFF makes three key arguments:

  1. Nonprofit archivists are not AI companies. The Internet Archive is not building commercial AI systems — it's preserving a record of our history. Blocking them punishes the wrong party in a fight they didn't start.
  1. Archiving and search are well-established fair use. Courts have long recognized that making material searchable without making copies is transformative and legal. The same principles that protect Google's book scanning project must also protect web archives.
  1. The historical record is irreplaceable. The Archive is the only reliable record of how stories were originally published. Wikipedia alone links to over 2.6 million archived news articles across 249 languages.

The bottom line

"If publishers shut the Archive out, they aren't just limiting bots. They're erasing the historical record."

The EFF argues that whatever the outcome of AI copyright lawsuits, blocking nonprofit archivists is fundamentally the wrong response. Real disputes over AI training must be resolved in courts — not by burning down libraries.

Source: Electronic Frontier Foundation

↗ Original source
← Previous: Stravaleaks: Le Monde Tracks France's Aircraft Carrier in Real Time via Fitness App DataNext: US Federal Debt Surpasses $39 Trillion: What It Means for Global Markets →
Comments0