New York Times Reporter Sues AI Firms Over Chatbot Training

A New York Times investigative reporter has filed a sweeping copyright lawsuit against several of the world’s most powerful artificial intelligence companies, accusing them of illegally using copyrighted books to train their chatbots.

Table of Contents

John Carreyrou, the journalist who exposed fraud at the now-defunct blood-testing startup Theranos, sued multiple AI developers in California federal court on Monday. Carreyrou alleges that the companies copied and ingested his books without consent to train large language models that power consumer-facing chatbots.

Defendants Include Leading AI and Tech Companies

The lawsuit names a wide range of prominent AI developers and technology companies, including xAI, OpenAI, Google, Meta Platforms, Anthropic, and Perplexity.

Carreyrou, who authored the bestselling book Bad Blood, filed the case alongside five other writers. The plaintiffs argue that their copyrighted works were copied wholesale and fed into large language models without permission or compensation.

This is the first known copyright lawsuit to name xAI, the artificial intelligence startup founded by Elon Musk, as a defendant.

Lawsuit Rejects Class Action Strategy

Unlike many similar copyright disputes, the plaintiffs are not seeking to form a class action. Instead, they chose to file as individual authors, arguing that class actions disproportionately benefit large technology companies.

According to the complaint, class actions allow AI firms to negotiate a single settlement that extinguishes thousands of copyright claims at a steep discount. The authors say that approach deprives writers of the full statutory damages available under US copyright law.

“LLM companies should not be able to so easily extinguish thousands upon thousands of high-value claims at bargain-basement rates,” the lawsuit states.

Settlement With Anthropic Cited as Warning Sign

The complaint points to a recent settlement involving Anthropic as an example of why authors should avoid class actions. In August, Anthropic agreed to pay $1.5 billion to resolve claims that it pirated millions of books for AI training.

However, the new lawsuit argues that authors in that settlement will receive only about 2% of the maximum statutory damages allowed under the Copyright Act, which permits awards of up to $150,000 per infringed work.

Carreyrou previously criticized that settlement in court, calling the unauthorized use of books to train AI systems Anthropic’s “original sin.”

Broader Legal Battle Over AI Training Data

The lawsuit is part of a growing wave of copyright litigation targeting AI developers over how they source training data. Authors, artists, publishers, and media companies have increasingly accused tech firms of copying protected works without permission to build generative AI tools.

The case also highlights rising tension between journalists and AI companies, as newsrooms worry that chatbots trained on copyrighted reporting could undermine original journalism while avoiding licensing costs.

Legal Representation and Judicial Scrutiny

The case was filed by attorneys at Freedman Normand Friedland, including Kyle Roche, a lawyer previously profiled by Carreyrou in a New York Times investigation. During earlier proceedings in the Anthropic case, William Alsup criticized Roche’s former firm for encouraging authors to opt out of a class settlement in pursuit of better individual deals.

Roche declined to comment on the new lawsuit.

What Comes Next for AI Copyright Law

The outcome of this case could have far-reaching implications for how AI systems are trained in the future. If successful, the lawsuit may force AI companies to license books and other copyrighted materials or face significant financial penalties.

As courts continue to weigh the balance between innovation and intellectual property rights, the case underscores the growing legal risks facing AI developers as generative technologies move deeper into the mainstream.