Judge Blocks Anthropic Settlement Over Pirated Books

newsdesk
3 Min Read

A federal judge refused to preliminarily approve a proposed $1.5 billion settlement between AI developer Anthropic and a group of authors who say the company used pirated books to train its Claude chatbot, finding the deal insufficiently transparent and fair and ordering significant fixes before any agreement can move forward. The ruling preserves a scheduled trial if the parties cannot resolve the issues identified by the court and highlights broader legal and financial risks for AI firms that rely on copyrighted material obtained from piracy sites.

The lawsuit, brought by authors including Andrea Bartz, Charles Graeber and Kirk Wallace Johnson, alleges Anthropic downloaded copyrighted works from piracy sites such as LibGen and PiLiMi to train its AI models. While the judge previously indicated that training on legally acquired books might qualify as fair use, he concluded that storing and using pirated copies is not protected, leaving the alleged downloads squarely within the scope of infringement claims.

Under the proposed settlement, Anthropic had agreed to establish a $1.5 billion fund to compensate authors, an amount the parties estimated would equal roughly $3,000 per book for about 500,000 eligible titles. The deal also called for the deletion of pirated copies the company had downloaded and contained no admission of wrongdoing by Anthropic.

The judge criticized the settlement as far from adequate and pointed to several major deficiencies. The court noted the absence of a finalized list of affected books, leaving unclear which titles would qualify for compensation. The proposed claims process was described as vague, with insufficient detail on how payments would be allocated or how competing claims for the same work would be resolved. The judge also found the notice and opt-out procedures for authors to be weak and raised concerns about potential overreach by industry groups involved in shaping the settlement, including the Authors Guild and the Association of American Publishers.

To address those problems, the judge directed the parties to provide a complete list of affected titles, a sample claims form and a revised settlement presentation on an expedited schedule, warning that unresolved deficiencies would push the matter to trial. If the case proceeds to trial, legal experts note that statutory damages for copyright infringement could be substantial, potentially reaching up to $150,000 per infringed work.

Beyond the immediate dispute, the ruling sends a clear signal to AI developers and publishers: use of copyrighted material in training datasets must be transparent, legally defensible and supported by robust compensation and notice mechanisms for creators. The outcome of this case will likely influence how courts treat the use of scraped or pirated text in AI training and could shape the obligations companies face when sourcing training data.

Share This Article
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *