Stay Updated Icon

Subscribe to Our Tech & Career Digest

Join thousands of readers getting the latest insights on tech trends, career tips, and exclusive updates delivered straight to their inbox.

Federal Judge Rules Meta's AI Training on Copyrighted Books is Fair Use in Key Lawsuit

6:44 AM   |   26 June 2025

Federal Judge Rules Meta's AI Training on Copyrighted Books is Fair Use in Key Lawsuit

Federal Judge Sides with Meta in Landmark AI Copyright Lawsuit Over Book Training Data

In a significant development for the artificial intelligence industry and the ongoing debate surrounding copyright and training data, a federal judge has ruled in favor of Meta Platforms in a lawsuit brought by a group of authors, including comedian and writer Sarah Silverman. The lawsuit alleged that Meta illegally used their copyrighted books to train its AI models without permission.

On Wednesday, Federal Judge Vince Chhabria issued a summary judgment in the case, effectively deciding the matter without the need for a full trial or jury. Judge Chhabria found that Meta's use of the copyrighted books for training its AI models, under the specific circumstances presented in this case, qualified as "fair use" under U.S. copyright law, and was therefore legal.

This ruling follows closely on the heels of a similar decision where a federal judge sided with AI company Anthropic in a related lawsuit concerning the use of copyrighted books for training. Taken together, these decisions represent early victories for technology companies that have been facing numerous legal challenges from creators and media organizations over the data used to build their powerful AI systems.

For years, tech companies have argued that training AI models on vast datasets, which often include copyrighted material scraped from the internet, is a transformative process that falls within the bounds of fair use. Creators and publishers, conversely, contend that this unauthorized use deprives them of control over their work and potential revenue streams, constituting copyright infringement.

The Nuances of the Ruling: Not a Blanket Victory

Despite the favorable outcome for Meta and Anthropic, these rulings are not the definitive, sweeping endorsements of current AI training practices that some in the tech industry might have hoped for. Both judges involved in these cases were careful to highlight the limited scope of their decisions.

Judge Chhabria explicitly stated that his decision in the Meta case does not establish a universal precedent that all AI model training on copyrighted works is automatically legal. Instead, he noted that the plaintiffs in this specific lawsuit "made the wrong arguments" and failed to provide sufficient evidence to support potentially stronger claims.

"This ruling does not stand for the proposition that Meta’s use of copyrighted materials to train its language models is lawful," Judge Chhabria wrote in his decision. He further elaborated, suggesting that in cases involving uses similar to Meta's, "it seems like the plaintiffs will often win, at least where those cases have better-developed records on the market effects of the defendant’s use." This indicates that while Meta won this round, future plaintiffs with stronger evidence, particularly regarding market impact, might achieve different results.

Understanding Fair Use in the Context of AI Training

Fair use is a legal doctrine in U.S. copyright law that permits limited use of copyrighted material without requiring permission from the rights holders. It is a flexible doctrine, evaluated on a case-by-case basis, typically considering four factors:

  1. The purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes.
  2. The nature of the copyrighted work.
  3. The amount and substantiality of the portion used in relation to the copyrighted work as a whole.
  4. The effect of the use upon the potential market for or value of the copyrighted work.

In the Meta case, Judge Chhabria's decision hinged significantly on the first and fourth factors: the purpose and character of the use (specifically, whether it was transformative) and the effect on the market.

Transformative Use: The Core of the Argument

A key element of Judge Chhabria's ruling was his finding that Meta's use of the copyrighted books was transformative. Transformative use is when a new work uses copyrighted material in a different purpose or context than the original, creating something new with a different character, and not substituting for the original use of the work. In the context of AI training, the argument is that feeding copyrighted text into a large language model (LLM) to teach it patterns, grammar, facts, and styles is fundamentally different from reading the book for its narrative or informational content.

The AI model, after training, does not store or reproduce the original books verbatim (or at least, is not intended to). Instead, it learns from the vast corpus of data to generate new text, translate languages, write different kinds of creative content, and answer your questions in an informative way. The judge agreed that this process of learning and generating new outputs constitutes a transformative use, distinct from merely copying or distributing the original works.

This perspective aligns with the tech industry's view that training data is more akin to a student reading many books to learn how to write, rather than making unauthorized copies of those books for distribution.

The Crucial Factor of Market Harm

Perhaps even more critical to the judge's decision was the plaintiffs' failure to demonstrate that Meta's training of its AI models on their books caused harm to the market for those books. The fourth factor of fair use, the effect upon the potential market, is often considered the most important.

To succeed on this point, the authors would have needed to show concrete evidence that Meta's actions, specifically the training process, negatively impacted the sales, licensing, or other potential revenue streams for their copyrighted works. This could involve showing that the AI outputs directly compete with their books or reduce the demand for them.

Judge Chhabria was unequivocal on this point: "The plaintiffs presented no meaningful evidence on market dilution at all," he stated. Without evidence demonstrating a negative market effect caused by the *training* itself, the authors' claim was significantly weakened. It's important to note that the lawsuit focused on the act of training the model, not necessarily on the outputs the model might produce, which could potentially be a separate legal issue if those outputs infringe copyright.

Broader Implications and Other Ongoing Lawsuits

The Meta and Anthropic rulings are significant early indicators in the complex legal landscape emerging around AI and copyright. However, as Judge Chhabria noted, the fair use defense is highly fact-dependent, and the outcome can vary greatly depending on the specifics of the case, including the type of copyrighted work involved and the evidence presented regarding market effects.

The judge himself suggested that markets for certain types of works might be more vulnerable to indirect competition from AI outputs than books were shown to be in this case. "It seems that markets for certain types of works (like news articles) might be even more vulnerable to indirect competition from AI outputs," said Chhabria.

This distinction is particularly relevant given other high-profile lawsuits currently making their way through the courts:

  • **The New York Times vs. OpenAI and Microsoft:** The New York Times is suing OpenAI and Microsoft, alleging that their AI models were trained on millions of the newspaper's articles and that the AI outputs sometimes reproduce NYT content verbatim or create content that directly competes with the newspaper, potentially harming its market. This case involves a different type of content (news) and potentially different arguments regarding market harm compared to the authors' lawsuit against Meta. You can read more about The New York Times' lawsuit here.
  • **Disney and Universal vs. Midjourney:** Major entertainment companies like Disney and Universal are suing AI image generator Midjourney, alleging that it was trained on their copyrighted films and TV shows and that its outputs can generate images in the distinctive styles of their properties, potentially infringing on their rights and impacting markets for derivative works. Details on the Disney and Universal lawsuit highlight the challenges faced by generative AI models that produce outputs visually similar to copyrighted works.

These cases, involving news articles and visual media, may present different challenges and evidence regarding transformative use and market harm compared to the book-centric lawsuits against Meta and Anthropic. The judges' comments suggest that the legal outcomes could indeed differ.

What This Means for the Future

The rulings in the Meta and Anthropic cases provide some temporary relief and legal grounding for AI companies regarding the use of copyrighted books in their training data, particularly when plaintiffs fail to demonstrate market harm. They reinforce the idea that the *process* of training, viewed as transformative, can fall under fair use.

However, the judges' caveats are crucial. The legal battle is far from over. Future cases will likely focus more intensely on:

  • The specific nature of the copyrighted works used.
  • The amount and substantiality of the copying, particularly if evidence emerges of models memorizing and reproducing large portions of copyrighted text or images.
  • Robust evidence of market harm, demonstrating how AI outputs directly compete with or diminish the value of the original works.
  • The terms of use and licensing of the training data itself.

Creators and copyright holders will need to refine their legal strategies, focusing on demonstrating tangible harm and challenging the extent to which AI training is truly transformative in all contexts. The tech industry, meanwhile, will continue to navigate these legal waters, potentially exploring licensing agreements or developing models that can be trained on different types of data or with different methodologies to mitigate legal risks.

The path forward involves complex negotiations and legal challenges that will ultimately shape how AI is developed and deployed, balancing the need for innovation with the rights of creators. The Meta ruling is a significant chapter, but just one chapter, in this unfolding story.