What just happened? It's not just OpenAI that keeps being sued for allegedly scraping content to train its systems. Reddit is suing Anthropic over claims it scraped content generated by forum users to train its Claude chatbot without consent.
In a complaint filed in San Francisco this week, Reddit claims that Anthropic intentionally trained its LLMs on content created by Reddit users without requesting consent, thereby violating Reddit's user agreement.
The filing adds that Anthropic accessed Reddit more than 100,000 times, despite the AI firm stating that its bots had been blocked from scraping data from the site. It adds that Anthropic has been training its Claude chatbot on Reddit data since at least December 2021. The suit actually includes a screenshot appearing to show Claude acknowledging that it was trained on Reddit user data.
Reddit signed agreements with Google and OpenAI in 2024, worth $60 million and $70 million per year respectively, that allow them access to posts created by Reddit's 100+ million daily users. This allows the posts to appear in answers provided by the companies' respective chatbots. But Anthropic allegedly prefers not to pay for this sort of thing.
"Anthropic refused to engage," the complaint states. "Thus, while other AI giants have entered into licensing agreements and agreed to respect users' choices (including by deleting posts that Redditors chose to delete), Anthropic has not."
Reddit challenges Anthropic's assertion that it is "the white knight of the AI industry."
"This case is about the two faces of Anthropic: the public face that attempts to ingratiate itself into the consumer's consciousness with claims of righteousness and respect for boundaries and the law, and the private face that ignores any rules that interfere with its attempts to further line its pockets."
Reddit CEO Steve Huffman previously complained that blocking companies unwilling to pay for data harvesting has been "a real pain in the ass," which is why it changed its robots.txt file to exclude bots and crawlers that didn't have permission to access its data.
Anthropic said that it disagrees with Reddit's claims and will defend itself vigorously.
A Reddit spokesperson said, "We believe in the Open Internet – that does not give Anthropic the right to scrape Reddit content unlawfully, exploit it for billions of dollars in profit, and disregard the rights and privacy of our users."
"This isn't a misunderstanding, it's a sustained effort to extract value from Reddit while ignoring legal and ethical boundaries."
"We're filing this lawsuit in line with our Public Content Policy and as our final option to force Anthropic to stop its unlawful practices and abide by its claimed values."
Anthropic is no stranger to lawsuits. In 2024, writers and journalists Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson filed a class-action against the firm over claims it used pirated copies of their work to train Claude.
Anthropic has also been fighting a case against Universal Music that started in 2023. The AI firm is alleged to have used lyrics from at least 500 songs by musicians, including Beyoncé, the Rolling Stones and the Beach Boys, to train Claude without permission. In March, a judge rejected a preliminary bid from the music publishers to block their lyrics from being used this way.