Three more publishers sue OpenAI over ChatGPT copyright infringement claims


Posts: 9,738   +121
Staff member
What just happened? OpenAI is being sued for allegedly stealing copyrighted articles to train ChatGPT, again. The New York Times has been involved in a lawsuit against the AI giant over this issue since December, and now digital publishers The Intercept, Raw Story, and AlterNet are launching their own copyright infringement suits against the Microsoft-backed firm.

The two new cases – Raw Story and AlterNet have the same owner, which filed a single suit – mirror the New York Times' arguments against OpenAI: that the company used copyrighted material to train ChatGPT.

The publications' suits say that the chatbot produces verbatim or nearly verbatim works of journalism "at least some of the time" without providing author, title, copyright or terms of use information contained in those works.

Raw Story and AlterNet say that OpenAI and Microsoft knew ChatGPT would be less popular and generate less revenue if people believed the tool's responses violated third-party copyrights.

The suits allege that the defendants are aware of their potential copyright infringements based on the fact that OpenAI offers an opt-out system for website owners to block their content from being scraped by its web crawlers. Lawyers representing the firms believe the WebText, WebText2, and Common Crawl datasets include the plaintiffs' content.

Both Microsoft and OpenAI offer to pay the legal fees of paying customers who are sued for copyright violations for using Copilot or ChatGPT Enterprise.

Only The Intercept's case also names Microsoft as a defendant. Raw Story and AlterNet did not include the Windows maker because of a partnership with MSN that helps fund their investigative reporting, said CEO John Byrne. Law firm Loevy & Loevy is representing all three outlets in the suits.

"Raw Story feels that news organizations must stand up to OpenAI, which is violating the Digital Millennium Copyright Act and profiting from the hard work of journalists whose jobs are under siege," Byrne said in a joint statement. "It's important to democracy that a diverse array of news sites continue to thrive. OpenAI's violations, if not checked, will further decimate the news industry, and with it, the critical news reporters who affect positive change."

"When they populated their training sets with works of journalism, Defendants had a choice: they could train ChatGPT using works of journalism with the copyright management information protected by the DMCA intact, or they could strip it away," the court documents in the Raw Story/AltNet case state. "Defendants chose the latter."

The companies are seeking damages in the amount of at least $2,500 per violation. They also want OpenAI to remove all copyrighted articles from its data training sets.

The New York Times sued OpenAI and Microsoft in December for using millions of its articles to train their systems without permission or compensation. OpenAI recently accused the paper of paying someone to "hack" ChatGPT so it would generate misleading evidence supporting its claim.

Permalink to story.

Sadly, all of the new related publications have fallen apart because of the replacement of the human's in the editing and publication side of the business. The number of duplicate stories, miss-assigned headlines to stories, wrong pictures, etc, etc is staggering. The entire news business is at risk of failure and the number of accurate, dependable outlets can be counted on one hand ....
Well... When AI gets more accurate and reliable and big corporations start to heap even more money from AI solutions all these lawsuits will start to end in fine or agreements because corporate power (money) shapes how the world around us operate.
The entire news business is at risk of failure and the number of accurate, dependable outlets can be counted on one hand ....
Just out of curiosity, which news/ analysis/ commentary outlets are part of your “Final Five”?
These cases keep getting thrown out of court which is sad to see.
Frankly, copyright trolls have been destroying people's lives for decades so I really dont care if it is used for AI training. AI is more useful than anything copyright holders create which is saying a lot because modern media is pretty useless