Europe's AI code urges companies to disclose training data and avoid copyright violations

Daniel Sims

Posts: 1,975   +55
Staff
Recap: The European Union introduced what it calls the world's first comprehensive legal framework last year to regulate AI development. Although the regulations will not be enforceable for some time, recently introduced guidelines aim to facilitate compliance from tech giants, including OpenAI, Microsoft, and Google, which will likely attempt to resist or circumvent the new rules.

The European Commission recently introduced an AI code of practice to help large language models comply with last year's AI Act. The conditions are voluntary, but the commission suggests that agreeing to them is the easiest way for tech giants to adhere to the AI Act.

Although the new regulations enter into effect on August 2, new AI models have one year to comply with them, and existing models have a two-year grace period. The code of practice, which targets generalized LLMs from companies like OpenAI, Microsoft, and Google, consists of three sections, covering transparency, copyright, and safety.

The transparency section asks applicants to disclose details about how they develop and train their AI models, including energy usage, required processing power, data points, and other information. The dominant AI giants will likely bristle at the requirement to reveal where they obtained their training data.

Furthermore, the new EU rules include demands that LLMs respect paywalls, refrain from circumventing crawl denials, and otherwise observe copyright law, which AI companies are already fighting against. European internet publishers recently filed an antitrust complaint against Google's AI Overviews, a feature that summarizes information from various websites without users having to visit them. Denmark also recently proposed a law granting citizens copyright over their likenesses, allowing them to file claims against deepfakes made without their consent.

Moreover, AI companies have admitted that the technology might be incompatible with existing copyright law. A former Meta executive warned that AI might die "overnight" if LLMs were forced to obtain permission from every copyright holder. The CEO of Getty Images also disclosed that the company cannot contest every copyright claim related to AI.

The AI code's safety section addresses potential threats to citizens' personal safety and rights. For example, language in the AI Act mentions high-risk implications associated with the technology, including surveillance, weapons development, fraud, and misinformation.

According to Bloomberg, violations of the AI Act can result in fines of up to 7% of a company's annual sales.

Permalink to story:

 
With those 'inspirational' regulations, Europe will end up with only IBM's Granite series (a very weak AI). It's virtually impossible to train good AI while respecting copyright.
5aa38a08e3d8a15231341cbbfa91977b.jpg
 
Last edited:
With those 'inspirational' regulations, Europe will end up with only IBM's Granite series (a very weak AI). It's virtually impossible to train good AI while respecting copyright.
5aa38a08e3d8a15231341cbbfa91977b.jpg
A business model built on stealing is a bad business model.

If we let AI companies do this it is only fair to throw out copyright law. So which is better: smarter AI or protecting intellectual property?
 
It's virtually impossible to train good AI while respecting copyright.
Why? Just buy rights to the content you want to use for training. If I need to buy a book instead of torrenting it for my study because otherwise I'll be prosecuted, why a big corpo can steal it and be fine with that? Why corpo is above law every citizen has to abide? They have billions and make trillions, I'm sure they can spend a bit for the data their project needs.
 
Back