YouTube creators unaware Google uses their videos to train AI

midian182

Posts: 10,776   +142
Staff member
A hot potato: When it comes to tech companies training their AI models, it seems everything is fair game. Google, for example, uses some of the billions of videos on YouTube to train Gemini and Veo 3, and many creators are unaware that it's happening.

With more than 20 billion videos on the platform, YouTube is a treasure trove of data for AI companies to exploit – and many already have.

YouTube owner Google is also using the content to train its AI models, reports CNBC. The company later confirmed that it does do this, but it only uses a subset of videos and that it honors specific agreements with creators and media companies.

"We've always used YouTube content to make our products better, and this hasn't changed with the advent of AI," said a YouTube spokesperson in a statement.

YouTube admitted that there was a need for safeguards in this area, which is why it has invested in protections to allow creators to protect their image and likeness.

But many experts point out that most creators and companies don't know that Google is training its models on their content. There's also no way for people to opt out of having their creations used this way.

The report notes that the size of YouTube's video library means that even if just 1% of the videos are used for training purposes, that amounts to 2.3 billion minutes of content, which is more than 40 times greater than the training data used by competing AI models, according to experts.

The situation has become more relevant since Google announced its Veo 3 video model that can create incredibly realistic video clips. As with many industries, the irony is that the content people create is being used to train an AI that could eventually replace them, or at least impact their income in what is a competitive market.

Some creators take a different point of view; they're using or planning to use Veo 3 to create content, even if it has been trained on their own original work.

There have been cases of other companies using YouTube to train their AIs without creators' knowledge. It was reported last year that OpenAI has transcribed over a million hours of YouTube videos to train its LLMs. Nvidia did the same thing, and at one point was scraping 80 years of videos daily – the company argued this was in "the spirit of copyright law." Anthropic, Apple, and Salesforce also turned to YouTube for their AI training data.

Google now allows creators to opt out of third-party training from AI companies such as Amazon and Nvidia, but there's no option to stop Google from doing the same.

Image credit: Jordan González

Permalink to story:

 
Ai influencers are a thing, if you take the summation of all that content the revenue is probably in the billions. I forsee ai influencers attempting to compete YouTubers in the near future. They want to cut out the middlehumans. They can train the ai to analyze why certain channels, videos, content does better than others on a macro level. Put the most likable CGI figure for targeted demographic and watch the ad revenue swell. Also they can present this data to advertisers and have targeted ads on roids going forward in my outlook.
 
If your data is on the web, it’s available for AI training. Companies can make all sorts of claims that certain things are excepted - but they are lying.

Honestly though - why do we care? Don’t we WANT the AIs to get as good as possible as fast as possible?

Most of the complaints that people have about AI are that it’s not “real AI”… well, that’s because we’re still in the infancy stage!

The faster we get out of that phase, the better our AI will be.
 
If your data is on the web, it’s available for AI training. Companies can make all sorts of claims that certain things are excepted - but they are lying.

Honestly though - why do we care? Don’t we WANT the AIs to get as good as possible as fast as possible?

Most of the complaints that people have about AI are that it’s not “real AI”… well, that’s because we’re still in the infancy stage!

The faster we get out of that phase, the better our AI will be.
With that exabyte of data that YT holds it's still on infancy stages. Any idea on how long that stage lasts before we get to the toddler stages and learning to walk by self?
 
With that exabyte of data that YT holds it's still on infancy stages. Any idea on how long that stage lasts before we get to the toddler stages and learning to walk by self?
Well, considering it’s been only a few years… patience is key… but anyone who forecasts the “next stage” is probably just blowing steam out of their ears… wouldn’t be surprised to see it take another decade or 2… but who knows?
 
With that exabyte of data that YT holds it's still on infancy stages. Any idea on how long that stage lasts before we get to the toddler stages and learning to walk by self?
Well, considering it’s been only a few years… patience is key… but anyone who forecasts the “next stage” is probably just blowing steam out of their ears… wouldn’t be surprised to see it take another decade or 2… but who knows?
They already have storm trooper and Spartan soldier AI influncers that have hundreds of millions of views
 
Back