YouTube is one of the top entertainment destinations on the Internet but for the 360 million people worldwide that are deaf or hard of hearing, enjoying videos like everyone else isn’t a given.
Google first launched video captions back in 2006 but it would be another three years before YouTube adopted automated captions. As YouTube product manager Liat Kaver explains, that was a big leap forward that helped them keep up with the site’s growth curve over the past several years.
These days, more than a billion videos have automatic captions with users relying on them more than 15 million times per day.
Kaver said in a recent blog post that one of the ways they were able to scale the availability of captions was to combine Google’s automatic speech recognition (ASR) technology with YouTube’s caption system. While that helped, it wasn’t perfect as there were limitations with the technology that underscored the need to improve the captions themselves, something that Kaver said creators sometimes had fun with at their expense.
All jokes aside, one of their major goals was to improve the accuracy of automatic captions – not exactly an easy task given YouTube’s size and the diversity of content it hosts. But by improving speech recognition and machine algorithms as well as expanding training data, they’ve been able to boost automatic caption accuracy in English by 50 percent.
YouTube isn’t resting on its laurels as Kaver notes they intend to keep growing beyond the billion captions milestone and extend their work to all 10 supported languages (Dutch, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish and of course, English).
Part of getting there will involve creators willing to review and edit automatic captions on their own videos, Kaver said.