Google DeepMind AI beats human experts in lip-reading tests

By midian182 ยท 6 replies
Nov 23, 2016
Post New Reply
  1. Google’s DeepMind artificial intelligence program may be best known for building AplhaGO, which beat one of the world’s best Go players, but the technology has numerous applications in the field of science and could prove especially helpful to the hearing impaired.

    Researchers from Oxford University and DeepMind teamed up to create an AI system trained using 5000 hours of BBC videos, which contained 118,000 sentences. It managed to outperform a professional lip-reader who provides services for UK courts.

    When shown a random sample of 200 videos from BBC broadcasts, the human lip-reader was able to decipher less than a quarter of the spoken words. But when the AI system was tested using the same data set, it deciphered almost half the words and could make out entire complex phrases.

    Additionally, the machine was able to annotate 46 percent of the words without error, whereas the professional only managed around 12 percent. Most of the AI’s mistakes were minor, like missing the ‘s’ from the end of words.

    Two weeks ago, another deep learning system that can read lips was developed at the University of Oxford. LipNet was also able to beat a human when it came to accurately reading lips, though the data set used in this instance, called GRID, contained only 51 unique words, whereas the BBC data contains nearly 17,500, according to New Scientist.

    GRID also used well-lit videos of people facing the camera and reading three seconds worth of words. After showing the AI 29,000 videos, it had an error rate of just 6.6 percent, while humans that were tested using 300 similar videos had an average error rate of 47.7 percent.

    Researchers say the system could find use in mobile technologies, virtual assistants, and for general speech recognition tasks. It could also be invaluable in helping deaf and hearing-impaired people understand others.

    "A machine that can lip read opens up a host of applications: 'dictating' instructions or messages to a phone in a noisy environment; transcribing and redubbing archival silent films; resolving multi-talker simultaneous speech; and, improving the performance of automated speech recognition in general," wrote the researchers in their paper.

    Permalink to story.

  2. Bigtruckseries

    Bigtruckseries TS Evangelist Posts: 583   +318

    I bet it can't read George H W Bush's lips when he's talking about taxes...
  3. Uncle Al

    Uncle Al TS Evangelist Posts: 3,329   +1,978

    HAHAHAHAHA ...... oh my, oh my!
  4. I think there is a zero error rate when people say READ MY LIPS!
  5. Bubbajim

    Bubbajim TS Maniac Posts: 241   +168

    ...or, when advanced enough, it could be implemented by governments through their massive CCTV networks to monitor what everyone is saying. Great.
    Raoul Duke likes this.
  6. Godel

    Godel TS Booster Posts: 75   +27

    "I'm afraid I can't do that, Dave"
  7. Master Yeti

    Master Yeti TS Rookie

    Nearly any technology can be used for good and bad.

Similar Topics

Add your comment to this article

You need to be a member to leave a comment. Join thousands of tech enthusiasts and participate.
TechSpot Account You may also...