Microsoft's speech recognition system has reached "human parity"

midian182

Posts: 9,632   +120
Staff member

One area that has struggled to keep pace with advancing technology is speech recognition. Due to the enormous variations in the way people speak, perfecting these systems has proved difficult. But Microsoft says its new experimental software is able to identify words in a conversation as well as a human can.

Microsoft announced the breakthrough in a blog post yesterday. The team from Artificial Intelligence and Research reported that its software has reached “human parity,” in that it can transcribe conversations with an error rate “about equal” to professional transcriptionists – 5.9 percent.

PC World notes that only five years ago, the most cutting-edge speech recognition technology had word error rates of 20 – 25 percent. Microsoft managed to hit 6.3 percent last month, and today’s report is the first instance of it dropping below 6 percent.

“Even five years ago, I wouldn’t have thought we could have achieved this. I just wouldn’t have thought it would be possible,” said Harry Shum, executive vice president for Microsoft’s artificial intelligence and research group.

The system uses neural language models that group synonyms for better efficiency. If someone says “fast,” for example, it will look for “quick.”

The researchers used the Computational Network Toolkit (CNTK) – a homegrown system that is available on Github via an open source license. The kit’s ability to process deep learning algorithms across multiple computers enabled the team to vastly improve its research speed.

Microsoft points out that the system isn’t perfect – it doesn’t recognize every word. But neither do people. Like humans, it can mistake words like “have” for “is,” or “a” for “the,” when transcribing.

Microsoft didn’t say when the software might appear in commercial products, and it still needs more fine-tuning to work in real-world situations where there is often background noise. One of its more obvious applications would be integration with Cortana.

"This will make Cortana more powerful, making a truly intelligent assistant possible," said Shum.

Permalink to story.

 
Provoking responses doesn't mean that the comment isn't stupid.

Now 3 out of 4.

"£$^£$^"#12 !@"~£@! #'4 12#3'1#2'3 !"£ !!!!!!

4/4 Mission Complete.

I kid, this is pretty awesome. The Speech Recognition technology has moved incredibly fast and vast over the past decade. Give it another 30-50 years and a fully flesh like AI which you can have as a **** buddy will be a thing. Just kidding but not really.
 
Soon it will be merged with an AI to make robo-calls offering air-duct cleaning.... it will be in "your area"...
 
Now I'm wondering how well does it do when it comes to accents...
I'd assume it's terrible, just like Googles. If you don't speak with an American accent it's hopeless, and more so with my hybrid British, South African accent. I gave up on those voice assistants years ago, not that I was ever excited about them. The only instance I could see their usefulness was if you were driving and using something like Google maps.
 
Microsoft's speech recognition system has reached "human parity"

"Disable Windows 10 Spyware and stop unwanted automatic updates"

"I'm sorry Dave, I cannot allow you to do that..."
 
Microsoft's AI assisted voice recognition is putting the overrated Dragon Naturally Speaking to shame. And Cortana leaves Siri -- and everyone else -- in the dust.
 
Microsoft's AI assisted voice recognition is putting the overrated Dragon Naturally Speaking to shame. And Cortana leaves Siri -- and everyone else -- in the dust.
That sounds great! (y)Can I download "Cortana" and install "her" on my XP machines...? *nerd*

Oh, and before I forget, "hooray for M$, and Windows 10".
 
Back