Google's new image recognition software can describe an entire scene

Shawn Knight

Posts: 15,253   +192
Staff member

google image recognition software describe entire scenes image recognition

Image recognition software has come a long way over the past several years but the latest work from Google is really mind-blowing. The search giant has come up with a new machine-learning system that can not only recognize the subject of a photo but describe the entire scene.

In a blog post on the matter, Google Research Scientists Oriol Vinyals, Alexander Toshev, Samy Bengio and Dumitru Erhan point out that recent research has greatly improved object detection, classification and labeling. Even still, that’s not enough for a computer to accurately describe a complex scene.

The idea for the new system comes from recent advances in machine language translation between languages. In those systems, a Recurrent Neural Network (RNN) converts a sentence in a foreign language into a vector representation while a second RNN uses the vector to build a sentence in the target language.

google image recognition software describe entire scenes image recognition

What Google has done is replaced the first RNN and its input with a deep Convolutional Neural Network (CNN) that is trained to classify objects in images. They then feed the CNN data into an RNN designed to produce phrases.

Sure, it’s a little complicated but once you get the concept, it makes a lot of sense.

The scientists said their experiments on several openly published data sets have generated sentences that are quite reasonable. It’s far from perfect but we also have to remember that it’s still in an early stage of development.

The hope is that one day, the system may be able to help visually impaired people understand pictures, provide alternative text for images in regions of the world where mobile connections are slow and of course, make it easier to search for images on Google.

Permalink to story.

 
Pretty good actually. The last two made me laugh though, there is no red in the one with the moped besides the marker lights and the spongebob-esque "Limo-Bug" classified as a school bus. ROFL.
 
Skynet v1.0

Elon Musk is predicting that AI will be very bad for humanity.
 
They have money for doing experiments like this, I mean this is not the latest technology, these methods were introduced in 80's and 90's.
 
Back