Google Photo Recognition Goes Beyond Objects

Scientists at Google, in partnership with Stanford University, have created artificial intelligence software that has the most advanced image recognition skills in the world.

The software is able to describe the contents of a photograph better than any other software. In fact, its descriptions are similar to what would be given by a human.

"This kind of system could eventually help visually impaired people understand pictures, provide alternate text for images in parts of the world where mobile connections are slow, and make it easier for everyone to search on Google for images," said Google in a statement.

While the new system is complex and uses algorithms that are very detailed, it also outputs language that is easy to understand. It does this using two networks. One of the networks deals with the actual image recognition and the other deals with natural language processing. Not only that, but the system is computer learning, meaning that it's fed a number of images that have captions, eventually learning how these captions relate to the images. For example, the image to the left was described by the software as "two pizzas sitting on top of a stove top oven." According to the team behind it, the system is twice as advanced as any similar software.

Despite this, the software is not 100 percent accurate. Occasionally it makes mistakes or is just plain wrong in its description. The team is continuing to develop the software, although it's clear that it has come a long way since it started.

"I consider the pixel data in images and video to be the dark matter of the Internet," said Fei-Fei Li, director of the Stanford Artificial Intelligence Laboratory. "We are now starting to illuminate it."

Eventually the technology could lead to enabling the blind and robots to better comfortably navigate their environments. However, it also could have a number of uses in surveillance. Over the last few decades hundreds of surveillance cameras have been placed in both public and private places. Eventually this software could be able to recognize individual faces and certain types of behavior.

The news comes two years after Google created different image-recognition software and fed it 10 million images from YouTube videos. The software taught itself to recognize cats. The new software is much more advanced, however.

"I was amazed that even with the small amount of training data that we were able to do so well," said Oriol Vinyals, one of the four researchers who wrote the paper, with the others being Alexander Toshev, Samy Bengio and Dumitru Erhan. "The field is just starting, and we will see a lot of increases."