Google's DeepMind Failed A High School Level Math Test: What Does This Mean?

Google's premiere artificial intelligence program DeepMind nearly flunked a math exam designed to measure the knowledge of 16-year-old British students.

In a study titled "Analyzing Mathematical Reasoning Abilities of Neural Models," researchers put DeepMind's analytical skills to the test by exposing it to various mathematical subjects. They fed the AI with question typically found in schools in the UK, such as algebra, arithmetic, measurement, and calculus.

Despite its cutting-edge approach to learning, DeepMind had a hard time making calculations that a regular human teenager would normally breeze through. Google's AI, as it turns out, could only muster an E in the British grading system.

DeepMind's Neural Networks

For the experiment, the researchers subjected DeepMind to a 40-item math exam. While some of its algorithms performed well in the test, others had difficulties in understanding questions. This was evident when the machines tried to translate numbers, functions, symbols, and words.

At one point, the program was asked to give the sum of "1+1+1+1+1+1+1." Instead of giving the correct answer of "7", the machine surprisingly said it was "6".

In the end, DeepMind scored only 14 out of the 40 questions it was given.

What Happened To Google's AI?

DeepMind struggled to add numbers when the value became higher than the first few counting numbers it was given, just like with the question "1+1+1+1+1+1+1". The AI was asked to solve 1 + 1 + · · · + 1, where the number "1" appears n times.

The researchers said the neural network models they used were able to calculate the value for n ≤ 6 correctly, but they failed to do the same for n = 7. They also gave the incorrect answer for n > 7.

However, when the models were given larger numbers in longer sequences to solve, they were able to do so without any problems.

The researchers admit that they still do not have a "good explanation for this behaviour," but they believe the answer lies in how the neural networks go over each question and calculate the correct values.

Whenever the AI encounters the same number in a question multiple times, it thinks that the input is "camouflaged." This is why it failed to answer certain questions correctly.

The neural networks performed well when they were asked to find the "place value" in long numbers. They were also able to sort number sequences based on their size, as well as round decimal numbers.

The researchers hope that their findings will help inspire others to develop more neural network architectures for AI. They said the data set they have started can easily be extended, which would allow other computer scientists to expose AI to higher forms of mathematics such as those in the university level.

DeepMind For Video Gaming

DeepMind might not qualify as a math tutor for now, but it has a lot of other useful applications.

Researchers have used the program to build a fully functioning AI for the popular video game StarCraft II. AlphaStar was skilled enough to beat professional gamer Grzegorz "MaNa" Komincz in five consecutive matches without losing a single one.