AI learns to do SATs better than high school seniors

AI learns to do SATs better than high school seniors

Ever since 1984 blockbuster ‘The Terminator’ Artificial intelligence has become a somewhat frightening prospect. Mainly because any robot more intelligent than us may very well view humanity as a threat to the Earth; a “cancer of this planet” as agent Smith of ‘The Matrix’ would say.

However that has not halted our endeavours to build an autonomous, hyper-intelligent and potentially self-aware cyber-species. Past creations like MarI/Oa program that learns like a human would, have brought us shockingly close to the potential dystopia of true AI.  

The latest venture into the world of the machine comes from the collaborative efforts of the Allen institute for Artificial intelligence -AI2 for short- and the University of Washington. Their AI program has managed to score 500 out of 800 points on the math section of the SAT; which is roughly 49 percent accuracy on the test and close to the high school senior average of 513 points.

Whilst this may not strike you as impressive at first, after all you might say ‘well heck, my calculator has 100 percent accuracy in the right hands.” The key difference here is that GeoS wasn’t given these problems in its natural language; code. It had to ‘read’ the questions straight off the paper, making use of the keywords and diagrams to solve the problems in the same way we do. So in effect, it isn’t just plugging numbers in, it is learning to interpret the data and formulate a method to solve each question.

Under The Hood

The way Geos thinks is as follows:

  • It looks at the diagram and text provided for each question
  • It uses this information to select a set of formulae that it ‘thinks’ will help it solve the problem
  • Since the formulae are purely mathematical and numeric, it can easily do the calculations from here
  • When it has come up with an answer, it looks at the multiple choice options provided and ‘sees’ whether any match it’s solution

Here is a demonstration of GeoS in action

Ali Farhadi, assistant professor of computer science and engineering at the University of Washington and research manager at AI2, explains that “our biggest challenge was converting the question to a computer-understandable language” and that “One needs to go beyond standard pattern-matching approaches for problems like solving geometry questions that require in-depth understanding of text, diagram and reasoning.”


The system is far from perfect though, as in the reported tests the system ‘failed to come up with a solution about half of the time.’ However, when it did have an idea for a solution; it had a 96 percent accuracy rate. Which is far better than I can say of myself.

Even though it is so far an imperfect system , Farhadi and his peers are adamant that this demonstration is a crucial step toward true artificial intelligence.

The proof of ‘true artificial intelligence’ is under debate however, as many software engineers disagree as to the validity of current AI testing. An example is the Turing Test, in which a computer must try to pass itself off as human in a blind interaction. Last year a group of developers claimed to have created a program capable of passing the test, but this has been met with much scrutiny.

An ever more popular idea is to forget the notion of one single test and digitally replicate individual aspects of intelligence; bringing each together in a Frankenstein-esque manner to create true AI.

Whilst GeoS and other similar projects have been making substantial progress  in the field of AI, happily, it seems we’re a long way off from the dawn of the Cybermen.


