30/04/2014 12:09 BST | Updated 01/05/2014 06:59 BST

'Babel' Essay Machine Can Generate Prize-Winning, But Meaningless Exams

Chris Turner via Getty Images

It's an oddity of the modern examination system that, in many countries, essays aren't actually marked by humans.

For some exams - such as several standardised tests used by American schools and universities - essays are actually marked in bulk by a computer. Examiners might dip back into the pool and correct some of the marks, or provide oversight, but the volume is too high to mark by hand in many cases.

Some systems claim to be able to mark 48,000 essays a minute.

Unfortunately, though perhaps predictably, these systems aren't foolproof. And according to one new analysis, they can be easily fooled into awarding full marks even if a given essay makes no sense at all.

The US Chronicle of Higher Education reports on a four-person team at Harvard and MIT who have created the Babel Generator, a machine which is able to generate perfect but meaningless essays which all score top marks in standardised tests.

Les Perelman, a retired MIT writing instructor, is leading the team in order to reveal the weakness of the current system. He points in the story to one example essay compiled with his machine:

"Privateness has not been and undoubtedly never will be lauded, precarious, and decent," reads one essay. "Humankind will always subjugate privateness."

No, it's not just you. It means nothing. But using the Basic Automatic B.S. Essay Language Generator (Babel), it's a winner. In fact it scored 5.4 points out of six - with advanced points for "meaning".

"How can these people claim that they are grading human communication?" Perelman asks.


Apparently the machine needs as little as three keywords to get started, and relies on procedural rules of its own to generate the text, rather than web searches or Wikipedia.

The point is that the machines marking the essays are trained to look for grammar, punctuation, and certain elements of writing style, but not for meaning or nuance in argument. So a student can theoretically dispense with the luxury of intrinsic meaning if they are instead able to employ loquacious elegance of sufficient dexterity.

Perelman’s view is that the producers of these essay marking machines should use his findings to refine their system.

Meanwhile defenders of the system say it remains a useful tool, and though not perfect is still able to provide guidance in grading.