Last updated June 7, 2018 at 2:37 pm
First ever comparison of facial recognition algorithms and human ability shows we should be using both.
When it comes to recognising faces, machines are now better than most people and at least as good as trained professionals, say Australian scientists.
But combining humans and algorithms will produce the most accurate results.
It’s a standard of spy films – facial recognition technology that faultlessly picks someone out of a crowd. Outside of Hollywood it is increasingly being used by law enforcement agencies to identify people of interest.
However, the capabilities of this technology have never actually been tested against people trained to do exactly that.
This is despite the fact that minimising errors in facial identification is surely highly important for forensic science.
Researchers from the University of New South Wales challenged highly-trained facial recognition experts to match faces against regular people, people known as super-recognisers who have a natural talent for face identification, and facial-recognition computer algorithms.
The algorithms outperformed most people including professionals, but the very best trained facial rec experts still topped the test.
However they still weren’t perfect. But when the best humans and algorithms worked together the overall accuracy improved to extremely high levels.
Humans vs Computers
The study put 184 people from five continents to the test. Of those, eighty-seven were trained professional facial examiners and forensic reviewers, while 13 were non-professional super-recognisers. The remaining 84 were people with no special training or natural ability.
They were tested by rating the likelihood of pair of facial images being the same person. However to try to match a real world scenario, the images were taken with varying amounts of illumination, expression and appearance.
The professionals were allowed to use whatever tools they would usually utilise in their job. Reassuringly, they did outperform the other groups.
The researchers then tested four computer algorithms developed between 2015 and 2017 for facial recognition. All four of the algorithms performed as well as the humans.
The most recently developed algorithm even outshone most of the forensic facial examiners, beating the median score from the professionals.
While this suggests that the algorithms are at least as good as trained professionals, the best human facial rec experts still topped the list.

Are these the same person? Credit: J. Stoughton/NIST
“As a group, trained forensic examiners outperformed the other groups,” says David White, who was part of the research team at the University of New South Wales.
“Another important insight from the study was that the most advanced facial-recognition algorithms are now as accurate as the very best humans.”
In an effort to find the most accurate combination, the researchers then combined the results of two experts, or an expert and the best performing algorithm, and found that the human-computer team ultimately prevailed.
“Experts in face identification often play a crucial role in criminal cases. Deciding whether two images are of the same person, or two different people, can have profound consequences,” said White
“When facial comparison evidence is presented in court, it can determine the outcome of a criminal trial. Errors on these decisions can potentially set a guilty person free, or wrongly convict an innocent person.”
Already happening in practice
While the current paper finds that the algorithms at best equal with professionals, some experts suggest that might not be the reality.
Some facial recognition algorithms reportedly have an accuracy of 99% when dealing with large libraries of images, which humans cannot match. And while facial recognition software might be a tool to narrow down the possibilities, the experts say trained professionals currently do have the final say – which was identified as the best combination in the study.
The main failing of humans however is the variability between people. While a well-trained forensic expert may have been able to outperform the algorithm, most were not. And in some cases, even the experts were struggling to outperform regular people.
“The results with people showed large variation in accuracy of individuals in all the groups tested. This ranged from near random guessing, with an accuracy of about 50%, to a perfect score of 100%,” said Dr White.
“This variability is a problem, because it is common practice for just one examiner to present face identification decisions in court.”
The use of algorithms takes that variability out of the equation.
However, while the algorithms are just as good as the best human judges of faces and more consistent, based on the current research it’s clear that teamwork between the two is the best solution.
The research has been published in the Proceedings of the National Academy of Sciences