Monday 18 August 2008

AI 47: Artificial Artificial Intelligence finds a new application for old books being digitised


In a paper published recently in the journal Science, computer-science professor Luis von Ahn describes a way that people have been used to decipher messy text from digital copies taken of old books.

The system is the latest incarnation of Artificial Artificial Intelligence which is a network human brains focused upon solving problems computers still can't handle.

Luis von Ahn is an assistant professor of computer science at Carnegie Mellon University, helped develop the original twisted-word security technique, known as CAPTCHA - a slightly fractured acronym for Completely Automated Public Turing test to tell Computers and Humans Apart. (The "Turing test" refers to mathematician Alan Turing, who in 1950 proposed a simple way to measure the success of artificial intelligence in computers.)

Since appearing on the Alta Vista search engine in 1997, the technique has become nearly ubiquitous on the Web; according to von Ahn's Science paper, people solve about 100 million captchas per day.

He and his team devised an elegant system for collecting troublesome words, turning them into captchas, and getting them solved. Books are scanned twice and the two text streams are compared; any mismatched words become captchas. The mystery words are paired with known words on normal website security checks, and the user is asked to solve both words. If the user is right about the known word, his or her answer for the mystery word is kept and compared to solutions offered by others. Von Ahn finds that the system correctly decodes mystery words more than 99 percent of the time - results nearly identical to that of the scanning projects' human reviewers.

According to the Science article, this system, dubbed "reCAPTCHA," is now used on some 40,000 websites, where it has solved some 44 million words in one year of operation - the equivalent of about 17,600 books in von Ahn's estimation.

Ethan Zuckerman, a fellow at Harvard's Berkman Center for Internet & Society, shares von Ahn's conviction that human computation offers a useful approach to problems that bedevil computers.

"Computer scientists - understandably - are more interested in solving problems with algorithms than by figuring out clever ways to slice them into small pieces and let humans solve them," he wrote in an e-mail last week. Among the things we do better than computers, he said, is human language itself, and translation in particular.

The above is an extract from an article called ‘Click to translate’ written by Matthew Battles