Spam: good food for growing AIs

wall of SpamIf you’ve been groaning in terror at the seemingly ever-growing contents of your spam folder, here’s a silver lining to the internet’s perennial plague – the ever-increasing ability of spambots to solve CAPTCHA puzzles may end up advancing the cause of artificial intelligence research. You see, it turns out that crime actually does pay:

“[von Ahn, inventer of the reCAPTCHA test] has seen bounties as high as $500,000 offered for software to break it – enough to attract people with the skills to the task and five times more than the Loebner Grand Prize offers to the programmer who designs a computer that can truly pass the Turing test.

The demise of reCAPTCHA could, however, be beneficial.

It has users decode distorted text taken from historic books and newspapers that is beyond the ability of optical character recognition (OCR) software to digitise. Humans who fill in a reCAPTCHA are helping translate those books, and spam software could do the same.

“If [the spammers] are really able to write a programme to read distorted text, great – they have solved an AI problem,” says von Ahn. The criminal underworld has created a kind of X prize for OCR.

That bonus for artificial intelligence will come at no more than a short-term cost for security groups. They can simply switch for an alternative CAPTCHA system – based on images, for example – presenting the eager spamming community with a new AI problem to crack.

Indeed, it appears that the Google gang are doing exactly that:

“… the Google researchers were apparently able to come up with the new technique simply by looking into areas that computer scientists had identified as being problematic for computer-based solutions.

They apparently came up with image orientation. Humans can apparently properly orient a variety of images so that the vertical axis matches the real-world orientation of the photograph’s subject; computers can only handle a subset of these. […]

The basic idea behind their scheme is that any functional system will first have to eliminate any images that an automated system is likely to handle properly, as well as any that are difficult for humans to orient. So, for example, computers are good at recognizing things like faces in group shots, as well as horizons in landscape scenes, both of which provide sufficient information to orient the image. In other cases, the image doesn’t have enough information for either humans or computers to properly sort things out—the paper uses the example of a guitar on a featureless background, which could be oriented horizontally, vertically, or in the angled position from which it’s typically played.”

I wonder if there’ll ever be an end to this particular arms race? And, if there is, will it be heralded by the arrival of the Canned Ham Singularity? [image by freezelight]