The Daleks will stop at anything to stop us! -- The Doctor, Doctor Who: The Daleks' Masterplan

If I only had a heart… (Happy 200)

In honor of my 200th post, I thought– well, okay, this has nothing to do with it being my bicentennial; I just noticed it when I clicked on “Entries” and saw the number 199 next to it. So, anyway, on with the post…
You may not know the term, but you’ve probably seen a CAPTCHA by now. The acronym expands out to the not-really-meaningful-unless-you’re-a-CS-guy “Completely Automated Public Turing Test to tell Computers and Humans Apart”. A bit of background:
Alan Turing, one of the founding bigwigs of the whole theory of computers as we know them, had this theory: If we stick a human being at a terminal of some sort (This was Turing, back in the fifties, so he was thinking of a teletype, but IM would work just as well) and have him chat for a bit with two other entities, one of which is a computer and the other one is a second human, if the guy at the terminal can’t tell which is which, the computer has demonstrated actual human intelligence, or, at least, something close enough to it to be interesting.
So, in a nutshell, a Turing Test is when a human tries to tell whether something else is a computer or a human. This is fairly easy (The human is less likely to say “BZZT! DESTROY ALL HUMANS!” if you annoy it). A CAPTCHA, which is sometimes ambigiously called a “Reverse Turing Test” is when a computer tries to tell if the entity it’s talking to is human or another computer.
That is to say, it’s one of those things you get when you sign up for something on the internet and they show you a picture of some distorted random letters and ask you to type them in.
This is actually a pretty hard test. It’s comparatively easy for one computer to convince another computer that it’s a computer (“Perform these six hundred hard math problems in under a second” is a pretty simple way), but how do you convince it that you’re human? The computer conducting the test can’t measure your capacity to love, or detect if you have opposible thumbs or anything like that — in fact, the reason that it’s so easy for a human to distinguish computers and humans is that humans can perceive a lot of things that computers can’t — which, of course, means that that distinguish a human (taking the test) from a computer (taking the test) are things that the computer (giving the test) can’t perceive.
So, the way to tell the difference is to generate the sort of problem that humans are good at solving and computers aren’t, and ask the test-taker to solve it. Fortunately, a computer can indeed generate problems it can’t solve itself. Or, a human can provide the computer (giving the test) with a crib sheet. The most common kind you see is the kind I mentioned above. Computers are pretty good at reading written words, but not if they’ve been distorted. So you print some letters in an image, mangle them a bit, and ask the test-taker to read them. This is doable, though it’s not all that easy: mangle the letters too much and a human can’t read them. Don’t mangle them enough, and a computer can. Most of the letter-based CAPTCHAs you see on the internet aren’t all that good, and throw up manglings that a very clever computer could work out, though there are some very good letter-mangling CAPTCHAs out there. Also, CAPTCHAs can often foil humans with vision problems (Like my color blindness).
Another CAPTCHA you see sometimes shows you several images and asks, say, “Which one is a puppy”, since that’s a hard thing for a computer to deduce. This works pretty well, but, unlike the letter-mangling test, the computer taking the test can’t generate new pictures of puppies, so unless it’s got a huge stockpile, the computer taking the test could just poke at random until it got in purely by coincidence.
I read a paper about CAPTCHAs back in grad school, and there was a really neat point they made. Unlike all the rest of computer security, if a CAPTCHA is broken, it’s basically great for mankind. Let me explain: You’ve by now probably heard of the animated cursor bug in Windows. No good can come of exploiting the animated cursor bug. There aren’t really useful things you can do by hacking an animated cursor. It’s good for exactly one thing: compromising systems to the owner’s detriment. Cryptography is largely based on number theory. Until modern cryptography was invented there was no practical use for number theory. People studied it purely for love of math. Aside from its mathematically interesting properties, the only practical use for the RSA algorithm is to encrypt data. Which means that if someone discovers a problem with the RSA problem, RSA encryption is broken. The problem itself has no positive use value, beyond breaking cryptosystems. This isn’t the case for a CAPTCHA: if a computer manages to foil a CAPTCHA, it means that the computer can do something which computers are historically bad at. If it can consistently find the puppy, then we have created a computer that can identify puppies, and puppy-identification is a skill with unlimited commercial application. If our computer can consistently read mangled words, then the next generation of business card scanner software will be able to tell that the business card you ran through it isn’t for “Lockheart Martini”.
But this is just a comically longwinded introduction to what I want to show you. Woe be to all of us the day a computer learns how to break the new Hotness CAPTCHA. It uses AmIHotOrNot API to ask users to identify which of several pictures shows the hottest person. Personally, I think they missed a great oppertunity by not calling it amibotornot.com.
The other CAPTCHA I’d like to show you comes to us via Defective Yeti: Internet Access CAPTCHAs. This one is designed to tell whether the testee is a human, a computer, or an idiot. What’s neat about this is that it’s much more likely to be foiled by a clever computer than a stupid human.
Welcome to the internet. Enjoy your porn

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.