Verifying the
Presence of Humans: 
    Three New CAPTCHAs    
   Neal R. Wagner 
 


Introduction:

Modern Internet providers and others often want to verify that they are dealing with a human being, rather than with a software agent. For example, services such as email or e-shopping are designed for humans but can be abused by automated software. Other automated software may search the entire web for information. To circumvent the automation and make sure one is dealing with a human, special automated tests are now popular -- tests easy for a human to pass but hard for a machine. Manuel Blum and other early researchers in this area gave these tests the name CAPTCHA.

This web page presents three new prototype CAPTCHAs, written as Java applets. These are strictly prototypes, just to illustrate the approach in each case; they are not intended as final products. The web applets below use the Java class BufferedImage, so you need Java 2, Version 1.3 or later to run the applets.


First CAPTCHA -- Shapes:

The first CAPTCHA is inspired by the five symbols on cards that have long been used to test for ability with extrasensory perception (ESP): a circle, a square, a plus, three wavy lines, and a star. For the test, I turned the three wavy lines into three vertical lines, and turned the (five-pointed) star into a triangle with a vertical left side.

The applet starts with a shape in undistorted form with line thickness from 4 to 12 pixels. Then the shape is distorted both vertically and horizontally using random sin curves. All white background pixels are changed randomly to black with probability 0.5. Finally, the pixels in the lines of the shape itself are changed to white with probabilities given in the table below, and the shape drawing is black/white reversed with probability 0.5. In particular, for a shape with line thickness 12, and using the ``hard'' mode, the drawing itself is 40%/60% white/black (or vise-versa).

Right now the orientation of the shapes is not randomized, that is, there is only one orientation. It would also be possible to insert artifacts into the field, particularly additional portions of lines to confuse software analysis. Unlike many other CAPTCHAs, this one in its ``hard'' setting is right at the edge of human ability, so that any such additions would be equally confusing to humans, and one would need to decrease the numbers in the table below for humans to guess these shapes.

  • Types: Circle, Square, Plus, Lines, Triangle.
  • Thickness of lines: From 4 to 12 pixels.
  • Size: Random from about a sixth of the field to most of it.
  • Location: Also random in the field.
  • Distortions: Shapes distorted in both directions using
    sin curves with random start, period, and amplitude.
  • Randomization of Pixels: Background pixels set randomly
    to black or white, while pixels in the lines are reversed
    according to the following table:
Percent Pixels Randomly Reversed
Line
Thickness
Difficulty
EasyMediumHard
420%26%32%
622%28%34%
824%30%36%
1026%32%38%
1228%34%40%


Second CAPTCHA -- Fisheye:

The next approach presents a picture with a region of distortion in it. The human user is supposed to click the mouse at the center of the distortion; he passes the test if he is fairly close to the center, perhaps in several successive pictures.

The specific approach here inserts a ``fisheye'' into a picture with enough regularity so that a human can find the region of distortion. For this application, a fisheye is a circular distortion that expands a picture at the center, contracts toward the edges, and smooths out at the radius to end up the same as the original picture, without any sharp ``kink'' at that radius. The particular fisheye used here appears at a random location (though not too close to the boundary) and with a random radius (though not too small). The expansion and contraction from the center to the edge of the circle is given by the inverse of the equation:

    g(s) = - (3/4)s3 + (3/2)s2 + (1/4)s, s from 0 to 1.

For this equation, I needed g(0) = 0, g(1) = 1, g'(1) = 1, and g'(0) well below 1; in this case I chose the above simple function with g'(0) = 1/4. The implementation below is strictly a prototype. In an actual system, one would want various additional distortions and randomizations, including cropping and distorting the original image, use of a random-shaped elliptical fisheye, and a final randomization of the resulting pixels.


Third CAPTCHA -- What's Wrong With this Picture?

A common puzzle for children gives them a confusing picture with a number of items that ``don't belong'' in the picture. A picture of a fishbowl may have a lit candle in it, or a bedroom might have a fish on the wall. Identifying such ``mistakes'' is subjective and even dependent on cultural assumptions---hence potentially hard for machines to analyze and (if carefully constructed) easy of humans to spot.

The particular system here starts with one of a number of generic background pictures, into which many objects would naturally fit. Pictures of quite a few of these objects are inserted, along with perhaps 3 to 5 objects that unquestionably do not belong in such a picture. More specifically, the CAPTCHA below starts with a spider web, and adds various pictures of bugs. Then several other pictures of objects are added that would be insane to find in a spider web. The user is asked to click the mouse at the rough center of any object not belonging. Random software-generated clicks on objects would only get the correct 5 one time out of C(15, 5) = 15!/(5! 10!) = 2730, that is, with probability 0.000366.

This system is quick and easy to use. It is also linguistically neutral, though it might be culturally dependent.

Right now this approach with errors in a picture consists of just a single background picture (a spider web), and a limited collection of foreground objects (bugs, and other completely inappropriate objects). Even though one assumes that the complete database of images is public, it is still necessary to have a number of background pictures, and a large number of possible foreground objects. Of course different objects would be appropriate for different backgrounds. Each background should be clipped from a larger background and then distorted. Similarly the foreground objects need to be distorted and randomly rotated, as well as randomly placed as shown in the example applet. It was always clear that all the images here should be simple black and white line drawings, with thick lines. I had such trouble finding appropriate drawings, that I finally commissioned my 12-year-old daughter to produce some, and that is what is shown in the applet.


Paper:

Here is a copy of a paper covering the material on the website: PDF (0.23 MB),   Postscript (1.29 MB).


Revision date: 2003-10-15. (Please use ISO 8601, the International Standard.)