JCaptcha and NPR

June 9, 2005

Its funny how my mind works – I was listening to NPR’s Weekend Edition over the weekend and they were going through listener’s comments. At the end of that segment, they mentioned that they were not going to accept email comments to their old address. They want listeners to go to npr.org and fill out a web form to enter their comments. The first thought I had was spam – It had to be spam. Having an open email address that’s published must get a ton of spam. I know I get a ton of spam daily and so I’m guessing NPR must have been getting thousands and thousands of spam messages.

So as I’m driving along, I started thinking about moving to the web-form model to solicit feedback and I just assumed that they would take the next logical step and add Captcha to their web app. If you don’t know, Captcha (completely automated public Turing test to tell computers and humans apart) is an acronym for a type of challenge-response test to determine whether or not the user is human. Thinking about Captcha got me thinking about JCaptcha, an open-source Java framework for Captcha definition and integration. I knew about JCaptcha as I had read about it on Dion’s blog a while back and so I finally decided to download and give JCaptcha a try.

I was impressed with how easy it was to incorporate Captcha into an existing application. Here is a simple web-app I built using the 5 minutes application integration tutorial on the JCaptcha wiki.

Here’s the JSP that acts as an entry into the application:




<title>JCaptcha Sample</title>


<h1>Sample JCaptcha</h1>
</p><p>A captcha (an acronym for "<b>c</b>ompletely <b>a</b>utomated <b>p</b>ublic <b>T</b>uring
test to tell <b>c</b>omputers and <b>h</b>umans <b>a</b>part") is a type of challenge-response
test used in computing to determine whether or not the user is human.</p>

<form method="post" action="/jcaptcha/validate">
<table cellspacing="5" cellpadding="0" border="0">
<tr>
<td><b>Name:</b></td>
<td></td></tr>
<tr>
<td><b>Email Address:</b></td>
<td></td></tr>
<tr>
<td><b>Comments:</b></td>
<td>
<textarea name="comments">
</textarea></td></tr>
<tr>
<td colspan="2"><b>Enter the text as it is shown below:</b></td>
</tr>
<tr>
<td><img src="jcaptcha"/></td>
<td></td></tr>
<tr>
<td colspan="2"><b>This extra step helps prevent automated abuse of this feature.
Please enter the characters exactly as you see them.</b></td>
</tr>
</table>
</form>


To initialize the Captcha service, you create a singleton to instantiates an instance of the ImageCaptchaService that provides the facility to cache the Captcha and create the image.

package com.j2eegeek.jcaptcha.common;

import com.octo.captcha.service.image.DefaultManageableImageCaptchaService;
import com.octo.captcha.service.image.ImageCaptchaService;

/**
 * The <code>CaptchaServiceSingleton</code> implements the Singleton patterns and returns an instance of the
 * ImageCaptchaService.
 */
public class CaptchaServiceSingleton {

    private static ImageCaptchaService instance = new DefaultManageableImageCaptchaService();

    public static ImageCaptchaService getInstance() {
        return instance;
    }
}

Once we’ve created an instance of the ImageCaptchaService, we can create a servlet that will allow us to create an image. The servlet ends up calling the singleton to get an instance of the CaptchaService Singleton and calling its getChallenge() method.

/**
 * The <code>ImageCaptchaServlet</code> class creates the actual image that's displayed to the user for validation.
 * The servlet ends up calling the singelton to get an instance of the CaptchaService Singleton and calling its
 * getChallenge method.
 */
public class ImageCaptchaServlet extends J2EEGeekBaseServlet {

    private static final Log log = LogFactory.getLog(ImageCaptchaServlet.class);

    public void doWork(HttpServletRequest req, HttpServletResponse res) throws ServletException, IOException {

        byte[] captchaChallengeAsJpeg = null;
        // the output stream to render the captcha image as jpeg into
        ByteArrayOutputStream jpegOutputStream = new ByteArrayOutputStream();
        try {

            // get the session id that will identify the generated captcha.
            String captchaId = req.getSession().getId();

            // call the ImageCaptchaService getChallenge method
            BufferedImage challenge = CaptchaServiceSingleton.getInstance().getImageChallengeForID(captchaId, req.getLocale());

            // a jpeg encoder
            JPEGImageEncoder jpegEncoder = JPEGCodec.createJPEGEncoder(jpegOutputStream);
            jpegEncoder.encode(challenge);
        } catch (IllegalArgumentException e) {
            log.error("IllegalArgumentException exception - " + e.getCause().getMessage());
            res.sendError(HttpServletResponse.SC_NOT_FOUND);
            return;
        } catch (CaptchaServiceException e) {
            log.error("CaptchaServiceException exception - " + e.getCause().getMessage());
            res.sendError(HttpServletResponse.SC_INTERNAL_SERVER_ERROR);
            return;
        }

        captchaChallengeAsJpeg = jpegOutputStream.toByteArray();

        // flush it in the response
        res.setHeader("Cache-Control", "no-store");
        res.setHeader("Pragma", "no-cache");
        res.setDateHeader("Expires", 0);
        res.setContentType("image/jpeg");
        ServletOutputStream out = res.getOutputStream();
        out.write(captchaChallengeAsJpeg);
        out.flush();
        out.close();
    }
}

Once you’ve created the image and displayed it via the index.jsp page, you need to validate the response entered by the user.

public class ValidateServlet extends J2EEGeekBaseServlet {

    private static final Log log = LogFactory.getLog(ValidateServlet.class);

    public void doWork(HttpServletRequest req, HttpServletResponse res) throws ServletException, IOException {

        res.setContentType("text/html");
        ServletOutputStream out = res.getOutputStream();
        out.println("<title>JCaptcha Sample</title>");
        Boolean isResponseCorrect = Boolean.FALSE;

        String captchaId = req.getSession().getId();
        String response = req.getParameter("j_captcha_response");
        try {
            isResponseCorrect = CaptchaServiceSingleton.getInstance().validateResponseForID(captchaId, response);
        } catch (CaptchaServiceException e) {
            log.error("Exception - " + e.getCause().getMessage());
        }

        if (isResponseCorrect.booleanValue()) {
            out.println("<h1>Success -- <a href="/jcaptcha/">Try again?</a></h1>");

        } else {
            out.println("<h1>Failure -- <a href="/jcaptcha/">Try again</a>");
        }
    }
}

Here’s all the code that’s essentially a rip-off from the wiki as a IDEA project. Another great resource is the JavaWorld article that Dion points to written by Anand Raman. He goes into details about incorporating Captcha into JAAS.

Advertisements

4 Responses to “JCaptcha and NPR”

  1. Using HttpSession is overhead in the case when you just want visitors to leave comment on a web page, it won’t scale to millions of visitors per second ;-).
    Why not pass captchaId to the client to save as hidden field and then submit it back? I assume any Captcha implementation should include some randomness so it’s pretty safe to expose id to the client.

  2. Vinny said

    Hi Dmitri – You’re right. In this simple scenerio, HttpSession will add some overhead but in a real application, client-based hidden fields are subject manipulation by the client. Unless you are validating all input, including hidden fields passed from form to form every time, you can be subject to ‘tinkering’. 🙂

    In a real application, HttpSession usually ends up playing an integral role and so it’s probably not a big deal. In the case of a simple guestbook type example, the captcha-id can probably be passed (FORM) or set as a cookie. Cheers

  3. Billy said

    CAPTCHAs are a pretty interesting technology. I remember reading about it a few years ago… after one on the Ticketmaster site mesmerized me. ? I was initially excited but it was tempered as I knew the dev community wouldn’t embrace it (wide-scale). IMO, the biggest reasons you don’t see them all over the place are due to the usability and testing issues. Essentially what you’re doing (if you implemented) is taking what is considered an enterprise problem (security) and pushing it off on the client/customer. This approach will/does negatively impact the user’s experience. I suspect the number of email submissions on NPR’s site would be greatly impacted. The goal of any good system is to keep them on the site… the inherent indecipherability/counter-intuitiveness causes damaging accessibility issues. People with visual impairment stumble on them and are discouraged according to the research. Automated and UI tests are rendered useless as CAPTCHAs are, by design, difficult to program for.

    Don’t get me wrong…. it’s a cool technology. But it’s probably best used for secure or limited-access sites where a user expects to be challenged to that degree. The reason that the Ticketmasters, Yahoos, Ebays of the world use them is because they can afford to turn some people off.

    My two cents…

    Billy

  4. sharath said

    Thank you so much for the nice description . It worked out for me too.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: