Maltese Cross An end to comment spam: reCAPTCHA

Posted by Daniel Stout on Wed 16 Jan 2008 at 7:04 PM

reCAPTCHAOne of the chores of managing a blog is dealing with comment spam. Comment spam is an endless stream of computer-generated comments that are intending to fill blogs with links to pages filled with spam content. The purpose of leaving comments helps raise the profile of the spammers’ pages in Google. Then when people search for certain keywords, they’ll end up at some spam site. Google works hard to eliminate spam from its index. But the maintaining of individual blogs is up to the blogger.

The folks behind WordPress had an ingenious plug-in called Akismet that for a long time eliminated spam comments for people with WordPress blogs, or even those who ran the plug-in on Movable Type, such as myself. But a while ago, the spammers beat Akismet. Spam comments were getting the blessing of Akismet, which simply wasn’t the case before.

When I had installed Akismet, people were able to comment directly on the blog. Occasionally, Akismet would misidentify a comment, but overall it did a good job. But when Akismet started failing consistently on certain spam comments, I ended up having to moderate comments. This is a good solution in the sense that it prevents any spam from getting to the blog. But comment moderation is a time consuming process, plus it doesn’t allow people to instantly comment on something. Their comment may take hours to show up on the site, just depending on when I get a chance to moderate the comments.

As Akismet became increaingly useless, I knew I needed to find a new solution. I looked at what was available for Movable Type for plug-ins. I did some Google searching for possible solutions. What I settled on is a service from Carnegie Mellon University called reCAPTCHA. I’ve been running reCAPTCHA for a couple of months, and it has succeeded on several fronts: I have turned off comment moderation and comments appears instantly, I no longer have to deal with comment spam, and I no longer run the Akismet plug-in.

reCAPTCHA is a CAPTCHA form of bot-deterrence. A CAPTCHA is a form that requires you to enter in some letters or words that presumably a robot program can’t read. That ensures that a person is leaving the comment. reCAPTCHA takes this idea a couple of steps forward. First, it has two words, both of which have been taken from projects to scan in library books. One of the words is known, and one of the words is unknown. That is, the OCR program reading the scan was unable to identify the word. But as a human, usually you can identify the word. So by entering in the two words, you’re helping digitize books.

Another nice aspect of reCAPTCHA is that it is accessible. You can click on the speaker symbol and get an audio CAPTCHA. If vision is a problem, then the audio route works well.

reCAPTCHA’s slogan is “Stop spam. Read books.” That’s something I can identify with. You can go to the reCAPTCHA website at reCAPTCHA.net, and find plug-ins for most types of blogging software or you can roll your own with some PHP code. You also need to setup a free account, and register a domain for the plug-in to work. You get a public and a private key to encode your transmissions between your blog and the reCAPTCHA service. In my experience, that all works very well.

So if you’ve left a comment recently on one of my sites, thanks for filling out the CAPTCHA code. It has made my job as a blogger much easier. It has completely removed one of the big headaches of blogging. Not having to deal with comment spam finally is amazing.

Tags: · · · ·

Comments (1)
Posted by Dr. Mike Wendell on March 31, 2008 10:20 AM | Permalink

I for one have a great deal of problems reading reCAPTCHA. In fact to make this comment, I had to reload the image three times to get one that I could read.

Leave a comment

(Required)

(Required, but not displayed)



Creative Commons License
This weblog is licensed under a Creative Commons License.
Validation: XHTML 1.0 StrictCSS 2.1Atom 1.0

manufactured environments

This is a blog about technology, music, vinyl, turntables and more.

Blog Feed: Atom Feed
Archives: 2000 to 2008
Classic Entries
The Tag Cloud
Contact
About: Daniel Stout


my other blogs

Manufactured Fotos is a collection of my photography.

Manufactured Podcasts is a podcast featuring poetry and PDFcasts.

monthly archives