Dec 29 2005

Taking on Comment Spam - Apologies for the Mess

Published by Daniel Cody at 9:53 am under Technology

As you may have noticed, the ‘recent comments’ section on the left side of my weblog has been filled lately by two types of comments: those pushing various poker, casino, and other card type games and those from ‘real’ sounding people telling me what an “Interesting wonderful great site you have! I will be sure to come back in the future” followed by a few paragraphs of crap about whatever product they’re hawking.

It’s the commonly known problem of comment spam, robots crawling the Web and posting comments on weblogs with links to their own site to boost their page rank with Google.

For the first few months since I switched to the dancody.org domain name, I didn’t have a problem with comment spam bots at all because my own ‘page rank’ with google was so low that there was nothing to be gained out of having a link from my site. In the last month or so though, I’ve been posting a lot more, and a few of my posts have been linked to from well known weblogs, two journals, and last week the web site of the Washington Post.

In fact, the month of December has been the busiest ever for my website (going back five years with five2one.org) with about 1,000 unique visitors a day and 5,000 - 7,500 page views a day. Nothing huge in the grand scheme of things of course, and not anything that drives what happens here, but interesting none the less. It’s also slightly gratifying that after years of talking about the issues I talk about - in the way that I talk about them - that I’m making a small difference and being looked to for opinion of my own from time to time. And as all of my good friends now, if there’s anything that I’m all about, it’s self gratification.. Kidding of course ;)

All of those things have increased the page rank of my own site, which makes it more attractive for spam bots to post their links here with the hope of increasing their own page rank. To be honest, what is getting through in terms of spam on the site is a very small amount of what the spam bots are trying to get through. I block probably about 90% of the comment spam attempts every day before they make it to the site, but a few do trickle through.

There are some other ways to stop it, but I don’t want to make people ‘register’ for my site just to post a comment, nor jump through any other hoops like moderation from me. Bryan mentioned to me last week that the new version of Movable Type has some increased functionality with regards to comment spam, so I might be upgrading to that in the next week, which should take a bite out the comment spam so you can see in an instant who is berating who in the political slugfest of the day :)

8 Responses to “Taking on Comment Spam - Apologies for the Mess”

  1. mwardenon 29 Dec 2005 at 11:08 am

    WordPress has completely killed comment spam on my blog. It has a blacklist, a whitelist, and a moderation queue. Basically, I can blacklist certain terms and comments with greater than x links (or send them to the moderation queue instead of rejecting them outright). Then, I can set it to allow comments to bypass the moderation queue if the poster has had a comment approved before. I believe this is based on email address, which is required to post comments but is never displayed.

    Requiring the email address is kind of a hassle, but I don’t think it’s so bad that people can’t stand it. They really only have to type it in once, too, if they use the ‘remember me’ feature.

  2. Miriamon 29 Dec 2005 at 11:42 am

    I just switched up to the WordPress 2.0 release candidate, and it’s SpamKarma has worked like a charm. And it’s open source ;)

  3. Benon 29 Dec 2005 at 3:12 pm

    I’d recommend WordPress too. I started off with Movable Type, and had the same problems with spam. WP has mucn nicer filtering abilities, and has reduced spam to basically zero, without forcing commenters to beat a Turing test a la Blogger.

    Plus, WordPress is released under the GPL, which kicks ass if, like me, you’re into that kind of thing.

  4. Bruno Wolffon 31 Dec 2005 at 1:43 pm

    Note that using referer (note that actual header name is not spelled the same as the English word) headers isn’t going to buy much in the long run. As soon as enough sites start using it to screen comments the bots will start supplying them.
    Even things like captcha can be defeated, both by programs and by recruiting people to do it. (See http://www.boingboing.net/2004/01/27/solving_and_creating.html for one creative solution.)
    A big step would be Microsoft redesigning their user interfaces so that they didn’t make it so easy for people to shoot themselves in the feet. That could make source based discrimination practical again.
    In the medium term you could only let authenticated people post without moderation and use filtering to help with the moderation.

  5. mwardenon 02 Jan 2006 at 5:27 am

    Wordpress 2 final release actually has a plugin that uses a distributed spam comment system called Akismet. I just installed it and it looks decent. It’s like gmail, where all users identify a given comment as spam, and then any other comment on blogs on the network (those that use teh plugin) that looks like that comment will get flagged as spam. Might be something to look into.

  6. Danon 02 Jan 2006 at 11:25 pm

    Wordpress looks nice, and I’d probably do it if it weren’t for the amount of work required migrating five years of weblog posts from one system to another. It was bad enough going from my old custom built weblog app to Movable Type a few years ago. Not to mention all the templating and nifty instant comment posting system that Bryan whipped up for MT.

    We’ll see, thanks for all the suggestions though :)

  7. mwardenon 03 Jan 2006 at 9:14 pm

    Migrating from MT to WordPress is pretty easy. They made sure they made it easy after MT changed their license scheme and people started jumping ship. As for the nifty ajax commenting, obviously that would probably break for the most part. You could either fix it, or there are WordPress plugins available for ajax commenting (although they probably don’t work with WP 2.0 yet).

  8. Spiceon 15 Jan 2006 at 3:49 pm

    Another recommendation for WordPress here. OpenSource always gets my vote. Plus there’s an initiative gaining steam out there called OpenID (http://www.openid.net/), which requires you to verify your identity once by proving affiliation with a URL, then being allowed to comment at participating sites simply by siting that URL. Something to keep an eye on (in a good way) as it developes.

    Maybe then we could do away with the “nofollow” dealy. I like the trust factor afforded by legitimate blog comments.

    Spice

Comments RSS

Leave a Reply