I have a personal web-site to help computer users that’s been running for
about 6 years. I have a guest book and people have been signing it for years.
Within the past year, though, I’ve been swamped with spammers signing my book.
I get about 6 to 10 spams each day. Each morning I delete them, but it is
getting worse by day.
I had tried to “hide” my guest book from the public and sacrifice the
ability to have people sign and my enjoyment in reading these. But even this
“hidden” page keeps getting spam.
How can I prevent spammers from signing my guest book? I’d appreciate your
comments and hopefully a solution to this annoying problem.
Oh, I have plenty of comments and opinions on this topic – it’s a problem I
face right here with Ask Leo!
But unfortunately, like spam in general, there’s no single answer – no magic
Depending on your server and other specifics, there are several approaches
you can take.
Become a Patron of Ask Leo! and go ad-free!
Web spam, also known as “blog spam” or “comment spam”, is definitely on the
rise. Spurred by the popularity of Weblogs or blogs which allow people to post
comments, spammers are using these forms to post links back to their own sites.
The links aren’t really intended to server as advertising, per se, but rather,
to trick the search engines into thinking that the target site is more
important than it is, because of all the incoming links.
Regardless of why, it’s a mess.
There are two types of comment spam generation techniques: manual and
automated. Automated tools will scour the web looking for things that look like
comment or guest book forms, and automatically post their bogus content to
these forms. Manual tools involving hiring cheap labor overseas to do exactly
the same thing by hand.
While it started as comment spam on blogs, it’s most definitely no longer
limited to that. Almost any form that accepts input on the web is getting
As I said, there are various tools and techniques to combat comment or web
spam. Which technique might help you depends on how your form is set up, and on
what type of server, or publishing platform you might be running.
A very common technique is to use what’s called a “CAPTCHA” (“Completely
Automated Public Turing test to tell Computers and Humans Apart”). You’ve
probably seen them – they’re the often distorted characters that you’re asked
to re-type into the form before it will be accepted. As the name implies, it’s
a way to prevent automated tools from posting to your form. Unfortunately it
does nothing to stop actual humans.
If you’re running on a content management system like MovableType, WordPress
or others, then CAPTCHA may already be an option – either as a built-in
feature, or as a plugin for your platform. Unfortunately creating and using a
CAPTCHA test in the general sense is not all that trivial.
is set up, and on what type of server, or publishing platform you might be
However, if you’re using a standard HTML <form> to get your input, I
a technique I use here on Ask Leo! with great success. It’s developed and
described for the MovableType publishing platform, but the technique is in fact
valid for any <form> based input. You can read more about it on my
MovableType Tips site: Dealing with Comment
enabled in order for people to post to your form. While most people do have it
turned on, there’s a percentage that do not, and you’ll have to decide if that
is important enough to you.
If you’re running an Apache-based web server and you have access to its
configuration, the mod_security module might be an
option. This module can be configured to monitor for terms and take action when
those terms are posted to your form. It’s something else I run on Ask Leo!’s
server, and as a result attempts to post a comment with certain
four-letter-words or certain spam-related phrases will simply be rejected.
Another technique I find myself using is for forms where I control the
script that processes the form input. Most notably, my ask a question page has been getting hammered of late with
various attempts at web spam. What I’ve done is simply make note of common
strings (typically the websites that are being linked to) and updated the code
to disallow posts containing those strings. (Apparently, being PHP based, it
Both techniques that scan for strings require a certain amount of
maintenance. As spammers arrive attempting to promote new things, those things
need to get added to the disallowed list. However, if you’re willing
to completely disallow links in the content posted from valid users, then
disallowing the string “http:” would stop 99% of this type of spam.
Unfortunately that’s not something I can do, as many of the questions I get do
need to refer to specific web pages.
If you don’t have access to the levels of scripting or server configuration
that I’ve described here, then your next best bet is to investigate the
specific publishing platform you’re using. The spam problem is wide-spread, and
many of the popular platforms are implementing solutions of various types.