control he calls “greylisting”. He tried to describe it to me, but just got me
more confused. What is greylisting, is greylisting any good, and if it is, how
do I get it?
In a truly ironic turn of events, it seems that spam has sparked a number of
technological innovations. Traditional spam filters for example are technically
very complex pieces of software that try to analyze the content of email to
determine whether or not it is legitimate or not.
No less innovative, greylisting is conceptually very simple, and makes use
of the mail system’s own protocols to try to ferret out the legitimate from the
spam.
Yes, conceptually very simple.
Become a Patron of Ask Leo! and go ad-free!
Greylisting relies on a characteristic of the protocol used to move mail
from one mail server to another (SMTP, or Simple Mail Transfer Protocol). That
protocol says that it’s OK for a mail server to be “too busy” to accept mail.
When computer A tries to send mail through mail server B, and B responds by
saying “I’m too busy right now”, machine A is supposed to hold the message it’s
attempting to send, and then try to send it again later.
Greylisting also relies on the observation that most spam zombies don’t try
again later, they simply fail to deliver their spam. Legitimate mailers sending
primarily legitimate mail will follow the protocol, and after a few seconds,
minutes or hours, will re-send the delayed mail.
So now that we have this technique that will allow us to tell the difference
between some likely spam and other more likely legitimate mail, how does a
greylisting implementation actually use it?
The greylist processor tracks three pieces of information for each email
that it receives:
- the IP address of the machine sending the email to the mail server
- the “From” address of the email
- the “To” address of the email
Call that a “triplet” of information about each email.
The basics of greylisting then works like this:
- If the triplet has never been seen before, respond with “I’m too
busy”. - If the triplet has been seen before, process the mail normally.
So the first time a message shows up, it gets delayed. The second time it
shows up, it is processed and that triplet is then considered “ok” from then
on, and no longer subject to greylisting delays.
Now there are some details that I’ve glossed over. For example two
different emails that happen to share the same triplet should not
cause greylisting to think that they’re ok. Similarly, that “resend” needs to
be seen within a certain amount of time for it to be considered valid. So in
practice the greylisting processors will keep and manage a database of triplets
that are in various states. (Implementations differ on how persistent that
database is .. some maintain it semi-permanently, others start from an empty
database each time the mail server is restarted. Ultimately that’s only an
optimization, and doesn’t affect the effectiveness of the technique.)
Note that this process is totally based on experience – greylisting “learns”
who the legitimate senders are by their behavior over time. There’s no need for
a “whitelist” of known good senders. A good sender will simply prove itself
over time by behaving properly.
Risks of Greylisting
Not everyone is in complete agreement on greylisting. There are at least a
couple of risks associated with using it.
Bad, yet legitimate, mailers: it turns out that not all
mailers know what to do when the server they’re sending to says “I’m too busy”.
The result is that mail sent from these mailers that should be queued and tried
again later, is instead bounced back to the sender. There’s no easy way to get
the mail through at that point. Fortunately, these misbehaving mailers are few,
and in decline.
Processor load: greylisting is yet another task that the
mail server has to do to process each incoming mail. Depending on the
implementation, the server’s load, and the database technology it implements,
greylisting can have an impact on server performance. The good news is that by
implementing greylisting “in front of” any spam filtering, total load can often
be reduced as a result, since spam filtering is typically an even more
computing-intensive task.
How to get Greylisting
Ask your system administrator. Period. Greylisting is most definitely
not something for end-users to implement. It requires tight
integration with the mail software running on your mail server. Depending on
your mail server, and who provides it for you, getting it may not even be an
option.
Finally, greylisting is but one tool in the war on spam. Even with it in
place, you’ll still need your other anti-spam solutions, whatever they might
be, and your anti-virus and anti-spyware solution as well. Greylisting might
prevent you from getting some of the spam headed your way, but it does
not prevent you from viruses or spyware that might turn your machine
into a spam-sending zombie.
Hi Leo.
Yours was one of the articles I read when I wanted to know a little more about greylisting.
Since then I’ve implemented it myself and documented the steps involved. So, if your readers decide they don’t want to ask their system administrator and they want to dive in themselves, you could always point them to:
http://geekery.etherknet.com/?p=75
Hope that helps.
C’ya.
David.