Email can often use complex methods to encode special or even not-so-special characters. Occasionally those methods become accidentally visible.
I’m one of the moderators on a large email discussion list. Quite often when we receive a message for approval it might be full of what I can only call “funny characters” or character sequences. They always begin with an equals sign, though. For example things like =0D=0A and =3D appear throughout the message.
But wait, this gets even more odd. If we allow such a message to go through to our list, most members who receive the messages individually don’t see this oddness; messages look just fine to them. And yet, members who receive these messages in a periodic digest see the same funny characters as we moderators do.
What’s up what that?
You’d think that with plain-text email having been around for as long as it’s been issues like this would have been resolved by now.
The problem is that there’s “plain text” email, and then there’s “plain text” email. That’s correct – not all “plain text” is created equal.
When you see something like =3D, what you’re seeing is a single character in what’s called “quoted-printable” encoding. “=3D” is, in fact, an equal sign. =0D is a Carriage Return (CR), =0A is a Line Feed (LF), and =0D=0A is a CRLF combination. CR, LF and CRLF are all used to indicate the end of a line of text in plain text emails. In fact any character can be represented as a three charter “=” sequence in quoted-printable. “=41=73=6B=20=4C=65=6F=21″ for example is “Ask Leo!” in full quoted-printable encoding.
Why is “quoted-printable” used?
It’s one of several encodings that get around the fact that once upon a time not all mail servers and network transports could handle what are called “non-printable” characters, like CR and LF, or certain types of special non-alphanumeric characters. CR and LF don’t cause anything to be displayed or printed, they just “mean something” (the end of a line) – that’s why they’re called “non printable”. That can confuse older email software when they’re part of a message. So these special characters are representing using something else that doesn’t confuse the old mailers. A CR is represented as =0D – three printable characters. And since the equal sign is part of that encoding scheme, even though it’s “printable”, it also needs to be encoded, hence the =3D.
So what’s happening with my mailing list approvals?
The approvals are coming to you in “raw” form. Your mailing list software has most likely removed or overridden the mail header information that says “this is quoted-printable” and hence your mail program doesn’t know that it should decode the encoded characters. It simply believes that it’s unencoded plain text email, and it should just be displayed as-is.
I can live with that for approvals, but what about digests?
A digest is a collection of emails bundled up into a single message. The “problem” is that some of those messages could be encoded using “quoted-printable”, others could be encoded using something else, and others could just unencoded plain text. In theory the mailing list software could decode, but obviously it doesn’t. (I would assume that it most likely has valid reasons for not trying.) As a result, like the moderation messages you’re seeing, the messages are appended to each digest in raw form. And the digest itself can’t say “this is quoted-printable”, because most likely not all the messages it contains are.
Great. So what do I do?
As a list owner or moderator, there’s not much you can do to the list itself. The “solution” is to ask your list members to change their message encoding. It’s much like asking them not to send HTML formatted emails, but simply a different setting.
From what I’ve seen, Outlook Express is the most common culprit. In Outlook Express go to the Tools menu,
Options item, Send tab. Now, under Mail Sending Format click on Plain Text Settings…. In the resulting dialog:
Make sure that MIME is selected (it probably is), and Encode text using is set to None.
In Outlook the default for English installations is no encoding, but that can be overridden by a registry setting as outlined in this Microsoft Knowledgebase article: How Outlook applies encoding to plain text e-mail messages in Outlook 2003 and Outlook 2002.
In Thunderbird, in Tools, Options, in the Composition section:
You can make sure that “For message that contain 8-bit characters, use ‘quoted-printable’ MIME encoding” is unchecked. Most email you compose will not trigger this, but if you ever edit in another application such as a word processor, and then paste the results into your mail program, it can happen.
Naturally, various other mailers will have (or hide) this option in other locations. Check the settings related to mail composition and format.