I frequently get questions that boil down to “How can I trace where this email came from?” or “Can I determine the IP address of the sender of an email?”
The answer is both yes and maybe, and it may not do you any good. However there is a lot of interesting information in your email that you normally don’t see, and the trail of mail servers is part of that.
So let’s interpret some email headers.
First, there’s the challenge of even getting to the real email headers. In Hotmail they’re apparently always visible. In Outlook, they’re hidden by default, so with the message open, click on View, and then Options, and you’ll see a box labeled Internet Headers. In Thunderbird, you can expand or collapse the headers by clicking on a simple control next to the subject line.
In any case, headers typically look something like this:
Received: (qmail 13384 invoked by uid 110); 13 May 2005 21:33:53
Received: (qmail 13380 invoked from network); 13 May 2005 21:33:53
Received: from bay107-f18.bay107.hotmail.com (HELO hotmail.com)
by pugetsoundsoftware.com with SMTP; 13 May 2005 21:33:53 -0000
Received: from mail pickup service by hotmail.com with Microsoft
Fri, 13 May 2005 14:33:53 -0700
Received: from 18.104.22.168 by by107fd.bay107.hotmail.msn.com with
Fri, 13 May 2005 21:33:52 GMT
From: “Leo Notenboom” <email@example.com>
Subject: Example Email
Date: Fri, 13 May 2005 14:33:52 -0700
Content-Type: text/plain; format=flowed
X-OriginalArrivalTime: 13 May 2005 21:33:53.0097 (UTC)
Now yours may look a lot different. It may be longer or shorter, or have additional information, or less. But the basic idea is that there’s a lot of information in the headers that has to do with the administration of getting the email from the sender to the receiver.
A detailed reference is more than I can present here, and quite honestly, probably more than you need. But let’s examine the headers above a little more closely, since it’s a good example of a “normal” email message. They are from a message I sent to my regular email account from my Hotmail account.
A good rule of thumb is to begin at the bottom and work your way up in the headers. That’ll make more sense in just a minute. Working from the bottom:
- X-OriginalArrivalTime: is the time the message was submitted to Hotmail … in other words, the time I pressed “Send”. Headers that begin with “X-” are “non standard”, and may not be used by all mailers. They’re often just informational. Note also the date and time: 13 May 2005 21:33:53.0097 (UTC). The “(UTC)” means that the time is recorded as “Universal Time Coordinated”, sometimes thought of as Greenwich Mean Time or GMT. Since I’m in the Pacific time zone, and daylight savings time is in effect, that means I sent it at roughly 2:33 PM PDT.
- Content-Type: is how the mailers tell each other what the format of the mail is: plain text, as this example is, or HTML, or something else.
- Mime-Version: “Mime” stands for Multipurpose Internet Mail Extensions, and is the formatting protocol most often used to encode attachments and alternate representations in a single email.
- Date: This is the more common place you’ll find the date and time that the message was sent. This is added by the sending mailer, and is commonly used by your email client as the “Sent Date”. Note that the time zone is specified as local time (2:33 PM) and an offset (-7 hours) from UTC. PDT is 7 hours behind UTC as I write this. Subtract the offset (and remember that subtracting a negative offset means to add it), and you’ll get the equivalent 21:33 UTC.
- Subject: As you’d expect, the subject of the email as you typed it.
- Bcc: To be honest, I’m not sure why Hotmail includes this here, as they strip out any BCC’d recipients. BCC is
supposed to be stripped from email completely before it is sent.
- To: Again, as you’d expect, the list of recipient email addresses that this message is addressed to. What most people don’t realize is that the To: line doesn’t define who the email actually goes to, but rather simply lists who the mailer claims it’s to go to. A virus, for example, can easily create a mail message that has bogus addresses in the To: line, and then send the mail to someone else entirely. That’s known as “spoofing”.
- From: Just like To:, the “From:” address shows you from whom the mail was supposedly sent. And also like “To:”, it’s very easy for the spammers and virus writers to spoof the From: address to be pretty much anything they want.
- X-Sender: is another representation of the address the email originated from, but like all “X-” headers, is optional and not universally used or recognized. “X-Sender”, and the similar “Sender:” are supposed to indicate the sender of the email, which might be an intermediary. For example, if you send mail to a mailing list, the mail might be “From:” you, but the mailing list software might be the “Sender:” to everyone else who receives it.
- X-Originating-Email: another representation of the sender of the email. Some mailers add this as a precaution against those who spoof the “From:” line.
- X-Originating-IP: The IP address of the computer on which the email originated. Once again, an optional and informational “X-” header. In this case, the IP address is one of Hotmail’s servers.
- Received: Herein lies the gold. I’ll get into more detail on that below.
- Deliver-To: is added by the receiving mail server when it finally delivers the email to a specific email alias or mailbox. In my case, I have my mailer configured to deliver my mail to two separate mailboxes: one with, and one without, spam filtering.
- Return-Path: is the address that the email, if it fails to be delivered, should be bounced back to.
- (HELO hotmail.com) – this is part of the SMTP mail protocol where the server identifies itself while connecting. Basically, it’s saying “Hello, I’m Hotmail.com” when it initiates the transfer of mail to the next server to receive it. The receiving server logs this information as part of the “Received” header it adds.
- (22.214.171.124) – this is the IP address of the server making the connection.
As part of spam prevention and server authentication, a mail server may elect to ensure that all three of these pieces of information match: the IP address reported matches the server name reported, which in turn should match the end of the HELO string. In practice, the internet is a little too fast and loose for that to be a reliable gauge of authenticity … too many legitimate servers are not configured to report the right information for that check to always be valid.
Another interesting use of the Received headers is to determine where a delay may have occurred in transferring the mail. Since each is time-stamped, it’s quickly apparent where a message may have been held up.
Now lets look at the headers of some SPAM I recently received:
Received: (qmail 19652 invoked by uid 110); 14 May 2005 20:03:05
Received: (qmail 19649 invoked from network); 14 May 2005 20:03:05
Received: from fake.pittpa.adelphia.net (**.**.198.208)
by pugetsoundsoftware.com with SMTP; 14 May 2005 20:03:05 -0000
Received: from desk.fakecompany.com
by qdam.eiynwr.com with SMTP; Sat, 14 May 2005 13:03:09 -0800
From: “Fake Name” <firstname.lastname@example.org>
Subject: Fast solution to your problems in a bed!
Date: Sat, 14 May 2005 13:03:09 -0800
[Note: everything that says “fake” is something I changed to anonymize this example. Someone’s real email address and real company domains were used in the original.]
There are several interesting things about these headers:
- The “Message-ID:” references an account at a domain in Italy. The first “Received:” header references “desk.fakemailer.com” –
- fakemailer appears to be a legitimate business involved in bulk email technologies based in New York state.
- That header also references “qdam.eiynwr.com” – a domain that doesn’t appear to exist.
- The next header appears to receive the message from “fake.pittpa.adelphia.net”, which from the name would indicate a Pittsburgh, PA node of adelphia.net.
- The “From:” line indicates yet a third party, fakecompany.com. On the surface this company, in New York City, appears to be unrelated to any aspect of the message, though I could be wrong.
The kicker is that the links for the products being sold by this email all go to a domain registered in Bulgaria.
So what to make of it all? It is possible that the originating computer, desk.fakemailer.com, is, in fact, sending out spam on purpose. It’s also possible that this machine has been infected with a virus, and is sending out spam without realizing it. And yet another scenario is that the machine is not involved at all, and that spammers in Bulgaria have spoofed the headers of the originating machine (using the companies role in the bulk email business to confuse and obfuscate the issue).
And therein lies the problem with SPAM and why there’s no simple solution. Email headers cannot be trusted, and not all email can be traced or authenticated. Legitimate mail typically can be traced, but for SPAM and virus-generated email it’s difficult to say that the headers are absolutely trustworthy.
But it’s interesting information, nonetheless.