Some e-merchants have a clause saying they “receive and store certain
information whenever you download web pages” – do they means any web pages or
just their own and what can you do to prevent it?
If they’re legitimate, it’d be just their own. Obviously if they’re
malicious and you’re not careful they could install spyware, and all bets are
off. But by now I’m certainly you’re already doing all the right things to
stay safe, so I’ll assume you’re only visiting legitimate e-commerce
sites.
And it’s not just e-commerce. Guess what? When you visit Ask Leo!, I also “receive and store certain
information whenever you download web pages”.
It’s really just a part of how the web works.
Become a Patron of Ask Leo! and go ad-free!
Now the statement “certain information” is vague and could mean
anything.
There’s the obvious stuff. For example when you order from an e-commerce site, you’re
giving them your information to process the other. They “receive and store”
that “certain information” as a part of processing and fulfilling your
transaction. The shouldn’t come as any surprise; it’s information you
explicitly gave them.
The only way to avoid it is to not do business on the internet. To me that
seems exceptionally extreme. Personally I’m happy letting a number of reputable
e-commerce sites “receive and store certain information” about me as part of
the process of my doing business with them. I’ve certainly never been harmed by
it, and in fact have only benefited from their services.
works.”
But you should know that even visiting a web site – any web site – provides
that site with “certain information”.
When you visit a site the web server receives the following information:
-
The URL of the page on the site you’ve requested to see. Hopefully this is
obvious; the server needs to know what it is you want to look at. -
Your internet IP address. The web server needs to know where to send the
information that you’ve requested. (Note that if you’re behind a NAT router,
this is the internet IP address of the router, not your computer’s IP
address.) -
If you clicked on a link to get to a page, then the URL of the page
containing that link may be included as what’s called the “referrer”. Basically
this tells the web server what page you were on that had the link that got you
to the page you requested. If there is no “referring page” then nothing is sent. -
The contents of any cookies that this web site placed on your machine from
any prior visits. Not all sites use cookies, so there may be nothing to
send.
It’s important to note that all of this is how the web works. There’s no
avoiding it. You can obfuscate the information presented by using an
anonymization service if you’re particularly paranoid, but the information is
presented to the web server regardless.
Now what happens to all that information is up to the web site owner.
On Ask Leo! my server logs most of that information, and I keep those logs.
I use them, for example, to see which pages on Ask Leo! get the most
traffic (from the URLs requested), which countries send me the most visitors
(from the IP addresses), and which sites link to me (from the Referrer).
Cookies are used un two ways on Ask Leo! Most articles have a “remember me” option
when placing a comment; you are “remembered” by using cookies to store your information
on your machine. Cookies are also used by a third party add-on package, Google Analytics,
which gives me some advanced traffic analysis – very similar ro the reports I get
based only on my server logs. As another example, on my wife’s
retail site I use it as a convenience to fill in sales forms with
information customers may have entered before. That way they don’t have to
retype everything each time.
I can track individual visits by IP address, but as I’ve discussed many
times before, there’s no way to tell who an IP address represents or specifically where they are.
The data presented to the web server on each page request is all information
that a site owner can use to more properly target his audience, analyze his
performance, and/or provide additional functionality to his site visitors.
•
So the bottom line is that for legitimate websites what might be refereed
to as “certain information” comes from two places: the information that’s
provided to servers as part of how the web works, and the information that you
explicitly give to a web site.
Should you be concerned?
No. Not in my opinion.
I keep using the phrase “legitimate web sites”, but just as in the real
world, as long as you are dealing with reputable sites and vendors by and large
you have nothing to be concerned about. There are certainly malicious vendors
in the real world, and of course malicious sites on the internet – those are
not what I’m talking about. Major retailers and reputable e-commerce sites from
companies you recognize are typically quite legitimate and above board. They
succeed by providing the services they advertise, not by trying to be
underhanded or stealing your information and spying on you.
FYI – your comment “I don’t use cookies so there’s nothing to track there” isn’t entirely accurate.
While you may not be using them for anything, your site is generating four cookie requests:
__utma (expires in 2036)
__utmb (expires in 5-1/2 hours)
__utmc (session only)
__utmz (expires in 6 months)
(Also, the comment posting obviously generates some cookies if you select “remember me”.)
I stand corrected. I do use cookies to allow you to hit “remember me”. Obviously I’m not paying attention to them myself :-).
The _u… cookies are Google Analytics, which they use to present me much of the same data that I’d see in my web server logs, only with a little more analysis.
I’ll update the article. Thanks for the catch.
Why is the expiration date for many cookies so far in the future? Forty or fifty years seems like nothing to these guys….what’s the point?