Other than using spyware and cookies which can be deleted from our
PC (hopefully), how can websites or search engines continuously track
and monitor our internet activities from our home PC? I read from one
of your earlier articles that most people probably have a “dynamic” IP
address. Assuming that is true for me, and my IP address constantly
changes, how can an IP address be used to identify me for any
significant length of time? (My IP address today could be yours
tomorrow.) And even if the website/search engine knew my IP address at
a single point in time, how can they connect that IP address to my name
(if I don’t register it) and physical location? I’m guessing that my
ISP can make this connection, but I assume they won’t provide that
information to just anyone, right?)
That question covers a lot of ground, from cookies to IP tracking.
It also misses a couple of areas that are worth thinking about as
well.
But I do have to point out one important thing for most people: you,
as an individual, just aren’t that interesting. Sorry to burst
your bubble, but it’s pretty likely no one really cares where you go or
what you do.
Let’s see what they might care about, and the ways that they can
collect it.
]]>
Cookies are, by far, the most common way for web sites to preserve information about your usage. While cookies ostensibly only store information on your machine, that information can of course be used to access other information stored elsewhere.
Cookies typically just store small bit of information on your machine, and then each time you go back to a page on the same site as stored the cookie originally, that number is sent along. That bit of information might be a login ID so that as you access on-line email account you don’t have to login for each and every page.
That’s why completely disabling cookies can be such a pain. Many web sites simply rely on cookies to keep you logged in, and keep the experience of using them somewhat manageable. It would really suck if you actually had to login each time you wanted to see the next email message in your inbox. Cookies solve that.
But yes, cookies are one way – the most obvious way – that sites and particularly advertising services – can collect information about the sites you might visit on the web. They may not know it’s you, but your machine that’s visited these sites, this many times over this period of time.
•
Logging In is probably the least thought of way that web sites collect information. When you login to a service, by definition you’ve identified yourself (and your IP address, but more on that below). The service then “knows” who you are – to the extent that you’ve provided that information – for as long a you’re logged in.
The ‘catch’ is that logging in to one service could identify you with all services from the same provider. And we provide a lot of information to the various services we interact with.
Consider Google. Logging into GMail also identifies you for iGoogle, Google Calendar, Google News, and all Google services, including Google Web History, which keeps a history of all the sites you visit while logged in.
It’s not uncommon. Login to Hotmail, and you’ve actually logged in to all Windows Live Services. Login to Yahoo mail, and all Yahoo services may follow.
•
Flash and Javascript can also be used to collect information about how they’re being used. Flash even has its own version of cookies that are not the same as browser cookies, and are not clear by a browser’s cookie management functions.
Javascript, when enabled, can also be used to send off some additional information to the sites you’re visiting in ways that bypass cookies.
•
IP addresses are what typically get everyone all excited and concerned, and for no real good reason. As I’ve said over and over and over again here and elsewhere: IP addresses cannot be traced to your physical location without legal intervention.
They can, however, occasionally be used as a tracing mechanism. As you say, IP addresses can change, but unless you’re on dial-up they actually don’t change that often. While they’re set they are a unique identifier – though not of your machine, since you may have any number of machines sharing an IP address behind a router. All the machines behind the router “look like” they all have the same IP address on the internet.
•
Combinations of everything above are where, I believe, the transient nature of all those means of identifying you can often be mitigated.
-
When you login, the service now knows your IP address.
-
When a cookie is uploaded as you visit a site, it might now be associated with your login, and/or your IP address.
-
If your IP address changes, but the same cookie is delivered, the service could know that it’s still the same machine.
-
If your cookies are cleared, your IP address changes and you logout, but a flash cookie or some Javascript happens to be used, the site you’re visiting might still be able to determine that it’s the same user or machine as before.
I’m not saying that any of this is happening on any particular site or set of sites. But as you can see, if sites are sufficiently motivated and technically astute they can collect a lot of information.
•
DON’T PANIC
I’m always reluctant to write about this kind of topic, about what kinds of things are possible, because it so often simply feeds people’s paranoia. Many folks will read the above and get very scared, thinking that their every move is somehow being tracked on line.
Folks, you’re just not that interesting.
By far the vast majority of data collection that’s happening is in aggregate – meaning that the habits of thousands if not millions of web users are collected en masse with all individual information being lost in the aggregation. Data like “people who visit Ask Leo! are 20% likely to shop at Amazon.com” is the level of information that’s being used. Individual activity, like “Leo Notenboom shops at Amazon and Fry’s and also visits CNN.com and somerandomservice.com and looked at these web pages and clicked on these links …” – even if it is being collected – isn’t being looked at by anyone. By and large, it can be used in either of two ways:
-
as input, without the individual identification, for the aggregate “people are likely to” kinds of calculations I mentioned
-
for you. For example, if you’re a customer of Amazon (or any retailer) you can login and see what you purchased. Perhaps you choose to use Google’s Web History so you can see what sites you’ve visited.
The Caveat
There are two scenarios where your paranoia might be somewhat justified.
You actually are a criminal, a suspected criminal, are under surveillance by law enforcement, or live in a country where law enforcement has been compromised. Depending on how big a fish you really are, “they” could be watching you. Most people just aren’t that interesting, but I’m sure that there are a few that are.
Your account’s been stolen or compromised. In this case, all the information normally available to you would be available to the person with access to your account. This is perhaps the most likely scenario, and the one for which you would want protect against by keeping your account secure.
Leo great job. Love your articles, and read them when I receive them. I’m not a novice so appreciate your candid solutions or suggestions.
futher more I save them so I can go back when needed.
I’m a senior in a senior and dissabled building and contantly doing my voluntering helping others with their computers.
Keep them coming, and great job.
Roland
Leo,
Great article! I own a medium size ecommerce company (www.tylertool.com) and your dead on. At best I can see which people used google, yahoo or MSN versus came directly to my site by typing the url. Without the cookie that tells me it comes from google I would have no idea how much money i should invest in paid advertising. Without this i don’t know my return on ad dollars spent.
Amazon and the bigger companies may have elaborate data mining resources but the smaller companies really don’t!
I really enjoy your articles,they are very informative. I have often wondered how companies track my information or how I use my online operations.Thanks
Firefox has an add-on to delete Flash cookies! And for the truly paranoid, don’t forget the index.dat files!
What about the MAC on the NIC card/device? All MAC addresses are unique in this world. Couldn’t that be used to “know you” even if you had cookies turned off completey?
02-Jun-2009
@Suzy: Leo has an article clearing it up: http://ask-leo.com/can_a_mac_address_be_traced.html
Leo, sorry to contradict you (in terms), but although everything you said was true, there are scores of studies showing that with modern data mining techniques, one *can* trace individual information using aggregate data. Data mining is a technology that is so advanced now that it escapes comprehension to even most seasoned IT professionals.
Now, it is true that most corporations aren’t interested in particular individuals. But a particularly aggressive mass-mailer might, totalitarian governments (or branches thereof) might – remember the Internet is worldwide, it’s not only used in the U.S., and even there your government’s record of late is not exactly flawless in that respect… And identity theft gangs might be VERY, VERY interested in that. Remember they are huge now, extremely rich and well-organized, and difficult to trace and frame because they are internationally based and spread through many countries and continents. The world has no borders, and if Big Brother isn’t (yet) watching you, someone else might be…
@Tom: It’s actually not a cookie that tells you where people came from when they visit your site; it’s a ‘referer’ record. This record is always present in a request for a web page, unless the user has gone to great lengths to disable it.
So even in a cookie-free world you would still get those stats you need.
Not being tracked huh? Well how about this? I go to a website and then a pop up window of a sexy girl comes up and says that she is available. She is in the same town as me or there are some that are 25 miles from me. Ok, I clear out my Internet cookies via CCleaner and restart my computer. All my history is cleared as well. I go back to the same website and again sexy girls in my particular area are showing up. My area not anything over 25 or 50 miles but only my area. How can this be?
12-Jun-2009
@Jon B: Geotargeting http://en.wikipedia.org/wiki/Geo_targeting
Hi..one more important thing here is that ..the user is also tracked based on the mac id of your computer.
19-Mar-2010
So what I get from this artical is that if you delete all the cookies and change your IP adress then a website will think you are a completly different person on a completly different computer even if you just visited it 5 minutes ago?
I have a question for you. I did a google search for a guy that I like and his website on the internet came up. I then clicked on it to visit and went to a couple of pages on his website. Is there any way that a) he could know the IP address of my computer and b) guess that I visited. If so, is there any way for me to prevent this? By deleting cookies? I don’t want him to guess that I googled him!!
23-Jul-2011
My searched question on Yahoo.com: Can a website track which of it’s pages you have viewed?
Very few pages even address this question. So Leo, I am glad you have at least hit somewhere in the ‘ballpark’. I have read your responses to other comments on this page … The other questions just aren’t asking the right way …
Scenario:
I went to an adult website that has live models. I became friends with one of them. One day a few months ago, she accused me of ‘cheating’! [How does a woman, who sells it to anybody with money, accuse her guy friend of cheating? But, that isn’t the main point here.] She told me she was viewing another model’s page and saw me there. Admittedly, I did go there. But was only there long enough to say “Hi, (name).” Then the other model logged off. That was months ago.
Today, my ‘Cam girl friend’ said another model came to her room, saw my screen name on her screen and told her I was a regular guest and actually paid for her adult services a number of times. I have never even had a conversation with this other girl. And, since I have an outstanding balance due for previous purchases from using my friend’s ‘services’, the website restricts me, or any other user that has not paid in full, from making any more purchases. So, it is impossible for me to buy or have bought any service on that site since my unpaid purchase with my friend.
But, her suspicions beg sum heavy IT questions:
1. I may have viewed that other model’s page without having signed in. So, could that be a way that someone like this other model or their boss could be tracking my visits to ‘rooms’ on their website?
2. I have noticed that this family of websites has many clones – same website with various incarnations under different names. So, I tried another site without signing in … Even though I NEVER logged in and never was able to chat with anyone on that website. I noticed that the site still keeps track of which models’ pages I visited. How is that?
3. How can someone track which pages I (my computer) have visited on their website?
4. Is this legal? Because in my situation, it has created undo tension between me and a girl friend. And, honestly that gives a person in her position unfair advantages over me: like she can accuse me of things based on her websites tracking abilities. Even though she, by the very nature of her work, is unfaithful. And, that infuriates me!
@Anthony
As explained in this article, a website can determine the IP numbers of the computers which have accessed them. They can also place cookies on your computer which they can retrieve when you visit them again. As for the legality of doing this, I can’t give you any legal advice, but suffice it to say that most website in the world do this kind of tracking.