- My computer name (the name I assigned to my computer)?
- Profile information???
- My browsing history (any/all sites I’ve visited and when) or can they just tell the number of items in my history?
- Email addresses associated with my computer?
I’ve reviewed similar questions but I’m not sure I truly understand what information a web server can collect from my connection/browser.
This turns into a fairly complex answer pretty quickly. It’s both more and less than you might think.
I’ll start by covering what every website sees.
Become a Patron of Ask Leo! and go ad-free!
Almost every web server on the planet keeps a record of the pages that have been requested of it. Each time a webpage is requested, the server adds a line of information to that log.
Here’s an example log entry from my own server of someone accessing https://askleo.com/someones-sending-email-address-stop/ (my article “Someone’s Sending from My Email Address! How Do I Stop Them?!“).
18.104.22.168 – – [24/Jul/2018:09:26:44 -0700] “GET /someones-sending-email-address-stop/ HTTP/1.1” 200 47906 “https://www.google.com/” “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/601.7.8 (KHTML, like Gecko) Version/9.1.3 Safari/537.86.7”
There are several interesting bits of information there:
- 22.214.171.124: the IP address of the internet connection of the computer requesting the page.
- [24/Jul/2018:09:26:44 -0700]: the date, time, and time zone offset that the page was requested.
- GET /someones-sending-email-address-stop/ HTTP/1.1: the operation (GET), the page requested, and the HTTP protocol version to be used.
- 200: the return code. In this case, 200 means success.
- 47906: the size of the response in bytes. In this case, this would be the size of only “/someones-sending-email-address-stop/”, without any additional files (like images or support files) it might reference.
- https://www.google.com/: the site that had the link to https://askleo.com/someones-sending-email-address-stop/ that this person clicked. This person arrived at Ask Leo! after performing a Google search and clicking on a link in the results.
- “Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/601.7.8 (KHTML, like Gecko) Version/9.1.3 Safari/537.86.7”: this fairly long and obscure string, known as the “User Agent” string, identifies the browser used (Safari), the operating system (Mac OS X), and occasionally some other things the browser chooses to put in there.
As you can see, it’s both a lot, and not so much.
Your IP address doesn’t convey as much as most folks think. I’ve written about this repeatedly, but unless you’re a law enforcement agency with a court order, the best a website owner can tell is your ISP and roughly where on the planet you are — sometimes as accurate as your neighborhood, and sometimes only as close as your continent. In our example, the IP address is owned by British Telecom, and is somewhere in the UK, possibly York, as determined by a quick “whois” lookup on the IP address.
But that’s about it. Note the things you’re worried about that aren’t on the list: your computer name, your profile, your history, and your email address are not made available to a web server by a simple website visit.
Websites remember what you tell them
Sites that allow you to sign in know who you are because you told them.
They generally log this information as well, either in the server logs I described above or in other logs maintained by whatever software on the server is processing the login. If the website profile associated with that login includes things like your email address, then the server knows that too.
Because you told it.
That’s something most people fail to remember: the vast majority of data collection and tracking exists because you explicitly provided it somehow.
Websites remember what their associated sites remember
In that same vein, whether it’s information you provided or was collected from simple visitor logs, websites that are associated with one another can certainly share data. There are several different examples.
- Sites from the same provider naturally share data. Gmail, Google Search, YouTube, and other Google properties all likely realize that you are you by virtue of your having logged in to check email.
- Sites from related properties may do the same. We don’t think of sites like Instagram and Facebook as being related, but they are (Facebook owns Instagram). I don’t know that they share data, but they certainly could. There are many such relationships out there.
- Unrelated sites could enter into agreements to share data. In theory, this would be disclosed on a privacy page, but is typically limited to wording about sharing information with “third parties”.
Advertising is a special case.
Advertisers remember you, sort of
We’re all aware of ads that seem to follow you around from site to site. The sites that have those ads don’t need to know who you are at all; they just need to use the same or related advertising services.
All the advertising service needs to know is that “this computer looked at external hard drives” in order to then show you ads for external hard drives as you surf other sites. No personal information whatsoever was necessary to make that happen. It feels like you’re being tracked, but you’re not.
Of course, it could be more than that. The web servers of the advertisers have that same log information we started with, except they also know on what sites their ads were placed. So in a sense, they can see what sites you’re visiting.
And, of course, if you’re logged in to one of those sites, or a related site, or a site related to the advertising service, they might know who you are — because you told them.