Does visiting a search-engine cached page prevent the original site from noting my visit?


There are times when one might choose to search a company’s web page as cached by Google or Bing in the hope of not broadcasting one’s IP address to the company by searching its active web page. Does visiting the cached version of a page provide anonymity at least from the company being searched? If not, is there a way to modify the search to achieve this anonymity short of using a proxy address?

The answer depends a lot on the specific sites that you’re actually looking at. In many cases, yes: the original site will never know that you were looking at its content that was cached somewhere else. However, in many other cases, – perhaps even most – the answer might be very different.

Avoiding detection

Unfortunately, it’s difficult, perhaps even impossible, to know which sites are going to limit your exposure and which are not before you visit. Ultimately I would never visit a cached page and count on not being seen by the original site.

If avoiding being seen is important, then I’d either not visit the site at all, or I’d use a proxy or something like Tor: The Onion Router. Even then, cookies and the like could still identify you, so caution is absolutely warranted if this is that important to you.

The search engine cache

The search engine cache is really just a copy of a web page that the search engine keeps on its own servers instead of the original. It can be handy to access the cache if the site is down or sometimes to detect changes ( if the cache hasn’t been updated since the website itself was actually changed).

The problem is that a webpage can be very simple or it can be incredibly complex. A plain old HTML page with no pictures can be entirely self contained, and that would be the only thing required and could come entirely from the cache. But let’s be honest here, how many plain old web pages do we visit these days? When you start adding images, CSS formatting, web analytics tools and even ads and other dynamic content, web pages get incredibly complex.

While the original page might come from the cache, the things referenced by that page may not. In fact, they often don’t. As a result, things referenced on that page could be a direct sign to the original site that you visited. Add to that things like cookies that might be saved or sent from prior visits, and it all gets pretty risky.

I’m not saying that will happen. It really depends on too many things that we really don’t have any control over or knowledge about. But bottom line: Accessing webpages through a search engine’s cache is not a way to guarantee anonymity.

