The internet is forever, except when it’s not.
Um…. no. There’s no magical “scramble code” to recover anything.
But that does raise a very interesting conundrum. We often say “the internet is forever”, while at the same time saying, “Be sure to back up, because once you delete it, it’s gone.”
The ways of both the internet and deletion are more complex than most people realize. While these two statements appear to be diametrically opposed, they’re both very true — often at exactly the wrong time.
Become a Patron of Ask Leo! and go ad-free!
Deleting or retrieving from the internet
- Trying to hide things online often only brings more attention. It’s called the Streisand Effect.
- Anything you post can be copied, and likely is being copied within moments, by search engines, archives, and other third parties.
- Sharing information online, even with privacy restrictions, puts you at the mercy of those you share with.
- Services that back up their data make copies when they do, which includes your data on that service.
- Once posted, it’s pragmatically impossible to delete all copies of your data.
- You can’t access the remaining copies, but they could still come back to haunt you.
The public internet & Barbra Streisand
As many have discovered, one of the fastest ways to spread a rumor is to call it a secret. That’s true in life, but nowhere is it more true than on the internet.
Just ask Barbra Streisand. There’s a reason there’s something now called “the Streisand effect“.
In 2003, she attempted to prevent photographs of her home from being published on the internet. This drew more attention, rather than less, to the images. It prompted people to copy and re-post the photos elsewhere, again and again and again. Indeed, the Wikipedia article that describes “the Streisand effect” includes the picture she originally attempted to suppress. There’s simply no way she or anyone else could find and delete all the copies that were made of that photo, and continue to be made to this day. Indeed, any attempt to do so would probably just spur another round of copying and re-posting.
We see this all the time with information posted publicly that is later removed or altered. Be it a tweet, a photo, a database, or something else, usually someone, somewhere, has a copy of its original. On discovering the attempts to rewrite or delete history, they’re quick to call shenanigans on the effort by posting the original as proof. Depending on the perceived relevance of the information, the original may be re-posted once or many times in many places, making its removal from the public internet as impossible as removing the photo of Ms. Streisand’s home.1
This is one reason we say “the internet is forever”. Anything you post publicly can be copied. There’s no easy way to know by who, or by how many, but you can never assume that the number of copies is zero. Never. Once it’s on the public internet, you lose all control over it the instant that someone makes a copy.
Your posts are being copied
OK, so you’re not Babs2. To the best of your knowledge, no one cares about your tweets, photos, or whatever you care to post online. No one’s making copies of what you post publicly.
While it’s likely that you as an individual aren’t that interesting, that doesn’t mean what you’re posting online isn’t being copied, and probably quite quickly. You’re “interesting” in the sense that you’re a user of Twitter, Google Photos, Flickr, or whatever service you use. Those sites are mirrored regularly. Why?
- Search engines like to keep local, cached copies in case the site goes offline.
- Understanding how to “spider the web,” as it’s called, is something computer science students learn by writing spiders that pick a target and mirror it. Others just do it for fun.
- The Internet Archive attempts to keep copies of all public websites to preserve our digital history. (Pictured to the right, my company’s website as of 2003.)
- Many sites exist specifically to mirror other sites (or portions of other sites). Pick a prominent politician on Twitter, for example, and I can pretty much guarantee you there’s a site keeping copies of that person’s tweets.
And that’s before we even consider corporations, malicious agents, and governments copying public information for their records, analysis, and uses unknown.
Is something you’ve posted in there? Probably. Will it matter someday? There’s no way to know. I personally publish a lot, but I don’t expect it to be a problem.
I hope I’m right.
Sharing is copying
I keep using the phrase “public internet” because it’s an important distinction many people fail to keep in mind. Public is public, and as we’ve seen, public must be considered to be “forever”.
So we ratchet up our privacy settings, restricting who is allowed to access our stuff, or perhaps only emailing certain things to certain people. We keep it “private” — or so we think.
Still, we remain at the mercy of everyone with whom we choose to share our data. Each could be copying what we give them access to, intentionally or otherwise. On top of that, they could have really bad security; should their accounts or computers be compromised, whatever we share with them could be in the hands of a hacker in moments.
While that last scenario is not very likely (unless you’re a “high-value target” in the hacker’s eyes, and he’s used your friends to get access to you), it underscores something that is vital to understand: every time you share information with someone, you’re giving them a copy, and you’re giving them the ability to make more copies, and perhaps even post one of those copies publicly.
Sharing and exchanging data over the “private” internet might not seem quite as private, since there really is no “private” internet at all.
Backing up is copying
The internet is nothing more than a collection of computers that store data and know how to talk to each other. When you use a service like Twitter, send email, upload a photo, or even post a comment on a website like Ask Leo!, that information is stored on a computer not unlike your own3. Those services all take steps to back up the data they contain (hopefully like you do).
Backing up makes a copy of all their data — including all of your data.
Even if you’re the only one using an internet-hosted service — perhaps your email, cloud storage, online password vault, or who-knows-what — there’s a good chance the service provider is regularly backing up their servers in case something goes wrong. In fact, we hope that’s exactly what they do.
How long do they keep the backups? They’re not saying. It could be moments or years. But it’s possible that whatever you’ve shared online or stored online for yourself, has been backed up somewhere, somehow, in some way. That’s yet another copy of your data that’s effectively impossible for you to erase — which brings us to the reason for all this “internet is forever” kind of talk.
Deleting doesn’t delete all
You delete an email. You delete a file from your cloud storage. You delete a photo from your social media account. You delete a tweet. As you can see by now, regardless of exactly what that looks like to you, it’s very likely you’ve deleted only one of many copies of your data.
Yet you can’t get it back. Once you delete it, it’s gone.
The “catch” is, you’ve deleted the copy under your control. Perhaps it’s the copy most obviously visible to everyone, but it’s probably not the only copy.
Unless you have access to those other copies, or you’ve kept a copy on your own machine, you’ve lost your data. The online services generally will not restore from their backups (the backups are to recover from their issues, not yours). Hackers certainly aren’t going to share with you, even if you can track them down (they’re probably overseas anyway). And the NSA isn’t going to respond to your request to restore your data from their backups (assuming they’ve been watching you, of course).
This is why we say “Once you delete it, it’s gone.” There may be other copies, but there is likely no way to access them.
If it was public, maybe you’ll get lucky and find a copy on The Internet Archive; I’ve recovered an occasional website or web page from there. If it was private, perhaps someone with whom you shared it still has a copy. If it was yours and yours alone, and it was stored in only one place, then you weren’t backed up. It’s likely gone forever, regardless of how many actual copies there might be out there somewhere.
Unless you have sufficient resources (read: money), a compelling reason, an attorney, and a court order to force an online service to retrieve it, whatever you deleted is gone.
And then it gets weird.
Deleted isn’t deleted, except when it is
Whatever you deleted is gone from your grasp. You deleted it, and you can’t recover it — unless you had a backup, of course.
But it’s not really gone, now, is it? As frustrating as it is, copies continue to exist: system backups, at a minimum, and possibly archive/mirror copies, research copies, malicious copies, and more.
All out of your reach and out of your control.
There are only two things you can count on, really:
- You can’t get it. (“Once you delete it, it’s gone.”)
- It could still come back to haunt you. (“The internet is forever.”)
The solutions are equally simple:
- Back up everything
you keep online.
- Don’t put anything online that might “haunt” you, for whatever definition of “haunt” you care to assume.
These are exciting times, to be sure, but they’re complex and often frustrating times, as well.
Subscribe to Confident Computing! More confidence & less frustration -- solutions, answers, & tips -- in your inbox every week.
I'll see you there!
Footnotes & References
1: In an exceptionally interesting and geeky note, the technology underlying BitCoin, called “block chain”, was used to “Irrevocably Mirror” 20,000 lectures before a legal issue forced the University hosting them to remove them from its site.
2: If English is not your native language, “Babs” is one of the many short forms of “Barbara”. It’s not common, and I’m guessing Ms. Streisand’s not a fan of it. But once published, it’s out there. Forever.
3: Seriously. They might have more cores, more RAM, more disk space, or more whatever, and they might run different operating systems (or not), but the majority of the internet runs on computers that aren’t that different from the desktop computer nearest you.