Are we at risk of losing our digital information over time?

Today, lots of information is being stored electronically. Years ago, it was on books that lasted 100’s of years, if you wanted. Try reading a 1984 3.1/2 floppy disk or worst, a 5.1/2. The 8 inchers are before my computing time (I saw them on Wargames). The ones with the reels that stopped and started are really unreadable. I also used a tape drive on a commodore Vic 20. If I did not hoard this stuff, it would be all unreadable, as I have not used this stuff in 34 years. So are humans in danger of losing knowledge on this? Magnetic media degrades faster than paper books or doesn’t it?

I think you raise a very good point.

While I still feel that digital information is vastly superior to analog counterparts like paper in most respects, that doesn’t mean that there aren’t drawbacks – often serious drawbacks – with storing information digitally.

And one of those drawbacks is progress.

Formats: physical

When it comes to digital information, what we’re concerned about over time is how that information is stored.

The most obvious are the issues you raise: physical formats. I do, indeed, have a couple of backup tapes that I no longer have a drive for. My first floppy disk was an 8″ drive. My first “storage medium” was punch cards.

Getting data off of those formats today would represent a challenge.

There’s no reason to believe that any of the successors to those formats will last forever either. 8″ disks became 5-1/4″ which became 3-1/2″, which is today almost unheard of. While the USB interface is backwards-compatible so you can still connect that USB 1.0 drive from a decade or more ago, that’s unlikely to last forever. Even internal hard drives have gone through some transitions. I expect that older IDE/PATA drives may simply not work any more in newer machines, giving way to the faster SATA interface – which itself has gone through a few revisions.

As I often say, change is constant. Evolution is continuous. It’s unlikely you’ll see any of today’s hardware on computers of the next century.

pcard Formats: digital

But there’s actually another level to the issue of future compatibility, and that’s the digital format of the file.

Even today proprietary file formats can fall by the wayside. A file saved in a custom format by an application that stopped working with the advent of, say, Windows XP, is for all practical purposes inaccessible today on a newer system.1

There are certain file formats that we might assume will last longer than others, PDF and .jpeg being good examples, but ultimately will today’s “.docx” files be able to be opened by Word 2076? Maybe. Hopefully. How about the latest CAD program? Other graphics files? Databases? Games?

Maybe. Maybe not.

Aging media

Another problem we face is that physical media degrades over time. Magnetic material slowly demagnetizes. Optical media like writable CDs and DVDs oxidize. Even paper can suffer from mold and decay if not stored properly.

The big question is “how long do they last?”. The answer is: it depends.

It depends on the quality of the media, the quality of the writing instrument, and the environment in which it’s stored. A sealed hard drive, for example, I would expect to last for many decades if not used. Floppy disks, not nearly as long. Archival CDs? I’ve yet to encounter a hard failure as I move data stored on CDs that are upwards of 20 years old.

And yet some experts say we should expect as little as 5 years from optical media.

And yes, being reminded of this has spurred me to take action. I’ve resumed a project where I’m copying hundreds of archive and backup CDs to hard drive storage (which in turn is backed up).

While I can.

That’s where at least part of the solution lies.

Migrating Data

Copy those floppies. Duplicate those discs. Transfer those tapes.

While you can.

The problem, of course, is that we can’t predict the storage medium of the future. But we can copy what we have today. In fact that’s one of the things that makes digital storage so appealing – assuming the media is intact it can be easily copied an infinite number of times with zero loss in quality or fidelity.

So, that’s what we need to do.

Migrate the data on those floppies, discs and other older storage media to newer media.

The good news is that besides being faster, newer media – like hard drives – are often significantly larger than the media we’re copying from. In my case those hundreds of CDs will in all probability take up only a fraction of the hard disk space I have available.

And then, of course, back everything up. Putting everything on a single hard drive means you can lose it all at once. But, as I’ve said, data is easily copied – so copy it. In my case I copy to another hard drive on another machine.

Depending on the importance of the data you might also consider an archive copy in the cloud.

It’s conceptually very simple

Honestly, all we’re talking about here is another form of backing up.

If the data is in only one place, then it’s not backed up.

If the data is in only one place, and that one place is aging or soon-to-be obsolete media then absolutely you’re at extremely high risk of losing it all, forever.

The solution is simple.

Back it up.

It’s not really a new problem

Paper can last a long time, but it’s notoriously difficult to back up. Imagine the manuscripts and documents that have been lost to the ages forever because of a fire, water, or even intentional destruction. I, for one, wish that the Library at Alexandria had been backed up, and mourn what was lost when it was destroyed.

The movie and film industry is fighting something that more closely parallels our digital issues as they scramble to save and transfer movies on degrading cellulose film before they’re also lost forever.

In most of these cases digital storage is, in fact, the solution. Once digitized, movies or just about anything can be quickly and easily replicated so data loss need never happen again. In particular, as physical storage formats change over time, that replication can be from old to new media as well. Just as I copy my archival CDs to my hard disk today, presumably that hard disk could someday be copied to its future technological replacement.

The future of file formats

File formats remain a problem, but only to a certain degree.

For example, I fully expect that computers of the 22nd century and beyond will be able to read PDF files. Why? Pragmatically it’s become such a ubiquitous file format today that I can’t imagine that at least future historians won’t be able to readily access it. Will the average computer user? That’s unclear, but I’m confident that someone will be able to convert them to the 22nd century’s equivalent.

Less popular digital formats have a less certain future.

But here’s the good news: as long as the digital bits have been preserved, the data is not lost.

Yes, some future historian or hobbyist might also be charged with re-implementing or reverse-engineering 22nd century code to interpret an obscure file format, but almost by definition it can and could be done.

It’s just a matter of time, cost and priority.

Footnotes & references

1: Case in point: the character-mode help file format that I designed while at Microsoft. It was supported for a while in Windows Help, but Windows Help itself has fallen by the wayside.

14 comments on “Are we at risk of losing our digital information over time?”

  1. Leo, the Library of Congress and National Archives face this problem in spades. For example, when a new president comes in, all the PC disks in the White House are yanked and sent to the Archives. How will they be read 100 years from now? They are actively researching this problem.

  2. I work on audio recorders and have had some bad things happen with DVD-Ram disks besides the drives being outdated the media becomes unreadable and have had to spend over $1500.00 to have a disk recovered. I am recomending to my Customers to convert everything to hard drives

    • And hope the drives do not crash. And hard drives do crash. I had a new HP notebook computer that the hard drive crashed after only 6 months. It was under warranty and and I had recent Image backups for the drive, so it did not cost me anything and nothing was lost.
      Also, if the hard drive sits unused for years, what will that do the moving parts of the drive, such as the ball bearings which can go flat if left in one position for long periods of time.

      • No need to hope the drives don’t crash. Assume they eventually WILL crash and make regular backups. Several are better than one.

  3. Minidiscs? (What is the longevity of?) They never quite took off – for reasons I can only speculate – and now that they’ve been around nearly 20 years and obviously are not going to continue into the future, it’s safe to say they (the contents of) will need backing up. One problem (aside from cost) is, they can only be copied in ”real time” so if the playing time is 2.6 hours it will take that long to copy each one.

    Sony – in a desperate bid to make MD take off – got as far as creating a Net-MD, which system enabled one to copy computer files to a mini-disc. A pity that it didn’t work in reverse, to get music files onto the computer – and from thence to mp3, etc.! But it still didn’t take off. (I picked one up about 10 years ago when they were selling them off cheaply. So already becoming obsolete even then!)

    Bigger problem – my 800 or so VHS video tapes. How can I get them (selectively of course) onto the computer? (What sort of size of hard drive would I need, per hour, and what sort of software would do the job? Any ideas?) That was quite a digression, but perhaps a productive one after all!! Cheers, Leo!

    • From what I’ve read a converted VHS video can range from about 400 GB to 1,500 GB per hour. You might have to experiment to find the ideal resolution for your needs.

      • I suspect you’re over-estimating by about three orders of magnitude there. Just look at “DVD rips” that routinely come in around 700 MB for 1.5-2 hours of video. And VHS has a lower useful resolution than even plain old DVD media. I wouldn’t be surprised if one could store a complete VHS tape in about half of a gigabyte while still preserving virtually all of the original fidelity.

        • I’ve started to digitize mine, but it’s a slow process because it is copied in real time. A 2 hour movie/tape takes 2 hours and because of the speed of the processor and the 1 GB of RAM in my Windows XP machine (the USB device I use only runs on Windows XP), I can’t do anything else on the computer or risk having the audio and video out of synch.

          A 97 minute that I recently did, recording as MPEG4 high quality (528 KB/sec) and saved as a high quality NTSC DVD format (29.97 frame rate, 7200 Kbits/sec) turned out to be 5.56 GB

  4. I’m not sure I would even bank on PDF being around in the 22nd century.

    WordPefect and Lotus 123 were the standard back in the day. Everyone probably thought those formats would outlast. then Microsoft came along. WordPerfect and Lotus were still the dominant players for a couple more years and then everyone had to have Microsoft Office. Nowadays, some people still use WordPerfect and Lotus, and even if you don’t, those formats are still readable by a few software packages, but by and large, they are dying.

    I get what you’re saying, PDF has been around for so many years and is used by so many people, and I think that it is the best candidate for longevity; however, in the tech world, nothing is guaranteed. Blackberry used to be the darling of smartphones. Today it’s Apple. In the 22nd century? Well, I probably won’t be here to find out.

    • PDF will very likely pass from popularity in 100 years, but that and probably some other currently ubiquitous formats such as .doc and .docx will probably be accessible as legacy formats 100 years from now. At the very least, there should be programs to convert from those formats to something more current.

  5. Another related issue that’s seldom addressed: What happens to all the saved items you want others to access, such as photographs or important documents, after you lose the ability to access them yourself? Unless others know where they are and how to get to them, they are lost with you. Maybe they are backed up to the cloud. Even with the access information passed on, unless someone continues to pay for the service, it will go away. There are similar issues with local storage: people may toss grandma’s old computer without realizing what is on it, CDs may end up in a box in the attic, etc. The physical photo albums and scrapbooks at least were recognizable for what they were.
    One of my relatives is producing CDs of photos of interest to other family and sending them to the interested folks. This is far superior to my method: publish some pictures on the Web, and provide access information to the several places where thousands of pictures are stored, of which maybe 1% is of interest to anyone (including me).
    My prediction is that her CDs will be enjoyed by folks for generations, but the entire photographic record of my existence will be gone within 5 years of my death.

  6. Hi Leo,
    I beg to differ on one issue. That you may not see today’s computers by 22nd Century. Are you sure, they’ll last beyond 2050? I doubt it very much. In these days, model changes so fast that one doesn’t get spares for a 5-year old machines!
    As regards to the permanenant loss of data, it is inevitable. As much as a data medium dies, so does the value of its data. In India, many old films have been lost for ever. I’m sure everywhere it is so. I’ve lost my marriage VHS tape for ever due to fungus attack. But, who wants my marriage show? May be not even my daughter!

  7. Leo,
    If you have some 8 inch floppy disks you want to read just go to your local Minute Man Missile silo. A segment on CBS’s 60 Minutes showed a Minute Man Missile command still using 8 inch floppy disks. Apparently the system is old but still working. And since it is so old it is not connected to the internet and safe from hacking.


