Tips to avoid losing data in your lifetime
Probably none of the above.
Technology keeps changing, of course, so the best long-term storage media will also continue to change.
What we really need here is more than a choice; we need a strategy.
Become a Patron of Ask Leo! and go ad-free!
There are varying opinions, but traditional magnetic hard drives seem the most likely to last the longest for archiving data. The best approach is to refresh the data by periodically copying to more current media. Data formats also come into play. Saving in open and/or ubiquitous file formats like pdf will help ensure the data can be read years from now.
You’ll get a lot of conflicting answers as to the best long-term storage media. Many people feel strongly that “X” is the way to go, and others feel just as strongly that, no, it needs to be “Y”.
They can both be right if you approach it properly. For example, the right solution might be both “X” and “Y”.
I’ll review the options and describe what I do.
Once upon a time, CDs and DVDs were the go-to media for archival. They had oodles of capacity and didn’t take much room.
We quickly discovered that quality matters. In fact, it matters a lot. Many cheap writable CDs that were written just 5 or 10 years ago are no longer readable. That’s exactly the scenario we’re trying to avoid.
Archival-quality CDs and DVDs (and perhaps Blu-Ray) are probably worth the money if you’re thinking of storing for many, many years. There are experts that even as recently as a few years ago will tell you this is the way to go. I suspect it’s a very safe bet for the most important data.
The real problem is that what was once big is now small. That 4.7GB DVD might be small given some the things we might want to archive these days, like video or lots and lots and lots of photos.
An “oodle” just isn’t what it used to be.
I don’t have a lot of faith in memory cards and thumb drives.
Theoretically, they should last for a long time. But, again, there is such a variation in quality, it’s just not something I would put a lot of faith in. I know many people use them successfully. But whether or not they’re going to be readable 10 years from now, for example, I really can’t say.
I will say that if media starts to go bad, a simple one-bit error has the potential to make the entire drive unrecoverable — unlike optical or magnetic media, where data-recovery techniques stand a better chance of success.
Traditional hard drives
Traditional magnetic spinning-platter hard drives — HDDs — are probably the most practical long-term storage if they’re stored properly. What that means is keeping them away from moisture and not storing them around strong magnetic fields. I’d feel confident in that data being accessible for decades.
You also need to manage them properly, perhaps updating to newer technology as it becomes available. More on that below.
HDDs are big; they store a lot of data. Even “older” drives — just a few years old — might be considered small when used in a computer, but if converted to an external drive, they make for excellent long-term storage.
Solid state drives are best thought of as a cross between flash memory and traditional hard drives.
My take on them is that the jury’s still out. They’re certainly of better quality than your average thumb drive, but it’s still not clear if they’ll hold their data for decades. Since they are both still smaller and more costly than traditional hard drives, to me they don’t seem like a good choice for long-term storage right now.
That could change.
When it comes to compatibility between today’s technology and that of years from now, there are two issues: physical and logical.
Will computers 10 or 20 years from now be able to read the media we write things to today?
For example, if you stored something on floppy disks 20 or 30 years ago, you are now dealing with the fact that computers no longer have floppy drives. You can find an external floppy drive for 3.5 inch floppies, but if it’s much older — say a five-inch disk common at the dawn of the PC era — you’ll have a difficult time finding a way to read it. Optical drives are beginning to disappear as well.
I’m fairly confident that the USB interface, and thus USB external drives, are going to be supported for a very long time. As I write this, USB 3 is common, and USB 4 is on the horizon; yet even old USB 1 devices still work, albeit more slowly. I’m confident that 20 or 30 years from now, there will still be a USB interface into which I can plug one of today’s external drives.
Will our computers 10 or 20 years from now have logical compatibility? By “logical”, I mean the format of the information we store, and our ability to run programs to read or interpret it.
A great example is the impending death of Adobe Flash, after which software that plays Flash-based games will no longer be generally available. People wanting those programs to continue running will need to “do something” (although it’s currently unclear what that is).1
Compatibility falls into two categories:
- The format of data on disk. Will the NTFS filesystem still be readable 30 years from now? How about FAT or FAT32? One would hope both will — and indeed, I do expect they will. But historically, there are definitely storage formats that lasted for only a brief time and you’d be hard pressed to recover today.
- The format of the data. Will jpg files still be a thing 30 years from now? Will there be programs that can play mp3 files? Again, one would hope that based on the current ubiquity of those formats, there will be compatible readers for decades. But, again, digital archives are littered with file formats that are understood by no current programs at all. While recovery would theoretically be possible by re-inventing a compatible reader, it’s not a simple task.
Left unaddressed, both of these are barriers to the viability of long-term digital archives.
What I do
Clearly, technology is constantly changing. Long-term archiving might not be best thought of as a “set it and forget it” kind of thing. Every so often, it’s worth a re-visit.
And that’s pretty much what I do.
I have a strategy.
On the physical/hardware side of things, what I once had on floppies, I eventually copied to CD. Then years later, what I once had on CDs (and a handful of DVDs), I copied to external hard disks. As newer, larger hard disks become available, I occasionally combine data from older, smaller ones to newer, larger drives. The 512GB drives I once used for archival have all now been replaced by at least 1TB drives, and my most recent addition to the mix was an 8TB drive.
This is the management I referred to earlier. By periodically “upgrading” the storage used by your archives to newer technology — copying the old disks to new, say, every 10 years or so — you also sidestep issues with older hardware failing due to age or lack of availability.
It does takes a little bit of forethought and effort to organize and copy the data. (The floppies were the worst.)
When it comes to things like the file formats of my data, I have less of a plan and more of an expectation. I expect that file formats ubiquitous today will still be readable in 50 years.
That means I save things in common file formats like .jpg, .mp3, and .pdf when I can. I would hope there would be better alternatives in the future, but I expect that because there are so many files in these formats today, they’ll always be readable, or convert-able somehow, in my lifetime and beyond.
Much like ASCII text documents created 50 years ago remain readable today.
A word about backing up
“If it’s in only one place, it’s not backed up.”
The other way you protect yourself from old hardware failing is the same way you protect yourself from any hardware failing: you back up.
Make sure you have multiple copies of any data you want to preserve — ideally on different media. Don’t put all your eggs in one kind of basket.
In my case, that 8TB drive I added to my system is a backup drive. Any data added to my archives on older disks is automatically copied to the new, larger drive, and thus lives in at least two places.
Whatever strategy you choose and whatever media you use, make absolutely certain to including backing up or some kind of redundancy in your plan. That approach significantly minimizes the risk of choosing the wrong long-term media.
There’s one more thing, though.
So far, I haven’t mentioned cloud storage.
It’s something you should consider.
I consider my photos my most precious data. In years past, it’d be the photo album I’d reach for on the way out of a burning house.2 Today that translates into redundancy — lots of redundancy.
I have over a terabyte of photos, including scans of photo albums pre-dating my birth, in Dropbox.3 Any time I add a photo, it’s immediately replicated — backed up — to the cloud and to several other of my machines. I could lose all of my hardware — every computer, every hard disk, every everything — and my photos would be waiting for me online.
But I’m not done. I also make a copy of my Dropbox folder outside of Dropbox4. That way, in the unlikely event that my Dropbox folder gets hacked or lost and all my files deleted, I’d still have a copy here at home.
The cloud can absolutely be a part of a very effective archival strategy, particularly for your most important information.
Bottom line: think about this
Honestly, long-term archival is much like backing up: the best approach is whatever approach you’ll actually take.
The difference, however, is time. When it comes to expecting to keep something for decades or longer, you’ll want to put some thought into exactly how, where, and when you store things.
Your children, your grandchildren, and perhaps even more future generations will thank you.
Subscribe to Confident Computing! Less frustration and more confidence, solutions, answers, and tips in your inbox every week.
I'll see you there!
Footnotes & References
1: Sites like archive.org have a vested interest in the answer, and may even develop one themselves. In the interim, older versions of software might be supported in the form of emulators or virtual machines.
2: OK, ok, after making sure my wife and pets were safe.
3: Any of the major providers would do.
4: Just a simple batch file that copies all changed files from Dropbox to another location nightly.