Tips to avoid losing data in your lifetime
Probably none of the above.
Technology keeps changing, of course, so the best long-term storage media will also continue to change.
What we really need here is more than a choice; we need a strategy.
Become a Patron of Ask Leo! and go ad-free!
There are varying opinions, but traditional magnetic hard drives seem the most likely to last the longest for archiving data. The best approach is to refresh the data by periodically copying to more current media. Data formats also come into play. Saving in open and/or ubiquitous file formats like pdf will help ensure the data can be read years from now.
Storage media
You’ll get a lot of conflicting answers as to the best long-term storage media. Many people feel strongly that “X” is the way to go, and others feel just as strongly that, no, it needs to be “Y”.
They can both be right if you approach it properly. For example, the right solution might be both “X” and “Y”.
I’ll review the options and describe what I do.
Optical media
Once upon a time, CDs and DVDs were the go-to media for archival. They had oodles of capacity and didn’t take much room.
We quickly discovered that quality matters. In fact, it matters a lot. Many cheap writable CDs that were written just 5 or 10 years ago are no longer readable. That’s exactly the scenario we’re trying to avoid.
Archival-quality CDs and DVDs (and perhaps Blu-Ray) are probably worth the money if you’re thinking of storing for many, many years. There are experts that even as recently as a few years ago will tell you this is the way to go. I suspect it’s a very safe bet for the most important data.
The real problem is that what was once big is now small. That 4.7GB DVD might be small given some the things we might want to archive these days, like video or lots and lots and lots of photos.
An “oodle” just isn’t what it used to be.
Flash memory
I don’t have a lot of faith in memory cards and thumb drives.
Theoretically, they should last for a long time. But, again, there is such a variation in quality, it’s just not something I would put a lot of faith in. I know many people use them successfully. But whether or not they’re going to be readable 10 years from now, for example, I really can’t say.
I will say that if media starts to go bad, a simple one-bit error has the potential to make the entire drive unrecoverable — unlike optical or magnetic media, where data-recovery techniques stand a better chance of success.
Traditional hard drives
Traditional magnetic spinning-platter hard drives — HDDs — are probably the most practical long-term storage if they’re stored properly. What that means is keeping them away from moisture and not storing them around strong magnetic fields. I’d feel confident in that data being accessible for decades.
You also need to manage them properly, perhaps updating to newer technology as it becomes available. More on that below.
HDDs are big; they store a lot of data. Even “older” drives — just a few years old — might be considered small when used in a computer, but if converted to an external drive, they make for excellent long-term storage.
SSDs
Solid state drives are best thought of as a cross between flash memory and traditional hard drives.
My take on them is that the jury’s still out. They’re certainly of better quality than your average thumb drive, but it’s still not clear if they’ll hold their data for decades. Since they are both still smaller and more costly than traditional hard drives, to me they don’t seem like a good choice for long-term storage right now.
That could change.
Future compatibility
When it comes to compatibility between today’s technology and that of years from now, there are two issues: physical and logical.
Physical compatibility
Will computers 10 or 20 years from now be able to read the media we write things to today?
For example, if you stored something on floppy disks 20 or 30 years ago, you are now dealing with the fact that computers no longer have floppy drives. You can find an external floppy drive for 3.5 inch floppies, but if it’s much older — say a five-inch disk common at the dawn of the PC era — you’ll have a difficult time finding a way to read it. Optical drives are beginning to disappear as well.
I’m fairly confident that the USB interface, and thus USB external drives, are going to be supported for a very long time. As I write this, USB 3 is common, and USB 4 is on the horizon; yet even old USB 1 devices still work, albeit more slowly. I’m confident that 20 or 30 years from now, there will still be a USB interface into which I can plug one of today’s external drives.
Logical compatibility
Will our computers 10 or 20 years from now have logical compatibility? By “logical”, I mean the format of the information we store, and our ability to run programs to read or interpret it.
A great example is the impending death of Adobe Flash, after which software that plays Flash-based games will no longer be generally available. People wanting those programs to continue running will need to “do something” (although it’s currently unclear what that is).1
Compatibility falls into two categories:
- The format of data on disk. Will the NTFS filesystem still be readable 30 years from now? How about FAT or FAT32? One would hope both will — and indeed, I do expect they will. But historically, there are definitely storage formats that lasted for only a brief time and you’d be hard pressed to recover today.
- The format of the data. Will jpg files still be a thing 30 years from now? Will there be programs that can play mp3 files? Again, one would hope that based on the current ubiquity of those formats, there will be compatible readers for decades. But, again, digital archives are littered with file formats that are understood by no current programs at all. While recovery would theoretically be possible by re-inventing a compatible reader, it’s not a simple task.
Left unaddressed, both of these are barriers to the viability of long-term digital archives.
What I do
Clearly, technology is constantly changing. Long-term archiving might not be best thought of as a “set it and forget it” kind of thing. Every so often, it’s worth a re-visit.
And that’s pretty much what I do.
I have a strategy.
On the physical/hardware side of things, what I once had on floppies, I eventually copied to CD. Then years later, what I once had on CDs (and a handful of DVDs), I copied to external hard disks. As newer, larger hard disks become available, I occasionally combine data from older, smaller ones to newer, larger drives. The 512GB drives I once used for archival have all now been replaced by at least 1TB drives, and my most recent addition to the mix was an 8TB drive.
This is the management I referred to earlier. By periodically “upgrading” the storage used by your archives to newer technology — copying the old disks to new, say, every 10 years or so — you also sidestep issues with older hardware failing due to age or lack of availability.
It does takes a little bit of forethought and effort to organize and copy the data. (The floppies were the worst.)
When it comes to things like the file formats of my data, I have less of a plan and more of an expectation. I expect that file formats ubiquitous today will still be readable in 50 years.
That means I save things in common file formats like .jpg, .mp3, and .pdf when I can. I would hope there would be better alternatives in the future, but I expect that because there are so many files in these formats today, they’ll always be readable, or convert-able somehow, in my lifetime and beyond.
Much like ASCII text documents created 50 years ago remain readable today.
A word about backing up
“If it’s in only one place, it’s not backed up.”
The other way you protect yourself from old hardware failing is the same way you protect yourself from any hardware failing: you back up.
Make sure you have multiple copies of any data you want to preserve — ideally on different media. Don’t put all your eggs in one kind of basket.
In my case, that 8TB drive I added to my system is a backup drive. Any data added to my archives on older disks is automatically copied to the new, larger drive, and thus lives in at least two places.
Whatever strategy you choose and whatever media you use, make absolutely certain to including backing up or some kind of redundancy in your plan. That approach significantly minimizes the risk of choosing the wrong long-term media.
There’s one more thing, though.
The cloud
So far, I haven’t mentioned cloud storage.
It’s something you should consider.
I consider my photos my most precious data. In years past, it’d be the photo album I’d reach for on the way out of a burning house.2 Today that translates into redundancy — lots of redundancy.
I have over a terabyte of photos, including scans of photo albums pre-dating my birth, in Dropbox.3 Any time I add a photo, it’s immediately replicated — backed up — to the cloud and to several other of my machines. I could lose all of my hardware — every computer, every hard disk, every everything — and my photos would be waiting for me online.
But I’m not done. I also make a copy of my Dropbox folder outside of Dropbox4. That way, in the unlikely event that my Dropbox folder gets hacked or lost and all my files deleted, I’d still have a copy here at home.
The cloud can absolutely be a part of a very effective archival strategy, particularly for your most important information.
Bottom line: think about this
Honestly, long-term archival is much like backing up: the best approach is whatever approach you’ll actually take.
The difference, however, is time. When it comes to expecting to keep something for decades or longer, you’ll want to put some thought into exactly how, where, and when you store things.
Your children, your grandchildren, and perhaps even more future generations will thank you.
Do this
Subscribe to Confident Computing! Less frustration and more confidence, solutions, answers, and tips in your inbox every week.
I'll see you there!
Podcast audio
Footnotes & References
1: Sites like archive.org have a vested interest in the answer, and may even develop one themselves. In the interim, older versions of software might be supported in the form of emulators or virtual machines.
2: OK, ok, after making sure my wife and pets were safe.
3: Any of the major providers would do.
4: Just a simple batch file that copies all changed files from Dropbox to another location nightly.
wow!! Leo…thanks for the awesome advise, i only backup my data on dvds but from now i am going to follow your advise and going to backup everything twice and on two diffrent medias. thanks
“Best Long Term Storage Media”
If you are recommending the pursuit of/for high quality (Archival) DVD media, you might want to mention Mellenium Disc (M-Disc) media. While the blank media are single layer and not inexpensive, it has NO dye layer; no disintegration over time. Also, the burner drives manufactured by LG retail for aprox $80 USD.
You can find LG brand combo BluRay burner and M-Disc burner, internal. I also have an LG brand external M-Disc burner.
I have a LOT of floppies with good stuff on them that I spent over a week transfering (all data that was usable) onto DVD’s. I’m glad that’s over. I’m watching and waiting for the next generation of storage media and wondering what it will be like.
There was a study done at NIST some years ago on CD quality and archival properties. Leo is right in that there is a big difference in the quality of CD/DVDs. And price is not always a good indicator (but cheap is…).
20 years????? In twenty years we will have chips implanted and simply think our internet searches or messages. Who knows what ‘backup’ will mean. Probably nothing you can put your hands on. I shouldn’t say WE. I’m an old man who’s been a professional in computers since they reached the street. I will not participate in that future.
that reminds me of the show where everybody in the world had a chip implanted in their brain at birth.
if they wanted to know something all they had to do was access the chip. there was no need for schools or any kind of learning activities.
then one day the central computer that provided the info blew out. no more chips worked.
everybody who had a chip implanted was like a child who had no knowledge of anything.
i hope thats not where we`re headed.
You point out that TEXT files will probably be readable for a very long time, which is reasonable.
But what about WORDSTAR files? They are much like text files, but with the addition (or change) of the “hi-bit” on the end of every word, to enable correct spacing, as I’ve always understood it. At any rate, it is there, so if ever searching for a certain word in WordStar files, as I am currently facing having to do, because many of the files have apparently been “corrupted” by outright sabotage, I have to reckon with that. I want to try to send to a data recovery facility, but before I even pay their deposit (non-refundable, if they can’t get any data back) I want to myself see if I can find certain key words I know should be there, if any of the muddled files still “exist”.
Of course I know I need to make CLONES of the 3-4 hard disks (all under 1Tb) but a programme that can search for certain words in TEXT (and there are a few) should, I hope, allow me to see which hard disks are possibly worth sending, simply by choosing a frequent word (a name, in this case) — MINUS the very last letter. I am hoping that this method will save me several deposits, if there is nothing able to be regained from one, or indeed the lot of them. (Some of the files go back to 2005 or earlier.)
In the background to all this is that the WordStar programme used to write all those vital files is v3.4, of course a DOS-based programme, which I could access and use in Win7 x32 — till forced to upgrade to Win10. I’ve heard that Win10 is based or “on top of” DOS, but other than at “Command Prompt” you can’t really SEE into those files — because I’m using a 64-bit Windows which won’t run a 32-bit programme. Certainly not WordStar v3.4, even if a few programmes can search for “most” of a key word, like “Notenboo” (instead of Notenboom).
As far as I am aware, there remain two options — one rather complicated (to me, at 76) and the other even more so. I believe there was a system of “Virtual DOS” which allowed one to read WordStar files, championed by a well-known sci-fi writer. I don’t recall the name of the programme or know if he still champions that, but I never actually tried it, as even then, it seemed a bit complicated.
The other is to run a “Virtual Machine” inside Win10, like Oracles’ free virtual PC application — if it still exists. But for me, operating a dual-boot system seems even more complicated in operation. Because this is not a “simple” backing up operation, but a data CHECKING operation — BEFORE backing up.
It is because it seems quite complicated to do the whole lot (cloning, then data checking, then data recovery, THEN data backing up) that I have left it, hardly getting involved, for 3-4 years. Yet what was on the original hard disks represented the fruits of several years work, which I had always HOPED to use for a book (when I had time — before the HDDs were sabotaged). And I certainly cannot now spare several thousand dollars on a possibly vain attempt to get the data back….
So I am left wondering what would be my best (and cost-effective) approach to this unique problem….
In your shoes I would, indeed, set up a virtual machine using Oracle’s VirtualBox software (I do so already for Ask Leo!). I’d probably install an older version of Windows, perhaps XP, and see what can then be run, checked, accessed and recovered from within it. What you’re doing sounds very doable. You might also look into tools like “everything” which I believe can be used to search all files on your (Win 10) machine. Also know that Word, or the Open Office tools should also still be able to import WordStar files.
The last version of Windows to be “on top of” DOS was Windows Me. XP, Vista, 7, 8, 10 … DOS itself is no longer under there.
There are a few websites which offer to convert Wordstar files. Google “convert Wordstar files”.
I once analyzed a WordPerfect for DOS file to determine the codes to create a mail merge in a database program I wrote. I’m sure I could write a program to strip off the Wordstar formatting characters if there was enough money to make it worth my while.
Archaeologists today reconstruct documents in unknown languages. Archaeologists in the future will do similarly for the documents of our day. Websites like Archive.org are working to preserve documents from the past so they will be available in the present and in the future.
in short… I tend to default to standard hard drives as the best all around option as I think the easiest/least time consuming way would be to have two copies of your data on two different hard drives. this is what I consider a bare minimum for data backup and gives one a reasonable level of protection against data loss. someone who fails to follow that basic advice (or thereabouts) is just asking for trouble.
but with that said, I do prefer to have some DVD recordable media for a limited amount of higher importance data backup like family videos/photos. if your using good media it should last at least 10+ years in most cases I would imagine and probably more in the 10-20+ years range (I have some around 10 years or older now and they still work the last I checked). I suggest using Verbatim DVD recordable media as it’s probably the sweet spot of price/quality ratio (Verbatim is a safe bet for recordable DVD in general over the years). Taiyo Yuden is good media to but it’s a bit pricier and probably a bit harder to find. I have heard of MDISC media and it’s claimed to be noticeably better than standard DVD recordable media (it’s claimed to be 1000 years but personally I doubt it. but if it’s even 30-40 years or so that will be plenty for most people I suspect) but it’s hard to say how it fairs since it’s not been around as long and seems to be noticeably more expensive and you need special DVD burners to record it. so all-in-all, I just stick to my more standard Verbatim(or Taiyo Yuden) DVD recordable stuff. but MDISC might be a option for some people who don’t mind paying a bit more $ on the chance it’s noticeably better than most typical DVD recordable media. but personally I would rather invest in additional hard drives as I feel standard Verbatim media is reliable enough especially when paired with a couple of HDD’s for additional backup and for icing on the cake, have a additional copy of data on another set of DVD media (say Taiyo Yuden) as the odds of all four of those failing at the same time seem slim.
as for SSD’s… while they are nice for speed, I have a feeling those when they fail will be sudden where as a standard hard drive I suspect one is more likely to get warning signs before outright failure. personally I would not trust these for long term data storage.
like SSD’s, I would not trust backing up anything too important to SD cards or USB thumb drives and the like as they might last a rather long time but if they fail I suspect it might take out everything. I think they are probably reasonably reliable for short term storage but anything too important I definitely would not rely on this as my sole backup option.
I think storing some data online can be a nice alternative but it’s definitely not something I would rely on as my only backup copy. I see it more as a bonus to more proper methods of data backup like two HDD’s etc.
but I guess even if say DVD recordable media is reliable… I figure the biggest unknown is whether drives that read DVD media will still be around 20+ years from now. but either way, it’s probably still a viable option for the next 10+ years at least as I can’t see DVD readers being too hard to find 10 years from now especially if the SATA connection on computer motherboards does not go away for the foreseeable future. but in 20 years, who knows if DVD readers will be harder to find or not. but as for BluRay recordable media, I never got into it as while it offers quite a bit more storage space than regular DVD I suspect it’s probably more susceptible to failure since it’s cramming more data in a smaller space etc. I would imagine CD/DVD is a bit more proven to assuming you got decent quality media and BluRay recordable is a bit pricey to as I would probably rather invest that in additional hard drive storage.
I also like Leo’s thoughts on the USB interface which don’t seem to be going away for the foreseeable future. hopefully basic SATA connections follow the same path for at least another 10-20 years. because while I get some stuff changes in computer tech, they should try to keep some level of standard so we don’t have to alter our backup methods TOO much. but at least so far SATA seems to have been hanging around a while as while it might not be USB level common, it’s probably not far behind it because as computers have been mainstream (call it around the year 2000) SATA has been around a large portion of that time so far and probably won’t be going away for the foreseeable future, at least not on desktop computers.
bottom line… my general ‘go-to’ backup tends to be a minimum of two copies of my data on two different hard drives (it’s also most convenient/time efficient to). but I usually try to backup the higher importance data (which I have a more limited amount) also on (besides the two different hard drives stuff) Verbatim and Taiyo Yuden DVD recordable media. although I do admit I slack a bit on backing up to DVD since it’s a bit more time consuming(I don’t have all of my family vids/pics on DVD but a large portion of it currently). but backing up to two different HDD’s is simple enough and gives one a reasonable protection against data loss and takes a minimal amount of ones time. so short of a house fire and the like, it’s unlikely I would lose data with that method of backup because quite a few things would all have to fail at once which seems unlikely.
Hi Leo,
Good article. As you say, there’s no one technology based solution, but a strategy that addresses the storage aspects and longevity as well as compatibility of the data and the device.
What process did you use for digitising all your old photo albums?
I’ve been contemplating this daunting task for a while, having my own collection as well as those of my parents and a couple of albums that pre-date them. My experience with scanning photos is that it is a time consuming process, both in the physical handling of the media and the process to get the best settings for each photo.
I got a scanner with a negative scanning attachment. I scanned all my negatives and slides. A scanned negative yields a much higher quality digital image than scanning a photo. Yes, it was time consuming but I did them a little at a time over the span of a couple of months. It would have been faster if I’d gotten a dedicated negative scanner, but this was a one time thing as by the time I scanned them, I no longer used film.
I have an Epson Perfection v500 Photo flatbed scanner. It’s slow, but slow and steady absolutely wins this race. I’ve scanned all my photo albums but one (it’s pages are too big for the scanner). It also does slides, so I’ve been slowly scanning a couple thousand slides, 4 at a time. It also does negatives, and as I think Mark points out elsewhere the result is actually pretty darned good. They’re harder to stage for the scan but the results are often worth it.
Ah, yes — archival woes. :)
Years ago, I wanted to access some files from our old Win95 computer that had been stored onto a DVD. Popped said DVD into our (then) WinXP system, and… nothing. The disc was unreadable. My jaw dropped: that was a contingency I’d never even considered; I thought optical discs were forever! And especially so, since the DVD had lain in its case, undisturbed, in a fireproof lockbox under my bed for years.
The guru at our local computer store managed to recover all but one file (yay!), so there was inconvenience, and small expense, but no loss.
Today, I still keep that data in that same lockbox — but on a gold-colored, archival-grade DVD, an SD card, and a flashdrive!
What you say about “logical compatibility” rings true, too: I got hit by that once when I found that one of my Iomega Zipdisks had been compressed using something called “Drivespace,” which Win95 understood, but which no operating system out now does. (The Zipdisks themselves were almost a physical incompatibility problem all on their own!)
Perhaps the ultimate and worst compatibility mismatch I’ve ever run into, was getting some textfiles originally written on my Commodore-128 (and saved onto a 5.25″ floppy disk via a Commodore 1571 in the late 1970’s) onto our 64-bit Win7 system! It wasn’t just the problem of transfer; the files needed converting, too, since PETASCII (the ASCII variant used by Commodore computers) is very different from regular ASCII. Now, that was an adventure (but, just so you know, although it required some finnegaling and some ingenuity, I did eventually succeed)! :)
I hope you have copies of those files in (a) place(s) far away from your fireproof safe. As Leo says, “If your files are only in one place, they’re not backed up.” This applies to a lesser degree to backups stored in the same place. You can encrypt them and keep them on a cloud server such as OneDrive or Dropbox.
Hi Leo,
Thanks for the article. One of the things I find that helps me is to write down a plan like a backup and archival strategy and then review it periodically with regards to content, location, media, format etc. This tends to show up things that you may have forgotten about or seems crazy now and just the process of documenting it makes you think about it seriously.
There is one topic that is making me think, however. These archives should be protected for future generations, especially photos, for example, but there may also be sensitive data like password vaults that you would like to make available to your children but don’t want to give out master passwords or encryption keys now that may be superceded (or lost) by the time they are needed. It would be neat to be able to set up some kind of fingerprint or retina scan access that could be set up for the long term. I don’t know what type of technology exists in this area but a fingerprint sounds pretty long term. Do you have any suggestions about this…a kind of IT inheritance planning!
One way to give out the master password is to break the password into a few pieces and give one portion to different relatives or friends only to be shared after you pass on. For example divide your password into two halves or three thirds and give each portion to two or three people (backups are important).
I don’t know how you could implement your biometric solution as even if it would be possible to set a date for it to be activated, you’d have no idea which date.
I believe encryption remains the way to go. If there’s a concern you put a long password on it and then give two people each half the password. Only when they agree its the right thing to do would they then gain entry.
LastPass (& possibly other password management plans) give the option of nominating a 2nd party who can be allowed to access your account.
You nominate a trusted relative/ friend.
If they apply to LP for access, LP sends you an e-mail.
If you don’t respond within a preset period, LP gives access to the nominated person.
Protection for you if you die or suffer a stroke. Be sure you trust the nominee if you decide to take a month long hiking expedition in Tibet.
What do you think about a hot swappable HDD bay? Is this a good way to back files to a hard drive?
You’d need to be able to hook that second drive up via USB and clone your system to it. A drawback to that is if you change any files and/or accidentally erase a file and clone your system drive, the older version of the changed files and the deleted files would be lost. System image backups are the most reliable.
That’s more about convenience — sure, if it works, fine, but an external hard drive is just as convenient.
Your excellent advice, “If it’s in only one place, it’s not backed up” should always be interpreted to define the LOCATION of your backups as a “place.” So, no matter how many different backup copies you have, on various media, if all of them are physically in your home or office they’re in one place and vulnerable to fire, theft, or lightning (if always “on”). An off-site copy is essential and I’m glad you set a good example, even backing up your Dropbox files! Offsite access should use 2-factor authentication. And, for all who are truly focused on “data security” (broadly), “lock” your file at all three credit bureaus to prevent credit fraud. You can unlock temporarily to allow a single entity access if you apply for a mortgage or vehicle loan.
In Windows 10, I use the built-in back-up utility to back up my data to an external Hard Drive (2 TB attached via USB3 for better performance). I also have all my data files synced to One Drive (mostly my Documents and my Pictures folders), and for extra measure, I have an internal (encrypted) SSHD drive on which I manually copy any files I want to keep. On the first of the month, I review the content of the internal drive and add files from my Documents or Pictures folders, or remove files I determine I no longer need readily available.
Both drives contain 2 partitions each, one NTFS (for Windows) and the other EXT4 (for GNU/Linux). In my GNU/Linux OSs, I use time-shift to back-up my system to the external drive with 6 daily images, 3 weekly images, and 1 monthly image. I also sync my user account’s Documents and Pictures directories to Google Drive using rclone (a bit of a learning curve, but once it is set up, syncing is automatic). I also installed a One Drive client I found on git-hub so I can access my Windows files from GNU/Linux. If Windows ever goes away, I will at least have my files synced to my GNU/Linux installation.
recently I have learned that Google provides GDrive for GNU/Linux. I will research this and perhaps switch to it.
These are what I do to keep my data safe and available to me. I don’t know if this information will help anyone else, but I hope it will,
Ernie
I would suggest people also print out a selection of their best and most memorable pictures. You can make beautiful photo books at relatively low cost using services like Picaboo. I can hold in my hands pictures of my grandparents as children. No digital media has lasted that long.
Getting beyond the technical discussion, a question is what happens to all the data stored for perpetuity? Companies have legal requirements for long term storage and government and universities archive information for historic legacy. But what about individuals? With our capability to store thousands, perhaps millions, of images, videos, and music files, how many people really organize them for easy retrieval and ever go back and look at them? Yes, maybe you can save them for your children and grandchildren, but are they going to look at them? I suppose we’re saving stuff for some archeologist in the distant future.
1000 years from now, archeologists are going to determine that the major religion in the late 20th and early 21st centuries was cat worship.
Choosing the right long-term storage media is part of protecting your data from degradation or other forms of loss. But there’s more to it than just media. What about us? Will we really be around 50 years from now ? I seriously doubt that I will , since I am past 80. Maybe it’s not so important to save data. It just may be more important to read and enjoy data, discard it and then on to living today and not be so concerned with saving for the future.
It all depends on who you leave behind and if it will matter to them. I know I still feel the loss of photographs taken by my grandfather that were lost in a fire 60 years ago.
“Any data added to my archives on older disks is automatically copied to the new, larger drive, and thus lives in at least two places.”
This sounds to me that the drives are permanently connected to the system. I don’t think that is a good idea as viruses become more sophisticated. IMO the backup media should be disconnected when not in the process of backing up or retrieving data.
I would also like to know what software you use to implement your copy and distribute strategy?
Two things: that machine happens to be running Linux, so it’s less likely to suffer from malware. Particularly since I don’t “use” the machine in any traditional sense other than have disks connected to it. The tools I use are simply scripts/batch files that I’ve written.
Hi, thanks for the article. However, I do not agree that automatic immediate backup (file mirroring to whatever media) is a good option. Or maybe… it may be good, but it depends from what you’re trying to protect. If you want to protect from your primary disk corruption – then yes, immediate copy (best option: RAID) is a way to go. But if you want to be able to “go back in time” and restore accidentally deleted files, or recover files encrypted with ransomware, then your backup is worthless, because once you login and try to restore your data, your backup will also contain unwanted changes.
So either use an automatic backup, but with “snapshot” option, which would allow to see previous version of all changed files – or use simple file copying, but only from time to time.
And if you really want a complete backup strategy, then you should move one copy of your data to some remote location, to protect from fire, flood, robbery or whatever which may affect physically your home.
If you want to be able to “go back in time” and restore accidentally deleted files, or recover files encrypted with ransomware, you can go back to a backup taken before that file was deleted. It’s good to keep something like the past 12 months of system image backups and the last full month of incremental backups. In addition, I copy all of my user created files to a backup which is added to and never deleted. I keep getting bigger drives as they come out and copy everything to those and keep the older ones as archives. I’ve only once had to restore from my archives, but I’m glad I have them.
How I hate this easy reading articles when I’m looking for actual information about safer filesystems for long term.
For file formats, if you want your data readable all the time, you better store your data on raw uncompressed, unformatted file formats like wav, bmp, plane txt and raw video. This formats are mostly plane ones and zeros, no matter if no program can recognize them they’re always accessible as long as you are able dump it’s contents to a terminal and if a terminal can’t do so then you are not using any functional SO. By the other way compressed file formats popular today can be transparent for current media but can be completely obscure to for future media once the format comes deprecated because compressed formats use complex mechanism to store the data in order to save disk space. So raw formats take more disk space but they’re simplicity is guarantee of future readabily.
Compressed formats like .zip, .mp3, .mp4 and .jpeg, for example, are so ubiquitous that they will be readable for many years to come even if they are replaced as standard by newer formats.
KOOO-AAAL,
Thanks a lot, now I know. I think you know what you are talking about from reading your article. I read a lot about about this and everything I saw was M-disc, M-disc, M-disc they say that last for a millennium. Right, 4.9gb isn’t right. For super important information, will a hhd have sterling care for the data like a m-disc for at least 15 years without the chance of corruption? If not I guess I have to use m-discs.
By the way, I have a Dell Optiplex 3060; I can’t find out whether this is a m-disc burner or not. Do you think this DVD burner will perfectly burn m-discs? Thanks a lot.
Having read through your article, it turns out that over the past 25 yrs my backups have evolved almost exactly like yours. Floppy > TR3 tape > CD/DVD > dedicated internal HDD (1st line backup) with external USB drives (2nd line backup).
The relatively new optical disc type ‘M-Disc’ is promoted as an ideal lifetime storage medium, and may be ideal for long term storage of vital data off-site – or, at least, a location away from the computer.
Info: SyncbackSE has proven ideal for my needs – including ‘grouping’, automation etc, and totally reliable. I use Macrium Reflect to image drive C:. It’s automated to update images on the two separate internal physical drives D: and E; (each have 1 full plus 2 ‘incremental forever’) – on Wednesdays 1 & 3 / 2 & 4 respectively.
Amazon AWS has servers allocated by geographical regions. So in case California falls into the ocean, the East Coast servers still have a copy. Or use Google and Amazon for backups, in case one vendor makes a mistake and deletes your account in every region, then the other vendor will still have copies.
pjkPA
Lots of good info on back up here.
One I did not see for Photos is to just print out on good photo paper.
I have thousands of Photos that are over 40 years old that look as good as they did 40 yrs ago.
I also have converted may of these to digital … but have found that I look at these physical pics much more often than going through pics on a computer. I also have many in my safe …
I do not have confidence that any digital tech today will be readily available in 50 years… but think these pics will still be around.
this is a great topic and was also covered jim salter at ars technica and he describes bit fliping or BitRot, where a 0 gets turned into a 1 or vice versa on a magnetic drive or SSD. this may not be a problem for archival CDs/DVDs, i am not sure about that. one incorrect bit in a photo ruins the entire photo, as jim shows with a picture of his son. his solution is to use the ZFS file system that has active error correction. i didn’t follow this tech 100%, but i think the way it works is that you have 3 or more drives with mirrored and parity data that the computer and file system compares and when there is a error, caused by radiation from the sun or from the ground, the ZFS active automatic error correction corrects the error, and this is the best way to archive digital media. ZFS was the first but BTRFS is now working and BCASHEFS will be working reliably at some
point. i am concerned that some of the tech described here may be a little behind what we have available now and i hate for people who do not look at the details or this newer tech to loose their files, like folks did with Polaroid photos back in the day. i think we should have 3 drives actively correcting errors but that RAID type ZFS system with 3 or more drives should be considered only ONE copy of the data, using the 3 2 1 rule you need 2 more copies with one or more off site and hopefully in it’s own zfs zpool with mirrors and auto error correction. the problem with this is that you have 9 drives for 3 zfs zpools, if i am using the correct terminology. thinking some more, the drives in different locations, such as off site, could be part of the same zfs zpool using ZFS send or WireGuard or some sort of VPN. so lots of drives and a set up most post people will not take the time to learn, which leaves The Cloud, which i assume also uses ZFS for error correction, because their reputation depends on storing data without corrupting it. people who are not going to set up a robust and expensive system would benefit from Google Photos, Google drive, iCloud and DropBox (Linode, AWS, Digital Ocean, Spider Oak, etc), but they should not count just one cloud service for their important data, files and photos. my other concern is privacy, so i am not going to be using The Cloud myself, even if i encrypt my own files and in addition use their encryption as it eventually becomes crack-able with quantum computing etc. i feel sorry for the people who will not see these posts and who will eventually loose all their photos. it will be a great loss to them and their families, but i am glad the big tech companies are doing cloud services which i believe will safe lots of families from loosing their photos, more so than people saving photos on CDs and ext drives. Family photos, videos, and creative endeavors in digital media in many cases is extremely valuable. thanks for this topic that will help people realize this and preserve their family history and creative projects.
I’m archiving five generations of b/w photos from the earliest days, color photos, 35mm slides, and digital photos for family sharing. At two years, multiple scanners, and maybe a quarter through the project, I am approaching 8TB non-RAID of storage already as I decided to use the highest quality and resolution for a ‘master’ copy. We have a closet stacked more than halfway to the ceiling with moving boxes of all the variety, albums, film processor-bound booklets, slide-processor boxes, slide storage boxes, rotary trays, and loaded USB drives . I’ve been reviewing information on the recommended means of long-term storage while distributing copies to various family entities. Then the additional problem is what to do with the originals.
My ultimate solution will likely be distributing the copies to individual families with encouragement to maintain two copies and refresh one from the other every couple years, while regularly upgrading to current technology. I may be purchasing drives by the dozen before long.
A secondary drive which automatically backs up everything on the main drive is not really a backup. It’s a copy at best. Any malware on the first drive will auto-replicate to the second.
One very important piece of advice is to keep your originals. If you have scanned all your old photo albums and transferred all your old 8mm movies to digital, be sure to keep all your originals in a safe place. That way if all your digital media an copies get corrupted or lost, you have the ability to re-scan.
Be sure to protect your precious backups. Keep your off-line hard drives, dvds, M-disks, and thumb drives in plastic tubs or metal tins stored in a cool place. Degradation of media is dependent on temperature, so protect it from temperature extremes. Sealed containers will also protect your media from flood waters, dust and other environmental hazards.
Also , be sure to keep a copy of your data off-site. If your house gets cleaned out by burglars, it’s not going to help having multiple backups if they are all in your closet. Ask a friend or family member to keep a copy for you. A small plastic tub with a hard drive in it takes up very little space.
Hi Leo, it seems to me that data storage development has always been based on packing more and more data into smaller and cheaper spaces rather than on developing a specific long-term (hundreds of years at least) medium. A really long-lived resource might not store as much data but we could at least be confident that our most prized data (literature, knowledge) would be safely stored and accessible for generations to come. What is the current leading technology purely focussed on longevity of data storage?
Survivorship bias. How many millions of pots have been destroyed, with the information they may have contained being lost?
That’s a problem with how the media is stored. A buried CD or HDD wouldn’t have lasted as long a pottery. They’ve lasted much longer than I expect any electronic media to last under similar conditions.
I find backing up your stuff incredibly arduous and the ideal solution for many less tech savy people is to go back to the beginning of the Civil War for the answer. It seems ironic that in many cases the longest lasting copy of what most families want to save , as one person already suggested, is to just have them put on paper-yes paper. They last longer than any of the modern or so-called advanced media if stored properly, which by the way is very simple.Absolutely ridiculous that there is not a simple elegant solution to this ubiquitous problem with all our modern technology. Something tells me the elegant practical solution doesn’t make the tech co’s enough money,so for them the problem is the solution-at least for their bottom line.
Paper works as a long term storage when there are several copies of it. For example: books and newspapers. Leo’s constant harping on backup goes back to the beginning of writing. We have a lot of ancient texts preserved because monks and scribe worked diligently making copies. Then the Renaissance came with the invention of the printing press and that led to better preservation of texts. I see the best long term storage to be continual copying of existing media. Sites like the Internet Archives and Google are doing that with everything on the Web.
With recent political developments I am getting concerned about the possibility of a high-altitude nuclear enhanced electromagnetic pulse device (EMP bomb) being used, either deliberately or by accident. I heard that cold war Russian warplanes used (vacuum-tube) valve-based avionics because semiconductore were considered too susceptible to EMP.
Does anybody know if an EMP bomb would affect a hard drive or semiconductor-based memory (if not connected to the grid)? Would optical media be relatively resistant to EMP damage?
Adding to my comment of December 29, 2020, I also addressed the “logical compatability” problem by saving to the same DVD as the datafiles, Windows programs (both 32- and 64-bit) that can read them, including those program’s installation fikes and license keys. Hopefully, it’ll be enough to be still able to read the file data, even if the programs are no longer around.
In other words, guys, it ain’t enough to store just your vinyl record albums — you need to store a record player, too. :o
Or be prepared to re-create or re-invent a record player from scratch. (This is what I suspect will happen with obscure file formats without current programs to understand them. If there’s enough of a new, new software will be created. If not enough of a need, then … not.)
I recorded a lot of video on Sony U-Matic in the 80s then I transferred most of that media to Sony Dvcam while I still had a U-Matic machine that worked, then I transferred to Sony Xdcam, now I’m at a loss as to what to use next as the Dvcams fail to power up, just today I managed to find a Dvcam engineer still alive but he was reluctant to take my Dvcam 50 machine on due to the lack of parts availability for these machines, at the end of the day the only reliable way that things will will keep for a 100 years is a photograph and a book, Hey Ho.