Will you be able to read them two decades from now? Will you want to?
Long-term viability of proprietary formats is an issue that transcends the subjects of backing up and even file formats. As hardware evolves, we run into it frequently. As one simple example, I have floppy disks in my basement that I currently have no way to read.
The concern is that the same thing will happen in software, and specifically back-up software. Will there be software a decade or two from now that will be able to access and read the backup images you've created today?
Become a Patron of Ask Leo! and go ad-free!
Proprietary backup formats
Most good backup tools use proprietary data formats to efficiently back up while simultaneously providing the features of the tool. That's fine for backing up, where the need to access the file decreases rapidly over time. For longer-term archival, using different, simpler tools is a hedge against proprietary formats going away.
Proprietary
Proprietary simply means that the format of data stored within a file is not public or openly available, and perhaps even considered corporate intellectual property. A backup image made with tool X can be read only by tool X, or, presumably, its subsequent versions. If the company goes out of business or stops producing software that understands that particular backup format, you may be left with no current software capable of understanding and/or accessing it.
Most of the time, this is fine. The companies whose software we choose to use typically outlast our personal use of their tools1 or our personal need for the information created using their tools.
Backups are an interesting conundrum, since they are often a convenient way to not only back up something short term, but keep things longer as well. That puts us at interesting risk.
Open
One common solution is to avoid proprietary formats completely. This can be difficult, as not all the tools we might want to use support open formats.
And sometimes things change. ZIP used to be a proprietary format. PDF was as well. Now they're present in a wide variety of tools from any number of vendors and even in the operating system itself. They're formats that, for a variety of reasons, will be around for a very, very long time. I have no doubt that a PDF of a document I create today will be readable 50 years from now.
Other common formats are likely to either be open, become open, or become so ubiquitous and important that one way or another, there will always be a way to understand them. Microsoft Excel's XLSX format is one such candidate. Regardless of where it is on the spectrum, I'm equally convinced that the spreadsheet I create today will be accessible long after I, Windows, and even Excel itself have passed on.
Backup formats may not fall into that category. I'd be surprised if the backup image I create today will be easily accessible 50 years from now.
If that's a goal, then a slight change in thinking is called for.
Backups: short term
Backups -- specifically image backups -- are most useful for decreasing short-term risk. By short term, I mean days, weeks, or months. (See How Long Should I Keep Backups? for specific advice on how long to keep 'em.)
Back ups decrease risks including malware, accidental data loss, hardware failure, data corruption, failed updates, and more. These all fall into the category of things you'll notice and/or fix pretty quickly.
Image backups are also comprehensive in that they contain everything: the operating system, installed programs, settings, and, of course, your data.
Longer term, however, you may not need or even want all of that data. That's where I take a slightly different approach.
Archives: for the long term
I don't expect to need my current version of Windows, my settings, or even my installed programs several years from now. What I do want to have on hand, however, is my data: things like my documents, my photos, my videos, and whatever else makes sense for my world. Perhaps the installation programs for some of the software I use. That's what matters, long term.
So I take a different approach. I don't use a backup program and its proprietary format for long-term archives at all. I just copy files. In some cases, I might collect them into a .zip file, but as I said above, that's a format I expect to be around a lot longer than I will.
Note this is for data I want to keep in readable form for years -- think decades. I'm not suggesting this method for backing up, because you won't have all the information you need to recover from the risks listed above.
I happen to have the process automated with scripts running nightly, but honestly, all it really takes for a more normal person would be to periodically copy all files of import to a different location, such as an external drive. Even putting them online in the cloud might suffice. That's where my terabyte of photos live, and as a side effect are also replicated to several other machines.
The late Karen Kenworthy also had a popular program -- Karen's Replicator -- that could perhaps more easily automate the process. There are also many alternatives.
The key here is that archiving uses simpler tools and techniques than backing up, thus protecting you from possible loss of access due to a proprietary tool.
Hardware longevity
Whenever I talk about archiving or keeping data long term, the issue of hardware longevity and eventual incompatibility comes up. Indeed, What’s the Best Long-Term Storage Media? Tips to Avoid Losing Data in Your Lifetime is one of my most popular YouTube videos. (Its companion article is here.)
As I said earlier, hardware we might have used for what we expected to be long-term storage may no longer be available in the years and decades ahead.
My approach and my advice is to not assume anything. Put another way, assume that whatever you're using today will be inaccessible 20 years from now. Instead, routinely migrate forward.
- I had all my data on CDs. I "migrated forward" by copying them to hard disks.
- As I got newer, larger, hard disks, I "migrated forward" by copying from the old disks to new.
- As I get new technology in the future, I'll "migrate forward" to whatever the current ubiquitous standard happens to be.
- I'll repeat the cycle as long as I can. Usually every few years.
Interestingly, this is exactly what cloud service providers are doing transparently behind the scenes. You can bet that Dropbox, OneDrive, and others, are constantly, slowly, rolling their hardware forward with new technology. The hard disks they used even just 10 years ago are likely nowhere to be found.
This will make sure the files you care about will be available long into the future.
Do this
There is no perfect solution, but understanding the issues ahead of time can dramatically reduce the risk of long-term loss.
The key takeaway here is to think about long-term storage differently. Don't think of it as backups, but instead as archives with a different set of requirements, risks, and approaches.
And yes, regardless of where your archives are stored, make sure they're backed up. If it's in only one location, then it's not backed up.
Subscribe to Confident Computing! It's a hedge against even more risks! Less frustration and more confidence, solutions, answers, and tips in your inbox every week.
Podcast audio
Footnotes & References
1: Admittedly, not always.
” I have floppy disks in my basement that I currently have no way to read.”
Actually, you can. Amazon has floppy disk readers for as low as $10. It’s more likely, there’s nothing useful on those floppies that you don’t already have backed up somewhere else.
I say that mainly for people who might have a stack of floppies. Of course, there’s no guarantee those floppies will still be readable.
Ahem. Leo didn’t say what kind of disks he had. What if he’s referring to 5.25″ floppies? From a Commodore-64, even? (Although IBM 5.25″ floppies would be quite bad enough!)?
“Think of ALL the possibilities, before settling down to enjoy yourselves.” –Eeyore to Piglet. :)
Current backup software won’t be available some time in the future, but if you retain a copy of the backup’s rescue disc along with the drive with the backup, you can run it and access your data. In a far enough future, the BIOS or whatever is used to load the OS might not be able to load the OS by then. Solution: back your data up in its original format and hope those disks will still be readable. Chances are they will as there are hundreds of billions of drives around today.
Ran into a designed-in problem a while back. A backup software program (company no longer in business) wrote an image backup for me. At the time I had a paid subscription, which allowed me to used their “high-compression” method. Fast forward a year, and I am no longer using their software, and my subscription has lapsed. Went to load the old backup, and surprise – the “free” version will not READ the image backup. I had to buy a subscription, just in order to retrieve a copy of some data off that backup.
I was hoping the article would discuss using a generic format like ISO to store backups.
Ideally there would be a software with a scheduling feature that could create an ISO of an entire drive.
There are plenty of softwares that can burn the ISO to a drive.
ISO is a horribly inefficient format for backing up. There’s no compression, I believe, among other issues, so you might as well just copy the files rather than bundle them into an ISO.
Anyway, I’m not aware of any backup software that writes to ISOs.
I use a program called “FreeFileSync” As the name implies, it’s free and synchronizes files. Available for Windows, MacOS and Linux. It solves the problem of copying large amounts of data from one media to another by only replacing those files that have changed since the last backup, resulting in a rapid backup. As a true copy, the backed up files are the same as the source so they can be read by the current OS.
Here’s the Wikipedia article:
https://en.wikipedia.org/wiki/FreeFileSync
and the website:
https://freefilesync.org/
Am I incorrect in thinking ToDo backup uses standard formatting? I can read the backup directly, using Windows File Explorer….(Or am I deluding myself?)
ToDo also uses a proprietary format. Just like Macrium, if you have the program installed they include tools that allow you to mount a backup and view its contents via File Explorer.
Hello,
I fully agree with your vision on backups.
I therefore use the program Second Copy 2k since.
Regards from the Netherlands
but the data stored on the floppy disks may be in a proprietary format. I had a document scanning program decades ago that did a great job but the format of the documents can no longer be read (and I haven’t had a real strong need for them to really hunt that down).
Yes floppy drives are available. I use a portable one to copy MIDI files from my wife’s Clavinova. The newer ones use USB sticks but Yamaha says that they cannot be retrofitted. There is at least one company that makes an SD card drive that looks to the equipment like a floppy. it is designed for old machine tools that are too expensive to scrap but want the ability to have a better transfer format.
I am still using SyncToy Version 2.1.0.0 Built 10/19/2009 4:04:38 AM.
As you can see from the build date it appeared on the scene some long time ago.
It works like Karens Duplicator. It is easy to set up and use. It has three copying methods, Synchronise, Echo and Contribute.
The first run may take a while, but after that it only looks for changes, additions and deletions.
I also use Reflect 8 for Images. Works nicely in the background too,
I have my desktop PC sync with OneDrive and that’s about all I need for now. I edit, add, and remove files as needed so what’s important is always safely stored there. I back up my desktop PC with the free version of Macrium Reflect (yes, I still have the installer for the last free release). I don’t backup my laptop PCs because I use them as ‘satellites’ to my desktop. All my PCs are logged into Windows using the same Microsoft account, so, even though my laptops do not sync with OneDrive, they still have access to it, so I can access my files on OneDrive and save any changes to there as well.
The story for my GNU/Linux installations is a bit different. My desktop and both laptops dual boot Windows with one of two GNU/Linux distributions. The one on my desktop PC is backed up by Macrium along with Windows. On both my laptops, I use the default utility for the installed distribution for backup purposes. I use similar strategies on my desktop PC and the GNU/Linux distributions on my laptops. I keep enough images to be able to revert my system (or recover file versions) as far back as a month (long enough for me). I’ve found a OneDrive client for GNU/Linux on GitHub. Using it I can access files on OneDrive and upload files there too. I can run the client as a systemd service to automate synchronization or manage it manually. I’m considering something of a hybrid approach with it so I can automate synchronization and still be able to interrupt the service to sync manually, restarting the service when the sync is finished, but that’s another topic.
The bottom line here for me is that I have my computers safely backed up and my files are synchronized to OneDrive so they’re stored offsite while OneDrive also acts as something of a long term storage solution (an archive?). The files and records that I need to keep long term are never changed so they’ll remain until I need to remove or replace them. I’ve checked, and I can mount a partition (for my test, I chose C:) in a Macrium image and access all the files in the OneDrive folder in it with my Wi-Fi adapter disabled so I have no Internet connection, so as far as I’m concerned, the files from my desktop PC are properly backed up because they’re stored locally (in my backup image) and offsite (on OneDrive). If I’m wrong about this, please explain how/why,
Ernie (Oldster)
I have Ubuntu Linux installed on an old computer that could no longer run Windows efficiently. For a long time, I avoided using Linux full-time on a computer, because I couldn’t sync it with OneDrive which I use as my file server. This time, when I installed it, I found a program called InSync that syncs OneDrive to a Linux machine. Now, I’ve finally found a way to use Linux fully lack of OneDrive compatibility had been a deal breaker.
It’s good to hear that Leo finally admitted that he (also) copies files, as opposed to a disk image with some tool that’ll be obsolete soon. I’ve been preaching that approach (just copy files) on these pages for a long time.
Always keep the installation media or downloaded installation software for whatever tool you use. Preferably get those as a zip file or what’s called “offline download” of the entire application (not just the installation stub). Reason is you’re not going to be able to get or use those applications in the near future. I say “near future” because the vendor will change formats, the vendor will update the tool and render your backups unusable, the vendor will see the older version and block its use, it’ll no longer be free, or the next Windows update may force obsolesce anything it doesn’t like. Also, if you have an older Windows XP or 7 computer, keep it as is if you really want access to your ancient backups (i.e. don’t update to Windows 10/11). You can always dual boot your computer to have access to the older OS – just in case. Final thought: keeping your backups in the cloud (OneDrive, etc.) doesn’t guarantee a solution for the problems raised in the article, especially if you’re entirely dependent on the Microsoft food chain.
“as opposed to”?? Hardly. In addition to is more like it. The article includes the specifics.
You may be able to read them but the jpeg files in them may not be readable. I had this happen. Found out the jpeg format has changed over the years.
I’d be shocked if most readers weren’t backwards compatible. If you run into trouble with one, absolutely try others. (IrfanView has a good track record for format support.)
My computer has two physical drives with one devoted to the OS and applications. That is cloned to a separate HDD periodically. Data is stored on a separate HDD and all data is copied to an external HDD. That drive is powered off when I am not coping to it. Whenever I change a data file or add data, that is copied to the external HDD. That process allows me to add a date or version number to the new data file when appropriate.
Also, any really important information is stored on paper.
For important business data, archiving onto microfilm is still available and good for 100 years. All it takes is a light and magnifying glass to read.
Leo’s article reminds me of Flash, made by Adobe. They made an announcement a few YEARS before Flash was discontinued. A desktop weather application I used was created with Flash technology. I dare say they had 10’s of thousands of users. The response from the software writers of the weather app regarding Adobe Flash was “We are sorry. Unfortunately, the situation is beyond our control.” My personal believe was Hogwash! They had YEARS to re-write their program. Their weather app is available for Android and iOS but they keep saying they are working on a new PC version. I doubt it. I’m staying away from them for fear they will not keep up with technology.
Leo,
I would not be complacent about the longevity of either MS-Office formats or Adobe .pdf.
I have Excel spreadsheets from the 90s that can no longer be opened in Office365.
Just as you advocate refreshing media and migrating to newer forms of storage, so you need to take old application data files & move them to later formats (& then deal with all the attendant issues of migration). Unfortunately that requires a lot more time and attention than moving from floppies-> CDs -> DVDs-> hard drives / cloud.
I’m surprised about the old spreadsheets. Do they open in Office alternatives?
I’d suspect corruption of the media to be the problem. I can open older .doc and .xls files in MS 365 versions of Word and Excel.