Dealing with accumulated data.
There’s no single answer to this. It depends on why you created the backups and your own data “hygiene”, for lack of a better term. If you’re running a business, you may have additional considerations.
Data retention is the formal term, and different situations call for different policies.
Become a Patron of Ask Leo! and go ad-free!
How long to keep backups
- Most people should keep daily incremental backups plus a full monthly backup for three months.
- Keep backups long enough to recover from various problems, bearing in mind how quickly those problems might be detected.
- Backups made for different reasons may have different retention requirements.
- Businesses may be required to keep more backups for longer periods of time, depending on their industry.
- Special-purpose backups may have shorter or longer retention plans.
When in doubt
If you don’t want to put a lot of thought into it, I would fall back to a three-month recommendation. In slightly more detail, that means:
- Perform monthly full image backups of your entire machine.
- Perform daily incremental images building on those full image backups.
- Keeping the backups for three months.
Without knowing more about your situation, this represents a balance between recoverability — anything in the last three months can be recovered — and disk space — only three months’ worth need be kept.
Implications of how long you keep backups
Think of each backup as a representation of your computer as it was when the backup was taken. As a result:
- Yesterday’s backup: your machine and everything on it as it was yesterday.
- The day before yesterday’s backup: your machine as it was two days ago.
- The day before that: your machine as it was three days ago.
- And so on…
Let’s look at some examples of what that implies.
Let’s say your machine becomes infected with malware. As I’ve stated many times, restoring to a recent backup taken prior to the malware’s arrival is probably the fastest and most reliable way to completely remove it.
Ideally, you would notice the infection quickly and restore the previous day’s backup. But what happens if you fail to notice for, say, a week? Perhaps you don’t use your computer for a while. Maybe it takes a week to figure out that the odd behavior you’re experiencing is, indeed, malware.
If you keep only a few days of backups — say three days — all you have is a backup of your machine as it was three days ago, which is after the malware arrived. That backup, and all backups since, are infected. You no longer have a clean backup you can restore to.
Accidents happen, and sometimes we change our minds.
Let’s say on Monday you delete a file you believe you no longer need. You’re done with it, or so you think.
Then, later that week — perhaps Friday — you suddenly realize not only were you not done with it, but it turns out to be crucial.
Once again, if you only have three backups, you have backups of your machine as it was on Thursday, Wednesday, and Tuesday. But not on Monday. As a result, you no longer have a backup copy of the file you deleted: it’s gone.
Either software or hardware can fail in such a way that a perfectly good file gets damaged to the point that it can no longer be opened or used. The file may be present, but its contents are so much garbage.
As above, let’s say on Monday your computer experiences an unexpected power loss and shuts down without warning.
Come Friday, you realize that a file you rely on to perform some end-of-week processing every Friday can no longer be opened — the application that tries to open it reports it as being broken, or of the wrong format. It looks like that power problem earlier in the week caused your hard disk to damage the file beyond repair.
Once again, with only three days of backups, you have your machine as it was on Thursday, Wednesday, and Tuesday — all after the damage happened. You no longer have a backup copy of the undamaged file.
Malware, deletion, corruption, and more — and how quickly they become apparent — are important to consider when planning your backup strategy.
Types of backups
For the average user, I think about three types of backups.
- Safety net: backups taken to protect yourself when taking a risky action.
- Regular: backups taken automatically on a schedule.
- Archive: backups intended for a long-term archive.
Let’s look at each.
Some backups are just a safety net created prior to a possibly risky event. If something goes wrong, you can restore your system to its pre-event status.
For example, you might back up your registry prior to installing software you don’t completely trust. Another example might be a system-image backup of your entire computer taken just before upgrading the operating system.
These types of backups are often temporary. Once the risk has passed and you’re certain you’ll never have to revert to that backup, there’s no need to keep it. You could keep it for a few moments, hours, days, or weeks, depending on what it takes to feel confident that you’ll never need it.
I consider regularly scheduled backups to be the single most important way to protect yourself from data loss and many other difficulties.
My recommendation is to automate a monthly full-image backup of your machine with daily incremental backups. While I’ve made a suggestion above, the specifics — monthly and daily — are less important than having something happen automatically, with no need for you to remember and take action.
There’s no set answer as to how long you should keep these, as it really depends on your own configuration, needs, and storage capacity. You might discard backups older than a month, or perhaps a year. You might decide to keep specific snapshots for longer, “just in case”, but discard the majority.
As just one example, here’s my retention schedule for backing up the PC I use as my primary work machine. I keep:
- Daily incremental backups for a month, until the next full backup occurs.
- Monthly full-image backups for at least three months.
- The full backup images of each quarter for an additional year.
- The first full backup image of each year pretty much forever.1
As I said, that’s just an example, and my needs might well be considered extraordinary compared to yours. (I use something from my backups perhaps once a year or so. Totally worth it.)
I want to mention one additional type of backup that many might not be considered to be a backup at all: what I call an archive.
An archive, to me, is a collection of data intended to be kept forever, even though it’s not necessarily needed now or needed daily. For example, those backups that I keep “pretty much forever” (mentioned above), might be considered archive copies of the long-defunct machines they represent. Similarly, the fact that I copy my photographs to cloud storage in addition to backing them up locally might also be considered archival.
The concept of archiving is truly data-dependent. There’s no need to archive your operating system updates for posterity, but your correspondence, photographs, and other more personal items might be appropriate for archival.
A rule of thumb
A good rule of thumb is to think long and hard, “Will I — or anyone — ever need anything from this backup ever again? And for how long might that need exist?”
Then keep it a while longer.
Before answering that, we also need to look at what’s been backed up, and what the implications of “needing it again” might be.
Needing a backup
A complete system restore to a backup image resets everything on that machine to the condition it was in on the day the backup was taken.
Everything since the time that backup was made is lost.
For example, perhaps on September 1, I restore my computer to a full image backup that was taken on June 1. All changes between June 1 and September 1 (that haven’t been saved elsewhere) are lost.
As you can imagine, then, while I might very well restore to an image of a few days ago because of a system failure or other catastrophic event, I certainly won’t be restoring my system to the image taken on January 1 two years ago.
Those backups are valuable because of the files they contain. While I might never completely restore my entire machine to their contents, I can still use my backup software to explore and restore specific files from the backups taken on those earlier dates. And because they’re image backups — backups of absolutely everything — I know that anything on the machine at that time can be recovered.
So, how long should I keep backups?
There’s no general rule I can apply that would make sense for everyone.
Clearly, the first few days are important. Things like lost files, malware, and the like are often discovered quickly, and typically you’ll need to go back only a day or two when that’s the case. Of course, a sudden and total hard disk failure makes itself known quite quickly.
The questions I’d have you consider are:
- How confident are you that you’ll discover whatever you might want from your backup within the amount of time you keep your backups?
- What would be the cost — be it money, emotion, or just time to re-create it — should you be unable to recover something because you didn’t discover you needed it before your retention period passed?
- Is there any reason you can’t just throw more disk space at it and increase the number of backups you keep?
These questions apply for any time period you might choose to keep backups, be it three days, three months, or three years. For various reasons and in various situations, the proper retention period could be any of those, or even longer.
Subscribe to Confident Computing! Less frustration and more confidence, solutions, answers, and tips in your inbox every week.
I'll see you there!
Footnotes & References
1: In reality, “forever” often turns out to be about five or ten years. Every so often, I go into a clean-up frenzy and delete individual backups I’m confident will never, ever be needed.