First let me say this: I strongly advise against blindly deleting duplicate files. Done incorrectly, you can quickly render your computer unbootable.
Duplicate files happen for a number of reasons, and surprisingly, what you and I do isn’t at the top of the list of the most common causes.
Become a Patron of Ask Leo! and go ad-free!
As I mentioned in a previous article, it’s not at all uncommon to have multiple copies – often identical copies – of files that are used by the software installed on your machine. This happens quite legitimately for a variety of reasons, which mostly boil down to programs not wanting to put themselves at risk due to the bad behaviors of other programs.
For example, program “A” installs a copy of a library of shared code. If program “B” also uses this exact same library, there are two approaches that could be taken:
The problem with the first approach is that program “A” could be uninstalled, or updated, and somehow change or remove the shared library. The net result is that program “B” could be broken.
As a result, most programs now carry and install their own copy of many libraries and support files, so as to remain in control of their own destiny.
Duplicate files on your hard drive result.
Windows protects you
Another common source of duplicate files on your computer is Windows itself. Many of the files that make up Windows have duplicate copies.
- System File Protection is a technology that monitors system files for unexpected changes, often due to malware. When such changes are detected, the files are restored from duplicate copies.1
- In older versions of Windows, the original installation source was duplicated in a folder called “I386”. While that’s no longer the case with recent Windows versions (a hidden partition is now more commonly used), manufacturers and others still occasionally keep a duplicate copy of the operating system installation media on the hard disk.
- If you’ve updated Windows and have a Windows.old folder, the previous operating sytem copy will be there. These are sometimes considered duplicate files – at least in name – though there are often many files that actually don’t change from version to version and may qualify as true identical duplicates.
Applications do it too
I did a quick duplicate file scan on one of my Windows 8 machines, and found duplicates for an assortment of additional reasons:
- I have multiple versions of some programs installed. Not all files changed between versions. This is often due to things like sample or other support files that simply don’t need to change version-to-version.
- Some applications update themselves by saving the previous version of the file being updated. That can often show up as a duplicate, depending on the type of duplicate search run.
- Some applications are organized such that their various sub-components include duplicate files.
- The system I use to keep track of all the files that make up my websites, videos, and more2 maintains a duplicate copy of every file under its control so that it can quickly detect changes locally.
The list goes on.
The takeaway here is simply that many applications install, manage, or maintain duplicate files for a variety of what turn out to be legitimate reasons.
Duplicate files my scan didn’t find
I didn’t find any files that were duplicate due to my own actions.
That kind of surprised me, but not terribly much. You and I are typically pretty good at managing the files we control.
As the questioner said, when saving a file in a program like a word processor, the original is overwritten, and no true duplicate would result. The same is true for most files we manage.
One big and easy exception that does come to mind is downloading the same file twice. We’ve all done it, and the result is two copies of the same file in our Downloads folder with the same base name, possibly followed by a number in parentheses.
Unless you really know what you’re doing, don’t. It’s simply not worth the risk.
As we’ve seen, there are many legitimate reasons for duplicate files. Deleting those duplicate files at best will cause an application or two to misbehave, and at worst will delete a critical file out from underneath Windows itself, rendering it unbootable.
If you must delete duplicates, limit yourself to those files you are 100% certain you recognize. For example, if you’ve been managing photographs or music on your PC, and suspect that as part of that you have lots of duplicates, then a duplicate file scan might be in order.
Even then, I’d still think twice about turning over control to that program to perform the deletion. I’d be more tempted to use the information gleaned from the scan to guide my own, more deliberate, manual actions.