Why does a file sometimes double in size? Sometimes, Iâll be working with a
file that is say, 25 MB in size. I will add say maybe 2.5 MB of text and
photos. When I save this file to my hard drive or external drive or thumb
drive, the file size sometimes doubles up. So instead of being something like
27.5 MB, in other words, the sum of the two original sizes, it jumps to 55 MB.
This happens very infrequently, but itâs a pain when it does. What am I doing
wrong? And how can I correct a file that this has happened to?
In this excerpt from
Answercast #60, I look at the most common reason for files to be larger than
expected after adding new data.
Become a Patron of Ask Leo! and go ad-free!
Files doubling in size
Youâre probably not doing anything wrong at all. It sounds to me like a
feature of the program that youâre using to edit the file. Now unfortunately, I
donât know what that program is, but Iâm going to use Microsoft Word as an
example.
How Save works
What some programs do (Microsoft Word being one of them) is that when you
save a file, they assume that is a time-consuming operation, so they make some
decisions in order to make that appear faster.
One of the decisions that they might make is to rather than overwrite the original file;
-
They may just write a second copy at the end of wherever they happen to leave it;
-
Or they might only write a few changes in a different place;
-
Or they might write some of the files, some of the changes, in one place and
leave the deleted portions alone, just mark them as being deleted, and then
continue to append new data to the end of the file.
As you can imagine, it can get quite complex.
Similar to defragmentation
If this kind of thing sounds familiar to you, it should. Itâs very similar
to disk defragmentation.
In other words, when you delete a file, it doesnât really get deleted. The
data is still there. The same kind of thing is true for some of these programs
in the way that they save their data. They may not delete the original copy of
the file; they may just write a new one at the end of the actual physical file.
Like I said, it gets kind of complicated.
Turn off Fast Save
The good news is that the solution is usually very simple. The thing to look
for (at least in Microsoft Office programs) is something called Fast
Save; turn that off.
What Fast Save does is it does all of these magical things that may not
result in the most efficient copy of the file.
With Fast Save turned off, Word will go through the work of creating a
completely new version of the file that contains all of the changes youâve
made in the correct order, in the correct place, and with only the content
that is currently in the file. It may take a little bit longer.
Thatâs the point. But the net result is youâll get a file that has only the
things that are supposed to be in the file.
Practicality
Now, from an operational perspective (in other words, from just using this
stuff), itâs not like thereâs other âstuffâ in your file that youâre going to
see when you edit it or print it. Itâs not. Itâs purely a way of how the
information is stored on disk.
If you didnât pay any attention to the file size, you would never know that this was going on â because when youâre editing the file, you would only see the file in the state from your last edit.
So, I wouldnât worry about that.
Sharing files
There is one additional interesting little side effect â and thatâs when
files get shared with other people.
If this Fast Save magic is going on and the program that youâre using
isnât really removing everything from the file (the actual physical file),
it may remove it from whatâs being displayed (the file that youâre seeing and
editing), but it may not remove everything from the file as itâs stored on
disk.
What that means is that if you give that file to somebody else, they could
potentially use some other tools to take a look at the parts of the file that
arenât currently being used; very much (once again) like file fragmentation (or File > Delete in the file system) where you can actually recover deleted files by looking in the right places, as long as that file hasnât been overwritten.
Same thing applies to these kind of magically-fast saved files. It is
possible that by looking at the areas of the file that are deleted but not
really removed from the file, that somebody could find something that you have
previously deleted.
There have in fact been news stories of exactly this kind of thing
happening where sensitive or embarrassing data was allowed to leak out from an
organization because somebody did the Fast Save option. The file that was sent
out contained not only the final version of the file, but some of the remnants
of things that had previously been deleted.
Computers have enough speed
So, I actually do recommend in general that Fast Save should be turned
off.
These days, there really isnât a big reason to have it on anymore. Computers,
disks, and so forth are fast enough that youâll never notice the difference on
anything but the largest document.
But, thatâs probably whatâs going on here. Thatâs the option to look for.
Like I said, I donât know what specific program youâre using, but those are the
kinds of things to be searching for as you search that programâs options or
online help.
Next from Answercast #60 â Can my mobile phone calls be listened to?
Files often âsaveâ more than just your content. They save the previous version as well, or other information that is not always visible.
I remember a Word document I was trying to edit, that was a single page of text yet clocked up a staggering 10mb of room. After highlighting the entire document and copy-pasting into a new file (including formatting), the size was a mere 400kb. Identical to look at, identical to edit, but 1/20th the size.
Thereâs a feature in MS Word and most other word processing programs called Track Changes. If this feature is turned on, things you delete from that file will not be deleted, but marked for deletion in a manner similar to Windows placing deleted files in the Recycle Bin.
This is one of those file sharing features that allows other users of that file to see which changes were made and who made them. I believe this is one of the causes of the cases Leo was talking about when he mentioned that sensitive information was discovered in a file which supposedly had that information deleted.
If you have a file with these changes saved, there are two steps youâll have to follow to clean them up. This is how to do it in Word 2010.
Click on the Review Ribbon
1. Click on the arrow under Accept and choose âAccept all changes in documentâ
2. If the icon above âTrack Changesâ is highlighted, click on that icon and unhilight it.
The process is similar in other versions of Word and other word processors such as Open/Libre Office and Word Perfect.
Another thing I have heard about are style sheets. In excel (as one example), when people work on spreadsheets, their stylesheets get added. Not only does this enlarge the files, but slows it way down because it needs to open all the connected style sheets when the spreadsheet opens. As more people work on a document, all their style sheets get added â not sure how much space it takes, but I have seen people with hundreds of styles they didnât even know they had. Once the were removed, the spreadsheets opened in seconds, instead of minutes.
You can also try âsaving asâ to reduce file size.
For all the reasons mentioned, a file that is edited and saved can often become bloated. Some programs perform worse in this regards than others. I work in the printing industry and I often see this (and in the extreme) when it is necessary to edit a client pdfâs.
After a number of edits a 5 meg pdf may become 20 or 30 megs (and even more) even though the pdf content was just re-arranged rather than added to. Performing a âsave asâ will always bring these âbloated savesâ down to a more reasonable file size.