Technology in terms you understand. Sign up for the Confident Computing newsletter for weekly solutions to make your life easier. Click here and get The Ask Leo! Guide to Staying Safe on the Internet — FREE Edition as my thank you for subscribing!

How do I make sure that my deleted data is really gone?

Question:

I appreciate that a normal file delete simply removes the file name from the
directory system and marks clusters as available for reuse. I also realize
that, just as trying to stick one piece of paper over another identical sized
piece will normally leave a small amount of the lower piece exposed, so
overwriting a disk leaves small areas with the original magnetization. Is it
reasonable to assume that recovering overwritten information is so expensive
that it would only be attempted for disks storing very valuable
information?

How does Windows deal with a normal File Save? Does it attempt to rewrite
the file to the same clusters, simply returning excess cluster to the available
pool if the new file is smaller than the original and adding a few new clusters
if the new file is larger than the original? If every File Save is to a new
area of disk, then what I am suggesting will obviously not work, but if
clusters are reused as far as possible, then is this a feasible way for people
to deal with small amounts of moderately sensitive data?

Are there snags to password protecting a file? I have only a few password
protected files, and I protected them so long ago that I have forgotten how I
did it. If I were to now password protect existing files, the file system would
obviously only know about the password protected files, but would the old files
still be in their original clusters?

You've raised several good points all around saving files and the potential
chance of recovering said files even after they've been deleted. Sometimes
that's a good thing (recovering a file you "accidentally" deleted) or a bad
thing (someone else recovering a file you didn't want them to see).

There are several assumptions in your questions as well, and as we'll see in
a minute, assumptions are rarely a good thing.

Become a Patron of Ask Leo! and go ad-free!

I like your overlapping paper analogy, since in essence it's exactly right.
If each "bit" on a hard disk is represented by a single piece of paper that's,
say, either black or white, then you'd think of writing new data to the disk as
putting down a new layer of pieces of paper over what's there already. But as you
say, you still might be able to see the color of the paper just underneath what
you just put down. Or the one underneath that. Or the one underneath that.

And indeed, this is exactly how extremely advanced computer forensics can
sometimes recover "old" data on the disk. By using special tools to examine the
disk media, they can sometimes reconstruct the data that was on the disk prior
to what's there now. And sometimes even the data before that.

"The safest assumption is worst assumption: assume that
Windows will re-allocate disk space in the worst possible way for your security
needs."

The good news is that no, it's not easy, and does as I understand it,
require special equipment. I don't know if commercially available data recovery
services make this type of recovery available, but I would expect it to be
expensive. And of course I'd expect some government and perhaps even some
corporate facilities to have this technology available. (And for the record
this only applies to magnetic material. As I understand it, anything
that's written into solid-state devices like flash drives completely overwrites
the prior contents.)

So, yes, I'd currently expect it to be attempted only when there's something
very valuable to be recovered. Though, of course, we've seen technology improve
over time, so who knows if that's going to be a valid assumption a year or five
in the future?

Which leads me to your question about cluster re-use. First, we need to
clarify that exactly how clusters are re-used depends not on Windows as much as
the format you chose for your hard disk. FAT32, for example, allocates file on
disk very differently than NTFS does.

All that being said, we could certainly figure out the file systems re-use
algorithm, (hint: it doesn't try to re-use recently released clusters, it more
likely attempts to allocate clusters in such a way as to reduce disk head
movement), but since that's dependent on the file system, and could
easily change, we'd be making an assumption. And if we're making security
decisions based on that assumption, that could be a very dangerous
approach.

The safest assumption is worst assumption: assume that Windows will
re-allocate disk space in the worst possible way for your security needs. For
example that means that you should never assume that the act of saving a file
in any way will (which also depends on the application involved as well as the
operating system) overwrite exactly the file's old clusters. New clusters may
well be allocated somewhere else entirely, and the old clusters will be marked
"free", but otherwise remain untouched and discoverable by recovery tools. It's
not guaranteed that will happen, but from a security perspective it's what
you should assume.

Which brings me to your final point: password protected files. Without
knowing exactly how you've protected the files it's impossible to say
what might happen. However, we can make some general statements:

  • A password protected file is likely "just a file". That means whenever you
    change it, copy it, rewrite it or whatever, the clusters it previously occupied
    may still remain unused and discoverable on disk.

  • Many password protection schemes do not actually encrypt the file's
    contents, or use a very "light" encryption. That means that the contents of the
    file might actually be easily visible outside of the file's intended
    application.

  • File system specific encryption and passwording might be more
    secure, but there are tradeoffs, and it's still safest to assume the worst.

So if we're assuming the worst (contents of deleted files might remain
discoverable for a long time, encrypted files aren't really very special, and
even overwritten files might be recoverable with enough resources), what's a
person to do?

First: understand your exposure: do you really have something on your hard
disk that anyone else would care about? For as many people that ignore security
completely, there are just as many that over-state their security and privacy
needs. As I've said before, for many of us we're just not that interesting. No
one wants to steal the pictures of your puppies or your email to your
grandma.

Second: understand the risk: you're much more at risk from security issues
elsewhere. Pissing off your waiter, and then giving him your credit card is my
favorite example. But even elsewhere it's more likely that the paper bank
statements you put out for curbside recycling are much more likely to be stolen
than the information within the deleted files on your hard drive.

If you do have legitimate and important security needs, my advice is
threefold:

  • Use an open source and proven encryption tool like TrueCrypt with an appropriately secure passphrase to keep
    your important documents, or perhaps your entire hard drive, secure.

  • If you're concerned about deleted file or empty space recovery, use a tool
    like SDelete (Secure Delete) which
    will delete and overwrite a file multiple times, and also has an option to
    overwrite the free space on your drive so that too becomes
    unrecoverable.

  • If you're concerned about the prior contents of the used space on
    your hard disk, then I'd use a tool like SpinRite which as part of it's drive maintenance will rewrite
    every cluster on your hard disk several times, effectively removing any prior
    images "peeking out" from underneath the magnetic equivalent of those slips of
    paper we talked about earlier.

I'll wrap up by summarizing what I do.

I do use TrueCrypt to encrypt all
my "sensitive" files on all my systems. By sensitive I mean my
financial records, my master list of passwords and so on.

I also use TrueCrypt to encrypt a
large partition on my laptop that contains all of my work. This isn't as
sensitive, but since laptops are more easily stolen it just makes sense to
ensure that if it is, my work documents, web site files and client information
isn't unnecessarily exposed.

I rarely use SDelete. With my use
of encryption, there's rarely anything to delete that would be left exposed on
disk that I might care about.

I use SpinRite not for its
security aspects described above, but as a maintenance tool to keep my hard
disks performing their best.

Do this

Subscribe to Confident Computing! Less frustration and more confidence, solutions, answers, and tips in your inbox every week.

I'll see you there!

7 comments on “How do I make sure that my deleted data is really gone?”

  1. Thanks Leo. The question was prompted by a not-very-computer-literate retired doctor who still had patient records on her computer. The people selling her a new computer offered to physically destroy her old drive. I don’t like the idea of destroying a perfectly good drive, and I also feel she should not let the drive out of her possession until she has removed all traces of patient data.

    SDelete’s writeup does not mention XP amongst the operating systems. Is this a documentation oversight? Her sensitive data has already been moved to the recycle bin, and that has been emptied. I have no experience in interpreting command line parameters, so we would be very grateful if you could tell us exactly what we should type in order to cleanse the free areas of her C drive.

    I was considering spending 8 dollars and getting her PC Magazine’s Shred 2 utility. http://www.pcmag.com/article2/0,1895,219998,00.asp
    Does anyone have any feedback about that product.

    Reply
  2. It would actually be rare that a write to a file would reuse the same cluster that the original data was in, as this would lead to a file system which was unstable in case of crashes (power outage, etc) in the middle of a rewrite of a file. In fact, much software will actually write an entire new file with the new content under a temporary name, and then do a rename of the two files to move the original to a temp file and the temp to the original filename, then finally remove the original file in its new name. (And yes, many variations on this theme exist. ;-)

    Reply
  3. To delete the secure content on a disk, could you create a large file and copy it many times. Then defrag the disk. Then delete the files you created, including the ones you want to get rid of. Then defrag the disk again.
    Wouldn’t this scramble the disk contents that don’t get deleted enough so everything would be unreadable except to the most sophisticated recovery tools. Unless you are doing something highly classified, or illegal, wouldn’t this be enough for most personal files?

    Reply
  4. —–BEGIN PGP SIGNED MESSAGE—–
    Hash: SHA1

    SDelete works just fine in XP :-).

    Another (free) option to wipe the hard drive is “Darik’s
    Boot and Nuke” (DBAN). http://dban.sourceforge.net/

    Leo

    —–BEGIN PGP SIGNATURE—–
    Version: GnuPG v1.4.7 (MingW32)

    iD8DBQFH1Ld8CMEe9B/8oqERAmEKAJ4xMHpmrBRaB4T3OWjHMK8yCDzHoQCfTHdj
    9oQCogWHcv4Hh3hZ+ijjcBo=
    =Bf8C
    —–END PGP SIGNATURE—–

    Reply
  5. Use a program like DBAN (Darik’s Boot and Nuke – hard disk WIPE). Freeware (just google – program and faq). Create a boot cd (or a boot floppy if the computer has a floppy drive), then run with multi-pass. Can take awhile (run overnight for the 32 pass wipe, but that might be overkill – the DOD 7 pass wipe might suffice), but it does work – disk is CLEAN.

    Reply
  6. your information was of no help to me, what i want to do is to recover a specifie email address that i accidentaly deleted form my ignore files. that is all i wish to accomplish. can you help me do thisor not?

    Reply

Leave a reply:

Before commenting please:

  • Read the article.
  • Comment on the article.
  • No personal information.
  • No spam.

Comments violating those rules will be removed. Comments that don't add value will be removed, including off-topic or content-free comments, or comments that look even a little bit like spam. All comments containing links and certain keywords will be moderated before publication.

I want comments to be valuable for everyone, including those who come later and take the time to read.