Writing over a file is a good way to lose data, but if you are willing to dive into some command line programs, you may be able to recover some of it.
Your situation is actually not that uncommon.
These days, file formats are complex and the programs that read them are often unforgiving when there’s something wrong with the file.
When only portions of a file are recovered, some of the information that the application relies on to open and interpret the file is so badly damaged that the application can’t even recognize the file to open it.
Typically, that happens when the first few pieces of the file are missing. But it actually can happen if any piece of the file is missing, out of order, or just otherwise unrecoverable.
Recovering text from files
When faced with this in the past, I’ve typically used a very bizarre technique that basically tries to recover as much of the text in the document as it can.
Now, I want to be clear about what I’m talking about here. When I say “text,” that means the words that I’ve written and only the words I’ve written. Using this technique, you will end up losing any formatting or layout information as well as any images.
But I’m assuming this is primarily a text document, so you want to recover all of the words.
Download the utility called Strings. It’s a command-line tool available from the TechNet portion of the Microsoft website.
Very often, the words that you write in a document are stored as plain text. It’s like writing it all in Notepad where there is no formatting. Word processing programs, like Word Perfect, Microsoft Word, and others add formatting and layout information to the file format, so you can see that a paragraph in your document is centered, in all italics, or on the next page. Nonetheless, the words are still in plain text.
Strings looks through a file that you specify on the command line and simply displays everything that it recognizes as being a plain-text string.
You may have to run it twice:
- In ASCII mode to get simple characters that you obviously would recognize.
- In Unicode mode. Sometimes, programs will store characters in Unicode. This allows them to store millions of different kinds of characters from around the planet. Even if you’re only writing in English, you’ll want to make sure that you run a Unicode scan.
One scan will probably be better than the other and it should be fairly quick to recognize, based on the output of the tool when you run it each of the two times.1
Next, you’ll redirect the output of the Strings tools to a file. This is simple command line stuff. Look up “redirect” or “command line redirect” in Windows. For example:
strings -A example.doc >recovered_example.txt
That dumps all of the plain text Strings that it can find into a new file, recovered_example.txt, which you can edit using notepad or whatever.
You will see a lot of junk in these files. Some of it you can just delete. Hopefully, the majority of the text from the original document will be there.
I used Strings in the past. It’s been a while since I’ve done it, but Strings has helped me recover corrupt documents from time to time. It definitely beats having to retype everything from scratch.
Remember to backup
Now, I do have a recommendation that I’m sure everybody’s expecting me to say.
This wouldn’t have been a problem if the file had been backed up somewhere.
Even when you’re working on a document, it’s helpful to have a backup copy of it. Even if the backup is as recent as yesterday and you lose all of today’s work, you don’t lose everything.
So, use Strings to see if it will recover enough of your document and then start backing up that file in the future.