As we use digital technology, we’re continually accumulating digital “stuff”: we take pictures, write documents, record videos, purchase music, acquire software, and much much more.
All of this digital data is either accumulating on our systems, or worse: getting lost.
In the past, we’ve had a very clear concept of how we could store the physical counterparts to today’s data. They were visible and we could move them about as our needs dictated: place them on a shelf next to the TV or store them in a box in the attic.
Digital data requires that we think a little differently about storage.
I want to introduce you to archiving.
To begin with, it’s important to realize that archiving is not the same as backing up. Not at all.
Become a Patron of Ask Leo! and go ad-free!
Archiving – what it is
Archiving is nothing more than the process of moving (or copying) digital data that you want to keep for “a long time” to some appropriate device, location, or storage medium.
I’ll use photographs as my example as photos are one of the most common forms of digital data that we’re all now accumulating over time. The concepts will apply to pretty much anything: music, documents, videos, software … any type of digital data that:
- You want to keep for an extended period of time
- Accumulates over time
By necessity, many of the specifics are intentionally vague. What’s an “extended period of time,” for example? Only you can decide that. But that’s also why I choose photographs as my example. For many people, the answer is “forever.”
Archiving in practical terms
When you take photographs with your digital camera, you :
- Leave them on the camera
- Copy or move them to somewhere on your computer
Leaving them on only one camera is bad for a whole host of reasons (such as losing the camera and everything on it). At a minimum, archiving your photos begins by copying them to your computer.
In rare cases, that may be enough. Perhaps you don’t take many pictures, perhaps you’ll keep your computer longer than you envision ever wanting the photos. Either way, having them on the computer might be enough. (And yes, this assumes the computer is backed up – more on that in a moment.)
I’m of the opinion that it’s not enough.
As you accumulate more photos, keeping them on your computer is going to become impractical for a few reasons. The biggest reason is disk space: with the increasing resolution and quality of digital photos over time, the files are getting bigger every year. If you take any reasonable number of photos, the amount of space taken can add up quickly. If you add videos into the mix, things get even worse.
There’s also the issue of finding photos that you want. Browsing through a few dozen photos for that one special picture that you remember taking last year is one thing. Browsing through a few thousand that have accumulated over the past decade is something else entirely.
In my opinion the thing to do is to copy all, or move a portion, of your digital photo collection somewhere else.
When it comes to archiving, there are several options, depending on your situation.
- External storage. I keep my entire 232-gigabyte collection of digital photographs1 on a network-attached-storage device, which is essentially nothing more than a really big external drive accessed over my local network. External drives are actually a particularly convenient approach to archiving.
- Online storage. Photo-sharing sites like Flickr, Picasa, and others can provide one approach to archiving your photographs, if you have the bandwidth to upload them, and your plan with these services includes an appropriate amount of storage. You can archive everything (my recommendation) or only those things that “make the cut,” so to speak. My biggest concern with online archival is that its too easy to fail to archive something you might want later.
- Offline storage. Burning your photos and videos to CDs, DVDs, or other offline storage media are one common option. Besides being a bit of a hassle to create and later retrieve, optical media may not be the most appropriate for truly long term archival, as the media can degrade over time. These days, if this is the direction you’re considering, I’d actually recommend an external hard drive instead.
- Another computer. There are two ways to use a second computer as your archive: one is to set it up as a computer specifically dedicated to that task. Give it a large hard disk, organize some folders on it, and periodically download or copy over whatever it is you want to keep long term. The other approach is to simply share a folder on its hard disk to the local network where it can be used exactly as an external drive described above by the other computers on your network. (The not-so-secret, of course, is that the net result on that computer’s hard disk is pretty much the same either way.)
Regardless of which approach you use, the key is to have this conceptual “other place” that is the official repository for a complete collection of your digital data, thus freeing up your computer – and to some extent your mind – from having to worry about, contain, or browse through absolutely everything.
You can choose to keep copies of some subset in more conveniently accessible locations. For example, while my 232 GB photo collection sits on another machine on my LAN and is thus a little slower to access, I keep a copy of a smaller subset – the last three years of photos – directly on my computer for quicker access.
What archiving is not
Archiving is not backing up, and it is not a substitute for backing up.
In fact, it’s very likely that you need to backup your archives.
Remember the “golden rule” about data and backups: if it’s in only one place, it’s not backed up.
If that “one place” failed or went away, then everything it contained could potentially disappear instantly and permanently. If you’ve moved all of your photos to some kind of archive device, and that’s the only place they live, then they’re not backed up.
They should be.
Recall that in my scenario, I have 232 gigabytes of digital photos in my archive. I do have a subset on my computer, so in a sense those are backed up, but what about the rest?
The entire collection (all 232 gigabytes and growing) is backed up to an external USB drive connected to another machine on my network. 2
If you’re considering archiving, make sure not to leave backups behind. Putting all of your eggs in any single basket is simply asking for trouble.
Isn’t this just one more thing to manage?
Well, yes. In a way, it is.
And I’m not saying that you must do this.
What I hope is that you’ll look at the collection of accumulating data that you have and make a conscious decision on how you want to handle it before it grows to be completely unmanageable, and perhaps even inadvertently lost.
If you feel that having five to ten or more years of photographs (or videos or documents or …) on your computer in whatever organization or lack thereof you happen to use is sufficient, then please, carry on – as long as you’re backed up, of course.
On the other hand, if you realize that you have a lot of data and you’re not sure how you’d ever find something in it (in fact, you’re not really even sure where it all is), then it’s time to think about managing your data so it is more organized, easier to use and find, and such that you’ll not lose anything.
In other words, it’s time to consider an archive strategy.
18 comments on “Archiving – What it is and why you need to start”
I always back everything up on at least 3 flashdrives drawback is that if the folders relating to my cinema research are updated i have to update the same folders on all three flashdrive though have found that doing a new back up on flash drive 1 i can then plug in flash2 and transfer that file from flash1 to flash 2 then unplug flash 1 and insert flash 3 and do the same again thereby the folder on all three flashdrives are backed up with the updated folder or folders
i also have my cv on 4 flash drives and a copy on my computer that way if anything corrupts the laptop at least i know i havent lost my copy of my cv i also back up files regularly anyway as further belt and braces safeguards
You might want to read this other article my Leo: http://ask-leo.com/how_do_i_back_up_to_a_memory_stick.html
Flash memory can wear out over time. As long as you’re aware of the risk, use your flash drives if you like. Of course with 3 or 4 of them, as long as they all don’t wear out at the same time, hopefully you’d have at least one good backup.
It basically feels like you’re adding extra redundancy. What you seem to be saying about those 232 GB of pictures is that they’re so many that you just push them out of the way so you don’t have to worry about it, but the same problem will reoccur if you then want to find something that happens to be in the archive. I say the only redundancy needed should be backups, of course, and the way to manage large amounts of data is using trees.
The natural way to arrange things in any file (digital) file system is using this tree-based approach, where you have folders/directories that form the nodes, and the files are the leaves. If you feel like you have too many leaves in one level, you just split it up or add extra levels (i.e. make separate folders/directories for them).
For instance, you might make a folder/directory named ‘Vacations’ where you store all your thirty years of photos and videos that you’ve made in 30 vacations (or more of course). If there are too many, a natural way to divide them up would be to make new folders for each of them, which will make the number of files to manage in one folder less while at the same time keeping them in a logical place; if you want to look up a picture from your trip to Italy in 2008, you know you need to go to the Vacations folder, and then in there you can easily find the one labeled Italy 2008 (for instance). In there, if there are still too files, you can add more levels for the specific things you did during the trip. Similarly, if you think 30 years of vacations is too many for one level, you can add more levels in between and for instance group them by five or ten years, or perhaps by continent of destination.
You can probably see that most of the data we store has a lot of meta-information associated with it that enables you to organize it very efficiently, and pretty much all modern (digital) file systems support this natural organization, so that you never have to worry about losing your files anymore. You don’t need to remember where exactly you put them, because you can just use the memories you already have associated with them to walk your logical data structures and find what you’re looking for.
Of course, if you actually have so much data that your hard disk fills up, there’s little else you can do than to move it to another disk, or just buy a bigger one, since the price per gigabyte is ever decreasing. (Once you have multiple disks you might use the physical disks as the top-level categories, using for instance one disk for the vacations, one for the kids etc.).
any suggestions on specific photo file management software that allows for multiple references to index each of the many photos, which I guess is the trick to find them a couple of years later??
But then you’d have to be diligent enough to fill out the fields for every photo. While you might start out with good intentions, a lot of people would probably do it less and less over time. I know from personal experience.
When we got our digital camera I created a folder for the year the photo was taken. Then a subfolder for every month. After about a year or two, even something simple like that became too much with two young kids and taking about 800 pictures a year. This year we took 800 pictures on our 2 week trip to the Maritimes!
So now, my file system consists of all the pictures for the year in each year’s folder, and a subfolder for just special things: like our trip to the Maritimes, family reunions, birth of a child, etc.
I have the same system, and it works well for me. I have photos going back through generations of family since I am the designated scanner of old family photos and slides. It’s not hard to find photos at all. Pre-1950 photos are in one folder. They can be broken down later if need arises. The rest are divided into decades, then years, then occasions and months. One thing I am considering doing is using the Windows 7 library system to set up libraries for things like Thanksgivings, birthdays, Christmases, Pets, etc. so that they can be seen together without having to drill down through the years. Almost every year has a folder for these holidays/occasions already, so it shouldn’t be to hard to just put the folders for each year’s Thanksgiving into the library titled “Thanksgivings.” I had eschewed the library system, but I think I have finally found a use for it.
I have used ACDSee for more years than I can remember. To me the important thing is to have an indexing system which is independent of the filing, and ACDSee allows this. After culling duds from the folder where I download to, it doesn’t take long to select all the photos of the same categories and tag them all with the same tag. Usually each photo has one tag, but I can tag a shot as “Water” and also as “Sunset”, in which case I could find it in a water search or a sunset search, or even specify both water AND sunset. For people shots, I add as many person tags as I care to. Tags are hierarchical, so I arrange people tags in family trees, and then find every photo in any branch or sub-branch, or individually. Some branches of the family I tag at the surname level, others individually. Searches can be combined with date ranges, so I can find photos of my grandson at a given age, for instance. ACDSee also allows keywords, but I rarely use that. It also offers grading of photos, so I can find the “best” shots of my grandson. There are also albums, which are really just a specific tag. There’s probably more I forgot. The point is, there is enough there to fine tune my own scheme.
After tagging, I don’t really need to store them in any special way, but I choose to roughly group them in folders and sub-folders. For scenic shots it’s particularly rough (where would I file a photo of a duck in a pond with trees at sunset?) but if I chose to tag at a detailed level, I could retrieve it via many different searches. I do tend to collect “trips” together (Zoo 2013, or Vacation 2009 – the latter being a subfolder of Vacations).
I name each photo so that every single
photo in my collection has an individual filename, and I can move it to any folder.
The trick is to tackle culling, editing and tagging often, before the job gets too big. The results speak for themselves when I can pull up every photo of a much loved now deceased cat (even the ones hidden in the “Christmas 2012” folder) in a couple of seconds from 18000 images.
My first comment is that Leo might want to change the section “What archiving it not” by one letter. :)
Secondly, it is not clear how creating an archive that lives elsewhere than my primary computer’s disk could make finding a particular photo easier to find. Any filing strategy that could aid retrieval would surely be applicable to my primary disk as well, and if I had only 232GB of photos they would fit nicely on my 1TB drive, so I don’t see how a backup differs from an archive for anything I want regular access to. I do agree that few users would have enough disk space for much of a movie library, so that would almost certainly be on external storage, but if the term “archive” applies properly only to material that is *not* on my computer (ie only on external storage) then archiving would seem to be something I would want to do only if absolutely necessary. Otherwise, not having the luxury of a network drive, my photos would be almost as cumbersome to look at as all those albums in the boxes in the attic.
Thanks. You have sharp eyes. It’s fixed now.
As for the second point, according to the article, archiving is moving things from your computer to another storage device to make more room on your hard drive. If you don’t have a disk space problem, archiving may not be necessary for you. I have all of my music, videos and most of my photos archived on an external HDD with 2 backup copies of each. And if you archive files, don’t forget to babk them up soewhere else.
I have considered a photo management system, but didn’t like to be forced into the logic and preoccupatons of their makers. My choice was to keep it very simple.
I’ve combined archiving and backing up by keeping my photo collection on three different usb disks (of three different brands), each 1 TB, in a simple folder structure that is exactly the same in all three. I keep these disks in different places.
The folders each have a name consisting of date, location, subject, keywords, like this:
The indexes can be names, actions, anything.
This is enough to locate any of the pictures instantly with the find-function of file manager Xplorer2.
For unimportant occasions, I just number the individual pictures per folder.
For special occasions I label every photo with date, location, indexes and sequence number.
I view the photos with IrfanView, which has a great thumbnail overview function.
It takes a bit of effort to code the folders and pictures. This is rewarded over and again by the ease of finding them. The issue is twofold: Storage and Retrieval.
New pictures I load onto my laptop, give names, and copy to one of the disks. Then I remove the pictures from the camera. In the following days I copy the new photos from the laptop to the two other disks.
Cost: three usb-disks, 75 euro apiece.
I used to use a NAS with an external hard drive to back up. It worked great until my house was burgled and both were stolen (along with my laptop, old mobile phones and camera). I was very lucky to have agreed to automatic uploads for various online services otherwise I would have lost everything. I did lose most of my 500GB music collection, lucky they developed streaming services. Now I rely on online services as my main archive with a local hard drive to back up. I’m planning on getting an additional drive connected to my Dad’s network to further back up in future.
Sorry to hear about the burglary Natasha. This, and the house burning down, are the two “unlikely but could happen” scenarios that I build my backups around. Hence I’ve got two offline backups in-house (NAS & HDD) and one online as well (Zovo), as well as various selective-file online backups (DropBox & SpiderOak). Leo, I thought my 206GB of photos was something to be proud of until I read this article. Sigh.
I have been reading about m-discs (magnetic) which are supposed to last almost forever without deteriorating for any reason. I would like to know if an m-disc burner (internal or external) could be used by a lay person who is only moderately computer-savvy, and whether or not this would be a wise purchase to use for files which need to be kept for years, i.e. tax-related, etc.
From what I’ve read you just need an m-disc ready burner and m-discs, and the process of burning an m-disc is the same as burning a standard DVD and uses normal DVD burning software. Checking on Amazon, the cost of an m-disc burner is not extremely expensive.
I back up my archives by copying my files to relatives’ computers. They benefit by getting the photos and I have an external backup. In some cases I gave them a drive with the photos. It makes an excellent gift.
One issue that is not addressed in this Archiving article is the question of future generations being able to find, access and view the items archived. For instance, will future computers be able to read CDs and DVDs, or an external USB-interfaced HD? I still have lots of floppy disks, and soon I’ll be getting rid of the last PCs stored in the basement which still have a floppy drive. I THINK I transferred all I need off of them!
Then there is the issue of file format – I should probably think of converting all my RAW images to JPEGs which will probably be readable to more people than the Nikon version of a RAW file! I think I converted all the documents from my Commodore 64 – what was that? WordPro? Too late now if I didn’t!
Maybe I should spend my retirement years printing everything on archival paper.
What you mention is a valid concern. For example, if you archived your pictures to 5.25″ floppies, there are very few 5.25″ drives to read these, but there are some still available. The solution is pretty obvious. As new storage methods become available and older ones are phased out, it’s a good idea to copy the files to newer media. It fits in with the philosophy of having multiple backups. You can still keep your hard copies too, although hard copies also have their problems. Old photos may have that classic look, but its really faded color which will fade more and more as time goes on. Many old books are still around, but most are gone, and those which we have are so delicate they need to be handled in a special way. Not to mention that modern paper is made with chemicals which will cause the paper to self destruct in much less time than older paper did. As for me, my bet is on modern technology for preservation of data.
“Archiving is nothing more than the process of moving (or copying) digital data that you want to keep for “a long time” to some appropriate device, location, or storage medium.” This also means keeping up to date the hardware and software necessary to read the archived data as well as updating the formats of the archived data to ensure that it is able to be read by the new software on new hardware. The location of the archived data must be in at least 2 widely separated and secure locations where disaster of any kind is unlikely to affect both/all locations. An efficient method of searching for and the retrieval of data is necessary. It is pretty useless to have terabytes of data without having an efficient method to identify and retrieve sought after data.
That pretty well protects personal data. A government on the other hand will want additional and more stringent protections which includes keeping both hard copies and digital copies so that data can be retrieved in the case of total loss of technology.
(former career in Records Management for a gov’t agency)