Do you think RAID 1 is a viable alternative for backing up?
No. No. No. No.
And NO!RAID is not a backup and no RAID array should ever be considered a replacement for backup.
I’ll review what RAID is, and most importantly, what it is not.
RAID stands for Redundant Array of Inexpensive Disks.
It can be used to improve two things:
RAID is… improved reliability
RAID 1 (which is what you’re asking about) uses what’s called “mirroring” to improve reliability – or more correctly, the fault tolerance – of a disk drive. The two drives appear as a single device. Whenever data is written to the logical drive that your operating system sees (perhaps C:), that data is simultaneously written to both physical drives by the RAID controller.
Should either one of the drives fail, the other is still present and available. The RAID controller will run in single-drive mode until the failed drive is repaired or replaced. Some RAID controllers actually allow this to happen without powering down at all.
Having RAID does not impact your need for proper backups.
RAID is… improved speed
RAID 0 uses what’s called “striping” to improve the apparent speed of your hard disk. Striping uses techniques that vary from RAID controller to RAID controller to spread your data across the two (or more) physical hard drives. Once again, they are combined transparently by the RAID controller to look like a single drive, perhaps your C: drive.
The increase in speed comes from the fact that the hard disk head movement and rotation speed both limit the rate at which data can be retrieved from hard disk media. For example, by alternating every other sector of your data across two physical drives, the apparent data rate can theoretically be doubled.
Important: RAID 0 should never actually be used as it reduces fault tolerance, almost doubling your risk of hard drive failure. If either of the two drives fails, then the entire logical drive will have failed. I use it here as an example of a basic RAID technique, which can be built upon to mitigate that increased risk as we’ll see shortly.
RAID is… improved speed and reliability
The two techniques that I’ve discussed can be combined in various ways, if you add additional drives.
A common technique uses both redundancy of data across multiple drives and distribution of data across multiple drives to achieve both improved speed and improved fault tolerance.
Consider this equation:
A + B = Z
Let’s think of A and B as our data (we can also think of them as bytes or sectors – it doesn’t matter), and we’ll call Z a check sum.
A, B, and Z are each placed on separate hard drives. These three drives together are managed by the RAID controller to look like a single drive.
When you write data to the drive, A and B each get written to their separate drives; the RAID controller calculates A+B and writes that to the third drive as Z.
Why’d we do all that?
If a drive fails (and it could be any of the three drives), whatever was on it can be re-calculated from the remaining two. The RAID controller can do this so that your system can continue running until the failed drive has been replaced. This gets you the fault tolerance that I discussed as characteristic of RAID 1.
Your data is spread across two drives – A and B. This allows the RAID controller to stream your data off of those two drives; this simultaneously gets you the speed improvement of a RAID 0 configuration.
Best of both worlds.
Naturally, I’ve oversimplified, and indeed, there are many ways to configure RAID arrays, but these are the fundamental concepts that pretty much apply across the board.
RAID is… NOT a backup
You might be tempted to look at RAID 1 and say, “Hey, my data is on two drives. That’s backed up, right?”
Your data is on one drive: C:. Yes, you might be more tolerant of a hard disk failure, and that’s a nice thing, but it’s not a backup.
- If your system is infected with a virus, RAID won’t be something you can restore to, like a backup can.
- If you accidentally delete a file, you won’t be able to restore it from a RAID array, like you can from the most recent backup.
- If your system goes up in flames, a RAID array is not going to be a copy of your data safely stored elsewhere – like a backup could be.
In general, there are two great rules of thumb for backups that you can apply to any backup approach:
- A backup should never be kept on the same machine. Technically, external drives actually violate this rule, but they’re at least a separate physical box which removes some of the major concerns relating to this rule.
- A backup should never be on the same drive as the thing being backed up. By drive here, I mean logical drive (C: for example) regardless of how many physical drives that might actually be “under the hood.” The reason is simple: software (and users) operate at the logical drive level. If you accidentally instruct your computer to delete all of the files on your drive (don’t laugh, it happens more often than you think – and it has happened to me), that would then delete both the original and backup. A virus, software bug, or any number of other scenarios could produce the same results. And, of course, if the drive fails – be it a single drive, as is most common, or the raid controller controlling several physical drives – then the backup is once again lost with the original.
Relying on RAID 1 as some kind of backup violates both of these rules.
RAID is… good for what it’s good for
RAID is an important technology to deliver potentially both speed and fault tolerance. Most higher-end servers, including the server hosting the Ask Leo! site, use some form of RAID for one or both of those purposes.
But don’t confuse it with a backup. Having RAID does not impact your need for proper backups.