It’s turned off now. I finished replacing it the other day, and I want to share why, some of the mistakes I made, some of the mistakes I didn’t make, and what I replaced it with.
And yes, how I dodged a bullet: a data loss bullet that had my name written all over it.
Become a Patron of Ask Leo! and go ad-free!
Why a NAS?
An NAS is just a small computer attached to your home or business network, dedicated to providing storage. Rather than being connected to a single computer, like an external drive, a NAS is designed specifically so that other computers on the network can access that storage using Windows file sharing and possibly other protocols.
There are a few reasons you might consider an NAS:
Naturally it’s the last item – reliability – the finally convinced me.
NAS drive failure: what should happen
A RAID array stores data redundantly, meaning that there is extra data placed on the drives which can allow other data to be recovered in the event of failure. The design is such that if any single drive in a RAID array fails, the array continues to work and no data is lost.
A couple of years ago I had exactly that happen: one of the drives failed. RAID arrays are often designed to be “hot-swapable”, meaning you don’t need to turn power off to replace a drive. I pulled out the failing drive and replaced it with an identical model. The NAS did it’s magic to set up the newly replaced drive, and within a couple of hours it was “fully redundant” once again.
All was well.
The problem this time?
Two drives were beginning to die. While larger RAID arrays can be designed to handle that with extra redundancy, that’s not something my little NAS could recover from. But since they were “beginning” to fail, they hadn’t failed yet, and I had a chance to preserve the data if I acted in time.
Unfortunately, I came close to shooting myself in the foot on that.
NAS failure notification
An NAS is a nondescript box into which you plug power and network and nothing else.
That means it needs other ways to alert you to problems. The technique here was pretty simple: the NAS would send email whenever anything went wrong. That worked when the single drive failed earlier; I got a message in my inbox that said, in effect: “Pay attention to your NAS or you risk losing data”. I acted on that, replaced the drive, and life went on.
This time I got no message.
The drives could have been on their way to failure for months, and I wouldn’t have known about it.
Why? The NAS had become unable to send email. Even now, I’m not 100% sure what changed or why, but as I looked at its email settings, it was clear: that configuration wouldn’t work. It probably worked at some point, but right now, when I needed it to, no.
The message that it was trying to tell me? “Pay attention to your NAS soon or you really risk data loss.”
The only reason I stumbled onto it is because the NAS also starting crashing when it attempted to read a specific file, and as a result attempts to use the NAS from the computers on my network were failing. Not only should a crash never happen to an NAS, it’s a Big Red Flag if it does. It was probably related to the impending failure.
The dilemma I faced
I was now faced with a choice. I could:
- Replace the two failing drives in the NAS, one at a time, so that the NAS could rebuild each in turn, and hope it would complete the process before one or both drives failed completely.
- Ditch the NAS and replace it with a single external drive.
On one hand, keeping the NAS running seemed like a reasonable choice. Except that it’s another device, with four hard drives that could fail. Of the five hard drives it had held since I got it, three had failed or were failing.
On the other hand, replacing it with a single, simpler drive – and a larger one at that – attached to a computer I already had running also seemed like a pretty compelling option. I really don’t need the speed and redundancy of a NAS or RAID; I just need some disk space.
The cost, as it turned out, was almost the same either way.
But Leo, don’t you back up?
Now, with all the harping I do on backing up, you’re probably asking yourself why I was concerned at all. I mean, certainly I had a backup of everything, right?
The NAS itself had been getting backed up nightly. By that, I mean that all the data on the NAS was getting replicated to another system. So data on the NAS was unlikely to be lost, no matter what. Backups are good that way.
The problem was this: the drive had been failing (and crashing) for “a while”. By my estimates, it’d been having a problem for a month or more. And that NAS device was the backup drive (my “external backup drive”) for several of my other systems. Those devices hadn’t been getting backed up properly for at least a month.
I stood to lose my most recent months’ worth of updates and changes should they also fail while the NAS was broken.
What this means for me
I went the external USB drive route. I purchased a USB3 interface for an older desktop computer that I run in my basement (my previous primary machine), and after confirming that it worked, I ordered a five terabyte external USB3 drive. I did some more shuffling for performance: the internal 3TB drive that had been the backup for the NAS now became the NAS’s replacement. The external USB drive became the backup to the NAS drive.
It’s no longer an NAS in the formal sense, since it’s not a dedicated box with multiple disks in a RAID array providing only storage. Now it’s just a drive shared by a PC on my network.2
My lesson was pretty simple:
- Don’t break the way that devices might notify me of a problem.
- Make sure that backup processes that fail notify me somehow. (And that I pay attention to those notifications.)
Being as email-centric as I am, that means don’t break email. 🙂 And set up filters so important emails like this don’t get sent to spam.
What this means for you
So why am I telling you all this?
Two reasons, really:
First, I want to reassure you that this kind of thing happens to all of us. No, I’m not expecting that you have a NAS on the verge of a nervous breakdown – but you may have a hard disk in your machine that could be. The warning signs are easy to overlook or ignore. The bottom line here? There’s nothing like a good backup to protect you from anything.
Second, I want to remind you that hardware breaks. People often assume that nothing could possibly break when it comes to their computer’s hardware. Even when the symptoms are clearly hardware related3, people continue to ask “where’s the setting in the registry to fix this?” It doesn’t always work that way. Hard disks fail more often than we’d like, and often at exactly the wrong time. Computers, keyboard, mice, monitors … they can all fail, in various ways from boring to exciting, and it’s important that you realize this and be prepared.
On one hand, I can look at my scenario and point out places where I failed to sufficiently protect myself – and indeed, I’ll be making a change or two as a result. When you’re working with technology, there’s always something that can be tweaked or improved.
But on the other hand, my system worked. I was lucky, but I was also prepared. I was backed up. There’s enough redundancy built in to my set up that I could have lost my NAS, and all three terabytes of data, on it and not be horrifically put out.4
Take a look at each component of your computer, your data, and your online life, and ask one simple question for each: what would I lose if this went away instantly and without warning?
Then decide if that’s something you want to prepare for.
My vote, of course, is that you do.