Customer Ratings:
List Price: $179.99
Sale Price: $94.99
Today's Bonus: 47% Off
Here is a quote from a review at pcper.com
I'm going to let the cat out of the bag right here and now. Everyone's home RAID is likely an accident waiting to happen. If you're using regular consumer drives in a large array, there are some very simple (and likely) scenarios that can cause it to completely fail. I'm guilty of operating under this same false hope I have an 8-drive array of 3TB WD Caviar Greens in a RAID-5. For those uninitiated, RAID-5 is where one drive worth of capacity is volunteered for use as parity data, which is distributed amongst all drives in the array. This trick allows for no data loss in the case where a single drive fails. The RAID controller can simply figure out the missing data by running the extra parity through the same formula that created it. This is called redundancy, but I propose that it's not.
Since I'm also guilty here with my huge array of Caviar Greens, let me also say that every few weeks I have a batch job that reads *all* data from that array. Why on earth would I need to occasionally and repeatedly read 21TB of data from something that should already be super reliable? Here's the failure scenario for what might happen to me if I didn't:
* Array starts off operating as normal, but drive 3 has a bad sector that cropped up a few months back. This has gone unnoticed because the bad sector was part of a rarely accessed file.
* During operation, drive 1 encounters a new bad sector.
* Since drive 1 is a consumer drive it goes into a retry loop, repeatedly attempting to read and correct the bad sector.
* The RAID controller exceeds its timeout threshold waiting on drive 1 and marks it offline.
* Array is now in degraded status with drive 1 marked as failed.
* User replaces drive 1. RAID controller initiates rebuild using parity data from the other drives.
* During rebuild, RAID controller encounters the bad sector on drive 3.
* Since drive 3 is a consumer drive it goes into a retry loop, repeatedly attempting to read and correct the bad sector.
* The RAID controller exceeds its timeout threshold waiting on drive 3 and marks it offline.
* Rebuild fails.
At this point the way forward varies from controller to controller, but the long and short of it is that the data is at extreme risk of loss. There are ways to get it all back (most likely without that one bad sector on drive 3), but none of them are particularly easy. Now you may be asking yourself how enterprises run huge RAIDs and don't see this sort of problem? The answer is Time Limited Error Recovery where the hard drive assumes it is part of an array, assumes there is redundancy, and is not afraid to quickly tell the host controller that it just can't complete the current I/O request.
Here's how that scenario would have played out if the drives implemented some form of TLER:
* Array starts off operating as normal, but drive 3 has developed a bad sector several weeks ago. This went unnoticed because the bad sector was part of a rarely accessed file.
* During operation, drive 1 encounters a new bad sector.
* Drive 1 makes a few read attempts and then reports a CRC error to the RAID controller.
* The RAID controller maps out the bad sector, locating it elsewhere on the drive. The missing sector is rebuilt using parity data from the other drives in the array.
*Array continues normal operation, with the error added to its event log.
The above scenario is what would play out with an Areca RAID controller (I've verified this personally). Other controllers may behave differently. A controller unable to do a bad sector remap might have just marked drive 1 as bad, but the key is that the rebuild would be much less likely to fail as drive 3 would not drop completely offline once the controller ran into the additional bad sector. The moral of this story is that typical consumer grade drives have data error timeouts that are far longer than the drive offline timeout of typical RAID controllers, and without some form of TLER, two bad sectors (totaling 1024 bytes) is all that's required to put multiple terabytes of data in grave danger.
The Solution:
The solution should be simple just get some drives with TLER. The problem is that until now those were prohibitively expensive. Enterprise drives have all sorts of added features like accelerometers and pressure sensors to compensate for sliding in and out of a server rack while operating, as well as dealing with rapid pressure changes that take place when the server room door opens and the forced air circulation takes a quick detour. Those features just aren't needed in that home NAS sitting on your bookshelf. What *is* needed is a WD Caviar Green that has TLER, and Western Digital delivers that in their new Red drives.
End quote and back to reviewer.
I've got 5 of these in a Synology DiskStation 5-Bay (Diskless) Network Attached Storage (DS1512+). It is really a sweet setup.
The Synology software has a S.M.A.R.T. test that can do surface scans to detect bad sectors. I have their Quick Test check every disk daily and the Extended Test set to automatically run on each of the 5 disks every weekend. (The Extended Test takes about 5 hours per disk so I separate the tests by 12 hours.)
WD Red 2 TB NAS Hard Drive: 3.5 Inch, SATA III, 64 MB Cache
Posted by
Unknown
on Friday, October 21, 2016
0 comments:
Post a Comment