Introduction to RAID

Posted on Dec 29 2012 - 2:59pm by admin

Hard drives have become exceptionally inexpensive in recent years. Current consumer hard drives cost as little as five to ten cents per gigabyte; as few as ten years ago, hard drives cost five to ten dollars per gigabyte. This allows for consumers to not only have hard drives capable of storing more than a million photos or hundreds of high quality videos. It also allows for home users to take advantages of technologies that used to only be affordable by businesses, such as RAID.

RAID standards for Redundant Array of Inexpensive Disks and is a common method of providing data redundancy. The second half of the acronym is easily explained by the low cost of current hard drives (also called hard disks). The drives are put into an array (several drives working together) and configured in such a way that your computer can keep running even if a hard drive fails (redundancy). RAID comes in many different varieties, but the home user will mostly be interested in four of the RAID levels: RAID 0, RAID 1, RAID 5 and RAID 10.

RAID 0, also called striping, is sometimes considered a misnomer because it doesn’t actually provide redundancy like the other RAID levels. However, it requires a RAID controller to utilize and has additional benefits. This level of RAID uses two or more identical hard drives and presents them to the operating system as one big drive. For example, a RAID 0 with two 500GB hard drives will be seen as a single 1,000GB (or 1TB) hard drive by the system. Data is written in stripes across all of the drives, allowing for faster reading and writing of data. This can give a big performance boost and is very popular with gamers. The downside is that if any drive fails, all of the data on all of the drives is lost.

RAID 1, or mirroring, is also very popular because it’s the only other common RAID level that works with only two hard drives. Like RAID 0, both hard drives must be identical; however, the operating system only sees as much space as is on one drive. Two 500GB drives in a RAID 1 will be seen as a single 500GB drive. The upside is that because all of the data is mirrored across both drives, a drive failing will not bring down the system. Instead, the system will keep running entirely off the good drive as if it were a normal system with only one hard drive. The bad drive can then be replaced to restore full redundancy to the array. RAID 1 does have a slight performance advantage when reading data from the disks because it can split up the task between the two disks. However, writing data takes a slight performance hit as the data must be written twice: once to each drive.

RAID 5 is less common among home users and is usually only seen on higher end systems due to greater complexity and higher cost. The higher cost comes from requiring a minimum of three identical hard drives, although more can be used. The total storage seen by the system is equal to the total number of drives in the array minus one. A RAID 5 of four 500GB drives will be seen as 1,500GB (1.5TB) by the system (500GB per drive times 4 drives minus one 500GB drive). This RAID level works by having a parity stripe across each drive. The parity stripe is used to restore data if a drive fails. Like RAID 1, RAID 5 can suffer a single drive failure without affecting performance. RAID 6 is closely related to RAID 5, but it can suffer two failed drives without affecting performance. However, RAID 6 is more expensive to implement and is rarely seen in consumer grade systems.

RAID 10 (or 1+0) is actually a combination of RAID 0 and RAID 1 and gives the advantages of both. It requires four identical drives to implement. The drives are split into two arrays of two drives each using RAID 1. The two mirrors are then used in a RAID 0. Total storage is equal to half of the total of all four drives. Four 500GB drives in a RAID 10 would give 1,000GB of total storage. Redundancy is greatly improved, as it can suffer one drive failing in each RAID 1 (or two drives total). However, if two drives fail in the same RAID 1, then all data will be lost. The less common RAID 01 (or 0+1) has all of the same advantages of RAID 10, but consists of two sets of RAID 0 combined into a large RAID 1.

The best, but most expensive, way to implement a RAID is with its own dedicated RAID controller card. These cards typically plug into the motherboard’s PCI or PCIe slot. The drives then plug into the card. Dedicated cards give the best performance and are often more reliable and offer more features, but may cost more than all of the hard drives. The second best way to implement a RAID is through the motherboard, which is becoming a very common feature among mid and high end motherboards. Although not quite as reliable as a dedicated RAID card, this method tends to be much less expensive and is readily available on many newer systems. The last method is to create a software RAID. In this case, the operating system creates and controls the array. It tends to be slower than a hardware RAID and more error prone, but may be the only choice for RAID on a budget.

The final caveat is that RAID is not a backup solution. Although it provides redundancy to allow the PC to keep running even if a hard disk fails (except for RAID 0), there are many situations in which only a proper backup can restore your data. Be sure to create regular backups and check that you can restore the data on them to another system in case the RAID totally fails. Furthermore, although it is possible to create arrays using different sized drives, each drive will be treated as if it’s the same size as the smallest drive in the array. Performance and reliability may also be affected if the drives are not identical. When in doubt, always follow the RAID card or motherboard manufacturer’s directions.

CD42XGD3FYUG