Redundant Array of Independent (originally Inexpensive) Disks or RAID can be set up using hardware or software. Of course, building hardware raid is more expensive than software arrays. In return of this, it offers better permformance. Software raid will use much more ram and cpu resources. Basically, this is a storage virtualization technologies which combines multiple physical disks into logical arrays.
I won’t talk about hardware raid like megaraid, MDRaid or others nor historical background of raid technology. We will cover different most known raid levels.
Raid 0 uses two or more disks, and is often called striping (or stripe set, or striped volume).
Data is divided in chunks, those chunks are evenly spread across every disk in the array.
Provides excellent performance.
The main advantage of raid 0 is that you can create larger drives. raid 0 is the only raid without redundancy. Here comes an important terminology, “Chunk”. Chunk is the size of a data which will be written in disk, i.e 64bit. If huge amount of data will be handled, small chunk size should be given, otherwise chunk size should be bigger. This will directly affect on I/O performance.
Raid 1 uses exactly two disks, and is often called mirroring (or mirror set, or mirrored volume). All data written to the array is written on each disk. The main advantage of raid 1 is redundancy. The main disadvantage is that you lose at least half of your available disk space (in other words, you at least double the cost). Not used in production environments because write throughput is always slower, every drive must be updated each time, and the slowest drive limits the write performance.
raid 5 uses three or more disks, each divided into chunks. Every time chunks are written to the array, one of the disks will receive a parity chunk. Unlike raid 4, the parity chunk will alternate between all disks. The main advantage of this is that raid 5 will allow for full data recovery in case of one hard disk failure. If you are using physical servers you will always need to monitor your device states; power supply, disks, cpu, ram etc. When one disk is failed you system will run even it will reboot without any problem. You will notice earlier about the failure then will have time to replace faulty disk. Nowadays, rebuiding arrays after a disk failure can be done online, no need to shutdown your server.
How your data will be rebuild from parity chunk ? Well, disks recieve parity chunks alternately, you have 3 disks one of them fails.
You have 1 0 1 X = 1 or the inverse 1 0 1 0 = X
The condition is; you must have double “1” to get equal to 1. So here from other parity chunks, data on newly added disk will be created. From tihs information x must replaced by 0 (first equal) and 1 (second equal). This missing metadata will be created on newly added disk.
Raid 0+1 is a mirror(1) of stripes(0). This means you first create two raid 0 stripe sets, and then you set them up as a mirror set. For example, when you have six 100GB disks, then the stripe sets are each 300GB. Combined in a mirror, this makes 300GB total. raid 0+1 will survive one disk failure. It will only survive the second disk failure if this disk is in the same stripe set as the previous failed disk.
raid 1+0 is a stripe(0) of mirrors(1). For example, when you have six 100GB disks, then you first create three mirrors of 100GB each. You then stripe them together into a 300GB drive. In this example, as long as not all disks in the same mirror fail, it can survive up to three hard disk failures.
That’s all about basic raid knowledge. For more details, we will do a tutorial on Linux servers, see different cases and failures. Also by this tutorial we will learn about multipathing.