Monday, August 9, 2010

Software Vs Hardware RAID


A redundant array of inexpensive disks (RAID) allows high levels of storage reliability. RAID is not a backup solution. It is used to improve disk I/O (performance) and reliability of your server or workstation. A RAID can be deployed using both software and hardware. But the real question is whether you should use a hardware RAID solution or a software RAID solution.
In this post I will document my experience with both software and hardware RAID.

Comparison: Hardware vs Software RAID

FeatureSoftware RAIDHardware RAID
Cost:
Software RAID is part of OS, so no need to spend extract money.
LowHigh
Complexity:
The software RAID works on partition level and it can sometime increase complexity if you mix different partitions and hardware RAID.
Medium to highLow
Write back caching (BBU):
The software RAID cannot add a battery. Hardware RAID can run in write-back mode if it has a BBU installed. With BBU pending writes are not lost on a power failure.
NoYes
Performance:
With the software based RAID0 and RAID1 performance is negligible. However, performance goes down when you use parity-based arrays and/or several arrays at the same time. The performance of a software-based array is dependent on the server CPU performance and current load.
Depend upon usageHigh
Overheads (CPU, RAM etc): 
The software RAID must use server's CPU and RAM for RAID software. The more hard drives means more CPU cycle will go to software RAID instead of your Apache / Postfix or MySQL server.
Depend upon usageNo
Disk hot swapping:
It means replacing hard disk without shutting down the server. Many RAID controller supports disk hot swapping.
NoYes
Hot spare support:
A hard disk is physically installed in the array which stays inactive until an active drive fails, when the system automatically replaces the failed drive with the spare, rebuilding the array with the spare hard disk included.
YesYes
/boot partition:
It is hard to make fail over with software RAID if /boot fails while booting the server. This can result into unexpected errors and data loss. However, LILO and FreeBSD loader can get around this problem too.
NoYes
Open source factor:
*BSD / OpenSolaris and Linux RAID software drivers are open source. It means more people can fix problems as compare to a closed source hardware firmware. You can move, mix and match different sizes with open source software RAID.
YesNo
Vendor lock in (open formats): See above.NoYes
Higher write throughput:
Hardware RAID with BBU may offers higher write throughput.
NoYes
Faster rebuilds:
Hardware RAID with BBU may offers faster rebuilds as compare to software based solution.
NoYes
Can act as a backup solution?:
Both software and hardware RAID cannot protect you against human errors or system failures or viruses. Daily scheduled and off site backups of your system are highly recommended. Use tools such as rsync, rsnapshot, tar, dump, restore and others to make daily backups.
NoNo
Recommend usage+Low cost solution
+Better for RAID0 or RAID1
+Single server / workstation
+Perfect for home and small business users.
+No vendor lock-ins
+Do you run a mission critical cluster or setup?
+Heavy database driven dynamic site
+Do you want the highest performance possible?

Other Factors

Powerful Modern CPU

The performance of a software-based array is dependent on the server CPU performance and load. With today's faster CPUs, software RAID outperforms hardware RAID.

Can RAID Array Fail?

Yes. The entire RAID array can fail taking down all your data (yes hardware RAID card do dies). Use tapes and other servers that can hold copies of the data, but don't allow much interaction with it. Move your data offsite. Another option is to use two or three RAID cards. Combine them together to protect your data. This make sure you gets back your data when one of your RAID card dies out.

Hardware vs Software Recovery

My personal experience - recovering from software RAID is easy. However, sometime finding out exact hardware RAID requirements can be a nightmare. A good backup can save from RAID hardware incompatibility problems. Software RAID allows you to mix different drive and sizes. You can not do something like this with hardware RAID cards. With software RAID you can swap the drives to a different server and read the data. There is no vendor lock in with software RAID solution.

You Can Not Go Wrong With Hardware RAID

There is an old saying in IT - no one ever got fired for picking RAID controllers.

Use Both Hardware and Software RAID

Sometime you need to use both hardware and software RAID to get the best of both worlds. For e.g. set up 4 mirror pairs, 2 on each hardware RAID controller, and use software RAID0 to put it all together. This will give the best performance for database server. Here is another example from one of our DR site server (this box mirrors our 30+ production server files and database):
  1. Server chassis with redundant power supplies
  2. Intel or AMD Dual Core CPU x 2
  3. 16GB ECC RAM
  4. 24 hot swappable drive bays
  5. 2 x RAID PCIe / PCIx RAID hardware controller
  6. 4 x Intel 1000 PCIx Lan cards (bond them together)
  7. 24 x 1TB SATA hard disk
  8. OS - Pick - Linux / FreeBSD / OpenSolaris
  9. Filesystem - Pick - ZFS / UFS / Ext3 (we use RAID-Z)
  10. Backup software - rsync, rsnapshots and MySQL in slave mode.
Now you can configure RAID0 stripe across the three RAID6 arrays (8 x 24 disks) using both hardware and software solution together. This massive storage system is perfect for online live backups.

Conclusion

So which one is better software raid or hardware raid?
Short answer - None.
Long answer - It depends upon your setup and requirements. I strongly recommend running both with benchmarking software to find out your disk I/O. Test both solutions by removing hard disk i.e. fail a few drives. Try running system while drives are failed. Note down system load and errors (if any). Reboot the system. See if you can boot. Can you see your data again? Are you comfortable using tools provided with both solutions? See what works for you.
Finally, while choosing a storage always consider speed, reliability, and cost - pick any two.

A Final Note About My Personal Choice

I have been successfully using Linux and FreeBSD software RAID for several years for backing up my own data. I prefer to use software RAID to save money and to avoid vendor lock ins. All my personal data backup using the following hardware:
  • 1.5 TB USB hard disk - rsync and rsnapshot is used to make backup of all my servers and digitial camera.
  • 80GB x 3 hard disk software RAID using FreeNAS. Again, rsync is used to make all backups. I'm planing to replace UFS with RAID-Z under FreeBSD 8.

RAID Alternatives

Local disks on MogileFS storage nodes can be in a RAID, or not. It's cheaper not to, as RAID doesn't buy you any safety that MogileFS doesn't already provide. This is quite popular among the web 2.0 companies where lots of photos, images and files are uploaded by their users.
RAID-Z ZFS Storage is a data/parity scheme like RAID-5, but it uses dynamic stripe width. Every block is its own RAID-Z stripe, regardless of blocksize. This means that every RAID-Z write is a full-stripe write. It doesn't require any special hardware.

References: