Sorry, but you're not correct. A CPU is a general purpose processor that can do all sorts of things. A RAID controller will have a high speed XOR/Parity engine where that's all it does.
Just like how the fastest CPU can't even touch the performance of a $60 graphics card (when it comes to rendering 3D graphics,) a CPU is also not as fast as a RAID card with performing parity calculations.
Well, maybe a CPU can do it faster, but you'll use a nice chunk of your CPU capacity on disk I/O. Not to mention you have to shove more data back and forth through the system bus because you have to access each disk individually and send/recieve the parity chunks.
New RAID controller chips hit the market all the time, each must faster than the previous. That's why we're seeing most RAID controllers support RAID 6 now - the better performing chips in these things are now fast enough to do the additional parity calculations.
Maybe some day it won't matter but right now, it just does.
Well, Windows does. Taking a snapshot of NTFS, even on a heavily used 1TB+ file server, takes only a few seconds, and under normal operation the file system is still fast.
NTFS is actually a pretty good file system. It's probably because it was originally designed by IBM.
I agree with you completely, but a disk-based backup system wasn't the intended target for the discussion.
I have to disagree with you about Synctoy. I absolutely hate synctoy. Use a disk snapshot system instead, such as Backup Exec System Recovery (i.e. Ghost 14 for servers.) You can do incremental with that and you can even preserve bootability and all metadata.
Well, I did mention FreeNAS so that lends itself to the possibility that I *probably* know what OpenFiler is.
SATA disks actually aren't fine for a lot of applications. Any SINGLE app, I'll bite. But for most VMware installations where you have over 10 virtual machines (that are actually USED in production) you SATA disks might not cut it. Or they might be fine. It really depends.
It's not about disk transfer speed, it's about IOPS. The 10 or 15K SAS/FC disks will get your data faster. And that's what it's all about. Nearly all normal infrastructure-type servers (File servers, e-mail, normal-use databases, etc) require a lot of IOPS but don't really care about throughput. It takes basically the same amount of time to fetch 4k as it does to fetch 1MB.
I'd love to be able to offer an OpenFiler solution to our customers, and I'm pushing for it for some of out smaller clients that want to go virtual, but it's not an easy sell. For home, it's great. For a one-off project or for a non-critical backup system, sure. Production? I trust it, but I live in the real world where our customers don't.
Really? Because I thought the main reason for not going RAID 4 was the performance hit of having a single disk getting hammered with parity information all the time.
RAID 5 is always preferred because it offers the same protection but distributes the parity for better performance.
You're running 300GB disks right? It's amazing, I have a 7 disk RAID 5 array on my file server at home and it's got 4.1TB =) I put it together about two months ago. I'd wanted to do it for a LONG time but until those 750's came down to the right price (about $100 each) I just couldn't justify it. The server also has four 500's in it (I mean, you can buy a 500GB for nearly $50 now!)
The cost comes with the RAID controllers. I wanted good performance so I got a hardware RAID card for it and the SATA RAID cards ain't cheap. I settled on a 12-port Accusys card. I love the thing. Couldn't be happier with it. I purchased a 4-port version for my VM host.
No way I can really back that up so I replicate a good portion of it to a friends' file server at his house. There was a lot of pre-staging involved, obviously, but DFS Replication works wonderfully for keeping things up to date in both directions.
But these days, I believe everyone should use RAID if they can. RAID 5 is a great mix of performance, reliability, and price. Who wants to deal with going to the backup system to recover data when a single drive fails, when it could have been avoided completely with a RAID?
While I agree with your post and you obviously have a little experience with these matters (versus a lot of folks on Slashdot - it's surprising) there's no such thing as "server grade" when it comes to the quality of a hard drive.
Do you think that they build "server" drives in a clean room, and "desktop" drives in a slightly less clean room?
Hard drives are manufactured right next to one another. Some will be SATA. Some will be SAS. Others will be Fiber Channel. ALL have the same incredible tolerances and accurate measurements. Hard drives are a miracle of modern technology.
The MTBF for hard drives is more of a warranty than a fact. There's no correlation between failure rates in server disks versus PC's.
Personally, I believe the correct answer to ensuring data recoverability is RAID together with real-time replication. You can usually accomplish this with a very acceptable price-point.
RAID is important to prevent down time due to a single disk failure, and replication prevents loss of data due to an array failure.
Personally I think RAID5/6 will be around for a very long time because it works and there's actually people out there that use it correctly (versus SO MANY people on Slashdot, apparently.)
Gosh, there's even one guy a few posts up that's claiming "The biggest problem with RAID is DECAY." Holy crap. Any RAID card made in the last 10 years will periodically scrub the disks and make sure the parity is correct - or else it will mark a disk as bad.
This is mitigated quite a bit by hardware RAID controllers, SMART, and data validation.
I've personally never experienced this "RAID decay" you speak of in the last 15 years of working with storage systems. And some of the arrays our customers have running consist of very old disks and controllers.
Lots of people use RAID 5. Well, let me correct that. MOST people use RAID 5. It's by far the most commonly used RAID system because of a good balance of protection, performance, and cost.
Only the most demanding systems will go 10, or the most paranoid admins. But for most things and most admins, RAID 5 works well. RAID 6 is becoming more popular, too - but it's basically the same thing. Instead of RAID 5 with a hot-spare, folks use RAID 6 without one.
You should always follow common sense and don't build 14 disk arrays and make sure your backups work.
RAID 5 is still relevant and will remain so for the foreseeable future.
Yea, except for the fact that ZFS isn't fast when you're using all the nifty features.
Normal server-class machines (and many workstations) have hardware RAID controllers, which do all of the parity calculations themselves. ZFS in "RAID" mode is done all in software, so it's got the same disadvantages as traditional RAID in software.
Software RAID like what is used in ZFS has a lot of advantages. It's like how LVM in Linux does - you can RAID individual partitions. But they all have the large performance hit.
Perhaps there could exist a "ZFS Accelerator" card that would do the parity checking for you, but as far as I'm aware that doesn't exist. On a big beefy Sun box it might not be a problem but on your desktop PC it will be.
Then there's the issue of BUS utilization. While it's true that PCIe is a fast bus, you're dealing with quite a bit less data over the bus by using a hardware RAID controller. The only things that hit the system bus is the actual I/O. With a software RAID, all of the I/O, plus all of the parity reads/writes, must also traverse the bus.
ZFS shows promise but I'll stick with hardware accelerated RAID until such time as software RAID poses no penalties.
You can use large disks in the enterprise without being an idiot.
In fact, the size of the disk has absolutely nothing to do with failure rates.
Basically stated, unless the data is used rarely or by a single host, you'd need a lot more spindles just to get the performance you'd need from that much data consolidated in one spot.
I can go and buy a bunch of 500GB disks for around $55 each now. It's amazing!
But what happens if you need Fiber Channel performance and share LUNs for Clustering or VMware? Your SATA disks from NewEgg won't help you.
Even low-end SAN systems with iSCSI connectivity aren't exactly free. It's not really about the cost of each disk, it's the cost of the management unit and the hard disk bays. Not everyone has the ability to just drop in a bunch of grey boxes running FreeNAS. Sometimes you need an actual SAN and they cost money even when you use SATA disks.
I guess you should be considered a new age Luddite?
Are you the same guy that always waits for SP1 before using any software? I thought so.
RAID is a proven technology and it's use in nearly all business IT systems from big to tiny.
RAID isn't meant as a replacement to backups. It's one PART of the entire system of preventing unnecessary data lose, and more importantly, down time. You can keep on running your server while the failed disk is replaced and rebuilt.
So, while I eat cheeto's and surf Slashdot while that RAID array rebuilds itself, you can go ahead and recover your old data from last night all day long while people bitch at you for not using the technology that's been around since the inception of the hard drive.
If you actually did have the experience you claim, you'd slap yourself for such a stupid fucking post.
Seriously - what's the problem with RAID 5? It's not a FALSE sense of security: It actually DOES prevent data loss or down time on a single disk failure. If you're a moron, you're creating 14 disk arrays. If you're smart, you keep it to 7 disks at the very most.
RAID 5 is great. It's fast, unless you have a shit controller without enough cache. It's going to prevent down time on a single disk failure (which is overwhelmingly the most common type of failure) and it doesn't cost you too much capacity.
Usually I'm more concerned with a fire or flood than a double-disk failure.
RAID 6 is good, but you get the same (actually worse) performance hit over RAID 5. More parity calculations. You can lose any two disks, which is nice, and if you can spare the space, go for it!
I don't see RAID 6 as being all that much more of a big deal over RAID 5 and actually it shouldn't really have it's own number since it's exactly the same technology and parity system as 5. It should be RAID 5.1 or something. Or maybe RAID5+1. The only reason it's become more available now is because controllers have gotten fast enough to deal with the additional parity.
I don't think I've ever met anyone that thought RAID was a replacement for backups. Have you? Wait don't answer that, I don't need a made-up story.
And I beg to differ on the "many enterprises concern most with disk speed" - no. Even small companies now have large data needs, and the very first thing to consider on any storage solution is usable disk space - because if there's not enough space then it doesn't work does it?
Performance is a close second, and reliability is simply taken for granted. You're always going to use a RAID set. It just depends on how much performance you need and how much you can spend.
RAID 5 isn't a false sense of security. It actually DOES protect you from a disk failure.
I made the decision about two years ago that all disks at home will be either mirrored or RAID5. Disks are so dirt cheap that there's no reason not to.
RAID doesn't prevent you from having to have some sort of backup solution, and if you can't trust yourself to do them unless you're being risky with your data, I'll happily avoid dealing with restoring data and all that bullshit from a single disk failure and you can sink your time into doing it all manually.
I'm with you on this, but the problem with many RAID5 systems is that you usually purchase all of the drives at once, so it increases the likelyhood of a double-disk failure since all the drives are the same age.
Mind you, I've worked in IT for over 15 years and I've never had the fortune to experience a double disk failure.
On my own system at home, I have three RAID5 sets on two servers with hardware raid SATA controllers (Accusys controllers - they're really nice!) If I were to experience a disk failure, I'll turn off the server and go get a replacement disk before turning it back on.
I also replicate all of the important data to a friend; we both have file servers, we both run Server 2003 R2 (Well, I run 2008 now) with DFS Replication, and we use an OpenVPN tunnel between us. This way, even if we had a bad disk failure we'll be okay.
In reality, you should have backups of critical data because I'm much more afraid of a fire or flood than a double-disk failure.
There's two very easy to used GUI package tools on Ubuntu. You don't have to use apt-get to install software.
You can double-click a.deb package and install it, and all the dependencies will install automatically. I dare say it's easier than many windows packages.
Yea just ignore the whole part about you flying off the deep end because you thought I was making a negative comment about you. Good move.
I mean, you could have just let it be, and retained maybe a LITTLE bit of dignity, but nope, that's not how you roll, little man.
I've been PM for at least a dozen fairly large wireless installations, and I've been all over the Northeast to hundreds of different businesses - big and small - and only five used EAP because of the current device compatibility issues. It's a problem, and hopefully within the next year or two we'll see better compatibility.
But until then, you can continue to masturbate to the sound of your own words.
Sorry, but you're not correct. A CPU is a general purpose processor that can do all sorts of things. A RAID controller will have a high speed XOR/Parity engine where that's all it does.
Just like how the fastest CPU can't even touch the performance of a $60 graphics card (when it comes to rendering 3D graphics,) a CPU is also not as fast as a RAID card with performing parity calculations.
Well, maybe a CPU can do it faster, but you'll use a nice chunk of your CPU capacity on disk I/O. Not to mention you have to shove more data back and forth through the system bus because you have to access each disk individually and send/recieve the parity chunks.
New RAID controller chips hit the market all the time, each must faster than the previous. That's why we're seeing most RAID controllers support RAID 6 now - the better performing chips in these things are now fast enough to do the additional parity calculations.
Maybe some day it won't matter but right now, it just does.
Well, Windows does. Taking a snapshot of NTFS, even on a heavily used 1TB+ file server, takes only a few seconds, and under normal operation the file system is still fast.
NTFS is actually a pretty good file system. It's probably because it was originally designed by IBM.
I agree with you completely, but a disk-based backup system wasn't the intended target for the discussion.
I have to disagree with you about Synctoy. I absolutely hate synctoy. Use a disk snapshot system instead, such as Backup Exec System Recovery (i.e. Ghost 14 for servers.) You can do incremental with that and you can even preserve bootability and all metadata.
The actual implementation may vary, but in it's simplest form it's the same as RAID 5 with two sets of distributed parity.
Well, I did mention FreeNAS so that lends itself to the possibility that I *probably* know what OpenFiler is.
SATA disks actually aren't fine for a lot of applications. Any SINGLE app, I'll bite. But for most VMware installations where you have over 10 virtual machines (that are actually USED in production) you SATA disks might not cut it. Or they might be fine. It really depends.
It's not about disk transfer speed, it's about IOPS. The 10 or 15K SAS/FC disks will get your data faster. And that's what it's all about. Nearly all normal infrastructure-type servers (File servers, e-mail, normal-use databases, etc) require a lot of IOPS but don't really care about throughput. It takes basically the same amount of time to fetch 4k as it does to fetch 1MB.
I'd love to be able to offer an OpenFiler solution to our customers, and I'm pushing for it for some of out smaller clients that want to go virtual, but it's not an easy sell. For home, it's great. For a one-off project or for a non-critical backup system, sure. Production? I trust it, but I live in the real world where our customers don't.
Really? Because I thought the main reason for not going RAID 4 was the performance hit of having a single disk getting hammered with parity information all the time.
RAID 5 is always preferred because it offers the same protection but distributes the parity for better performance.
You don't count - you understand the risk =)
You're running 300GB disks right? It's amazing, I have a 7 disk RAID 5 array on my file server at home and it's got 4.1TB =) I put it together about two months ago. I'd wanted to do it for a LONG time but until those 750's came down to the right price (about $100 each) I just couldn't justify it. The server also has four 500's in it (I mean, you can buy a 500GB for nearly $50 now!)
The cost comes with the RAID controllers. I wanted good performance so I got a hardware RAID card for it and the SATA RAID cards ain't cheap. I settled on a 12-port Accusys card. I love the thing. Couldn't be happier with it. I purchased a 4-port version for my VM host.
No way I can really back that up so I replicate a good portion of it to a friends' file server at his house. There was a lot of pre-staging involved, obviously, but DFS Replication works wonderfully for keeping things up to date in both directions.
I agree.
But these days, I believe everyone should use RAID if they can. RAID 5 is a great mix of performance, reliability, and price. Who wants to deal with going to the backup system to recover data when a single drive fails, when it could have been avoided completely with a RAID?
That's another funny one I see sometimes.
While I agree with your post and you obviously have a little experience with these matters (versus a lot of folks on Slashdot - it's surprising) there's no such thing as "server grade" when it comes to the quality of a hard drive.
Do you think that they build "server" drives in a clean room, and "desktop" drives in a slightly less clean room?
Hard drives are manufactured right next to one another. Some will be SATA. Some will be SAS. Others will be Fiber Channel. ALL have the same incredible tolerances and accurate measurements. Hard drives are a miracle of modern technology.
The MTBF for hard drives is more of a warranty than a fact. There's no correlation between failure rates in server disks versus PC's.
Indeed.
Personally, I believe the correct answer to ensuring data recoverability is RAID together with real-time replication. You can usually accomplish this with a very acceptable price-point.
RAID is important to prevent down time due to a single disk failure, and replication prevents loss of data due to an array failure.
Personally I think RAID5/6 will be around for a very long time because it works and there's actually people out there that use it correctly (versus SO MANY people on Slashdot, apparently.)
Gosh, there's even one guy a few posts up that's claiming "The biggest problem with RAID is DECAY." Holy crap. Any RAID card made in the last 10 years will periodically scrub the disks and make sure the parity is correct - or else it will mark a disk as bad.
This is mitigated quite a bit by hardware RAID controllers, SMART, and data validation.
I've personally never experienced this "RAID decay" you speak of in the last 15 years of working with storage systems. And some of the arrays our customers have running consist of very old disks and controllers.
Lots of people use RAID 5. Well, let me correct that. MOST people use RAID 5. It's by far the most commonly used RAID system because of a good balance of protection, performance, and cost.
Only the most demanding systems will go 10, or the most paranoid admins. But for most things and most admins, RAID 5 works well. RAID 6 is becoming more popular, too - but it's basically the same thing. Instead of RAID 5 with a hot-spare, folks use RAID 6 without one.
You should always follow common sense and don't build 14 disk arrays and make sure your backups work.
RAID 5 is still relevant and will remain so for the foreseeable future.
Yea, except for the fact that ZFS isn't fast when you're using all the nifty features.
Normal server-class machines (and many workstations) have hardware RAID controllers, which do all of the parity calculations themselves. ZFS in "RAID" mode is done all in software, so it's got the same disadvantages as traditional RAID in software.
Software RAID like what is used in ZFS has a lot of advantages. It's like how LVM in Linux does - you can RAID individual partitions. But they all have the large performance hit.
Perhaps there could exist a "ZFS Accelerator" card that would do the parity checking for you, but as far as I'm aware that doesn't exist. On a big beefy Sun box it might not be a problem but on your desktop PC it will be.
Then there's the issue of BUS utilization. While it's true that PCIe is a fast bus, you're dealing with quite a bit less data over the bus by using a hardware RAID controller. The only things that hit the system bus is the actual I/O. With a software RAID, all of the I/O, plus all of the parity reads/writes, must also traverse the bus.
ZFS shows promise but I'll stick with hardware accelerated RAID until such time as software RAID poses no penalties.
You can use large disks in the enterprise without being an idiot.
In fact, the size of the disk has absolutely nothing to do with failure rates.
Basically stated, unless the data is used rarely or by a single host, you'd need a lot more spindles just to get the performance you'd need from that much data consolidated in one spot.
It's cheap - yes. For drives from NewEgg.
I can go and buy a bunch of 500GB disks for around $55 each now. It's amazing!
But what happens if you need Fiber Channel performance and share LUNs for Clustering or VMware? Your SATA disks from NewEgg won't help you.
Even low-end SAN systems with iSCSI connectivity aren't exactly free. It's not really about the cost of each disk, it's the cost of the management unit and the hard disk bays. Not everyone has the ability to just drop in a bunch of grey boxes running FreeNAS. Sometimes you need an actual SAN and they cost money even when you use SATA disks.
I guess you should be considered a new age Luddite?
Are you the same guy that always waits for SP1 before using any software? I thought so.
RAID is a proven technology and it's use in nearly all business IT systems from big to tiny.
RAID isn't meant as a replacement to backups. It's one PART of the entire system of preventing unnecessary data lose, and more importantly, down time. You can keep on running your server while the failed disk is replaced and rebuilt.
So, while I eat cheeto's and surf Slashdot while that RAID array rebuilds itself, you can go ahead and recover your old data from last night all day long while people bitch at you for not using the technology that's been around since the inception of the hard drive.
If you actually did have the experience you claim, you'd slap yourself for such a stupid fucking post.
At home, I use RAID to protect my non-replicated data between backups.
At work, I use RAID to protect my data between backups and to help prevent down time due to a disk failure.
RAID is an excellent tool to protect yourself. There's multiple levels of protection and RAID is just one of them.
Seriously - what's the problem with RAID 5? It's not a FALSE sense of security: It actually DOES prevent data loss or down time on a single disk failure. If you're a moron, you're creating 14 disk arrays. If you're smart, you keep it to 7 disks at the very most.
RAID 5 is great. It's fast, unless you have a shit controller without enough cache. It's going to prevent down time on a single disk failure (which is overwhelmingly the most common type of failure) and it doesn't cost you too much capacity.
Usually I'm more concerned with a fire or flood than a double-disk failure.
RAID 6 is good, but you get the same (actually worse) performance hit over RAID 5. More parity calculations. You can lose any two disks, which is nice, and if you can spare the space, go for it!
I don't see RAID 6 as being all that much more of a big deal over RAID 5 and actually it shouldn't really have it's own number since it's exactly the same technology and parity system as 5. It should be RAID 5.1 or something. Or maybe RAID5+1. The only reason it's become more available now is because controllers have gotten fast enough to deal with the additional parity.
Wait - WHO SAID THAT?
I don't think I've ever met anyone that thought RAID was a replacement for backups. Have you? Wait don't answer that, I don't need a made-up story.
And I beg to differ on the "many enterprises concern most with disk speed" - no. Even small companies now have large data needs, and the very first thing to consider on any storage solution is usable disk space - because if there's not enough space then it doesn't work does it?
Performance is a close second, and reliability is simply taken for granted. You're always going to use a RAID set. It just depends on how much performance you need and how much you can spend.
Storage capacity is always #1 on the list.
RAID 5 isn't a false sense of security. It actually DOES protect you from a disk failure.
I made the decision about two years ago that all disks at home will be either mirrored or RAID5. Disks are so dirt cheap that there's no reason not to.
RAID doesn't prevent you from having to have some sort of backup solution, and if you can't trust yourself to do them unless you're being risky with your data, I'll happily avoid dealing with restoring data and all that bullshit from a single disk failure and you can sink your time into doing it all manually.
Easily solved by file system snapshots.. which you should be doing for an important file server no matter what operating system you're using.
I'm with you on this, but the problem with many RAID5 systems is that you usually purchase all of the drives at once, so it increases the likelyhood of a double-disk failure since all the drives are the same age.
Mind you, I've worked in IT for over 15 years and I've never had the fortune to experience a double disk failure.
On my own system at home, I have three RAID5 sets on two servers with hardware raid SATA controllers (Accusys controllers - they're really nice!) If I were to experience a disk failure, I'll turn off the server and go get a replacement disk before turning it back on.
I also replicate all of the important data to a friend; we both have file servers, we both run Server 2003 R2 (Well, I run 2008 now) with DFS Replication, and we use an OpenVPN tunnel between us. This way, even if we had a bad disk failure we'll be okay.
In reality, you should have backups of critical data because I'm much more afraid of a fire or flood than a double-disk failure.
There's two very easy to used GUI package tools on Ubuntu. You don't have to use apt-get to install software.
You can double-click a .deb package and install it, and all the dependencies will install automatically. I dare say it's easier than many windows packages.
Unless you're running Gentoo or another source-based distribution, you never have to compile software anymore.
Such an old tired argument. Get over it.
Linux distributions like Ubuntu are easy to use, easy to update, and easy to install new software with.
Photoshop will show up on Linux eventually.
Yea just ignore the whole part about you flying off the deep end because you thought I was making a negative comment about you. Good move.
I mean, you could have just let it be, and retained maybe a LITTLE bit of dignity, but nope, that's not how you roll, little man.
I've been PM for at least a dozen fairly large wireless installations, and I've been all over the Northeast to hundreds of different businesses - big and small - and only five used EAP because of the current device compatibility issues. It's a problem, and hopefully within the next year or two we'll see better compatibility.
But until then, you can continue to masturbate to the sound of your own words.