There are three great options to get your servers out of the RAID-controller business. One is NAS (Network Attached Storage), the second is using native SCSI or IDE controllers with RAID provided by your OS. And lastly, you can buy a box that already is a RAID but just looks like one big fat drive and plug it in.
At work run all our linux boxen at work with kernel mirroring and it uses almost NO CPU even under pretty heavy parallel load. Great for the base OS with SCSI or IDE, since the only thing they'll do once they boot is swap to these. Striping your swap space across multiple drives really helps when a server starts running low on memory.
I have mirror sets running at 48 Megabytes a second on two year old 18 Gig 10k SCSI drives for streaming output, and can provide very good performance under parallel load as a database disk set.
I've never had the kernel RAID drivers act flakey since I started using them over two years ago, and I've done various things like hot insert a raid disk in both RAID 1 and RAID 5 (both were pretty easy to do.) and typed the respected, yet undocumented --really-xxxxx (xxxxx=a 5 letter word not mentioned here!) flag a few times.
A friend is in the process of building NAS servers in 2U units with multiple IDE cards and ~500 Gigs of storage for ~$3500 or so. SCSI versions would be a bit more, bigger, and probably need more cooling, but be faster too. Right now the IDE ones are fast enough with a RAID 5 configuration.
The IDE ones can flood a 100 Base-TX connection, so performance isn't really an issue for anything on less than gigabit, and even then the IDEs will use up a goodly chunk of that.
The external RAIDS are often the fastest for databases, offering fibre optic connections. they're not cheap, but if you're running EBay's database, cheap isn't the point anymore.:-)
If you have to have a RAID card, I can recommend the AMI Megaraid 428, which used, on Ebay, goes for $100 right now. Not that fast (I never got more than 20 Megabytes a second from one) but very solid and reliable, and they can hold up to 45 SCSI hard drives if you can afford the cooling and electrical for them. Plus the first channel looks like a regular SCSI card to anything other than a hard drive, like a tape drive or CDROM, so you don't need another SCSI card if you want a tape drive to back it up.
While the Megaraid site no longer has configuration software available, this site:
While I'll agree that SCSI is superior for most applications, IDE is no slouch nowadays.
On one of our production servers we have twin 18 Gig 10krpm Ultrawide SCSI drives for the database, and a pai rof 80 Gig IDE drives for the static data like web content.
The pair of U2W SCSI drives in a RAID1 can be read at about 48 Megs a second by bonnie, while the pair of 80 gig IDEs can be read at about 28 Megs a second.
pgbench, a little benchmarking program for postgresql, gets about 150 to 200 transactions per second on the dual SCSI drives, while it gets about 100 to 120 on the dual IDE drives.
the problem is, even under it's heaviest loads, that machine never handles more than 10 or 20 transactions every second. Both sets of drives are plenty fast enough to hand the load.
For servers that need hundreds of gigabytes of storage but only have to provide static storage for a medium, to small group, the money you'd spend on SCSI is probably better spent on other options for that server.
For a database server handling hundreds of concurrent users, SCSI (via electrical cables) is a good choice, but maybe a SCSI over FC-AL setup would be needed.
Engineering isn't about which component is the absolute best, it's about which component makes the most sense for what you're doing.
In Linux, and many other flavors of unix, you could buy three drives, set them all three up as a mirror set, then pull one of the drives and take it offsite. If one drive dies, you can replace it without interuption, if both fail, you can just bring in the one from offsite and plug it in. Given the speed / price / performance of modern IDE hard drives, you could have two offsite drives AND a mirror set for less than most medium to large tape drives, and as far as I know, hard drives have a pretty long shelf like (usually >10 years) assuming they are treated well.
Storm Constantine for the Wraththu trilogy, Piers Anthony after having read Anthonology (On the uses of Torture and The Toaster are two of the best shorts ever) and Parke Godwin for Waiting for the Galactic Bus, among other works.
And everyone knows that the guys at mysql.com are gonna be able to install, configure, and tune postgresql to be an optimal dbms just like they did with mysql.
These guys couldn't even get vacuum to run, a command I've never had fail...
They probably have no idea how to optimize the query planner, change the buffer memory blocks, or create the right types of indexes to accelerate the database. And that's ok, but you should realize these things before you go by a benchmark made by one company against a competitive product.
Someone please mod the parent to this comment up. Very valid point.
Many people have their shiny new 50+Gig hard drives in an old AT case with inadequate cooling. I had to move my Maxtor 30 gig to a different area in the case of my old box that had better flow and cooling to get it to work reliably. who knows how much life I took off of it running it hot...
True, but I think the parent post to yours is saying they'd probably be open to doing it, they just don't realize they're in violation yet, and what thay means.
but I'm not so sure, I worked with the parent company a bit, and they were pretty secretive about their technology.
Anybody else remember "The Halley Project" for the Amiga? That was an excellent learning game. You flew around the solar system in a little space ship in missions where you'd do things like "land on a moon with an atmosphere".
Taught you how to navigate, and you had to go to the encyclopedia to look stuff up to know where to fly.
Just a couple days ago my IP changed (lease expired) so I had re-entered my port 80 mapping to my linux box through my 675 this very morning before going to work. Glad to know I got it done just in time to keep my 675 working.
Let's just grow it up there. Seeds good to eat, use the oil to power the rocket motors, and make our clothes outta the fiber. Hell, other than a few miscellaneous vegetables, we'd have all we need to survive in one plant.
Interestingly enough, I've seen corporate environments where the arrogant idiot in marketing gets much further ahead the the arrogant genious in development.
During the 60s, and SR71 pilot flying at over 100,000 feet and over mach 3 had a catastrophic SAS failure SAS is the stability augmentation system that makes big planes easy to fly basically) when it's sensors reported less fuel in the rear tanks than there actually was. The pilot initiated a turn and the plane simply disintegrated around him.
Luckily for him he was wearing the "space suit" built for use in that plane, and he was basically uninjured. I read about that in some literature about the Lockheed Skunkworks once, I wonder if they have that story online somewhere now.
They've pulled the article, after talking to the folks at reviewboard, they have agreed to pull the warmed over 50% changed article (and already have). They've also agreed issue an apology to Josh (we'll see if they do.)
so, maybe, just maybe, there's someone there with some morals who'll do the "right thing."
They've changed a few sentences here and there, but it is pretty much all the information, in the same order, from the original article.
http://www.reviewboard.com/Section/Cover/E10k
such blatent thievery, I truly hope everyone will boycott this shady, fly by night organization, they do NOT deserve your hits.
or stealing national treasures. When I go to Washington DC, and find out that somebody ripped of the original declaration of independence, if I go to the library and find out someone stole and destroyed the only copy of a Henry Miller book, or if I go to a national park and find out an endanger, protect wild animal was killed by a poacher, it's the same as if someone stole GPL code and refused to release their changes, while simultaneously parading it about as their own code, and offering it for sale.
These guys need to do two things to make this right (TM).
They need to immediately release it in compliance with the GPL, and they need to review all their internal code (possible allowing a single outside authority to examine it) to assure us they have not violated the GPL in any of their other software.
If every package they off is just repackaged GPL code, and they have charged money, they owe all of that minus maybe tech support costs to the original author's, along with an apology.
Stealing GPL code and not releasing it is like selfishly stealing a valuable resource held in the public trust for yourself, and denying it to others. Everyone who is interested in the software this company sells owes it to themselves and this company to email them and request they prove they are complying with the GPL in ALL their software modules, or refuse to do business with them.
The reason for treble damages is many fold. The obvious reason is to punish the offender for willfully ignoring the law. But the secondary reason is to offset the court cost of simple people sueing for something they rightfully deserved to get. I.e. if it cost a single copyright holder $25,000 to being suit for a GPL violation, several could throw $5,000 in a hat and start a suit, seeing who joined in. The cost would be minimal if a lot joined in (class action suit) compared to a bunch of single suits.
But, if you were the only person on a GPL project attempting to sue someone for a violation who could just put food on the table month to month, and you're lawyer takes it on contingency plus costs, you might get into and out of mediation for $10,000 to 50,000 if it was an open and shut case, local/international, etc... found by the judge/panel et. al. with prejudice, then maybe you need treble damages. Heck, you still may not collect them on review, and then you would lose money.
Almost every opinion I've read on treble damages, even from judges that didn't favor the reason for the law that imposed it (labor's a greate subject area for that kind of logic)... where was I? oh yeah Nearly every one of them supported the treble damages in cases where the violator was flagrant in it's abuse of the law.
IANAL, just read a lot that gunk.
Actually, you wouldn't have to anchor it. Just make it fairly massive, so that it rises and falls very slowly, and make several hundred upside down bowl enclosures each with it's own small (in comparison to the size of the station) generator in it.
With regards to postgresql...
Did you try bracing begin/end commands at each end of your super large insert series? On my box, 10000 inserts run in 11 seconds with begin end. 100 run in 12 seconds without them.
Hey, how about using this to put a 1983 era mainframe on your wrist? According to the article on the linux->mainframe emulator I read the other day (see http://www.byte.com/column/BYT20000801S0002) the mainframe runs nicely on 8 megs of ram.
Anyone running MVS on their wrist would have to qualify as cool.
There are three great options to get your servers out of the RAID-controller business. One is NAS (Network Attached Storage), the second is using native SCSI or IDE controllers with RAID provided by your OS. And lastly, you can buy a box that already is a RAID but just looks like one big fat drive and plug it in.
:-)
i nd ex.asp?fileid=R28825
At work run all our linux boxen at work with kernel mirroring and it uses almost NO CPU even under pretty heavy parallel load. Great for the base OS with SCSI or IDE, since the only thing they'll do once they boot is swap to these. Striping your swap space across multiple drives really helps when a server starts running low on memory.
I have mirror sets running at 48 Megabytes a second on two year old 18 Gig 10k SCSI drives for streaming output, and can provide very good performance under parallel load as a database disk set.
I've never had the kernel RAID drivers act flakey since I started using them over two years ago, and I've done various things like hot insert a raid disk in both RAID 1 and RAID 5 (both were pretty easy to do.) and typed the respected, yet undocumented --really-xxxxx (xxxxx=a 5 letter word not mentioned here!) flag a few times.
A friend is in the process of building NAS servers in 2U units with multiple IDE cards and ~500 Gigs of storage for ~$3500 or so. SCSI versions would be a bit more, bigger, and probably need more cooling, but be faster too. Right now the IDE ones are fast enough with a RAID 5 configuration.
The IDE ones can flood a 100 Base-TX connection, so performance isn't really an issue for anything on less than gigabit, and even then the IDEs will use up a goodly chunk of that.
The external RAIDS are often the fastest for databases, offering fibre optic connections. they're not cheap, but if you're running EBay's database, cheap isn't the point anymore.
If you have to have a RAID card, I can recommend the AMI Megaraid 428, which used, on Ebay, goes for $100 right now. Not that fast (I never got more than 20 Megabytes a second from one) but very solid and reliable, and they can hold up to 45 SCSI hard drives if you can afford the cooling and electrical for them. Plus the first channel looks like a regular SCSI card to anything other than a hard drive, like a tape drive or CDROM, so you don't need another SCSI card if you want a tape drive to back it up.
While the Megaraid site no longer has configuration software available, this site:
http://domsch.com/linux/#megaraid
points to this site:
http://support.dell.com/us/en/filelib/download/
on Dell where you can find management software for the MegaRAID controllers.
While I'll agree that SCSI is superior for most applications, IDE is no slouch nowadays.
On one of our production servers we have twin 18 Gig 10krpm Ultrawide SCSI drives for the database, and a pai rof 80 Gig IDE drives for the static data like web content.
The pair of U2W SCSI drives in a RAID1 can be read at about 48 Megs a second by bonnie, while the pair of 80 gig IDEs can be read at about 28 Megs a second.
pgbench, a little benchmarking program for postgresql, gets about 150 to 200 transactions per second on the dual SCSI drives, while it gets about 100 to 120 on the dual IDE drives.
the problem is, even under it's heaviest loads, that machine never handles more than 10 or 20 transactions every second. Both sets of drives are plenty fast enough to hand the load.
For servers that need hundreds of gigabytes of storage but only have to provide static storage for a medium, to small group, the money you'd spend on SCSI is probably better spent on other options for that server.
For a database server handling hundreds of concurrent users, SCSI (via electrical cables) is a good choice, but maybe a SCSI over FC-AL setup would be needed.
Engineering isn't about which component is the absolute best, it's about which component makes the most sense for what you're doing.
In Linux, and many other flavors of unix, you could buy three drives, set them all three up as a mirror set, then pull one of the drives and take it offsite. If one drive dies, you can replace it without interuption, if both fail, you can just bring in the one from offsite and plug it in. Given the speed / price / performance of modern IDE hard drives, you could have two offsite drives AND a mirror set for less than most medium to large tape drives, and as far as I know, hard drives have a pretty long shelf like (usually >10 years) assuming they are treated well.
My votes would go to:
Storm Constantine for the Wraththu trilogy, Piers Anthony after having read Anthonology (On the uses of Torture and The Toaster are two of the best shorts ever) and Parke Godwin for Waiting for the Galactic Bus, among other works.
And everyone knows that the guys at mysql.com are gonna be able to install, configure, and tune postgresql to be an optimal dbms just like they did with mysql.
These guys couldn't even get vacuum to run, a command I've never had fail...
They probably have no idea how to optimize the query planner, change the buffer memory blocks, or create the right types of indexes to accelerate the database. And that's ok, but you should realize these things before you go by a benchmark made by one company against a competitive product.
Someone please mod the parent to this comment up. Very valid point.
Many people have their shiny new 50+Gig hard drives in an old AT case with inadequate cooling. I had to move my Maxtor 30 gig to a different area in the case of my old box that had better flow and cooling to get it to work reliably. who knows how much life I took off of it running it hot...
True, but I think the parent post to yours is saying they'd probably be open to doing it, they just don't realize they're in violation yet, and what thay means.
but I'm not so sure, I worked with the parent company a bit, and they were pretty secretive about their technology.
Anybody else remember "The Halley Project" for the Amiga? That was an excellent learning game. You flew around the solar system in a little space ship in missions where you'd do things like "land on a moon with an atmosphere".
Taught you how to navigate, and you had to go to the encyclopedia to look stuff up to know where to fly.
Very cool old game.
Just a couple days ago my IP changed (lease expired) so I had re-entered my port 80 mapping to my linux box through my 675 this very morning before going to work. Glad to know I got it done just in time to keep my 675 working.
Let's just grow it up there. Seeds good to eat, use the oil to power the rocket motors, and make our clothes outta the fiber. Hell, other than a few miscellaneous vegetables, we'd have all we need to survive in one plant.
Yeah, and that site is running apache and php, so it can handle a bit of load. Musta had a (good) prima donna program it too.
Interestingly enough, I've seen corporate environments where the arrogant idiot in marketing gets much further ahead the the arrogant genious in development.
Also, the steel plates used to build the hull had very high sulfer content and were quite brittle, even by the standards of the day.
During the 60s, and SR71 pilot flying at over 100,000 feet and over mach 3 had a catastrophic SAS failure SAS is the stability augmentation system that makes big planes easy to fly basically) when it's sensors reported less fuel in the rear tanks than there actually was. The pilot initiated a turn and the plane simply disintegrated around him.
Luckily for him he was wearing the "space suit" built for use in that plane, and he was basically uninjured. I read about that in some literature about the Lockheed Skunkworks once, I wonder if they have that story online somewhere now.
I too answered human for race. The guy gave me a strange look (he had dropped by), but I stuck by my guns.
They've pulled the article, after talking to the folks at reviewboard, they have agreed to pull the warmed over 50% changed article (and already have). They've also agreed issue an apology to Josh (we'll see if they do.)
so, maybe, just maybe, there's someone there with some morals who'll do the "right thing."
They've changed a few sentences here and there, but it is pretty much all the information, in the same order, from the original article.
http://www.reviewboard.com/Section/Cover/E10k
such blatent thievery, I truly hope everyone will boycott this shady, fly by night organization, they do NOT deserve your hits.
These guys need to do two things to make this right (TM).
They need to immediately release it in compliance with the GPL, and they need to review all their internal code (possible allowing a single outside authority to examine it) to assure us they have not violated the GPL in any of their other software.
If every package they off is just repackaged GPL code, and they have charged money, they owe all of that minus maybe tech support costs to the original author's, along with an apology.
Stealing GPL code and not releasing it is like selfishly stealing a valuable resource held in the public trust for yourself, and denying it to others. Everyone who is interested in the software this company sells owes it to themselves and this company to email them and request they prove they are complying with the GPL in ALL their software modules, or refuse to do business with them.
The reason for treble damages is many fold. The obvious reason is to punish the offender for willfully ignoring the law. But the secondary reason is to offset the court cost of simple people sueing for something they rightfully deserved to get. I.e. if it cost a single copyright holder $25,000 to being suit for a GPL violation, several could throw $5,000 in a hat and start a suit, seeing who joined in. The cost would be minimal if a lot joined in (class action suit) compared to a bunch of single suits. But, if you were the only person on a GPL project attempting to sue someone for a violation who could just put food on the table month to month, and you're lawyer takes it on contingency plus costs, you might get into and out of mediation for $10,000 to 50,000 if it was an open and shut case, local/international, etc... found by the judge/panel et. al. with prejudice, then maybe you need treble damages. Heck, you still may not collect them on review, and then you would lose money. Almost every opinion I've read on treble damages, even from judges that didn't favor the reason for the law that imposed it (labor's a greate subject area for that kind of logic)... where was I? oh yeah Nearly every one of them supported the treble damages in cases where the violator was flagrant in it's abuse of the law. IANAL, just read a lot that gunk.
Wasn't it Kepler who looked over Brahe's work to work out his law of conservation of angular momentum?
Actually, you wouldn't have to anchor it. Just make it fairly massive, so that it rises and falls very slowly, and make several hundred upside down bowl enclosures each with it's own small (in comparison to the size of the station) generator in it.
mysql. tim had problems with the earlier (6.5.3) versions of postgresql and corrupted indexes I believe
With regards to postgresql... Did you try bracing begin/end commands at each end of your super large insert series? On my box, 10000 inserts run in 11 seconds with begin end. 100 run in 12 seconds without them.
The fix for that is easy, either:
?> php -q script_name
or put #!/usr/local/bin/php -q at the top of all your scripts.
flush() will force output to stdout.
Hey, how about using this to put a 1983 era mainframe on your wrist? According to the article on the linux->mainframe emulator I read the other day (see http://www.byte.com/column/BYT20000801S0002) the mainframe runs nicely on 8 megs of ram.
Anyone running MVS on their wrist would have to qualify as cool.