Data Storage Leaders Introduce New Wares
louismg writes "Data storage giant EMC announced upgrades to their storage hardware family this morning, and claimed performance increases of 25% to 100%, with increased capacity and disk speeds. This comes two weeks after competitor BlueArc announced Titan, the world's biggest ever NAS box, which claims throughput of 5 Gbps and 256 terabytes in a single hardware file system.
How much is enough, and as IT administrators, what is the answer to today's issues - improved hardware, or software?"
I predict that the storage industry will continue to produce boring incremental improvements on archaic paradigms untill somebody comes out with something revolutionary. Yes, that was vague and truly deep. Since you probably didn't read the article, here's the spoiler: it's esentially the same thing the author of the story said. Given the history of the industry, you can bet you'll get old and go grey before something revolutionary comes from one of the established players.
Something revolutionary is coming soon though.
Also today, Seagate launched a family of server-class 2.5" drives sporting 10k rpm and an Ultra320 SCSI or Fibre Channel interface. No details on Seagate's web site yet, though.
HIV Crosses Species Barrier... into Muppets
...is still broken. My company is finishing up a particularly nasty lawsuit with EMC now over the crap that they "sold" us. I'd advise anyone in a position to make a purchase for their company to consider all the options before going with EMC. Their products are unfinished and unreliable. Ugh.
What they need is improved backups. I don't give a fig about space if I can't back it up. So maybe someone should be looking at how we're supposed to be backing this stuff or archive this stuff. Or are we supposed to keep a warehouse of EMCs around? I can lay a bit that we are going to need serious backup infrastructure than what we have today to keep up.
sri
BlueArc appears to charge about $100/gb for storage solutions, and claims that its price is less than its competitors. At first, this looks to me like an insanely high price because my last hard disk cost $0.88/gb. But after some thought to the other hardware involved, I figure I could build an almost equally capable solution for $8-$20/gb, not counting software development costs. But adding the cost of the room to hold it all, plus the insane electrical and air conditioning costs, $100/mb is starting to look fairly reasonable for those who really need what they offer, and need it soon.
If you're using Linux and want to copy a lot of stuff from one place to another, you can use dd ('disk dump', designed for moving large files) and specify a blocksize of a few megs; this means that you will be moving data a few megs at a time, rather than a few K at a time - of course, this means that you have to use that much more memory. Also, I would imagine that Cygwin would allow you to use dd under Windows; another option is NTFS, where transfers from one directory to another on a single drive are nearly instantaneous. Of course, then you lose compatability; while FAT variants are understood by almost all OSes, you will have an unpleasant time trying to mount and use an NTFS volume from anything other than Windows. It's all about tradeoffs, but hopefully something here will help.
That's it. I'm no longer part of Team Sanity.
Because you are copying from a disk to itself.
:-)
All the "max bandwidth" figures you see are for streaming reads, where the disk heads move (relatively) smoothly along logically continguous chunks of disk.
Compare that to copying from one part of the disk to another. Your 100Mbyte file will be copied in chunks. The sequence of events will go something like this at a low level:
while( data left to copy )
{
move disk heads to offset in file to be read
read a chunk
move disk heads to offset in file to be written
write a chunk
}
The bit that really costs you is the two seeks. For a disk with an advertised seek time of 10ms, you are paying 20ms per chunk on top of your read+write times.
20ms/chunk == 50 chunks/second. So, 5Mbytes/second would be 100Kbytes/chunk, (assuming the actual read+write are free). [If you meant 5Mbits a second that would be ~12Kbytes/chunk.]
If you had two disks (on different disk controllers, etc etc). The disk heads wouldn't need to seek around much at all, and you'd get much closer to the theoretical bandwidth.
Or if you have a RAMdisk (and enough RAM), you could try:
cp bob/file to RAMDISK/file
cp RAMDISK/file fred/file
which should also run at full speed.
Also - note that if you do any other task which involves reading or writing to the disk at the same time, you'll hurt performance even more.
Its not the case that "time taken to perform tasks A and B in parallel" == "time taken to perform task A + time taken to perform task B", you also pay the cost of switching between them, which is comparitively steep in the case of disk I/O.
Does that make sense? Or Have I Been Trolled?
While the industry..... and consumers.... spend billions a year on R&D for larger storage devices/solutions and more secure ways to store data without losses, has anyone considered making the data SMALLER? Unlimited hours are going into encryption algorithms every year but most of the people I've seen out there are still using WinZip and other usefull but not too impressive compression utils. MP3 made audio better at a smaller (data) cost, mpeg for video etc..... what about the rest of the crap on your drive? Is it not possible to keep a compressed/simplified version on files on drives/b'ups and reinflate them when needed for operation?
OK i guess, but this means there's little point making it faster cuz it'll still be bottlenecked by the seeks - which are still the same over the past n years.
There is also the extra housekeeping that goes on for clearing bits from the freemap, updating the file size in the dest directory entry, etc.
Things like that also contribute to the performance penalty.
Those who sacrifice security to condemn liberty deserve to repeat history or something. - Benjamin Santayana