US Supercomputer Uses Flash Storage Drives
angry tapir writes "The San Diego Supercomputer Center has built a high-performance computer with solid-state drives, which the center says could help solve science problems faster than systems with traditional hard drives. The flash drives will provide faster data throughput, which should help the supercomputer analyze data an 'order of magnitude faster' than hard drive-based supercomputers, according to Allan Snavely, associate director at SDSC. SDSC intends to use the HPC system — called Dash — to develop new cures for diseases and to understand the development of Earth."
Imagine a beo...... umm.. nevermind
sounds like trading performance for lifespan, maybe a fair trade but how much?
1) design SSDs with a longer lifespan
FLASH is about read access time. Throughput can be gotten far cheaper with conventional drives and RAID1.
The rest is the usual nonsense for the press.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
"Hard drives are still the most cost-effective way of hanging on to data," Handy said. But for scientific research and financial services, the results are driven by speed, which makes SSDs makes worth the investment.
Why is the super computer ever being turned off? Why not just add more RAM?
SSD is cheaper than DDR ( ~$3/GB vs ~$8/GB ), but also ~100 times slower.
TFA isn't particularly detailed, beyond saying SSD's are used on "4 special I/O nodes".
One obvious thing would be to use SSD's for the Lustre MDS while using SATA as usual for the OSS's. That could potentially help with the "why does ls -l take minutes" issue familiar to Lustre users on heavily loaded systems, while not noticeably increasing the cost of the storage system as a whole.
RAM + uninterruptible power supply, and you're done. The only thing you need storage for is loading apps and data to begin with.
I piss off bigots.
Until supercomputers use SD cards.
The Institute of Incomplete Research has determined that 9 of out 10
No, I'm Batman!
...
Multiple account sock puppets are NOT cool, Barny
I've lost track of how many times hardware dudes have jammed a bunch of the newest fastest hardware into a box to achieve "100x" the "performance" of prior systems. Without a sliver of irony, or the slightest effort to analyze how software will use all this new hardware. Or what the serviceability of the new machine will be. Or any of the hundred other things that will combine to turn their "100x" into "1.25x".
--------
Boot time is O(1).
I think it was Amdahl who said that a "supercomputer is a machine which is fast enough to turn cpu-bound problems into io-bound problems", which means that disk speed could become a limiting factor.
I have trouble seeing how having SSD arrays can make a big difference though!
All current supercomputers have enough RAM to handle the entire problem set, simply because _all_ disks, including SSDs, are far slower than RAM.
A supercomputer, like those which are used by oil companies to do seismic processing, does need fast disk, but only in order to load the input data, and this is an almost totally sequential process.
Regular disk arrays are just as fast as SSD arrays for sequential IO, so unless they have found a supercomputer problem which requires significant amounts of random access disk IO, having SSDs available should only provide marginal speedups.
Terje
"almost all programming can be viewed as an exercise in caching"
So I gather flash storage technology is a lot less prone to "write failures after 'x' amount of write operations" than it was 5 or 10 years ago?
This is one of the reasons I don't trust solid state drives. Sure, I've had my fair share of crashes for traditional platter drives in my life, but if you have a program that writes thousands of times to the media every hour... what's the lifespan going to be on that flash storage?
I haven't been paying attention to tech news -- maybe some clever inventor improved flash to not have this problem anymore and nobody told me?
Slashdot requires you to wait longer between hitting 'reply' and submitting a comment.
Yeah I am burning karma, but its there to be burnt.
Why did the OP just wholesale copy one of my posts from earlier in the thread? Including the comment at the end which states my user name from the forum my original post was in reference to.
...
No you're not!
Everybody knows that Dr. Sheldon Cooper is Batman!
I've just gone through the process of setting up a pair of servers (HP DL380s) for Linux/Postgres. Our measurements show that the Intel X25-E SSDs beat regular 10k rpm SAS drives by a factor of about 12 for fdatasync() speed. This is important for a database system, as a transaction cannot COMMIT until the data has really, really hit permanent storage. [It's unsafe to use the regular disk's write cache, and personally, I don't trust a battery-backed write cache on the RAID controller much either. So not having to wait for a mechanical seek is really useful. Read speeds are also better (10x less latency), and the sustained throughput is about 2x as good.
So, yes, SSDs are a good idea for database loads, where the interaction is with the real world, and where once a transaction has completed, some other real-world process has happened. BUT, most supercomputer workloads are, in principle, re-startable (i.e. if you lose an hour's work due to a hardware failure, you can just re-run the simulation code, and throw away the intermediate state).
So, for simulations, the cost of dataloss is an hour of re-work, not irretrievable information. Given that, we can get much better performance by storing everything in RAM, enabling all the write-caches, and sticking with standard SATA, provided that, every so often, the data is flushed out to disk. If something goes wrong, just revert to the last savepoint, which could be an hour ago, rather than having to be 10ms ago.
[BTW, HP "don't support" SSDs in their servers, but the Intel SSD X25-E disks do work just fine. Though I did, unfortunately, have to buy some of HP's cheapest SAS drives ($250 each) just to obtain the mounting kits for the SSDs.]
"This is important for a database system, as a transaction cannot COMMIT until the data has really, really hit permanent storage. [It's unsafe to use the regular disk's write cache, and personally, I don't trust a battery-backed write cache on the RAID controller much either. So not having to wait for a mechanical seek is really useful. Read speeds are also better (10x less latency), and the sustained throughput is about 2x as good. So, yes, SSDs are a good idea for database loads, where the interaction is with the real world, and where once a transaction has completed, some other real-world process has happened." - by Richard_J_N (631241) on Sunday September 06, @04:38PM (#29334123)
Exactly! I noted that back in 1996, for EEC Systems (now SuperSpeed.com) - which, in turn, helped lead to a GOOD review in "Windows NT Magazine" (now Windows IT Pro) in the April 1997 Issue "Back Office Performance" pg. #61 topic (cover story), & for their ramdisk softwares (SuperDisk - whilst I improved their diskdriver block device driver diskcache, SuperCache I/II, by up to 40% more on paid contract to they)...
They took the same idea you expound on now, when I noted it back then, & that idea worked to place EEC Systems/SuperSpeed.com as a FINALIST @ Microsoft Tech-Ed 2001-2002, 2 yrs. in a row, in the hardest category there - SQLServer Performance Enhancement (which it works great for, as the results in the URL I first post above clearly shows in how SSD's &/or RAMDrives can clearly enhance both DB server performance AND Webserver performance as well).
IT JUST WORKS!
Others have noted it as well, but NOT ONLY FOR DB performance gains - also for WEBSERVERS, FILESERVERS, & MORE... an example thereof being here -> http://techreport.com/articles.x/17183/8
HOWEVER, I'd like to note some "creative uses" of these units (same ideas I put out for CENATEK, which for years was featured on their main page as "An Independent users review" of their SSD product, of which I am a proud & happy owner of no less) Albeit, this time, for more "home/end user" type application:
System RAM is SHARED RAM, first of all - more than 1 thing is "going on" in it, @ ALL times (this is not the case w/ using SSD's for specialized tasks (& they tend to EXCEL in webserver or DB server environs & tasks. Proof of that much is from the responder I replied to, and, from techreport above (see that url))).
Personally, for more "end-user" type tasks here @ home? Well - I use SSD's here, "true" ones, meaning NOT based on FLASH RAM (with its slower write cycles & inferior longevity).
----
1.) A CENATEK RocketDrive (2gb PC-133 SDRAM, PCI 2.2 133mb/sec. bus transfer rates)
2.) A GIGABYTE IRAM (4gb DDR-400 RAM, SATA 1 150mb/sec. bus transfer rates)
----
I use TRUE SSD's in this manner here for performance gains:
----
1.) Pagefile.sys placement (all alone by itself on the CENATEK RocketDrive on a 2gb NTFS partition, uncompressed, so it is a "dedicated task" there & that one only).
2.) WebBrowser Program Caches (all of them in IE, FireFox, & Opera) - &, on an NTFS compressed partition, so the files are even TINIER & pickup that much faster into memory (small offset due to decompression of data into memory, but, today's CPU's & RAM speeds make up for that - on GIGABYTE IRAM)
3.) OS and application logs (like eventlogs & far more from apps + the OS also - on GIGABYTE IRAM) - again, on an NTFS compressed partition, for the same reasons as above.
4.) %Temp% &/or %tmp% environment alteration (so app & OS 'temp ops' take place in a higher speed environs & off the main disk too - on GIGABYTE IRAM)
5.) %Comspec% placement (cmd.exe on Windows NT-based OS' - on GIGABYTE IRAM)
6.) PRINT SPOOLER location (o
I am blind!
The post was about ramdisks. The ac who signs off as apk replied about ramdisks. He did so with a quote from the original poster and he replied in direct response to that poster's question, and with some good ideas. I'd like to know how he was considered off topic. Somebody's a tad trigger happy with the down mods I'd say.