Storing CERN's Search for God (Particles)
Chris Lindquist writes "Think your storage headaches are big? When it goes live in 2008, CERN's ALICE experiment will use 500 optical fiber links to feed particle collision data to hundreds of PCs at a rate of 1GB/second, every second, for a month. 'During this one month, we need a huge disk buffer,' says Pierre Vande Vyvre, CERN's project leader for data acquisition. One might call that an understatement. CIO.com's story has more details about the project and the SAN tasked with catching the flood of data."
Wow! Actually geeky science news, not enough of that here lately!
I don't precisely think that CERN is going to be purchasing thousands of dell PCs to analyze the data that they collect. maybe they are talking about a distributed computing project?
If only I could get porn that fast
there I said it, let's move on now.
I'm god, but it's a bit of a drag really...
I think I just creamed myself. The hardware needed to push that much data must be insane!
Still waiting on Serviscope_minor to wake up to fucking reality and realize that Jessica Price isn't going to fuck him.
Um...no. Actually, it's a product placement PR piece about Quantum's StorNext. (Read page 2...)
Quote from the Slashdot story, as it is now: "... and the SAN tasked with catching the flood of data."
I think the correct word, considering the meaning, is "caching".
"Don't run with scissors" advice: If you play video games too much, it will stunt your growth. People need time to learn about the real world around them, not just a fantasy world. Part of learning about the real world is learning how to communicate with other people.
They're probably using an object based parallel filesystem like Lustre or something similar. I heard at At Sun they build these all the time with one customer striping data against 214 PCs acting as data engines all within one Lustre Filesystem. All the storage is direct attach but SAN can't even come close to the speeds generated and all the equipment being used is commodity hardware.
2,629,743 seconds in a month, so... 2,629,743 GB or 328,717 GB?
It's too late to do math.
You think that's bad? I've gotta download 2 Grateful Dead torrents for, like, 3 months from Lossless Legs! I scoff at your God (particles)!
based on 1GB/sec * ((3600 * 24) * 31) means over 2.5 Petabytes.
Wow.
Something like 3000 of the current ITB drives.
How long until Exabyte level storage is required for some project or another?
Trying to associate Microsoft with "fun" is like trying to associate Satan with aromatherapy. -Tycho
http://science.slashdot.org/article.pl?sid=07/05/2 2/009216 From two months ago.
These physicists always say they are searching for God or "the God Particle". But what happens if they switch the big God Particle generator on, and God suddenly appears? What if we really do find God?
What are all these geeky physicists going to do then? Do we really want to find God? Do we really want physicists finding God? Is this a good thing?
Just wondering ...
"Due for operation in May 2008, the LHC is a 27-kilometer-long device designed to accelerate subatomic particles to ridiculous speeds, smash them into each other and then record the results."
Next up ludicrous speed!!! Better fasten your seat belts...
Shakespeare poems - infinite monkeys with infinite time.Computer tech support - a few trained ones working from 9 to 5.
Hmm, lets see. ~2700 TB of data over one month. Let's store it on 500 GB drives. That's 5400 disk drives just to store the data. Add in the the extra drives for parity, and a few hundred hot spares, this thing could easily use OVER NINE THOUSAND drives.
That refers to the number of PCs involved in storing the data.
1GB/s * 1 month = 1GB/s * 30 day/month * 24 hour/day * 3600s/hour = 2,592,000 GB.
A big disk (Seagate ST3750640AS) is 750GB.
324,000 GB / 750GB/disk = 3,456 disk.
At AUD467 per disk this will cost AUD1,613,952 (plus computers+net). Even cheaper if you allow for the fact these are retail
prices for wholesale quantities. Let's take the startup current of 2A@12V as the worst case power
consumption and we end up with a maximum power of 83kW. That's less than 35 domestic heaters (2.4kW ea).
Okay, it's not trivial stringing together 3,456 disks, but it's not exceptional either. It is no bigger in
scale than a typical university network. Or just buy a few of the Internet Archive's Petaboxes off the shelf.
MacBook Pro. Worst name since the Bicycle
The network is one thing, but just processing that amount of data is incredible.
200 computer breaks the 1GB chink into more manageable 5MB/Sec chinks of data, but then they still need to handle the metadata that figures out how to put it all back together. On top of this they'll need to have some redundancy in case of data loss, and how the load is redistributed if a machine croaks.
These are good problems, it would be a fun system to work on.
Not only did the Slashdot editor not catch a spelling mistake, he apparently didn't catch the fact that the linked article is an advertisement from CXO Media, which, according to its web site, mixes articles and advertisements: "Through our integrated media and marketing programs we provide..."
From the linked article: "... the team is using Quantum's StorNext software as its file system..."
Question: Did a Slashdot editor get paid directly for running an advertisement disguised as an article? Or was someone in Slashdot's parent company paid "under the table"? Or did the parent company get paid?
Anyone wanting to read a real article from 2005 about CERN's data handling, data storage, and data processing can download this PDF file: Grid Computing: The European Data Grid Project.
Real articles begin this way: "The computing challenges for LHC are: * the massive computational capacity required for analysis of the data and * the volume of data to be processed."
Advertisements begin by talking about God and murder, this way (from the article linked by Slashdot): "CERN's Search for God (Particles)..."
and "Maybe you last read about CERN (the European Organization for Nuclear Research) and its massive particle accelerators in Angels & Demons by Dan Brown of The Da Vinci Code fame. In that book, the lead character travels to the cavernous research institute on the border of France and Switzerland to help investigate a murder."
./go.sh | bzip2 > results.bz2 Problem solved!
It's only 5x HD SDI single channel ~ 200MB/s. Any major studio could handle this with ease.
SDI is how the movie guys move their digital stuff around. A higher end digital camera will capture at 2x HD SDI for a 2K res, 4:4:4 colour space. A few of em' and you got your 1GB/s easy. Spools onto godlike RAID arrays.
Get em' to call up Warner Bros if they have problems.
Assuming a non-RAID 3x-replication tech solution (what Google do in their datacenters), using 500-GB disks (best $/GB ratio), they would need about 16 thousands disks:
Which would cost about $1.8M (disks alone):
15552 (disk) * 110 ($/disk) = $1710720
Packed in high-density chassis (48 disks in 4U, or 12 disks per rack unit), they could store this amount of data in about 30 racks:
15552 (disk) / 12 (disk/rack unit) / 42 (rack unit/rack) = 30.9 racks
Now for various reasons (vendors influence, inexperienced consultants, my experience in the IT world in general, etc), I have a feeling they are going to end up with a solution unnecessarily complex, much more expensive, and hard to maintain and expand... Damn, I would love to be this project leader !
I thought about that, but when was the last time you heard someone talk about "catching a flood"?
1GB/sec is 3.6TB/hour, or 86.4TB/day, or 2.5PB in a month. That's really not all that huge for enterprise or scientific storage. I see that all the time in hosted environments.
I don't know what kind of crack I was on, but I suspect it was decaf.
I was wondering how this ranks against what Google handles in a month. Either way, I'm sure Google's got plenty of storage to handle the needs for the experiment.
Just e-mail it all to Google. By then gMail should be able to handle that much per user.
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
Right now, the average event size for ATLAS is 1.6 MByte and the system is designed to keep around 200 events per second, or roughly 300 MByte. This isn't much of course, but you have to consider that the bunch crossing rate (i.e. the rate at which bunches of protons will collide and generate events) is 40 MHz.
So you have to design a system that boils this rate from 40 MHz down to 200 Hz and only keeps the interesting parts, while also buffering all the data in the meantime. For this reason, the first trigger level is entirely implemented in hardware right in the detector and reduces the rate down to 75 KHz with a latency of 2.5 s. The rest of the trigger works on clusters using Linux computers and has a latency of o(1s).
...all this data will be distributed to a handfull of TIER1 sites (CERN is TIER0) all over the world (about 10). At the TIER1 sites the data will be preprocessed. The TIER1 sites distribute their preprocessed data to TIER2 sites which are the places where the international scientists work. I work at a TIER1 site and we face a lot technical challenges with this project. At a TIER1 site as I mentioned, the data is preprocessed too, so we will need a compute cluster and the necesary bandwith internally to move the data around. With each new software release (about every six months), ALL raw data has to be reprocessed with the new software. All results have to be stored. So for every part of raw data we will have to store preprocessed data for every software release. Of course a lot of data will be stored on tape but we expect that the dataflow from CERN (for us 150MB/s to disk and 75 MB/s to tape) will be the least of our problems. Moving the data around and preprocessig the data is probably a bigger problem in the long run. An the fact that the machine will be running for about 15 years or so, this will be a very long run!
After coming home from a party I read this as "CERN Searches for GOLD Particles." Thought to myself, WTF?
"The problem with socialism is eventually you run out of other people's money" - Thatcher.
TFS makes a point about storing 1 GB (presumably GigaBYTE) of data per second, but THAT feat is already in widespread use, spefically for the digital manipulation of 4k film. The company that produces the systems that process this film data is called Baselight.
:P
Basically, 4k film, at a resolution of 4096x3112, requires approximately 50MB per frame @ 24 fps. That comes out to about 1.22GBps, and maninuplating the data doubles it to 2.44GBps. The systems[PDF] that Baselight sells run 8 nodes and 16 processors, and it's all built with commodity hardware and some flavor of Linux. Apparently they use 3ware RAID cards... and I found out about this by browsing 3ware's site when I was shopping for a RAID controller.
Either way, my point is, it's been done, and there's a real world application that requires that type of data storage bandwidth and has nothing to do with scientific data.
Boot Windows, Linux, and ESX over the network for free.
My God, think of the porn you could download with that setup. It would be biblical (in a pornographic sense).
I am really surprised they did not use the Lustre filesystem for their data storage since it is vendor neutral, open, and designed for exactly this sort of thing. The lustre guys report being able to obtain tremendous bandwidth and scalability. I have not yet been able to play with Lustre but I look forward to doing so.
The ALICE experiment is actually concentrating on heavy ion collisions which is why they only worry mainly about one month/year, the rest of the time the machine is running protons for the other experiments, ATLAS and CMS, which will look for the Higgs. ALICE will hopefully study the quark gluon plasma but, as far as I know, has no plans to look for the Higgs.
According to a guy that I met yesterday on the street (he was talking to himself or somebody) the only way I could meet God (and hopefully His particles) was through his son. WTF? Can't even *God* get a good secretary these days?
Don't worry -- the products of particle accelerators only exist for a few picoseconds. If God is created during a collision event, he will wink out of existence so fast that we'll only become aware of his presence by the shower of Mormonions and PatRobertsonite particles impinging on the detection apparatus.
About 15 years ago, I was around 16, we made a one week school-trip to Geneva and we also visited CERN for one day. Even if you don't understand anything of what they are doing there, the place is impressive. I was very surprised that you actually can visit such a facility. I bet there are similar labs/institutions near you happily showing around and showing off what they do :)
1>Oh God!
1>Oh God!
2>Oh God thats is awesome. more!!!
3>Hey wait, you guys are studying the wrong kind of collisions.
1>Sorry just stress testing the hard drives.
2>Yeah we couldn't help it, the vibrations of so many drives...
OK, we got a half way overview of CERN's decision, with some bold statements of questionable validity. I am submitting the criticism purely on the grounds of being really interested in large data storage, I don't work for any large storage vendor, but I am an architect of storage systems.
First of all, with the statement "and it's (StorNext) completely vendor independent": Lot's of other solutions provide flexibility about choosing the hardware vendor from a theoretical perspective. The theory says that if vendor A makes a SAN, vendor B makes a RAID controller, C a disk cabinet and D offers a clustered FS, and all comply to the relevant standards, you can plug them together and expect them to function. However, imperfections in the standards, hidden proprietary optimizations, always dictate certain configs and combinations for optimum performance. There is a lot of work to be done in the StorNext and other similar products, until they claim full flexibility. My experience in deploying a StorNext based solution on a 1200 node setup says so and to keep the post short, I shall exclude at this stage vendor details, but if someone is interested, I am happy to go over the details. There is vendor dependence if you wish optimum performance. Not to mention that if you mix and match the RAID and SAN cards in the setup, any unfortunate issue might end up in a multi-headache, even if you have solution support (A blaims B, B accusses A, and the game of ping-pong begins). You can never exclude vendor dependence in such a large setup, you have to deal with it.
Then you have the "Clustered file systems are still an evolving category, she says, but enterprise IT is warming up to it.". I can imagine what the author classes as enterprise IT here, but I think there is a bit of an orientation issue. CERN is not exactly the classical enterprise IT environment, is it? Not in terms of their requirements for resilience and capacity. These FAR EXCEED enteprise IT requirements. CERN is a research setup. And the mentality of a research setup (that incubated the WWW after all) is (or should be) that of innovation and playing with some of the latest and the greatest. In fact, some US based research setups have long experimented with other cluster FSes. They are not warming up. CIO claims that StorNext is scalable. It is. But to what extent? Have they excluded for example things such as Lustre? http://wiki.lustre.org/index.php?title=Main_Page If yes, why?
...just because a SAN is connected at 1Gbit to a machine does not mean there is 1 Gbit of data passing over there all the time.
If I were to write up my house network I could say 'network switches feed data to several computers at 1Gbit per second' - this would be true if I only use it for web browsing - doesn't mean I'm saturating my bandwidth.
- Printed hardcopy. Many authorities recommend this as you do not need to worry about changes in data formats over time. For exact calculation, we would need to know the font they were planning to use and the character encoding. However, let's take a working assumption that they can cram 10KB of data onto an A4 sheet. That implies 259,200,000,000,000 pages. They will probably not want to use an inkjet printer if they use this solution and may, indeed, choose to acquire multiple printers and split the load. A single printer at 10 ppm would take approximately 50,000 years to complete the backup. On 70gm paper, it would weigh a little over two million tons. At any rate, this would certainly produce reams of output.
- Diskettes. This was good enough for nearly everyone 15 years ago. It is curious that such a tried and trusted technique is no longer in fashion. I assume regular 3.5" 1.44MB diskettes, generally recognised as easier to handle than 5.25". We shall need around 1,800,000,000 diskettes. One drawback is the person changing the diskettes as each one filled up might become a little bored after a while. On the positive side, the backup will be quite a lot faster than the printed solution. Assuming about one diskette per minute, inclusive of changing disks, the backup could be complete in less than 3,500 years.
- Now considered somewhat old fashioned, punch cards were once a mainstay of every programmer's personal backups. Like printed hardcopy, anyone familiar with the character encoding used, could read the data without needing any access to a computer. If we assume 80 column cards, we would need 32,400,000,000,000 cards. I would be somewhat concerned about the problem of getting this stack of cards back in the correct order if I dropped it. With a weight of about 30 million tons and stretching perhaps 6 million miles end to end, handling certainly would be challenging and an accident very possible.
- Paper (punched) tape was the only alternative on the first computer I used, a basic early model Elliott 803 without the optional magnetic tape. If I recall correctly, you could manage about 10 characters per inch, so you would need a paper tape over 4,000,000,000 miles long. Hmmm, that would be silly. The other solutions are clearly better.
I am sure other options will be considered, but I just wanted to bring these up in case CERN had failed to consider themImagine how deep the personality problems must run in a person who gets all hot because of someone's DNA sequences!
"Caching a flood of data" sounds fine to me.
"Imagine how deep the personality problems must run in a person who gets all hot because of someone's DNA sequences!"
You must be new here.
on my old trash-80, the command was CLOADM
ahh, the memories....
I think you're thinking of that guy who got nailed to the cross (Jesus). Noah was born about 5,000 years ago.
Ben Hocking
Need a professional organizer?
what is really fascinating is the data collector array in the first place. ...
the "thing of sensors" that makes these huge amounts of data.
oh well, good luck.
my guess is that it's lossy, from the first "smash" to the last pixel analyzed, sorry
maybe somebody will tell us how much got lost in between.
PR doesn't seem something important to CERN anyway, even if they invented the WWW.
Your units need work: power per velocity is action, not force.
We're all born with nothing.
If you die in debt, you're ahead.
http://208.69.34.230/search?ie=UTF-8&oe=UTF-8&sour ceid=navclient&gfns=1&q=1.21+gigawatts+%2F+88+mile s+per+hour
(1.21 gigawatts) / (88 miles per hour) = 30 757 874 newtons
It's force.
P/v
= (W/t) / (s/t)
= (W/s)
W = Fs, therefore P / v = W / s = F.