A Look At CERN's LHC Grid-Computing Architecture
blair1q writes "Using a four-tiered architecture (from CERN's central computer at Tier 0 to individual scientists' desk/lap/palmtops at Tier 3), CERN is distributing LHC data and computations across resources worldwide to achieve aggregate computational power unprecedented in high-energy physics research. As an example, 'researchers can sit at their laptops, write small programs or macros, submit the programs through the AliEn system, find the necessary ALICE data on AliEn servers, then run their jobs' on upper-tier systems. The full grid comprises small computers, supercomputers, computer clusters, and mass-storage data centers. This system allows 1,000 researchers at 130 organizations in 34 countries to crunch the data, which are disgorged at a rate of 1.25 GB per second from the LHC's detectors."
I wonder when we will have the equivalent computing power at home? :)
A single 10gb ethernet connection can handle that quite easily. a single symmetrix/shark should be able to keep up.
I was having lunch with some CERN guys a couple weeks ago, and was asking them about the speed of their analogue to digital converters. I don't remember what the number was, but it seemed low to me, something like 200kHz. So, of course, I had to point out that *my* cheapo converters ran faster than theirs by more than an order of magnitude. They responded with "well, each of our converters does 200kHz on all of our 4000 channels at the same time, so we're really recording at..."
They won.
Your post makes me wonder about a future where I have a home computer powerful enough to run an algorithm which downloads as many tracks off of iTunes as it needs and then can compute by extrapolation the future hits of RIAA, before they are released.
One wonders whether the courts would find that such a program is a circumvention of DRM for the purposes of the DMCA. Unfortunately, the computer, which can answer that question, will be destroyed by the construction of a .....
(Ouch. I should go get some sleep....)
is a botnet !
Thanks in advance.
Yours In Akademgorodok,
K. Trout
That's awesome, but will it run Crysis at max settings?
Good thing they didn't use PS3s to build it.
No folly is more costly than the folly of intolerant idealism. - Winston Churchill
Wow, imagine a Beowulf cluster of those!
I was quite saddened to find that this 'Look at CERN's LHC GRID' ....didnt include any pictures. :-(
What kind of database do you think they would be using? would it be sql based, as they would obviously need to query? but storing that amount of data? at that extremely fast rate??
I just wish they would send some more work units down the LHC@Home pipe. None of my computers have done any work for that project in ages.
-l
Help cure AIDS, cancer, and more. Donate your unused computer time to worldcommunitygrid.org. Join Team Slashdot!
IT people at LHC are the biggest liars in the universe, they are also pretty incompetent and get confused by simple questions...
One actual example (specifics removed to protect the innocent):
"So you configure the IP like this thru this software?"
"Oh dear no! I am a hardware guy, I dont deal with software! IPs are a hardware problem..."
"No they are a software setting..."
"No they are hardware!"
and so on...
Sorry for being pedantic, but the article says there are three tiers between the central computer and the scientist's machines (which are tier 4, not 3).
As someone who worked on the processing of HEP experimental data for awhile, let me say that there is a ton of work to do. You have particles entering the detector every ~40ns and hundreds of different instruments making measurements, which leads to a ton of data very quickly. You then have to reconstruct the path of the particle based off of the detector information, but it's not straight-forward. The detector can have gaps in coverage; neutrinos (which are undetectable) can be created removing momentum; particles from the previous event can still be in the detector et cetera.
And all of the data crunching you do must be done in 40ns, so that you're ready for the next set. (Of course, you can do some processing offline, but if you don't maintain a 40ns average, then your data will start piling up.)
There was a good presentation at LISA '07 on this entitled "The LHC Computing Challenge":
http://www.usenix.org/event/lisa07/tech/tech.html#cass
It was given by Tony Cass, who is/was "responsible for the provision of CPU and data storage services to CERN's physics community". They're planning on collecting 15PB/year.
I wonder what they at cern were expecting to have for infrastructure at the end of the project (e.g. now).
I mean: who could have guessed the processorspeed and diskspace we have now.
www.vanheusden.com - home of Multitail, HTTPing, CoffeeSaint, EntropyBroker, rsstail, bsod, listener, nagcon, nagi
I can say that the article doesn't explain it very well. Since CERN has been calling the sites "Tier", this terminology has become a buzzword, and everything is a Tier (the managers call their services "Tiered" just to make them sound important).
Tier0 and Tier1 are well described by the article. Tier2 are mostly computing clusters, with of course big storage, but they're mainly for analysis. Tier3 are like Tier2 but not really. They are "uncertified" Tier2 in the sense that they do not strictly adhere to the Tier2 standards in terms of middleware and configuration and policies.
Tier4... never heard of that, I think the buzzword Tier backfired and they're calling their desktops Tiers. When I started managing the Tier3 we did not even call it like that... it was just a cluster.
in fact it sounds just like what i do to get ... STFU.
my latest TV-series (from bit-torrent).
-
oh come ON! anybody that still believes IONIZING,
thus changing and INTERFERING with the "to-be-studid"
object at hand is FUNDAMENTAL is plain
It depends on which CERN guys you talk to. When I was a grad student we had a 1 GHz ADC (with fewer channels and only 8 bits IIRC) reading out scintillator which timed protons in a beam to O(10ps) timing resolution. I've been more involved with triggers than front end digitization since then but 200 kHz and 400 channels is nothing - the ATLAS calorimeter alone has 110,000 channels and its ADC's operate in the 10's MHz range (IIRC - you'd have to look up the ATLAS detector paper for exact numbers).
...I already do! One of the great things about the LCG is that you can submit and monitor jobs from anywhere. It is used by far more than the 1,000 physicists the article mentions. There are 2,500+ on ATLAS alone and then there is CMS and LHCb to count as well.
Imagine a Beowulf cluster of these! \o/