IT At the LHC — Managing a Petabyte of Data Per Second

← Back to Stories (view on slashdot.org)

IT At the LHC — Managing a Petabyte of Data Per Second

Posted by Soulskill on Friday August 3, 2012 @01:28AM from the take-a-drink-from-the-science-firehose dept.

schliz writes "iTnews in Australia has published an interview with CERN's deputy head of IT, David Foster, who explains what last month's discovery of a 'particle consistent with the Higgs Boson' means for the organization's IT department, why it needs a second 'Tier Zero' data center, and how it is using grid computing and the cloud. Quoting: 'If you were to digitize all the information from a collision in a detector, it’s about a petabyte a second or a million gigabytes per second. There is a lot of filtering of the data that occurs within the 25 nanoseconds between each bunch crossing (of protons). Each experiment operates their own trigger farm – each consisting of several thousand machines – that conduct real-time electronics within the LHC. These trigger farms decide, for example, was this set of collisions interesting? Do I keep this data or not? The non-interesting event data is discarded, the interesting events go through a second filter or trigger farm of a few thousand more computers, also on-site at the experiment. [These computers] have a bit more time to do some initial reconstruction – looking at the data to decide if it’s interesting. Out of all of this comes a data stream of some few hundred megabytes to 1Gb per second that actually gets recorded in the CERN data center, the facility we call "Tier Zero."'"

41 of 248 comments (clear)

Min score:

Reason:

Sort:

Call the Interns! by Sponge+Bath · 2012-08-03 01:34 · Score: 3, Funny

We need backup on floppy disk.
1. Re:Call the Interns! by DigiShaman · 2012-08-03 01:44 · Score: 3, Funny
  
  You don't what them to be idle, now do you? Use punch cards instead. Trust me.
  -BOFH
  
  --
  Life is not for the lazy.
Large Organization Has 2 Data Centers by GeneralTurgidson · 2012-08-03 01:35 · Score: 1, Funny

They may also be using something called load balancing, but we're still waiting for sources to confirm.
1. Re:Large Organization Has 2 Data Centers by DahGhostfacedFiddlah · 2012-08-03 06:53 · Score: 1
  
  Good point. Non-story. I can't see anything of interest to nerds here.
  
  --
  Last post!
2. Re:Large Organization Has 2 Data Centers by HappyPsycho · 2012-08-03 07:48 · Score: 2
  
  'Score: 3, Funny' - This is hilarious, from TFA:
  'The Tier Zero facility is the central hub of the Worldwide LHC Computing Grid, which also connects to some dozen ‘Tier One’ data centres for near-real time storage and analysis of data and over 150 ‘Tier Two’ data centres for batch analysis of experiment data.'
Keeping us humble... by Anonymous Coward · 2012-08-03 01:37 · Score: 3, Interesting

My wife, a staff physicist at FermiLab in their computing division, manages to keep me humble when I talk about the "big data" work I'm doing in my commercial engineering position. I think having to deal with a billion or so data points per day is big... Not so much in her universe!
1. Re:Keeping us humble... by plover · 2012-08-03 01:49 · Score: 4, Funny
  
  And we jokingly call our data center the "Large Software Collider". Not as funny when the real thing is even bigger!
  
  --
  John
2. Re:Keeping us humble... by somersault · 2012-08-03 02:12 · Score: 2
  
  Of course hadrons are bigger than softwares, not to mention a lot more fun in collisions.
  
  --
  which is totally what she said
3. Re:Keeping us humble... by ethanms · 2012-08-03 06:26 · Score: 2
  
  My wife, a staff physicist at FermiLab in their computing division
  Much like the HB itself, up until recently I assumed these were only theoretical...
GRID ack by PiMuNu · 2012-08-03 01:49 · Score: 3, Interesting

I tried using the GRID - it's deeply embedded in acronyms and crud, practically impossible to use without a PhD. For crying out loud, it's just a batch farm!
Re:You mean... by Anonymous Coward · 2012-08-03 01:54 · Score: 1

Not yet.
Large Hadron Collider - powered by Linux
pretty described on the LHC-CMS websites by peter303 · 2012-08-03 02:16 · Score: 2

I was looking up how complicated the detectors were, and they were. They have 75M directional sensors and 9K energy detectors (calorimeters), each which are analyzed 40M times a second for "interesting" events. One out of a billion maybe recorded for subsequent deep analysis.
Re:I don't really give a s h i t by Fusselwurm · 2012-08-03 02:34 · Score: 2

What do you want to imply?
That, somehow, he who does not know how to debug the kernel should not play with bit operations?
Something like that?
Or, that we should stop researching the structure of the universe, and instead focus on what we usually do, which is making war, screwing other people and post photos of our dicks on teh internet?
And Still. by CimmerianX · 2012-08-03 02:42 · Score: 4, Funny

The head researcher will STILL come to IT and ask them to please help him sync his outlook contacts to his phone.
grep by atisss · 2012-08-03 02:44 · Score: 2

So they just used grep
Re:You mean... by qu33ksilver · 2012-08-03 02:52 · Score: 1

Aah I see they use VMWare to manage the virtual machines. .. I guess Citrix is still lagging behind in the server virtualization field.

I am from Citrix ... if you know what I mean . :P
Which amounts to... by Travelsonic · 2012-08-03 03:01 · Score: 2

Roughly, assuming you can round it off to 53 weeks/year, if you do 1Petabyte/ear, and transferred that much constantly, that would be roughly 2887200000000000000000000000000000000000000000000000 BITS [individual 1s or 0s] per year

--
If you believe in privacy, and believe you have "nothing to hide" at the same time, you're a goddammed idiot
1. Re:Which amounts to... by Joce640k · 2012-08-03 04:20 · Score: 1
  
  Is that as much as billions and billions?
  
  --
  No sig today...
2. Re:Which amounts to... by Immerman · 2012-08-03 08:43 · Score: 1
  
  Quite a bit larger actually
  1 billion = 1e9
  Travelsonics number is ~3e51 which would be 3e6*(1e9)^5
  or millions of billions of billions of billions of billions of billions
  Not quite sure how well Sagan could pull that line off though
  
  --
  --- Most topics have many sides worth arguing, allow me to take one opposite you.
Power limitations by onyxruby · 2012-08-03 03:09 · Score: 3, Informative

Did a bunch of work with some stock exchanges a few years back. It was an interesting environment and I see that CERN had the same problems that the stock exchanges had. They even had the where the number one budgetary item wasn't cost but electric load.
You only had so much power physically available in the data centers next to the exchanges and server rooms inside them. Monetary cost was never an issue, but electric load was everything. It seems funny considering their load is strictly a science based load and not monetary, but their requirements and distribution remind me greatly of the exchanges.
1. Re:Power limitations by girlintraining · 2012-08-03 04:15 · Score: 1
  
  They even had the where the number one budgetary item wasn't cost but electric load.
  Probably true wherever you go, but the NYSE is in the middle of a dense urban area stretching for a hundred miles in every direction. Electricity, along with everything else, is painfully expensive there. I believe that's why so many data centers are built in relatively remote areas. Obviously, the NYSE has a physical location requirement... :\
  
  --
  #fuckbeta #iamslashdot #dicemustdie
2. Re:Power limitations by Mattsson · 2012-08-03 07:57 · Score: 1
  
  On the other hand, at CERN the power used by their computing farm is probably a small trickle compared to what is being pumped into the components of the ring and its detectors.
  
  --
  /.Mattsson - My native language is not English, so please don't whine over linguistic errors. (That's lame anyway...)
Re:I don't really give a s h i t by AchilleTalon · 2012-08-03 03:13 · Score: 1

Well, that one should be called a petadick.

--
Achille Talon
Hop!
Re:You mean... by LordLimecat · 2012-08-03 03:39 · Score: 1

VMWare is pretty widely recognized as the king of virtualization-- at least so long as you arent concerned with money. Its overhead is far far smaller than the others especially when dealing with huge numbers of connections, and it simply has more features than its competitors.
Of course, that assumes you're willing to pony up for vRAM entitlements and Enterprise Plus.
Re:You mean... by cduffy · 2012-08-03 03:58 · Score: 4, Interesting

VMWare is pretty widely recognized as the king of virtualization-- at least so long as you arent concerned with money. Its overhead is far far smaller than the others especially when dealing with huge numbers of connections, and it simply has more features than its competitors.
Which doesn't mean those features are implemented well.
Not so long ago, I built an automated QA platform on top of Qumranet's KVM. Partway through the project, my employer was bought by Dell, a VMware licensee. As such, we ended up putting software through automated testing on VMware, manual testing on Xen (legacy environment, pre-acquisition), and deployment to a mix of real hardware and VMware.
In terms of accurate hardware implementation, KVM kicked the crap out of what VMware (ESX) shipped with at the time. We had software break because VMware didn't implement some very common SCSI mode pages (which the real hardware and QEMU both did), we had software break because of funkiness in their PXE implementation, and we otherwise just plain had software *break*. I sometimes hit a bug in the QEMU layer KVM uses for hardware emulation, but when those happened, I could fix it myself half the time, and get good support from the dev team and mailing list otherwise. With VMware, I just had to wait and hope that they'd eventually get around to it in some future release.
"King of virtualization"? Bah.
Re:I don't really give a s h i t by Joce640k · 2012-08-03 04:19 · Score: 2

News that matters? The human race is not even able to handle itself and it wants to play with atoms.
I assume you're using a computer to post that? Maybe own a cellphone....?
That makes you a hypocrite of the worst kind. Sorry, but there it is in black and white.

--
No sig today...
Re:You mean... by LordLimecat · 2012-08-03 04:24 · Score: 1

King of virtualization when it comes to things like "supports live migration of a VM's execution state and/or permenant storage", or "stability and speed of the networking layer".
I cant speak to KVM as my experience is limited to VMware, and some HyperV and XenServer testing. But just doing a check from RHEV's own fact sheet, there are a number of things that are missing that are quite useful:
*Storage live migration
*Hot add RAM, CPU
*Hot add NICs, disk (note that RHEV has it wrong-- this does not require anything more than the free hypervisor, NOT enterprise plus as they claim)
*Live VM Snapshots (not really clear how RHEV doesnt have this, even Virtualbox has this)
Those are all pretty core features-- to my mind, ESPECIALLY the disk and NIC hot add. There are a lot of times that it is an absolute blessing to be able to roll out a new VLAN on the network and to just hot-add a NIC to the firewall VM on that VLAN, and your network suffers no outage. With disk, its awfully nice to be able to add a USB disk to the VM without having to reboot the entire thing (again, how does HyperV and RHEV not have this?).
Re:You mean... by X0563511 · 2012-08-03 05:03 · Score: 2

The King Joffrey of virtualization, perhaps.

--
For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
Re:They need some serious... by X0563511 · 2012-08-03 05:06 · Score: 2

That takes time. Time vs Space tradeoff.

--
For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
Re:You mean... by StuartHankins · 2012-08-03 05:29 · Score: 1
VMWare is nice, let's get that out of the way. We have a mix of ESXi and RHEV and are deciding which to use for everything (assuming moving up to paid VSphere is the VMWare option). The fact that RHEV was cheaper, much much better looking, quicker to setup, and easier to use than KVM under RHEL made the decision to migrate from RHEL-based KVM to RHEV fairly easy.

RHEV is getting there, still lacking some features and still rough around the edges. For instance:
- Right now you can't have a VM with one disk on one storage group and another disk on a different storage group (we use this for bulk SAN storage vs fast SAN storage). KVM on RHEL does that just fine; just mount the SAN presentation and place the appropriate disk on the appropriate mount and you're good to go.
- There are too many things which require the command line to either set up or clean up.
- ISOs cannot be uploaded into folders and that's an issue with us since we have a lot of ISOs. You also have to use a command-line tool to upload them rather than a nice GUI tool like VMWare has. And there's no way to remove items without manually diving into the filesystem to remove ISOs.
- They've already fixed the "requires IE" part of the browser for most tasks, however I couldn't get console view to work under Firefox for either SPICE or VLC VMs without having a bootable, running VM running those services... and the only time I need console view is when things are BROKEN...
- Reliance on NFS as "the" VM import / export option is ugly.
- VMs created in KVM using virtio drivers won't use those virtio drivers in RHEV unless they're on a very short list managed by Red Hat. Only RHEL and Windows are there -- no Fedora for instance, sorry! That stinks.
Snapshots are a sore point but I've heard the next version out around this December will fix it and give us a truly live backup option. Most of the other issues you mentioned (hot-add of resources) will hopefully be part of the new version. Dunno yet if that means "total fix" but I've got my fingers crossed. Hey, it's better than KVM under RHEL...
CERN network architecture by Joshua+Fan · 2012-08-03 05:38 · Score: 2

Those with further interest in the article may find this informative:

http://www.geant2.net/upload/pdf/LHC_networking_v1-9_NC.pdf

Apparently, CERN uses BGP between T0 and T1, and uses only ACLs, no firewalls, for security.
And it's on Brocade's 100G Ethernet =) by Desmoden · 2012-08-03 06:34 · Score: 1

http://www.enterpriseinnovation.net/content/brocade-delivers-100-gigabit-ethernet-solution-cern
just say'n
(oh come on, can you blame me? It freaking COOL!)
Re:You mean... by cduffy · 2012-08-03 06:39 · Score: 1

Those are all pretty core features-- to my mind, ESPECIALLY the disk and NIC hot add. There are a lot of times that it is an absolute blessing to be able to roll out a new VLAN on the network and to just hot-add a NIC to the firewall VM on that VLAN, and your network suffers no outage. With disk, its awfully nice to be able to add a USB disk to the VM without having to reboot the entire thing (again, how does HyperV and RHEV not have this?).
I can't speak to RHEV -- I ran on bare KVM. RHEV eliminated any feature from KVM that Red Hat didn't consider rock-solid enterprise-ready, effectively paring it down to the featureset they were willing to thoroughly test on all supported hardware.
All four of the items you listed were present and usable in upstream KVM back when I was working with it years ago, except for storage migration (which had just been implemented and was still at a place where I wanted to see other folks using it before I wanted to trust it myself... plus, already had shared storage anyhow and didn't need it).
Not really that huge by fa2k · 2012-08-03 07:01 · Score: 1

The summary says it's 100 MByte to 1 Gbit, which is confusing in itself. I think "a few hundred megabytes" is correct. It's impressive to run at that rate continuously with high reliability, but it's nothing compared to Youtube and probably Facebook. If you say a "tweet" takes up 200 bytes including overhead, that's 500 000 tweets per second at 100 MB/s, so maybe even Twitter has to deal with that rate. The requirement for redundancy is probably stricter for the LHC, they have at least triply redundant storage for the original data. The data processing is maybe more demanding for LHC than for big internet companies, as all data are essentially equal, so to do an analysis, all data files have to be accessed (there is more filtering after the trigger, and intermediate files are stored containing reconstructed information).
1. Re:Not really that huge by fa2k · 2012-08-03 07:10 · Score: 1
  
  maybe even Twitter has to deal with that rate.
  Never mind, guys, still a few orders of magnitude lower (340 M messages/day according to WP)
Unfortunately by warrax_666 · 2012-08-03 08:25 · Score: 1

Unfortunately that isn't saying much.

--
HAND.
You're a little off there by Immerman · 2012-08-03 08:45 · Score: 1

Actually you're off by 26 orders of magnitude
1PB/s = 8e15 bits/s
8e15 bits/s *(3600s/h) *(24h/day)*(~365.25 days/year) ~= 2.5e23 bits/year
or 252,460,800,000,000,000,000,000 if you prefer counting zeros
even in stereo it'd only be 5e23.

--
--- Most topics have many sides worth arguing, allow me to take one opposite you.
1. Re:You're a little off there by Travelsonic · 2012-08-04 03:21 · Score: 1
  
  My keyboard is being weird, probably omitted a few 0s when working with the calculations. Either way, it is still a mind boggling number of 1s and 0s. Wonder how long, in continual transfer, one would theoretically have to go to hit transfer of over a googolplex bits of information.
  
  --
  If you believe in privacy, and believe you have "nothing to hide" at the same time, you're a goddammed idiot
2. Re:You're a little off there by Immerman · 2012-08-04 17:55 · Score: 1
  
  Actually the other way around, you've got over twice as many zeros as you should have. You're right though, a mind boggling number regardless. Nowhere near a googolplex (10^googol) though, nor even a googol (10^100) bits. Using my number (~2.5e26) you need ~4e73 years to transfer just one googol bits, or 40 trillion trillion trillion trillion trillion trillion years, and the entire universe is currently only estimated to be ~15 billion years old right now. The universe will probably be so close to absolute heat death by then that it will have have long since lost the ability to support any meaningful form of information processing long before you finished.
  Your question suggests you're intrigued by huge numbers but aren't familiar with scientific notation, the preferred form of dealing with them since people, or even computers, are typically not very good at it. (if I'm wrong then I apologize, ignore the rest of my post. Or better yet critique my explanation, I teach math from time to time) It actually isn't all that hard. Take a number like 2e7, that simply means 2 followed by 7 zeros
  2e7 = 20 000 000
  if there's a decimal place then just write everything after the decimal place "over the top of" the zeros (what you're really doing in either case is moving the decimal point to the right the specified number of times, adding zeros as necessary)
  2.364e7 = 23 640 000
  Multiplying numbers is pretty straightforward, just multiply the parts before the "e"s and add the parts after:
  2e4 * 3e5 = (2*3)e(4+5) = 6e9
  Division is similar - divide the parts before the "e" and subtract the parts after
  6e7 / 3e4 = (6/3)e(7-4) = 2e3
  Often though you'll end up with a number before the "e" that's larger than ten or smaller than one. That's not exactly wrong, anyone who sees the number will know exactly what it means, but it makes things all ugly again. To clean it up just remember that the number after the "e" means "move the decimal point to the right this many times". So lets say I end up with 0.0354e9. For "clean" scientific notation I want one (nonzero) digit to the left of the ".", so move the decimal place right until we get it (_'s show where the decimal "used to be" as I move it):
  0_0_3.54 - so we had to move it to the right 2 times.
  Now remember the part after the "e" tells us how many times we needed to move the decimal place to the right to get the real number, so if we originally needed to move it right 9 times, and then we moved it right two times to clean things up, we would still need to move it 9 - 2 = 7 times to get the real number. So the cleaned up number is:
  3.54e(9-2) = 3.54e7
  The same basic method is used when you end up with a number 10 or larger, except in this case we're moving the decimal place to the *left* to clean it up, so to get the real number afterwards we'd need to move it even farther to the right - so we add the number of clean-up steps instead of subtracting. Lets say we start with
  1538.0e4
  to clean up we have to move the decimal three places to the left
  1.5_3_8_0
  so the final number is
  1.5380e(4 + 3) = 1.538e7
  notice that I dropped the last "trailing" zero, that's always safe to do. Remember we add zero back in whenever we need them when we're moving the decimal to the right
  Adding or subtracting is slightly trickier, but not much. Basically we just need to get both numbers after the "e" to be the same first, and then we can just do normal math on the leading part and leave the "e" part alone. Say we want to know what 5e6 - 3e4 is. Just take the number with the bigger "e" number and move the decimal right, reducing the "e" number by one each time(since we've used up one of the moves it's telling us we need to make). In this case that's 5e6, so moving
  
  --
  --- Most topics have many sides worth arguing, allow me to take one opposite you.
Re:You mean... by LordLimecat · 2012-08-03 10:49 · Score: 1

ESX's cost is a bit of a PITA-- theres essentials plus, but of course that lacks DRS; and theres the free version which truly is nice for a single-server solution... but there are a lot of good contenders out there for less.
Im not gonna say that the others are garbage; I took a peek at Xen and really like that they dont gouge you to death for basic things like "can manage several servers at once". Im just saying that from my experience, as well as from listening to others in the recent ArsTechnica discussion on virtualization, it really sounds like VMware is still on top-- as long as you can pay.
Disclaimer: I just got my VCP and regularly do a lot of work with ESXi.
oracle infrastructure details? by Finite9 · 2012-08-08 22:06 · Score: 1

I'd like to know what their infrastructure looks like for storing that 1GB/s.
I was at OpenWorld in 2003 and they had some guy there from CERN giving a talk about how they were using Oracle9i (I read later that they upgraded to 10g, but no-doubt they upgrade to later versions relatively quickly), and he did mention that petabyte/s buzzword. It would be very interesting to know how it was all implemented, and how they manage to write 1GB/s to disk. Must be some serious RAC clustering going on, and some serious disk bandwidth capacity, but I wonder when they purge the data. After all, 1GB/s is going to require massive storage, and if you've gone with one of the big boys for storage like NetApp then it won't only be their research that could be termed as 'astronomical'.

--
"Everyone knows that vi vi vi is the number of the beast" -- Richard Stallman