Slashdot Mirror


LHC Data Generation Expected To Scale Up To 400PB a Year

DW100 writes: Cern has said it expects its experiments with the Large Hadron Collider to generate as much as 400PB of information per year by 2023 as the scope of its work continues to expand. Currently LHC experiments have generated an archive of 100PB and this is growing by 27PB per year. Cern infrastructure manager Tim Bell, speaking at the OpenStack Summit in Paris, said the organization is using OpenStack to underpin this huge data growth, hoping it can handle such vast reams of potentially universe-altering information.

8 of 99 comments (clear)

  1. universe-altering information? by Racemaniac · · Score: 2

    you mean how we see the universe? because i doubt the universe cares much about the data we generate....

    1. Re:universe-altering information? by lymond01 · · Score: 2

      I sat through a lecture on the Higgs Boson. It explained why they were expecting it -- basically the final jigsaw puzzle piece to a long-time theory. If the theory was correct, they would be able to find the Higgs Boson at certain energy levels. If they didn't find it, then it's back to the drawing board to figure out what they missed. So no, they weren't necessarily doing basic "Let's ram particles together and see what we get" science -- we've been doing that for decades. This was more of a "If we ram these particles together at this velocity, this is what we should get". And we got it.

  2. Compared to Facebook by pmontra · · Score: 3, Informative

    To put this in perspective, Facebook states to be generating 4 PB per day, so 3.6 times more than the LHC. Does anybody know about anything generating more data than that?

    1. Re:Compared to Facebook by Anonymous Coward · · Score: 2

      NSA, they generate all the data of everyone combined.

    2. Re:Compared to Facebook by Thanshin · · Score: 5, Funny

      So, the LHC should just create a Facebook profile and store all the data on steganographied selfies and baby pictures.

  3. Re:Compress it by Anonymous Coward · · Score: 2, Informative

    The raw data rate that's generated by the particle detectors themselves is unreal. Based on some poorly remembered numbers of the this-many-of-that variety I think it's in the region of 10TB/second: 144 2.5Gsps 8 bit channels per card, a few dozen cards per cartridge, and some dozens of cartridges that run like a ring around the ATLAS detector's front.

    The first level trigger/filter rejects the 99.5% of events that are boring and dumb (two protons strike a glancing blow and emit photons; Two protons' quarks exchange a gluon and fire out jets, ...). The second level trigger does more detailed analysis of a torrent of data somewhere in the GB/second. The final stage records the best-available-reconstruction data on several hundred events per second and I think it ends up around a few 100MB/s or less.

    The recorded data will not be appreciably further compressible; Events are just a list of the origin, 4-momenta and tagged types of particles associated with each event.

    It's already undergone its first refit/upgrade coincident with the magnet re-engineering shutdown: Increased luminosity, increased energy mean more events per crossing (from ~6 to ~20), requiring faster particle trackers and faster event readout (to increase from 400 to 1000Hz at final to-disk stage).

    The second shutdown/refit around 2020 is planned to increase the luminosity even further, resulting in many dozens of interactions per crossing and requiring even faster electronics. I believe they're going to completely replace the ATLAS inner tracking detector because by that point it'll have absorbed over one million rads of radiation...

  4. Actually the LHC is bigger! by Roger+W+Moore · · Score: 2

    Actually the LHC generates more data than this. The talk is only talking about the data at CERN. The last count of all the files in the ATLAS experiment's DQ2 store (a distributed dataset access system with storage around the globe) was 161PB. This value includes all the simulated data, analysis data etc. I'm certain CMS has a comparable amount and then there are Alice and LHCb as well so the total will be well over the 300PB which Facebook stores.

    While Facebook generates 4 PB of new data per day they only store 300 PB according to that page so most of this is either discarded or overwrites existing data. If we look at the LHC then the raw data rate is probably about 1 PB/min but we throw away most of this (using computers on the surface, not 100m underground as the original talk says) because it's physics we already know about and we can't afford the storage for it. Then there is the generation of new data by analysis and simulation to include.

    So if you actually look at the whole system, not just what is at CERN, we have a larger total storage capacity and generate more data than Facebook...and we plan to scale up.

  5. Re:A new theory by justthinkit · · Score: 2
    (1) You want specific predictions? Hard numbers? Here you go.

    (2) As to "new theories are a dime a dozen, I get two new ones a week." Do you ever ask yourself why this is so? Do chemists get two new theories of chemistry a week? No. Because they have a good base model. I maintain that physics lacks a good base model.

    (3) Too many people don't realize the vast number of predictions made by current theories that have been tested by measurement

    The most obviously broken parts of physics, like the inflationary miracle after the Big Bang, are based on what measurements, exactly? The CMB? The same CMB that BICEP 2 based its nonsense on?

    And Black Hole information retention is based on...?

    And our completely broken notion of how stars should be orbiting the Black Hole at the center of our galaxy is a confirmation of our theory? Surely you gest.

    Wikipedia's list of Unsolved Problems in Physics has, by my count, 148 questions (and another 74 things that need to be discussed). Wiki's corresponding U.P. in Chemistry looks to have 25 or 30.

    There are at least a few modern physicists trying to deal with the horrendous state of physics today -- Lee Smolin, Frank Close, Peter Woit, and Amit Goswani come immediately to mind. Others like Anton Z Capri (& Feynman & Einstein) at least kept their sense of humor throughout their career.

    Far too many are followers, and the system encourages this, big time. Lee Smolin talked about this, and how he tried to go against his gut at first, before ultimately coming out with Loop Quantum Gravity.

    So is it all a bed of roses to you, "Roger"? Or have you a better theory? Or are you just interested in nitpicking?

    --
    I come here for the love