LHC Data Generation Expected To Scale Up To 400PB a Year
DW100 writes: Cern has said it expects its experiments with the Large Hadron Collider to generate as much as 400PB of information per year by 2023 as the scope of its work continues to expand. Currently LHC experiments have generated an archive of 100PB and this is growing by 27PB per year. Cern infrastructure manager Tim Bell, speaking at the OpenStack Summit in Paris, said the organization is using OpenStack to underpin this huge data growth, hoping it can handle such vast reams of potentially universe-altering information.
To put this in perspective, Facebook states to be generating 4 PB per day, so 3.6 times more than the LHC. Does anybody know about anything generating more data than that?
The raw data rate that's generated by the particle detectors themselves is unreal. Based on some poorly remembered numbers of the this-many-of-that variety I think it's in the region of 10TB/second: 144 2.5Gsps 8 bit channels per card, a few dozen cards per cartridge, and some dozens of cartridges that run like a ring around the ATLAS detector's front.
The first level trigger/filter rejects the 99.5% of events that are boring and dumb (two protons strike a glancing blow and emit photons; Two protons' quarks exchange a gluon and fire out jets, ...). The second level trigger does more detailed analysis of a torrent of data somewhere in the GB/second. The final stage records the best-available-reconstruction data on several hundred events per second and I think it ends up around a few 100MB/s or less.
The recorded data will not be appreciably further compressible; Events are just a list of the origin, 4-momenta and tagged types of particles associated with each event.
It's already undergone its first refit/upgrade coincident with the magnet re-engineering shutdown: Increased luminosity, increased energy mean more events per crossing (from ~6 to ~20), requiring faster particle trackers and faster event readout (to increase from 400 to 1000Hz at final to-disk stage).
The second shutdown/refit around 2020 is planned to increase the luminosity even further, resulting in many dozens of interactions per crossing and requiring even faster electronics. I believe they're going to completely replace the ATLAS inner tracking detector because by that point it'll have absorbed over one million rads of radiation...