Slashdot Mirror


CERN Collider To Trigger a Data Deluge

slashthedot sends us to High Productivity Computing Wire for a look at the effort to beef up computing and communications infrastructure at a number of US universities in preparation for the data deluge anticipated later this year from two experiments coming online at CERN. The collider will smash protons together hoping to catch a glimpse of the subatomic particles that are thought to have last been seen at the Big Bang. From the article: "The world's largest science experiment, a physics experiment designed to determine the nature of matter, will produce a mountain of data. And because the world's physicists cannot move to the mountain, an army of computer research scientists is preparing to move the mountain to the physicists... The CERN collider will begin producing data in November, and from the trillions of collisions of protons it will generate 15 petabytes of data per year... [This] would be the equivalent of all of the information in all of the university libraries in the United States seven times over. It would be the equivalent of 22 Internets, or more than 1,000 Libraries of Congress. And there is no search function."

226 comments

  1. OT: The size of the internet by chriss · · Score: 5, Informative

    Okay, the Library of Congress has been estimated to contain about 10 Terabyte, so I buy the 1000 * LoC = 15 Petabyte. But archive.org alone expanded its storage capacity to 1 Petabyte in 2005, so the CERN is not going to generate anything near "22 Internet" (whatever that might be). This estimate from 2002 calculates the size of the internet as about 530 Exabyte, 440 Exabyte of which are email, 157 Petabyte for the "surface web"

    1. Re:OT: The size of the internet by vitya404 · · Score: 2, Insightful

      Have you read that article? Firstly, what you say exa, is peta really. But, according to me the size of the internet is the available data through internet. And my emails are not available through the web (hopefully). And while the data transmitted through the network is redundant and huge part of it worthless data (eg. my post), this experiment will give us an enormous amount of meaningful, therefore valuable data.

    2. Re:OT: The size of the internet by Walking+The+Walk · · Score: 1

      Oops, I think you misread your source. Following your link goes to a report by Berkeley that has a table in "2002 Terabytes", not petabytes as you erroneously quoted. So that's 157 Terabytes for the "Surface Web", 93 Petabytes for the "Deep Web", and 532 Petabytes total. I agree that the summary quote underestimates the size of the web, but please don't exagerate the data from your sources.

      --
      A recursive sig
      Can impart wisdom and truth
      Call proc signature()
    3. Re:OT: The size of the internet by Ceriel+Nosforit · · Score: 1

      You assume the Archive use all that capacity...

      Either way, the Archive also keeps old versions of the sites, meaning multiple copies of what is essentially the same site.

      --
      All rites reversed 2010
    4. Re:OT: The size of the internet by chriss · · Score: 3, Informative

      Firstly, what you say exa, is peta really.

      Me bad, miscalculated, off by a factor of 1000.

    5. Re:OT: The size of the internet by Servo · · Score: 1

      That's 22 times the legitimate content of the internet. Including porn its only about .3 Internet.

      --
      A slip of the foot you may soon recover, but a slip of the tongue you may never get over. -Benjamin Franklin
    6. Re:OT: The size of the internet by Anonymous Coward · · Score: 5, Funny

      We are from NASA, and would like to offer you a job in mission planning.

    7. Re:OT: The size of the internet by joto · · Score: 2, Interesting

      Meaningful and valuable to who? If I had to make the choice between using the bandwidth and storage space to store your post, or to store half a kilobyte of CERN sensor data, I would actually choose to store your post. And it's not because I find your post particularly valuable. It's because the CERN data is as meaningless to me as line-noise would be. For me even donkey bukkake with midgets is more meaningful, than random sensor data from CERN. Only when the scientists make discoveries from it that either carries important philosophical, economical, and/or practical benefits or changes, do I become interested.

    8. Re:OT: The size of the internet by mooingyak · · Score: 1

      For me even donkey bukkake with midgets is more meaningful, than random sensor data from CERN.

      I think I'd rather the random sensor data, given those two options. It's kind of like staring at the wall in front of you when you're at a urinal. It's not that the wall is so interesting...

      --
      William of Ockham had no beard. The most likely explanation is that it was chewed off by squirrels every morning.
    9. Re:OT: The size of the internet by Jeff+DeMaagd · · Score: 1

      If you mean by the size of the internet being how many bytes you can get to if you want, then that's only half the concern. Getting access to the bytes in a timely manner is another serious concern.

    10. Re:OT: The size of the internet by Anonymous Coward · · Score: 1, Funny

      For me even donkey bukkake with midgets is more meaningful, than random sensor data from CERN.
      Link plz.
    11. Re:OT: The size of the internet by FST777 · · Score: 1

      But the archive is part of the web, no? As is the Google cache...

      --
      Free beer is never free as in speech. Free speech is always free as in beer.
    12. Re:OT: The size of the internet by Belacgod · · Score: 1

      And how much of that is spam, spyware, and porn?

    13. Re:OT: The size of the internet by AI0867 · · Score: 1

      That's going to be the next failure right there, it'll be off by a factor of 1.024

    14. Re:OT: The size of the internet by Anonymous Coward · · Score: 0

      How many hard disks to transport in the airplane?

      It's more stupid than to use ssh and to execute the scripts in the remote massively parallel servers.

  2. Too much for the 'Net by DTemp · · Score: 3, Insightful

    I hope they're planning on running their own fiber optic line across the Atlantic, or shipping a lot of hard drives, cause thats too much data to pass over the public internet.

    FYI 15 petabytes per year = 120 petabits per year = 120,000,000 gigabits per year

    120,000,000 gigabits per year / ~30,000,000 seconds per year = 4gbps of continuous transmission. They could run a fiber across the Atlantic that could handle 4gbps.

    1. Re:Too much for the 'Net by Anonymous Coward · · Score: 2, Informative

      They could run a fiber across the Atlantic that could handle 4gbps.

      They have been getting sustained performance (with simulated data) of more than that for several years now. This is the sort of thing that Internet2 does well, when it's not on fire.

    2. Re:Too much for the 'Net by anzev · · Score: 1

      Don't forget these are the guys who INVENTED the internet. It would be as you know it today if it wasn't for them. So I wouldn't be surprised if they actually do something über amazing to transfer all that data.

    3. Re:Too much for the 'Net by Cyberax · · Score: 1

      Actually, it's not that much data.

      Two hard drives can fit 1Tb of data now (1Tb hard drives are also available), so 15Pb can fit on 'just' 30000 hard drives. A large number, but manageable.

    4. Re:Too much for the 'Net by Anonymous Coward · · Score: 2, Interesting

      "They could run a fiber across the Atlantic that could handle 4gbps."

      The .eu academic networks have a lot more transatlantic bandwidth than that already. When I worked at JANET (the uk academic network) we were one hop from .us and had 10G transatlatic bandwidth (how much of that was on-demand I can't remember). Geant, the .eu research network interconnect, also has direct connections to the .us research networks. The bandwidth is in place and has been for some time. It's being updated right now as well.

      Check out http://www.geant2.net/

    5. Re:Too much for the 'Net by grimJester · · Score: 1

      Is it really too much? The average torrent release of a popular TV show spreads to hundreds of users at an average of perhaps a megabit / second. University networks can probably handle that load without problem right now.

    6. Re:Too much for the 'Net by DarthGregor · · Score: 1

      The solution is "The Grid", a parallel network used to transmit and process data to/from the seven processing and archiving stations around the world Here's how it works: http://gridcafe.web.cern.ch/gridcafe/index.html

    7. Re:Too much for the 'Net by Anonymous Coward · · Score: 0

      Don't forget these are the guys who INVENTED the internet.. Is it really so hard to believe that Al Gore invented it?
    8. Re:Too much for the 'Net by ender- · · Score: 2, Funny

      Is it really too much? The average torrent release of a popular TV show spreads to hundreds of users at an average of perhaps a megabit / second. University networks can probably handle that load without problem right now. Um, no they can't, they're full to the brim with torrent traffic. :)

    9. Re:Too much for the 'Net by bockelboy · · Score: 4, Interesting

      That's 4Gbps AVERAGE, meaning it's much below the peak rate. That's also the raw data stream, not accounting for site X in the US wanting to read reconstructed data from site Y in Europe.

      LHC-related experiments will eventually have 70 Gbps of private fibers across the atlantic (Most NY -> Geneva, but at least 10Gbps NY -> Amsterdam), and at least 10 Gbps across the Pacific.

      For what it's worth, here's the current transfer rates for one LHC experiment You'll notice that there's one site, Nebraska (my site), which averages 3.2 Gbps over the last day. That's a Tier 2 site - meaning it won't even recieve the raw data, just reconstructed data.

      Our peak is designed to be 200TB / week (2.6Gbps averaged over a whole week). That's one site out of 30 Tier 2 sites and 7 Tier 1 sites (each Tier 1 should be about 4-times as big as a Tier 2).

      Of course, the network backbone work has been progressing for years. It's to the point where Abilene, the current I2 network, rarely is at 50% capacity.

      The network part is easy; it's a function of buying the right equipment and hiring smart people. The extremely hard part is putting disk servers in place that can handle the load. When we went from OC-12 (622 Mbps) to OC-192 (~10Gbps), we had RAIDs crash because we wrote at 2Gbps on some servers for days at a time. Try building up such a system without the budget to buy high-end Fiber Channel equipment too!

      And yes, I am on a development team that works to provide data transfer services for the CMS experiment.

    10. Re:Too much for the 'Net by FireBug · · Score: 1

      Brian? Is that you?

      Heh :)

    11. Re:Too much for the 'Net by markov_chain · · Score: 2, Interesting

      If they could get 1GB/s sustained, it would take them... 173 days to transfer 15PB. I hope they have dark fiber to light up!

      --
      Tsunami -- You can't bring a good wave down!
    12. Re:Too much for the 'Net by kestasjk · · Score: 2, Informative

      They're not going to run the particle accelerator for a day and then spend half a year transferring all the data generated, the lifetime of a particle accelerator is longer than 173 days.

      --
      // MD_Update(&m,buf,j);
    13. Re:Too much for the 'Net by kestasjk · · Score: 2, Funny
      Oh wait, this is Slashdot.
      • Okay, so that's 15 petabytes *tapping on calculator* that's 3.4x10^29 bits.
      • Taking the maximum data rate from a given node as 3 gigabits per second, and taking into account the effect of bandwidth increases over time.. *tapping on calculator*
      • Okay, and taking the average mosquito lifetime as 20 days.. *tapping on calculator*
      • *breaks into a cold sweat*
      • Now, assuming mutations in mosquitos occur at a rate of 1 base pair per generation, *tap tap tap* and that our genes are different from mosquitos by 2.4x10^6 base pairs.. *more tapping on calculator*
      By the time they have transferred this data to scientists across the world mosquitos will have become the new dominant species.
      --
      // MD_Update(&m,buf,j);
    14. Re:Too much for the 'Net by perturbed1 · · Score: 1
      Too many CERN people hanging out on this thread! Hey, people. Get back to work. Coffee time is past!

      *ducks and then goes back to read some article on some exotic particle which we will never find*

    15. Re:Too much for the 'Net by BlueStraggler · · Score: 1

      120,000,000 gigabits per year / ~30,000,000 seconds per year = 4gbps of continuous transmission. They could run a fiber across the Atlantic that could handle 4gbps.

      I doubt they are going to dump *all* of their raw data on the poor ol' US of A, or they would have built the accelerator on this side of the Atlantic in the first place. In fact, it's highly unlikely that they are going to dump all of their raw data on *themselves*, as a major part of these experiments is filtering, cutting, and compressing the data in real time to bring the volume down to the storage farm's throughput capacity. Furthermore, the detector is not simply on or off; it can be turned on in various modes, with various types of beam, and various subdetectors enabled or disabled. Huge amounts of data are collected with pretty much nothing happening, to measure sensors' "dark current" response, or to measure the effects produced by cosmic rays. And vast volumes of calibration data are collected, in which no interesting physics is expected to occur. Nobody wants *all* of this data. The data from these special runs will get broken out into different analysis streams for statistical control. Particular teams will only grab certain streams from certain runs that pertain to their jobs, and pull those down to their local labs, which are located all over the world, not just the USA.

      And the major job of early analysis passes is data reduction. For example, converting collections of sensor readings to energy and time of flight values, converting collections of those to particle tracks, converting those to particle identifications, etc. Anyone requiring a "big picture" data set to do some high-level physics with is going to be grabbing a reduced data set such as this, and the data volume will be much lower.

    16. Re:Too much for the 'Net by Barryke · · Score: 1

      I dreamt about that.

      To bad it wasn't me at the controls. :(

      --
      Hivemind harvest in progress..
    17. Re:Too much for the 'Net by Anonymous Coward · · Score: 0

      By the time they have transferred this data to scientists across the world mosquitos will have become the new dominant species. I live in Florida, and I'm here to tell you, it has begun.
    18. Re:Too much for the 'Net by Anonymous Coward · · Score: 0

      They are waiting for the beam!

    19. Re:Too much for the 'Net by lordholm · · Score: 1

      I suggest the unit Bm/s to determine what transmission medium should be used.

      E.g. one shipload of DVDs (say 100 000 discs), transmitted over the atlantic (5 572 km), during a one month trip gives us: 100 000 * 4.7GB * 5572km / (3600s/day*30days) = 2.42485185*10^17 = 242.5 P Bm/s.

      Try to beat that with a wire...

      --
      "Civis Europaeus sum!"
    20. Re:Too much for the 'Net by stevelinton · · Score: 1

      The 15PB/year is after considerable reduction. As I recall the hadron pulses are something between 10 and 100 nanoseconds apart, so each detector sees between 10^7 and 10^8 collisions per second.
      For each of these, the detector is producing data from millions of sensor elements (these things are the size of a large house, and basically crammed with pixels. So the really raw data is well into the terabytes per second. About 90% of this data is thrown away by the sensor electronics and it then gets whittled down thrugh several more layers of filtering and compression before it's ever recorded at all.

      The 15PB/year is the data that will be shipped to a dozen or two tier 1 data centres, 7 of which are in the US, where it will be stored and made available for analysis. A well-defined set of further reduced data will be shipped to a larger number of tier 2 centres, and so on.

    21. Re:Too much for the 'Net by idontgno · · Score: 1

      Well, I for one welcome our particle-accelerating blood-sucking overlords.

      --
      Welcome to the Panopticon. Used to be a prison, now it's your home.
  3. No Search Function by tacocat · · Score: 4, Interesting

    Google it?

    If Google is so awesome, maybe they can put their money where there mouth is and do something commendable. Of course, they'll probably have a hard time turning this data into marketing material.

    1. Re:No Search Function by gedhrel · · Score: 3, Informative

      Well, there _is_ a search function, and that's what the tier-2 sites will be running. The data describes individual experiements (that is, individual collisions) and comes off LHC at a whacking rate. There's some front-end processing to throw away a lot of it before what's left gets sent to the tier-1 sites for further distribution.

      The data is suitable for high-throughput (ie, batch processing) and the idea is to keep copies of the experimental data in several places during processing. Interesting results get flagged up by the batch processing for further study.

    2. Re:No Search Function by idiat · · Score: 0

      google are involved in this at some level, they have a project where they ship storage arrays to science labs who then load it up with data before google ship it to the universities who use it. It's the old cartload of tapes argument again!

      --
      And remember folks, Gnu's *not* unix.
    3. Re:No Search Function by Raptoer · · Score: 2, Interesting

      The problem is less that there is no search function (with digital data all you're doing is matching one pattern to another), the problem is more that you don't know exactly what you are searching for!
      My guess is that they are looking for anomalies within the data that would indicate the presence of one of these subatomic particles. My guess furthermore is that once they get enough data analyzed they will be able to form a model to base a search function around.
      That or the summary lies (wouldn't be the first time) and in fact they know exactly what they are searching for, and they have a search function, but of course someone has to look at the output of those functions to determine what impact they have on their model/ideas.

    4. Re:No Search Function by scheme · · Score: 4, Informative

      The problem is less that there is no search function (with digital data all you're doing is matching one pattern to another), the problem is more that you don't know exactly what you are searching for!
      My guess is that they are looking for anomalies within the data that would indicate the presence of one of these subatomic particles. My guess furthermore is that once they get enough data analyzed they will be able to form a model to base a search function around.
      That or the summary lies (wouldn't be the first time) and in fact they know exactly what they are searching for, and they have a search function, but of course someone has to look at the output of those functions to determine what impact they have on their model/ideas.

      For a lot of the physics, the researchers know what they are looking for. For example, with the Higgs boson, theories constrain the decay and production to certain channels that have characteristic signatures. So they would be looking for events that have a muon at a certain energy with a hadron jet with another given energy coming off x degrees away and so on. There have been monte carlo simulations and other calculations done to predict what the interesting events should look like using various different theories. Of course there maybe interesting events that pop up that no one has predicted but everyone has a fairly good idea of what the expected events should look like.

      --
      "When you sit with a nice girl for two hours, it seems like two minutes. When you sit on a hot stove for two minutes, it
    5. Re:No Search Function by Benson+Arizona · · Score: 5, Funny

      Buy Higgs Boson now at e-bay.com

      Buy books about Bosons at Amazon.com

    6. Re:No Search Function by oglueck · · Score: 1

      I already see the "related" Google ads:

      * scintillators fix your mortgage
      * Viagra particles
      * free teen bosons

    7. Re:No Search Function by cb372 · · Score: 1

      You're right, but I'm sure I also read something last year about some different work Google was doing for the astronomy community. They were writing some kind of innovative image search framework for use with a new telescope array that produces huge amounts of data.

      The details are all a bit hazy now, but I remember thinking this sounded really interesting at the time. After a few minutes of searching, I can't seem to find any mention of it on the web. Did I dream this? Does anybody know what the hell I'm talking about?!

    8. Re:No Search Function by Anonymous Coward · · Score: 0

      Yes, and they set up they're own computer clusters to sift the data looking for these signatures. Just take the search comment as the joke it was and leave it at that.

      One thing the article doesn't mention is that the CERN already pushes incredible amounts of data accross large distances. From their facts page:

      Fact 24) On 1st October 2003 CERN and the California Institute of Technology set a new Internet Land Speed Record by transferring 1.1 terabytes of data in less than 30 minutes across 7000km of network. The equivalent of transferring a full length DVD movie in 7 seconds.

      Another thing they don't mention is how much data is being thrown out at the source. Collisions will occur in the LHC at 40 MHz. A trigger will only record data from those events that "look right," which cuts that down to about 100 Hz. That right there is a 99.99975% reduction in data.

      And of course, none of the research sites are going to want all of the data.

  4. google by wwmedia · · Score: 1

    im sure google would love to get their hands on this data, they are like one of them energy alien beings from star trek that feed and grow, except google grows on data

  5. How are they going to sift through all this... by Js+Eagle · · Score: 1

    Really...

    1. Re:How are they going to sift through all this... by GregPK · · Score: 1

      6 months on a sailboat should get you through 1 month of this data. Though, make sure you remember to stockup on your Marine supplies at http://www.westmarine.com/ They sell the coolest saftey gear that you will most definatly need after being distracted by so much data sifting.

    2. Re:How are they going to sift through all this... by PseudoQuant · · Score: 1
      Easy, it should go something like this ...

      grep higgs raw_data > NobelPrize
    3. Re:How are they going to sift through all this... by Anonymous Coward · · Score: 0

      I thought it should be more like grep -i higgs* raw_data | sed s/background/QCD_fudge_factor > /dev/NobelPrize

  6. 60% by Alsee · · Score: 4, Funny

    The CERN collider will begin producing data in November, and from the trillions of collisions of protons it will generate 15 petabytes of data per year... [This] would be the equivalent of all of the information in all of the university libraries in the United States seven times over. It would be the equivalent of 22 Internets, or more than 1,000 Libraries of Congress. And there is no search function.

    And 60% of it will be porn.

    -

    --
    - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
    1. Re:60% by carpe_noctem · · Score: 4, Funny

      mmmm... particle porn!

      See the hottest collisions on the web! Watch as innocent particles get ripped apart, revealing their inner quarks! See protons get exploited and penetrated in their luscious gluons!

      --
      "Quoting famous computer scientists out of context is the root of all evil (or at least most of it) in programming." - K
    2. Re:60% by Anonymous Coward · · Score: 0

      well those beautiful bottoms always give me a hadron

      Actually I still think the best paper title ever came from a CERN conference due to the lack of LEPs discovery of the top quark (Fermilab had still to discover it). The top quark is the partner of the bottom quark and is required by the Standard Model. However it was *really* heavy so it was starting to look like it didnt exist so people were formulating all sorts of models with out it but they werent working.

      The title: "Topless Model has Problems with Naked Bottom". Priceless.

    3. Re:60% by lexarius · · Score: 5, Funny

      Talk like that gives me a large hadron.

    4. Re:60% by Hektor_Troy · · Score: 1

      I'm not sure which is worse ... knowing exactly what you're talking about OR getting turned on by the way you explained it ...

      --
      We do not live in the 21st century. We live in the 20 second century.
    5. Re:60% by LordVader717 · · Score: 1

      I heard it was going to be like the Big Bang(TM)

    6. Re:60% by rasputin465 · · Score: 2, Informative

      wow, i almost spit coffee all over my laptop when i read that. careful, yo.

  7. Never mind the data by simong · · Score: 4, Interesting

    What about the backups?

    1. Re:Never mind the data by MichaelSmith · · Score: 1

      What about the backups?

      I don't want to be the one who has to stay back at night to change backup tapes.

    2. Re:Never mind the data by simong · · Score: 1

      Nah, there are robots for that. Big robots.

    3. Re:Never mind the data by dylan_- · · Score: 2, Funny

      Nah, there are robots for that. Big robots.
      Tape backup?
      --
      Igor Presnyakov stole my hat
    4. Re:Never mind the data by rrohbeck · · Score: 1

      >What about the backups?
      Not a big deal. You can buy petabyte tape libraries from a number of vendors... Quantum/ADIC, Overland, Sun/STK etc.

  8. Is there a danger or isn't there? by Excelcia · · Score: 1, Interesting

    catch a glimpse of the subatomic particles that are thought to have last been seen at the Big Bang I read of "fringe" scientists who warn that there could be potential catastrophic consequences to the coming generation of colliders. The answer to these warnings seems to be that cosmic rays of higher energy than our colliders can generate have been zipping around for billions of years - so if something "bad" could come of it, then it would have already happened.

    So, is the above quote simply a poster who doesn't know what he is talking about (someone more interested in a catchy phrase in an article than in actually disseminating facts), or are these colliders actually capable of generating particles that haven't existed since the big bang? I tend to think the former - but I'm not a physicist, just a geek.
    1. Re:Is there a danger or isn't there? by vertigoCiel · · Score: 1

      From my (college-level) physics knowledge, the advantage of these colliders is that they come close to recreating the conditions which existed at the time of the beginning on the Universe (according to the Big Bang hypotheses). Whether or not these conditions allow certain never-before-seen particles to be observed is uncertain, but likely, since some kinds of particles (like mesons or bosons) have a tendency to dessapear in less than a nanosecond (1*10^-9 seconds).

      On a related note, all the particle colliders of the most recent generation (like the Tevatron at Fermilab or the Relativistic Heavy Ion collider in New York) have the capability (if certain theoretical models are accurate enough) to generate very tiny (around nine millimeters), but stable black holes (though the probability is extremely low). See "How to Destroy the Earth" for more information on this.

    2. Re:Is there a danger or isn't there? by SamSim · · Score: 4, Funny

      all the particle colliders of the most recent generation (like the Tevatron at Fermilab or the Relativistic Heavy Ion collider in New York) have the capability (if certain theoretical models are accurate enough) to generate very tiny (around nine millimeters), but stable black holes (though the probability is extremely low)

      Well, yeah, but the probability is about the same as that of you generating a small black hole by clapping your hands together really hard.

    3. Re:Is there a danger or isn't there? by Excelcia · · Score: 1

      I've read the different theories on what some people say "could" happen. Strangelets and micro-black-holes (where Hawking's theory of evaporation is wrong) tend to be the worst case scenarios. My question, though, is are the coming generation of colliders capable of producing energies greater than is already seen in the rare high-engergy cosmic rays? If not, then the types of particles or events created in our accelerators is unlikely to be any different than happens all over the universe when these cosmic rays hit things and whomever submitted this article was talking out his yin-yang when he implied it would be creating particles not seen since the big bang.

      My understanding is that cosmic rays can reach 10^20eV. Wikipedia has an article on what was believed to be a single proton with an energy of 50 joules. If that article is correct, I'm doubtful that CERN can can pump that kind of energy into a proton, which means that these sort of collisions are happening all the time.

    4. Re:Is there a danger or isn't there? by locofungus · · Score: 1

      have the capability (if certain theoretical models are accurate enough) to generate very tiny (around nine millimeters)

      9 millimeters? That's huge.

      You'd need a mass of about 6x10^24kg to get a Schwartzchild radius of 9mm.

      Microscopic (much smaller than a proton) black holes, yes but 9mm just doesn't sound credible unless you've got some very outlandish theories about black holes.

      (I've just been to read your link - a 9mm hole is what is left when the entire Earth is consumed by a microscopic black hole.)

      Tim.

      --
      God said, "div D = rho, div B = 0, curl E = -@B/@t, curl H = J + @D/@t," and there was light.
    5. Re:Is there a danger or isn't there? by alexhs · · Score: 1

      subatomic particles that are thought to have last been seen at the Big Bang. Mmh... Big-Bang... Black Holes...

      Reminds me of Commander Blood.

      Rendez-vous at the Big-Bang :)

      Has someone played it ?
      --
      I have discovered a truly marvelous proof of killer sig, which this margin is too narrow to contain.
    6. Re:Is there a danger or isn't there? by Anonymous Coward · · Score: 0

      Not just "fringe" physicists. Physics World did a piece on the up and coming high energy accelerators several years ago. The was before the US project was canned. The top bods in the field had meetings to discuss potential problems, particularly "what-if" regarding generating anti-matter.

      They concluded the probability was very low. I.e. they weighed having their new toy more important than the risk of causing a global meltdown!

      I no longer have the mags, otherwise I'd dig out the full details.

    7. Re:Is there a danger or isn't there? by Hatta · · Score: 1

      The answer to these warnings seems to be that cosmic rays of higher energy than our colliders can generate have been zipping around for billions of years - so if something "bad" could come of it, then it would have already happened.

      What do you think the big bang was?

      --
      Give me Classic Slashdot or give me death!
    8. Re:Is there a danger or isn't there? by delt0r · · Score: 1

      Your information is quite wrong. Nothing even close to mm dimesions will be created. The "stable" black holes that *mite* form would last for less than 10^-15 seconds IIRC. Light can't even travel 1 mm in that time.

      --
      If information wants to be free, why does my internet connection cost so much?
    9. Re:Is there a danger or isn't there? by BrianRagle · · Score: 1

      There are indeed higher energy beams out there which could or not be producing the kinds of particles these scientists are seeking. The issue is they will be produced in controlled and observable conditions, as opposed to hoping we are looking in the right place, with the right equipment, and at the right time somewhere out in the black.

      As for those on the fringe who call themselves scientists and warn of things like Earth-devouring black holes being spawned at CERN....truly, they aren't even worth coming up with a witty description for their idiocy.

    10. Re:Is there a danger or isn't there? by Patrik_AKA_RedX · · Score: 1

      So 1 in 15 qazillion?

      hmmmm. allright. I demand 100 billion dollar or I start clapping...

  9. re: 15 petabytes? by GNUThomson · · Score: 2, Funny

    The real fundamental question is not about beginning of the universe, but something much much more important: Are they going to backup the data?
    On the other hand, I'm sure it will be available on some torrent soon.

  10. Advertising, silly... by Excelcia · · Score: 1

    Revenue from advertising, as always.

    Get yer hot fresh strange quark...

  11. Neutrinos by MichaelSmith · · Score: 4, Funny

    I hope they're planning on running their own fiber optic line across the Atlantic

    You know with the right sort of particle accelerator you could send messages straight through the Earth and save a heap of latency.

    1. Re:Neutrinos by Dunbal · · Score: 4, Funny

      You know with the right sort of particle accelerator you could send messages straight through the Earth and save a heap of latency.

      It's called the "Death Star" project, and we've been having a hell of a time with the receiver...

      --
      Seven puppies were harmed during the making of this post.
  12. I suspect by Rix · · Score: 1

    1 "internet" is being used as the amount of data transfered in a given period.

  13. GASP by Excelcia · · Score: 2, Funny

    Lepton dancers wearing gluons.... WHOA!

  14. Never underestimate the bandwidth of a 747 by Rix · · Score: 2, Insightful

    So long as it's not needed right now pretty much any amount of data can be transmitted.

    1. Re:Never underestimate the bandwidth of a 747 by MikShapi · · Score: 3, Informative

      That's a highly misleading figure (whatever figure you had in mind).

      When you add the amount of time, money, kit and effort that'd go into either burning that many optical disks or filling that many harddrives, then connecting them on the other end and reading it out makes it less attractive than fiber optics.

      On the other hand, if the 747 is crammed full of ultra-high-capacity hard-drives (say, the new Hitachi 1TB) in high-density racks that do not need unloading from the aircraft (it lands, it plugs into a power/multiple-10GbE-grid, offloads the data to a local ground facility, then goes out for the next run), you get something that'd possibly be competitive with fiber, as well as a possible business model avenue.

      You would, of course, need someone to be willing pay the rough equivalent of .. say .. 500 economy airline tickets (shooting from the hip here, I tried compounding business/first-class costs).. to get that through. That's a lot of cash. Then again, at 1TB/drive, it's a LOT of data.

      --
      -
    2. Re:Never underestimate the bandwidth of a 747 by Anonymous Coward · · Score: 1, Funny

      The problem is that CERN will generate more data than it can afford to hold on it's own server's hard drives. If they had an A380 (Airbus for teh win ;-)) worth of hard drives installed and ready to tap data, they would not need to move all that data.

    3. Re:Never underestimate the bandwidth of a 747 by Dunbal · · Score: 4, Funny

      If they had an A380 (Airbus for teh win ;-)) worth of hard drives installed and ready to tap data, they would not need to move all that data.

      I'm sorry, how much is that in Cessna 172's again?

      --
      Seven puppies were harmed during the making of this post.
    4. Re:Never underestimate the bandwidth of a 747 by fbjon · · Score: 5, Informative
      We obviously want to use maximum storage per HD weight, which is currently the Hitachi Deskstar 7K1000, we would have 1,000,000,000,000 bits per a maximum of 700 grams.

      Using the maximum payload weight of an A380F (freighter model), we get with Google calc: (152 400 kg / 700 grams) * 1Tbytes = 193.36913 petabytes, which is 12.8912753 years worth of CERN CMS data over a maximum distance of 5,600 nautical miles.

      The maximum useful load of a Cessna 172 is 371 kg, which gives a meager 0.0313823042 years worth of data over a maximum distance of 687 nm.

      The raw distance between CERN and Purdue University (not including distances to airports and such) is about 3838 nm, well within range of the A380F. The Cessna 172 falls into the ground/ocean long before that however. Since there's no air-refueling option for the Cessna, the plan calls for a fleet of at least 179 Cessna 172's constantly working in relay, just to keep up with the data production rate!

      So, to answer your question: If you want the same leisurely pace of using one A380F, you'll need a massive 2148 Cessnas flying for a full year, every 12 years (the total weight of which is equivalent to 531 A380F's, which should tell you something about the efficiency of said plan).

      --
      True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
    5. Re:Never underestimate the bandwidth of a 747 by Anonymous Coward · · Score: 0

      A wonderful bit of math... but what about magnetic tape? Surely that must be more efficient to transport.

    6. Re:Never underestimate the bandwidth of a 747 by Anonymous Coward · · Score: 0

      Cessna moves only 687 nanometers? I love nanotech.

    7. Re:Never underestimate the bandwidth of a 747 by scaramush · · Score: 1

      Math nerds ftw!

      --
      "...you can steal my woman, but you ain't done nuthin' smart."
    8. Re:Never underestimate the bandwidth of a 747 by identity0 · · Score: 1

      If I were moving a lot of data with a 747 (or a C-17, as the US military might), I would build hard drive rack assemblies that were built into standard shipping containers, which could be offloaded onto a truck or train and shipped directly between sites. It's less space-efficent than building them into the 747, but I figure the limiting factor would be weight, not space.

      Obviously, it would use freight 747s, which are cheaper and can handle the containers. Hey, FedEx and UPS manage to do it with letters, so I figure it's possible with hard drives :)

    9. Re:Never underestimate the bandwidth of a 747 by Xyrus · · Score: 1

      Wow. You've just guarenteed that you will never get laid. :)

      ~X~

      --
      ~X~
    10. Re:Never underestimate the bandwidth of a 747 by mcrbids · · Score: 1

      When you add the amount of time, money, kit and effort that'd go into either burning that many optical disks or filling that many harddrives, then connecting them on the other end and reading it out makes it less attractive than fiber optics.

      Do we really need a 747? Well, let's see. 15 PB of data, how many 1TB hard drives would that actually be? According to Wikipedia:

      1 PB = 10^15.
      1 TB = 10^12

      Thus, 1 PB could be written as 1,000 TB of data. So 15,000 TB hard drives will do it. Use RAID 5, say 4/5 (where 5 disks replicate 4 images) so we'll add 25%. That brings us to 18,500 HDD with decent redundancy.

      The weight of a 3.5" HDD is apparently as much as about 700 grams so we'll say that's around 25 ounces per drive. That's 375,000 ounces, or 23,437 pounds. But a Boeing 747 can carry about 10x that much!

      Methinks you've seriously overbuilt your solution. Heck even a little 727 is still way overbuilt. (max load 58,000 pounds) And 727s are dirt cheap nowadays.

      But is that actually better?

      Fiber optics nowadays can be pushed closed to 1 Tb per second. That's certainly in the range of what we're talking about. Actual numbers looks like 1 Tb per second could conceivably transfer 15 TB every 5.5 days or so, assuming optimal conditions. How much "dark fiber" is there under the ocean? Not much, I'd wager. Meaning this may likely require another cable to be laid == big, expensive, long project.

      So the 727 is probably the best bet, since they can get started pretty much right away, and won't have to put together a 5 year project to run cables under the ocean...

      Hmm. more curiosity - a 727 burns about 1,800 gallons of fuel every hour - costing around $1.84 per gallon. 3800 miles, about 3000 knots, or 10 hours at 300 Knots... around $65,000 per round trip. Since the budget of the entire project is 6.7 Billion dollars, it would take over 10,000 such trips to equal 10% of the total CERN budget.

      In short, it's a deal at twice the price!

      --
      I have no problem with your religion until you decide it's reason to deprive others of the truth.
    11. Re:Never underestimate the bandwidth of a 747 by BoothbyTCD · · Score: 1

      Unfortunately the Airbus A380 appears to have a 2 year latency.

      --
      snig
    12. Re:Never underestimate the bandwidth of a 747 by MikShapi · · Score: 1

      Thinking of how long it would take to release the aircraft, you'd probably be right.
      However, this would not improve your throughput, as the data still needs to be offloaded from the containers to wherever it is going. You're missing my original point about the difference between delivering the MEDIA with the data (a big pile of optical disks, or a container with hard-drive racks) to delivering the DATA itself (which includes reading it off the media and putting it on the destination). The container approach may have some merit if you have a bandwidth problem between where the 747 lands and where the big mother of a storage data center is, but it's debatable if that was part of the thought exercise.

      Bottom line, limiting factors that will not play a bottleneck role:
      * Individual drive throughput (either on-disk read/write or bus (SATA/FC/etc)
      * Space

      Limiting factors that WILL play a role:

      * Master uplink(s!) from the entire array to someplace meaningful (think multiple 10GbE links)
      * WEIGHT of harddrives. (Volumewise, you won't be able to fill the 747 anyway, so containers are pro'lly a great idea)
      * $$$ :-)

      --
      -
  15. 22 Internets per year? by UnHolier+than+ever · · Score: 4, Funny

    Would that be 0.84 Internet per forthnight? Or 1 kiloLibrary per Congress session? How much in tubes?

    1. Re:22 Internets per year? by eMbry00s · · Score: 1

      NO!

      You don't understand!! Argh, slashdot makes me so aggrivated. Don't you understand that you can't just dump stuff on the tubes? It's not like a truck, you know.

    2. Re:22 Internets per year? by AndyboyH · · Score: 2, Funny

      How much in tubes?

      Too much, and that's why we should pay the good companies all our hard earned cash to drill giant tubes for all our torrents, MP3s, smut and VoIP calls. Or at least, wasn't that what they were arguing for? ;)
      --
      Baka Drew
    3. Re:22 Internets per year? by SharpFang · · Score: 2, Funny

      The tube radius of 420 attoparsecs.

      OTOH owning the harddrives capable of holding this much data gives you about 730 kilometers of e-penis.

      --
      45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
  16. 4chan by Anonymous Coward · · Score: 0

    Now, the scientists can be serious when they say "I'll give you 22 Internets if you can get me a shop of Einstein dancing with Paris Hilton" on a 4chan board.

  17. Skynet? by magicchex · · Score: 1

    "The actual data analysis by physicists will take place at Tier-2 sites, so it's important that we can receive whatever data our physicists need," Würthwein says. "We will take data from CERN and push it across the worldwide networks to these seven places. They will receive it, analyze it, the whole gimbang. Once we have the data in all these places, a physicist will be able to submit jobs from their office computer, or even from a laptop in Starbucks."
    2007: CernNET becomes self aware.
    --
    How many fulltime jobs can one man have?
  18. All pages are identical by Laxator2 · · Score: 5, Interesting

    The main difference between the LHC data and the Internet is that all that 15 PB of data will come in a standard format, so a search is much easier to perform. In fact most of the search will consist on discarding non-interesting stuff while attempting to identify the very rare events that may show indications of new particles (Higgs for example). The Internet is a lot more diverse, the variety of information dwarfs the limited number of patterns LHC is looking for, so "no search available" for LHC data sounds more like "no search needed".

    1. Re:All pages are identical by kramulous · · Score: 1

      While I mostly agree with you, i'm not sure that anyone would consider 'discarding non-interesting stuff'. Although this is probably semantics given the quality of some of the other statements.
       
      I'd love to write some algorithms to help search/sort this stuff, to sort via probability of interest?

      --
      .
    2. Re:All pages are identical by Anonymous Coward · · Score: 2

      Actually, "discarding non-interesting stuff" is exactly how particle physicists work! Look at the design of a big experiment, all the different parts of the detector are there to work out which particles are boring, well understood stuff like Muons and discard them.

      I doubt they'll actually delete any of this data once they have it safely on disk, but you can bet your life that most of it is going to be filtered out and basically ignored.

    3. Re:All pages are identical by Laxator2 · · Score: 2, Insightful

      What I mean by "discarding non-interesting stuff" is not actually delete the data from disk. If this were the case, what need would be for 15 PB of storage ? The thing is that what the LHC people (and the whole physics community) want very badly is some signature of new physics. That means either Higgs, or supersymmetric partners of known particles, or even microscopic black holes (most people are skeptical about that, but look anyway at: http://www.slac.stanford.edu/spires/find/hep/www?r awcmd=f+a+thomas+and+giddings&FORMAT=WWW&SEQUENCE= to see how many times it has been cited. That gives an idea of how many papers have been written on the subject) The "non-interesting stuff" will be used to improve current limits on experimental data, but if nothing genuinely new will be found it is very likely that the LHC will be the last large particle accelerator ever built.

    4. Re:All pages are identical by imsabbel · · Score: 1

      If they didnt discard the unwanted stuff, you would have to put 3 or 4 additional zeros to that number...

      --
      HI O WISE PRINCE. WHT TOOK U SO DAM LONG?
    5. Re:All pages are identical by Hoi+Polloi · · Score: 1

      Do they have to keep track of every single decay particle? They could probably get rid of a lot of data no one is going to consult that way. On the other hand if you are going to spend that much money on CERN every bit of data coming out has a significant price on it.

      --
      It is by the juice of the coffee bean that thoughts acquire speed, the teeth acquire stains. The stains become a warning
    6. Re:All pages are identical by StreetStealth · · Score: 1

      Don't they have algorithms that can look at a segment of data and basically conclude "no Higgs here," waiting to flag anything interesting for the researchers?

      --
      Your mind is clear / The things that you fear / Will fade with how much you / Believe what you hear
    7. Re:All pages are identical by vondo · · Score: 2, Informative

      More like 6 or more extra zeros, actually. There seems to be a lot of confusion about this, so let me try to explain.

      Generally the data coming out of these experiments is filtered in two or more stages. It has to run in real time since the data volume is enormous. A detector like this can easily spew out several TB a second of raw data. The first layer of filtering will look at very small portions of the data and make very loose requirements on it, but can run very fast in dedicated electronics. This might discard 99.99% of the events and keep 90% of the interesting stuff, for instance. Now you have a much smaller volume of data, so you can afford to spend more time on it. So maybe you run a pared down version of the full reconstruction software. This is much more sophisticated software, so maybe you can get rid of 99% of what remains and only toss out 10% of the interesting interactions. This stage might be done on a cluster of 1000 computers or more. At the end, you've kept one out a million events and only thrown away 20% of what might be useful. But you need both steps. Skip the first step and you need a network with 10,000 times as much bandwidth and a computer cluster of 10 million computers. Skip the second step and instead of 100 PB of storage, you need 10,000. And you need to deal with all that data in the next step.

      The initial filtering is not the end of the story. The one event in a million that passes will be reconstructed with the full, best software available along with the other billion events that pass. Then those will be filtered again based on different types of physics signatures and sent to the researchers looking at that one particular type of interaction. This process also requires thousands of CPUs. The big LHC experiments will have 40 million interactions/second and each interaction might contain 25 collisions. The vast majority of these are understood (not interesting) but the challenge is to sort through those 1 billion interactions a second in a finite amount of time to find the interesting ones. The two stages I've described are called "triggering" and "offline event reconstruction and filtering" if you want to try to find out more.

      There go the mod points I assigned earlier in this discussion.

    8. Re:All pages are identical by imsabbel · · Score: 1

      Thanks for the explanation.
      I was vaguely aware of that 2-stage model, but didnt remember the exact numbers, so i put it conservatively.

      Interestingly, its in some way similar to the stuff i am doing, although on a much bigger scale (here its experiments on synchrotrons, where we also trigger stuff by the electron bunch...)

      --
      HI O WISE PRINCE. WHT TOOK U SO DAM LONG?
  19. Re:Disturbing and unsettling by Anonymous Coward · · Score: 0

    Both the accident mentioned in that article, and the earlier one linked to, are the result of work by fermilab, providing equipment/calculations for cern. Just another case of sabotaging your competitors' work by providing faulty equipment... and just like when the CIA did it to Soviet energy supplies, with little concern about the potential loss of human life.

  20. Re:Disturbing and unsettling by Anonymous Coward · · Score: 0

    Americans don't need science, the bible tells them everything they need to know.

  21. Re:Disturbing and unsettling by denominateur · · Score: 1

    You are aware that the magnet that blew up was built by Fermilab and it was on their side that the mistake was made?

  22. Gaaa aaaaa aaaaaaa by $RANDOMLUSER · · Score: 3, Funny

    "Like an exercise session getting you ready for the big game, we've been going to the physics gym," Hacker says
    Must. Erase. Image.
    Physics locker room.
    --
    No folly is more costly than the folly of intolerant idealism. - Winston Churchill
    1. Re:Gaaa aaaaa aaaaaaa by Anonymous Coward · · Score: 0

      now a regular mens locker room is certainly a welcome sight for one such as yourself

    2. Re:Gaaa aaaaa aaaaaaa by Dunbal · · Score: 1

      Must. Erase. Image.
      Physics locker room.


      It's called a chess club.

      --
      Seven puppies were harmed during the making of this post.
    3. Re:Gaaa aaaaa aaaaaaa by ozbird · · Score: 1

      Must wear sunglasses - that must pasty white skin in one place might blind you...
      Hmm, I wonder if that's the solution to global warming - get geeks outdoors, and increase the albedo of the planet (but decrease its libido.)

    4. Re:Gaaa aaaaa aaaaaaa by Anonymous Coward · · Score: 0

      Must. Erase. Image.
      Physics locker room.


      Speaking as a reasonably buff physicist, I especially like the idea that you mention only a single, presumably unisex locker room. There are some real cuties in this department.

  23. The internets?? by Spy+Handler · · Score: 0, Troll

    It would be the equivalent of 22 Internets, or more than 1,000 Libraries of Congress.

    You mean W was right all along??

  24. Re:Disturbing and unsettling by lloydchristmas759 · · Score: 1

    I just trust that Americans would be more responsible in doing this kind of science. Huh ? And what about that and that ???
    --
    I'd give my right arm to be ambidextrous.
  25. Libraries of Congress by Xoknit · · Score: 0, Troll

    I've got a library of congress in my pants.

  26. Remember by ReidMaynard · · Score: 1, Insightful

    Remember, this data can only get out per the size of CERNs Internet pipe(s) - so even if they have, say 5 10Gig-Ethernet connections - that's not much effect on big OC backbones. I'm just guessing, but I don't think CERN has HTTP/FTP servers right on a OC Internet backbone, or the server structure (think magnitudes greater than Google's) to drive the data.

    --
    -- www.globaltics.net

    Political discussion for a new world

    1. Re:Remember by kramulous · · Score: 2

      I'm willing to bet that they're all over it. And have even considered the possibility of a lot more than your 'average' figures given that a significant event may increase this data deluge. There is a lot at stake with this experiment (series of). A lot of future funding is dependent on how well this project has been managed, down to the smallest (pun originally not intended) detail.

      --
      .
    2. Re:Remember by UnHolier+than+ever · · Score: 1

      I don't know what size their 'pipes' are, but it's more than just 5 ethernet connections. This is the place where the web was invented, remember. And they have truckloads of money. Anyone who can build something that generates this amount of data will have thought of having enough throughput for it, trust me.

    3. Re:Remember by bockelboy · · Score: 5, Informative
      I do work with one of the LCG projects, so let me share some of my personal opinions with you (all this info is mostly available on the web, if you can find it. We keep no secrets.).

      I don't think CERN has HTTP/FTP servers right on a OC Internet backbone, or the server structure (think magnitudes greater than Google's) to drive the data.
      Oh yes we do. You are right though - buying network bandwidth is a lot more straightforward than building an disk / server infrastructure to handle all the data. It's difficult, but being accomplished.

      I think total - transatlantic fiber plus the European equivalent of Internet2 - bandwidth to CERN will amount to 100 Gbps - about 10 OC-192s. Universities buy into private global fiber networks, which are independent of the public internet.

      We then use gridFTP as a transport, which is basically PKI-protected FTP which transfers in N many parallel TCP streams. Then, we use a protocol called SRM to control the gridFTP transfers and (well, the CMS experiment) uses a higher-level application called PhEDEx to control worldwide data movement. Right now, PhEDEx directs about 8-10 Gbps worldwide, and we aren't "doing anything" big.

      GridFTP is a fairly effective protocol. I can get near-line speed - 2Gbps from a channel bonded RAID device. Locally, we've been buying large RAIDs - 30TB a box, building up to 200TB this fall. Some sites take a more "clustered" approach - they put a few 500-750 GB drives in each of the cluster's worker nodes, and build up to 200TB that way. Costs are lower, but you have to keep 2 copies of each file in the cluster, plus have the headache of swapping out drives. Of course, I like our method better. In addition, larger, T1 sites have a few petabytes in tape silos.

      Funding agencies don't just throw money into projects for years at a time, then wait for results. Two years ago, we did a test at 25% of the turn-on "complexity" (in terms of jobs run and data movement). Last year, we increased that to 50% complexity. Toward the end of this summer, we will have a challenge called CSA07 which should be between 75-100% complexity. Finally, turn-on should be around November this year.

      This is a multi-billion dollar project which has been under development for 10-15 years. We've been doing lots and lots of careful planning.
    4. Re:Remember by Falstius · · Score: 2, Informative

      Another thing to point out is that, at least for ATLAS, researchers don't get their data directly from CERN. CERN has fat dedicated pipes to what are called Tier-1 data centers, which are spread around the world. I think these centers build the raw data into structured events. Then there are smaller Tier-2 data centers (I worked for one of the Universities hosting a Tier-2 center) which get these structured events and that is where Joe physicist gets his data from. Also, these data centers have processing power on site to run programs submitted by physicists, so most of this data will never touch the everyday internet.

      For some reason, ATLAS and CMS don't use the same techniques and technologies for just about anything from detector design down to the style of pen carried in their pocket protectors. So anything said for ATLAS does not necessarily hold true for CMS (the other big detector on the LHC).

    5. Re:Remember by ShakaUVM · · Score: 1

      I used to work with Scott Baden and Fran Berman at UCSD / San Diego Supercomputer Center. It's nice to see what was research 10 years ago running in the prime time. =)

    6. Re:Remember by Anonymous Coward · · Score: 0

      PhEDEx

      Great name, lol. :)

    7. Re:Remember by bockelboy · · Score: 1

      Which ATLAS T2 do you work for? I'm at Nebraska, a CMS T2...

      Most T2 sites won't get anything from CERN (although, I think there is technically a T1 site at CERN, meaning it's possible to see some movement there). However, one break from the original MONARC model is that all T2 sites can recieve data from any T1. This means that T1 -> T2 trans-atlantic traffic is going to be increased for CMS.

      The two experiments don't use any common technology / techniques (beyond "the grid", in the broadest sense) for fault-tolerance. The probability of two radically different designs having a false positive for, say, SUSY, is supposed to be low. Plus, if one CMS collaborator claims to find the Higgs, you bet 1800 ATLAS collaborators will be trying to prove him wrong (and vice-versa).

      Healthy competition keeps everyone honest.

    8. Re:Remember by Falstius · · Score: 1

      At the detector level, most of this separation makes perfect sense. Although I think it has more to do with redundancy in case of failure than avoiding false positives. But all of the infrastructure and effort around the experiment is being doubled also. Well, good luck to them both.

  27. I predict the end of the universe by jamesh · · Score: 4, Funny

    This is really bad news. By defining the amount of data in LoC's, they leave themselves open to a huge exploit... If the LoC ever includes this data, then there will be a recursive loop of definitions and the LoC will expand to fill the universe.

    Okay... maybe not, but if they ever did put this data in the LoC, the effort required to re-factor all the LoC based measurements would bankrupt the world. And the confusion that goes on while this re-factoring is happening will surely crash at least one probe into Mars, where the English have used the new LoC units and the Americans will have used the old LoC units.

    1. Re:I predict the end of the universe by TapeCutter · · Score: 4, Interesting

      It seems the metric LoC = 10TB. If that is so then an LoC is no longer based on a physical library but has rather been redefined based on a more basic unit of information, (ie: the byte). This sort of thing has happened before, the standard time unit (second) is no longer based on the earth's rotation, rather it is based on some esoteric (but very stable) feature of cesium atoms.

      IMHO: This is a GoodThing(TM), it could mean the LoC is well on it's way to becoming an accepted SI unit. :)

      --
      And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
    2. Re:I predict the end of the universe by 1u3hr · · Score: 1
      an LoC is no longer based on a physical library but has rather been redefined based on a more basic unit of information, (ie: the byte)...it could mean the LoC is well on it's way to becoming an accepted SI unit

      Now we will have a whole other schism over whether the 10 TB is binary (10 x 2^40) or decimal (10 x 10^12), with SI purists demanding the binary be distinguished as 10 tebibytes.

    3. Re:I predict the end of the universe by Hobbled+Grubs · · Score: 1

      has anyone else been thinking that LoC is lines of code? some of my lines get close to 10TB but that is only when I am using full class references in java but i wouldn't us it base a standard LoC on.

    4. Re:I predict the end of the universe by Anonymous Coward · · Score: 0

      By the time we get to terabytes, even the binary die-hards will give up. It's easy to remember that a kilobyte is "really" 1024 bytes, or that a megabyte is "really supposed to be" 1048576 bytes, and it makes some people feel smart. But who wants to have to keep in mind that a terabyte is "really" 1,099,511,627,776 bytes? I don't even know pi to that many digits. Forget "tebibyte" -- it's time to call a truce and give up the binary units altogether above the mebibyte.

    5. Re:I predict the end of the universe by TapeCutter · · Score: 1

      The EU will go for decimal (metric), the US will go for binary (pocket caclulator lobbyists), the rest of the world will put/omit a (US) behind quoted figures to inform/confuse the reader.

      It's far from perfect, but it's better than a recurively expanding unit of information gobbling up the universe.

      --
      And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
    6. Re:I predict the end of the universe by markov_chain · · Score: 1

      Never! You'll have to take away the binary prefixes from me from my stiff, cold, dead fingers.

      --
      Tsunami -- You can't bring a good wave down!
    7. Re:I predict the end of the universe by cosinezero · · Score: 1

      In Soviet Russia, government library measures you!

    8. Re:I predict the end of the universe by Impy+the+Impiuos+Imp · · Score: 1

      > Never! You'll have to take away the binary prefixes from me from
      > my stiff, cold, dead fingers.

      While you could count in binary using your fingers, most people actually use their fingers to count in unary.

      --
      (-1: Post disagrees with my already-settled worldview) is not a valid mod option.
    9. Re:I predict the end of the universe by CommunistHamster · · Score: 1
      I object to the "tebibyte" "mebibytes" etc naming convention purely because the word doesn't flow as well as "terabyte" "megabyte" etc. I would use L, or R, or some other flowing phoneme, but not B or indeed any other plosive.

      Which flows better, tebibyte or telibyte?

    10. Re:I predict the end of the universe by Anonymous Coward · · Score: 0

      You mean decimal?

    11. Re:I predict the end of the universe by orclevegam · · Score: 1

      Actually I rather have internets become a new SI Unit. Yes sir, this baby here has 10 whole internets of storage on it.

      --
      Curiosity was framed, Ignorance killed the cat.
  28. CERN Collider as the invinite monkeys? by michaelz · · Score: 1

    Perhaps Shakespeare will be generated as data as well? http://en.wikipedia.org/wiki/Infinite_monkey_theor em

  29. So.. by mindwhip · · Score: 2

    FTA:"catch a glimpse of the subatomic particles that are thought to have last been seen at the Big Bang."

    Who was at the Big Bang to see them then? I suspect that the numbers are a lot lower than the number of people that heard that tree fall in the woods and heard the sound of one hand clapping put together.

    --
    [The Universe] has gone offline.
    1. Re:So.. by Dunbal · · Score: 1, Funny

      and heard the sound of one hand clapping put together.

            Don't be daft. Everyone here at UU knows that the sound of one hand clapping is 'cl-'

      --
      Seven puppies were harmed during the making of this post.
  30. Worst Hyperbole Ever... by Glock27 · · Score: 3, Insightful
    The collider will smash protons together hoping to catch a glimpse of the subatomic particles that are thought to have last been seen at the Big Bang.

    That line is some of the worst hyperbole ever. Here's why. First, there was (almost by definition) no one there to 'see' anything at the Big Bang. (Supernatural explanations aside, and this purports to be a science article.) Second, these subatomic particles are formed frequently in nature, as high-energy astronomy has found various natural particle accelerators that are FAR more powerful than anything we're likely to build on Earth.

    One hopes the author will do better next time.

    --
    Galileo: "The Earth revolves around the Sun!"
    Score: -1 100% Flamebait
    1. Re:Worst Hyperbole Ever... by 1u3hr · · Score: 1
      One hopes the author will do better next time.

      Unlikely. This is his explanation of bosons:
      "...theory of particle physics (boson is the name physicists give subatomic particles with particular properties)."

    2. Re:Worst Hyperbole Ever... by Anonymous Coward · · Score: 0

      They managed to instill, sorry: install hope in a machine, and all you do is criticize metaphors.

  31. Last time this was tried.. by metushelach · · Score: 1

    A black hole made earth go into neverland. http://en.wikipedia.org/wiki/Hyperion_(novel)

    1. Re:Last time this was tried.. by rrohbeck · · Score: 1

      >A black hole made earth go into neverland. http://en.wikipedia.org/wiki/Hyperion_(novel)
      Gregory Benford has a few more interesting black hole novels: Eater and Cosm come to mind. Maybe Artifact too.

  32. Bush and his internets by cl191 · · Score: 2, Funny

    It would be the equivalent of 22 Internets So our President was right about the "Internets" after all, he must have access to a few of those 22 Internets!
    1. Re:Bush and his internets by jaavaaguru · · Score: 1

      And with a DS3 connection, he can access seven of them simultaneously!

    2. Re:Bush and his internets by Dunbal · · Score: 0, Offtopic

      he must have access to a few of those 22 Internets!

            I am sure they are being searched for Weapons of Mass Destruction even as we speak.

      --
      Seven puppies were harmed during the making of this post.
  33. 5 internets, please ... by unity100 · · Score: 1

    and also put some Library of Congress saucing on it.

  34. Re:Disturbing and unsettling by Anonymous Coward · · Score: 0

    Americans don't need science, the bible tells them everything they need to know.

    The US had its chance to build a bigger and better particle collider: the Superconducting Super Collider.

    In 1993 the SSC, already partway-built, was cancelled by the Democrats who controlled Congress at that time. If the SSC had not been cancelled, we would already have discovered the Higgs boson 4 or 5 years ago.

    So, yes, in 1993 the US Government reneged on international commitments to scientists from Europe and Japan, and set back the progress of science by years.

    But don't blame the Bible. Blame congressmen like Tom Foley and Dick Gephardt, who preferred to lard up the farm bill with as much pork as possible.

  35. 15 petabytes... by pookemon · · Score: 1

    All that space to store. "Hit, hit, miss (doesn't matter), hit, miss (doesn't matter)"...

    --
    dnuof eruc rof aixelsid
  36. That's a LOT of data by Dunbal · · Score: 1, Funny

    it will generate 15 petabytes of data per year...

          Umm, question. Is this BEFORE or AFTER time stops?

    --
    Seven puppies were harmed during the making of this post.
    1. Re:That's a LOT of data by mux2000 · · Score: 1

      Mu

  37. But... by Cctoide · · Score: 1

    But how many rumors are there going to be on those Internets?

    --
    "Let's face it, it's a good story. Accuracy would kill it."
  38. you broke by Dunbal · · Score: 1

    Rule 1 and 2, asshole. GB2 gaia

    --
    Seven puppies were harmed during the making of this post.
    1. Re:you broke by Anonymous Coward · · Score: 0

      STFU, namefag.

    2. Re:you broke by Anonymous Coward · · Score: 0

      heh I didn't want to wait 20 mins :P

  39. Re: 15 petabytes? by Anonymous Coward · · Score: 0

    ISTR reading an article on this several years ago in which Cern people said that they just accepted the fact that they were going to lose massive amounts of data every year because backing up such huge amounts of data just wasn't possible.

  40. Re:Disturbing and unsettling by Dunbal · · Score: 1

    Not to mention this!

    --
    Seven puppies were harmed during the making of this post.
  41. Re:Disturbing and unsettling by Dunbal · · Score: 0, Flamebait

    Americans don't need science, the bible tells them everything they need to know.

    Americans don't need science, Fox News tells them everything they need to know.

    --
    Seven puppies were harmed during the making of this post.
  42. Re:Disturbing and unsettling by Dunbal · · Score: 1

    we would already have discovered the Higgs boson 4 or 5 years ago.

    Only if it really exists... how can you discover something that you have already discov...gurk too much recursion.

    --
    Seven puppies were harmed during the making of this post.
  43. Re:Disturbing and unsettling by bockelboy · · Score: 1

    You were aware that it wasn't the magnet which blew up, but rather the supporting structure that "kicked" when the magnet quenched?

    Last I heard, they'll be able to add to the structures in-place. FNAL will have to spend some money, but things will be fixed without delaying the project.

    And you were aware that FNAL's work passed multiple independent review committees and CERN signed off on it? It just turned out that the same oversight was made by all.

    In the end, a little egg-on-face for the US, but not a huge deal.

  44. For that analogy... by Anonymous Coward · · Score: 0

    You Win One Internet.

  45. 22 Internets by Zero_DgZ · · Score: 1

    .22 Internets: Rimfire serious business.

  46. Re:Disturbing and unsettling by Anonymous Coward · · Score: 2, Informative
    "fixed without delaying the project" says the parent!

    Truth: There are several news agencies that have booked flights to descend upon CERN at the "supposed" start of the LHC in November. What will they come and see, lots of hype and not much!

    What will happen? Single beam commissioning earliest in May. Collisions probably in August. Not earlier.

    I hate being a Anon Coward, but there you go... Yes, I am sitting at a CERN office right now.

  47. No no no by Anonymous Coward · · Score: 0

    It's a data storm.

  48. Don't forget the security... by siasl · · Score: 2

    The NSA will have to scan the data for potential terrorist Tachyons hiding among the Bosons. That will slow things down a bit.

    1. Re:Don't forget the security... by markov_chain · · Score: 1

      While they are at it, they should scan for Higgs Bosons too, save the scientists some trouble ;)

      --
      Tsunami -- You can't bring a good wave down!
  49. Think for a moment by kilodelta · · Score: 2, Interesting

    There are some other benefits to building such a huge network of high powered computers. And it's not the teleportation you thought, it's more copying of metadata and re-creating the original.

    Think about it, the only thing stopping us is the ability to store and transfer large amounts of data necessary to describe the precise makeup of a human being. I have a feeling this project will branch off into that area.

    1. Re:Think for a moment by Control+Group · · Score: 2, Funny

      kilodelta, I have someone I think you should meet. His name is Werner Heisenberg, and he's got some ideas that may interest you.

      --

      Reality has a conservative bias: it conserves mass, energy, momentum...
    2. Re:Think for a moment by kilodelta · · Score: 1

      Oh I'm familiar with Heisenberg. Granted, we can't know the states precisely but we can assign probabilities to them. It'll be an interesting world and I wouldn't want to be the first to go through said teleporter.

    3. Re:Think for a moment by Control+Group · · Score: 1

      A better counter-argument - and one which just occurred to me - would have been that, by the Heisenberg uncertainty principle, nothing can "know" the exact position and vector of a given particle. Which is precisely why it's legitimate to say that the particle doesn't even have a precise position and vector. Which, in turn, means that its precise position and vector can't be a causal determinant of anything. Which then means that if we can identify the position and vector of every particle in question to the physical limits of knowledge, we can then recreate that same information at the far end, and the Heisenberg error won't matter in the slightest.

      So, really, I'm suddenly of the opinion that Heisenberg is no bar to teleportation.

      However, we do run into a real existential problem, here. If your information is being read and recreated at the far end of the teleporter, there isn't any reason for the current you to vanish in the process. Rather, we've just made an exact duplicate of you. Which would have interesting implications in the philosophy of "self".

      --

      Reality has a conservative bias: it conserves mass, energy, momentum...
    4. Re:Think for a moment by alphamugwump · · Score: 1

      "A better counter-argument - and one which just occurred to me - would have been that, by the Heisenberg uncertainty principle, nothing can "know" the exact position and vector of a given particle. Which is precisely why it's legitimate to say that the particle doesn't even have a precise position and vector. Which, in turn, means that its precise position and vector can't be a causal determinant of anything. Which then means that if we can identify the position and vector of every particle in question to the physical limits of knowledge, we can then recreate that same information at the far end, and the Heisenberg error won't matter in the slightest."

      No, it does matter. By measuring the position or velocity of the particle, you affect the system. What were you planning to measure it with, ESP? See Maxwell's Demon

  50. Data != Information by drerwk · · Score: 1
    I know, it's a minor nit to pick.

    ...15 petabytes of data per year... [This] would be the equivalent of all of the information in all of the university libraries...

    I suspect that 15 petabytes of data will actually be equivalent to at most a 2x the information in a number of standard model journal articles and texts. They just have to figure out the right compression kernel.
    1. Re:Data != Information by tknd · · Score: 1

      Are you trying to say that you can compress the big bang?

    2. Re:Data != Information by drerwk · · Score: 1

      Into a few relatively number of equations and descriptive text, yes.

  51. ohh my computer by Ep0xi · · Score: 0

    They should advertize perfectly when the collider is going online, so i can
    turn off my computer because it is the only one i have, and i dont want my harddrive full of subatomic particles around my precious data.

    --
    ?
  52. 22 Internets? by Evil+Cretin · · Score: 2, Funny

    Sounds like the article was written by Senator Stevens. Nothing to fear, 22 emails can't possibly clog our tubes.

    --
    "A deadlock has been reached. One task must die. We must now choose between murder and suicide."
  53. As if... by fandrieu · · Score: 1

    The data deluge from all the HD media wasn't enough...my pc's can't keep up with all this data !!!

  54. Re: 15 petabytes? by markov_chain · · Score: 1

    See, that's just too bad. They spend $8B on the project, and then they don't have a few million to spend on hard drives to save the data produced by the $8B machine.

    --
    Tsunami -- You can't bring a good wave down!
  55. Thanks alot dude... by Vr6dub · · Score: 1
    You're the reason I've been getting all this lag on Xbox Live. Get off our tubes!!!

    All kidding aside, that does sound like some pretty cool stuff.

  56. Re: 15 petabytes? by databyss · · Score: 3, Funny

    My quantum computer has been working on downloading the torrent for the past few weeks.

    --
    Hmmm witty sig or funny sig? Maybe elitest techy sig!
  57. ISP Caps by Mike+Morgan · · Score: 1

    Their ISP is gonna be pissed.

    --
    -USR1
  58. "One hopes the author will do better next time" by patio11 · · Score: 1

    You must be new here.

  59. 22 Internets by rubberbandball · · Score: 2, Funny

    That's a lot of tubes.

    --
    oh marmalade.
  60. Slight problem by FST777 · · Score: 1

    SATA speeds are 1.5 to 3.0 Gbps...

    --
    Free beer is never free as in speech. Free speech is always free as in beer.
    1. Re:Slight problem by untaken_name · · Score: 1

      Why wouldn't you use fiber channel? Then you can get 4Gbps...
      Stupid slow-ass 3.0Gbps SATA. Might as well use wheelbarrows to move data around.

    2. Re:Slight problem by Dread+Pirate+Skippy · · Score: 1

      Find me a SATA drive that actually reads/writes at those speeds, and I'll buy you a drink sir.

    3. Re:Slight problem by jZnat · · Score: 1

      This is what things like RAID 5 are for...

      --
      'Yes, firefox is indeed greater than women. Can women block pops up for you? No. Can Firefox show you naked women? Yes.'
  61. Google deal by jshriverWVU · · Score: 1

    Wonder if they'll hire or contract some Google engineers for a data mining effort. Personally I'd work for free to get a chance to mine that much data.

    1. Re:Google deal by michaelz · · Score: 1

      Hm, I'm wondering what you want to find in that data. It's not like there's pr0n hidden in it. It's like examining the 15 Petabytes data of an banana. Lot's of data, not a lot that's worth mentioning.

  62. Corrections and additional info by acidflux4 · · Score: 1

    That's not "High Productivity Computing" Wire... the HPC in "HPC Wire" stands for High-Performance Computing.

    The real story on the ~15PB/year data store is to be found in these two sites:

    This outlines the hardware environment supporting the data (IBM 3584 w/ Ultrium and IBM DS4400):
    ftp://ftp.software.ibm.com/common/ssi/rep_sp/n/GRC 03001USEN/GRC03001USEN.PDF

    This outlines the software environment (layered Tivoli Storage Manager and dCache):
    http://www.dcache.org/manuals/tsm-symposium-2005-p aper.pdf

    Or is it?
    Here, Sun posts how Storagetek supplied the tape storage:
    http://www.sun.com/customers/storage/cern.xml

    The LCG
    Something could certainly be said about their computing backend of going through this data. It's called the LHC LCG (Large Hadron Collider Large Computing Grid) and is described here:
    http://lcg.web.cern.ch/LCG/tdr/LCG_TDR_v1_04.pdf

  63. Re: 15 petabytes? by The_Wilschon · · Score: 1

    Won't be stored on hard drives. At least, only a small portion of the total amount of data taken after a year or two will be stored on hard drives at any given time. The data gets archived on tape drives.

    I don't know for sure that that is how it will be at CERN, but I know that that is how we do it at Fermilab, and I don't know of any change in technology between when that was set up and now that would invalidate the reasoning behind using tape at Fermilab. So, I would expect that CERN would do the same.

    --
    SIGSEGV caught, terminating

    wait... not that kind of sig.
  64. A Question of Scale by airship · · Score: 1

    So... to hold that may Libraries of Congress worth of data, how big will the data server have to be?

    Please express the answer in 'Volkswagens'.

    --
    Serving your airship needs since 1995.
    1. Re:A Question of Scale by Anonymous Coward · · Score: 0

      Wait, that's a trick question.

      Google doesn't recognize LoC *OR* VW's as valid units!

  65. Cool! A Minnie Driver/Anne Hathaway love scene. by Impy+the+Impiuos+Imp · · Score: 1

    > This] would be the equivalent of all of the information in all of the university
    > libraries in the United States seven times over. It would be the equivalent of
    > 22 Internets, or more than 1,000 Libraries of Congress.

    $349,000, though I'm sure you'd get a decent volume discount for a thousand of these.

    Oh wait, it won't be needed for a year. Halve that.

    --
    (-1: Post disagrees with my already-settled worldview) is not a valid mod option.
  66. Degenerate matter destroying the Earth by Impy+the+Impiuos+Imp · · Score: 1

    So the argument these experiments are safe, and that they won't introduce exotic states of matter that cascade out of control with regular matter, converting it, and destroying the Earth, is that far more energetic events occur in our upper atmosphere all the time (e.g. the WOW type particles hitting so hard and fast they mass as much as a bacteria and pack the momentum of a pitched baseball)

    Yet they claim this all the time:

    > The collider will smash protons together hoping to catch a glimpse of
    > the subatomic particles that are thought to have last been seen at the Big Bang

    So which is it? While I don't believe the experiments are dangerous, this does shoot down their "safety" argument above. Or is their claim really false (e.g. WOW particles would have introduced this via upper atmosphere collisions many times) and just advertising to sell it to politicians and the public?

    --
    (-1: Post disagrees with my already-settled worldview) is not a valid mod option.
    1. Re:Degenerate matter destroying the Earth by FedeLebron · · Score: 1

      The second. Even though these particles do appear in natural high-energy collisions, it's the first time they'll be studied in this much detail. The collisions will be nothing new, and there's no danger to "Life as we know it", but we'll be able to obtain a great amount of detailed information about what happens when these particles collide. So yes, basically, the claim is false if you take "seen" to mean "have happened". "Everyday collisions being studied at CERN" would not get that much attention.

  67. AACS will object. by Anonymous Coward · · Score: 0

    They'll never be able to publish... odds are in all that data somewhere is the AACS key-of-the-week.

  68. How much data? by vrmlguy · · Score: 1

    15 petabytes per year.

    According to Gordon Bell and Jim Gray, recording one person's life as DVD video generates about 7 TB per year, so this is the same as generating life records for 2,000 people.

    BTW, according to one trend line I've seen, the cost of a PB of raw storage will drop below $1,000 around 2020. This means that while it may cost ~$5,000,000 to store the first year's data, by 2020 you could store 13 years worth of the data (i.e. all of the data produced up to that date) for around $250,000. Double that if you want it mirrored.

    --
    Nothing for 6-digit uids?
    1. Re:How much data? by Anonymous Coward · · Score: 0

      I, for one, am not an unique beautiful snowflake, so I'm most likely far from being worth 7TB a year.

  69. ISPs are already much bigger than that by billstewart · · Score: 1
    There's a Tier 1 ISP that uses a Death Star as its logo - I haven't looked at the current marketing numbers, but they claim to carry somewhere between 5-10 Petabytes a day. This includes public Internet traffic, private networks for businesses, and the voice network --- and the voice network is the smallest of those three. I don't think those numbers include the consumer DSL side of the net, which is run by the local telco side of the company as opposed to the long distance side.


    If you look at consumer broadband, the US has about 50 million homes getting an average of 1.9 Mbps download speed - that's about 100 petabits/sec, though obviously the network's oversubscribed enough that they couldn't actually carry that much without broadband, but it's still likely to be well above 1 petabit/sec of sustainable throughput if there were enough servers available to pump data that fast. In about two minutes, CERN@home should be able to download the CERN collider's entire data set for the year...

    --

    Bill Stewart
    New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
  70. No. by Rix · · Score: 1

    Perhaps someone with more experience than me with HPC can pipe in on the specifics, but it should be relatively trivial to create two, rather than one, data stores as the data is generated. Or if it's acceptable, just ship the original. Further, you wouldn't be stacking hard drives on a passenger plane, you'd charter an appropriately sized cargo aircraft. Or, more likely, throw it all in a standard cargo container for surface transport.

    That's a *lot* less cash than running fibre, and while it has very high latency it has effectively infinite bandwidth.

  71. 15 PB per year "And there is no search function." by Anonymous Coward · · Score: 0

    This sounds like Bush's advance plan for Iraq.

  72. Re:Disturbing and unsettling by bockelboy · · Score: 1

    Yeah -- after I wrote that, I asked someone, and was told the engineering run was "not likely" in November now. Oh well. More time to buy cheaper disks. :)

  73. Re: 15 petabytes? by markov_chain · · Score: 1

    I guess the power consumption of tapes is much better :) It seems that offline storage makes it easier to overlook certain unexpected but possibly groundbreaking events, because it's much harder/more annoying to explore the data. But that's just a layman's view; I'm sure the experts have a better idea of what could be there.

    --
    Tsunami -- You can't bring a good wave down!
  74. Re:Disturbing and unsettling by Anonymous Coward · · Score: 0

    Err, I thought that CMS would jump on that! (I work with your direct competitor, and let me say, we are not as happy.) Sorry that I gotto remain anonymous... truely unfair, i know.

  75. Re: 15 petabytes? by Anonymous Coward · · Score: 0

    Can anyone say "data compression"? Most likely you could reduce this to one tenth the size.
    Also, you would undoubtedly want some error correction encoding (which would add some percentage increase to the size of the data) as well.

  76. most data is lost by shawngiese · · Score: 1

    When I was at a tour of the facility a couple of years ago I asked how they could store so much data so fast. They replied that in fact most (80-90%) of the data was lost instantly at the collision but that they could selectively record certain amounts of data that they would use to validate theories.

  77. Tsk. by Ayanami+Rei · · Score: 1

    A terabyte really is a mega-megabyte (1024*1024*1024*1024). That's all that matters.
    It is a useful property of the kilobyte to be a power of two size (many related reasons for this). As such, it would be bad if a terabyte was assumed to be decimal and not binary, because it could not be expressed in a simple mulitple of kilobytes, let alone the convienence of raising the coefficient to a power.

    The only people it seems to actually bother are boorish computer enthusiasts who are trying to cobble together RAIDs on a shoestring budget and are outrage to discover that their 3-disk RAID5's of 500-marketing-GB apiece does not equal 1TB of porn storage.

    Of course that is really 1-marketing-TB, so why this bothers them, I don't know. They should be bugging MS (oh, I'm sorry, Micro$haft) for a patch to shell32.dll that reports the base-10 size they expect.

    Of course, when they come crying that the low-level disk-check, partitioning, and defragmenting tools actually report the "real" size to contrary, considering their reliance of "real KB" units for sector and cluster sizing... what are you going to do?

    Just deal with it. XXXXbytes are not SI units, and never will be.

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
    1. Re:Tsk. by TapeCutter · · Score: 1

      "Just deal with it. XXXXbytes are not SI units, and never will be."

      I was aiming for humour but somehow got modded interesting.

      --
      And did you exchange a walk on part in the war for a lead role in a cage? - Pink Floyd.
  78. There's a statement by Faux_Pseudo · · Score: 1

    The world's largest science experiment
    Who writes stuff like that? It might be the worlds largest or most
    expensive particle phisics experiment but I would have to count making
    it to the moon as the largest over all hard science experiment and or
    for political science maybe the reintroduction of democracy 2000 years
    after the last time it failed on a large level might be bigger than
    this.

    Do they mean largest as in 'largest amount of data every to come from
    one single test?' Because that isn't what they said.

  79. Re: 15 petabytes? by The_Wilschon · · Score: 1

    Nobody is looking for particular events. Everything is statistical in nature. "Do the distributions of these umpteen variables, which are calculable for each event, match what would be expected from theory?" is the basic question. So, in some sense, there is no such thing as "groundbreaking events", and certainly pretty much nobody just goes exploring through the data (by which I assume you mean eyeballing individual events one after another). Offline storage has nothing to do with this; the culprit is just the sheer amount of data combined with the excessively low probabilities of all the interesting processes. Everything with a higher probability has been seen already.

    And yes, before somebody asks, people regularly do analyses that are asking the question "Do the distributions of these umpteen variables not match what would be expected from theory, in a statistically significant way?" They very very very rarely find anything, but we keep doing these broad spectrum searches for new physics because it'd just be really cool to be the one to find something utterly unexpected.

    Also, unless I'm mistaken, the price per bit for tape is better, and the long-term stability of the data on tapes is much better. Hard disks degrade quite rapidly in comparison to tapes. I have no idea about the power consumption.

    --
    SIGSEGV caught, terminating

    wait... not that kind of sig.
  80. Re: 15 petabytes? by markov_chain · · Score: 1

    Thanks for sharing the point of view! I understand your environment better now.

    --
    Tsunami -- You can't bring a good wave down!
  81. Is there a danger or isn't there?-Kryptonite. by Anonymous Coward · · Score: 0

    "Well, yeah, but the probability is about the same as that of you generating a small black hole by clapping your hands together really hard."

    Clark Kent could do it.

  82. Re: Like the Big Bang (Expansion) by Douglas+Goodall · · Score: 1

    Does that mean that an unbelievable amount of data will come into being within a fraction of the first second, a phase called "Expansion"?

  83. steppin back a mo from the geektalk by jm81193 · · Score: 1

    this is all interesting guys but let me tell ya something. i recently returned from a real trip to this place, yes CERN, and there are some great chicks there. Our group had a tour aroud the computer centre and actually underground to the biggest experiment they ahve there. its called ATLAS. one of the top chicks at ATLAS and you can find this info online easily, is called Connie Potter. Not only does she know her stuff but she is one seriously great lookin woman. man those guys sure know how to pick 'em. in any case, she says if anybody needed any info or whatever, she'd be happy to mail stuff out. could we start with her phone number ? hehe..