Slashdot Mirror


Google's Academic TB Swap Project

eldavojohn writes "Google is transferring data the old fashioned way — by mailing hard drive arrays around to collect information and then sending copies to other institutions. All in the name of science & education. From the article, 'The program is currently informal and not open to the general public. Google either approaches bodies that it knows has large data sets or is contacted by scientists themselves. One of the largest data sets copied and distributed was data from the Hubble telescope — 120 terabytes of data. One terabyte is equivalent to 1,000 gigabytes. Mr. DiBona said he hoped that Google could one day make the data available to the public.'"

16 of 190 comments (clear)

  1. Should we be continuing this fallacy? by garcia · · Score: 3, Informative

    One terabyte is equivalent to 1,000 gigabytes.

    Uhh, no it isn't. It's really 0.9765625 terabytes.

    1. Re:Should we be continuing this fallacy? by Cristofori42 · · Score: 5, Funny

      umm a terabyte is really 1 terabyte. Though 1 terabyte = 1024 gigabytes not 1000... but whatever.

      --
      "Is that dad? Either that or Batman's really let himself go."
    2. Re:Should we be continuing this fallacy? by Professor_UNIX · · Score: 4, Insightful

      * 1 Terabyte = 1000 Gigabyte * 1 Tebibyte = 1024 Gibibyte
      Yea, yea, yea. And you also believe a hacker isn't someone who maliciously breaks into computer systems, it's just a curious innocent person right... crackers are the criminals! Give it up. The general public is never going to adopt "Tebibyte" into the language because terabyte sounds much more fucking cool.
    3. Re:Should we be continuing this fallacy? by Anpheus · · Score: 3, Insightful

      That's not the problem, the problem is, when you buy a X GB drive, you don't know what you're getting until you find the fine print. Some manufacturers provide different sizes of the same labeled drive, differing only in whether it's "1 GB = 1,000,000 KB" or "1 GB = 1,000,000,000 B"

      So if you buy a set for RAID one day, the next day they may no longer stock the drive you need and your vital information is put at unnecessary risk because... what, because the hard drive manufacturers can't decide whether they want to screw you out of 7% (using 1 GB = 1 billion bytes) or 5% (using 1 GB = 1 million kilobytes, which they curiously agree on equaling 1024 billion bytes. What a coincidence that KB is 2^10, but GB is 10^9?)

      Think about that for a moment before you lambast the argument for proper labeling of drives.

  2. Large datasets by BWJones · · Score: 4, Informative

    This is absolutely the most cost effective way of transferring large amounts of data like this. If you do the calculations on terrabyte size files, sneakernet (of FedEx net) is actually faster and less expensive. We also went to one of Jim Grey's seminars when he was here giving an Organick Memorial Lecture and he made an incredibly compelling demonstration using a variety of data types. We ended up talking with him for some time after about new projects we are engaging in that will also be generating terrabytes of data and his suggestion was to pass applications rather than data which was interesting.

    This is becoming more and more the norm in scientific research and Google's work is quite welcome.

    --
    Visit Jonesblog and say hello.
    1. Re:Large datasets by Sobrique · · Score: 4, Funny
      Never underestimate the bandwidth of a lorryload of backup tapes traveling at 60 miles an hour.

      Latency may leave something to be desired though :)

  3. In Other News by UnknowingFool · · Score: 4, Funny

    FedEx delivered what appeared to be a ton of broken office chairs to Google headquarters this morning. When asked for the sender's ID, the severely beaten FedEx courier would only reply that the sender wished to remain anonymous.

    --
    Well, there's spam egg sausage and spam, that's not got much spam in it.
  4. Other Uses for Mass Data Transfer by Anonymous Coward · · Score: 4, Funny

    Moe: Say, Barn, uh, remember when I said I'd have to send away to NASA to calculate your bar tab?
    Barney: Oh ho, oh yeah, you had a good laugh, Moe.
    Moe: The results came back today. (reading a printout) You owe me seventy billion dollars.
    Barney: Huh?
    Moe: No, wait, wait, wait, that's for the Voyager spacecraft. Your tab is fourteen billion dollars.

  5. Re:1TB = 1024 GB by 91degrees · · Score: 5, Insightful

    Why?

    Why is a Kilobyte 1024 bytes, if "Kilo" means 1000, both according to the SI and the greeks (Kilo is derived from khilioi). If 1 kg = 1000g, 1 kV = 1000V, 1 km = 1000m, why should hard disks break the pattern?

    When we're talking about addressable computer memory, approximating the kilobyte to 1024 is a convenience, but since Terabyte gives such a huge error, and makes absolutely no sense for data transfer or disk sizes, it's really time we stopped this illogical naming convention just because some engineers found a term convenient 40 years ago.

  6. Bark! Bark! Bark! by ColdWetDog · · Score: 4, Funny

    I'm so tired of this stuff. Byte me!

    --
    Faster! Faster! Faster would be better!
    1. Re:Bark! Bark! Bark! by AchiIIe · · Score: 4, Funny

      > I'm so tired of this stuff. Byte me!

      I'm sorry, that's wrong too:

      * 1 byte == 2 nibbles
      * 1 byte != 1 bite

      --
      Byte nazi police, proudly serving since 2^1025

      --
      Nature journal lied in Britannica vs Wikipedia Ask to retrac
  7. Re:Like days of old by meringuoid · · Score: 3, Interesting
    This sounds almost like stories of scholars trading/copying books from long long ago.

    According to what I'm told every time I watch a DVD, these scholars were in fact stealing books.

    --
    Real Daleks don't climb stairs - they level the building.
  8. ...why not tapes? by Penguinisto · · Score: 3, Interesting
    I understand the whole "HDD w/ a common filesystem = more compatibility" thing, but wouldn't it be easier to simply send along some tapes of a type appropriate to the format/type that the scientific institution uses? LTO-3 can do 800GB compressed, SDLT can do up to 600... and neither is susceptible to data loss when it gets bounced too hard by FedEx/UPS/DHL/Whatever. (plus it would make for a lighter package, wouldn't require some poor IT schmuck to disassemble a server or wait forver for USB to transfer all of it, etc...)

    I'm not criticizing or anything; just curious is all.

    /P

    --
    Quo usque tandem abutere, Nimbus, patientia nostra?
  9. Re:1TB = 1024 GB by 91degrees · · Score: 3, Informative

    Well, the IEC and IEEE as well as the CIPM and NIST all agree thatthere are 1000 bytes to a Kilobyte and 1024 bytes tothe kibibyte. So there:P

  10. Re:Mod parent up by MajinBlayze · · Score: 5, Informative
    As a former UPS employee, (I worked as a package handler, the guy that beats the shit out of your boxes as he loads them on the truck) I will never ship anything of value without paying extra for the insurance. when you do that, a couple of things happen:
    1. the item goes into a big bag (by itself, not mixed with other items) with red/white stripes, so employess know not to mess with it)
    2. it gets hand-carted to the destination truck, and is the last thing to be loaded, and first unloaded
    3. only seasoned workers ever touch your package, and generally care about the state that it's in
    4. finally, they are good about paying up if the item arrives damaged.
    did I forget to include ???? and Profit!
    --
    "Hate is baggage. Life's too short to be pissed off all the time." Danny Vinyard -American History X
  11. Not acording to NIST by Ernesto+Alvarez · · Score: 3, Interesting

    If you want to be strict, the SI defines the "tera" prefix as 10^12, so 1 terabyte = 1000 gigabytes.

    If you want to use the binary values, you might as well use the correct "tebi" prefix. NIST says you should, and it looks like the IEC, IEEE and BIPM agree.