Google's Academic TB Swap Project
eldavojohn writes "Google is transferring data the old fashioned way — by mailing hard drive arrays around to collect information and then sending copies to other institutions. All in the name of science & education. From the article, 'The program is currently informal and not open to the general public. Google either approaches bodies that it knows has large data sets or is contacted by scientists themselves. One of the largest data sets copied and distributed was data from the Hubble telescope — 120 terabytes of data. One terabyte is equivalent to 1,000 gigabytes. Mr. DiBona said he hoped that Google could one day make the data available to the public.'"
One terabyte is equivalent to 1,000 gigabytes.
Uhh, no it isn't. It's really 0.9765625 terabytes.
I get sperm in my keyboard. Can't lick it out. Any suggestions ?
1 TB = 1024 GB, for gods sake
This is absolutely the most cost effective way of transferring large amounts of data like this. If you do the calculations on terrabyte size files, sneakernet (of FedEx net) is actually faster and less expensive. We also went to one of Jim Grey's seminars when he was here giving an Organick Memorial Lecture and he made an incredibly compelling demonstration using a variety of data types. We ended up talking with him for some time after about new projects we are engaging in that will also be generating terrabytes of data and his suggestion was to pass applications rather than data which was interesting.
This is becoming more and more the norm in scientific research and Google's work is quite welcome.
Visit Jonesblog and say hello.
But are they using station wagons?
Never underestimate the bandwidth of a station wagon...
Still very much applies today.
Ryan Fenton
This sounds almost like stories of scholars trading/copying books from long long ago. It's actually a somewhat interesting plan.
"In case of emergency, break glass. Scream. Bleed to death."
How long do you think it will be until some maroon somewhere plunks a hard drive into an unpadded envelope and drops it in the big blue mailbox on the corner?
This guy's the limit!
gonna fix your summary for free... "...one terabyte is equal to 1024 gigabytes..."
Whos going to own the data? I hope Google isnt going to say they do like they want to with the old books theyre scanning. Everytime you download a hubble picture will it have a google watermark?
Libertarian Leaning Political Discussion Forum.
It was said some time ago that the fastest way to transfer data was in a station wagon full of backup tapes traveling down the Interstate. I guess we now update that now to a mini-van full of hard drives...
"It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
The bandwidth of a moving van full of disks.
Looks like Google is hoarding data. Seems they at least are equating information with power and money. And them that has the power and money makes the rules.
Here will be an old abusing of God's patience and the king's English.
I mail my external hard drives to different friends a few times a year. I have several, but one specifically for mailing to friends and co-workers. I thought this was somewhat of a common practice.. I have never had a fellow geek gawk at the idea, rather it seemed like the only logically way to get what we wanted to do done.
:)
Google is doing something cool by getting and hopefully displaying the data, but the method is not really anything newsworthy is it? I mean, this is the same as using a flash drive to transfer files real quick, this is just on a much larger scale
Invexi - a Phoenix, AZ based web design and web development company.
FedEx delivered what appeared to be a ton of broken office chairs to Google headquarters this morning. When asked for the sender's ID, the severely beaten FedEx courier would only reply that the sender wished to remain anonymous.
Well, there's spam egg sausage and spam, that's not got much spam in it.
Whos going to own the data? I hope Google isnt going to say they do like they want to with the old books theyre scanning. Everytime you download a hubble picture will it have a google watermark?
In 10 years google will own just about all data worth owning. Then slashdoters will be railing on them instead of microsoft... or maybe google and MS will merge and collect our taxes too.
Help! I've fallen in a karma hole and I can't get up!
Moe: Say, Barn, uh, remember when I said I'd have to send away to NASA to calculate your bar tab?
Barney: Oh ho, oh yeah, you had a good laugh, Moe.
Moe: The results came back today. (reading a printout) You owe me seventy billion dollars.
Barney: Huh?
Moe: No, wait, wait, wait, that's for the Voyager spacecraft. Your tab is fourteen billion dollars.
120 TB of data from the Hubble telescope? I wish I was paid to go through that. And this picture is of a...star and this one is a star And a star another star OMG its a FRICKIN STAR
"Luck is a tag given by the mediocre to account for the accomplishments of genius." -Heinlein
SUVs to transport those hard drives. That would be evil.
The more you regulate a company, the worse its products become.
coding is life
Don't say I didn't warn you guys about this "don't be evil thing." First they start swapping TB for "academic" purposes, then maybe some avian influenza in some apartments around Mountain View, and next thing you know, they'll be a smallpox outbreak and we will coincidentally receive advertisements on gmail that we can buy the cure for a few thousand dollars from one of their Adsense "partners."
The only thing you're getting by saying that is a flamewar between 10 kinds of people, whose who count only in MB (and disagree with you) an those who count in both MB and MiB (and agree with you) !
For my take on the issue, see this precedent post of mine.
I have discovered a truly marvelous proof of killer sig, which this margin is too narrow to contain.
I really don't want to share, whether it's academic or not!
"The moral of the story is: Never underestimate the bandwith of a station wagon full of tapes hurtling down the highway."
-Andrew Tannenbaum
Test your net with Netalyzr
I call B.S. "Lack of engineering time" is why we haven't seen the source to the core search engines or gmail?
I've been thinking that the only home use app lots of HD storage space would be A/V. Now, I guess when 10 PB of HD are $100-1120, then we'll be able to get copies of these 120 TB of hubble data or TBs of other datasets to fill up those future home PB HDs. One day we'll need home exabyte HD to store and play around with public PB datasets.
I can only hope that bandwidth can keep up. How long would it take to transfer a 120 TB bit torrent file over either cable or dsl?
Well, maybe we'll have small TB USB flashdrives that we can just mail those around instead of upgrading our bandwidth.
...that a researcher sends them all the printouts of his/her data... on greenbar...
GetOuttaMySpace - The Anti-Social Network
I'm so tired of this stuff. Byte me!
Faster! Faster! Faster would be better!
Here's what happened when I FedExed my RMA to Newegg, packed very carefully. Note the bent motherboard - I didn't even know you could do that. The good news is that FedEx paid part of my claim ... they paid $100 plus the $8.33 that the FedEx store charged me to fax in the claim forms. The bad news is that they did not refund my original shipping or pay more than $100 on the over $280 of damage that they did. It also took about 4 hours of phone calls to even convince FedEx that I was not the seller, and then they lost my claim in their e-mail system (and did not reply to my e-mails) and closed it out for inactivity after a month or so, until I called them and asked what happened.
On a side note, don't bother with UPS insurance. I insured something when I sent it to myself once, and they broke it and the insurance remedy was to return it to the origination address and ask to see an original purchase receipt to award the insurance claim. If you happened to make something yourself or even received something as a gift, don't insure it when you ship it. And hire a private courier (unless someone has found a common carrier that doesn't suck).
That's the only instance of anyone claiming it's a jocular misspelling of 'moron.' other sites point out why it shouldn't be used as a derogatory name. I suggest gEvil beta refrain from using that word in a negative light considering what that word (when used as a noun) has meant for a long time for many people.
That excuse is about as weak as George Allen's.
I really don't like the idea of a "private" (yes i know its publically traded) company having control of this public information. The data was paid for by tax payers. Google will inevitably make money from this otherwise they wouldn't be doing it.
This is not right.
...what does this new P2P technology mean for me? I guess the RIAA is really in for it now.
The reason that hasn't been released would be "trade secrets."
Relax. Think before you call B.S.
I'm not criticizing or anything; just curious is all.
Quo usque tandem abutere, Nimbus, patientia nostra?
We have been sending two DVDs, with about 6-8 GB data, around every month for updates. Now we are trying rsync, which in our view has been more convenient.
I'm just happy they're not swapping tuberculosis.
Oh, say does that Star-Spangled Banner entwine / The myrtle of Venus with Bacchus's vine?
that's what I read it as!
It's a clarification of the confusing post above. Mod appropriately.
She's an astronomer, said the Sloan Digital Sky Survey produces about a terabyte of data a year. Not as much as the Hubble, but still pretty cool.
When you sympathize with stupidity, you start thinking like an idiot.
1.3Tb each or so. About $150,000. the drive is about $5500. $155,000 in total. A 750Gb hard disk costs about $1000. so it'd cost about $160k to do the same with hard disks.
Deleted
Why not? Today Google Earth, tomorrow Google Universe!
How you measure a terabyte depends on whether you are buying disk, or monitoring disk usage on your server.
The disk manufacturers define it as 1000 megabytes which is 1000 kilobytes which is 1000 bytes.
The OS measures it as 1024 megabytes, which is 1024 kilobytes, which is 1024 bytes
Why? Because when you're buying a drive, 750 Gigs sounds bigger than 698.5 gigs.
We ended up buying a bunch of these to ship the arrays around in. Cardboard == bad :-)
Co-Editor, Open Sources
Open Source Program Manager, Google, Inc.
I'm with you, although I have seen FedEx and UPS both damage a lot of packages. I think that their automated systems are a lot rougher on packages than AirBorne Express / DHL or the USPS's Parcel Post. But if you don't insure it, you're accepting that risk when you give them the goods.
A while back I bought a radio-controlled airplane, pre-assembled. It came in a big box, most of which contained the wing. So it was fairly fragile, but well packed, in tri-wall. Got it sent UPS, with insurance for the full value.
They ran it over with a forklift.
To their credit, they called me right away and basically said "uh, so we may have damaged your package a little bit, you might want to look it over." So I went and took a look at it, and it was mangled pretty much beyond recognition. I was a little surprised they had actually bothered to deliver it. But I called them up, told them the stuff inside was ruined, and they sent me a check. (I think that if they hadn't been aware that it was broken already, they might have come and picked it back up, but as it was, they didn't.)
The only problem I have with the way they do insurance, is that they always want the SHIPPER of the goods to file the insurance claim, rather than the receiver. So if you ship something to me, and it arrives to me basically destroyed, and I call UPS, they're going to say "hey, we can't do anything except ship it back, and that guy has to file the claim." It takes a lot of arguing and escalation in order to explain to them, that sometimes things just don't work that way.
I think this is because they're used to working with big businesses and retailers that want to get damaged goods back, and then send out new ones, but for eBay and private shipments, where the RECEIVER is absorbing the transit risk, and the shipper is just basically saying "hey, I'm selling this to you FOB, whatever arrives at your door is your problem" (which is the eBay standard), it creates a big problem. The last thing the shipper wants is for the damaged goods to come back at him, because from his perspective, he washed his hands of the whole business when he dropped it off at UPS.
So overall, I'm not hugely dissatisfied with them, they just need to get through their heads that it's not always the shipper who's going to initiate a claim, and that in many cases, it's going to be the receiver of a shipment who is purchasing the insurance and who is the one at risk if something gets damaged, and it's going to be them who's filing a claim for loss.
Now, when I have fragile stuff that I want to send, I pretty much always use DHL, because I haven't had them mangle anything yet, but you can't beat FedEx Ground for being dirt cheap. You just have to be prepared for a lot of bureaucratic hassle when they drive over it.
The other thing I learned, is to always take a photo of the shipping label, or note the tracking number, on everything. Both UPS and FedEx are absolutely worthless unless you have a tracking or waybill number, and oftentimes, shippers won't bother to keep records of that on their outbound stuff. (Which means if it gets lost, everybody's hosed.)
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
If the average Slashdotter applied the same flawed logic to Microsoft, you'd have to say they're big open source sponsors too. After all, Microsoft has released GB of free source code for utilities, etc. for decades. Sure, the code mostly only works with their proprietary "family jewels" (the OS and development tools), but why quibble?
Actually, at least the earlier versions of MS-DOS *WAS* open source - iirc, Microsoft actually distributed the source code (or at least made it available) of some of the early 1980s MS-DOS.
If you want to be strict, the SI defines the "tera" prefix as 10^12, so 1 terabyte = 1000 gigabytes.
If you want to use the binary values, you might as well use the correct "tebi" prefix. NIST says you should, and it looks like the IEC, IEEE and BIPM agree.
GPG 0x1B479C78
TB is killing people all over Africa, and Google wants to see it swapped around our schools, too?!? I knew those liberal, heathen, California commies would be the downfall of this great nation!
A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
I've got Celestia
I see your informative link, and raise you a pithy comment.
Why don't they make a 'google earth' that uses hubble data. Instead of looking down at the earth you could look up and away, allowing zooming just like google earth but with pics of the universe. I'd put up with adsense to be able to browse that kinda interface.
It's "dragged"