Google Sorts 1 Petabyte In 6 Hours

← Back to Stories (view on slashdot.org)

Google Sorts 1 Petabyte In 6 Hours

Posted by Soulskill on Sunday November 23, 2008 @04:53AM from the sort-of-fast dept.

krewemaynard writes "Google has announced that they were able to sort one petabyte of data in 6 hours and 2 minutes across 4,000 computers. According to the Google Blog, '... to put this amount in perspective, it is 12 times the amount of archived web data in the US Library of Congress as of May 2008. In comparison, consider that the aggregate size of data processed by all instances of MapReduce at Google was on average 20PB per day in January 2008.' The technology making this possible is MapReduce 'a programming model and an associated implementation for processing and generating large data sets.' We discussed it a few months ago. Google has also posted a video from their Technology RoundTable discussing MapReduce."

49 of 166 comments (clear)

Kudos to Google by Anonymous Coward · 2008-11-23 04:54 · Score: 5, Funny

for knowing how important the Library of Congress metric is to us nerds!
1. Re:Kudos to Google by canuck57 · 2008-11-23 06:30 · Score: 5, Funny
  
  for knowing how important the Library of Congress metric is to us nerds!
  But at least now we know Google can sort out petafiles.
2. Re:Kudos to Google by shutdown+-p+now · 2008-11-23 07:46 · Score: 4, Funny
  
  Bah! To pay true homage, they need to add it to the list of units in Google Calc!
3. Re:Kudos to Google by LingNoi · 2008-11-23 11:02 · Score: 3, Funny
  
  So Google can sort through 12 LoCs in 6 hours.
  Wow, that's 2 LoC/pH
Unit conversion by Zarhan · 2008-11-23 04:59 · Score: 4, Funny

Yay! We finally have unit conversion from 1 LoC to bytes! So...20 PB = 6LoC, means that 1 LoC = 3,333... PB :)
1. Re:Unit conversion by xZgf6xHx2uhoAj9D · 2008-11-23 05:15 · Score: 3, Informative
  
  Don't you mean 1PB = 12LoC?
2. Re:Unit conversion by Neon+Aardvark · 2008-11-23 05:22 · Score: 4, Informative
  
  No, 1 PB = 12 LoC, so 1 LoC = 0.0833... PB
  Also, I'd like to make some kind of swimming pool reference.
  
  --
  Azural - instrumentals
3. Re:Unit conversion by Anonymous Coward · 2008-11-23 05:45 · Score: 2, Interesting
  
  Assuming it was written in binary in a font that allows for 1 digit per 2mm, the length of the data would be 183251938 m, or 1145324 times the perimeter of an olympic-sized swimming pool.
4. Re:Unit conversion by Zarhan · 2008-11-23 06:47 · Score: 2, Informative
  
  Oh darn. Clearly I was converting pound-congresses to kilos first.
5. Re:Unit conversion by ewanm89 · 2008-11-23 09:29 · Score: 2, Informative
  
  Well, American's don't even play *foot*ball with there feet.
That's Easy by Lord+Byron+II · 2008-11-23 05:04 · Score: 4, Interesting

Consider a data set of two numbers, each .5 petabyte big. It should only take a few minutes to sort them and there's even a 50% chance the data is already sorted.
1. Re:That's Easy by Blakey+Rat · 2008-11-23 05:08 · Score: 5, Insightful
  
  I came here to post the same thing. If they sorted a petabyte of Floats, that might be pretty impressive. But if they're sorting 5-terabyte video files, their software really sucks.
  Not enough info to judge the importance of this.
  
  --
  Comment of the year
2. Re:That's Easy by farker+haiku · 2008-11-23 05:16 · Score: 5, Informative
  
  I think this is the data set. I could be wrong though. The article (yeah yeah) says that
  In our sorting experiments we have followed the rules of a standard terabyte (TB) sort benchmark.
  Which lead me to this page that describes the data (and it's available for download).
  
  --
  Your sig(k) has been stolen. There is a puff of smoke!
3. Re:That's Easy by Anonymous Coward · 2008-11-23 05:16 · Score: 5, Informative
  
  From TFA: they sorted "10 trillion 100-byte records"
4. Re:That's Easy by sakdoctor · 2008-11-23 05:18 · Score: 4, Funny
  
  And yet google don't even convert petabytes to libraries of congress in the google calculator.
  Or perhaps I got the syntax wrong.
5. Re:That's Easy by sakdoctor · 2008-11-23 05:19 · Score: 4, Funny
  
  Huh? This isn't the parent post I was trying to reply to.
Need to benchmark against the best sorts by Animats · 2008-11-23 05:12 · Score: 4, Insightful

Sorts have been parallelized and distributed for decades. It would be interesting to benchmark Google's approach against SyncSort. SyncSort is parallel and distributed, and has been heavily optimized for exactly such jobs. Using map/reduce will work, but there are better approaches to sorting.
1. Re:Need to benchmark against the best sorts by Pinball+Wizard · 2008-11-23 10:36 · Score: 2, Interesting
  
  Parallel/distributed sorting doesn't eliminate the need for map/reduce, it just helps spread the problem set across machines.
  Here's the thing though...its the distributing of the problem set and the combining of the results that is the hard part - not map/reduce.
  Map and reduce are simple functional programming paradigms. With map, you apply a function to a list - which could be either atomic values or other functions. With reduce, you take a single function(like add or multiply, for instance) and use that to condense the list into a single value or object.
  That's my understanding of map/reduce from my functional language classes in school and that's exactly how Google describes it. I don't really see what the big deal is with map/reduce in itself.
  Like I said, its the distributing the problem among thousands of machines that is the hard part.
  
  --
  No, Thursday's out. How about never - is never good for you?
2. Re:Need to benchmark against the best sorts by ShakaUVM · 2008-11-24 00:22 · Score: 3, Insightful
  
  >>Using map/reduce will work, but there are better approaches to sorting.
  It kinda bugs me that Google trademarked (or, at least, what they named their software) after a programming modality that has been in parallel processing for ages. In fact, MPI has a mapreduce() function that, well, does a map/reduce operation. I.e., farms out instances of a function to a cluster, then gathers the data back in, summates it, and presents the results to someone.
  It kind of bugs me (in their Youtube video linked in TFA, at least) that they make it seem that this model is their brilliant idea, when all they've done is write the job control layer under it. There's other job control layers that control spawning new processes, fault tolerance, etc., and have been for many, many years. Maybe it's nicer than other packages, in the same way that Google Maps is nicer than other map packages, but I think most people like it just because they don't realize how uninspired it is.
  It'd be like them coming out with Google QuickSort(beta) next.
Finally... by aztektum · 2008-11-23 05:17 · Score: 5, Funny

I will be able to catalog my pr0n in my lifetime:
Blondes, Brunettes, Red heads, Beastial^H^H^H^H^H "Other"

--
:: aztek ::
No sig for you!!
1. Re:Finally... by Pugwash69 · 2008-11-23 05:37 · Score: 2, Funny
  
  How do you catalogue the topics? I mean "Clown" and "Monkey" are so different, but something with both elements could be difficult to sort.
  
  --
  Pro Coffee Drinker
One ups Yahoo & Hadoop by DaveLatham · 2008-11-23 05:23 · Score: 3, Interesting

It looks like Google saw Yahoo crowing about winning the 1 TB sort contest using Hadoop and decided to one up them!

Let's see if Yahoo responds!
1. Re:One ups Yahoo & Hadoop by Patrick+May · 2008-11-23 07:21 · Score: 3, Informative
  
  It's older than design patterns. Lisp has provided map and reduce functions for literally decades. It's a standard functional programming idiom.
2. Re:One ups Yahoo & Hadoop by jollyplex · 2008-11-23 11:49 · Score: 5, Interesting
  
  Exactly. It's unclear if their better time was a software engineering or algorithmic feat, though. Hadoop was able to finish sorting the 1 TB benchmark dataset in 209 s; TFA states Google pulled the same event off in 68 s. The Yahoo blog post you linked to says their compute nodes each sported 4 SATA HDDs. Note TFA mentions Google's 1 PB dataset sort used 48,000 HDDs split between 4,000 machines, or 12 HDDs to a machine. If Google used the same machines to perform their 1 TB sort, then they had 3 times as many HDDs on each compute node, and could probably pull data from storage 3 times as fast. 209 s / 68 s ~ 3.1 -- coincidence, or not? =)
Re:Sort? Sort what? by nedlohs · 2008-11-23 05:34 · Score: 5, Informative

I realize, slashdot..., but maybe you could glance at the article which states:
10 trillion 100-byte records
tagging by Hao+Wu · 2008-11-23 05:34 · Score: 4, Interesting

I will be able to catalog my pr0n in my lifetime:

It's not enough to sort by blond, black, gay, scat, etc. Some categories are a combination that don't belong in a hierarchy.

That is where tagging comes in. Sorting can be done on-the-fly, with no one category intrinsically more important.

--
I suggest you read Slashdot
1. Re:tagging by gardyloo · 2008-11-23 05:53 · Score: 5, Funny
  
  pr0n for Geeks, volume 18: Sorting On-the-Fly
Its About Time.... by Anonymous Coward · 2008-11-23 05:36 · Score: 2, Funny

Finaly... A system with enough power to run vista efficiently.
1. Re:Its About Time.... by poetmatt · 2008-11-23 05:55 · Score: 3, Informative
  
  Are you sure? It wasn't marked Vista capable.
2. Re:Its About Time.... by peragrin · 2008-11-23 06:09 · Score: 3, Funny
  
  Not only that the extra processors aren't covered under the EULA and require special extra licenses.
  
  --
  i thought once I was found, but it was only a dream.
Not impressive... by g0dsp33d · 2008-11-23 05:41 · Score: 4, Funny

Not a big deal, that's just the data they have on you.

--
lol: You see no door there!
0s and 1s by johno.ie · 2008-11-23 05:56 · Score: 2, Funny

That's a lot of computing power to use just to get 4,000,000,000,000 0s and 4,000,000,000,000 1s.

--
872835240
nice one, Google... by Tastecicles · 2008-11-23 06:03 · Score: 2, Funny

...fancy doing my mp3 collection?

--
Operation Guillotine is in effect.
Libraries of congress? by TinBromide · 2008-11-23 06:03 · Score: 2, Insightful

First of all, this isn't a straight up "Libraries of Congress" (better known and mentioned in prior posts as a LoC). Its the web archiving arm of the LoC. I call for the coining of a new term, WASoLoC (Web Archival System of Library of Congress) which can be defined as X * Y^Z = 1 WASoLoC where X is some medium that people can relate to (books, web pages, documents, tacos, water, etc), Y is a volume (Libaries, Internets, Encyclopedias, end to end from A to B, swimming pools, etc) and Z is some number that marketing drones come up with because it makes them happy in their pants.

Honestly, How am i supposed to know what "..the amount of archived web data in the US Library of Congress as of May 2008." Looks like!? I've been to the library of congress, i've seen it, its a metric shit-ton of books (1 shit-ton = Shit * assloads^fricking lots), but i have no clue what the LoC is archiving, what rate they're going at it, and what the volume is of it.

--
Is it sad that I am more likely to recognize you and your posts by your sig than your name or UID?
clever strategy by stimpleton · 2008-11-23 06:12 · Score: 2

Good.

They clearly have the ability to respond to emergencies. And this puts it out there that they can...

eg;
1) Foot n mouth out break in cattle
2) A supliment to census data
3) Finding information of dissidents/traitors(bloggers)

--

In post Patriot Act America, the library books scan you.
Re:Sort? Sort what? by Dpaladin · 2008-11-23 06:12 · Score: 5, Funny

Sorting a petabyte sounds pretty impressive, but I don't think it was a whole yotta work.

--
Bad puns gave me bad karma. =(
just in perspective... by wjh31 · 2008-11-23 06:38 · Score: 2, Interesting

i make this about 48GB/s, my hard drive manages about 20MB/s, even my mid-range ram manages only ~6.4GB/s, and top end ram will reach only ~13GB/s (according to wiki) so even ignoring the ability to process that much data in that time, the ability to simply move that much data is quite impressive (at time of print, may not hold one year down the line)
Re:Sort? Sort what? by nedlohs · 2008-11-23 06:46 · Score: 2, Insightful

You do have to merge them all back together at the end...
But I'm sure you can do better tonight.
Re:Sort? Sort what? by chaim79 · 2008-11-23 06:46 · Score: 2, Insightful

right, so it's 250gb sorted in 6 hours... now where does the sorting and integration of the 4000 250gb blocks of sorted data come in? :)

--
DEMETRIUS: Villain, what hast thou done?
AARON: Villain, I have done thy mother.
Shakespeare invents 'your mom'
Re:20,111 Servers ?? by chaim79 · 2008-11-23 06:50 · Score: 3, Insightful

Yah, but you gotta wonder at the computing cost of integrating all those datasets into one complete sorted block of data. It could be that those servers can sort at 1gb per min but the overhead for combining is 25% of the computing time.

--
DEMETRIUS: Villain, what hast thou done?
AARON: Villain, I have done thy mother.
Shakespeare invents 'your mom'
Re:20,111 Servers ?? by johnflan · 2008-11-23 07:04 · Score: 2, Informative

Agreed, but even if it takes 40,000 servers with losses and extra overhead to calculate their daily workload. It makes you wonder what their other estimated 410,000 servers are doing? (2006 estimate)
Re:20,111 Servers ?? by smallfries · 2008-11-23 07:21 · Score: 3, Insightful

Oh dear. 4000*362 ~= 1440*20111 / 20. So you assumed that the sorting would scale linearly. fail.

--
Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php
Amazing feat... by Duncan3 · 2008-11-23 08:32 · Score: 5, Funny

Today from Google, the god of all things and doer of all things good in the universe, many millions of dollars in computer equipment were able to sort lots of things, in about the amount of time you would think it would take for millions of dollars of equipment to sort things.
In other news, a woodchuck was found chucking wood as fast as a woodchuck could chuck wood.
Congrats Google, you have a HUGE data set, and an even bigger wallet.

--
- Adam L. Beberg - The Cosm Project - http://www.mithral.com/
MapReduce = map + reduce by Bitmanhome · 2008-11-23 08:45 · Score: 3, Interesting

If you feel the urge to play with MapReduce (or reade the paper), you don't need a fancy Linux distro to do it. MapReduce is simply the map() and reduce() functions, exactly as implemented in Python. Granted, Google implementation can work with absurdly large data sets, but for small data sets, Python is all you need.

--
Not that this wasn't entirely predictable.
1. Re:MapReduce = map + reduce by boyter · 2008-11-23 09:39 · Score: 3, Informative
  
  True, but not quite the point. The map and reduce functions as you say are implemented in python (and a great many other languages), but what makes MapReduce special is that you replace the Map function with one which distributes it out to other computers. Because any map function can be implemented in parallel you get a speed boost for however many machines you have (dependant on network speeds etc....).
  So yeah, you can do it in Python but you arent going to be breaking any records untill you implement your own infrastructure that lets you span it out to thousands of computers. The nice thing being you dont need to write any new code to take advantage of the speed when you do.
2. Re:MapReduce = map + reduce by Pinball+Wizard · 2008-11-23 10:42 · Score: 2, Informative
  
  Exactly. There is nothing special to map and reduce.
  Here's an example. Map and reduce are functional programming tools that work with lists. So we'll start with a simple list.
  1 2 3 4 5
  Now we'll take a function - x^2, and map it to the list. The list now becomes:
  1 4 9 16 25.
  Now, we'll apply a reduce function to our list to combine it to a single value. I'll use "+" to keep it simple. We end up with:
  55
  And that is pretty much all there is to map and reduce.
  
  --
  No, Thursday's out. How about never - is never good for you?
3. Re:MapReduce = map + reduce by adpowers · 2008-11-23 17:33 · Score: 2, Informative
  
  Almost, but not quite. MapReduce has a slightly different format than just map() and reduce(). Here is the signature of map and reduce from a theoretical functional language:
  map(): A* -> B*
  reduce(): B* -> C
  Whereas in MapReduce:
  map: (K, V)* -> (K1, V1)*
  reduce: (K1, (V1)*)* -> (K2, V2)*
  I think that is mostly accurate. Read more accurate/detailed report in MapReduce revisited[PDF].
Re:BCA's? by SEWilco · 2008-11-23 14:10 · Score: 2, Funny

Can we convert that to number of bad car analogies?

Sure, it's -4.15 Edsels.
Re:MapReduce by adpowers · 2008-11-23 17:43 · Score: 4, Informative

The individual functions map and reduce are quite standard. The innovation here is the systems work they've done to make it work on such a large scale. All the programmer needs to worry about is implementing the two functions, they don't have to worry about distributing the work, ensuring fault tolerance, or anything else for that matter. That is the innovation.
They mention in the article that if you try and sort a petabyte you WILL get hard disk and computer failures. Hell, you can only read a terabyte hard disk a few times before you encounter unrecoverable errors. The system for executing those maps and reduces is what is important here. The important parts are in the design details, such as dealing with stragglers. If you have 4000 identical machines, you won't necessarily get equal performance. If a few of those machines have a bit flipped and started without disk cache, they might see a huge decrease in read/write performance. The system needs to recognize this and schedule the work differently. That can make a huge difference in execution time. If you graph the percentile complete of a MR job, you'll often see that it quickly reaches 95% and then plateaus. The last 5% may take 20% of the time, and good scheduling is required to bring this time down.
But like I said, the innovation isn't in the idea of using a Map and Reduce function, it is the system that executes the work.