Slashdot Mirror


Google Two Years Into Overhaul of the Google File System

El Reg writes "As its ten-year-old file system — GFS — struggles to keep up with Gmail, YouTube, and other apps it was never designed to support, Google is brewing a replacement. According to the company, it's two years into a GFS sequel designed specifically for customer-facing apps that require ultra low latency."

52 of 217 comments (clear)

  1. hmm by gnarfel · · Score: 5, Funny

    Well I'm no expert on Google's internal workings, but are any of these protocols or file systems they've developed been released outside of Google for public use?

    --
    Local music(to upstate NY). http://gnarfel.com/ radio.
    1. Re:hmm by buchner.johannes · · Score: 5, Funny

      GFS is proprietary and for internal use only. The only released a paper describing how it works (don't know if that content is enough to rebuild it). I think GFS (global file system) from Redhat and OpenGFS is something differently. Hadoop is what you want. What would we do without the wiki

      --
      NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
    2. Re:hmm by Brian+Gordon · · Score: 5, Funny

      No, they haven't. So why does the editor think we care? "Google Six Months Into Resurfacing Parking Lot"

    3. Re:hmm by mysidia · · Score: 5, Funny

      They have not, and apparently Google thinks of the Google FS as part of their secret sauce, such that they will probably never get it released. Although they seem happy to write papers about it.

      It's actually really sad... Google has built an innovative platform for distributed computing, that solves quite a few problems, vastly superior to the state of the art in distributed computing, but they basically keep the filesystem and clustering implementations completely to themselves, it would seem.

      They use the Linux platform to the absolute max, leveraging all the blood and sweat Linux developers poured into its development over the past 15 years, and yet, not contributing back any of their most significant enhancements.

      I won't call it evil, as they're under no obligation to release GoogleFS or their map reduce implementations, it's just unkind.

      I would equate it to an inventor creating the lightbulb, and their employer saw this, and decided instead of trying to sell the invention to the public, they decided to only allow their own factories to buy lightbulbs, thus netting them a competitive advantage over other factories whose workers had to operate in the dark or by candlelight.

      No software product available to the public that even utilizes GoogleFS. Instead it's all software as a service (The Google search engine service, that is)

    4. Re:hmm by ToadProphet · · Score: 5, Funny

      Parent and GP modded funny? Am I missing the joke or are there some giddy drunks with mod points?

      --
      It's on America's tortured brow, That Mickey Mouse has grown up a cow
    5. Re:hmm by buchner.johannes · · Score: 4, Funny

      Everyone is in a good mood. Why not :-)

      --
      NB: The message above might reflect my opinion right now, but not necessarily tomorrow or next year.
    6. Re:hmm by MeatBag+PussRocket · · Score: 5, Interesting

      They use the Linux platform to the absolute max, leveraging all the blood and sweat Linux developers poured into its development over the past 15 years, and yet, not contributing back any of their most significant enhancements.

      i see your point, but its not like google isnt giving signifigantly in return. most people would be hard pressed to deny that Googles search engine was a game changer in the interweb. at its release it was leaps and bounds better tahn just about anything out there, and is still the gold standard for finding information. hell they gave us the verb "to google" we got a pretty decent browser out of it, gmail, google docs, google maps, and a whole bunch of other stuff they've generated. not to mention a forthcoming OS. at this point i can already hear critics screaming about Googles profits driving these services, and you know what, maybe they are, but i havent paid Google a dime, and most likely, neither have you. i dont care if they make money, theres nothing wrong with it, and i'm even happier that they make money without involving me whatsoever. in many ways i would think Google would be a champion to the FOSS community. so they want to keep a filesystem proprietary, frankly thats not so bad, competition is good but competitors arent usually. Google is a good counter balance to Microsoft and other would-be owners of the interwebs. are they "good" as in saintly? no, but they never claimed to be, they claimed "dont be evil" i'd say they're pretty far from that.

      --
      i wage a holy war against the apostrophe.
    7. Re:hmm by ksatyr · · Score: 5, Funny

      "Google Six Months Into Resurfacing Parking Lot"

      And it's still in beta.

    8. Re:hmm by lawpoop · · Score: 5, Insightful

      They use the Linux platform to the absolute max, leveraging all the blood and sweat Linux developers poured into its development over the past 15 years, and yet, not contributing back any of their most significant enhancements.

      Not contributing back!? Dude, they gave us *google*. Remember what it was like before google? When internet search was basically voo-doo crapshoots, that worked 25% of the time? They gave us a search engine that actually *worked*. Before that, you basically had to bookmark or memorize internet sites that you liked. Good luck actually finding what you were looking for without having an actual site in mind beforehand.

      I think that alone has probably spurred the development of free software. Imagine being able to *find things* on the internet!

      --
      Computers are useless. They can only give you answers.
      -- Pablo Picasso
    9. Re:hmm by Night+Goat · · Score: 5, Funny

      Yahoo worked fine for me before Google. I think you give it more credit than it deserves. The downside of Yahoo was its advertising and clutter. The searching part worked fine.

    10. Re:hmm by Snarf+You · · Score: 5, Funny

      While I found both posts informative, I find it funny that they were modded funny. It's meta-funny. You know what else is funny, is that the word funny starts to sound funny after saying it enough times.

    11. Re:hmm by billcopc · · Score: 5, Funny

      You clearly weren't an Altavista user.

      Google's results today are no better than the leading search engines 10 years ago. People were gaming the engines then, and Google came up with a smarter algorithm (Pagerank), but today's results page is again full of garbage because people learned how to game Pagerank. Combine that with the web 2.0 fad of scraping and regurgitating everyone else's content, and the resultant pile of URLs for any given keyword is utterly worthless. I call it "metapublishing", because the content is worthless, it's become a twisted game of outwitting Google to maximize ad revenue while providing zero value.

      Searching has always been a game of finding the most specific yet least popular terms to define what you want, and then adding a bunch of negative keywords to filter out the junk. Google scored a hit, many many years ago, but they haven't been able (or willing) to maintain that lead, and all their competitors have pretty much died out anyway.

      If Google hadn't come along when it did, someone else would have stepped up. Maybe Altavista, or Yahoo, or someone else. There was a need, and a provider to address that need. The only reason we don't have a new search engine to beat Google today is because, well, everyone is scared shitless of going head-to-head with Google, except Microsoft with their propaganda-laced Bing embarrassment. They're just not the golden child people seem to think they are.

      --
      -Billco, Fnarg.com
    12. Re:hmm by CharlyFoxtrot · · Score: 5, Funny

      Altavista worked fine, HotBot too. I started using Google primarily because of the cached pages, not because the search was that much better. Plus like you say the Google interface was a breath of fresh air.

      --
      If all else fails, immortality can always be assured by spectacular error.
    13. Re:hmm by ToadProphet · · Score: 5, Funny

      Everyone is in a good mood. Why not :-)

      Modded Troll... now that's delicious.

      --
      It's on America's tortured brow, That Mickey Mouse has grown up a cow
    14. Re:hmm by mysidia · · Score: 5, Informative

      Yahoo was originally a web directory, not a conventional search engine. The search results were provided by others.

      In 2000, they signed an agreement with Google, and Yahoo's search was powered by Google, in other words -- if you used Yahoo, you were using Google.

      That didn't change until 2005, and after several other search engine company acquisitions, when they developed their own search technology.

    15. Re:hmm by negRo_slim · · Score: 5, Funny

      Plus like you say the Google interface was a breath of fresh air.

      Sometimes I wonder if Yahoo hadn't made their default page http://search.yahoo.com/ early on, if they wouldn't have done somewhat better for themselves.

      --
      On the Oregon Cost born and raised, On the beach is where I spent most of my days
    16. Re:hmm by NekoYasha · · Score: 5, Funny

      It's called a " running gag".

    17. Re:hmm by Afforess · · Score: 5, Funny

      If you repeat a word too many times, quickly, your brain become tired of that word and it begins to become foreign to it. This event is similar in nature to looking at grid illusions. Your brain becomes tired after a few moments and you see dots.

      --
      If our elected representatives no longer represent us, do we still live in a Democracy?
    18. Re:hmm by jcnnghm · · Score: 5, Informative

      You know from June 2000 to February 2004 Google was the backend for the Yahoo web page search. That was back when Yahoo was a web site "human directory" search first and foremost, and only secondarily a machine-powered internet search. Sort of like how Yahoo search is going to be powered by Bing in the future, and was powered by Inktomi before Google.

      --
      You don't make the poor richer by making the rich poorer. - Winston Churchill
    19. Re:hmm by superdave80 · · Score: 5, Funny

      Funny
      Funny
      Funny
      Fun...

      Shit, you're right!

    20. Re:hmm by lawpoop · · Score: 4, Insightful

      If Google hadn't come along when it did, someone else would have stepped up.

      Doesn't change the fact that it *was* them, who was able to do it when nobody else had been able to. So I think that yes, they did contribute a lot to open source development. It's not enough to have a good idea, or believe that someone will eventually get around to it; someone actually has to sit down and *do* it. If google hadn't done it then, we would be that much further behind in internet search technology.

      --
      Computers are useless. They can only give you answers.
      -- Pablo Picasso
    21. Re:hmm by Afforess · · Score: 5, Funny

      Why is the OP funny, if it says "40% interesting," "30% funny" and "30% informative"? Shouldn't the post be "Interesting?"

      --
      If our elected representatives no longer represent us, do we still live in a Democracy?
    22. Re:hmm by Mostly+a+lurker · · Score: 5, Informative

      Your recollections are different from mine. Prior to Google, I tended to use AltaVista and Hotbot. Searches took at least ten times as long. Results rarely included any recently created pages. The number of indexed pages was several orders of magnitude less than Google handles today (which in turn is one order of magnitude, or so, greater than current competitors). In spite of the fact that gaming of search engines is overwhelmingly targeted at Google, Google still does a relatively better job of finding the genuinely useful pages. Is Google perfect? No, of course not. Search is still only a partially solved problem. However, since its inception, Google has come up with most of the practical advances in the state of the art, as well as the best infrastructure for its implementation.

    23. Re:hmm by MobileTatsu-NJG · · Score: 5, Funny

      They use the Linux platform to the absolute max, leveraging all the blood and sweat Linux developers poured into its development over the past 15 years, and yet, not contributing back any of their most significant enhancements.

      Not contributing back!? Dude, they gave us *google*. Remember what it was like before google? When internet search was basically voo-doo crapshoots, that worked 25% of the time? They gave us a search engine that actually *worked*. Before that, you basically had to bookmark or memorize internet sites that you liked. Good luck actually finding what you were looking for without having an actual site in mind beforehand.

      I think that alone has probably spurred the development of free software. Imagine being able to *find things* on the internet!

      Are you kidding? Search for Quake? Porn. Search for a new version of Netscape? Porn. Google? PFtb. It always gave me Quake and Netscape. My pr0n searching was MUCH more productive before Google!

      --

      "I like to lick butts!" by MobileTatsu-NJG (#32700246) (Score:5, Informative)

    24. Re:hmm by skine · · Score: 5, Funny

      Woooooooooooooo!

      (Apparently just entering "Woooooooooooooo!" creates an error. I have to explain that it's supposed to be a giddy mod, thus destroying any semblance of assuming intelligence present in at least part of the /. community).

    25. Re:hmm by Dahamma · · Score: 4, Funny

      Because it's funny.

    26. Re:hmm by Dahamma · · Score: 4, Interesting

      You clearly weren't a daily Google user 10 years ago.

      The moment I realized Google was completely superior to the others was when I was able to paste an obscure compile error for an equally obscure CPU architecture into Google and immediately get the answer back... the kind of utterly random error that a few years previous would have potentially taken hours to debug...

      If Google hadn't come along when it did, someone else would have stepped up. Maybe Altavista, or Yahoo

      And you were modded Insightful - sigh... So you are saying they decided "oh, well Google is pretty good at this - let's NOT STEP UP." Yeah, that's what companies do in that situation. Or maybe they do try, and fail (nothing wrong with trying and failing... but that's the REALITY of the situation).

    27. Re:hmm by Yvan256 · · Score: 5, Funny

      Move closer to your screen. There's plenty of them.

    28. Re:hmm by wdr1 · · Score: 4, Insightful

      Put the crackpipe down!

      I was an altavista user. A die-hard one, for most of the mid/late-nineties. In fact, I remember the day I finally convinced my boss to switch from Altavista to Google, because he had worked on Altavista.

      Today's results completely blow away the search engines of 10 years ago. In fact, any of the major players -- Yahoo, Microsoft, even Ask & co. -- would blow away the search engines of 10 years ago.

      (Add to the fact that the number of documents on the web that they need to crawl & rank have exploded.)

      Your comment that "the resultant pile of URLs for any given keyword is utterly worthless" is itself hyperbolic nonsense. If that were true, nobody would use them.

      --
      SlashSig Karma: Excellent (mostly affected by moderatio
    29. Re:hmm by HeronBlademaster · · Score: 5, Funny

      I used Dogpile, back in the day; it would show you the results from ten or so other search engines.

    30. Re:hmm by unity · · Score: 4, Insightful

      The only thing you really missed there was the really simple, non-image intensive interface. That alone spurred people to use google.

    31. Re:hmm by BikeHelmet · · Score: 5, Funny

      Shit, I see dots from the start. My brain must be really lazy.

    32. Re:hmm by Anonymous Coward · · Score: 5, Funny

      Many people think that Google's original claim to fame is PageRank. That's only partially true. Google became as successful as they are because of their systems-scalability work. That is, Google figured out how to build the biggest clusters, with the most storage space, the most computation capacity, and the lowest latency, for the least amount of money (compared to their competitors anyway). If you have 1000x times the computing power of your nearest competitor, then you can do 1000x as much data mining, which means that your search results (and ad relevancy) will be that much better.

      For a long time, Google refused to release any information on their system infrastructure (it was their crown jewel, after all). The GFS paper was released in 2003, well after Google had put the filesystem (and its predecessors) to public use.

      To sum it up: GFS has been one of the strongest contributing factors to Google's dominance. The idea that Google would voluntarily give this code to competitors is laughable.

    33. Re:hmm by sootman · · Score: 4, Insightful

      It really amuses me how all these different comments come up in every thread about search engines. Everyone's experience is different. Google is still very useful to me 99% of the time. As for AltaVista, I remember '96-'97 very well. I would usually use Yahoo first. If Yahoo only produced a small handful of results--literally, 10 or less, and no good ones--then I'd go to AltaVista and get tens of thousands of results. If I was lucky I'd find what I wanted in the first few pages, else I'd give up.

      Google is still literally orders of magnitude than anything else I've tried. Disclaimer: I've pretty much used only Google for the last... um, however many years it's been since they came on the scene. I won't claim to have used it when they were still hosted at stanford.edu, but I heard about them early on (back when they had , probably from Slashdot, and I was impressed right away. I probably stopped using Yahoo altogether within a couple months.

      --
      Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
    34. Re:hmm by V!NCENT · · Score: 5, Funny

      It takes absolutely zero effort for this post to be modded funny

      --
      Here be signatures
    35. Re:hmm by jimicus · · Score: 4, Funny

      Of course, by posting under your user account, you're not a mod in this thread any more so it's all rather academic.

    36. Re:hmm by Vu1turEMaN · · Score: 4, Funny

      Please take your meds, Rick Flair....

    37. Re:hmm by Xemu · · Score: 5, Informative

      It takes absolutely zero effort for this post to be modded funny

      It doesn't take much to be modded informative either.

      --
      Tell your friends about xenu.net
    38. Re:hmm by D+Ninja · · Score: 5, Funny

      C-C-C-C-COMBO BREAKER!

  2. It's not really GFS by mysidia · · Score: 5, Insightful

    It's GoogleFS.

    GFS refers to the Global File System, which is commonly used in Linux clustering environments.

    By comparison, GoogleFS came second, is basically a no-name filesystem unknown to most of the IT world, because it's not available for use, hasn't been released as a product, compared to the well-established global filesystem.

    It would certainly seem like the Global File system would have priority claim over the name GFS...

    So let's stop calling Google's filesystem, which we'll probably never get to use GFS :)

    1. Re:It's not really GFS by mysidia · · Score: 5, Funny

      That is a problem that may be getting corrected by the IANA TLA registry :)

    2. Re:It's not really GFS by corsec67 · · Score: 4, Funny

      Tell that to the NWA. Wrestling and a wildlife foundation should be even easier to tell apart, they both aren't as similar as two file systems.

      --
      If I have nothing to hide, don't search me
  3. Google is IT done right... by Alien+Being · · Score: 5, Funny

    but God help us all if they ever do turn evil.

    1. Re:Google is IT done right... by ObitMan · · Score: 4, Funny

      not on your life.
      Developers constantly ruin perfectly good infrastructure.

      --
      Who run Barter Town?
    2. Re:Google is IT done right... by Nefarious+Wheel · · Score: 5, Funny

      Developers aren't IT?

      Not really, no. It's kind of like the difference between a doctor and a patient. Or to use a car analogy, the difference between being an automotive engineer and the guy who takes money for candy bars, magazines and fuel.

      Disclosure: I was a developer for about thirty years before I took a step down and moved into marketing. I learned a lot of languages but was stopped when I discovered I was having trouble mastering Hindi.

      --
      Do not mock my vision of impractical footwear
    3. Re:Google is IT done right... by Duncan3 · · Score: 5, Interesting

      Not really, it's IT done by not letting anyone over 30 or with any experience into the room. Every single issue they had to learn and fix mentioned in the article is quite literally standard textbook stuff in distributed systems, and has been for over 40 years. The failure model, the huge chunk sized, the single master problems... etc. Nobody who had taken even one decent class would have ever considered the original design viable.

      They really should just stick to buying their tech pre-made like everything else Google is known for - acquisitions. Other companies are willing to hire experienced people. You know, those old lazy bastards that only work 40 hours a week because they have families, cost way too much to provide health insurance to, but get things done 5x as fast because they have done it before :)

      --
      - Adam L. Beberg - The Cosm Project - http://www.mithral.com/
  4. Curiously by ShooterNeo · · Score: 5, Insightful

    In the article, it's stated that the load on the google file system has grown orders of magnitude greater than it was ever intended to handle. And one of the algorithm changes is that the chunks in the new file system are 1 megabyte in size rather than 64 megabytes. This is to reduce latency, which makes logical sense...but dividing a gigantic database into pieces that are 64 time smaller doesn't make intuitive sense...

    1. Re:Curiously by martin-boundary · · Score: 4, Funny

      It does if it was 64x too big to begin with. Live and learn.

      No need to learn. 64x should be enough for anybody, dammit!

    2. Re:Curiously by Runaway1956 · · Score: 4, Funny

      The question is based on assumptions. I've personally pushed an 18-wheeler over 150 mph. I've pushed a bike over 170 mph. In both cases, the limiting factors were all the 4-wheelers. Take the cars off the roads, and let the bikes and the trucks run.

      Oh yeah - one more thing. Mandate that cop cars have square wheels. They already have radio, they need a handicap to make things fair.

      --
      "Windows is like the faint smell of piss in a subway: it's there, and there's nothing you can do about it." - Charlie Br
  5. Quality of comments going downhill... by s0litaire · · Score: 5, Funny
    There's over 25 comments and not one has attempted to call it "Goatse File System"!

    Whats up with you trolls! You guys on a union break or what!!

    --
    Laters Sol "Have you found the secrets of the universe? Asked Zebade "I'm sure I left them here somewhere"
  6. Where's meta-moderation when you need it? by nobodyman · · Score: 5, Funny

    I'm impressed that all of these Reddit users had the attention span to stay long enough to get mod points. But nobody likes a guest who overstays their welcome. Besides, I think somebody posted an animated gif of an old man falling down or something. GO CHECK IT OUT!!!1!1one

  7. Re:GFS? by Vectronic · · Score: 4, Funny

    Wrong, its the Google GFS File System!