Slashdot Mirror


The Design Of The Google File System

Freddles writes "This is an interesting paper (PDF) describing the design approach to Google's file system. The design had to take account of requirements for huge file sizes, a highly responsive infrastructure and an assumption that hardware components will always fail."

149 of 210 comments (clear)

  1. In case you don't like PDF by Brahmastra · · Score: 4, Informative

    Here's the html link

    1. Re:In case you don't like PDF by redJag · · Score: 3, Funny

      In case you hate highlighting as I do:
      try this :)

      I wish I had enough RAM to use as a harddisk. Then I could...well no, I wouldn't do anything useful. It would be cool, in a geeky way.

    2. Re:In case you don't like PDF by gl4ss · · Score: 1

      well, most probably he meant it as a joke(hiding the name behind blank number and all).

      as a sidenote, here is the google pdf-to-html cache of it: http://www.google.fi/search?q=cache:m0TMQYgIlIoJ:w ww.cs.rochester.edu/sosp2003/papers/p125-ghemawat. pdf+&hl=fi&ie=UTF-8

      --
      world was created 5 seconds before this post as it is.
    3. Re:In case you don't like PDF by Short+Circuit · · Score: 5, Funny

      A ramdisk would make for a great swap partition. :)

    4. Re:In case you don't like PDF by bugnuts · · Score: 2, Funny
      How ironic, that the HTML-ized file on Google is available from Yahoo!...


      Yahoo uses the evil Anti-Google FS. It's the 1's complement called GllgOe. It can store 01111111111 1111111111 1111111111 1111111111 1111111111 1111111111 1111111111 1111111111 1111111111 1111111111 bytes of data.

    5. Re:In case you don't like PDF by rjch · · Score: 1, Redundant
      A ramdisk would make for a great swap partition. :)
      Oh I wish I had a Redundant (+1) modifier... :)
    6. Re:In case you don't like PDF by nonetheless · · Score: 1

      And see also here.

    7. Re:In case you don't like PDF by proj_2501 · · Score: 1

      actually i used to use a RAM disk on my mac as my browser cache.

    8. Re:In case you don't like PDF by canadiangoose · · Score: 1

      Years ago I had a 386SX/33 with 6MB of ram. Just as an experiment I created a 4MB RAMDisk and compressed it with DriveSpace, then I put my Windows swapfile on it. Even with a whopping 10MB of memory, Windows still sucked, and it was quite a hastle to set that up every time I needed to run a Windows app.

      --
      Never eat more than you can lift -- Miss Piggy
  2. Thoughtful... by Anonymous Coward · · Score: 5, Funny

    It was thoughtful of the poster to link to google.com for those that have never heard of it.

    1. Re:Thoughtful... by Queuetue · · Score: 5, Funny

      Absolutely - I was about to go look google up on teoma and askjeeves...

    2. Re:Thoughtful... by Anonymous Coward · · Score: 5, Funny

      Last week I had a co-worker ask how to spell it. He is MS cert'd for Win2k Pro. Don't mod this funny, it's sad.

    3. Re:Thoughtful... by Anonymous Coward · · Score: 1, Funny

      Unfortunately, we can't mod it sad, you have to go to fark to do that.

    4. Re:Thoughtful... by Anonymous Coward · · Score: 1, Funny

      That reminds me of all those personal web pages back in 1997 that had link sections, and they always had links to, say, Lycos. As if I couldn't find the search site myself!

    5. Re:Thoughtful... by OneFix · · Score: 1

      While everyone should have already heard of google, it's kinda dumb to use one search engine when you can use a meta-engine like Turbo10 that uses all of the main search engines and some lesser known...

      Of course, AllTheWeb is giving Google a run for its money...in the race to make it to 4 billion pages indexed, so Google may fall back down for a while...

      However, I don't think many ppl will switch because of a few thousand pages...

    6. Re:Thoughtful... by Morosoph · · Score: 2, Funny

      In case of Slashdotting

      Take note: "Google is not affiliated with the authors of this page nor responsible for its content."

    7. Re:Thoughtful... by SquadBoy · · Score: 1

      http://web.ask.com/web?q=google&o=0&qsrc=0&askbutt on.x=25&askbutton.y=8

      --

      Cypherpunks: Civil Liberty Through Complex Mathematics. Those who live by the sword die by the arrow.
    8. Re:Thoughtful... by daeley · · Score: 3, Funny

      Last week I had a co-worker ask how to spell it

      I-T. Really now, how hard is that?

      --
      I watched C-beams glitter in the dark near the Tannhauser gate.
    9. Re:Thoughtful... by xoboots · · Score: 4, Insightful

      There's a reason not every search engine is considered the same. Try a simple search for a popular item. I searched for "PHP" on the three sites you mentioned. The top returned results are as follows:

      Google:
      - top result: php.net
      - 2nd place was php.net/downloads

      AllTheWeb:
      - top result: Hands-On PHP Training - 4 days $1695 (also ranked #10 on Turbo10, but not ranked in the top 20 at Google) -- oops, that is a sponsored link, but in AllTheWeb's default view, it looks like a normal link. php.net is actually ranked #1, but it appears 4th in the list of available links.

      Turbo10:
      - will not provide ANY results without Javascript turned on (BOO!)
      - top result: GBF Masonry Cleaning Services..Stone Cleaning
      - php.net ranked 5

      Draw your own conclusions, but meta-search engines existed prior to Google yet even at its launch it excelled over them in terms of provision of relevant links. It appears that it still does. At least for a first pass :)

      I suspect that one of the reasons that Google can bring higher quality links to the forefront is that being #1, they have a wider and more generous revenue base and therefore don't have to be as generous to "paying patrons" *cough cough*.

      Another problem is that meta engines have to mix "high-quality" results (say from Google) with lower quality results (say from some dippy paid for advertising search engine).

    10. Re:Thoughtful... by pepsee · · Score: 1
      Last week I had a co-worker ask how to spell it. He is MS cert'd for Win2k Pro. Don't mod this funny, it's sad.

      Because he got confused with 'googol'?

    11. Re:Thoughtful... by SarekOfVulcan · · Score: 1

      He was probably trying to look it up at http://www.googol.com/.

    12. Re:Thoughtful... by Pfhreakaz0id · · Score: 1

      save him some time... www.google.com/microsoft for windows troubleshooting issues rocks!

      that is sad though.

    13. Re:Thoughtful... by lostchicken · · Score: 1

      Yeah. We need a +1 Sad moderation.
      We can use it whenever people know just too much about trivial things.

      +5 would show up as (Score:5, No Life)

      --
      -twb
    14. Re:Thoughtful... by Steeltoe · · Score: 1

      I suspect that one of the reasons that Google can bring higher quality links to the forefront is that being #1, they have a wider and more generous revenue base and therefore don't have to be as generous to "paying patrons" *cough cough*.

      Not just that. Google revolutionized the web-search stage with their Pagerank software and other improvements. It's not something new, librarians have used such algorithms for a long time. However, it consistently gives "better" results than most of the competition.

      I suspect Google won out due to high competence and a drive to become the best, while not being evil and greedy.

    15. Re:Thoughtful... by Elbow+Macaroni · · Score: 1

      No he was probably one of those guys here from Russia or Monrovia on one of those work visas. He probably thought it was Gogol.com.

      --
      -------------------------------------
      Technically, we are beyond survival.
  3. You mean FAT don't cut it no more? by inertialmatrix · · Score: 1, Redundant

    I say screw the inovation and lets all just move back to FAT16!
    Weeeeeeeeeeeeeeeeeeeeee!

    1. Re:You mean FAT don't cut it no more? by Russ+Steffen · · Score: 2, Funny

      I think you mean "WEEEEEEE.EEE." Or possibly "WEEEEEE~1.EEE."

    2. Re:You mean FAT don't cut it no more? by Wumpus · · Score: 4, Funny

      Surely you mean "WEEEEE~1.EEE".

    3. Re:You mean FAT don't cut it no more? by myom · · Score: 1

      Sure there is. Unless you see the file with explorer, with the default settings (hide registered dos extensions) AND you have registered .EEE as a known file type.

    4. Re:You mean FAT don't cut it no more? by Wumpus · · Score: 1

      Even if there wasn't, it'd still be WEEEEE~1, not WEEEE~1.

  4. Story summary by slash-tard · · Score: 4, Funny

    Google uses MS access as a backend to store all of its cache files. It is redundant by having a batch file setup with the windows "at" command to "xcopy" the data to another backup server.

    1. Re:Story summary by radish · · Score: 1

      Heh! Someone should show them how to use robocopy!

      --

      ---- Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"

    2. Re:Story Summary by Queelix · · Score: 1

      Why is the earlier comment about Access/At/Xcopy funny but MySQL/cron/cp is troll worthy? Hmmm...

    3. Re:Story Summary by Anonvmous+Coward · · Score: 1

      I metamodded it as unfair. Damn these peeps have no sense of humor.

  5. PDF mirror by Tyler+Eaves · · Score: 4, Informative

    PDF mirror on my server /Feels sorry for the Rochester cs server

    --
    TODO: Something witty here...
    1. Re:PDF mirror by ibmman85 · · Score: 1

      rochester cs server? im at rit just curious

  6. Interesting... by petermdodge · · Score: 3, Insightful

    It's an interesting enough read, it certainly is interesting to see how one of the biggest-volume servers out there cope. Now, the question is, what can us little server guys do to implement the ideas therein to our server? What can we take from it?

    --


    Peter M. Dodge,
    Chief Executive Officer,
    LiquidFire Studios

    Platinum Linux - www.
    1. Re:Interesting... by Mister+Black · · Score: 1

      Now, the question is, what can us little server guys do to implement the ideas therein to our server? What can we take from it?

      Nothing, or you'll be sued for copyright infringement.

      --

      You are standing in an open field west of a white house, with a boarded front door. There is a small mailbox here.
    2. Re:Interesting... by petermdodge · · Score: 1

      Are we talking SCO or Google.com? - plus, I always thought that you cannot copyright ideas, just patent them. (And thus the EU patent issue is ressurected.. bleh)

      --


      Peter M. Dodge,
      Chief Executive Officer,
      LiquidFire Studios

      Platinum Linux - www.
    3. Re:Interesting... by asscroft · · Score: 1

      NONE cause they're all PATENTED.

      If not they SHOULD BE by some peoples logic.

      It's very generous of google to share their ideas without NDA's and patents and all that IP Bullshit.

      Google: Putting the Science back in Computer Science.

      --
      because I have been enjoined by this Holy Office to abandon the false opinion which maintains that the Sun is the centre
    4. Re:Interesting... by LiquidCoooled · · Score: 1

      I believe the grand-parent was referring to learning from those who succeed.

      There is nothing wrong with following and learning from our ancestors.

      Google have given a great deal of thought into their filesystem, and most likely made some huge mistakes along the way. In the end they have a stable workable system that still gives me the shivers occasionally.

      I would see these as guidelines for a further next generation filesystem rather than ripping the code from underneath them and calling it our own.

      --
      liqbase :: faster than paper
  7. Just to make it clear.. by Doodhwala · · Score: 4, Informative


    Okay, so I read this paper as a part of the SOSP reading group here at school. Just want to make it clear that this is not the file system used by the front end that we all see. It is used by internal dev groups as well as the web spiders that they employ. Their unique usage has definitely led to a number of interesting choices (such as the atomic appends) for the file system design. Read the paper for more details :-)

    1. Re:Just to make it clear.. by lurker412 · · Score: 1

      OK, so what does the front end file system look like?

    2. Re:Just to make it clear.. by Klaruz · · Score: 3, Insightful

      Could you cite your source please? In the first page of the paper linked:

      "It is widely deployed within Google for the generation and processing of data used by our service as well as research and development that requires large data sets."

    3. Re:Just to make it clear.. by Doodhwala · · Score: 3, Informative

      And if you read that statement, it does not mention the front-end. Generation and processing all takes place offline as most of the query results are only updated once a month (the Google-dance). And this question was asked of Howard Gobioff (one of the co-authors) at a presentation on the Google File System (GFS) at Carnegie Mellon.

    4. Re:Just to make it clear.. by holstein · · Score: 1
      Initially, GFS was conceived as the backend le system for our production systems. Over time, the usage evolved to include research and development tasks.

      Page 14.

  8. Hmmm. by Pig+Hogger · · Score: 4, Funny

    I'd like to see a beow...
    Never mind.

    1. Re:Hmmm. by user32.ExitWindowsEx · · Score: 1

      I'd like to see a single computer use this!

      (it's a joke..laugh)

      --
      "Evil will always triumph because good is dumb." -- Dark Helmet
  9. Everything's stolen nowdays. by Anonymous Coward · · Score: 2, Funny


    Why the google file system is nothing but a waffle iron with a phone attached.

    1. Re:Everything's stolen nowdays. by marine_recon · · Score: 1

      that may be so, but have you ever seen such a useful waffle iron? i think not mmmmmmm, e-waffles

      --
      Jack the sound barrier. Bring the noise.
  10. Only a file system? by jrrl · · Score: 5, Interesting
    Back in the early days at Lycos, Danner Stodolsky, now at Akamai used so many weird little tricks to make things faster that we used to joke that we'd end up with a custom operating system. The supposed name? LycOS.

    Luckily the world was saved from this possibility.

    -John (now, one of those "why, back in my day..." story telling guys... sigh.)

    --
    Self Serving Sig: Hosting Comparison
    1. Re:Only a file system? by jrrl · · Score: 1

      Gasp! My secret is out!

      --
      Self Serving Sig: Hosting Comparison
    2. Re:Only a file system? by Alethes · · Score: 1

      Luckily the world was saved from this possibility.

      Not Really. :)

    3. Re:Only a file system? by FireBreathingDog · · Score: 3, Interesting
      Nice menu: not alphabetized, and "Use a digital camera" appears twice with two different icons. Then there's the inexplicable and unexplained "scribus" menu item, the only item that is neither a phrase nor capitalized.

      Steve Jobs must be shitting in his pants.

  11. Is it open source? by The+Ancients · · Score: 4, Funny

    I need something for my p...err, book collection.

    1. Re:Is it open source? by daeley · · Score: 2, Funny

      book collection

      Ah, yes. You want a new-fangled "ShelFS" system.

      --
      I watched C-beams glitter in the dark near the Tannhauser gate.
    2. Re:Is it open source? by jawahar · · Score: 1

      You may want to try www.aspseek.com http://mundlapati.editthispage.com

  12. Word processor? by Anonymous Coward · · Score: 2, Interesting

    What word processor/text editor is used to write all of these technical papers? Almost every paper I've seen looks like it's written in the same program.

    1. Re:Word processor? by jrrl · · Score: 1
      Way back when, when I was in academia at CMU, it seemed like most conference papers were done in LaTeX (or straight TeX, for the fearless).

      Nowadays, who knows? Probably Word (shudder).

      -John (managing to not be nostalgic for LaTeX hackery).

      --
      Self Serving Sig: Hosting Comparison
    2. Re:Word processor? by gloth · · Score: 1

      I think it's FrameMaker.

    3. Re:Word processor? by UtucXul · · Score: 1

      It is definately still Latex in both physics and astronomy. I would hope that CS people get to use it too, but I don't have any experience there. And it is significantly superior to any word processor I've ever seen.

    4. Re:Word processor? by Jeremy+Erwin · · Score: 1

      It looks like LaTeX to me, though the macros aren't the default ones. The tables are very much in LaTeX's style.

    5. Re:Word processor? by Jason+Earl · · Score: 1

      I also was curious to see what software they had used to write the paper. It looked like a LaTeX document to me. Sure enough a quick peek at the document info reveals:

      Title: paper.dvi
      Application: dvips(k) 5.86 Copyright 1999 Radical Eye Software

    6. Re:Word processor? by Saunalainen · · Score: 2, Informative

      The PDF file claims to have been made by dvips, so it was written in Latex. It was then converted to PDF using Distiller.

    7. Re:Word processor? by SamBC · · Score: 1

      It's probably LaTeX, which can be prepared from your favourite text editor, and rendered to print or PDF (or postscript) by entirely open-source software.

      It's very nice.

    8. Re:Word processor? by LinuxHam · · Score: 1

      Exactly. I helped build NYU CompSci's very first web site and spent many days converting the technical paper collection to PS when electronically available and scanned to TIFF when it wasn't.. like for papers dating back to the late 60's.

      There was some cool stuff buried in there.

      --
      Intelligent Life on Earth
    9. Re:Word processor? by Mindjiver · · Score: 1

      It's LaTeX using the ACM template that you can get from here I have used it myself a couple of times. It's really nice.

      --
      I know not what course others may take; but as for me, give me liberty or give me death!
  13. html version by kaan · · Score: 3, Informative

    thanks to, ehh, Google, here's an html version of the article

    I didn't read the whole article (kinda lengthy) but it seems pretty informative. I found their assumptions interesting, as they reveal some of the essence of what makes Google such a great search tool. Here are a few from the article:

    - The system is built from many inexpensive commodity components that often fail. It must constantly monitor itself and detect, tolerate, and recover promptly from component failures on a routine basis.

    - High sustained bandwidth is more imprtant that low latency. Most of our target applications place a premium onprocessing data in bulk at a high rate, while few have stringent response time requirements for an individual read or write.

    - The workloads primarily consist of two kinds of reads: large streaming reads and small random reads. Successive operations from the same client often read through a contiguous region of a file.

  14. Various hardware life expectancies? by The+Ancients · · Score: 3, Interesting
    ...and an assumption that hardware components will always fail.

    I think perhaps this is something we could all take a little more seriously. Part of me realises this is a comment on the sheer data being manipulated, but then something else that sprung to mind is the gradual reduction of warranties on HDDs, for example. I wonder what sort of stats an operation of this size could gather on various hardware components, and their varying propensities to wither and die.

    1. Re:Various hardware life expectancies? by forevermore · · Score: 2, Interesting

      Gradual reduction of hard drive warranties? Didn't Maxtor just bump up the warranty on their drives to 5 years? And WD and Seagate both have 3 year warranties on their drives. Granted, I'm talking about the "good" (SATA, 8 meg cache, etc.) drives, not the cheap ones that most of us users are using rebates to get for really-cheap.

      --
      Do you really need reason for beer? Wingman Brewers
    2. Re:Various hardware life expectancies? by Alizarin+Erythrosin · · Score: 1

      Google probably (most likely) uses SCSI drives anyways, which most often carry a 5 year warranty regardless of the company who makes it. Enterprise users wouldn't settle for less.

      --
      There are only 10 kinds of people in this world... those who understand binary and those who don't
    3. Re:Various hardware life expectancies? by CrystalFalcon · · Score: 1

      Enterprise users wouldn't settle for less.

      No, Enterprise users won't settle for interruptions. It's the IT guy's work to figure out how to make a noninterruptible environment as cheap as possible.

      Such a solution may well involve ultra cheap drives (one-third the cost of reliable ones) in a redundant RAID setup with hotspares, for example.

    4. Re:Various hardware life expectancies? by forevermore · · Score: 1

      Actually, most of our customers are enterprise-level types (Microsoft, MIT, Real), and for the most part they buy ATA/SATA RAID systems. They're much cheaper than SCSI, and we've actually had a lot of trouble with the U320 hardware (both drives and controllers) failing. Longer warranties don't do any good if you have to keep sending the hardware in for repair/replacement. Better to buy 2-3 SATA drives for the cost of a SCSI drive and have them around for instant swapping if/when something breaks.

      --
      Do you really need reason for beer? Wingman Brewers
  15. Interactive demo by javaaddikt · · Score: 1, Funny

    Check out the interactive demo of how GFS works.

  16. Google FS? by Dark+Lord+Seth · · Score: 1, Funny

    What's next? GoogleOS? Google Electronics? Google Nuclear Power Plant? Google Search Engine? Oh wait...

    1. Re:Google FS? by Anonymous Coward · · Score: 1, Funny

      Google-Linux :)

      installed with the new extgoogle filesystem...

      using xfreegoogle for frame buffering, browse the internet with mozgooglla, and do your work with googleoffice...

    2. Re:Google FS? by Shadwell · · Score: 1

      Shamelessly stolen from imdb.com and Mel Brooks:

      Yogurt: Merchandising, merchandising, where the real money from the engine is made. Google-the T-shirt, Google-the Coloring Book, Google-the Lunch box, Google-the Breakfast Cereal, Google-the Flame Thrower.

      [turns it on]

      The Dinks: Ooooh!

      Yogurt: (reacts to dinks) The kids love this one. (A dink hands him a doll that looks likes Yogurt) And last but not least, Google the doll, me.

    3. Re:Google FS? by Captain+Large+Face · · Score: 1

      Duh! Everyone knows that common benchmark for application growth is a web browser!

  17. Re:great. now, deal with the spam issue by winkydink · · Score: 4, Funny
    how many times have you searched for something on google, only to find that the search engine spammers have taken over almost every top 10 result?

    Ummm... not very many. Then again, I try not to search on "teen panties" very often. :)

    That reminds me of the winter I spent in Chicago. I needed some galoshes to protect my shoes and keep my feet dry. Back in New England, we called them "rubbers" (I am not making this up). Needless to say, a google search on "buy rubbers" did not yield the intended results.

    --

    "I'd rather be a lightning rod than a seismometer." -Ken Kesey

  18. Everyone still uses Latex in university. by Anonymous Coward · · Score: 2, Funny

    Just for covering their penis, not reading papers.

  19. Re:real "Google"? by Anonymous Coward · · Score: 1, Funny

    No, but we can mod you down.

  20. Fabulous Insights by dolo666 · · Score: 4, Informative

    I really enjoyed that read about the file system Google uses. The fact that they usually append to their files, is of special note. By appending data you only need to know a simple pointer address. Seems quick enough. Add a bunch of threaded concurrent writes and you could get into trouble on other systems... The "atomic append" seems interesting because of the use of multiple machines to append simultaneously (hazard free).

    64meg chunk size is pretty huge, but I'm guessing that's blocked out based on continual threads of data, not typical files.

    At first glance, this file system seems fairly wasteful. But hey, Google likely require speed and reliability over cost. Right?

    This reminds me of the discussions about not-so-far-off database filesystems coming to an OS near you.

    1. Re:Fabulous Insights by harmonica · · Score: 1

      64meg chunk size is pretty huge, but I'm guessing that's blocked out based on continual threads of data, not typical files.

      64 MB is the maximum chunk size. The assumptions section at the beginning talks about typical read/write operations working on about 1 MB.

    2. Re:Fabulous Insights by etnoy · · Score: 1

      Well, they seem so use old and recycled hardware as a big cluster, so the cost is very low

      --
      Quantum hacker.
  21. When will it be in the kernel? by caluml · · Score: 3, Funny
    I hope they're going to release it to us mere mortals. I mean, they're probably the only people that need millions of gigabyte+ files floating around thousands of machines, but it would be nice to see

    [ ] Google File System.

    in the kernel config.

    Must be 12pm - the updatedb script it running.

    1. Re:When will it be in the kernel? by Jellybob · · Score: 1
      Must be 12pm - the updatedb script it running.

      Someday I'll set that to a time when I won't be sat at my computer developing.

      Maybe 11am.
    2. Re:When will it be in the kernel? by MikeFM · · Score: 1

      I set mine to run once a week. I think on Sunday afternoons. I also limit it to things I'm likely to want to locate and won't know where to look.. so it ignores my huge disks full of ripped movies, porn, mp3's, etc.

      --
      At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.
    3. Re:When will it be in the kernel? by mOdQuArK! · · Score: 1
      I mean, they're probably the only people that need millions of gigabyte+ files floating around thousands of machines

      Actually, since it's designed for lots of hardware which is expected to die regularly, I wonder if any of the technology could be applied to P2P networks?

  22. And starting with Linux 2.7... by JessLeah · · Score: 4, Funny

    ...the Linux kernel will have googlefs support. It will be marked (EXPERIMENTAL), though, and will only run on 10,000-node Babelfish clusters...

    1. Re:And starting with Linux 2.7... by Anonymous Coward · · Score: 3, Interesting

      Actually this sounds exactly like the sort of file system that would be useful in a render farm.

      How long before ILM or Weta has a GFS disk array?

    2. Re:And starting with Linux 2.7... by __past__ · · Score: 1
      10,000-node Babelfish clusters
      Is that the successor of the Shakespear cluster system based on an infinite number of monkeys with typewriters?
  23. Google's many, MANY sister sites... by Valthonis · · Score: 1

    ...like google.co.jp, google.ca, etc. will fill up pages of hits on a search for Google long before slashdot even makes an appearance. But it is a nice thought.

    --
    "Life in every breath... that is bushido"
  24. they published it ... by trick-knee · · Score: 5, Interesting

    ... which may not have happened from just any company of google's prominence. I mean, they have highly successful business and technical infrastructure models and they didn't HAVE to share it with anyone.

    I wonder what they believe will protect their business from poaching of these ideas?

    1. Re:they published it ... by MoobY · · Score: 1

      I wonder what they believe will protect their business from poaching of these ideas?

      It's called "creating prior art" without patenting the stuff. That's good. It's not evil. It's the google folks.

      --
      --- Sigmentation Fault - Comments Dumped
    2. Re:they published it ... by phch · · Score: 1

      Perhaps they've filed for patent, as they've done before.

    3. Re:they published it ... by baneblackblade · · Score: 1

      and I'm sure it was very cheap, so just about anybody could use they're ideas in a small or a large business environment!

    4. Re:they published it ... by Brad+Mace · · Score: 1
      I wonder what they believe will protect their business from poaching of these ideas?

      Copyright law perhaps?

      It's not clear if their filesystem would be GLP'd by default or not. It'd be cool if they released it, but it would be understandable if they kept it to sell to corporations and governments. They provide a very valuable service for free, so if they can make some money licensing their filesystem, good for them. Perhaps they'd make it free for geeks to play with on their own 3-box 'clusters' while charging the people who actually have a use for it.

    5. Re:they published it ... by Anonymous Coward · · Score: 2, Insightful

      The catch up Law.

      Basically it says that if you spend all your time playing catch up you never be first.

      If the other Search engines use the GoogleFS then you know they aren't the leader. Sort of like if kernal.org was running windows 2003 or if www.msn.com was running on linux.

      Now if they go and create a FS so they can be the same as google then they are just catching up. Once they catch up to Google, Google will be somewhere else.

      The other thing is they're are lots of Clustered file systems around so it's not like they have the only one. They've just optimsed one for their needs.

      Basically if the other companies copy the idea it would take them at least a year to get it working by then the Google FS will have more features or they may have another bootle neck eg Google NUMA or the like.

    6. Re:they published it ... by hankaholic · · Score: 4, Insightful

      I wonder what they believe will protect their business from poaching of these ideas?

      Perhaps the fact that it's taken many very smart people a good amount of time to implement and tune the original design, even after having come up with the basic layout?

      Go take a look at the ReiserFS Future Vision page -- you'll see some more interesting discussion of filesystem design, and overall direction. There are a few solid developers working full-time on the concepts discussed in the Reiser docs, and they still have enough work to keep them busy for years to come.

      Google releasing information regarding the structure of their systems is a bit like John Carmack discussing the structure of his graphics engines: there's a hell of a distance between a conceptual description and a fine-tuned, tested, working implementation.

      Given Google's history, I'd also imagine that they're on the lookout for up-and-coming young researchers. As such, if some grad student takes their work and extends it, they can certainly benefit.

      --
      Somebody get that guy an ambulance!
    7. Re:they published it ... by BobTheLawyer · · Score: 1

      ideas aren't covered by copyright, only expression of ideas. So if you copied their paper you would be breaching google's copyright, but if you created a filesystem using the ideas in the paper you would not.

    8. Re:they published it ... by canadiangoose · · Score: 1

      There could be an initial advantage to this 'catch up' thing. If someone takes this filesystem and improves apon it before implementing it, they have the advantage of implementing it as a clean installation, instead of having to worry about upgrading an existing infrastructure.

      --
      Never eat more than you can lift -- Miss Piggy
  25. RAIC?? by More+Karma+Than+God · · Score: 3, Interesting

    Could we call Google a Redundant Array of Inexpensive Computers?

    What else can it be programmed to do? Could this become the basis for a personal computer where you just add computers seamlessly when you need more power?

    --
    Go here to create your own Slashdot dis
    1. Re:RAIC?? by mindriot · · Score: 1

      Wouldn't that be called Grid?

  26. Re:google groups mostly down all day by Threni · · Score: 1

    "They could use a more robust file system then. It seems like postings within the past 48 have headers, but google dies when accessing the body."

    Sure! Also, some of the counts of messages per thread are optimistic. I guess they've been told 1000 times already..or maybe I should mail them about it too?

  27. Re:great. now, deal with the spam issue by skelley · · Score: 1

    And www.manpages.com is NOT an online resource to get *nix man pages.

    http://www.bash.org/?137303

  28. In case you don't like links at all by Bingo+Foo · · Score: 2, Funny

    In case you don't like reading stories and links before posting, remember this is Slashdot.

    --
    taken! (by Davidleeroth) Thanks Bingo Foo!
  29. Google cache by Skreech · · Score: 5, Funny

    In case Google gets slashdotted, here is the Google cache for Google.

    1. Re:Google cache by trick-knee · · Score: 1

      or here

      funny, I think google is getting /.'d after all: they've started using alternative servers. note that 216.239.41.104 kinda looks like google.com, but not exactly. (is there a google.com in, say, Farsi?)

    2. Re:Google cache by CableModemSniper · · Score: 3, Funny

      Best part is the disclaimer at the top:

      Google is not affiliated with the authors of this page nor responsible for its content.

      --
      Why not fork?
    3. Re: Google Cache by liam193 · · Score: 1

      Just in case the Google Cache get's slashdotted. Here's a yahoo cache of google.

      http://216.109.117.135/search/cache?p=google&ei= UT F-8&url=zhool8dxBV4J:www.google.com/

    4. Re:Google cache by sburnett · · Score: 1

      Hmm....can I get a Google cache of the Google cache of the...

      Never mind.

  30. Re:great. now, deal with the spam issue by hondo77 · · Score: 1

    how many times have you searched for something on google, only to find that the search engine spammers have taken over almost every top 10 result?

    Err, never. Even searches for porn images are still pretty useful (as useful as porn images are, I guess). Dozens of non-porn searches a day and always useful.

    --
    I live ze unknown. I love ze unknown. I am ze unknown.
  31. Re:Thank God by X · · Score: 1

    This is also a pretty strong indication of just how noteworthy this article is. This kind of stuff has been time and again. Things like OceanStore are far more innovative. But of course that stuff isn't from Google, which is what makes this article noteworthy. ;-)

    Tomorrow's slashdot headline: Google proves definitively that 1 + 1 = 2

    --
    sigs are a waste of space
  32. RAID by _ph1ux_ · · Score: 1

    No, It would still be RAID - although the D would denote "Devices"... unless they had a purchasing contract with Dell...

  33. GFS and GWS? by cpopin · · Score: 2, Funny

    They designed their own file system as well as Web server? Did they design their own receptionists? If so, I want to work there!

    --
    -=- Many seek good nights and lose good days.
  34. Prevayler anyone? by 12357bd · · Score: 2, Informative

    The in-memory master behaviour described in the paper ressembles a lot the Prevayler software.

    --
    What's in a sig?
  35. GooFS? by hajejan · · Score: 2, Funny

    Yeah, that'll definitely sell.

    --
    The Mini Repository - more links
  36. Re:Thank God by kiltedtaco · · Score: 1

    Yes it's true,

    1+1

  37. well... by pr0ntab · · Score: 1

    it's not really a clustered filesystem. It's sort of like uber-intelligent iSCSI.

    A "real" GFS has multiple masters, as far as I'm concerned. This is a very specific app tied to a specific need for Google's web collection system.

    So I think you're okay, even so. :-)

    Also, the article was published before Sept. 17 (earliest commentary I saw), so this is moot.

    But anyway, kids, listen to him, don't procrastinate! And if you do, make sure you have adequate forged documentation on your 17 grandparents gruesome deaths.

    --
    Fuck Beta. Fuck Dice
  38. PC #1782563 by can56 · · Score: 2, Interesting

    See Verity Stobs article -- Cold Comfort Server Farm -- in the August/2003 edition of Dr. Dobb's Journal, for the sad truth about Googles' server farm. Sniff ;-(

  39. Release it to Yahoo! by cpopin · · Score: 1

    Yah, their going to get right on that...probably release it right to Yahoo! who is going to try to even think about taking on Google. I wonder if they've patented GFS?

    --
    -=- Many seek good nights and lose good days.
  40. let them know by Therlin · · Score: 1

    I've come across that situation a couple of times. They have an address for that type of complains. I let them know both times and a human got back to me within 48 hours and said that they would look at the issue. Sure enough, a week later it was taken care of.

  41. Re:great. now, deal with the spam issue by fermion · · Score: 1
    I don't want to start a funny rubber thread, but I have one from an friend of mine.

    I think he was in college, we are in the U.S., and it must have soon after the war, late 40's ealy 50's. Anyway he is sitting in class one day taking a test and this british bloke sitting behind him leans foward and whispers in this ear
    can you spare a rubber.

    It took my friend several seconds to understand what he meant.

    --
    "She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
  42. Damn... by kashmirzoso · · Score: 1

    ..those Google guys are very, very smart....

  43. Chunkservers... by HotNeedleOfInquiry · · Score: 1

    and chunkhandles. I love it. Great read.

    --
    "Eve of Destruction", it's not just for old hippies anymore...
  44. user-mode? PVFS? by penguin7of9 · · Score: 1

    I can't quite tell from a quick reading of the paper, but this seems to be a user-mode file system. That is, if you call the regular POSIX "open" call, you probably can't open a file in the GoogleFS. It appears that some library code linked directly into the application handles all file system operations. A number of distributed file systems take that approach--it can be more efficient.

    I wonder how it compares to PVFS. It seems like GoogleFS deals more aggressively with component failure. Any ideas?

  45. Apples vs. oranges by SoupIsGoodFood_42 · · Score: 1

    Rendering doesn't need super-fast storage. It may need lots of storage for the whole movie, but the render farms spend far more time rendering than they do outputing data.

    1. Re:Apples vs. oranges by SoupIsGoodFood_42 · · Score: 1

      Not sure, but I do know that Lemons run Windows 98.

  46. They have no reason for worry by ttyp0 · · Score: 1
    It's apparent that Google employs by far the best programmers in the world. Google has published numerous white papers details their infrastructure and technology. By the time a competitor has time to implement, Google would already be far ahead with new innovations.

    Show your hate for SCO. Get a cool t-shirt and donate to the Open Source Now Fund.

  47. LaTex is not a word processor by maxmg · · Score: 3, Informative

    It's more of a "text compiler" where you concentrate on writing the content and leave all of the formatting to a template that is responsible for transofmring the content into (normally postscript) output. Anybody who has worked with LaTex and then moved to Word, only to have that stupid piece of sh*t bunch all images in a document together, on top of each other, on the first or last page of their document will appreciate the LaTex workflow. And LaTex absolutely rocks when it comes to formulas.

    That being said, LaTex comes with a siginificant learning curve, and due to its nature misses some of the features that are important in a business environment (most notably changes tracking). There are some pseudo-wysiwig frontends for LaTex, such as Lyx, but they are firmly targeted at an academic audience. Most scientific papers require submissions in .ps format, processed with a speified LaTex templates (at tleast they did when I was at Uni).

    --
    I asked for a refund - and got my monkey back.
    1. Re:LaTex is not a word processor by hanssprudel · · Score: 1

      That being said, LaTex comes with a siginificant learning curve, and due to its nature misses some of the features that are important in a business environment (most notably changes tracking).

      For changes tracking, why not just use cvs?

    2. Re:LaTex is not a word processor by v_1matst · · Score: 1

      LaTeX is professional typesetting software. People don't use LaTeX and then "move to Word", they are entirely seperate things. It's like comparing apples and volkswagons.

      Also, what makes everyone so sure that this is LaTeX? TeX (which is -NOT- the same as LaTeX) can also produce DVI output and is much more robust than LaTeX. I know I may be splitting hairs, however we (American Mathematical Society) use TeX heavily (actually we contribute quite a bit to the development of TeX) and I wanted a distinction to be made clear.

    3. Re:LaTex is not a word processor by t · · Score: 1
      That being said, LaTex comes with a siginificant learning curve, and due to its nature misses some of the features that are important in a business environment (most notably changes tracking).
      Yes, just like programming in C (or any other programming language) has a significant learning curve. Thus you should stick to ... software legos? That's quite the odd logic you have there. But mostly I wanted to ask you what exactly is the nature of Latex?
  48. Re:great. now, deal with the spam issue by MikeFM · · Score: 1

    Most of the people I see having trouble searching just don't know how to to search for things properly. My parents are a prime example. They knw how to get to Google but not how to pick the combination of keywords most likely to return the result they're looking for. I wish I could think of a way to put into code my mental process for doing this.. if I could then maybe Google would hire me. :)

    The other major problem is that many webpages aren't made to be easy to locate. At times they don't even include the subject of the page anywhere inside the page's contents. This doesn't exactly make it easier for searchers to find your site especially when you take into account the spam peckers that are including your search terms in their totally unrelated pages just to sneak hits.

    --
    At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.
  49. Re:great. now, deal with the spam issue by tconnors · · Score: 1
    how many times have you searched for something on google, only to find that the search engine spammers have taken over almost every top 10 result?

    Ummm... not very many. Then again, I try not to search on "teen panties" very often. :)


    Hmmm, searching for help on LaTeX can sometimes be... distracting.
  50. Already using it by pfifltrigg · · Score: 1

    Since when is googlefs new? I've been using it for ages on my FreeBSD box.

    $ mount
    /dev/ad0s1a on / (googlefs, local)
    devfs on /dev (devfs, local)
    /dev/ad0s1e on /tmp (googlefs, local)
    /dev/ad0s1f on /usr (googlefs, local)
    /dev/ad0s1d on /var (googlefs, local)
    $

  51. This sounds like a GPL violation to me! by CypherDeaz · · Score: 1, Interesting

    On the GNU linux wouldn't under the true GPL licence such deep modifications to the GNU Linux be a GPL violation?

    1. Re:This sounds like a GPL violation to me! by ZigMonty · · Score: 1

      No, they don't have to provide source code unless they distribute binaries. Even if they did, they'd only have to provide source to those they distribute to.

  52. Is there still a Google dance? by harmonica · · Score: 2, Interesting

    I thought the Google dance was history, and the index is now being updated more continuously (how exactly, I don't know)?

  53. Ironically Google has been down all day... by quinkin · · Score: 1
    Well ok, at least from Oz... and it seemed to be a backbone routing issue (Sydney Telstra Reach.com)... but don't ruin my fun with logic and facts! :)

    Q.

    --
    Insert Signature Here
  54. A couple of observations by mt-biker · · Score: 1

    First, the obvious one. This is not for use at home! It's a highly specialised filesystem which, even distributed over several machines, will perform badly for "normal" use.

    At first I was asking myself why Google needed their own filesystem, rather than using one of the many filesystems already available. Actually, I'm still not convinced that another commercial filesystem couldn't do what they need (SGI's CXFS will be available for Linux soon, won't it? True, it's not big on fault tolerance...), but still it's clear that Google's needs are pretty special.

    Also, at which point does the master become a bottleneck? I'm sure they've spec'd it properly, but I'm still curious...

  55. great job by tshuma · · Score: 1

    well i belive that many things can be done better and faster, becasuse there are always a faster way, but just look at the results! Google is fast enough for me! I use it all the time, and they made a great job! Tnx for it!

    --
    There is only one good solution: The simpliest!
  56. Failure by jetmarc · · Score: 1

    Interesting.. Just yesterday the google groups database suffered failures. A lot of threads appeared in the search results, but couldn't be browsed.

  57. Google Groups is teh fux0red by dosius · · Score: 1

    I'm having trouble reading and replying to newsgroups since Google isn't showing comp.sys.apple2, comp.os.cpm and comp.emulators.apple2, and is very spotty with alt.fan.sailor-moon. (Sometimes I have been able to access these groups. YMMV.)

    My ISP unfortunately isn't giving me netnews, so I'm trying to find a solution, and I have not found one.

    This sucks, because I use comp.emulators.apple2 as a help forum for EMU][, among other things.

    -uso.

    --
    What you hear in the ear, preach from the rooftop Matthew 10.27b
  58. Re:great. now, deal with the spam issue by Robmonster · · Score: 1

    Then again, I try not to search on "teen panties" very often. :)

    Nor do I. I know exactly where to find them....

    --
    I have no sig yet I must scream.
  59. People, people by Epistax · · Score: 2, Funny

    The question really on all our minds is can you play doom on it?

  60. Re:great. now, deal with the spam issue by Alizarin+Erythrosin · · Score: 1

    There's an Indian (not Native American, from India) guy here at work who one day asked a coworker if he could "bum a fag"... I don't know if the guy ever figured out he was asking to bum a cigarette or not... I was laughing so hard.

    --
    There are only 10 kinds of people in this world... those who understand binary and those who don't
  61. What a waste.... by abramsh · · Score: 2, Insightful

    Should have just bought one of these: SGI SAN 3000 It would be easier and cheaper to manage, scales better, and you wouldn't have to spend the money to create and maintain the file system.

  62. What's next after Google? From the Valley... by Seldon+Wells · · Score: 1

    http://groups.yahoo.com/group/SandHillEC/