Slashdot Mirror


How does Google do it?

Doc Tagle writes "With Google reportedly on the verge of going public, more and more people want to know what makes Google tick. The Observer, serves up the answers to our questions."

261 comments

  1. Google Problems by williamstephens007 · · Score: 0, Funny

    If Google had chosen to go with a superior platform, they probably would have been able to go pubic already. When I meet with them I will recommend IIS 6.0, ASP.NET on Windows Server 2003.

    --
    William Stephens
    MCSE,MCDST,Well Respected VBScripting Guru
    williams007@yahoo.com,(212)275-4831
    1. Re:Google Problems by Motherfucking+Shit · · Score: 2, Funny
      If Google had chosen to go with a superior platform, they probably would have been able to go pubic already.
      Well, I suppose "Micro soft" isn't the superior platform for anyone's pubic ventures...
      --
      "BSD: Free as in speech. Linux: Free as in beer. Windows 10: Free as in herpes." --Man On Pink Corner in #52607549.
    2. Re:Google Problems by grommit · · Score: 1, Funny

      Yes, and they will smile, pat you on your head and let you know that they'll consider your suggestion while forgetting your name, face and anything about you at the same time.

    3. Re:Google Problems by Anonymous Coward · · Score: 0

      For many hackers, google forgetting your name is a scary, scary concept :)

    4. Re:Google Problems by Anonymous Coward · · Score: 0

      Well Respected VBScripting Guru... thats got to be an oxymoron? surely?

  2. Openness is the first casualty of going public?! by Paul+Townend · · Score: 4, Insightful

    If truth is the first casualty of war, openness is the first casualty of going public

    OK - I can (perhaps) see this as being the case prior to an IPO, but that statement can't be true after it has happened...

    I mean....surely once they've gone public, they'll be obliged to detail and list the sort of information that the article postulates about? The shareholders would be entitled to know how many servers google has, what their specifications are, and what their current commercial strategy is.....surely?!

  3. Google is faltering by Anonymous Coward · · Score: 3, Interesting

    Google has been at 4.285 billion pages for more than three months straight. The count hasn't increased in a long time... The index is maxed.

    Google has recently removed tens of thousands of "duplicate content" sites from its index - where "duplicate content" is as simple as being an affiliate site (e.g. Amazon) and having the same textual item descriptions as many other sites.

    Google is now in the process of dropping millions of link records from its index, presumably to make room for more pages.

    Google is wavering.

    Gmail is a distraction, a venture into some other space to keep people from noticing that their search product is degrading.

    May she last as long as possible...

    1. Re:Google is faltering by shachart · · Score: 1

      Google has been at 4.285 billion pages for more than three months straight. The count hasn't increased in a long time... The index is maxed.

      Either that, or they're indexing more pages than they are letting on. Don't forget they also have 10K servers for around a year.

      --
      Those who can, do. Those who can't, consult.
    2. Re:Google is faltering by jabbadabbadoo · · Score: 5, Insightful

      "Google has been at 4.285 billion pages for more than three months straight. The count hasn't increased in a long time... The index is maxed."
      Hmm... are they using a 32-bit integer to keep the page count?
      2^32 = 4.294 billion, pretty close to 4.285 billion pages.
      Newbies...

    3. Re:Google is faltering by Anonymous Coward · · Score: 0

      Yeah, those hundreds of PhDs they have working there will *never* figure that out. I hear they started with a 16 bit signed integer for their primary key and only after months of hard work upgraded it to 32 bit. Time to close down shop, it's impossible to fix.

    4. Re:Google is faltering by Dayflowers · · Score: 1, Interesting

      Precisely what I thought. But the question now is: did they miscalculated the amount of pages they'd end up indexing? Or was it something they accepted for performance considerations? And are they now in the process of upgrading perhaps?

      --
      I am a speak english. Do you not? - Saroto
    5. Re:Google is faltering by Waffle+Iron · · Score: 3, Informative
      Yeah, those hundreds of PhDs they have working there will *never* figure that out. I hear they started with a 16 bit signed integer for their primary key and only after months of hard work upgraded it to 32 bit. Time to close down shop, it's impossible to fix.

      Actually, they already have the fix implemented, and it's currently in the process of being rolled out. The upgraded system makes use of a split primary key which comprised of a "selector" subkey and a "segment" subkey. The selector key is shifted left by four bits and then arithmetically added to the segment key. This clever scheme expands the index by a factor of 16; Google will soon be able to host over 64 billion pages!

    6. Re:Google is faltering by Decameron81 · · Score: 4, Funny

      I bet you wouldn't know you need more than an unsigned 32 bit integer before you hit it.

      On a side note I would really like to know which one is page number 1.

      Diego Rey

      --
      diegoT
    7. Re:Google is faltering by Peridriga · · Score: 1

      Wouldn't it be fair to say that the index probabaly isn't maxed out because the number of actual websites indexed wouldn't be the limiting factor?

      Wouldn't it be the keyword index?

      Image the slashdot front page? How many 'keywords' or 'whatevers' would you have to categorize and organize to maintain a searchable structure? 100? 1000?

      So if you have 100 rows in a DB relating to 1 page then the DB would have been maxed out a factor of 10^2 ago instead of now...

      Right?

    8. Re:Google is faltering by DrunkenTerror · · Score: 1
      On a side note I would really like to know which one is page number 1.
      Why, www.google.com, of course!
    9. Re:Google is faltering by zcat_NZ · · Score: 2, Funny

      "64 billion should be enough for anybody" .. ?

      --
      455fe10422ca29c4933f95052b792ab2
    10. Re:Google is faltering by reanjr · · Score: 1

      That number is extremely close (approx 1% error) to the limit of a 32-bit integer. Being that Google almost assuredly runs their servers on 32-bit processors, it would make since that the number caps out there. The error is probably due to extra records that Google keeps for whatever reason (possibly indexes marked for deletion). Or it could, in some wacky fashion, have to do with the metric powers of ten and computer powers of two^10 difference.

    11. Re:Google is faltering by orthogonal · · Score: 5, Informative
      Actually, they already have the fix implemented, and it's currently in the process of being rolled out. The upgraded system makes use of a split primary key which comprised of a "selector" subkey and a "segment" subkey. The selector key is shifted left by four bits and then arithmetically added to the segment key. This clever scheme expands the index by a factor of 16; Google will soon be able to host over 64 billion pages!

      Ah, youthful mod!

      You've been (humorously) trolled. I suggest posting in this thread to remove your "+1 Informative", or getting a friend to mod it "Funny".

      What the parent is describing is not what Google will do, but what DOS did: the above scheme is how MS-DOS managed memory, except that the "selector" and "offset" were both 16-bit numbers under DOS. (Although "segment" was the more usual term for "selector".) The segment number was shifted left four places -- or put more simply but less graphically, multiplied by 16 -- and then added to the offset number, to give the whole or "flat" address:
      segment (in hex): 0001
      offset ( in hex): 0002
      segment is multipled by 16 (shifted left 4 bits or one hex digit of multipled by 16)
      segment: 0001x
      offset: 0002
      ---------------
      total: 00012
      This allowed DOS to use 16-bit numbers to address 2^20 = 1 MB of memory, but since DOS reserved the upper 384 KB for the (remapped) BIOS and peripheral cards, programs were able to address at most 640 KB of memory; the parent's mention of "64 billion pages" is probably an allusion (increased several orders of magnitude) to this DOS limit.

      Of course, this was a kludge, pure and simple, required because DOS machines were 16-bit. Among other things, it allowed the same memory locations (all but the very top and bottom memory addresses) to be addressable by several different addresses, and discovering pointer aliasing it required calculations that, by their very nature couldn't be done wholly in the machines (16-bit) registers.

      Consider: segment 4, offset 0 is 4 * 16 + 0 = 64,
      and segment 3, offset 16 is 3 * 16 + 16 = 64,
      and segment 2, offset 32 is 2 * 16 + 32 = 64
      and segment 1, offset 48 is 1 * 16 + 48 = 64
      and segment 0, offset 64 is 0 * 16 + 64 = 64:

      so all five segment:offset pairs are apparently different but actually point to the same memory location.
    12. Re:Google is faltering by eet23 · · Score: 2, Insightful

      I'd rather know which one is page 0.

    13. Re:Google is faltering by jabbadabbadoo · · Score: 1
      I'd rather know which one is page 0.

      Well, there's http://www.zero.com and http://www.null.com

    14. Re:Google is faltering by imroy · · Score: 2, Informative
      ...the above scheme is how MS-DOS managed memory.

      <sarcasm>Wow, I didn't know DOS managed memory at such a low level!</sarcasm>

      s/DOS/the 8086/g;

      You're really referring to the horrible segmented memory layout used by the Intel 8086 processor and its later derivitives. I did all this shit years ago in university. Almost every lesson my fellow students and I (and the lecturer as well) would end up cursing Intel for their whacky processor design. Interestingly Intel introduced a similar scheme in (IIRC) its Xeon processors to produce (IIRC) 36-bit addresses and access more than 4 gigabytes of physical memory on a 32-bit processor.

    15. Re:Google is faltering by NonSequor · · Score: 2, Informative

      The 36-bit addressing extension began with the Pentium Pro.

      --
      My only political goal is to see to it that no political party achieves its goals.
    16. Re:Google is faltering by Anonymous Coward · · Score: 0

      There was a good reason for the "wacky" processor design. The 8086 was not meant to be Intel's 16 bit processor, it was only meant as an upgrade for people using their 8 bit processors. Code written for the 8080 or 8085, and using 16 bit address arithmetic, would ignore the extra registers, but could gain extra memory with a few extra instructions.

      Intel did design a proper 16 bit processor, but then IBM chose the 8086 for their PC.

    17. Re:Google is faltering by Anonymous Coward · · Score: 0

      Dropping duplicate entries? Why doesn't someone try that with Slashdot comments!!

    18. Re:Google is faltering by pritchma · · Score: 1

      Err, I'm pretty sure the above was meant to be funny :)

      I doubt very much that Google's engineers would think they only needed a 32-bit integer for their doc id.

      I think the original post in this thread needs <fud> delimiters.

    19. Re:Google is faltering by alexburke · · Score: 1

      On a side note I would really like to know which one is page number 1.

      Why, Larry Page, of course!

    20. Re:Google is faltering by catenos · · Score: 1

      Dear Mods, please note that the parent is a troll.

      Although it sounds interesting when presented as speculation, it's not presented as such and shows the same signs that are so typical for hoaxes, aka "present enough half-truth to make the rest believable".

      I want to make clear, that I don't deny that there might be some small possibility that it is as the parent poster says, but as the saying goes, extraordinary claims need extraordinary proof, and he provides (almost) none. The conclusions he draws are not the most plausible ones for the claims that are presented.

      Aside from the lack of any references for his claims, he fails to cover, for example:

      - What about all the smart people Google pays. Nobody saw this coming? [I have no particular problem telling years beforehand with an accuracy of 1 month when disk space, CPU, or other design contraints will limit further growth of the services I am responsible for. Google can't do better?]

      - There are other reasons that can equally well explain why they are taking stance against duplicate content. A simple one is: complaints by users or content authors (like the amazon reviews case that's discussed all over the postings here)

      - The fact that indexes are usually not limited by the amount of indexed entities (here: web pages), but the overall size of the entities.
      - The google index could be an exception to that, because they have a distributed file system (which is equivalent to a second index), so they don't need to point to the location of the data, but only the file containing it.
      - So if there is some limit, it sounds more probable to be in the number of nodes on the filesystem or such, but the parent presented it as fact: The index is maxed.

      - Google is now in the process of dropping millions of link records from its index. Maybe I missed it, but except for the duplicates or other "offenders" I didn't hear of any such process. And unfortunately the parent doesn't provide any references.

      - Google is wavering. Where does this come from?

      - Gmail is a distraction, a venture into some other space to keep people from noticing that their search product is degrading. Hm, at least I missed what they tried to distract from by all those nice new features they presented within the last years and are still going to add. In other words, he presented no evidence for the claim that this is more than just coincidence, but there seems to be a lot against it.

      Well, I'll think I stop now.

      --
      Keep an eye on which arguments are silently dropped in replies. Not always, but often times it's very telling.
  4. How does Google do it? by Talez · · Score: 5, Funny

    PigeonRank! Duhhhhhh

  5. My theory: by Anonymous Coward · · Score: 1, Interesting

    A large infusion of cash from some scary-assed three letter agencies that would be very interested in a centralized repository of the tastes and proclivities of nearly everyone in the world connected to the Internet.

    1. Re:My theory: by dAzED1 · · Score: 1

      I think you'll find $25billion to be a bit out of their price range. To get a grasp on reality, you might want to check out the federal budget, then combine the numbers for all 5 of those 3-letter agencies. Then ponder for a moment what a $25Bill whack would do.

    2. Re:My theory: by Anonymous Coward · · Score: 0, Redundant

      So you're telling me you think the amounts shown in the Federal Register are the real budgets for all those agencies? Next you'll be telling me you think the Iraq invasion wasn't about oil and that Diebold is an impartial producer of reliable voting machines.

    3. Re:My theory: by leerpm · · Score: 1

      That's pretty hard when most of the NSA's budget is classified material.

    4. Re:My theory: by nyseal · · Score: 1

      They aren't? Crap.

      --
      [SIG] Remember Mattel handheld games?
    5. Re:My theory: by Anonymous Coward · · Score: 0

      Iraq was about freeing those nice Islamic boys who like to play with explosives...

  6. Here by mfh · · Score: 4, Insightful

    > If truth is the first casualty of war, openness is the first casualty of going public.

    Maybe this is the reason after all, but I think it's more about Google being simple, smart and clean. They play fair (no browser interstitials, no sneaky crap, no registration necessary...etc); I would equate Google's victory thusfar to a kind of no-nonsense attitude to business, always, no-exception.

    --
    The dangers of knowledge trigger emotional distress in human beings.
    1. Re:Here by evilviper · · Score: 5, Insightful
      They play fair (no browser interstitials, no sneaky crap, no registration necessary...etc)

      And the fact that there are so many articles, from people that just can't understand why google is successful, just goes to show you how screwed we all are...

      Practically everyone in business is determined to be as evil as possible torwards their customers (and employees) and assume that anybody doing anything else must be doing something wrong, no matter what all other indicators may say.

      For a great example, read The Wal-Mart Myth.
      --
      Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
    2. Re:Here by danharan · · Score: 1

      Excellent article on Wal-Mart!

      I went to tompaine.com, which had originally published it, and found more articles by the same author. Back to Basics was a very thoughtful look at the outsourcing debate.

      --
      Information: "I want to be anthropomorphized"
  7. Dupe! by shachart · · Score: 1

    or at least, a variation on a dupe.

    --
    Those who can, do. Those who can't, consult.
  8. They have built an amazing system using Linux... by Anonymous Coward · · Score: 2, Interesting

    Sure would be nice to see some of that amazing tech coming back into the community...

  9. Article didn't say much by krs-one · · Score: 4, Interesting

    I read the article and it didn't say much at all about how Google operated. Instead, it just said we don't know how they operate because they keep it secret. But maybe that was the point to begin with.

    -Vic

  10. Soon to be everything by WhitePanther5000 · · Score: 4, Interesting

    The only thing it's missing now (IMO) is spellcheck and an online translator, which I'm sure they're already planning. I'm also looking forward to Gmail being open to the public. After they conquer these 3 thing, whats next.. Google ISP? Google National Army?

    1. Re:Soon to be everything by richard_za · · Score: 5, Informative

      Google already has spell check, and so does Gmail have a look at the screenshots on my blog. I believe they're looking at releasing it to the public in six months time, have a look at this article.

    2. Re:Soon to be everything by Anonymous Coward · · Score: 3, Informative

      The only thing it's missing now (IMO) is spellcheck and an online translator, which I'm sure they're already planning. I'm also looking forward to Gmail being open to the public. After they conquer these 3 thing, whats next.. Google ISP? Google National Army?

      Google has had a builtin spellchecker forever and their translate tool is right here http://www.google.com/language_tools
    3. Re:Soon to be everything by nacturation · · Score: 0, Flamebait

      Google already has spell check...

      What, as in "Did you mean french military defeats"?

      --
      Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
    4. Re:Soon to be everything by evilmonkey_666 · · Score: 2, Informative

      Umm is this a joke, they do have a spellchecker built into the search engine. I use it on a daily basis.

      And their online translator is here.

      --


      - PS. This is what part of the alphabet would look like if Q and R where eliminated.
    5. Re:Soon to be everything by Anonymous Coward · · Score: 0

      Google's spellchecker can be found on this page. As for the Google National Army of America, you do realize what the acronym for that is, right? :)

    6. Re:Soon to be everything by wud · · Score: 1

      I'm pritty sure they use Dictionary.com as the spell check. Don't quote me though, Unless you say don't quote me after quoting me. ;)

      --
      wud
    7. Re:Soon to be everything by Dave2+Wickham · · Score: 1

      I think they check their DB for similar terms with far more results.

    8. Re:Soon to be everything by jakoz · · Score: 1

      Google has a translate function, which links to translate.google.com addresses.

      Look for the "translate this page" link to the right of search results.

    9. Re:Soon to be everything by WhitePanther5000 · · Score: 1

      I guess all that's left now is the ISP and National Army

    10. Re:Soon to be everything by Anonymous Coward · · Score: 0

      Dude... If you're going to post blurred text, make sure that you don't post the same text blurred 6 times. You have no idea how close you just came to finding out how good gmail's spam filters are. Lucky for you, I sobered up a little in the time it took to get your e-mail address.

      Black it out completely next time, so assholes like me don't... well, be assholes.

  11. As a consultant by elinenbe · · Score: 5, Informative

    having been a consultant at their data center a year or so back I can attest that they had well over 50,000 machines. I am not sure about the 80GB drive per machine because from what I understood was they bought whatever drive at the time was the cheapest MB/$ and would replace any dead ones with the larger ones. Also, at any given time machines just die and many of them are not replaced or repaird for months. Their cluster accounts for all this...

    --
    -eric
    1. Re:As a consultant by _Sharp'r_ · · Score: 5, Informative

      But also realize that the data center you were at isn't their only one. I know of at least 7 physical locations and there are probably more out there.

      But yeah, their racks of 4 servers/1U is pretty impressive when you see them lined up in row after row of racks. Their data centers have to bring in extra cooling because they are so densely packed.

      --
      The party of stupid and the party of evil get together and do something both stupid and evil, then call it bipartisan.
    2. Re:As a consultant by Anonymous Coward · · Score: 0

      4 servers per 1U? Assuming 1/4U rack servers aren't available, how is this done? Blade servers which average out to 4 servers/U?

    3. Re:As a consultant by Anonymous Coward · · Score: 0

      I like to use traditional SCSI disks instead of SAN for handling massive amounts of data. /sarcasm

    4. Re:As a consultant by herlitz · · Score: 1

      The discrepancy between the number of machines * the size of their drives vs. the 4+ petabyte figure is most likely due to the use of large SAN storage devices. In such a massively distributed system, local drives are probably only used for logging hardware issues. By using one or more SANs for common storage, your data is much more centralized and, due to the redundancy features of SANs, a lot safer.

    5. Re:As a consultant by Anonymous Coward · · Score: 0

      Google's big win is that their cluster does use local storage (and usually local RAM) instead of a SAN.

      Can you imagine the SAN you'd need to run a cluster of 100,000 mostly disk-bound servers?

      Google's approach is like NUMA, you have to do it to scale beyond a certain point.

  12. That article... by Anonymous Coward · · Score: 0

    ...didn't answer shit.

  13. Huh? by lawrencekhoo · · Score: 1, Informative

    There are no answers in the article at all. Just the usual questions about how Google's publicized statistics don't add up.

    1. Re:Huh? by Anonymous Coward · · Score: 1, Interesting

      Not just that, it seemed to me the entire article was based on 2 statistics that didn't add up. Statistics, I hasten to add, which don't even reflect the internal structure, and which could just as easily have come from an ISP grepping their logs and multiplying quite a few times.

    2. Re:Huh? by ezzzD55J · · Score: 1
      Indeed, just those two very simple-minded calculations.

      It's a very dumb assumption that all machines contain 80gb drives, and that that's all the storage they're using (what about e.g. SANs)..

  14. Interesting by Motherfucking+Shit · · Score: 3, Interesting

    I lost a couple of sites from Google this month, presumably due to duplicate content; they were nearly verbatim clones of some of my other sites. The original sites are still there, the "clones" vanished from Google. As in, even if I search for those domains directly, I get nothing, where I used to get a cached copy of the sites. They've quite literally vanished from Google's database.

    Can you back up your assertions that Google's index is full? It's a rather interesting theory, and perhaps an explanation for all the tweaking they've done lately.

    --
    "BSD: Free as in speech. Linux: Free as in beer. Windows 10: Free as in herpes." --Man On Pink Corner in #52607549.
    1. Re:Interesting by ShaunC · · Score: 4, Informative

      Google is definitely cracking down on duplicate content. In fact, they've recently patented the concept.

      Insert software patent debate (where Google is the default hero due to its geek factor) here...

      --
      Thanks to the War on Drugs, it's easier to buy meth than it is to buy cold medicine!
    2. Re:Interesting by galaxy300 · · Score: 2, Insightful

      It's possible that their index is full. A more likely theory is that they don't really see the benefit of having content duplicated throughout the database.

      How many times have you run a search and seen a link at the bottom that says something like "Google removed information from this search that is redundant to information already displayed on the page" (Can't remember exactly what it says right now). Usually, there's nothing valuable in the hidden links - why index them at all?

    3. Re:Interesting by Psychic+Burrito · · Score: 4, Funny

      Google is cracking down on dupes? Oh no, Slashdot is doomed! :-)

    4. Re:Interesting by AhBeeDoi · · Score: 0, Offtopic

      Wish I had mod points right now. :^)

    5. Re:Interesting by JesterXXV · · Score: 1
      Usually, there's nothing valuable in the hidden links - why index them at all?

      The key word there is "usually". Sure, more often than not that stuff is useless, but I've found useful pages in those entries once or twice before.

      --
      Yo mama so fake, she failed the Turing Test.
  15. Two Thingies by BoldAC · · Score: 5, Interesting

    One -- Slashdot seems to be into content-directed ads now... as google was my ad for this story.

    Two -- If you want your pages indexed faster and more frequently, sign-up and place a google adsense ad on your page. Many webmasters believe that google is having to index so many adsense pages... that is difficult for google to add many more non-ad driven pages.

    Just sign up for adsense and run it a couple of weeks while you build your site. After google has spidered your site well, then just drop adsense.

    Good luck. I would love to hear any of your google-related tricks.

    AC

    1. Re:Two Thingies by Anonymous Coward · · Score: 0

      I got an ad for Windows Server 2003. I'm unable to verify your second claim, but so far you're 0 for 1.

    2. Re:Two Thingies by Anonymous Coward · · Score: 1, Insightful

      Also, just to mention that I've been seeing that Google ad for quite some time on Slashdot, and it gets randomly shown for any article. Keep hitting reload and watch the ad change. Microsoft and Google seem to be the primary two ads being shown in the square box below the article.

    3. Re:Two Thingies by anethema · · Score: 1

      Ads? OHHH like banners. I havent seen those for months since i got the Adblock extension for firefox. Will even block iframes so the page just looks like there never was an ad there.

      Try it out

      --


      It's easier to fight for one's principles than to live up to them.
  16. Re:Openness is the first casualty of going public? by Anonymous Coward · · Score: 5, Insightful

    They will not have to disclose the number of machines, the OS, the anything related to the machines. Wall Street isn't buying their technology, they are buying their cash flow.

    If you do not believe me, buy a share of GE. Pick up the phone, call Investor Relations and ask them how many Unix computers they have and what OS and patch level they run.

  17. Google does it with Linux :o) by maharg · · Score: 1

    'nuff said.
    (You may wish to take issue with the above..)

    --

    $ strings FTP.EXE | grep Copyright
    @(#) Copyright (c) 1983 The Regents of the University of California.
  18. Additional questions by lewko · · Score: 0, Redundant
    Why do NONE of the statistics ever mention the pigeons?

    Google search for the letter "a" resulted in 3,530,000,000 hits [search took 0.12 seconds].

    --
    Do you or your partner snore? - Visit www.snoring.com.au
    1. Re:Additional questions by nacturation · · Score: 5, Funny

      Google search for the letter "a" resulted in 3,530,000,000 hits [search took 0.12 seconds].

      Neat. I wonder what doing a Google search would return for other letters:

      "c" -- 299,792,458 hits
      "e" -- 2.71828183 hits
      "h" -- 6.626068 × 10^-34 hits
      "i" -- sqrt(-1) hits
      "k" -- 1.3806503 × 10^-23 hits

      Looks like Google is definitely busted. They should fix these bugs.

      --
      Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
  19. The "searching xxx web pages" count by shish · · Score: 1, Redundant

    Perhaps this is another form of secrecy - the number of pages indexed never seems to go up, except in huge jumps. According to archive.org, it's been stuck on 4,285,199,774 pages for about a year now :/

    --
    I mod down anyone who says "I will be modded down for this", regardless of the rest of their comment
    1. Re:The "searching xxx web pages" count by bolind · · Score: 1

      Perhaps this is another form of secrecy - the number of pages indexed never seems to go up, except in huge jumps. According to archive.org, it's been stuck on 4,285,199,774 pages for about a year now :/

      Anyone notice 4,285,199,774 just so happens to be ~99.8% of 2^32? Is this a 32bit counter about to overflow?

    2. Re:The "searching xxx web pages" count by Anonymous Coward · · Score: 2, Interesting

      Searching for 'the' gives about 5,740,000,000 pages while they index 'only' 4,285,199,774 web pages... Anyone knows why?

    3. Re:The "searching xxx web pages" count by Anonymous Coward · · Score: 0

      Sorry, Google does not serve more than 1000 results for any query. (You asked for results starting from 1073741824.)

      The Plot thickings.................

    4. Re:The "searching xxx web pages" count by XO · · Score: 1

      This is just like McDonald's.

      Remember way back when, when the McDonald's signs used to say "Over 80 billion served", with a 2-digit counter that would get increased every now and then? Well, when they started building new ones back in the 90's, the signs simply said "Billions and Billions served."

      Just plain ran out of counter.

      No point IN counting at this point.

      --
      "Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
    5. Re:The "searching xxx web pages" count by Atragon · · Score: 1

      I remember back in the day when they said 'millions' served...

  20. Re:Openness is the first casualty of going public? by nacturation · · Score: 5, Interesting

    I mean....surely once they've gone public, they'll be obliged to detail and list the sort of information that the article postulates about? The shareholders would be entitled to know how many servers google has, what their specifications are, and what their current commercial strategy is.....surely?!

    Why would a shareholder care about server specifications? Investing is all about money. Read any quarterly report from a public company. Income statement, balance sheet, and cash flow are the primary interests on the numbers side as well as a general roadmap of where the company's heading. Warren Buffett doesn't care if each server has two 80 GB drives, or whether they have four 250 GB drives per server. The only thing that matters is that there are competent people to handle these kinds of "dirty details" that an investor doesn't give a rats ass about.

    Take a look at the kinds of information you could expect from Google's quarterly reports.

    --
    Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
  21. I know I know! by Flingles · · Score: 0, Redundant

    more and more people want to know what makes Google tick

    Google has already told everyone what makes them tick! Imagine, a beowulf cluster of pigeons

    --
    Karma: -2^0.5 . Mainly due to the imbibing of dihydrogen monoxide
  22. Google can't do it: phrase searches by Anonymous Coward · · Score: 0, Insightful
    How does Google do it? They still can't do accurate phrase searches. A search on "to be or not to be" comes up with 2 or so in the top 10 links not even containing that phrase. (sure, the bogus links are related to the phrase, but they do not contain the actual phrase as Google's own description of how it works says it would). This is just the most obvious example of error results.

    Interestingly, a9.com, which copied Google, contains the same search errors.

    1. Re:Google can't do it: phrase searches by Anonymous Coward · · Score: 0

      Obviously you are full of it as I just did a google search and there are eight exact matches in the top ten. Not two as you claim.

      See for your self: http://www.google.com.au/search?q=+%22to+be+or+not +to+be%22+&ie=UTF-8&oe=UTF-8&hl=en&met a=

    2. Re:Google can't do it: phrase searches by RenaissanceGeek · · Score: 5, Insightful
      I performed the Google search for the phrase

      "To be or not to be"

      and I honestly can't see what you are going on about: of the first ten results, eight highlighted the phrase in the page synopsis, one used the phrase as a domain name, and one included the parital phrase "...Or Not To Be."

      Note the elipsis on that last one: it alludes to a larger portion of text preceding the printed portion. And the domain-name was found even though the spaces were omitted.

      Those aren't irregular results: those are highly intelligent results.

      Just because they aren't deterministic enough for you to plug them into a piece of code of your own construction (without compensating Google) doesn't mean that they don't fulfill the purpose of the web search.

      --
      What is the difference between a small revolutionary change and a large evolutionary change?
    3. Re:Google can't do it: phrase searches by jonknee · · Score: 1

      A9 actually uses Google for search results... Notice how all the results are the same, and there are Google AdWords? :).

      (You have to put quotes around a phrase to get results that contain it as you typed it)

    4. Re:Google can't do it: phrase searches by Anonymous Coward · · Score: 0
      A search on "to be or not to be" comes up with 2 or so in the top 10 links not even containing that phrase.

      Take a look at the cached versions of these two pages. There you'll see that while the pages themselves do not contain "to be or not to be", pages linking to that page did. It's well known that Google searches anchor text in addition to document text, and my understanding is the anchor text is where "to be or not to be" can be found for these two pages you mention.

  23. the reason they keep their mouth shut by gevmage · · Score: 5, Funny
    It's quite possible the reason that they keep their mouth shut about their capabilities is to avoid the NSA (or someone like them) to come calling. After all, they basically have a distributed database of the entire net, which they index efficiently on a continuous basis. Who wants to bet that their system is better at gathering intelligence than any government agency in the world?

    On the other hand, here's the conspiracy theory version: what if Google IS the NSA? The IPO is a smokescreen to try to avert attention. The reason they can't show their true capability is that when the company goes public, only 20% of their hardware will actually go into the public company "Google", the rest of the hardware will still be hidden and a part of the NSA's system. :-)

    [For the humor impaired, I'm just joking, but it does make you wonder...]

    --
    Craig Steffen
    http://www.craigsteffen.net
    1. Re:the reason they keep their mouth shut by tyler_larson · · Score: 1
      It's quite possible the reason that they keep their mouth shut about their capabilities is to avoid the NSA (or someone like them) to come calling.

      Though, interestingly enough, if the NSA or FBI or CIA wanted to extract intelligence from the google database, all they'd have to do is fire up their browser and run a normal search. After all, that's exactly what Google is for. And I'll bet you a dime to a dollar that the federal agencies frequently do exactly that.

      And not only that, there really is no motivation for a some agency to poke around the internals of Google's databases because their search interface is almost certainly the most efficient and complete way to extract data anyway.

      --
      "With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea...."
      RFC 1925
    2. Re:the reason they keep their mouth shut by ThomK · · Score: 1

      After all, they basically have a distributed database of the entire net, which they index efficiently on a continuous basis. Who wants to bet that their system is better at gathering intelligence than any government agency in the world?

      You weren't joking about that part for good reason, and it made me do a double-take. The government, which will outsource a LOT of things to corporations, are going to have a hard time having to rely on someone else for the ultimate commodity: information.

      --

      TK

  24. the NSA, FBI and CIA would panic by jacquesm · · Score: 0, Redundant

    and google would be nationalised in an eyeblink as soon as they realised that google has enough computing power to do simulation of nuclear weapons :) possibly in realtime !

    that must why they're so secretive !

    1. Re:the NSA, FBI and CIA would panic by Ilgaz · · Score: 0, Troll

      Why panic? One of Google's chiefs is ex NS... Oh wait, this is /. , telling bad things about Google, privacy, 2038 expiring cookie, never mentioned google news inclusion policy is evil.

      It runs on Linux Beowulf whatever, so it must be good eh? :)

      -call this post a karma suicide btw-

  25. I've though about this a bit by gtoomey · · Score: 3, Interesting
    The software/hardware architecture seems impressive.

    Putting on my computer scientist hat I would guess:
    - instead of backup, hold data in multiple places at once
    - use a "cascaded rsync" to trickle software changes to thousands of nodes
    - then load software via NFS at node bootup
    - use nodes just to store data; keep software in RAM for speed

    Just a few thoughts.

    1. Re:I've though about this a bit by kasperd · · Score: 2, Insightful

      instead of backup, hold data in multiple places at once
      Even better, instead of backup just crawl the pages again in the event of a lost disk. Of course some data needs to be in multiple places for performance reasons, but not all data are accessed frequently. How often do you think they will need the page with the lowest rank? (OK, I know there will probably be a lot with exactly the same rank, but you get the idea).

      load software via NFS at node bootup
      There are better protocols for this than NFS. But when you build a cluster this size, you surely want boxes, that can netboot of of the box. Actually that means you will need to use DHCP and TFTP. Security of the DHCP and TFTP servers is going to be very critical.

      use nodes just to store data; keep software in RAM for speed
      I wouldn't worry about the speed. Linux is going to do fine. But since they probably netboot and download kernel and a ramdisk from a server, it is of course going to be kept in ram. Now I wonder, does it all run of an initial ramdisk?

      --

      Do you care about the security of your wireless mouse?
    2. Re:I've though about this a bit by jimicus · · Score: 1

      Still a heckuva distributed database. After all, I very much doubt each node holds a complete copy of the database: with 2x80GB hard disks and 4 billion pages that would give only 40 bytes per record, with no space for operating system or software.

      That'll be a no, then.

  26. Idiot. RTFP by Anonymous Coward · · Score: 0
    "Obviously you are full of it as I just did a google search and there are eight exact matches in the top ten. Not two as you claim"

    That's what I said, idiot. See parent: "2 or so in the top 10 links not even containing that phrase"

    Duh. 2 of 10 without means 8 with. Duhhhh....

  27. alternative? by Anonymous Coward · · Score: 0

    So how can an open/free alternative service can possibly happen?
    The only way I can think of is to have a distributed system around the world

  28. google instant messenger, or... by zogger · · Score: 4, Interesting

    GIMMEE would be nice. Well, nice for awhile and if they didn't get weird with it. Don't know if that could happen though, nature of man and all that philosophical stuff. Goes along with the current VoIP articles. They would dominate the net then if they implemented that. I know I would pay cash to them have a universal works great, any OS VoIP and no-spam, no commercial email service.

    So far we know they have just a cubic load of servers, the most on the planet most likely with one private company. The government probably has more, but it's a mish mash of them, not near as sleek or coordinated, AFAIK. What COULD be next with them, practical cheap 50 dollar thin clinets that you could do a TON on, using distributed computing, from games to communication to running any business? With tech savvy like they got and their already established heavy hardware base and heavy committment to R&D, they could just 'splode with an extra 25 billion in cash all of a sudden from an IPO. OR, the money could get to them and they become just another weird company that forgets it's roots as "brains come first" and switch to "marketing crap comes first" like certain other unnamed megacorps do now.

    Interesting times

  29. How Google do that? by elpecek · · Score: 4, Informative

    For those who haven't read - there is an article written by Brin and Page - maybe a little outdated, but still interesting: The Anatomy of a Large-Scale Hypertextual Web Search Engine

    1. Re:How Google do that? by jvsanford · · Score: 3, Informative

      There is also a paper that describes their storage infrastructure (Google File System) here

  30. Also censoring... by tiltowait · · Score: 1

    See Google's chastity belt too tight (PartsExpress.com listing removed via SafeSearch because "sex" in domain name) and Google In Controversy Over Top-Ranking For Anti-Jewish Site (Google picking out Googlebombed results) for recent examples.

    1. Re:Also censoring... by Anonymous Coward · · Score: 1, Interesting

      Man if you care about that crap you have some serious problems.

      SafeSearch IS A FILTERED SEARCH. Shit happens this is the limits of technology, which is why you have to take extra steps to use it. Obviously it's going to have false hits, but that's life. Nothings perfect and it's NOT GOOGLE'S FAULT.

      And the anti-jewish site?

      Well, that's just plain bullshit. Some FUD spread by a reporter trying to get his name and his article spread around.

      If you don't like it, use Altavista.

  31. Do you work for the Congressional Budget Office? by Anonymous Coward · · Score: 0
    Do the math. 8 exact matches in 10 leaves 2 that do not match. The AC said that there were 2 that did not match.

    If you did not know that 10 - 2 = 8, it is a wonder you can even turn on your machine. Or has Mommy let her pre-scrooler use slashdot?

  32. Supplmental Result by Richard5mith · · Score: 3, Interesting

    There is plenty of evidence to suggest that Google has run out of docid's, hitting the 32-bit integer limit.

    The best evidence is doing a search which returns results which say "Supplemental Result" next to them. That'll be coming from a second document store I'd guess.

    1. Re:Supplmental Result by Webz · · Score: 5, Interesting

      That doesn't make any sense. A well-designed system is a transparent one, so Google would have no reason to let you know that they're running out of IDs.

      By the way, for supplemental result... By doing a quick keyword search on Google using my domain name, I'm led to believe that pages marked "Supplemental Result" are pages that look like search results. That is, they aren't filled with any real content, other than search results from other engines. Results that could "supplement" your "result" from Google.

    2. Re:Supplmental Result by Anonymous Coward · · Score: 0

      Yeah, if I ran out of IDs, I'd just set some kind of magic code - if the ID number is 0, go look at the first 32 bits of the title for the real idea, etc.

      Btw, not really an AC, just writing from somewhere where I can't log in.

    3. Re:Supplmental Result by Anonymous Coward · · Score: 0

      search for "the" -- you'll get results 1 to 10 of an estimated almost 6 billion results. I guess all those doctorates kinda figured it out already, ya think?

    4. Re:Supplmental Result by Dave2+Wickham · · Score: 2, Interesting
      Supplemental results do come from a second store, yes:
      Hey, pages get added to the supplemental index using automatic algorithms. You can imagine a lot of useful criteria, including that we saw a url during the main crawl but didn't have a have a chance to crawl it when we first saw it.

      Think of this as icing on the cake. If there's an obscure search, we're willing to do extra work with this new experimental feature to turn up more results. The net outcome is more search results for people doing power searches.

      The above is from GoogleGuy in this thread on WebmasterWorld.

      (I think you may need to copy/paste the link, I'm not sure)
  33. Useless story by Anonymous Coward · · Score: 0

    They requoted all of Garfinkel's observations without adding *anything* to it...not a single insightful/informative sentence which adds anything to his article...they might as well have redirected the readers there.

  34. Re:Openness is the first casualty of going public? by Blastercorps · · Score: 3, Insightful

    I disagree. An investor deserves to know at least general information about the goings on of a business. If I were a stock broker I would want to know that say: FruitCompanyA uses insecticide whereas FruitCompanyB doesn't. I personally would choose FruitCompanyA as a a rise in the insect population would ruin FruitCompanyB.

    With google: before I give them my money, I would like to know how many servers they have, how close to capacity they are, what softwares they use (compatibility issues).

    Honest reporting of operations lets an investor make an intelligent decision about their money and helps avoid boiler-room companies.

  35. Objection! by StarfishOne · · Score: 0

    "For example, how do you implement security patches and operating-system upgrades (much more frequent in Linux than in proprietary systems from Microsoft or Sun)"

    Sustained, thank you :)

  36. I know lot's of people don't read the articles... by Anonymous Coward · · Score: 0

    ...hell, me the anonymous coward post a lot of stuff sometimes and don't read the stuff either.

    But in this case I think this is an article really should read. Here is the first paragraph, really great trick too!

    Here's a cheap trick to play on an audience - especially one drawn from the business community. Ask them how many use Microsoft software. Virtually every hand in the room will go up. How many use Apple Macs? One or two - at most. How many use Linux? If the audience is drawn from corporate suits, no hands will show. Now comes the punchline: who uses Google? A forest of hands appears. 'Ah,' you say, 'that's very interesting, because it means you're all Linux users.' Stunned looks all round.

  37. Re:Openness is the first casualty of going public? by BigGerman · · Score: 4, Insightful

    unfortunately the technology spending IS part of the cash flow. "We went dumpster-diving and picked up a dozen new machines for the indexing farm" and "we entered agreement with Dell to secure a reliable source of cheap Intel servers" would both show up on the shareholder statements but the impact would not be the same.
    Going public WILL expose the siginificant portion of Google technology, more sp when it has to do with hardware.

  38. Re:Openness is the first casualty of going public? by Smidge204 · · Score: 4, Insightful

    The problem with that analogy is that what software they run has absolutely nothing to do with what they do to make money.

    With Google, their entire "business" - their means of generating cash flow - relies on sheer quantity of computing muscle and high performance software for their search databases. With GE, their business is making lightbulbs, dishwashers, hair dryers, electric motors and any more of thousands of different products used in residential, commercial and industrial settings. How many Unix computers they have in all their offices around the world is a causality of doing business, not their means of doing business.

    I'm sure if you asked the GE Investor Relations department something relevant about how their business operates, you might get somewhere.
    =Smidge=

  39. Not true in the slightest by Anonymous Coward · · Score: 0, Troll

    If they are playing fair, (i.e. sans "sneaky crap"), then:

    1) Why are their terms of service / Pirvacy Policy so vague?

    2) Why does their cookie stay until the year 2038?

    3) Why does their Google search bar report information and auto-update without permission?

    Google freaks me out after reading this page:
    http://www.google-watch.org/

    Sorry if that's a bit paranoid, but if you have some counter-information I'd be glad to read it.

    1. Re:Not true in the slightest by Anonymous Coward · · Score: 0

      Yes, it is alarming. I've been rejecting cookies from .google.com for about a year now, and I would recommend to anyone else to do the same.

    2. Re:Not true in the slightest by Anonymous Coward · · Score: 0

      2038-42 = 1996. Year when google started. 42 is that meanign of life number etc.

    3. Re:Not true in the slightest by _Sprocket_ · · Score: 1

      Keep in mind that google-watch is ran by an individual with an axe to grind. Which doesn't mean that serious issues can't be raised by someone so motivated. But it does cast some doubt on his assertions when there seems to be a fair amount of reaching to get them.

  40. first casualty ?? by Sad+Loser · · Score: 4, Informative


    Recycling without attribution is the first casualty of bad journalism.

    I thought I had read this article before, and then I realised, I had read it before...
    (although I now realise that you are not supposed to read the linked articles before posting comments - sorry)

    --
    Humorous signatures are over-rated.
    1. Re:first casualty ?? by platypussrex · · Score: 4, Informative

      Not sure why you say that. If you read all the way through Naughton's article, he says that the calculations come from Garfinkel, he mentions Technology Review, and then later directly quotes Garfinkel. Sounds like attribution to me.

  41. Why Verbatim Clones??WAS:Interesting by Anonymous Coward · · Score: 0

    IANACS (...computer scientist)

    Why did you have verbatim clones of sites?

    Are you running pr0n sites that exist soley on the purported 'AD' dollars coming your way??

    I do not mean disrespect for the pr0n industry... I know that they generate BILLIONS of dollars..

    But Seriously, what is the general utility/usefullness of numerous identical sites??

    -i am not a comp scie...

    Blah

    1. Re:Why Verbatim Clones??WAS:Interesting by reanjr · · Score: 2, Informative

      I don't know why he has numerous identical sites, but one reason is when a small company purchases several other companies that are in the exact same market. Since the companies are compatible, you merge all their operations into one. But you still want to keep brand identification with your customers so you keep two copies of the site, each branded differently.

    2. Re:Why Verbatim Clones??WAS:Interesting by platypussrex · · Score: 1

      From the comments out there, it seems like Google is more aggressive than just deleting duplicate sites. If you are an affiliate site for say, Amazon.com, and use their descriptions verbatim for products you offer, this seems to give you a good chance for being deleted. Other examples are also given if you read some of the links given earlier in this topic.

    3. Re:Why Verbatim Clones??WAS:Interesting by Anonymous Coward · · Score: 1, Informative

      or corporation #128264 has a complete web-viewable copy of the javadocs for version 1.2. lots of times i've done google searches for something code-related looking for examples/bugs/whatever and come up with a ton of hits on the same API documentation on different websites.

    4. Re:Why Verbatim Clones??WAS:Interesting by Motherfucking+Shit · · Score: 1
      Why did you have verbatim clones of sites?
      platypussrex described the situation that was most fitting for the sites I lost from Google... Affiliate sites. Among other things I operate a variety of e.g. "webmaster book sites," "programmer book sites," etc. We'd set up several sites using the same general content with differing headers/footers. It seems that Google has gotten wise to the fact that our sites were using the same text and item descriptions for books, and dropped some of the "dupe" sites from their index.

      As an example, suppose you set up an affiliate site to sell books through Amazon.com. You call it books1.com. Then you make a site called books2.com with the same books for sale, and then you make a books3.com with those books as well. What Google seems to be doing is keeping books1.com, since it was around first, but deleting books2.com and books3.com from its database. As in, not only do books2.com and books3.com not show up in search results, but even if you search for books2.com or books3.com directly (or through "site:books2.com" etc) there are no results.

      It goes without saying that a number of other sites were also using the same description copy, trying to sell the same books. Some of them got dropped, some of them didn't. It looks like a case of first-come, first-served; where the sites that were around first are still there; some of my sites got dropped, some are still in the index.
      --
      "BSD: Free as in speech. Linux: Free as in beer. Windows 10: Free as in herpes." --Man On Pink Corner in #52607549.
  42. Simple answer by Anonymous Coward · · Score: 0

    They're about to go public. Pumping your stock up involves a stream of "improvements" and conquests after you go public to show investors that your company is king of the hill. Why spend that ammo now rather than wait until it actually generates value for the company?

  43. Linux needs more patching? Does it? by MicklePickle · · Score: 2, Interesting

    much more frequent in Linux than in proprietary systems from Microsoft or Sun

    Huh? Does it!? Since when? I like these throw-away lines the media people dish out. What is their basis for this statement? Even when they see Linux obviously succeeding, they dish out a statement like this.

    I certainly don't have to patch my Linux boxes as frequently as my Windows boxes. Actually... no... wait, they're right! I only need to patch Windows once. Ctrl-Alt-Del -> Boot Debian CD.

    --
    -- main(s){printf(s="main(s){printf(s=%c%s%c,34,s,34) ;}",34,s,34);} $p='$p=%c%s%
    1. Re:Linux needs more patching? Does it? by Baumi · · Score: 2, Insightful

      Not sure if it needs more patching, but at least OSS-pastches come out in a timely manner after the discovery, whereas MS patches sometimes take ages to materialize. Thus, more patches don't necessarily mean more security holes - just better housekeeping.

      Baumi

    2. Re:Linux needs more patching? Does it? by Anonymous Coward · · Score: 0

      much more frequent in Linux than in proprietary systems from Microsoft or Sun

      >>Huh? Does it!? Since when? I like these throw-away lines the media people dish out. What
      >>is their basis for this statement? Even when they see Linux obviously succeeding, they dish out a statement like this.

      >>I certainly don't have to patch my Linux boxes as frequently as my Windows boxes. Actually...
      >>no... wait, they're right! I only need to patch Windows once. Ctrl-Alt-Del -> Boot Debian CD.

      Granted, the sentence was phrased fairly ambiguously, but I think he was talking about upgrading the kernel. You upgrade Windows once every 3/4 years: 98->XP->Longhorn, while you can update your kernel every few weeks, if you want to : 2.4.22, 2.4.23 ...

    3. Re:Linux needs more patching? Does it? by MicklePickle · · Score: 1

      But you don't need to upgrade the kernel, unless that security leak occurs. If I had a choice between updating the kernel on Windows vs Linux, I know which I'd prefer.

      I still think it's a real media throw-away line that will get quoted by at least one person. It's an annoyingly inaccurate non-statement.

      --
      -- main(s){printf(s="main(s){printf(s=%c%s%c,34,s,34) ;}",34,s,34);} $p='$p=%c%s%
    4. Re:Linux needs more patching? Does it? by shish · · Score: 1

      Windows gets one huge patch per month, linux apps get patched whenever they need it.

      --
      I mod down anyone who says "I will be modded down for this", regardless of the rest of their comment
  44. Re:Another Rumor... by cbreaker · · Score: 1

    .. or another iPod article..

    --
    - It's not the Macs I hate. It's Digg users. -
  45. Larry Augustin by kevcol · · Score: 1

    Larry used to use that in the pitch for institutional investors before VA went public. That's where that came from.

  46. The linked article is shit by Donny+Smith · · Score: 1, Interesting

    Some comments on the linked article:

    > it means you're all Linux users.
    What is that - guilt by association?

    >how do you implement security patches and operating-system upgrades (much more frequent in Linux than in proprietary systems from Microsoft or Sun) on thousands of servers without causing disruption to service?

    You don't implement any security patches and upgrades because those systems are used only by Web servers; it's not like some Web client will hack into their servers... You boot thousands of servers from NFS or such; you upgrade system images once a quarter, together with Google's own software.

    >yet achieves 100 per cent uptime.

    Uptime of what? Of www.google.com, using round-robin load balancing to several geographically dispersed data centers. What's the big deal about that?
    But I've seen 404 on www.google.com and the paid AdWords Admin Web is down quite often(anyone who ever used it knows what I'm talking about).

  47. Re:Openness is the first casualty of going public? by nacturation · · Score: 4, Informative

    With google: before I give them my money, I would like to know how many servers they have, how close to capacity they are, what softwares they use (compatibility issues).

    I agree it would be nice to know. But if those are your conditions for investing in Google, I think Google would probably tell you to keep your money. I imagine Google's quarterly reports would probably say something like:

    "Our operation depends on having the ability to increase our server and bandwidth resources as we grow our services. Business may be adversely impacted should capacity be unavailable. Our servers are also at risk for viruses, worms, and DDoS attacks which could put the operation of those servers at risk and adversely affect business." etc...

    That would give you, as an investor, the information you need to determine whether those risks are worth your money. In all likelihood you'll just have to rely on the fact that they have an army of PhDs who are smarter than you and I put together and know their shit when it comes to security, databases, clustering, etc.

    Now I could be wrong. Perhaps Google is waiting for the IPO and will then detail their server infrastructure, wow Wall Street (and geeks worldwide) with their amazing capacity, and their stock will skyrocket on the first day of trading. I'd wager that Google's stock is going to have amazing gains anyway given that it's a bit of an industry darling. Other tech companies which have been thinking of going public would be wise to time their IPO very shortly after Google's and ride the wave.

    --
    Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
  48. This is a repeat.. by netsharc · · Score: 1

    How the heck is this news? The article just summarised the Simson Garfinkel article for the business types. Slashdot already covered the calculations from Garfinkel, and therefore this is just a repeat! Booring.

    --
    What time is it/will be over there? Check with my iPhone app!
  49. Re:Openness is the first casualty of going public? by Vlad_the_Inhaler · · Score: 2, Insightful

    Do you know how many servers IBM have? Akamai? Microsoft?

    Be reasonable.

    Financial information is important, their business plan is important, it is probably important to know that they are running Linux so that SCO-type problems can be factored in. The sort of fine technical details the Observer goes into are totally irrelevant, just an incidental business expense. We know that it all works and that Google are on top of what they do. That is what matters.

    --
    Mielipiteet omiani - Opinions personal, facts suspect.
  50. Tinfoil Hats by mfh · · Score: 4, Informative

    > 1) Why are their terms of service / Pirvacy Policy so vague?

    This is to keep it simple. Exacting legal language is the path to screwing people. Vague terms of service are good because both sides can wiggle. Has anyone been sued because of these terms of service? I'd like to see some refs to that, but I'm guessing it's just to protect the general public from a-holes who would exploit Google.

    > 2) Why does their cookie stay until the year 2038?

    Not to be funny, but someone at Google likely knows when the end of the world is coming and has set the cookie to reflect this. Seriously, who cares how long cookies stay alive for? You can block them if you like, but I think it's really just to keep Google more effective.

    > 3) Why does their Google search bar report information and auto-update without permission?

    I'm against Spyware, so I don't run it, but Google tracks searches anyway, so what's the point of getting upset about it? These technologies makes Google more user-friendly. Google doesn't have loads of popups trying to get you to install the bar -- it's not right in your face. People who want it likely don't care if it auto-updates because then they have the most recent version of it.

    --
    The dangers of knowledge trigger emotional distress in human beings.
    1. Re:Tinfoil Hats by Anonymous Coward · · Score: 1

      I understand your points of view. These could all just be the result of someone looking too close at a successful business and finding faults. However, whenever confronted about these issues, the top executives always skirt around the questions. That is what kind of drove the nail further for me. If it really is such an innocent issues and honest company, then why do they play coy? I can't buy that...

    2. Re:Tinfoil Hats by Felinoid · · Score: 1

      However, whenever confronted about these issues, the top executives always skirt around the questions.
      Ohhh really? And where were you on the 31st of Febuary hmm? Well? Skirting the issue?
      Or maybe it's a stupid question. (BTW there is no 31st of Febuary..)
      Here is the deal. Top executives have more important things to do than address the favoret pet conspericy theroy of the week.
      There are some questions that shouldn't be dignified with a responce.

      --
      I don't actually exist.
    3. Re:Tinfoil Hats by rootus-rootus · · Score: 1

      2038 is when the 32 bit integer which holds unix time rolls over. 64 bit solutions are coming :-)

      --
      The moral of the story is: "Always remember to mount a scratch monkey."
    4. Re:Tinfoil Hats by Beale · · Score: 1

      Possibly not the end of the world, but the end of the Unix Epoch.

    5. Re:Tinfoil Hats by Ilgaz · · Score: 0, Flamebait

      If you tell me what NSA does and what does an ex NSA guy at Google does, I wlll even let them to inject Google chip in my brain as they dream of.

      Wake up! Just because a damn thing runs on Linux, it doesn't be automatically good.

      Sorry but I had enough with the reasonless Google defending posts for years.

    6. Re:Tinfoil Hats by Anonymous Coward · · Score: 0

      As opposed to the reasonless Google bashing posts? Some people are just dumb.

  51. Re:How does Google do it? by theRG · · Score: 2, Funny

    My favorite Google features:

    http://labs.google.com/
    http://www.google.com/i ntl/xx-klingon/
    http://www.google.com/intl/xx-elm er/

  52. IPO is Business by Sumbody · · Score: 1


    Let's leave it to Google, facing an IPO, to play these numbers and the PR game how they feel will most benefit them and deter their competitors.

    This post is brought to you by Microsoft [tm] Internet Explorer (r), the only browser for the Internet. Remember Mosaic and Splyglass? We don't.

  53. Public paper on Google File System by MarkWatson · · Score: 4, Informative
    Here is a PDF file of the paper.


    If that link gets slashdotted, here is another link of a PDF PowerPoint presenation.


    Good read! This paper (with the discusion of the goodness/fastness of file appends) made me more interested in Prevalence - so much so that I am using it for my new project.

    -Mark

    1. Re:Public paper on Google File System by svr0002 · · Score: 4, Informative
      and another good one - http://www.computer.org/micro/mi2003/m2022.pdf

      Interesting that a major problem for Google is managing power and cooling !

  54. No questions answered by T.Hobbes · · Score: 1

    The meat of the article is just the observation that the numbers Google puts out (for # of servers, # of hits, etc) are inconsistant. The only conclusion it comes to is that google has more 'horsepower' than it's letting on.

    1. Re:No questions answered by Bz3rk · · Score: 1

      How about the question of what they are going to do about the fact that 90% of the pages google finds are just links to other "search sites" that have nothing but links to other search sites/link listings that have nothing but links to OTHER search sites/link listings that have nothing but..........

  55. The Google Might Be Falling by aluminumcube · · Score: 2, Interesting

    I think this is the wrong question investors need to be asking about Google before they IPO. Sure, it makes for some great geek gab; the fetishistic wonderment of just how many servers Google is running, how many hits they get and how exactly they manage to, well, manage that many servers. In the end though, answering those questions doesn't tell us anything about what Google is actually selling.

    The more and more I look at it, the more and more I fear Google is just nothing more then a very well calculated shill game; the Enron of technology IPOs...

    Pretty much everyone who uses the internet loves Google and we do so for a combination of three compelling reasons; First off, Google offers up what is basically the best search engine on the internet. It isn't perfect, it doesn't work all the time but it is the best thing out there right now. Second, they offer this high-quality search service without all the excess bullshit that got tacked onto all of the other search engines on the market in the .com heyday. While Yahoo was busy playing in Hollywood and becoming a "Portal" and Alta Vista was going down the tubes, Google's simple, whimsical, easy to use front page didn't get gaudy by trying to make us sign up for accounts or any of the other marketing department crap. Finally, Google has a high Willy Wonka factor, sort of like Apple. We don't hear much from the company in the way of press releases or other information, but every so often, they open the doors and it turns out the PhD Umpa Lumpas there developed something totally cool. Local search, Froogle, gMail and Okurit are examples of this...

    The thing that gives me the heeby geebies about Google is how they make all of this look so effortless. Okurit just sort of popped out of the open one day. gMail appeared on April 1 with such an "effortless" air about it all that Google didn't even bother to take the press release seriously. We keep hearing these cryptic references from the company about some overwhelmingly massive amount of computing power they have and how their kabul of PhDs has it humming along with levels of efficiency that are a world beyond most everything else out there.

    All of this has made for a very pumped up environment for an IPO, but we still have yet to get an answer to the question "What is Google's business model?" I "google" words all day. I have an Okurit account that I use. I even use Google as a quick and dirty calculator. When it opens up, I will have a couple of gMail accounts. The problem is, I've never paid these people a single penny for ANY of this. How the hell are they going to make money?

    Sure, we can say that Google has integrated advertising within the search results, but the advertising model has always proven to be of dubious effectiveness at best. Google has an enterprise search division, but the cost of their Google Appliance is a pittance compared to the sort of money big time enterprise software companies like Oracle and SAP are making, how can they survive on that revenue stream and pay the bandwidth bills for all of the free services they offer to the public?

    We always tend to answer these questions with an "I don't know, but Google must be doing something right." Google works very hard to continue to fuel the fire that they are doing something paradigm shifting with all of those PhDs they have on the payroll, and how many servers they have, and how they can just sort of effortlessly announce 1gb free email accounts. We keep drawing up the impression that these guys must have something HUGE up their sleeves, and they have us salivating for the IPO so we too can be part of it.

    Very soon, Google executives are going to pile onto a Gulfstream V and do a roadshow for big time investment houses and institutional investors and they are going to be trying to convince these guys to buy Google IPO. They are going to be asked exactly what sort of business model Google is going to be pushing and one of two things is going to happen:

    - Google will c

    1. Re:The Google Might Be Falling by Anonymous Coward · · Score: 1, Insightful

      The thing that gives me the heeby geebies about Google is how they make all of this look so effortless. Okurit just sort of popped out of the open one day. gMail appeared on April 1 with such an "effortless" air about it all that Google didn't even bother to take the press release seriously.

      BTW, it's Orkut. Google is showing themselves as having excellent business and marketing savvy by putting out the press release on April Fool's Day. What better way to stir up speculation and cause a lot of buzz by releasing something which sounds like it could be a joke, or it just might be true. They probably received much more exposure by releasing a flippant press release on April 1st which turned out to be true, compared to various other companies which only had various pranks.

      I think you can safely put the tinfoil hat away. Google's business is quite clear. Have an awesome search engine and other services. Charge advertisers for listings. It's basically eBay in a search engine. eBay gets a lot more revenue from each transaction than Google does, but Google makes it up bigtime on volume.

    2. Re:The Google Might Be Falling by Anonymous Coward · · Score: 0

      Are you really that oblivious, or are you trolling? Do you not see the advertisements on the side of every search result page (and starting to appear on many non-google pages also)? Do you not realize that many people and companies are willing to pay lots of money to appear in those spots?

    3. Re:The Google Might Be Falling by laura20 · · Score: 5, Insightful

      The problem is, I've never paid these people a single penny for ANY of this. How the hell are they going to make money?

      Um, you do realize that Google already makes a profit, don't you? I daresay the IPO will puff the value of the company up beyond the rational amount, but that's not 'Enron' -- if you are going to use buzzwords, use the right ones. Enron was a case of internal actors in the company using financial games to siphon off profits and inflate the value of the company on the books. You accusing Google of financial fraud? If you are going to use a buzzword, use 'Yahoo' or something -- a solid company that got its stock price puffed up excessively due to investor mania.

      How the hell did this get moderated up, except as 'Funny'?

    4. Re:The Google Might Be Falling by mrm677 · · Score: 1

      Nice theory. Except that it is well-known that Google is already profitable and that the two founders are billionaires.

    5. Re:The Google Might Be Falling by _Sprocket_ · · Score: 3, Informative


      The problem is, I've never paid these people a single penny for ANY of this. How the hell are they going to make money?


      1) Google has an effective advertisement system

      2) My last two employers bought Google boxes for their intranet
    6. Re:The Google Might Be Falling by Lord_Dweomer · · Score: 3, Insightful
      "Sure, we can say that Google has integrated advertising within the search results, but the advertising model has always proven to be of dubious effectiveness at best."

      Correction, the ad model has proven to be of dubious effectiveness with companies that have no credibility.

      Google is perhaps the most trusted company on the net today, and with the traffic they get, I'm not surprised at all that they can support all their financial needs with ad revenue, especially with some of the big bucks that large companies dump into advertising with Google. I challenge you to show evidence showing that their advertising business model cannot support their costs, because so far you've done nothing but toss up tin-foil hat ideas without any proof to back it up, and as someone else so kindly pointed out to you, Google is ALREADY in the black.

      --
      Buy Steampunk Clothing Online!
  56. How do they do it? Two words by JoeBaldwin · · Score: 2, Funny

    Underpants gnomes.

  57. Google Programming Contest: Robust Hyperlinks by just+someone · · Score: 1

    Google programming contest pays off again.
    Using Phelps and Wilinski's "Robust Hyperlinks" concepts to detect duplicate content.

  58. You may also find this interesting... by lunar_legacy · · Score: 5, Informative

    Another wonderful speculation about Google infrastructure which You can find it here.

  59. Leprechauns! by penginkun · · Score: 2, Funny

    I mean, how else could they do it?

    1. Re:Leprechauns! by Anonymous Coward · · Score: 0

      Google appoints a 2 foot tall Leprechaun as their new CFO. Prior to the IPO, the Leprechaun goes around pitching the benefits of investing their money and says "Hey, we could use some more funding... we're a little short."

      ba-dum-dum!

  60. Google by mrcutrer · · Score: 1

    Their success lies with the name.

    I mean, google, that's pure genius.

    I started using it for the name only, then found it to be much more useful than hotbot.

    --
    "When I look back, my life is not a foreign country, it's more like a library book returned long ago." - ????
  61. Re:They have built an amazing system using Linux.. by B1ackDragon · · Score: 3, Insightful

    As far as I can tell there is no better way for that hardware to have come "back into the community."

    The service is free, and they're really good at what they do. I would say I'd be lost without google on the internet, but really this compliment goes for lots of search engines - I'm really very grateful this sort of service still exists for free (well, with ads.)

    Unless you want to talk about cures for diseases through protien folding simulations, I can't think of a better way for this hardware to be used, such that it begets a greater net benefit.

    --
    The snow doesn't give a soft white damn whom it touches. -- ee cummings
  62. Re:Openness is the first casualty of going public? by Anonymous Coward · · Score: 2, Interesting

    Akamai?

    "When I visited the company in January, the screen said that Akamai was serving 591,763 hits per second, with 14,372 CPUs online, 14,563 gigahertz of total processing power, and 650 terabytes of total storage. On April 14 [2004], the number had jumped to a peak rate of 900,000 hits per second and 43.71 billion requests delivered in a 24-hour period."

    From this article.

  63. Re:They have built an amazing system using Linux.. by B1ackDragon · · Score: 1

    Why did I read "tech" as "hardware"? I assume you refer to the people... oh well.

    I guess it would be nice for google to work some its technology/people into the community. Maybe someday they will.

    --
    The snow doesn't give a soft white damn whom it touches. -- ee cummings
  64. Re:They have built an amazing system using Linux.. by Anonymous Coward · · Score: 1, Funny

    You betcha! I'm looking at starting up a search engine and would really love their technology for free!

  65. "serves up the answers to our questions"??? by tsadi · · Score: 2, Insightful
    The Observer, serves up the answers to our questions.

    the article never answered any of our questions - heck, i even looked for a "Page 2" link after reading the entire thing, sadly, the article ended w/o even attempting to answer its own questions.

  66. Hey! by Anonymous Coward · · Score: 0

    This isn't the first article [of late] to attempt to describe how Google works - but it's one of the most recent which doesn't include pretty picture.

    We want pretty pictures!!!!

  67. Re:Openness is the first casualty of going public? by espo812 · · Score: 2, Interesting
    hy would a shareholder care about server specifications? Investing is all about money.
    I, for one, would. Now, unfortunately I don't have enough money to start investing on Wall Street, but hopefully that will change soon. So, why would I want to know technical details for a company? Obviously, because I'm a geek. But someone has to track this kind of stuff to produce a stock report. You can't have a company saying "We bought an IBM X Server and it now ballances our accounts and brokers international deals for us - so our $10,000 server produces $10Million in revenue." I'd like to know I was making a good investment, instead of one based on snake oil.

    No, they have to have people who understand technical details to be able to produce legitimate forecasts of output. I'm sure there are people who analyize how many workers and robots Ford has to estimate how many cars they can produce, right? So the equvilent is how many coders and systems Google has, no?

    Well if they don't, big brokerage houses can reply and I will consider the most lucrative offer.
    --

    espo
  68. Number of results? by Anonymous Coward · · Score: 1, Interesting

    Searching for "the" returns:

    Results 1 - 10 of about 5,690,000,000 for the [definition]. (0.11 seconds)

    But Google homepage says:

    Searching 4,285,199,774 web pages

    Is there a difference?

  69. Questions by ipour · · Score: 1

    Google is spending its time maintaining an unparalleled search engine, and they simply haven't had either the time or inclination to send someone around counting up all the capacities of all the hard drives, so that some fool can ask how much capacity they have.

    However, now that the company is contemplating going public, people DO want to know the answers to these questions, and as a publicly traded company, they may be required to answer many of these questions.

    So expect to see the company go from mostly techies to mostly lawyers!

  70. Moderate Up... by Ayanami+Rei · · Score: 1

    Durrrh... that's exactly what I was thinking. :-)

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
  71. Re:They have built an amazing system using Linux.. by HockeyPuck · · Score: 1

    If they 'give back some of their amazing technology back to the community', then they would lose their competitive edge. A Ferrari is fast b/c of the engine, if they had to 'open source' their engine, then we'd see minivans capable of 180mph.

  72. google becomes the first AI? by Anonymous Coward · · Score: 0

    Is what happens if the entire complex reaches an order of magnitude close to the complexity of the mammalian brain.

    And instantly becomes SELF-AWARE.

    Google----> First AI?

  73. One word. by Viceice · · Score: 3, Informative

    Robot.txt

    The Google bot respects it, so if you're up to no good, it's easy to get Google to not index your page.

    Anyway, I'd like to see a version of google that didn't respect robot.txt. You'd used to be able to dig up alot of infermation on peopel on google before they started to use robot.txt on alot of sites.

    --
    Sometimes I wish I was a plumber, then I'd know how to deal with other people's shit.
  74. OT (Good article) by Ayanami+Rei · · Score: 1

    BTW, I was quite surprised to hear about the part about CostCo's success with it's farier treatment of employees. I much prefer shopping at CostCo than I do at Sam's Club. (At the very least, the free samples of various food they hand out make shopping less monotonous) Coincidence? I think not.

    Do yourself a favor, slashdot, and get a membership. It's worth it.

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
    1. Re:OT (Good article) by evilviper · · Score: 1
      I much prefer shopping at CostCo than I do at Sam's Club.

      I agree completely. The same goes for WalMart as well. The two chain-stores are a miserable place to be. Everything is always a mess. People are packed-in, with never enough employees to do everything that needs to be done. They are kind-of like the junk-heap of retailers. You have to dig through unorganized piles of products, stand in long, slow lines, etc. And if you're lucky, your hours of endless hassles will be rewarded by saving 50 cents over buying the same product from a good store, where you aren't put through all that mess.

      The non-name-brand products they sell are always junk. I would never expect that from a big store chain. When I go into Target, K-Mart, Sears, or Costco, I know that I can buy any product in the store, and it will work as it's supposed to. If you try the same thing at Walmart/Sams Club, you'll get things like lamps that fall apart, furniture that won't stand-up, etc. In fact, Walmart has pressured Levis to make a cheapo line of jeans that practically fall apart.

      Frankly, I have never ever been able to understand why anyone at all ever shops at Walmart/Sams Club.
      --
      Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
  75. Re:Openness is the first casualty of going public? by Vlad_the_Inhaler · · Score: 1

    Yup, Akamai. That link of yours points to 'Technology Review' (I can't say more because it appears to be slashdotted) and not the 'Wall Street Journal'.

    This is just fun stuff that the company chooses to publish, it is not Investor Information'.

    --
    Mielipiteet omiani - Opinions personal, facts suspect.
  76. just read almost everything on google-watch.org by tsadi · · Score: 2, Funny

    i say google-watch.org is as credible a site as this one: www.realultimatepower.net - go ahead, click the link - its a hilarious site

    1. Re:just read almost everything on google-watch.org by Anonymous Coward · · Score: 0

      I disagree entirely...
      Ninjas actually exist.. "The Ultimate Power" just takes the facts and blows them way out of preportion.. Sprinkle on a healthy serving of made up storys to support it.

      Google watch is entirely made up stuff blown out of preportion sprikle on some actual events to support it.

      Once a conspericy chart becomes more complex than the Pentium 4 scematics credability is blown sky high.

  77. Re:Openness is the first casualty of going public? by doktorstop · · Score: 1

    Not necessary. The shareholder information is mostly and primarily business information relating to investments, profits etc. As to the technology used, there is a pretty simple excuse ... confidentiality of technology used. For instance, Colc-Cola does not have to publish their secret formula just because they have shareholders to report to. The know-how and technical details are their only and most valuable asset, and describing how they index, how they patch and what OS they use would be suicidal.

    --
    http://www.automatiq.se
  78. Yes. by Ayanami+Rei · · Score: 2, Informative

    very simple example of 15 servers in 3U. Many vendors are also offering a "dual dual" system in 1U... that is a two dual CPU motherboards that fit in one case.

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
    1. Re:Yes. by _Sharp'r_ · · Score: 1

      The Google design predates the big blade craze. They purchased custom designed cases that are like 1U steel boxes with no top on them and populated them with the bare minimum they needed (MB, CPU, RAM, NIC, single HD) to create 4 servers on each tray.

      --
      The party of stupid and the party of evil get together and do something both stupid and evil, then call it bipartisan.
  79. Re:Openness is the first casualty of going public? by chipset · · Score: 3, Insightful

    The original analogy is a little off. However, if you look at eBay, do they disclose how many systems they are running? How about Amazon? Do I care?

    The real fact of the matter is, they have custom software that they run. The number of systems, speed, memory and OSs are simply a byproduct of what they really offer: a service.

    Google is no different. They offer a service. As long as they are profitable, as an investor, I could care less if the systems were running on Dell's, White Boxes, Mac, or Commodore-64s. They have found a way to make the business run on the systems they have.

  80. Google full? Or just tweeking the algorithm? by Saeed+al-Sahaf · · Score: 2, Interesting
    Google has recently removed tens of thousands of "duplicate content" sites from its index - where "duplicate content" is as simple as being an affiliate site (e.g. Amazon) and having the same textual item descriptions as many other sites.

    Google is now in the process of dropping millions of link records from its index, presumably to make room for more pages.

    It's possible that the index is full, but I would imagine that they would have seen this coming long ago, as it "filled up", and taken measures. What's more likely behind the elimination of duplicate pages is that more and more people have been complaining about the search results relevancy and how site owners have been taking advantage of certain known flaws in the Google algorithm. So, they are taking steps to fix the algorithm, and kill off all the fake sites.

    --
    "Who are in control, they are not in control of anything - they don't even control themselves!" - Glen Beck
  81. gmail by yic · · Score: 1

    btw, how many of you have tried gmail? i think the interface is incredible .. its hard to imagine that web based email has been around for such a long time and nobody came up with an interface as nice as this. perhaps its just that i have a very fast connection, i'd have to try whether it works as well on slower connection speeds also.

    to the answer of how google does it: google seems to take whatever they do, and do it very well! even the example of email graphical user interface, which nobody even cares to talk about.

  82. Slashdot this... by blizzard854 · · Score: 0, Offtopic

    Hmmm... 1000 queries per second at peak time?.?

    Perhaps we should test out there systems...

    Google... try slashdoting that...

  83. they don't have to path and update very often by tmalsburg · · Score: 3, Insightful
    For example, how do you implement security patches and operating-system upgrades (much more frequent in Linux than in proprietary systems from Microsoft or Sun)

    Come on, the nodes in their clusters are not desktop computers with office software on it.

    The system running these machines are rather very stipped down: They only need very few applications and a very simple kernel (not many device drivers, maybe no graphic card driver, ...).

    Furthermore there are no local users on the the machines -> many security flaws wont affect the integrity. And remote holes in the kernel occur not very often.

    And above all these cluster nodes are certaily shielded by some sort of firewall. Therefore they don't have to care for network security themselves.

    All in all: I believe that you need to update such machines rather infrequent. At least not for security reasons.

    Titus

    1. Re:they don't have to path and update very often by BCW2 · · Score: 1

      Also it's fairly simple and quick to get one update done and working on a test machine, and then do every other one with an image from the master server. Quick and simple, the testing is done in advance and the actual update takes very little time. Googles speed would be slowed very little if two or three servers were out of the loop for a few minutes at a time.Kind of like how most place test a program before releaseing it in their production/working system.

      --
      Professional Politicians are not the solution, they ARE the problem.
    2. Re:they don't have to path and update very often by burns210 · · Score: 1

      furthermore, since most nodes would likely just have enough instructions to pull down the OS from the network, you would only need to reboot, in waves, the nodes in a serious(!) security problem, after having patched the central image.

  84. Re:They have built an amazing system using Linux.. by agurkan · · Score: 1

    Maybe they do not give back in terms of technology, but they do give back to community! I can think of at least in two ways.
    1) They provide an alternative to Microsoft. Not only search, it looks like they will give a blow to hotmail as well. They prevent MSN from becoming the portal. I think this is very important, people see things can be done better than the Microsoft way, and it can be done with Linux ;-)
    2) They make the communication within Open Source and Free Software community much easier. I keep a log of visits to my webpages, and 90% of hits come from Google searches. I almost exclusively use Google to find a resource for any project I am working on, including Free Software resources. Without Google, I might have had to filter through a hundred compiler advertisement pages, before getting information on a trick for GCC preprocessor. Now I type what I need, it is usually on the first page. Granted, I use Google because I am lazy, but people are generally lazy.
    I think Google does give back to community, in a way, they enable us to be a community.

    --
    ato
  85. Re: Sig, WAS: How do they do it? Two words by platipusrc · · Score: 1
    Not Found

    The requested URL http://joebaldwin.homelinux.com was not found on this server.

    Additionally, an error Joe's a fucking dimwit who destroyed half of his own website while migrating to Windows XP, like a dimwit was encountered while serving this request.

    Apache/2.0.49 (Win32) PHP/4.3.6 Server at localhost Port 80


    That IS amazing!

    --
    And the muscular cyborg German dudes dance with sexy French Canadians
  86. You have a Point here by AmericanInKiev · · Score: 1

    Say I just invested twenty years at Google figuring out search engines.

    Now figure I am selling my options.

    Now add that more people will buy them at a higher price if they are impressed with the number of computers.

    I think there is a big temptation for Google to expose whatever it has to expose if it means getting the option value up.

    After they cash out their options - google can compete, not compete or whatever - it will be the publics problem.

    AIK

  87. Re:Openness is the first casualty of going public? by martums · · Score: 1

    These two articles seem really similar. The article at Technology Review has a bit more detail, and also covers Akamai well. Interesting as heck. But is the first page copied a bit, or plagiarized?

    I'm going to go Google for some of those same sentences. ;-)

    --
    Those who would give up Essential Liberty to purchase a little Temporary Safety, deserve neither Liberty nor Safety
  88. Error results explained by Anonymous Coward · · Score: 0
    "and I honestly can't see what you are going on about: of the first ten results, eight highlighted the phrase in the page synopsis, one used the phrase as a domain name, and one included the parital phrase "...Or Not To Be."

    The two results you mention are obviously wrong: the one that used the phrase as the domain name obviously did not: it used an ALTERED version of the phrase (with all spaces removed). The partial phrase is so obviously off it needs no explanation, except to add, if partial phrase errors are OK, you are probably someone who would search for information on president "Andrew Jackson" and be happy that "Michael Jackson" results are in the list.

    "Those aren't irregular results: those are highly intelligent results."

    As intelligent as Homer Simpson. These are "D'oh!" results. I've seen this in many other searches: I ask for an exact thing and get many resulting pages that do not even contain it.

    "doesn't mean that they don't fulfill the purpose of the web search."

    I asked for A, they gave me B. It is a big bug: if you read Google's own search instructions, it says that it is not supposed to glitch out this way..

    "And the domain-name was found even though the spaces were omitted."

    The space-emitted version is a glitch, as I did not want that.

  89. Re:They have built an amazing system using Linux.. by NDPTAL85 · · Score: 1

    Anyone else notice its usually an anonymous coward who selfishly demands access to that which they do not deserve access to?

    --
    Mac OS X and Windows XP working side by side to fight back the night.
  90. Google linux by Anonymous Coward · · Score: 0, Interesting

    Yes we need a google linux for the desktop. If google sells out, what a shame, Warren Buffet will have taken over the planet. The whole idea being - how to make a lot of money without working for it.

    If work is defined as 'The expenditure of energy to the benefit of others that cannot be achieved by automation, then not many people accumulate money by honourable actions. In other words they get paid without doing something for it.

    The lads at google have done a marvelous job, an all the greedy fat capitalist bastards want to do is stick their noses in the trough, an suck their sustenance. Whilst shagging the planet.

  91. Re:Openness is the first casualty of going public? by hazem · · Score: 1

    If I were going to become a serious investor (in my mind, $millions) in a particular company, I would want to know certain details about it.

    In your coca-cola example, I'd want to know some rough numbers about their production capacity. If thy're planning on taking on a new market, will they be able to meet the demand? If one facility goes offline (maybe a terrorist attack), can their other facilities absorb the needed supply?

    I would want to know some similar things about google. I would need to know something about their infrastructure. How close are they to maxing their current resources? What will it take to add more? If they lose a data center, can they make up the load?

    One of the biggest problems in the market these days is that people invest lots of money in things they don't understand. Then they wonder why they lose all their money. There's going to be a lot of ignorant people who will be investing lots of money in google just because someone else told them it's the next hottest thing. It might be, but that's a dumb reason to put money in it.

  92. Re:Openness is the first casualty of going public? by nyseal · · Score: 1

    Having a PhD does NOT make one smarter, it just means that said person found (or has) the financial means to become more educated; or is it edumicated....I forget.

    --
    [SIG] Remember Mattel handheld games?
  93. IPO signals more World Poker Tour participants by mabu · · Score: 2, Interesting

    I can understand how in some cases an IPO can help generate revenue necessary to operate and break into new markets, but does this apply to Google? I really don't think so. They have market share; they have resources. Any infusion of funds to the company is more likely to give them the ability to further diversify and enter different markets, which history has shown is more often than not, a bad business idea.

    So one has to assume the IPO is the first phase of the principals "cashing out". The press will probably signal this as a sign of the next dot com boom, and a bunch of nerds within the company will suddenly become millionaires, and subsequently quit their job and open up a Bed & Breakfast in some obscure town or join the World Poker Tour. There goes the talent.

  94. How does Google do it? by thames · · Score: 1

    > How does Google do it?

    What kind of kinky question is that! I don't wanna know!

  95. Re:-1 baby by Anonymous Coward · · Score: 0

    What did I tell you? -1 baby! Guess where I am? Go on, guess!

    No.

    No.

    Yes, that's right, in heaven! Woohoo, now I dance.

  96. Do what? by Anonymous Coward · · Score: 0

    Suck ass? Google used to rock, but they've been fucking up BIG TIME lately. Couldn't care less how they tick. Probably much like Microsoft lately.

  97. Re:Openness is the first casualty of going public? by krosk · · Score: 2, Interesting
    Not necessarily. You can easily fudge this information. You don't have to relate any specific information, in fact Google could quite easily just say "$10,000 for capital investments" and everything would confirm perfectly to GAAP (Generally Accepted Accounting Principles). Capital investments could be anything from computer servers, a new piece of land, a new building, or pens/pencils for all we would know. Google, through it's financial statements doesn't have to say exactly what they spent their cash on, just what catagory it fits in (Operating, Investing, and Financing).

  98. Re:Openness is the first casualty of going public? by AhBeeDoi · · Score: 3, Funny
    With google: before I give them my money, I would like to know how many servers they have, how close to capacity they are, what softwares they use (compatibility issues).
    Not to mention source code for custom applications, maintenance schedules, software upgrade schedules, standard permissions settings, root passwords, type and model of CPU cooling fans used, average uptimes and other relevant information which all prudent investors need.
  99. Re:Openness is the first casualty of going public? by Anonymous Coward · · Score: 0

    Remember, "There is no search business." There is however an advertising business.

  100. There is no search business. by Anonymous Coward · · Score: 0

    There is no search business. Read up on these: Google is out of control There is no search business

  101. Re:Openness is the first casualty of going public? by Anonymous Coward · · Score: 0

    Are low interest student loans that hard to come by? Am I missing something?

  102. Gmail Screenshots by Anonymous Coward · · Score: 0
  103. Google remembers. . . by Fantastic+Lad · · Score: 0, Flamebait
    What can possibly be more revealing about a person than all the searches one makes, (and which are never forgotten.) What kind of searches have YOU made in the last two years?

    What drives the Google architecture? I dunno. Borrowed muscle/money from the shadow government?

    I mean, Google already has 'ex'-NSA guys (no such thing as 'ex') on their payroll.

    Ho hum. . .


    -FL

  104. Re:They have built an amazing system using Linux.. by DA-MAN · · Score: 1

    1) They provide an alternative to Microsoft. Not only search, it looks like they will give a blow to hotmail as well. They prevent MSN from becoming the portal. I think this is very important, people see things can be done better than the Microsoft way, and it can be done with Linux ;-)

    Very important. Microsoft thinks they own so much of the browser market that you now have to accept their self signed certs to log out of hotmail. This means non-MS people are pulled with a message saying that there is no Certificate Authority behind this site. What's next? Drop the global DNS system and only allow MS Browsersto find your site???

    --
    Can I get an eye poke?
    Dog House Forum
  105. Google started to make me mad by Ilgaz · · Score: 0, Flamebait

    Its not Google in fact, its some geek coder uses google himself and forces USERS to use it too.

    I give you a list
    1) Safari
    2) Omniweb
    3) Opera 7.x versions
    4) Camino
    5) Of course, Mozilla

    Those browsers come with google search default, in Safari its more than .ini hacking, you must HACK EXE WITH HEX EDITOR, I mean, the application, .app whatever.

    Opera is commercial, so as Omniweb. I understand they make money with referring searches to google by default, just like paid bookmark inclusion. I of course feedback to them too.

    Do I have to use google? We all have to? As a guy paid for OSX, I have to hack the Safari app itself just to use another engine?

    Oh, on OSX, guess which browser gives users choice for Search Engine? IE 5.2 :)

    1. Re:Google started to make me mad by XO · · Score: 2, Informative

      Chill out, brother.

      Try clicking in the address entry bar on Safari, and typing in "www.lycos.com", or whatever other search engine you would like to use.

      Just because the menu bar's search function pulls up google, doesn't mean you have to use it. Or did using a Mac for this long rot your brain to the point where you can only do things either the Mac way or the Extremely Difficult way?

      --
      "Champagne for my real friends - and real pain for my sham friends!" http://ericblade.postalboard.com/
    2. Re:Google started to make me mad by Ilgaz · · Score: 0, Flamebait

      Yes, Mac rot my brain. I won't click on address bar and type Lycos.com , I will use another browser and will send complaint feedback to Apple.

      I don't have to use Hixie's or whatevers favorite search engine on an OS I paid.

    3. Re:Google started to make me mad by Anonymous Coward · · Score: 0

      Hey dipshit. Click on the search bar and then click the "add engines" option. It's revolutionary!

    4. Re:Google started to make me mad by Ilgaz · · Score: 1

      I show my finger to whoever "flamebait" this post. Yes, I pay for my OS, I don't download pirated copy from Gnutella and I wait for an app, which is funded by APPLE who I pay to RESPECT my choices. Its Safari in this case.

      Fucking NSA infested Google.

      Now THIS POST is a Flamebait... Like I fucking care.

    5. Re:Google started to make me mad by DC1 · · Score: 1

      I got your OS right there!

  106. Why 4.285 billion? by NotQuiteReal · · Score: 4, Interesting
    Just because the front page says "©2004 Google - Searching 4,285,199,774 web pages " doesn't mean you have to believe them. Maybe it is understated. For example, I just did a search on "the" and got:

    Results 1 - 10 of about 5,750,000,000 for the [definition]. (0.11 seconds)

    Doesn't that imply more than 4.285 billion?

    --
    This issue is a bit more complicated than you think.
    1. Re:Why 4.285 billion? by ThomK · · Score: 1

      Doesn't that imply more than 4.285 billion?

      You aren't accounting for duplicates there. Just because they have that many pages indexed it doesn't mean they aren't counting a lot of those multiple times.

      --

      TK

    2. Re:Why 4.285 billion? by GebsBeard · · Score: 1

      That's actually right in line with what I've been reading. Google has been notoriously tight lipped about their finances and technology in general. They tend to underplay everything to keep potential competitors at bay. It doesn't take a giant leap of faith to conclude that 4.2 billion page count (which conveniently maps into 32 bits unsigned) is just more of the same subterfuge.

    3. Re:Why 4.285 billion? by mac_eng_ryan · · Score: 1

      Ok, just because your search of "the" resulted in 5.750 billion pages doesnt mean that they have indexd that many pages. This is because MANY pages may contain "the" many times. Remember that seraches are based on content not always just on the name of the site.

  107. Doing half as well as Google by alien_tracking_devic · · Score: 4, Funny
    from the artice:

    "Google manages to achieve this with sophisticated techniques for rippling changes through the cluster, yet achieves 100 per cent uptime. This is serious stuff, and there are a lot of IT managers out there who would give their eye-teeth to be able to do it half as well."

    Sigh...as an IT manager I can only dream of 50% uptime. Damn you, Google!

    1. Re:Doing half as well as Google by Anonymous Coward · · Score: 0

      50% uptime? You must be running WindozeNT. Still, for Windoze that pretty good.

  108. Interesting comparable... by ron_ivi · · Score: 1
    An interesting comparable for Google. Also quiet about most of their infrastructure; but they do answer some questions such as electricity bills, budget, etc...

    How much electricity they use
    - "the 2nd largest user of electrical power in Maryland. ... yearly electrical bill is more than $21 million. "
    How big in # of people and budget
    - "if ... considered a corporation in terms of dollars spent, floor space occupied, and personnel employed, it would rank in the top 10 percent of the Fortune 500 companies."

    Wonder how google ranks in those metrics - and we may get a good ballpark feel of how much data they can store and process.

    1. Re:Interesting comparable... by atrizzah · · Score: 1

      Are you a crackhead? Neither of those statistics have anything to do with google

  109. This wil be the begining of the end for Google by rpsoucy · · Score: 2, Interesting

    Wallstreet should be seen for what it is: a plague upon american businesses and innovation.

    You get your initial investment, which seems great, but then you sell your soul. You will be forced to "cut the fat" and "yeild higher short-term profits" and all resarch projects that make tech companies great will vanish.

    This has happened with almost every great American tech company. How often do we see the type of reasearch that came out of Bell Labs today? We don't, instead we see former reasearchers that were once considered the "cream de la cream" of computer scientists out looking for work (most taking up teaching positions at universities).

    Along with the presure of Wallstreet, Microsoft will be releasing their direct competitor to Google soon and they will be pushing hard for industry domination.

    Wallstreet is the reason that our tech jobs are going to India, Wallstreet is the reason that America is slowly becoming less and less of the technological superpower that it used to be.

    IMHO, Google should stay out of Wallstreet and keep doing what it has been doing.

    Then again, there are plenty of examples of companies that had alot of hype for an IPO and are still strong and innovating today, VA Linux Systems for example, oh, I mean VA Software, and their one product that is slowly being made obsoleete by Free and Open Source alternitives.

  110. Re:Openness is the first casualty of going public? by cookie_cutter · · Score: 1
    "how many servers google has, what their specifications are"

    Those two points are irrelevant. Google is all about software. The hardware is whatever they can pick up cheap. You may be able to tell how integer or floating point their calcs are, but thats pretty useless to any other company as well.

    As for their current commercial strategy, take over the world, I think we all know that.

  111. Re:Openness is the first casualty of going public? by falsified · · Score: 1

    That's because you (by you I mean we) are a geek. I wanna know too. So do most people here. Because we're geeks. Okay? We're fucking geeks. Dammit.

    --
    HI, MY NAME IS ISAAC.
  112. Mod parent down by Anonymous Coward · · Score: 0

    The two examples he mentioned did not contain the phrase.

  113. Cap on shares by sashang · · Score: 1

    I don't know much about shares and stocks and stuff but I've heard complaints in previous threads about 'greedy wall street bastards' buying heaps of shares to make a profit. Why doesn't Google put a set a maximum number of shares limit per person/organization. Then it will encourage a more even distribution of wealth.

  114. Idiot #2 by Anonymous Coward · · Score: 0
    "(You have to put quotes around a phrase to get results that contain it as you typed it)"

    Read the original item, and try the search. I used quote marks around the phrase, and it still came up with bogus results.

  115. Idiot #3 by Anonymous Coward · · Score: 0
    "There you'll see that while the pages themselves do not contain "to be or not to be", pages linking to that page did."

    This is clearly a bug. If you read Google's own documentation, it says that it returns pages actually containing the phrase. Not some useless "pages linked to" thing. The "pages linked to" is to determine ranking, not actual determined results.

    "and my understanding is the anchor text is where "to be or not to be" can be found for these two pages you mention."

    ...but not at all in the two pages.

  116. Google phrase searches don't work very well by Anonymous Coward · · Score: 0
    "Just because they aren't deterministic enough for you to plug them into a piece of code of your own construction (without compensating Google) doesn't mean that they don't fulfill the purpose of the web search."

    No, this proves that for this, and other phrase examples, Google does not fulful the purpose of the web search since two of the top 10 pages do not contain the phrase asked for and are thus error. Chaff. Trash links..

    I don't plug code and do Google searches from that; I just do searches from www.google.com. I've noticed that Google's returned results are very buggy. Other search engines like Altavista have no problem with this: their returns are 100% accurate. If it is so easy that Altavista can do it, why not Google?

  117. annoying ads by 602 · · Score: 2, Funny

    It's a good article, but the page as a whole is annoying, due to several animated ads. I won't put up with that shit. I copied the text to my word processor for reading.

    1. Re:annoying ads by shish · · Score: 1
      There are ads? Where? I don't see any...

      Are you not using FireFox or something?

      --
      I mod down anyone who says "I will be modded down for this", regardless of the rest of their comment
  118. Re:Openness is the first casualty of going public? by builderbob_nz · · Score: 0

    FruitCompanyA uses insecticide whereas FruitCompanyB doesn't. I personally would choose FruitCompanyA as a a rise in the insect population would ruin FruitCompanyB.

    Good example, but I guess it's my turn to be nit-picky. Growing up around apple orchards teaches you a lot about how to grow apples. One lesson learned is that the best way to stop insects is not to use insecticide, but to use other insects, aka natural enemies (ever wonder why NZ-grown apples are so popular around the world?).

    Again a good example, as a good investor would have an understanding of where they are putting there money, otherwise they would be better of going to somewhere like Las Vagas and putting it all on a black-jack table.

    --

    Karma? Hey I just call it as I see it.
  119. Semi random? by NotQuiteReal · · Score: 1
    Sorry to reply to myself, but here it is, just a few hours later, and now I get:

    Results 1 - 10 of about 5,660,000,000 for the

    When I search google for "the".

    So, 90 million pages just vanished?

    --
    This issue is a bit more complicated than you think.
  120. What this article didn't mention... by rocksh · · Score: 0

    "Google cluster actually has 100,000 servers" "More than half the company's 1,000 employees are techies" => Googles has 100 servers/employee => Google is IT company

    --
    >
  121. Re:Openness is the first casualty of going public? by Lord_Dweomer · · Score: 2, Interesting
    'With Google, their entire "business" - their means of generating cash flow - relies on sheer quantity of computing muscle and high performance software for their search databases."

    Actually, their means of generating cash flow relies on how beneficial advertisers feel it is to advertise on Google.

    --
    Buy Steampunk Clothing Online!
  122. Re:Openness is the first casualty of going public? by Smidge204 · · Score: 1

    Which is directly tied to their "brand". Google is a household name because they provide fast, relavent search results with a clean interface and relavent ads. This is only possible because of the hardware and software they run. It is what made them famous, and it is what keeps them up front. Their hardware and software is critical to their livelyhood.

    This is not the case with a company like General Elecrtic.
    =Smidge=

  123. GUGL? GOOG? GPLX? by grikdog · · Score: 1

    Any idea what symbol Google plans to trade under? Is Google-IPO.com the best source for news?

    --
    ``Tension, apprehension & dissension have begun!'' - Duffy Wyg&, in Alfred Bester's _The Demolished Man_
  124. Mathematical Billion? by Syphtor · · Score: 1

    Out of curiousity does anyone know if that is a Metric Billion or US Billion?

    The difference being:
    Metric Billion is 1 million, million ie:
    1,000,000,000,000

    Whereas US Billion is 1 thousand million ie:
    1,000,000,000

    A fair order of magnitude in difference! Also Metric Billion is also referred to as a Mathematic Billion. The US Billion is also referred as a European Milliard.

    --
    It's in that place where I put that thing that time
  125. Re:Openness is the first casualty of going public? by nacturation · · Score: 1

    Having a PhD does NOT make one smarter, it just means that said person found (or has) the financial means to become more educated...

    Sure, anyone can buy a PhD from a diploma mill. That's not very hard. But reputable PhDs, the kind that Google would want to hire, have had to learn a lot of material, prepare a thesis, successfully defend that thesis and demonstrate their broad as well as in-depth knowledge of the subject. But I might be out to lunch on this one -- why don't you buy yourself a PhD for $199 and apply for a job at Google. Let us know how the interview goes, ok?

    --
    Want to improve your Karma? Instead of "Post Anonymously", try the "Post Humously" option.
  126. Re:Openness is the first casualty of going public? by logicnazi · · Score: 1

    Yes, but why couldn't the response simple be something like. We believe we currently have the computing capacity to handle Y many hits per second. It is evenly distributed in X locations with the destruction of i of those facilities leaving us with X/i percent of Y many hits. We can add additional hardware at $Z/10000 hits.

    Nothing in the information you asked for, other than the peak load they can handle, requires them to answer how many machines, what each machine can do etc..

    --

    If you liked this thought maybe you would find my blog nice too:

  127. great mp3 on it here by kernel2 · · Score: 1

    there's an older but great mp3 of how google is set up at ddj's technetcast website. The speaker is Jim Reese, Chief Operations Engineer at google.

    Link

    PS. On that website, I think the link to the mp3 doesn't work, but if you manully ftp into the server and get the file manually, it's fine.

  128. Why go to Wal-Mart? by Anonymous Coward · · Score: 0

    Because Wal-Mart does not force its workers to join political organizations. Costco does.