Slashdot Mirror


Google Doubles Server Farm

Mitch Wagner writes "Here's our followup story on Google's colossal server farm. When we first wrote about Google last spring, they had 4,000 Linux servers, now they run 8,000. Last year we focused on the Linux angle, this year we thought it was more interesting to go into the hardware, giving a little detail about some of the things Google has to do to build and run a server farm that big." Impressive. I always think our 8 boxes are cool, until I see this kinda thing.

65 of 258 comments (clear)

  1. Pictures! by Anonymous Coward · · Score: 3

    I want to see pictures.

  2. Re:Kudos to Google by Anonymous Coward · · Score: 3

    and without whoring themselves

    I have to say it's so nice not having a giant animated "Punch the monkey for $20" at the top of the screen. With Google, you actually have to look for the ads to see if there are any. It would be nice if a few other major sites learned something from this. What would that lesson be? Giant flashing ads only annoy people and do not bring in new customers.

  3. Re:Why? by Bill+Currie · · Score: 3
    IMO, it's not the CPU power they're after (though it doesn't hurt), it's the io bandwidth. Think of it as a giant RAID array. Assuming their systems can pull 20MB/s off the hdds, that's 160000MB/s (or 156.25GB/s) total bandwidth (ignoring overheads).

    Bill - aka taniwha
    --

    --

    Bill - aka taniwha
    --
    Leave others their otherness. -- Aratak

  4. google modifications available by Brigadier · · Score: 3



    I'm curious whether or not the optimizations made by google are readily available to the public. i.e GNU,

  5. "Google downloads Red Hat for free" by cpeterso · · Score: 3


    "Google downloads Red Hat for free, taking advantage of the company's open source distribution. And Linux's open source nature allowed Google to make extensive modifications to the OS to meet its own needs, for remote management, security and to boost performance."

    I'm sure Red Hat is upset that they are missing out on the sale of 8000+ Linux licenses!! :-) Maybe they should block downloads from the *.google.com domain.

    1. Re:"Google downloads Red Hat for free" by shyster · · Score: 5
      I'm sure Red Hat is upset that they are missing out on the sale of 8000+ Linux licenses!! :-) Maybe they should block downloads from the *.google.com domain

      I imagine they only download it once, then distribute via LAN. Besides, from last year's coverage, "Google actually paid for only about 50 copies of Red Hat, and those purchases were more of a goodwill gesture. "I feel like I should be nice, so when I go to Fry's I pick up a copy," Brin said."

  6. Re:Electric bill by the+eric+conspiracy · · Score: 3

    Buffalo NY would have to be the ideal location for this. Cold as hell, and right next to the Niagra Hydro plant for cheap power.

  7. Re:Kudos to Google by Chewie · · Score: 3

    Well, Google has recently added paid links near the top of searches (but, thankfully, they've taken pains to identify them as such). Also, they make a metric buttwad of money licensing out their search engine to other sites (Yahoo!(TM) anyone?).

    --
    49 20 68 61 76 65 20 74 6F 6F 20 6D 75 63 68 20 66 72 65 65 20 74 69 6D 65 2E
  8. Re:What about hardware maintenance by crimoid · · Score: 3

    With all those machines you could just pull the dead ones out of service and leave them there until you wanted to do periodic maintenance (at which time you simply yank out the dead ones, replace them, flip on the power switch and walk away). Assuming you've got some clever auto-assimilation software you may not even need to configure the box manually.

  9. Really doubled or part of a cost cutting move? by jonathanclark · · Score: 3

    As part of the infrastructure expansion, Google is consolidating. The company is moving out of datacenters in the San Francisco Bay and Washington D.C. areas, and consolidating in a new facility in the D.C. area. That means Google is moving from five to four datacenters--this, after adding three datacenters in the past year or so.

    I wonder if they really need that many servers or they doubled their size in order to have a seemless transistion during the move? I.e. Get the new site up and running and handling load and then take down the old site? Maybe they will sell off the old computers instead of move them. This could just be a PR spin to say "we doubled our size." Just devil's advocates conjecture, but they are probably moving to DC from SF to save money on space - so this is more of a cost cutting thing than anything else.

    Don't get me wrong, I love Google and use it everyday, but I don't see any reason they would suddenly double their capacity.

  10. MG (Managing Gigabytes) by harmonica · · Score: 3
  11. Re:it's still not as 31337... by DonkPunch · · Score: 3

    Interesting slogan on those shirts.

    http://www.elj.com/elj-quotes/elj-quotes-1999.html

    --

    Save the whales. Feed the hungry. Free the mallocs.
  12. Re:a petabyte?!!?! by segmond · · Score: 3

    4 copies of Microsoft Windows 2100.

    --
    ------ Curiosity killed the cat. {satisfaction brought it back | it didn't die ignorant | lack of it is killing mankind
  13. Re:im not really clear on.. by MustardMan · · Score: 3

    A bit of a correction to my own point, it's not a petabyte database, that petabyte of storage contains several hundred copies of the database. It's still a friggin LOT of data.

  14. Re:Why not Windows 2000? by eric17 · · Score: 3

    Well, $120 per license is a pretty good deal. Maybe the government should get the same deal for us citizens. For 150 million copies, the discount should be down to say, $100 a copy. That's only $15 billion, just a drop in the bucket for rich old uncle sam, and just a bit more than half of M$'s yearly revenues, so it won't hurt them either, but OMG--think of the savings!

  15. Re:a petabyte?!!?! by Tackhead · · Score: 3
    > petabyte == 1million gigabytes
    > can you just imaging how much _______ (insert your choice: mp3s, pr0n, divX;), etc) you could store! damn. *drool*

    A full USENET feed (including binaries) is about 250GB per day (yes, about an OC-3 saturated), and growing at 50-60% per year.

    One petabyte works out to only four more years of future USENET, give or take 50%.

    Scary, ain't it?

  16. Compression by dopolon · · Score: 3

    They actually use some compression algorithm (gzip I think) to compress the pages of the cache, because it would be silly to keeep a complete uncompressed mirror of the cache, since it's a feature that's probably used by only 20% of users

    --
    "The obvious mathematical breakthrough would be development of an easy way to factor large prime numbers." Bill Gates,
  17. Re:im not really clear on.. by turbodog42 · · Score: 3

    Well, when was the last time you searched on Google? It has a stunning amount of servers indexed. I can search for just about anything, and Google always finds more accurate hits, faster, than any other search engine. (Don't turn this into a search engine flame war, either.) They have to constantly refresh their indexes, and they have to turn around fast answers.

    Yeah 1.3 billion pages indexed is stunning. But even more stunning is the fact the total number of "pages" (an overly broad terms I concede) on the Internet is at least 100, if not 500 times that size. Basically Google is behind on indexing by 2 to 3 orders of magnitude.

    It's true that they constantly refresh their index. But it takes them about 2 months to do it. That ain't fast no matter how you look at it. As evidence, take a look at the date on the cached CNN.com home page

  18. Electric bill by HerrGlock · · Score: 3

    I wonder which gives them the highest electric bill, the servers themselves or the airconditioner required to do it?

    I'd just give up and get a handful of S/390s and do the same thing.

    DanH
    Cav Pilot's Reference Page

    --
    Cav Pilot's Reference Page
    UNIX - Not just for Vestal Virgins anymore
  19. Re:Argghh by ichimunki · · Score: 3

    What good would open source search engine code do? Unless you wrote it in such a way that it ran on some sort of distributed basis, only your direct competitors would have the hardware to run it. I mean, Google is in the business of providing search results. If they give away the software that does this, anyone with a server farm can build the same engine. Now if they were a not-for-profit company (you know, a charity) or a volunteer effort like DMOZ, then I could see it, but I expect the stakeholders at Google prefer black ink on their bottom line.

    Free software makes all kinds of sense when users demand it, especially when it comes to operating systems, programming languages, and "productivity" applications. But it makes zero sense for a company who has not only written the software, but has the only machine running that software, to give away the software.

    --
    I do not have a signature
  20. Re:But..how do they finance? by kinnunen · · Score: 3
    http://www.google.com/corporate/index.html (under business mode).

    Also, do a search for "porn". Ads.

    --

  21. And this is good? by update() · · Score: 3
    Disclaimer: I don't know anything about enterprise-scale IT. If I'm saying something ridiculous, let me know!

    That said, I'm surprised by the positive slant on this story. 8000 boxes that have to be separately administered? This is cost-effective (and environmentally sound) compared to a small number of heavy-hitter Solaris, AIX or Tru64 systems? I have to say I was a lot more impressed by hearing what cdrom.com does with a single FreeBSD system than by how many Linux boxes Google has had to cobble together.

    I've got to wonder - if this were a story about 8000 W2K servers powering Hotmail, would it get the same spin?

    Unsettling MOTD at my ISP.

    1. Re:And this is good? by bellings · · Score: 5
      8000 boxes that have to be separately administered?

      Why would 8,000 identical boxes be difficult to administer? The guys that develop the monitoring software and the install and upgrade processes are probably pretty smart cookies. But the actual maintence of the machines could probably be handled by monkeys.

      Think about it: the instructions for handling a hardware failure in one of these machines is probably:
      1. Identify bad part
      2. Replace bad part with any of the two dozen exactly identical parts we keep in the spare parts closet.
      3. Put system recovery CD in drive.
      4. reboot.
      5. remove system recovery CD when it automatically ejects and the end of the recovery process.
      6. If this doesn't work, call our system engineer, at 555-1212
      The spare parts closet probably just has boxes with labels like: "This box contains 80GB Maxtor hard drives -- exact match for every hard drive in rack 5, 7, and 8." Another box might be labeled: "AMI A571 motherboards -- exact match for all motherboards in rack 1, 2, 3, 4, and 7."

      Another box in the closet is probably labeled "Empty, pre-labeled Fed-Ex shipping boxes that are exactly the right size for our rack mounted hardware. Use to ship any badly broken machines back to our system engineer. Call first!"
      --
      Slashdot is jumping the shark. I'm just driving the boat.
  22. Why? by rabtech · · Score: 3

    Why bother to put together 8,000 Linux boxes, when one could obtain high-powered 64-bit computers to accomplish the same task?

    You can always go with Tru64, W2K Datacenter, AIX, et al.

    It would be interesting to figure out how much high-powered hardware would be required to replace those 8,000 boxen and the software to run it, and see if it comes out less or more than running the 8k separate Linux boxes.
    -------
    -- russ

    "You want people to think logically? ACK! Turn in your UID, you traitor!"

    --
    Natural != (nontoxic || beneficial)
    1. Re:Why? by Chewie · · Score: 5

      Several points here: W2K DC doesn't run 64-bit, at least not until Itanium is released. Second, for something like this, there are two reasons to do a large server farm: scalability and throughput. They said that they do not have one monolithic storage system, but instead partition the database up into small segments in the servers themselves. This means that they can handle many more I/Os per second than one (or several) big iron boxes could do. Also, those big 64-bit boxes are damn expensive (both hardware and software). For the price of one of those, you can get cheap servers and cluster them together. The big iron boxes are great for large databases that can't be split up among several servers/storage systems, but if you can split the database up (as they have done), a farm of small servers will always provide better scalability and throughput than one big box. And aren't those two things the secret behind the web game?

      --
      49 20 68 61 76 65 20 74 6F 6F 20 6D 75 63 68 20 66 72 65 65 20 74 69 6D 65 2E
  23. Google architecture by SpaceLifeForm · · Score: 3

    If you want to really know how it works.

    http://www-db.stanford.edu/~backrub/google.html
    Note: the document was written in 1998.
    two snipets:
    6.3 Scalable Architecture

    Aside from the quality of search, Google is designed to scale. It must be efficient in both space and time, and constant factors are very important when dealing with the entire Web. In implementing Google, we have seen bottlenecks in CPU, memory access, memory capacity, disk seeks, disk throughput, disk capacity, and network IO. Google has evolved to overcome a number of these bottlenecks during various operations. Google's major data structures make efficient use of available storage space. Furthermore, the crawling, indexing, and sorting operations are efficient enough to be able to build an index of a substantial portion of the web -- 24 million pages, in less than one week. We expect to be able to build an index of 100 million pages in less than a month.

    9.1 Scalability of Google

    We have designed Google to be scalable in the near term to a goal of 100 million web pages. We have just received disk and machines to handle roughly that amount. All of the time consuming parts of the system are parallelize and roughly linear time. These include things like the crawlers, indexers, and sorters. We also think that most of the data structures will deal gracefully with the expansion. However, at 100 million web pages we will be very close up against all sorts of operating system limits in the common operating systems (currently we run on both Solaris and Linux). These include things like addressable memory, number of open file descriptors, network sockets and bandwidth, and many others. We believe expanding to a lot more than 100 million pages would greatly increase the complexity of our system.

    --
    You are being MICROattacked, from various angles, in a SOFT manner.
  24. Re:Further info on box specs? by shyster · · Score: 3
    Anybody out there have more nitty gritty details on the specs of the latest boxes added? I am interested in CPU speeds, gigabit ethernet, RAM. 8000 of these things! The mind boggles...

    Evidently, they shun multiprocessor boxes, use big & fast IDE drives (2 per PC, one on each IDE channel), and from last year's article, use 100 Mbps links on the racks, with gigabit links between the racks. Last year's articles also quotes "256 megabytes of memory and 80 gigabytes of storage", though I imagine it's closer to 512MB (at least) and 180 GB per server now. Also says that they pack them in 1U on each side of a rack.

    But, here's the kicker, "Many of the systems are based on Intel Celeron processors, the same chips in cheap consumer PCs."!

  25. Re:a petabyte?!!?! by chris_mahan · · Score: 3

    The point of failure thing is a good point. If 10% of their servers fail (800) they still have 7200 that work fine, and they can probably handle things just fine.

    If 50% of their servers fail, then they would be slow, but still work fine.

    If 90 percent of their servers failed, they would still have 800 up. It would be very slow, but might still handle the load.

    If you had 1000 servers with disk array and your system failed, then ouch!

    In the other hand, they probably have half a dozen burned CDs of their implementation of Linux (depending on the HW configuration), so if a server fails, they take it offline, put another on there, load the OS already preconfigured from the CD (with all conf and stuff done already) and load it online.

    One tech can probably put 10 servers online a day.

    So 30 techs can probably put up 300 servers a day.

    Assuming each Linux box operates without admin intervention for 90 days, there would be 88 boxes that need to be fixed each day (about 1%), and so 9 techs could handle it.

    They probably have more than that.

    And since the technology is not hard to understand because it's a dual pentium PC, they don't have to call the IBM mainframe guy over. Also, they probably have a few dozen servers already configured, ready to be popped into the rack.

    --

    "Piter, too, is dead."

  26. Why not Windows 2000? by dougel · · Score: 3

    I mean why not... Really: Windows 2000 Server OEM 642.60 Times 8000 PC's Is only $5,140,800 Now for the peace of mind that comes with a crash proof windows box, why would linx even be an alternative. The worst part about this post is there are MCSE's who are reading and saying "right on my brainwashed friend!" =-=-=- Doug

  27. Re:Seen it by Anonymous Coward · · Score: 4

    Funny story. Google got into the Virginia facility when Globalcenter owned the datacenter. Before google, the sales people would only sell "floor space". Google's one and half cage, jammed full of 1U linux boxes pulled so much power that it rendered 6 surrounding cages unsellable. After that, sales people began selling "Amp capped floor space" rather than just square ft.

  28. Re:Why do you think Google needs 8000 servers? by ch-chuck · · Score: 4

    Microsoft would be woefully inneficient in that environment

    ... 8000 Msft boxen is probably getting to the point where you'd need 3 shifts of McSE's full time just to reboot the damn things - kinda like the days they made computers with so many vacuum tubes that their failure rate caught up with them, and it would barely run before another tube needed replacing.

    --
    try { do() || do_not(); } catch (JediException err) { yoda(err); }
  29. I'd bet they've already done the math by pivo · · Score: 4
    Considering that they're not necessarily Linux advocates, I'd imagine the did that calculation *before* buying all those machines.

    In any case, they'd have done it at some point along the line before the 8000th server arrived, and if they found they were making a mistake I can't see why they wouldn't have switched by now. Especially since if they thought NT would somehow be so much better they could have just removed Linux and installed NT and not have had to buy more hardware.

    Sounds like Linux is working out pretty well for them.

  30. Re: Multithreaded TCP/IP stack by kinkie · · Score: 4


    Let's recap how a single packet is to be handled (and probably I forgot something):
    you get the ethernet interrupt, you have to DMA the frame off the board, check to what protocols it belongs (if it's not IP, drop), checksum, check if you have to do any reassembly, check what protocol it is (it might not be TCP after all), check that the packet makes sense given the connection's history (i.e. sequence numbers and various other bits here and there), identify the process waiting for the packet, copy to userspace, signal process.
    A multithreaded TCP/IP stack means that more than one packet can be in the pipeline at the same time. It makes no difference on an UP system really, but on Nproc it can multiply your throughput by N (at least theoretically), just as a multithreaded app could increase throughput on a multiproc system.
    Of course, to be feasible, as many parts of the stack as possible must be reentrant, or you'll have to do locking and thus (in MS-ese) "serialize".

    --
    /kinkie
  31. Seen it by travisd · · Score: 4

    I've seen their cage out at Exodus in Virginia. Pretty cool.. They have like 6 racks of servers there - each rack is 80 servers I believe. They use systems from Rackable. Generally in a hosting facility you pay per rackspace and bandwidth -- more servers/rack means less cost/month in space.

  32. can you say pr0n? by Ender+Ryan · · Score: 4

    I thought I was really cool with my 100 gigs of storage at home filled with DivX ; ) movies and MP3s. 1 million gigabytes, that's insane.

    Ok, new poll

    What do you think is stored at Google?
    1. Huge search engine index
    2. Pr0n
    3. MP3s
    4. DivX ; ) Movies
    5. DivX ; ) Pr0n
    6. Marketing data collected with satellites and video cameras attached to flies... just like MLB
    7. Cowboyneal's transporter pattern buffer

    note: I own _MOST_ of the mp3's and divx movies I have...

    --
    Sticking feathers up your butt does not make you a chicken - Tyler Durden
  33. ROI on Linux by GreyyGuy · · Score: 4

    Just think how much it would cost to license 8000 servers with win2k and whatever database they would use. Would Google even be able to do this on M$?

  34. The power drain is staggering! by clink · · Score: 4

    I hope these people aren't located in California. Otherwise I think we've located the source of the electricity crunch.

  35. What about hardware maintenance by Once&FutureRocketman · · Score: 4
    The scalability of many small servers is great, but I would think they would run into a wall eventually due to the effort required to maintain all those machines. I mean, even if the failure rate is very low on a per machine-per time basis, if you have enough machines, you're going to wind up replacing multiple hard drive, cards, mobos etc every day. Their system is redundant enough that this doesn't affect performance, but there is a cost associated with the manpower required to do all that maintenance.

    I just gotta wonder at what point they would get better overall efficiency by replacing all those little boxes with a couple of big iron mainframes.

    --

    "Research is what I am doing when I don't know what I am doing." -- Wernher von Braun

  36. Interesting points by sumengen · · Score: 4

    I have listened to a Google senior engineer for about 10 months ago. They are really good at load balancing and should become a good example for other companies. Interesting points I remember:

    - Number of websites are increasing exponentially. So your number of computers or required CPU cycles are increasing exponentially. On the other hand prices per CPU Mhz also decreases exponentially (Moore's law ???). That is the key solution for the scalabbility. At least the problem is not exponential.
    - As mentioned in this article, they have been running Celeron 500+256MB RAM+ 2x 40GB harddisks back then. When a computer fails it is easier to replace them because of the cheap hardware.
    - Buy systems as much parts integrated to the main board as possible (NIC card, etc.) It is supposedly more reliable.
    - They are not running linux because it is cheaper. I have seen headlines about this including Slashdot, but it is not true. They are not denying that they saved a lot of money because of that, but hen they started Google that wasn't the issue. He mentioned that they could have had got a good deal from Sun for Solaris. The reason was that the openness of the source code and other reasons mentioned in the article. By the way he mentioned that TCP stack issues were also considered when the decision have been made. it looks like they are confident that they can fix problems at home if any exist.
    Google wants to design all software they run at Google. They don't want to use third party software because it introduces instability and it is difficult to fix bugs in that case.
    - They are not running Apache. using linux doesn't mean running apache. They designed their web server, which is simplest possible and therefore fastest. They don't need a complicated web server. All the computation is done in the background on 8000 linux servers. Web server needs only to send the query to the query server and display the results.
    - Googles job was easier than people might think. Their database is not dynamic. It only gets updated once a month. Updating means replacing the old files with the new ones, which is an offline process. Comparing this with an ecommerce site displaying real time statistics, you can see that google has an advantage and makes things easier for them.
    - Lets say Spidering and crawling is done on one datacenter. You need to copy these terabytes of data over to other datacenters and then replicate it to multiple server farms in each datacenter. You have to do this fast and without any errors. You don't want to use OS file system functions.
    - They rent bandwidth of multi gigabits for offline hours when there is not much traffic. of course for a very very cheap price. They use this bandwidth to copy data files from west coast to east coast. We are talking about many terabytes.

  37. Crud.... by V50 · · Score: 4

    They are still NOWHERE near a Googol Servers like their name suggests... Humph...


    --Volrath50

  38. Re:Loadbalancing large websites by baptiste · · Score: 4
    however I havn't seen that many testimonies/reviews from sites that use it.

    http://slashdot.org/article.pl?sid=01/04/26/033921 9

    Anandtech.com is using it.

    --

  39. Amazing by Anonymous Coward · · Score: 5

    This is what you can tell people when they tell you that linux is a toy. The best search engine in the world is *not* a toy.

  40. Re:Loadbalancing large websites by Precision · · Score: 5

    We have been using LVS on SourceForge, Linux.com and Themes.org and I nothing but good things to say about it. I have yet to have any real problems. We have 2 firewalls with automagic failover using heartbeat. We also use keepalived to automagically remove webservers from the queue if they go down.. all in all it's been a great piece of software.

    --
    - U
  41. Re:Locking into a OS by ethereal · · Score: 5

    Totally not the case - they've made their OS what they want, and they can change it if they want to. Don't confuse the cost of rolling out changes to 8000 machines with the cost of forcing a proprietary OS vendor to make the changes you need - you can roll out 8000 machines on a rolling basis in a week, assuming a conservative 1 hour automatic install 80 at a time (1% unavailability). You may never be able to get Sun or Microsoft to make the changes you need in an OS, if it isn't in their best interest to do so. Google's only "locked in" to RH in the sense that they can only achieve sufficient flexibility with an open source OS, and it sounds like they just went with RH because it's easier to hire admins. I bet they could run on any other flavor of Linux pretty easily, and *BSD without too much pain if they had to.

    Moderators, the above was only insightful if you don't care to think very hard...

    Caution: contents may be quarrelsome and meticulous!

    --

    Your right to not believe: Americans United for Separation of Church and

  42. Loadbalancing large websites by blinx_ · · Score: 5

    In the recent months I've been trying to read everything I can find about loadbalancing large web sites, and google sure does make an interresting example.
    My company is in the progress of moving from one big server to several smaller onces, to allow for greater scalability, there is just a limit to how much cpu + memory you can put in a single box. Our future site will proberly use linux virtual server, which seems quite nice, however I havn't seen that many testimonies/reviews from sites that use it. The company I work for creates online image manipulating services, and part of the process is rendering large high quality images - and the hard part seems to be shared storage of these images (scsi over tcp/ip seems very interresting), load balancing with static pages seems easy enough. Anyway google's way of using many small machines is an inspiration.

    --
    Resistance is not futile - www.gnu.org
  43. Do they give back? by leperjuice · · Score: 5
    Google's applications are unique, requiring far more extensive load-balancing, computing, and input-output bandwidth than other enterprise applications.

    The question that should be asked here is if they are sharing the results of their word. I bet that they're probably lifting some of their techniques hot and fresh off of research papers and they may be the first to actually use them in a enterprise environment.

    Note that I personally believe that closed source is not necessarily a bad thing. But if Google has made radical changes to these enterprise-grade tools, it would be nice to see them trickle down into the mainstream distros. While we as home users would probably never need them, it would certainly put to rest some of the pro-Microsoft arguments against Linux as a server-grade OS.

    Of course, for all I know, they could be actively working with Cox et al to incorporate their findings into the kernel and related tools.

    Either way, a very impressive job done with a operating system that "is simply a fad that has been generated by the media and is destined to fall by the wayside in time."

    Note that I use Windows and Linux so I'm no bigot... (some of my best friends as Microsoft Programmers!)

    --

    -- "I am disrespectful to dirt. Can you not see that I am serious!"

  44. Re:A Real Reason They Can Get Away With That by ottffssent · · Score: 5

    "And no, Linux on IBM/390 WILL NOT help them because it is just an emulation, and disk arrays of this one huge computer will get swamped by the billions of read requests (the same way they will get swamped on Starfire or the same S390 under OS390)"

    Exactly. Even at ~1M/s per IDE drive (lots of random reads), that's 1M/s * 8000 machines * 2 drives/machine (yeah, some have 4, but the article doesn't say how many) = 16GB/sec. It would take a hell of a SCSI setup to equal that bandwidth, let alone the massive numbers of IOs.

    Further, even if the boxen only have 2G memory each, that's 16TB of memory, which you could put in one big server, but no single memory system is going to provide the throughput that 8000 SDRAM channels will.

  45. Re:im not really clear on.. by Brento · · Score: 5

    what in gods name do you need 8000 linux servers for? quake? I cant figure out what google could possibly use all that power for... if they really *need* all that power, they're obviously doing something wrong with their code.

    Well, when was the last time you searched on Google? It has a stunning amount of servers indexed. I can search for just about anything, and Google always finds more accurate hits, faster, than any other search engine. (Don't turn this into a search engine flame war, either.) They have to constantly refresh their indexes, and they have to turn around fast answers.

    Yahoo even uses them for their search engine. I can't imagine being able to service Yahoo's search needs with anything less than a full-fledged data center split across two cities.

    --
    What's your damage, Heather?
  46. Kudos to Google by revscat · · Score: 5

    This is only tangentially related to the story at hand, but I would just like to compliment Google on a job done extremely well. They have successfully built the fastest search engine out there, using open methodologies and without whoring themselves out like any number of other search engines. They continue to add interesting (and [gasp!] useful) features such searching PDF documents and their translation engine. They have really helped the Open Directory Project along, as well.

    There are successful .coms out there, but I think their business practices are so foreign to the "regular" business community that they aren't quite sure how to handle it.

    BTW: Anyone else see a philosophical relationship between Google and ArsDigita?

    1. Re:Kudos to Google by Tackhead · · Score: 5
      > All true, but are they really making money? I rarely see an ad there (not banner ad, mind you, but they're own form of search-related targetted ads). So are they still going off of vc, or do the few ads I see cover the bills?

      Actually, I think they're being smart about it.

      If the typical query returns one USENET post - maybe 2-3 kilobytes of text - why would you want to (as Deja did) spend money sending 20-30 kilobytes of HTML for the associated frames and banners and other ad support?

      The user's gonna see one ad. Google's bandwidth and I/O costs are gonna explode if the HTML wrapped around each ad takes up 10 times as much space as each query's results.

      By going with text-based ads and a non-frames approach, they not only make the site more user-friendly (thereby adding value), they cut their own costs by a sizable fraction.

      With lower bandwidth costs and I/O requirements, Google can make money with less ads, not more. That's where (IMHO) Deja went wrong - the more they needed the ad-revenue, the more they escalated the cost of serving the ads, in a vicious circle that consumed them.

      It's also where (IMHO) Google is doing it right.

  47. Petabyte? Try pedobyte! :) by Phrogz · · Score: 5
    Google indexes 1.3 Web billion pages on over a petabyte of storage--that's more than a million gigabytes. "That's not to say that the index takes up a petabyte..."

    And what takes up all that size? You know it--pr0n. The storage size says it all...it's not a petabyte they've got there, but a pedobyte. Sick google bastards. :)

  48. Seen it firsthand... by supabeast! · · Score: 5

    I have seen some of Google's stuff in the Northen Virginia. Those guys really know how to do high density racks. They have double-sided racks of 1U servers, with what I believe is 47 servers per side. The cabling alone is gorgeous. The bright red and shiny steel racks full of hundreds of flashing LEDS looks like something out of a rave.

  49. Wait, I have the Answer by StoryMan · · Score: 5

    What they should do is utilize the heat escaping from that chimney of theirs to power steam turbines.

    Then use the turbines to drive generators.

    Then send the power from those generators to the western united states.

    Now -- follow me here -- this would be a self-sustaining system, no?

    Users use google to search the web and read their embarrassing usenet posts from 1995. Power is generated. That power is funneled back to the user so that his or her computer stays on, the lights stay on, and they don't have to worry about getting stuck in an elevator during a rolling blackout.

    Users are happy, nuclear opponents don't have to worry about radioactive leaks into the environment from improperly sealed cooling tanks and leaking water, and google remains up and active, chugging away ad infinitum.

    Simple.

    Tomorrow, I'll work on my plan for cold fusion. Maybe a couple of Guiness glasses filled with tapwater, a couple of batteries, and a beowulf cluster ...

    1. Re:Wait, I have the Answer by BMazurek · · Score: 5
      Now -- follow me here -- this would be a self-sustaining system, no?

      "Lisa! In this house we obey the laws of thermodynamics" -- Homer Simpson

  50. A Real Reason They Can Get Away With That by Poligraf · · Score: 5

    It is that their information and the cost of failure are not critical. If one of the Google's servers (or hard drives) dies they can just find out what pages were stored there (from the master DB) and reload them into the storage on a new PC (and I'm sure they have some PCs with identical data).

    Now imagine an e-commerce site built like that. Loss of any part of user list or merchandise catalog is a major failure. This is why such sites are usually powered by a moderate (typical site) to huge (Amazon, eBay) database with an enormous redundancy built in.

    And no, Linux on IBM/390 WILL NOT help them because it is just an emulation, and disk arrays of this one huge computer will get swamped by the billions of read requests (the same way they will get swamped on Starfire or the same S390 under OS390). The entire idea of the setup is that you have a lot of independent disk channels.

    Another interesting insight is that they have done some improvement to administering all of these machines remotely. Otherwise they will blow all their money on paying sysadmins ;-)

    --
    Tigers respect lions, elephants and hippos. Maggots respect no one. (C) S. Dovlatov
  51. missing email by Matthew+Luckie · · Score: 5
    "It doesn't look like Google got the e-mail that the dotcom boom is over"

    three possible explanations:

    1. they have a spam filter in place
    2. they have a microsoft exchange server somewhere
    3. they were too busy going through everyone else's embarassing usenet postings than to read their own email
    my guess is the third one

  52. Multi-Threading Madness by Sinjun · · Score: 5

    I wonder what kind of information Google has about the deficiencies of the Linux TCP/IP stack? Certainly with 8,000 servers they could have some input as to how the lack of mult-threading has affects performance on a major site. I know that the most recent kernels and Apache versions were suposed to have dealt with this issue, but has anyone seen such a large scale experiment?

    1. Re:Multi-Threading Madness by epiphani · · Score: 5

      not nessecarily commenting on the multi-threading issue, kernels 2.4.x have substantially better socket handling... there were articles floating around on slashdot and linux.com a while back about a DALnet server breaking 38,000 simulatious active open sockets at one time. Linux has done wonders with their 2.4.x tcp/ip stack.. until recently, nobody even considered linux's stack worthy of an attempt at an IRC server of any reasonable size.

      --
      .
  53. google's new language features by stype · · Score: 5

    Go to google, click on preferences and change your language to "bork, bork, bork." From now on the site is completely in Swedish Chef (no joke).
    -Stype

    --
    -Stype
    Bus error -- driver executed.
  54. Interesting detail the article didn't go into: by vslashg · · Score: 5
    "That's not to say that the index takes up a petabyte. We have several hundred copies of the index," Felton said. "Most of the servers are serving up some fraction of the index." The index is partitioned into individual segments, and queries are routed to the appropriate server based on which segment is likely to hold the answer.
    An interesting metric that they don't go into in this article:
    • 4,718 of the servers index pr0n
    • 2,148 of the servers index warez
    • 1,634 of the servers index MP3 sites
    • 1,139 of the servers index various "ate my balls", "all your base", and other joke-of-the-month sites
    • 278 of the servers index content
  55. Where does Google get their money? by SirChive · · Score: 5

    Google is wonderful. But I'm left wondering where they get their financing and what their long term goals are.

    The Google site features minimal advertising. So they are most likely funded with VC money. This means that they must have a plan for making money at some point. What is it and when will it kick in?

  56. Ads are secondary... by daveym · · Score: 5

    "The Google site features minimal advertising. So they are most likely funded with VC money. This means that they must have a plan for making money at some point. What is it and when will it kick in?" Ummm...If you go to google and read about their company, you will learn that most of their income comes from licensing their awesome search engine for internal use by other companies. NOT from advertising. With everyone just now learning that advertising on the web sucks balls, this looks like a pretty shrewd move on the part of Google....

    --
    "Chill, Orrin!"---Trent Lott
  57. Ironic timing... by nrozema · · Score: 5

    I just spent all of yesterday afternoon installing a 63-node rack from Rackable. The build quality of these units is excellent... amazingly dense and efficient. According to the installers, in addition to google, their systems are also used extensively by yahoo and hotmail.

  58. Re:Doesn't this seem wrong to anyone? by dhamsaic · · Score: 5
    Has Google bragged about how much electricity they are consuming to run 8,000 electrical heaters? Have they boasted about how much pollution their power consumption generates? - they haven't bragged about *anything* - an article was simply written by an outside source which gave some details of their setup. They also note that they have hundreds of copies of the index, so that the redundancy is there - if one server goes down, another hops back up. Google *is* a business, and they need to be reliable. They're out to a) provide a useful service and b) make money. It's not useful if you can't get to it.

    They're using 8,000 computers to accomplish a pretty amazing feat, and they're doing this instead of buying a pretty huge farm of larger and faster computers anyway. Sometimes more smaller parts are better - you don't have one big machine that fails, separate parts are replaceable (say 10 or 20 machines instead of a few larger servers).

    You don't build a house starting with a large block of concrete - you use bricks. Google is doing the same thing. Cut them some slack.

    --
    Every once in a while I like to masturbate a new word into my vocabulary, even if I don't know what it means.
  59. They're efficient too by Magumbo · · Score: 5
    "they direct heat to a central chimney which is blown up to a high-powered fan"

    And these high powered fans then blow the blisteringly hot air along a complex series of ducts which lead to facilities which:

    a) generate electricity for the wall-o-lava-lamps
    b) are used to fill state-of-the-art, floating, hot-air furniture
    c) keep folks warm-n-toasty in the sauna
    d) make you hot and thirsty

    --