Google's Technology Explored
RobotWisdom writes "Internetnews offers a moderately detailed peek at Google's technology. For example, they use stripped-down Red Hat on a massively redundant network, and they're starting to have success with automatic clustering of concepts, so that pages can match even if none of the words in your query actually appear on the page." Additional analysis on InformationWeek and C|Net. From the article: "As a search query comes into the system, it hits a Web server, then is split into chunks of service. One set of index servers contains the index; one set of machines contains one full index. To actually answer a query, Google has to use one complete set of servers. Since that set is replicated as a fail-safe, it also increases throughput, because if one set is busy, a new query can be routed to the next set, which drives down search time per box."
It then returns a random blogger's page that has no useful information. Then you look it up in a book.
TWW
"Encyclopedia" is to "Wikipedia" what "Library" is to "Some people at a bus stop"
Google seems to fail to realize that according to the latest reports, IIS's TCO is much lower than their current solution. Microsoft's solution allows communication between a box and the outside, or between two boxes, and allows the running of arbitrary code, which is all that is required by this application. On top of this, engineers could inspect their running application with an XP GUI to diagnose issues (or rather see that there are none!) Their Red Hat solution, while a cute research prototype, lacks the backing of a serious, committed enterprise. I'm outraged that a booming company like Google can make an obviously wrong decision for the choice of a server technology.
I do this all the time. Before I buy anything electronic, for example, I type its model number or maker's name into Google and search site:slashdot.org to find out why it sucks.