Google Plans to Reveal Some of its Code
Andy Beal writes "According to Australia's The Age, Google plans to reveal some of the code it uses to great success. It says '
"The time has come for Google to "give something back", Wayne Rosing, the company's vice-president of engineering, told students while on a recruiting drive in Melbourne last week.
"There have been a lot of conversations in the company in the past two months about (how) . . . it's time for us to give something back. So our technical director, Craig Silverstein, has started a project to look at all the Google code and start figuring out what parts of it we want to give back," Rosing said.'"
Don't need the code for that
http://googlebar.mozdev.org/
They released some of their code in previous programming contests. The code allowed users to access their compressed data file format(compressed/indexed HTML) and quickly run seachs on them. As well they provided 20->200 megs of sample data. (Something like that) It was a couple years ago. April 30 2002. http://www.google.com/programming-contest/
Also, from python.org:
"Python has been an important part of Google since the beginning, and remains so as the system grows and evolves. Today dozens of Google engineers use Python, and we're looking for more people with skills in this language." said Peter Norvig, director of search quality at Google, Inc.
25% Funny, 25% Insightful, 25% Informative, 25% Troll
Didn't they publish their search algorithms in Patent 6,285,999 "Method for node ranking in a linked database"? That's the PageRank algorithm; since it's patented it's publicly documented and available for public use 21 years later.
Why can't I moderate something "Wrong" or at least "Grossly Misinformed"?
I heard about it through the grapevine only... And I am a sysadmin at the uni it was held. I think they were trying to keep it hush hush in order to sort the snotty undergrads from the people they were potentially interested in employing - researchers. Come back when you get your Ph.D kid...
BTW, a friend who went to the career night said that they were looking for these things in order; extremely high intellegence, fit with their culture, programming skills desirable but optional..
Its almost entirely javascript with a hell of a lot of preloading going on. Its a very good system actually, Im amazed that everything works identically in both IE and Firefox.
Since you brought it up, this article, Old Search Engine, the Library, Tries to Fit Into a Google World is definitely worth a look.
In the article, Wayne Rosing explicitly says that Google is not planning on open-sourcing the Google code base, but that they will publish academic papers on their work. "I'm not saying we're going to open-source Google, because that would be a little dumb when we have these Microsoft guys making noise. . . We're encouraging the software engineers to submit papers where it makes sense, particularly where it is landmark work and it is really important that other people know."
Google already has published a number of papers on their systems, including descriptions of PageRank, their clustering architecture, and their high availability file system (the Google File System). Seems like this is merely an announcement that they intend to do more of the same.
It's also described in one of their research papers.
Link to patent.
-jim
The original pagerank patent 6,285,999 lists larry page as inventor, but The Board of Trustees of the Leland Stanford Junior University is the assignee. Google has an exclusive license on that patent through 2011. There's a later patent, 6,526,440, listing Krishna Bharat as inventor, Google as the assignee. The latter patent appears to be a minor refinement, per the abstract:
"A search engine for searching a corpus improves the relevancy of the results by refining a standard relevancy score based on the interconnectivity of the initially returned set of documents. The search engine obtains an initial set of relevant documents by matching a user's search terms to an index of a corpus. A re-ranking component in the search engine then refines the initially returned document rankings so that documents that are frequently cited in the initial set of relevant documents are preferred over documents that are less frequently cited within the initial set."
"...but a nifto OS that can combine a few computers and let me run stuff across them trivially?" They have that: Plan 9, Inferno, and Amoeba.
"A Note on the Eigenvalues of the Google Matrix"
http://arxiv.org/abs/math.RA/0401177
Interview with Matt Wells (GigaBlast)p a=showpage&pid=135&page=1
http://www.acmqueue.com/modules.php?name=Content&