Slashdot Mirror


Google Programming Contest

AccordionGuy writes: "Google has just announced its first annual programming contest! The objective is to write a program that will do something "interesting" with the about 900,000 Web pages' worth data that's Google provides. In addition to writing the program, contestants also have to convince the judges why their program is interesting (or useful) and why it will scale (that is, handle a constantly increasing load of data that grows as the Web grows). The prize is US$10,000 in cash, a V.I.P. tour of the Google facility in Mountain View, California and possibly a chance to run their program on Google's complete billion-Web-page store."

18 of 629 comments (clear)

  1. Googlewhacking by waldoj · · Score: 4, Informative

    An automated Googlewhacking system.

    Ingenius!

    -Waldo Jaquith

  2. Re:So basically... by anthony_dipierro · · Score: 5, Informative

    That's assuming that any contest entries automatically become the property of Google.

    With regard to an entry you submit as part of the Contest, you grant Google a worldwide, perpetual, fully paid-up, non-exclusive license to make, sell, or use the technology related thereto, including but not limited to the software, algorithms, techniques, concepts, etc., associated with the entry

    So basically, google doesn't own your code, only the right to use it. GPLing your code would satisfy the worldwide, perptual non-exclusive license grant.

  3. Re:Some Inspiration by jimbo3123 · · Score: 2, Informative

    it occurred to me, since you are evaluating the number of links pointing to a page anyway, that it would be a very nice thing to
    have a sort of "Top 40 Links of the Day" page, regularly updated to include only new and unique stuff. You could use an
    algorithm similar to the one used by


    It's Called Google Zeitgeist.

    It is at:
    Zeitgeist[Google.com]

    --
    There should be a moderation category "Dumbest Comment EVER"
  4. Re:Security Risk by Anonymous Coward · · Score: 1, Informative

    Reading the page, it sounds like this is only the code to extract the data from their format. It's from the backend, indexing part - not the publicly visible front end.

  5. Yeah, But for 10K, Google owns it by mattvd · · Score: 2, Informative

    "With regard to the software and repository that you obtain for the Contest, you agree to the license terms as stated in files you download or receive. With regard to an entry you submit as part of the Contest, you grant Google a worldwide, perpetual, fully paid-up, non-exclusive license to make, sell, or use the technology related thereto, including but not limited to the software, algorithms, techniques, concepts, etc., associated with the entry.

    If you are selected as a contest winner, you agree that Google may publicize your name, likeness, and the description of work you did to win the contest. Apart from the prizes associated with being selected as a winner, Google shall not be obligated to compensate you in any way for such publicity."


    So in other words, google buys the next great thing for $10K. The only upside of the above is that it's a non-exclusive license which means you could go and sell it to a competing search engine too...

    Of course, good luck finding a competing search engine :-)

  6. Re:Can _you_ count? by RetardHumper · · Score: 1, Informative

    Uh...you didnt read the article did you?
    the 900K pages are provided from googles cache of about 100 .edu sites, this is just for the programmers to play with a fairly large set of data before scaling it to 2 billion.

    Also 900,000 +100K != 1 billion just FYI...

  7. Re:Can _you_ count? by MavEtJu · · Score: 2, Informative

    You can't count either, 100k + 900k != a billion ;-)

    This is what it reads:

    Google is providing a selection of about 900,000 web pages in pre-parsed and raw format

    That is what you get for the 57Mb or five cd's.

    The billion-Web-page store is what your program might be ran on if it wins.

    --
    bash$ :(){ :|:&};:
  8. Re:This is brilliant by epsalon · · Score: 5, Informative

    If you read the rules, you will see that you don't even have to assign copyrights to Google. You only have to give them a license. This means you can GPL your code or even BSD it. Sounds fair to me.

  9. Re:Notice their contest agreement? (was Re:Well th by Anonymous Coward · · Score: 1, Informative

    Does the GPL allow the creator to grant liscense to certain commercial vendors? Otherwise, you wouldn't be able to GPL it.

    You own the copyright, so you can GPL it even if you grant a different license to others. See "mozilla" for an example of this.

  10. Re:Free Labor - Tom Sawyer Effect by bannerman · · Score: 2, Informative

    kids these days... I remember Tom Sawyer. As the story goes, he does not hold a contest. He makes them think that he's having the time of his life and in fact talks them into paying him to be allowed to paint the fence. It was a great idea. And the idea of holding a contest for a cool program for Google is a pretty good idea too.

    --
    I keep forgetting my place. Jesus is for losers. Why do I still play to the crowd?
  11. Re:57Mb = 5 CD ?!? by metsfan · · Score: 2, Informative

    The 57MB download only includes the code, not the 900,000 web pages. Instructions for downloading those are included with the initial download. This is what takes up most of the space on the CDs.

  12. Re:swedish chef filter by BlacKat · · Score: 2, Informative

    You can set Google's language to Swedish Chef, and h4x0r as well. Just look under "Preferences". :)

  13. Re:Well, here's an idea.. by YoJ · · Score: 4, Informative
    I like this idea. But I would limit the definition of "annoyance" to something easily quantifiable. Broken links might be the easiest, but even for that you have the problem of internet addresses being sporadically available, or just slow some days.


    Another idea is to just count the number of HTML errors as the annoyance factor. I'm sure there are many tools out there that can do this rather quickly. If this were actually implemented by Google, so sites with bad HTML were ranked below all other sites, imagine how much cleaner the web would get!

  14. Re:57mb Download by godless · · Score: 1, Informative

    "just under 300", acording to this

    Regards,
    G

  15. Re:Ummm... by Tom7 · · Score: 3, Informative


    In general, it's not wise to learn about computer science from O'Reilly books!

    The languages that can be expressed with NFA, DFA, and Regular are the same. I promise I know what I'm talking about; I've taught this material to undergraduates in fact. It might be the case that O'Reilly has a word for something in Perl or Python, and they call it "Nondeterministic Finite Automaton", but whatever that is, it isn't a real NFA. NFA also cannot capture back-references or counted sub-expressions; they are subject to the same shortcomings as DFA. But, it might be an abuse of the terminology "NFA", just as everyone calls the (non) regular expressions that perl uses "regular expressions". Anyway, I just hate to see technical terms get misused... no big deal.

  16. I can solve your problem: by Dave_bsr · · Score: 2, Informative

    Go download mozilla 9.8 and go to Edit/Preferences/Privacy and Security. it fixes popups, allows for cookie rejection, add blocking, image blocking by site...it's what you need. And it handles lousy HTML pretty well too.

    --


    Who is this Anonymous Coward character, how does he post so much, and why is he always such a whore?
  17. Re:The other company by jason_hutchens · · Score: 2, Informative

    I worked for Ai (the Israel company) as its Chief Scientist, and I still take great interest in its activites and progess. Ai didn't go bankrupt. It has frozen its operations by choice, simply because today's climate isn't conducive to the kind of work we were doing.

    I personally proposed the "Machine Learning Challenge" when I first joined Ai, in mid-2000. Our intentions in running the contest were noble. We really were interested in finding out how well competing machine learning techniques fared in head-to-head battles.

    Unlike Google, our entry criteria was "by entering the challenge you transfer to us no rights apart from the right to evaluate your program by running the round-robin tournament". We offered a prize of $2,000 and a round trip for the creators of the top three entries to our research facilities for a research workshop. We also offered an additional prize of $25,000 to any entrant whom we entered into an agreement with (e.g. by buying their technology).

    The Machine Learning Challenge went ahead, thanks to Dror Kessler volunteering his time to run it. The winners were recently announced, and the workshop is scheduled to happen soon. See Ai's home page for more information.

  18. RTFM by Anonymous Coward · · Score: 1, Informative
    Google more-or-less owns your entry once you submit it. Doesn't matter if you win or not. Read the fine print on their contest entry rules:

    With regard to an entry you submit as part of the Contest, you grant Google a worldwide, perpetual, fully paid-up, non-exclusive license to make, sell, or use the technology related thereto, including but not limited to the software, algorithms, techniques, concepts, etc., associated with the entry.

    Or to make it simpler for you...

    if (code.Submitted())
    code.licenseTo(Google);