Google Programming Contest

The average color of the WWW by I+am+the+blob · 2002-02-06 10:28 · Score: 5, Interesting

Much like the recent discovery of the average color of the universe, this would be a pointless, but fun, use of the data. Of course, I'm not sure exactly what to average. Do you take into account browser real-estate a particular color occupies? Do you simply average each color= and stylesheet instance?

Ideas?

--

All sweeping generalizations suck.

Usefulness? by Hi-Tech+Redneck · 2002-02-06 10:29 · Score: 2, Interesting

I'm honestly curious as to what kind of useful programs could be run on that collection of pages and still be interesting? Statistical Analysis? Boring! Or maybe market analysis? Again, BORING! Some of the more trivial interesting things, like how much of phrase or word x appears on the internet couldn't really be termed useful... Hopefully, somebody will prove me wrong. Good luck to all you developers...

What a coincedence! by ctkrohn · 2002-02-06 10:30 · Score: 2, Interesting

I was just talking to someone on IRC, and we were playing a game with Google. You had to find a two correctly spelled words which would obtain a page or less of results. He mentioned that a distributed client which searches for the longest string of words returning less than a page would be a cool idea.

Just a thought...

The biggest Dictionary by p-n-wise · 2002-02-06 10:37 · Score: 4, Interesting

I'd go for a dictionary of every word ever used on the web. Complete with common usage examples.

--
I am the NUL and the DEL, the beginning and the end.

I know! by AntiFreeze · 2002-02-06 10:38 · Score: 2, Interesting

Someone could do a CRC (cyclic redundancy check) on all the pages in the cache, that way, one could tell when the Internet's been updated...

Even Stupider: Not only easy, but it could allow google to create static result pages for common searches: it would just update the result page when the cache CRC changes.

--

---
"Of course, that's just my opinion. I could be wrong." --Dennis Miller

map of the internet, using the internet... by edrugtrader · 2002-02-06 10:38 · Score: 3, Interesting

how about have google parse every page, and save the homepage as an image. then take the map of the internet, and make it using tiny thumbnails of the most heavily linked (popular) sites.

this would be just like those mosaic photos, only much nerdier. thinkgeek execs are drooling already....

--
MARIJUANA, SHROOMS, X: ONLINE?! - E

How about... by Anonymous Coward · 2002-02-06 10:39 · Score: 1, Interesting

a program that figures how many "degrees of seperation" between websites?

Use that idea if you want, I'll only ask for 65% consultation fee. You can keep the tour.

Make the check out to "CASH"

How about a FPS game? by t0qer · 2002-02-06 10:40 · Score: 3, Interesting

A few years back there was a game, I think it was called Virus or something like that. It would scan your directory structure and make a map for the FPS world based on that.

Looking at the web, I allways though it would be cool to make a game based on the same concept, but use web pages instead of your hard drive directory.

I'm just throwing out ideas.

one word (or maybe two): spellcheck by option8 · 2002-02-06 10:41 · Score: 4, Interesting

i actually bugged the google guys a while ago about adding a spellchecking function to google. throw a URL or a set of pages at it, and it spits out a list of misspelled or questionable words - highlighted in the way they already do search terms in the cache...

anyway, someone there emailed me back basically saying it was an interesting idea, but not something on their agenda.

maybe someone out there can work up a scalable google spellchecker that i can run my big-ass database-driven website through (which is a major pain to spellcheck, considering the client simply refuses to do when they provide the content)

--
- Entertaining Bits from the Ancient Kernel Tree

Restoring meta-tags by Charles+Dodgeson · 2002-02-06 10:43 · Score: 5, Interesting

I've been kicking around an idea for a scheme to end meta-tag (keyword, description) abuse so that they can actually become useful again. But it would require the cooperation and effort of google (and others) do do this.

The idea is roughly to refuse to index sites which engage in keyword/description abuse.

index keywords and description data
Allow users to search with keywords on or off
If users search with keywords on, provide a mechanism for users to nominate a site as engaging in keyword abuse.
semi-automatically, and then manusually review nominations.
Refuse to index sites which have engaged in keyword abuse.

This isn't so much a system that meets the specs of the contest. And there is a scaling issue, but it is on my wish-list for google (and others) to do.

--
Prime numbers are exactly what Alan Greenspan says they are -S. Minsky

Re:Some Inspiration by costas · 2002-02-06 10:44 · Score: 3, Interesting

I hate to link a beta-level site from /., but that's exactly what I am trying out...

The entire internet on a floppy by KenSentMe · 2002-02-06 10:44 · Score: 2, Interesting

Something to think about... you know that cool cacheing feature that google has? That basically means they have the entire internet saved on their disk array. Seriously though, I've been doing a lot of work and research in the area of neural nets, fuzzy logic, evolutionary algorithms, etc. etc. I wouldn't mind feeding 900,000 webpages into a neural net, and seeing how well it learns, or *what* it learns.

Re:The entire internet on a floppy by rgmoore · 2002-02-06 11:58 · Score: 3, Interesting

I'm not sure if using USENET is such a great idea. While there are some areas where it has a great signal to noise ratio and intelligent commentary, there are a ton of places where it's simply awful. It's loaded with misinformation, flameage, and proof of the correctness of Godwin's Law. I doubt that I'd be very excited about chatting with a bot that learned to communicate by reading the USENET archives.
OTOH, you might be able to do some very clever work on using the page cache as a knowledge store for a chatbot. You'd just take the incoming message, try to find some keywords in it (probably using previous parts of the conversation to help) and use them to search Google for relevant information. Then you'd reformat the information you found into something like a conversational reply and send it.

--
There's no point in questioning authority if you aren't going to listen to the answers.

jargon watcher by MbM · 2002-02-06 10:47 · Score: 5, Interesting

Write an application to track keyword usage over time, when a keyword goes from only 10 hits to several thousand then flag it for jargon. The jargon can then be presented as a webpage of the top whatever with various statistics over popularity and suspected origin urls.

--
- MbM

Regular Expressions! by Oink.NET · 2002-02-06 10:51 · Score: 3, Interesting

If someone can come up with a regular expression search engine that scales to billions of pages, that would be the killer app for Google. It would probably have to be a Deterministic Finite Automaton (DFA) regex engine, not the more powerful Nondeterministic Finite Automaton (NFA) engines like you have in Perl, Python, Emacs, and Tcl, but still, that would rock!

Spam page deleter by www.sorehands.com · 2002-02-06 10:51 · Score: 3, Interesting

How about a program that checks for SPAM, then the program will delete the entries in the database that SPAMMERs have used to publicize. Then if there are more than 3 SPAMs, then notify the ISP and delete every page in the data base from that ISP.

--
Fight Spammers!

six degrees of google-ation by anthony_dipierro · 2002-02-06 10:51 · Score: 5, Interesting

Connect any two pages on the web to each other with the minimum number of hyperlinks.

Bah to their definition of 'interesting'. by Xzzy · 2002-02-06 10:53 · Score: 3, Interesting

I think their example ideas pretty much suck, dunno, maybe they did it on purpose so no one would try that stuff or maybe they just don't wanna see much creativity.

I personally think it'd be coolest to turn it into an art project.. imagine you had a repository of the consciousness of an entire race and could run a script on it. Things like the map of the internet. Or the web collage. Or use it to power some kind of AI chatterbot.

I dunno. Their webpage on it didn't seem to do much to promote being creative; they just want to pay someone 10k to develop a new way to make more relevent search results.

Useful or interesting? by Mr.+Sketch · 2002-02-06 10:56 · Score: 5, Interesting

It seems like it would be very easy to come up with something interesting, and only a small fraction of those interesting things are actually useful.

Examples of a few interesting non-useful things I can come up with just off the top of my head:
Google Poet: Generate rhyming poetry from randomly rhyming sentances on the webpages in the database.
Googlesaic: Input a picture and scavenge the webpages for pictures from which to create a large mosaic of the input picture.
Google Map: Create a picture/graph of all the website connections (links) in the webpage list, perhaps add 3d/naviations. Perhaps perform graph opererations and maybe find the longest path one can travel through the links and still stay within the Google search results/database.

These are just a few, I'm sure plenty of other people can find much more exciting/interesting things to do, but they won't always be useful to the google company.

--
Things you think are in the Constitution, but are not.

Search Engine Wars by Van+Halen · 2002-02-06 10:58 · Score: 5, Interesting

I already made a game last year I called Search Engine Wars. I wonder if it would qualify?

It's a party game. The basic idea is that a bunch of people are in the game, and it goes around in turns. On your turn, you type in a few words to search for. The game goes and queries google for the first hit on that search, and sends everyone's browser to that page. Then the other players get 100 seconds to guess which words you searched for. The first player to guess correctly gets points for the amount of time remaining.

It's written using BYOND, which you'll have to download if you want to play.

--

Say hello to zMac.

Re:I know what someone should make! by foobar104 · 2002-02-06 10:59 · Score: 3, Interesting

How about adding the option to have google understand what I *mean* to search for, not what I tell it to search for.

You might have been kidding, but you've got a really good idea there.

How about semantic searching: equip Google with a database that organizes words in a relational hierarchy from the general to the specific. For example, "orange" is a more specific form of "fruit," and also a more specific form of "color."

When you search for "orange," Google might also have the ability to search for "fruit" and "color," depending on how broad you want your search to be.

Just a thought.

JWZ Has the winner, and the runner up... by thehossman · 2002-02-06 11:08 · Score: 5, Interesting

JWZ allready wrote the coolest apps I've ever seen that harvest the power of Internet search engines...

Webcollage -- slowly builds a random collage of images from the net.

DadaDodo -- generates random sentences based on word probabilities in pages on the net.

--
-- The Hoss Man

Well, here's an idea.. by shayne321 · 2002-02-06 11:13 · Score: 5, Interesting

Here's a free idea to anyone who has the time/initiative to code it (i.e. Not Me): a program that scans a page and rates it with an annoyance rating (x out of 100?) based on annoying things you'll find on the page if you open it: webbugs, cookies sent back to doubleclick, pop-unders, banner ads, java applets, BLINK tags, poorly formed HTML/CSS, broken images, sql/asp/php errors, etc. The higher the number the more annoying the page, and therefore the more likely the user is to click a different search result. Google could also tie it in to their ranking system to rank annoying pages lower in the results. Seems to me like it'd make the web a better place.

Shayne

--
Today I didn't even have to use my AK; I got to say it was a good day -- Icecube

Re:Well, here's an idea.. by shayne321 · 2002-02-06 12:54 · Score: 5, Interesting

Another idea is to just count the number of HTML errors as the annoyance factor.
That's not really what I had in mind... HTML errors are nowhere NEAR as annoying as pr0n sites that pop open ads all over the place, resize your browser, bookmark themselves, etc, etc. That's what I mean by annoyance, the kind of site that makes Joe Sixpack (as well as me) get upset when he gets stuck in a loop that for every window he closes two pop open. I'm more worried about discouraging sites from using bad behavior than I am encouraging them to use proper html. Of course, malformed html should ADD to the annoyance factor, but not be the only thing counted. That's my opinion anyway.
Shayne

--
Today I didn't even have to use my AK; I got to say it was a good day -- Icecube

57mb Download by RageMachine · 2002-02-06 11:17 · Score: 2, Interesting

I have to say the download is quite smooth. 160k a second is nice. I wonder how much bandwidth google actually has? Probably a gigabit or more?
This many people with Cable/DSL downloading that file, and its not even slashdotted.

I havn't untared the file yet. But I wonder just how many people it takes to run google. How many are on staff? And how many work on the actual code that powers such a huge site?

--

--------------------------
Is this a sig?
--------------------------

Re:Free ideas and free code development for Google by CmdrPinkTaco · 2002-02-06 11:39 · Score: 4, Interesting

While I am all for Free Software, I have to agree with the poster of this comment, at least in principal. 10k is a small price to pay for tons of ideas. While Im sure the majority of the ideas will not be worth the time spent reviewing them, there will always be that precious gem buried somewhere.

For once, I just might agree with a binary only submission. That way if Google is truly interested they can license the code from the developer or have some sort of other agreement / arrangement.

It isn't like Google is offering up their source to the rest of the world, so I don't see why it is unreasonable to only offer up a binary to them. At the risk of sounding like a "me too" post - I still think that this would be something fun to be involved in if I had the creativity or the passion to persue something of this sort.

--
Please give your mod points to others, Im at the cap. They will appreciate it more

Re:Free ideas and free code development for Google by notsoanonymouscoward · 2002-02-06 11:48 · Score: 3, Interesting

would binary only even matter? its the IDEA they need... they have tons of coders easily available to implement whatever ideas they can glean from this. its not always about source control.

--
I ate my sig.

Re:Free ideas and free code development for Google by kill+-9+$$ · 2002-02-06 12:02 · Score: 3, Interesting

For once, I just might agree with a binary only submission.

Ahh, but if you read the submission requirements, you have to submit your source, a Makefile, and use only GPL or other open source libraries, so they've covered their butt there.

I hope anybody who does decide to participate in this contest realizes the implications of it. $10K is nothing for Google to pay to get ideas, source code, etc. Also note, in the submission requirements, any entry made to Google becomes their sole property. Christ, I can afford $10K, a tour of my house, allow somebody to run their prize winning code on the data on my computers if somebody's going to give me this kind of intellectual property. I really think that its a pretty raw deal for the developer.

--

-- A computer without COBOL and Fortran is like a piece of chocolate cake without ketchup and mustard

Finding Programmers! by rbeattie · 2002-02-06 12:20 · Score: 5, Interesting

Sure beats hiring programmers.

No, that's it!

According to this article Google is getting deluged by resumes, this is just a way for them to weed out the 600+ resumes they get a day.

The winner of this contest (and maybe a few of the runner ups) will most likely get a job offer as well. Beats having to weed through 4200 greatly exagerated CVs every week...

-Russ

--
Me

Re:Free ideas and free code development for Google by WNight · 2002-02-06 12:26 · Score: 5, Interesting

The problem is that ideas aren't worth a lot without a way to use them. I've had a lot of neat thoughts about mapping connectivity and so on, but without something like Google to run it on I'd have to spider the whole web myself on my cable.

They might get a good idea, but if you don't win the contest they don't really have much of a legal leg to take your idea, so you're pretty safe unless you're the winner, in which case you get $10k for hacking together a script that you never could have afforded to run anyways. (It's only concept they want, not the polished results of a 2-month dev process.)

It honestly sounds like a good deal to me. I hack for a night or two on a project that I find interesting. If I lose, no big deal. If I win I get 10k USD (3 months wages for me, I get paid in Canadian $s) and I'd be famous in exactly the circles who are looking to hire a coder with good ideas...

People go on about the value of ideas all the time, but really, without proper backing ideas are a dime a dozen. I've said many time "Hey, how about a ..." and seen it advertised a few years later. That doesn't mean I lost out on it, because I didn't have the cash to develop it let alone market it.

This is why patents on wide ideas are so damaging. Any idiot can have a good idea every now and then, but it takes more work (and funding unfortunately) to make them fly. If you let someone with an undeveloped idea block off a whole field it does a great disservice to the people with the ability to follow through, who likely had the idea independently.

Re:This is brilliant by Tom7 · 2002-02-06 13:42 · Score: 5, Interesting

Unfortunately, all the comments at 4 and above are complaining about how Google intends to rip people's ideas off.

Obfuscated Code? by arglesnaf · 2002-02-06 16:42 · Score: 2, Interesting

It says you must provide source. But that does not mean that you can't also enter it in an obfuscated programming contest!

Re:Free ideas and free code development for Google by Ragin'Cajun · 2002-02-06 17:34 · Score: 5, Interesting

For once, I just might agree with a binary only submission. That way if Google is truly interested they can license the code from the developer or have some sort of other agreement / arrangement.

It isn't like Google is offering up their source to the rest of the world, so I don't see why it is unreasonable to only offer up a binary to them.

Well, they *have* been running the best search engine on the web FOR FREE for the past 3 years. They don't clutter their main page with flashing X10 ads, or the the irritating news+sports+weather+financialnews+email combo that everybody seems to think people want. This might not be a bad way to give something back to the company that's saved us so much time and effort finding information.

And to the guys out there who wouldn't bother with this contest for less than $100K: if your idea is so good, go develop it yourself! Get a lawyer, and work out a deal with Google that suits you better.

--
--It's all fun and games, 'till someone loses an eye. Then it's one-eyed fun!--

Re:Free ideas and free code development for Google by onepoint · 2002-02-06 19:06 · Score: 2, Interesting

Oh this has to be the funniest set of post on slashdot in a long time.

What happend to the free sharing of ideas and code. They want it GPL, so post your code when it's done on sourceforge.

gee when it's for your own benifit it has to be free, but when somebody desires something it has to cost alot.

Thank you all for the laugh

ONEPOINT

--
if you see me, smile and say hello.

Sort results by W3C standards conformance by chrysalis · 2002-02-06 21:20 · Score: 5, Interesting

So that pages that can properly be read by any browser comes first.
Then, maybe webmasters will stop doing IE-only pages.

--
{{.sig}}

I wish I could code this by cascadefx · 2002-02-07 02:58 · Score: 4, Interesting

I hope Google reads these pages and gets some free ideas from it. At least take mine! Please. God knows that I don't have the coding chops to do it myself. I sent this same idea to Allaire (remember them) a long time ago and I had a couple of software engineers write me back, but nothing ever came of it. My guess is that this is a hard problem.

I want a browser control/plugin/whatever that harnesses a backend of web information to make my surfing more productive/predictive.

The gist would be to have a hover option for links which would give you information about what is behind the link without having to actually follow it. While browsing, the user would just hover over an link in a page and information pertaining to the page beyond the link would show up in a hovering menu or a sidebar (this would be great with mozilla, but I could see an activex control as well).

The types of information is where it gets useful. Using some of the more advanced summarization algorithms out there, it would pull up the summaries of those pages if they were in the offsite database (Allaire, Google, and the WayBack Machine being possible backends). Based on your preferences a short, medium or long summary would be displayed. If it wasn't in the cache, it could be summarized on the fly and then presented after some delay (the new summary now being cached).

It would also list, in an orderly way and subject to preferences, links from the page on the other side. That way the user could follow one of those if it turns out that she only needed the summary and a link. It would also list the elements of the page, like graphics, and give their specs (i.e. dimensions and estimated download times and ALT tag entries if present) and give the option to display them on a page by page basis. All of this would be nested, of course, so that a user could hover over links in the summary pages and get the same information all over again for that link (which is why I see it more as a "sidebar" feature). Theoretically a user could just surf by these summaries if they wanted.

Now, I realize that this would pose some problems like trusting the summaries and so forth. However, the nice thing about it would be features that could be built into the user's preferences. For instance, you could make it so that the user could have certain words or phrases set that would then be scanned for during the summarization process. You could then either relax the amount of summary for the entire page or, better yet, still pull the cached summary but also pull a user-definable number of lines before and after their keywords (best of both worlds).

Each summary could also list a numeric rank of where that page fits in "status" (like google's ranking system) based on the summary (generically) or the keywords of the user (specifically). Finally, it could pay for itself with text advertising (small and innocuous like the ones seen on Google).

If you start to think about it for a while, there are all sorts of things you could do with this and it would help cut through the "padding" that you usually go through while looking for informaition on a certain subject. I think it would be great! It is kind of based on the idea of the "magic spyglass" that was heralded almost a decade ago, but never implemented in any OS that I know of.

Like I said, I can't code it, but I would love to see it done. So have at it if you think it is good. Google's cache of pages and images and its ranking technology make it perfectly suited for this type of problem and they have enough PHD's that the summarization issue should prove an "interesting" problem to solve.

Then again, it might suck. If you do implement it, let me know. I would love to beta-test it. I called the whole thing the Clairvoyant Browser Plugin... but you could use what you want.

36 of 629 comments (clear)