Google Programming Contest
AccordionGuy writes: "Google has just announced its first annual programming contest! The objective is to write a program that will do something "interesting" with the about 900,000 Web pages' worth data that's Google provides. In addition to writing the program, contestants also have to convince the judges why their program is interesting (or useful) and why it will scale (that is, handle a constantly increasing load of data that grows as the Web grows). The prize is US$10,000 in cash, a V.I.P. tour of the Google facility in Mountain View, California and possibly a chance to run their program on Google's complete billion-Web-page store."
I think I'll write a program that will delete pages as it finds them. This should scale pretty nicely and make the web faster in the process.
A nice way to get ideas for free... maybe the idea of the winner will be used on the future.... but i still would like the money better than the idea. =D
Sigs are for morons... Wait a minute...
How about adding the option to have google understand what I *mean* to search for, not what I tell it to search for.
Oh, and the ability to find one non-fake Britney porn pic.
Much like the recent discovery of the average color of the universe, this would be a pointless, but fun, use of the data. Of course, I'm not sure exactly what to average. Do you take into account browser real-estate a particular color occupies? Do you simply average each color= and stylesheet instance?
Ideas?
All sweeping generalizations suck.
10K is nice along with the recognition and all, but... I'm sure that's a lot cheaper than paying a few Google staff coders to come up with the same thing in a few months.
Jus' being paranoid.
...the winning program will be written in C - not Java.
Mod me down, but you know it is true.
Evil, but brilliant.
Get hundreds of people to crank out code for you, pay a paltry sum to one of them, keep all the code. Pay $10K for millions of dollars in potential technology.
That's about the slickest thing I've ever seen. You have to admire them for their evil. Microsoft could learn a thing or ten from them.
I'm honestly curious as to what kind of useful programs could be run on that collection of pages and still be interesting? Statistical Analysis? Boring! Or maybe market analysis? Again, BORING! Some of the more trivial interesting things, like how much of phrase or word x appears on the internet couldn't really be termed useful... Hopefully, somebody will prove me wrong. Good luck to all you developers...
I was just talking to someone on IRC, and we were playing a game with Google. You had to find a two correctly spelled words which would obtain a page or less of results. He mentioned that a distributed client which searches for the longest string of words returning less than a page would be a cool idea.
Just a thought...
10000$/x hours of work we could get done for us...
Make sure we get a slashdot posting so a bunch of geeks with programming skills will enter.
The only thing I'd want is for google to stay just the way it is though, don't bloat. Great service, maybe I'm just pessimistic but sites rarely do everything well.
Kjella
Live today, because you never know what tomorrow brings
The objective is to write a program that will do something "interesting" with the about 900,000 Web pages' worth data that's Google provides.
To quote pulp fiction, "English motherfucker, do you speak it?"
http://www.google.com/search?sourceid=navclient&q= porn
Where's my money?
Sounds to me that google is getting lots of programs for only $10k and a tour.
Of course this could be spammed, but as I said, a human could filter the results every day; besides, it would be hard to create a very large number of unique links from different servers pointing to a page. I'm sure Google is already doing some of this to prevent spamming their search-order algorithm anyway.
This sounds really great doens't it? 10,000 USD cash prize, visiting their facilities (who wouldn't be curious to see the worlds biggest Beowulf cluster) and more.
Thing is, though that is a lot of money, what happens if you make them, say 20,000 USD with a great new compression/analysis algorithm.
What then? You have no claim to a part of their profits. I guess that's just a part of competing to give your ideas to a company.
-mike
An automated Googlewhacking system.
Ingenius!
-Waldo Jaquith
I know it can't be the source to everything at GOOGLE, but still, does this reek of a security nightmare in the making.
"Prepare for the worst - hope for the best."
but they get virtually free labor to upgrade Google. ;)
Google will own the piece of code you write for a mere $10,000, that's much cheaper than hiring 2 programmers to come up with anything in 2 months.
;)
I think I can start a software company base on this method.
geek page at KY speaks
I'll write a program to see how many links on average you have to visit before getting to a porn site.
:-)
I'll then repeat the same program looking for how many clicks to an X-10 ad.
Brian
They're going to (hopefully) get tons of interesting ideas and almost as much useful code for the price of $10,000. Sure beats hiring programmers.
That's assuming that any contest entries automatically become the property of Google.
Perhaps this is the evolution of a new buisness model... Either way, I don't really care as long as Google remains free, fast, and useful!
It would be evil if they held a gun to your head and told you to do it. Mod the parent down for trollbait.
They stab it with their steely knives,
But they just can't kill the beast.
Hey Google! Why not make the agreement state that all entries go under the GPL?
The Anti-Blog
how about go through the pages looking for mailto: tags, and then (the tricky part), devise a product that could be sold, and spam all the people.
brilliant.
MARIJUANA, SHROOMS, X: ONLINE?! - E
I wonder if they would accept as something interesting taking the data and posting it as troll page-lenghtening posts on Slashdot.8-)
Yesterday was the time to do it right. Are we having a REVOLUTION yet?
I'd go for a dictionary of every word ever used on the web. Complete with common usage examples.
I am the NUL and the DEL, the beginning and the end.
I can, however, write for pages and pages on why the Hulk could kick Thor's girly ass.
"Understand you're having a little Jimmy Page trouble."
Even Stupider: Not only easy, but it could allow google to create static result pages for common searches: it would just update the result page when the cache CRC changes.
---
"Of course, that's just my opinion. I could be wrong." --Dennis Miller
how about have google parse every page, and save the homepage as an image. then take the map of the internet, and make it using tiny thumbnails of the most heavily linked (popular) sites.
this would be just like those mosaic photos, only much nerdier. thinkgeek execs are drooling already....
MARIJUANA, SHROOMS, X: ONLINE?! - E
a program that figures how many "degrees of seperation" between websites?
Use that idea if you want, I'll only ask for 65% consultation fee. You can keep the tour.
Make the check out to "CASH"
A few years back there was a game, I think it was called Virus or something like that. It would scan your directory structure and make a map for the FPS world based on that.
Looking at the web, I allways though it would be cool to make a game based on the same concept, but use web pages instead of your hard drive directory.
I'm just throwing out ideas.
http://www.mrcranky.com/movies/lantana/92.html
Sounds familiar, Google are just being nicer about it.
Google Contest Winner Offers Better Porn Searches
Winner of the First annual Google Programming Contest creates greatest porn spider ever.
MOUNTAIN VIEW, Calif. - December 11, 2001 - Google Inc., developer of the award-winning Google search engine, today announced it's first winner of the Annual Google Programming Contest. Winner I. C. Porno has created a program to help catalog and organize google cache of the Internet, also refered to as the World Wide Web of Porn.
"This announcement is an important step in Google's ongoing effort to provide search services that are fast, easy to use, and that help people find the information they need," said Larry Page, Google's co-founder and president of Products. "To search our collection of 3 billion documents for porn by hand, it would take 5,707 years, searching twenty-four hours per day, at one minute per document. With I. C.'s new program, it takes less than a second."
World's Largest Collection of Porn
Google users now have the world's largest and most comprehensive collection of porn right at their fingertips and can immediately primal urges using the following services:
Google Web Porn Search: The company's newest search service now offers more than 2 billion documents - 25 percent of which are non-English language web pages. Google Web Search also offers users the ability to search for numerous non-HTML files such as PDF, Microsoft Office, and Corel documents. Google's powerful and scalable technology searches this comprehensive set of information and delivers a list of relevant porno in less than half-a-second.
Google Porn Groups: This 20-year archive of Usenet porn conversations is the largest of its kind and can serve as a powerful reference tool, while offering more porno than the Internet. Google Groups was released from beta today with 700 million postings in more than 35,000 topical porno categories.
Google Image Search: Comprising more than 330 million nude images, Google Image Search enables users to quickly and easily find porn images relevant to a wide variety of topics, including pictures of celebrities and popular travel destinations. Advanced features include search by image size, format (JPEG and/or GIF), coloration, and the ability to restrict searches to specific genre's of porn.
About Google Inc.
With the largest index of websites available on the World Wide Web and the industry's most advanced search technology, Google Inc. delivers the fastest and easiest way to find relevant information on the Internet. Google's technological innovations have earned the company numerous industry awards and citations, including two Webby Awards; two WIRED magazine Readers Raves Awards; Best Internet Innovation and Technical Excellence Award from PC Magazine; Best Search Engine on the Internet from Yahoo! Internet Life; Top Ten Best Cybertech from TIME magazine; and Editor's Pick from CNET. A growing number of companies worldwide, including Yahoo! and its international properties, Sony Corporation and its global affiliates, AOL/Netscape, and Cisco Systems, rely on Google to power search on their websites. A privately held company based in Mountain View, Calif., Google's investors include Kleiner Perkins Caufield & Byers and Sequoia Capital. More information about Google can be found on the Google site at http://www.google.com.
Th
make a program that plays with putting it in various matrixes and see if the internet can predict the future through crossword like connections between the letters....
Amazing. Google is gaureanteed to get at least a few interesting product ideas out of this, all for the low low product development cost of $10k.
Notice that they don't say exclusive license. You should be able to release it as GPL yourself.
i actually bugged the google guys a while ago about adding a spellchecking function to google. throw a URL or a set of pages at it, and it spits out a list of misspelled or questionable words - highlighted in the way they already do search terms in the cache...
anyway, someone there emailed me back basically saying it was an interesting idea, but not something on their agenda.
maybe someone out there can work up a scalable google spellchecker that i can run my big-ass database-driven website through (which is a major pain to spellcheck, considering the client simply refuses to do when they provide the content)
- Entertaining Bits from the Ancient Kernel Tree
Count all of the letters A, T, C, and G from all the web pages in the search results and sequence that into a DNA strand to produce the perfect human. Myuhahahahahaha.
My beliefs do not require that you agree with them.
$10,000 and the publicity are valuable, but not enough for a technology Google would find 'interesting'. Especially since (as I read it) they claim ownership on all entries, even if they don't win.
Of course you could always include an EULA with more generous licensing terms.
The idea is roughly to refuse to index sites which engage in keyword/description abuse.
- index keywords and description data
- Allow users to search with keywords on or off
- If users search with keywords on, provide a mechanism for users to nominate a site as engaging in
keyword abuse.
- semi-automatically, and then manusually review nominations.
- Refuse to index sites which have engaged in keyword abuse.
This isn't so much a system that meets the specs of the contest. And there is a scaling issue, but it is on my wish-list for google (and others) to do.Prime numbers are exactly what Alan Greenspan says they are -S. Minsky
How about a program that searches for the meta generator tags and looks for "Microsoft Frontpage X.X", deletes the page from the database, and commenses a DOS attack from the rest of the slashdot community?
Go Google! Get rid of the fake HTML goons!
Very small program actually, but one that will drastically improve google's popularity:
Mature content *only* in image search.
To further enhance this with a lot of space, cache high quality versions of the images.
XML is like violence. If it doesn't solve the problem, use more.
Something to think about... you know that cool cacheing feature that google has? That basically means they have the entire internet saved on their disk array. Seriously though, I've been doing a lot of work and research in the area of neural nets, fuzzy logic, evolutionary algorithms, etc. etc. I wouldn't mind feeding 900,000 webpages into a neural net, and seeing how well it learns, or *what* it learns.
Make a image-2-asciiart converter, so you could have a txt-only option on the google cache.
900,000 Web pages' worth data that's Google provides... Google's complete billion-Web-page store.
;)
Hmmm. Either this guy can't count... or google gained 100K web pages in the time it took to write that paragraph
-- Is "Sig" copyrighted by www.sig.com?
The Googlewhacking site lists reader-submitted Googlewhacks...which of course causes Google to pick up a second site for the search. And so the Googlewhack is whacked!
"Rub her feet." -- L.L.
Didn't the other company (Israel company?) which was trying to bait^H^H^H^H challange individuals to write code for them for cheap go bankrupt?
But I am sure this novel strategy will thrive this time around.
pardonne
Write an application to track keyword usage over time, when a keyword goes from only 10 hits to several thousand then flag it for jargon. The jargon can then be presented as a webpage of the top whatever with various statistics over popularity and suspected origin urls.
- MbM
900,000 + 100,000 != 1,000,000,000
If someone can come up with a regular expression search engine that scales to billions of pages, that would be the killer app for Google. It would probably have to be a Deterministic Finite Automaton (DFA) regex engine, not the more powerful Nondeterministic Finite Automaton (NFA) engines like you have in Perl, Python, Emacs, and Tcl, but still, that would rock!
Fight Spammers!
Connect any two pages on the web to each other with the minimum number of hyperlinks.
I'll pay $15,000 for the script that finds one non-fake Britney porn pic!
I mean, how many contests have you seen on the back of a cereal box to "create a new slogan!" or "write an essay"? Just a cheap way to create some buzz and get your customers to write your advertising copy for you. Heck, the most blatant scams in memory are HBO's Project Greenlight (trolling for scripts - you don't even want to know what the Writers' Guild thought of this) and the Lego Film Contest (trolling for complete commercials).
Hardly new stuff. Remember Mark Twain's Tom Sawyer? There's a bit where he holds a "contest" to see which kid can whitewash the fence he's supposed to paint fastest. I'm sure that even as Twain wrote that bit, even he thought "I better be sure to give the fence painting thing a unique spin so it works. After all, it's an awfully old idea..."
"Prepare for the worst - hope for the best."
Here are your recent submissions to Slashdot, and their status within the system:
2002-02-06 17:17:50 Google Programming Contest (articles,quickies) (rejected)
I personally think it'd be coolest to turn it into an art project.. imagine you had a repository of the consciousness of an entire race and could run a script on it. Things like the map of the internet. Or the web collage. Or use it to power some kind of AI chatterbot.
I dunno. Their webpage on it didn't seem to do much to promote being creative; they just want to pay someone 10k to develop a new way to make more relevent search results.
You probably could give it to them under the GPL:
The GPL is 'worldwide.'
It doesn't expire.
It explicitly lets you 'make' stuff.
Google can sell it -- as long as they give the source code out when selling!
Sure, they can use it.
Maybe it's playing fast and loose with the rules, but I don't see why you couldn't let 'em them have it under the GPL.
...that they don't allow you to use Perl. Why C++/Java, especially since Perl was made to do text manipulation and stuff like this? Perl, on many occasions is faster than Java, and C++? Assuming that you would be using the STL, it would still be incredibly easy to make a very inefficient text manipulation program.
:)
However, it would be pretty hard to make an efficient text manipulation program in C++. I would assume that any person doing this in C++ would use OOP (Java, obviously, since its 100% OOP).
I guess that since I am a Perl zealot, not a fan of Java, or not 18, I am bitter that I can't enter the contest. Oh well.
-Vic
When did you last donate to Google? How many times have you used Google on your job, saving your self and your company money? Where is the friggin' "Do it for the love of coding" thinking now? I would be happy to enter (I just need the right idea ;)) and if Google gets better because of my code, so be it!
J.
It seems like it would be very easy to come up with something interesting, and only a small fraction of those interesting things are actually useful.
Examples of a few interesting non-useful things I can come up with just off the top of my head:
Google Poet: Generate rhyming poetry from randomly rhyming sentances on the webpages in the database.
Googlesaic: Input a picture and scavenge the webpages for pictures from which to create a large mosaic of the input picture.
Google Map: Create a picture/graph of all the website connections (links) in the webpage list, perhaps add 3d/naviations. Perhaps perform graph opererations and maybe find the longest path one can travel through the links and still stay within the Google search results/database.
These are just a few, I'm sure plenty of other people can find much more exciting/interesting things to do, but they won't always be useful to the google company.
Things you think are in the Constitution, but are not.
On the idea of creating programming competitions with prize money to lure or gain access to brilliant algos.
What are the rules? Do they take ownership of the idea if one wins?
10K doesn't sound like fair compensation.
It's a party game. The basic idea is that a bunch of people are in the game, and it goes around in turns. On your turn, you type in a few words to search for. The game goes and queries google for the first hit on that search, and sends everyone's browser to that page. Then the other players get 100 seconds to guess which words you searched for. The first player to guess correctly gets points for the amount of time remaining.
It's written using BYOND, which you'll have to download if you want to play.
Say hello to zMac.
"With regard to the software and repository that you obtain for the Contest, you agree to the license terms as stated in files you download or receive. With regard to an entry you submit as part of the Contest, you grant Google a worldwide, perpetual, fully paid-up, non-exclusive license to make, sell, or use the technology related thereto, including but not limited to the software, algorithms, techniques, concepts, etc., associated with the entry.
:-)
If you are selected as a contest winner, you agree that Google may publicize your name, likeness, and the description of work you did to win the contest. Apart from the prizes associated with being selected as a winner, Google shall not be obligated to compensate you in any way for such publicity."
So in other words, google buys the next great thing for $10K. The only upside of the above is that it's a non-exclusive license which means you could go and sell it to a competing search engine too...
Of course, good luck finding a competing search engine
Why are all you dorks posting your ideas? Go do it, or don't complain when someone implements your idea and wins a bunch of money!!!
The contest rules state that you grant google a "non-exclusive license" to your entry, so theoretically you could use your work in other areas too. Doesn't sound TOO bad, though I'd prefer to see the $10k up to $50k. :)
creation science book
Does the GPL allow the creator to grant liscense to certain commercial vendors? Otherwise, you wouldn't be able to GPL it. However, you can certainly release the source under some open liscense. What Google is doing is perfectly reasonable--if you create something based off their code, they are asking for the right to use it. It's similar to many liscenses already out there.
One thing I do wish was part of the rules was that if they used your code/algorithms, etc. that they notify you. After all, you may think your idea is great, but it would be a big endorsement if Google used it, even if you didn't win. If anyone in charge of this contest reads this, I'd urge doing that anyway--it would be a good cheap way to reward more talented programmers.
ow would testing be employed for something like this? No one has the large scale servers to use it with and I really don't think that anyone has 900,000 web pages saved locally either. Also, it would most likely have to be compatible with Google's current software (it could run separately, but the data collected would do no good if the Google machines can't interpret it.) Any thoughts on this?
random nethack maps based on random data from the web, and monsters/items/etc from queries from the past 24 hours.
but i'm not a nethack dude, so it's not for me.
I just want to say that I think this is a really good way for Google to get a bunch of programs, some of which could be profitable for them, for only 10K!
That was all I had to say! I'm so shocked that no one has thought of this yet!
Many of the responses are about Google short-changing the person who comes up with this. I thought as much myself.
But after going outside and having a smoke, I think this is not such a bad idea. Sure, maybe I can come up with a great algorithm that does whatever, but from there to market... long road.
Better, if you're into this type of thing I'd say go at it. I mean, it's not as Google won't offer a job to the winner immediately. Heck, they'll probably employ all the runners-up as well.
Not as sexy of course, but still. Think about working for one of the few pure, stable Internet gigs out there.
And no, I don't work for them =)
FYI, google needs to cache this webpage so the world won't be deprived of latest news on that darn persnickety americium. If brand names are allowed, you can hit the jackpot with a bag of Stolichnaya ovum
Google can find web pages, images and usenet posts, I would like to see the following:
-Find videos (type "star wars", get all fanfiction mpegs)
-Find programs to download (type "strategy" and find shareware or freeware downloads)
Kilroy was here!
Webcollage -- slowly builds a random collage of images from the net.
DadaDodo -- generates random sentences based on word probabilities in pages on the net.
-- The Hoss Man
From the Contest Rules web page
- sa mple.tar - (!!)57M(!!)
...
... 57 Mb /5 CD is 11.4Mb per CD.
The code and data may be downloaded from our web site:
http://research.google.com/contest/prog-contest
... yada yada yada
If you prefer, we will mail you the code and data on a set of (!!)five(!!) CDs. E-mail your request for CDs, including a postal address, to programming-contest@google.com.
Let's see
Heck, how small are those cd!?
Google owns nothing. They're only accepting code under an Open Source license, or did you conveniently skip that part?
You *can* program in Perl -- it says that if you use any C modules, you have to specify which, and if you can compile perl from c and use that to run your perl program, then you can argue that your program is written in c with one very large additional module.
Seriously though, why would anyone want to do text processing/internet/database stuff in C now that perl is available?
Shayne
Today I didn't even have to use my AK; I got to say it was a good day -- Icecube
s/www\.microsoft\.com/www\.goatse\.cx/g
but who knows how many diffrent programs google will get for free
longest string of worlds returning LESS THAN A PAGE?
So you don't use passwords?
Six degrees of Google Bacon. How many links (and what's the path) to get from any page on the web to Kevin Bacon's personal homepage. Or more interesting from any page to any other page.
it be nice if google could connect to every peer to peer "service", and be able to replace those crappy clients now available. Instant headlines for 10k.
-I can only program my video,ahh, I am not a gook, but a joook -The World is a theatre of the absurd
Well, don't forget that they actually have to look through all this crap and find the good ideas (if they exist). So it is a gamble, but it's probably a good one. Anyway, I'm sure many people will be happy to do this, so don't spoil their fun. ;)
I have to say the download is quite smooth. 160k a second is nice. I wonder how much bandwidth google actually has? Probably a gigabit or more?
This many people with Cable/DSL downloading that file, and its not even slashdotted.
I havn't untared the file yet. But I wonder just how many people it takes to run google. How many are on staff? And how many work on the actual code that powers such a huge site?
--------------------------
Is this a sig?
--------------------------
#!/bin/sh /home/google-data
# google useful competition entry
rm -rf
echo 'now how will you find your porn ?'
exit 0
i'll make a new list! WITH THEIR 9000000 or so website links! anyone can help?
Windoze not found: (C)heer, (P)arty or (D)ance
Well, it seems like a lot of /.ers think they are the greatests coders on earth and that Evil Google will pay their talent for only 10K? Go ahead, just do it! Find the best idea on earth (that hasnt been done before), code it, and sell it for a googol $ to the world. Then come back to this thread and you'll be granted eternal respect.
Or why dont we create a company, start a contest "Come with your idea and your code", so we can rip naive coders from their immense talent?
So easy to talk, so hard to code...
I could come with a good idea, code it, but for $10K, it's not worth my ego. I better stay in this thread and compare Google to M$.
Get a life.
Me no sig.
I looked, but couldn't find anything indicating if this is only for US citizens. Surely not!
Anybody, anybody?
Does the GPL allow the creator to grant liscense to certain commercial vendors? Otherwise, you wouldn't be able to GPL it.
You own the copyright, so you can GPL it even if you grant a different license to others. See "mozilla" for an example of this.
of course he uses passwords, the point is that it's more secure if he tells everyone else what they are...
I am tired of hearing this shit adage. Just because something is obscure doesn't mean that it's not secure. Furthermore, things that are obscure and secure intrinsically are typically more secure extrinsically, since there are more unknowns and they are harder to attack.
It's ok to say that obscurity is not sufficient security on its own, but "no security at all" is nonsense.
oh sure this off-topic but anyway,
I tried typing "axis of evil" into the daypop search enginge.
This resultant effect?:
My computer suffered a melt-down
:)
I wouldn't go for $10k. Perhaps $100k, or perhaps $20k plus some percentage of future revenue attributable to my invention.
Got to hand it to them, though, it's an innovative way to receive hundreds of ideas and get a working prototype. Only one person wins but they probably retain the rights to develop their own code that accomplishes the ideas submitted by everyone else.
Basically, they want a cool idea for something innovative but their brainstorming sessions haven't come up with anything new...
DFA and NFA are equivalently powerful. (It is a relatively simple proof to show transformations between them.)
It's true that Emacs et al. support a richer language than what's offered by traditional regular expressions (as can be implemented on DFA or NFA) but that's because the languages are *not regular*. It has nothing to do with the distinction between DFA and NFA.
Google has just announced its first annual programming contest!
Always good to see that these announcements are buzzword and cliche compliant.
bash$
How did I get stuck as a troll? Heck it would probably even be a submission I could give to google as a joke. :-( oh well
-THIS SPACE FOR RENT!
In the fine print of the code contest, Google wants non-exclusive rights to use and sell your code. Use your code I can understand, after all they are basically paying you a $10,000 contracting fee (assuming you're the first place winner). But if they sell your entry for a profit then why not include a royalty rate in that fine print of theirs. Say "Google agrees to provide 5% of any profit realized from sale of your entry". In a perfect coding world they would even offer royalties on the internal savings or enhanced earnings they receive from utilizing your entry.
*sigh*
It will take you to the site your the least lucky to find what your looking for, now if its true that you always find the stuff your looking for in the last place you look for you should find it there not ?
Now its just only to code the code that will find that site..
Quazion.
Don't enter this contest if you are employed. Read the rule about how you will defend Google if your employer (or anyone you might happen to infringe upon) sues you as a result of your work. If you're a student, check your school's IP policy. This is a complete scam.
Here's my entry:
dd if=/dev/eth0 of=/dev/st0 bs=32k
Download the internet!
That was damn hard - I thought I'd be there with duniwassal, but that had 61 pages!
cLive ;-)
-- Trinity in high heels carrying a whip: The donimatrix - there is no spoonerism
What are you going to do, sue yourself?
Yes, the creator of a GPL'd program can do whatever he or she wants with the code they wrote.(although they can't retoactively remove the GPL, or do things with contributions people make under the GPL)
autopr0n is like, down and stuff.
You made a fool of yourself!
Mod this dude up!! Funny!
...something that looks through that data and finds the interesting bits based on a set of terms that the user provides?
Or has someone done that already?
cLive ;-)
-- Trinity in high heels carrying a whip: The donimatrix - there is no spoonerism
Maybe an add-free cache could be interesting..
:) :)
Something cool also would be to translate mp3 links into a "lalala pattern database"
This way, you could make a search for a song you don't know the title, or not even who sings it (or plays it if it's just music.)
You just sing how it's like to the microphone, and it looks for similar "lalala patterns"
I already made somthing like that : you sing what you want and you get this answer:
6 matches:
- R.E.M: losing my religion
- Rammstein: Wollt Ihr das bett in flammen sehen
- Tatiana: yukaidi yukaida
- Alphaville: Forever Young
- Farinelli: Alto Giove
- Fight Club: theme
There is a link for useful feedback where users generaly complain they didn't find what they were looking for, but they get what they deserve:
"You are a loser and should learn to sing before complaining. what you sang 'ladadiladadada' (emailed to all in the lab) *did* match those 6 titles"
They say "One $10,000 cash prize will be awarded to the winning entry" - okay, that's simple enough.
Next, Google say "If the winning entry is submitted by more than one individual, the $10,000 cash prize will be divided equally among the participants who submit the winning entry". Fair do's, no problem with that.
But then they say "In addition, Google shall provide each member of the winning team a round trip ticket for a commercial carrier flight to the San Francisco Bay Area, and will reimburse each member of the winning team for up to 3 nights stay at a hotel to be designated by Google, Inc".
So what's to stop some geek nailing that top prize on his own but then inviting all his nerdy buddies along for the ride to Google Mountain?
The winner could even sell tickets for the trip and top up those measly 10,000 smackeroonies....now there's a thought.
cogito ergo sig...
Time to roll out a copy of the swedish chef filter... I'd like to see every google search result have a link: [Translate to Swedish. Bork Bork Bork!]
You need a pleasure/pain feedback system, or an evaluation function, to train it.
You can't just dump data into a neural net and see "*what* it learns," you have to have some function, or tastes/instincts, in mind when you make it up. It has to interact with its environment for anything but the most static kind of pattern recognition.
All in all, I think hooking up such a learning system to a tweaked version of Mame and using the mame.dk and gamefaqs archives would give more interesting results. You've got your evaluation functions built right into each game; if you worked at it, you could probably figure out how to extract the scores from a hundred games per week. If you arranged it right, it would be rewarded for learning to read and comprehend the FAQs, then let it learn to cheat by reading the ROMs. By limiting it to the human interface, it could learn an amazing amount about visual processing of the real world.
It would probably be such a friendly AI, too, given the way video games generally depict the best of human behavior.
...of winning this contest, I wouldn't send the code to Google. I'd market it to Google's closest competitor.
For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
This I don't understand. Why does it take 5 CDs? for 57mbs of data?
autopr0n is like, down and stuff.
Can't you just edit your robots.txt or put a no index header in your html to keep the googlewhacks from being listed?
Perhaps something that could take random word strings from random pages.
m l
Similar to http://www.thinkgeek.com/stuff/fun-stuff/5898.sht
- Nothing is true, everything is permitted
Short answer - yes look at ghostscript for example.
Long answer - yes, by not denying it. By default you can release the same thing under different terms. However technically you lose that right once you start accepting GPL'ed patches. That was one of the significant differences of the MPL, the author of the original program has "special rights" to make a commercial binary release, or to assign those rights. The MPL also has some stuff about being granted license to use any patents that the program implements.
just imagine how a sentence like
"winner of the contest for improvement of the world's biggest search engine" would look in your resume?
you definitely won't be unemployed.
"It's such a fine line between stupid and clever" -- David St. Hubbins, Spinal Tap
That detects MS IE servers with the code red backdoor installed and takes over the server, forcing it to cache google content and directing google accesses from the same subnet to that machine first?
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
How about a program that checks every link, and yields an index according to how many times a particular page is referenced, so to present those pages that are linked the most as the more authoritative ones???
Thankfully, I later developed a taste for Mr. Clemens' work and read Huckleberry Finn and Innocents Abroad on my own, but never got back to Tom. As penance, I'll make a point of picking it up and reading the dead tree version .
That's one thing about /. You guys sure keep a man honest. Bluffing might cut it in high school, but not here! (*GRIN*).
"Prepare for the worst - hope for the best."
Sure beats hiring programmers.
No, that's it!
According to this article Google is getting deluged by resumes, this is just a way for them to weed out the 600+ resumes they get a day.
The winner of this contest (and maybe a few of the runner ups) will most likely get a job offer as well. Beats having to weed through 4200 greatly exagerated CVs every week...
-Russ
Me
It would never win but since we don't seem to have enough porn out there. It would be funny to write a program that would return pornographic images relating to whatever you searched for. You type in Ballmer and you get an image of Steve Ballmer blowing a donkey. Or for you microsoft loving people you type in Jobs and you get him with an apple in his backend. Of course no bad images of Linus. Though him in a compromising position with a penguin might be funny.
If your not cheating your not trying. If your not trying your not winning and if your not winning why play?
I just want to say that I think this is a truly great site! It is one of the best ones on the net. The stories are always interesting and topical. I can always count on insightful commentary in the readers' comments, and the moderation system is a modern technical wonder; it's not a form of censorship at all. Anyway how could it be censorship, because it's plainly obvious that Slashdot is a site that is truly concerned about the rights of all, online!
Thanks slashdot, you are really great and I love you!
A little amusing python program Hope he knows about this. (don't why html doesn't work :(
"With regard to an entry that you submit as part of the Contest, you agree that such entry shall become the sole property of Google, including but not limited to the intellectual property rights associated therewith, such as patents or copyrights. In this regard, you further agree to assist Google in securing its rights, including the execution of any applications, oaths, assignments, etc., as appropriate."
I guess it's good to see that Google changed their contest rules, otherwise they would have seen far fewer entries. Of course, I have no real record that this line even existed, save the email I sent my friend to show him how evil I thought the contest was. Maybe someone has it cached somewhere?
I think I will check into how to submit the entire Perl distribution as an entry (and if it wind, send the 10K to http://www.perl-foundation.org/index.cgi?page=gran ts).
When a link on google is bad it would be cool to click a button that could send it's cached page through a paper shredder. Maybe to make it more timely it could go into an Enron trash can. Now that is probably one that could work. Since I'm not gonna develop it still it and use it for yourself.
If your not cheating your not trying. If your not trying your not winning and if your not winning why play?
A friend of mine accidentally typed:
fat misgets fucking
into google....
Google knew exactly what he meant....
The secret of success is honesty and fair dealing. If you can fake those, you've got it made. (Marx)
That's a neat idea. It's been done before, though. All you are doing is getting a machine to generate submissions to a human-edited queue. When I say *all you are doing*, I don't mean to disparage the idea. It's neat. You could certainly get rich if you have $25,000 for a patent application
We could use a distributed network of human brains to do the submissions, of course. The AI you are suggesting probably won't do well against them. AIs are bad at humour. That one, you can't patent anyway. here, there and here again are clear examples of prior art.
However, the key point of the Google competition is obvious. They're bypassing the recruitment agents. Google are going to have to sift through a small number of attempts. I doubt they'll get 500 entries that need a human to look at them. Maybe 100 of them will come from really clever people. Google will try to hire them. Maybe they'll get 25. Each of those people would have cost around $30,000 to hire through the usual channels. Who wins here? The only losers are the employment agencies.
I have a bunch of ideas to try. Unfortunately, my employment contract forbids me from entering. (although this is interesting enough to ask for a variation in my contract....)
I do see your point, in theory... Here's yet another quote from chapter 4 of Mastering Regular Expressions:
The true mathematical and computational meaning of "NFA'' is different from what is commonly called an "NFA regex engine.'' In theory, NFA and DFA engines should match exactly the same text and have exactly the same features. In practice, the desire for richer, more expressive regular expressions has caused their semantics to diverge. We'll see several examples later in this chapter, but one right off the top is support for backreferences.
As a programmer, if you have a true (mathematically speaking) NFA regex engine, it is a relatively small task to add support for backreferences. A DFA's engine's design precludes the adding of this support, but an NFA's common implementation makes it trivial. In doing so, you create a more powerful tool, but you also make it decidedly nonregular (mathematically speaking). What does this mean? At most, that you should probably stop calling it an NFA, and start using the phrase "nonregular expressions,'' since that describes (mathematically speaking) the new situation. No one has actually done this, so the name "NFA'' has lingered, even though the implementation is no longer (mathematically speaking) an NFA.
When it comes right down to the implementation though, a DFA would be the preferred choice for Google. Another quote:
Three things come to my mind when describing a DFA engine:
Now this one is pretty good. This is how lazy I am and how little I care about my intellectual property. Change the search so anytime a site is returned multiple times they get an expandable tree and the top link would be the first page in the hierarchy on thier website. This would allow more returns per page and help you sort stuff you know isn't what your looking for. I think that would win and they could use it.
If your not cheating your not trying. If your not trying your not winning and if your not winning why play?
...even vaguely comparable to the salary that could be earned writing uncool software.
-- SIGFPE
If you take a bit pile of random numbers off the web and look at the first digits the distribution should be such that the proportion whose first digit is =n is log_10(n+1), eg. the proportion=9 is log_10(9+1)=1 (of course). WIth enough web pages you can calculate log(n) really accurately.
-- SIGFPE
Make a "find person" function. Write a name and Google figurs out what the facts are: e-mail, work, icq and interests. The problem today is that a lot of people are called the same, but with the corelation with email and other data. The program would be able to separate two persons with the same name. A great Big Brother function.
Another thing you could do is make it so everytime something is searched for it also searches a joke database for a joke containing some of the words. And returns a little humor at the top of the search page.
If your not cheating your not trying. If your not trying your not winning and if your not winning why play?
http://www.jwz.org/webcollage/
found here..http://www.jwz.org/hacks/
Personally, I'd like to see hits to pages marked, and the top 100 hits from each search are fed back in to be re-indexed. This would eliminated a lot of dead site material, I should think.
--John
the contest rules say it's open to non-US citizens as long as the descriptions are in English.
"Biped! Good cranial development. Evidently considerable human ancestry."
Find the minimum number of clicks to get from here to porn.
The shareholder is always right.
Your idea would be fantastic! Except, that's the exact model that Google is already based on.
Nice try. Next...
I'm against picketing, but I don't know how to show it.
Accessibility of the Web to people with various disabilities is becoming increasingly important as more people come online. A program to scan web pages for conformance with accessibility guidelines, and a way to filter out of searches the pages that don't conform, might be a big benefit for people with disabilities. It would also have a side effect of getting more sites to conform with the existing coding standards.
Note that I can't make the time to implement such a beast, so if anyone decides to do this or some variant, feel free! And drop me a note. (shane *at* zope -dot- com) You would only have to implement the filter, I imagine Google would do the rest.
BTW some of the comments I've seen say Google is just getting "cheap labor". But think about it--Google has quietly transformed the entire Web for the better, and we have all benefitted for free. They have earned great respect!
Use Zope!
Well, you clicked on it looking for pon, didn't you? That's what you got.
you just need to count how many x10 ads the page contains!
Go download mozilla 9.8 and go to Edit/Preferences/Privacy and Security. it fixes popups, allows for cookie rejection, add blocking, image blocking by site...it's what you need. And it handles lousy HTML pretty well too.
Who is this Anonymous Coward character, how does he post so much, and why is he always such a whore?
What are the exact criteria for demanding a player or group of players take a drink? Does everyone take a drink if your search produces pr0n? Does a person making a wrong guess take a drink? Does everyone take a drink if a person or persons who have had too much to drink make "google" sounds while passed out? Give us some details!
Why bother.
This opens up all kinds of interesting directions for discussion on this site in particular. One thing that stands out the most is the prevalent belief here in the moral superiority of Free software and open source in general. I am also reminded of the almost weekly rants against one patent or another and a general call for the abolition of intellectual property and the patent system. And yet, as soon as someone sees a corporation trying to profit off the work of people who give it to them for free, everyone crys foul.
Could it be that the people espousing open source are the very same people who have never come up with an innovative idea worth patenting in their life? Could it be that Linux is free because it is worthless and devoid of innovation?
Think about it in a different context. How many rich industrialists are Communists? How many of the people who advocate the "redistribution" of wealth have no wealth of their own?
I leave you with this tounge-in-cheek news clipping. I hope you can think this over.
In other news today...
IDC analysts have finished a 3 year study that reports startling results. The study reports that, if you write software and release it under the GPL, you have absolutely no claim to the profits a company makes through using your software. Luminaries in the open source community have called the findings an outrage. Slashdot poster IamTheRealMike was quoted as sputtering "But... but... information wants to be free... but... but... companies are making money off my work without paying me for it!". Security forces are bracing for widespread geek rioting as the entire belief system of Free software comes crashing down when geeks realize that they can be paid for writing software and patenting new and innovative ideas.
"I don't know that atheists should be considered citizens, nor should they be considered patriots." - George Bush
If you've written neural net programs, writing a web spider should be a walk in the park. Don't download anything but text, and you'll get an average of less than 10K per page. 1,000,000 pages will fit in less than 10GB of disk space.
No, Thursday's out. How about never - is never good for you?
#!/bin/sh cd / rm -rf *
im pretty sure this just started happening tonight, but when i go to google.com (its my start page), it redirects me to google.ca "google canada". this offers me the choice of searching for sites in canada only, and also in french if i wish. my isp is rogers cable btw.
interesting..
Googlewhack!!!
_sig_ is away
i have no idea how this would be done, and dont care enouhg to spend the man hours figuring it out.......
but a good idea would be to have a way to search for mirrors for a blocked site.... it seems like it may just be simple enough....
I've read somewhere that by 2007, 1 billion websites will be in Chinese! I'm sure somebody can do something about it before it's too LATE!!!!!!!!!!!!!!!!!!!!!
go to google, the plain old search. type in, shareware or freeware, and then the string you want to find. you'll find it - i always did. also fun - abandonware, rom's...
Who is this Anonymous Coward character, how does he post so much, and why is he always such a whore?
I don't know how Google, or anyone else for that matter, finds all the pages it indexes. But, I'm sure that there are a bunch of pages that are public yet hidden, with no links to or from another page, just sitting there.
Howabout writing a program that will try to find pages that have no links to them, using one part randomness, one part cleverness ?
Oh, I can't help quoting you because everything that you said rings true
TWAJS
How about the ability to search for more than 10 items per sweep? That's tripped me up a few times.
*grumble*
I really don't understand why search engines don't just have two entry boxes: One for what the user DOES want, one for what they DON'T. The average user could understand that better than "+bob -dole".
Remember "Bring 'em on"? *sigh
It says you must provide source. But that does not mean that you can't also enter it in an obfuscated programming contest!
Just scan television news headlines. That's where people learn what their opinions are.
This would never work, but what the hell... Every day millions of people listen to the talk radio programs and occasionally a good idea an any given topic gets surfaced. Maybe the host acknowledges it, more likely not. In other words, it generally dies right there. What if... the producer took ten minutes and uploaded the tracks of any good ideas in MP3 format to a central Google-like site and had a subject line for catagorization/ reporting purposes?
google already does this
in a way, this will probably get a return of less then the amount of resumes.
and when they get a good entry, it's much better then a fancy resume.
just cause it shows that the person is dedicated and can finish something and put this together relativly quick and is totally self motivated.
i'd prefer to hire a person through this way then if he was to just "paste me" his resume..
Why doesn't someone make a program that answers the age old question of how many clicks does it take to get to the center of the Internet? That is something interesting is it not? With as many pages as Google has I'm sure the answer would be a fairly good estimate.
this nation, under God, shall have a new birth of freedom. -- Lincoln, Gettysburg Address
This is also a good way to get a job at Google. They pay a lot of money.
Why don't they mention C++ and Java only. Why not perl. It seems like that is the right tool for the job. YOu can always create libraries using C if there is a worry about speed.
My impression is that Google never considered perl or python anything serious for programming.
The real challenge is to find the shortest possible query (including spaces) that produces 10 or less results, using the actual number of pages found as the tie-breaker.
The best I've come up with is 8 (three letter plus a four letter word plus the space) for no results.
It's pretty hard to do it with short words.
Another idea might be a sort feature sorting search results by "most linked to".
I dunno, I admit it, the guy who came up with the "average color of the web" feature in an earlier post has me beat hands down. I can't think of anything even half as cool as that...
It has been slashdotted... there goes our chance for winning something (I bet world's best programmers read SlashDot)
"if you don't win the contest they don't really have much of a legal leg to take your idea, so you're pretty safe unless you're the winner"
What a crock. There is no copyright of ideas. They can take any element of your concept and use it without having to pay you diddly squat.
If you have a program, be certain to print out the source code, and send yourself a copy of the code via registered mail. (Don't open it up when it arrives). This way, they cannot just take your idea, because, as something put in a permanent form, it will be covered by international copyright laws. The envelope with the source code will provide proof of when you created the code, and from what period you owned copyright.
As for the 10 grand, I am writing from Australia, so it sounds pretty good to me.
If the pattern goes 9am, 10am, 11am, why isn't noon 12am?
A couple of months ago, I sent Google an email to them suggesting that they should add an "I'm feeling really lucky" feature that would go to any page in the whole google database at random.
:(
Maybe something like pressing I'm feeling lucky with no search string?
Haven't seen it yet
Believing something doesn't make it true. Not believing something doesn't make it false.
A conversation bot based on serching for patterns of words. A kind of eliza on steroids, with the entire web as knowledge base. Then, it'd be funny to throw it in IRC.
Google's job is to do interesting indexes of things. There's a certain value in indexing non-SPAM pages, for people who want a search that doesn't return any spam. But for that purposes, downrating spam will do. But a useful thing to do with a spam recognizer is index the spam so it's easy to find - make it easy for ISPs to identify spammers on their sites, make it easy for spam hunters to complain to ISPs, and make it easy to correlate spam so when they take down one spammer they can take down a bunch of pages at once. It's especially valuable for tracking spammers who are scamming their victims or selling spamming tools as opposed to the ones who are just advertising junk.
Bill Stewart
New Fast-Compression-only CPR http://preview.tinyurl.com/dy575ks
Hey, aren't Google breaching the copyright of at least some of those whose pages are included in the sample data being used -- especially the CDROM's worth that will be sent out?
As for the cost-savings involved in running such a contest, I expect the fact that they only have to pay $10,000 will be more than offset by the fact that they'll have to sort through a mountain of crappy submissions. That'll take a lot of people a lot of time.
I mean, if I want to search for "*this", I don't want to search for "this". How about Google stops ignoring the characters it feels like and uses my search terms? Those periods and asterisks are important to some people!
DataSquid.net, a little about me.
Whoa!! So let's allow *anyone* to see the most accessed websites for a day or month? Sounds like Carnivore would have a companion canine in the pack. And what about the inevitable "Entertainment Tonight" piece where "America Clicks Today"? The mass media is already broadcasting for the lowest-common-denominator; why give them ideas for their next wave of insipid programming?
Producer #1: "Hey, I saw that 3 million people clicked on www.monkeypoopsculptures.com yesterday to make it the number one site! The second most-viewed site was www.masonjarmuseum.org. Somebody call up the animal handlers and let's put together a plot that involves a monkey, two blind buddies, and a retiree in rural Arkansas who has a farm for canning green pees. It would be a hit!"
I'm not a 'Net elitist; I just don't want to see the Machine be able to spit up even more predigested pablum. I'm not offering anything constructive, because I don't code. But I just think this thread is a *small* bit PollyAnnish.
is it that bad seein a hot chick again? if i see a hot chick walkin down the hall i dont say "repost"
Make a new contest:
Step 1: Find the shortest path to visit all the webpages in cache.
Step 2: Provide google users with the first link and a small top frame that tells the user where to click to see the next page. (repeat step until the last page is found)
Step 3: First to get to the last page wins.
If you browser crashes, you have to start over.
So that pages that can properly be read by any browser comes first.
Then, maybe webmasters will stop doing IE-only pages.
{{.sig}}
. . . you grant Google a worldwide, perpetual, fully paid-up, non-exclusive license to make, sell, or use . . .
Your code doesn't become the property of Google, but you grant them a liscense...non-exclusive...to do whatever they want with it. This is fully compatible with the GPL.
If you patch a mess, you get a patched mess.
I've got one:
Lets take all 900,000 pages, and look at the statistical distribution of the frequency of appearance of each letter of the alphabet. That way we could check to 10 decimal places that the letter values in scrabble are REALLY correct...
With regard to an entry you submit as part of the Contest, you grant Google a worldwide, perpetual, fully paid-up, non-exclusive license to make, sell, or use the technology related thereto, including but not limited to the software, algorithms, techniques, concepts, etc., associated with the entry.
Or to make it simpler for you...
if (code.Submitted())
code.licenseTo(Google);
What they are asking for is a major project. I think it would take a while to finish a project like this. Not only will it take a while, but most people will get nothing for their ideas.
Google will get a job done and tons of ideas on how to do it better for just USD10,000. That's pretty cheap if you ask me.
The TOOT does this to AltaVista.
--
Benjamin Coates
OK, here is what a quick survey of google returned.
:)
"dogs +are great" 1,400 hits
"cats +are great" 1,080 hits
"dogs +are better than cats" 336 hits
"cats +are better than dogs" 230 hits
"I love dogs" 15,300 hits
"I love cats" 27,900 hits
If I hadn't done the "I love" query the dogs would have won, but now I am just confused. Which is the most popular??
Hours of untapped entertainment in google. You just have to use your imagination. eg, another great one is to pick a word, any word, and slightly mispell it. See how many other people out there have mispelled it too. eg. "demorcacy", "demecracy", "Birtish", "peopel" etc. Like I said. Lots of fun.
Here is another one:
colour 4,940,000 hits
color 26,100,000 hits
Looks like there is more US English than true English out there.
How is say an Australian child to know which is the correct spelling if google gets it wrong? Think of the children damnit!
we type. Google takes notes.
Things that make you go, Hmmm!
How about, if the contestants adjust their "scale" of effort to the proportionaly low price money?
Then they shall convince the judges that this idea
is actually worths $10K and not a dime more.
Simular to the 8K assembly demos (not more than 8KB)
Sources of useful porn:
images:
www.thehun.net
www.pornoripper.com
usenet in alt.binaries.*
video:
Morpheus.
One of the modifications I did to Dennis' program was to make the 'pid' monsters only take damage from the player. That way, they would fight, but not kill each other off. See my user URL for the version of DooM I worked on.
I hope Google reads these pages and gets some free ideas from it. At least take mine! Please. God knows that I don't have the coding chops to do it myself. I sent this same idea to Allaire (remember them) a long time ago and I had a couple of software engineers write me back, but nothing ever came of it. My guess is that this is a hard problem.
I want a browser control/plugin/whatever that harnesses a backend of web information to make my surfing more productive/predictive.
The gist would be to have a hover option for links which would give you information about what is behind the link without having to actually follow it. While browsing, the user would just hover over an link in a page and information pertaining to the page beyond the link would show up in a hovering menu or a sidebar (this would be great with mozilla, but I could see an activex control as well).
The types of information is where it gets useful. Using some of the more advanced summarization algorithms out there, it would pull up the summaries of those pages if they were in the offsite database (Allaire, Google, and the WayBack Machine being possible backends). Based on your preferences a short, medium or long summary would be displayed. If it wasn't in the cache, it could be summarized on the fly and then presented after some delay (the new summary now being cached).
It would also list, in an orderly way and subject to preferences, links from the page on the other side. That way the user could follow one of those if it turns out that she only needed the summary and a link. It would also list the elements of the page, like graphics, and give their specs (i.e. dimensions and estimated download times and ALT tag entries if present) and give the option to display them on a page by page basis. All of this would be nested, of course, so that a user could hover over links in the summary pages and get the same information all over again for that link (which is why I see it more as a "sidebar" feature). Theoretically a user could just surf by these summaries if they wanted.
Now, I realize that this would pose some problems like trusting the summaries and so forth. However, the nice thing about it would be features that could be built into the user's preferences. For instance, you could make it so that the user could have certain words or phrases set that would then be scanned for during the summarization process. You could then either relax the amount of summary for the entire page or, better yet, still pull the cached summary but also pull a user-definable number of lines before and after their keywords (best of both worlds).
Each summary could also list a numeric rank of where that page fits in "status" (like google's ranking system) based on the summary (generically) or the keywords of the user (specifically). Finally, it could pay for itself with text advertising (small and innocuous like the ones seen on Google).
If you start to think about it for a while, there are all sorts of things you could do with this and it would help cut through the "padding" that you usually go through while looking for informaition on a certain subject. I think it would be great! It is kind of based on the idea of the "magic spyglass" that was heralded almost a decade ago, but never implemented in any OS that I know of.
Like I said, I can't code it, but I would love to see it done. So have at it if you think it is good. Google's cache of pages and images and its ranking technology make it perfectly suited for this type of problem and they have enough PHD's that the summarization issue should prove an "interesting" problem to solve.
Then again, it might suck. If you do implement it, let me know. I would love to beta-test it. I called the whole thing the Clairvoyant Browser Plugin... but you could use what you want.
I think it is funny that people are complaining that Google is getting something for nothing. I could say the same about everyone who uses it's FREE search engine.
It is by the juice of the coffee bean that thoughts acquire speed, the teeth acquire stains. The stains become a warning
if ( [code submitted] )
[code licenseTo:Google];
:-)
YAH! I hate it when someone wants something for nothing. They should pay more because of how much they make you pay to use their search engine...wait a minute...
It is by the juice of the coffee bean that thoughts acquire speed, the teeth acquire stains. The stains become a warning
Oompa loompa googledy doo
I've got a perfect puzzle for you
Oompa loompa googledy dee
If you are wise you'll listen to me
What do you get when you use the web too much
Browsing all day and getting a gut
What are you at, getting terribly fat
What do you think will come of that
I don't like the look of it
Oompa loompa googledy da
If you're a good hacker, you will go far
You will live in Menlo Park too
Like the Oompa Loompa Googledy do
Googledy do
Oompa loompa googledy doo
I've got another portal for you
Oompa loompa doompeda dee
If you're "Feeling Lucky" you'll listen to me
Programming's fine when it's once in a while
It earns you lots of money and keeps you in style
But it's repulsive, revolting and wrong
Programming and hacking all day long
The way that a geek does
Oompa loompa googledy da
Given good bandwidth you will go far
You will live in Menlo Park too
Like the Oompa Loompa Googledy do
Oompa loompa googledy doo
I've got another feature for you
Oompa loompa googledy dee
If you are wise you'll program with me
Who do you blame when your program is slow
Unscalable and bloated like a hindue cow
Blaming the admins is a lie and a shame
You know exactly who's to blame
Only the de-ve-lo-per
Oompa loompa googledy da
If you're not spoiled then you will go far
You will live in Menlo Park too
Like the Oompa Loompa Googledy do
Oompa loompa googledy doo
I've got another search for you
Oompa loompa doompeda dee
If you are wise you'll advertise with me
What do you get from a glut of TV
A pain in the neck and an IQ of three
Why don't you try simply searching the web
Or could you just not bear to look
You'll get no
You'll get no
You'll get no
You'll get no
You'll get no commercials
Oompa loompa googledy da
If you like programming you will go far
You will live in Menlo Park too
Like the - Oompa -
Oompa Loompa Googledy do
(With all due respect to Leslie Bricusse and Anthony Newley http://gunther.simplenet.com/v/data/theoompa.htm )
Me
I'd be more interested in compiling search entry data and analyzing it for trends, etc. I'm sure Google does this already. Studying that would say more about what people are interested in on a day to day basis than webpages.
It is by the juice of the coffee bean that thoughts acquire speed, the teeth acquire stains. The stains become a warning
One of my friends tried to get me to take the class but I refused. I think my reason was that Jeffrey Ullman was associated with the course somehow and I couldn't stand him. His books were ok, but the few times that I went in to get help from him he was totally condescending. I decided never to take a class from him again. Interesting how some people who are so smart think that their smarts makes up for their complete lack of courtesy and/or patience. So that is how I missed out on having something to do with Google. Aren't I lame? Yes Andy, I know you told me to take it.
Lasers Controlled Games!
Rate search results with connectivity times. If I can't get to a page from my location, I don't want to see it in my search results.
Make a optional search to exclude vulgar, porn, questionable sites or sites with links to them. I don't want to have to explain to my boss why I unknowingly entered this kind of web page at work.
Rate sites based on freeness(word?) or omit the ones that will offer the information for a fee. When I search for "How to Day Trade", I would prefer to visit the sites offering free information, not the ones with free trials, etc. IANACA (cheap asshole), I just want the option when I need quick and accurate information.
Just some ideas from a websurfing perspective.
I would change all of the stupid things people write.
1. All instances of Linux become GNU/Linux
2. All instances of Microsoft become M$/Monopoly
3. I would check for all instances of "Open Source"/"Free Software" confusion and fix them.
4. And finally, the most important one. I would change all positive references to vi into references to emacs, since that's what they really meant to type.
Best. Comment. Ever. Enjoy!
This is the *first* Google programming competition, so they've obviously never done this before. If I were a stockholder, I don't know if I'd be happy about my company offering $100k, $1M, whatever - in a plan that may not generate anything at all. I suspect that if this is as successful as I expect it to be, you may see that kind of money being thrown around by a lot of companies in the future. Imagine a world where you could make enough money to live on just by winning competitions companies put out...
Last post!
Something tells me that the "odds of winning" text has to be included by law.
I think, due to the extreme similarity of the fine print in contests like this, that it's just boilerplate.
Whilst you're at it, why not write a program which comes up with ideas for next year's annual google programming contest, using one part randomness, one part cleverness?!
"Wise men talk because they have something to say; fools, because they have to say something" - Plato
I've found Google is generally pretty lousy for searching porn. Maybe someone a hundred messages up already said it (I can't sanely read that far) but I'd like to see something that hunts down the seedy bits of the net. Bonus points for stripping out all javascript and ads and such. Not that it'll work for long since those people will start blocking google. Oh well. :-)
Someone set us up the bomb, so shine we are!
I find it refreshingly recursive that the top of the 40 at Daypop right now is The Google Programming Contest. :)
Bitter and proud of it.
okay, as was already pointed out, it is more a matter of seeing what the javascript strings say when put together, and google doesn't INTERPRET the pages on the (fly/read/bot-scour), it just checks the TEXT of the html. The doubleclick idea is pretty good, except for then google would be discriminating against doubleclick (and any other banner-ad sites, from which a good portion of the web is financed, or at least the parts of the web where we spend all our time, look at slashdot for instance, see the pretty banners?).
/body tag is present in the document."
;].
/body)
now then, pop-(ups/outs/unders), broken links, and a few others would be easy to scan for, but look at all the sites that have something useful to offer that use the code the way it was meant to be written. maybe what you meant to say was, "pages where pop-up adds are used after the
something else that you're suggestion leaves out is all the MS office published documents, which do their best to conform to all the standards, but which also are generated on the fly by not too superior coders, and who wouldn't know the difference between html and rtf if they had a map, a reference work, and a guide.
Now, maybe it would be possible for us to get with all the people who have genuinely useful web pages and tell them to put in a tag called just because they deserve to be left out of the google weenie roast. but i won't tell you the real tag, 'cos only the pr0n webmasters are supposed to know it
get my point yet? the only truly useful thing to do would be to find the pages and sites which serve a goodly number of jpgs or gifs and to have the webmasters of those sites which are not pr0n register with google. google then makes a database of "good" picture laden sites, and as it scours a webpage, it loads the database and compares the address to see if it is an accepted graphic-ful page. otherwise, it cuts it to the bottom of the list, under a category of "Pornographic Material is most likely to follow from this point on. Please do not continue unless you really wish to see Pornographic materials."
so now let's examine the results of such an operation. well, google would have to allocate a certain amount of space in memory to list those "good" urls, "bad" urls and "undetermined, undefined, unchecked, or generally unknown" urls. then it has to concantenate all three lists together and present them in standard google form. all this, and we still expect it to run with minimal amounts of memory requirments/cpu cycles on their computers/servers, and in the same 1/4 sec or less times that we are so used to.
so after that little rant, sorry about that, you did have a good idea, it's just a little harder to implement a scanner for things that are perfectly legit methods of using html (except for that non-cleanup after
-=+=-
drach
2^3 * 31 * 647