Wikia Search Engine to be Launched on January 7th
cagnol writes "The Washington Post reports that Jimmy Wales, the founder of online encyclopedia Wikipedia, has announced the launch of a new open-source search engine, Wikia Search, on January 7th, 2008. The project will allow the community to help rank search results, in a model close to Wikipedia. However the company is a for-profit organization. This new search is supposed to challenge Google and Yahoo."
I guess that's their response to Google's Knol (http://en.wikipedia.org/wiki/Knol) Pity to see things heat up between the 'good guys'.
Since this project would seem to depend on the participation and good-will of users in order to work, my guess is that a nonprofit version will follow shortly afterwards, paralleling the open-source model. I also predict that without the benefit of a massive Microsoft-esque head start, the for-profit version will be put of business in short order.
A-Bomb
Point well made - while spam attacks may be pretty obvious, they could be spread out over time to make them less obvious.
Additionally, I can see this search engine being very much affected by public mood. For example, say there was a royal death and a certain right-wing 'upmarket' tabloid newspaper decided to claim that it was a conspiracy by the Government to kill the royal off. This is linked to from said newspaper's web site, and this people improve its ranking. Therefore it floats to the top of the results pile, thus giving it more exposure and setting off a vicious cycle.
Just a hypothetical situation, but certainly possible. Such a model would also make it possible to carry out smear attacks and to ruin the rankings of competing companies, parties, organisations, whatever - a practice that IMHO should be left to search engine admins.
Those using pirated Tinysoft signatures(TM) are a real threat to society and should all be thrown in jail.
Live today, because you never know what tomorrow brings
I completely agree. I am continually amazed at how good google's input-correction is - if I do a search for 'pale gire', it knows to correct it to 'pale fire ', yet if I do a search for 'canadian gire', it's clever enough to work out that I mean 'canadian tire '. I'm also continually amazed that people running other search services haven't yet realised just how vital this feature is - it's probably one of my favourite things about Google. Less so for monosyllables, but it's useful for words like "monosyllables". I'm particulary surprised that prominent online dictionaries don't have similar funcionality, seeing as I would imagine a large portion of their usage is to find the correct spelling of words.
There have only been two fundamental revenue models of content for 25 years now - EndUser and Advertiser. The ISP's went through the throes of the switch from PerHour to FlatRate in the 1990's, and the RIAA is struggling with it now.
I don't know anyone who would "pay to search" casual queries. There are some professional databases which do operate on this principle for high powered content.
From the RIAA threads we learn people don't want to pay as endusers for their content. The post above asks about the advertiser model.
The absolutely tough part about Free Open Source models is that it takes a MUCH longer cycle for the benefits to wind around the social benefit cycle. The monthly rent/mortgage whips around much sooner. The first person to absolutely nail this problem will be the mogul of the 2010 decade.
My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
As trollish as parent is perhaps, he is unfortunately speaking a trollish truth.
Speaking explicitly as a reader of slashdot, with all the group-think biases a site like this introduces, wikipedia is floundering in a mire of their own arrogance, and the dissatisfaction with this needs to be heard.
Wikipedia receives most of its traffic from its articles appearing in Google's search results, Wikipedia being relevant content, and Google being the top search engine.
How is Wikipedia to draw traffic to their search engine? Obviously not via Google, as search engines are content free on their own. Integrating it with Wikipedia? But again, Wikipedia is the end target, not a start point, so how could this work.
I don't think Wikipedia has the strategy or money for this to reach critical mass and show its potential, but it'll be interesting as an experiment.
Hey, what would you say to another Slashdot interview so you could answer more questions at greater leisure? :)
timothy
jrnl: http://tinyurl.com/c2l8yr / foes: http://tinyurl.com/ckjno5
"You operate under the sham of an open community, yet exclude those outside a very narrow political agenda. Your a fraud, using open source principals as a smokescreen that presents your personal world-view set as fact to the world."
:-); but that it presents my personal-world view or that we exclude people outside a narrow political agenda is just... not grounded in fact.
Actually, no. Wikipedia can be criticized on a lot of grounds, some of them even valid
Perhaps you'd like to come to my talk page at Wikipedia and tell me what you're upset about.
Wikia
Google's mentioned a variety of techniques publicly, although there's sure to be some secret sauce as well. The most obvious check would be a dictionary-based spellchecker. They can also look for letter transpositions, misstruck keys, word-form matching, etc.
:)
They also do a variety of statistical analysis on a ridiculously large data set. For example, if a particular phrase appears over and over again, and all of the words in the query match the phrase save one, it may be more likely that the non-matching word is incorrect.
Google often (always?) tracks click-throughs on search pages, so it would be able to deduce the accuracy of its suggestion by seeing if a user clicks-through to a given result, and doesn't come back to the search results. Also, Google does correlation between different terms that often appear frequently together.
It's amazing what kind of stats you can do with a workforce full of Ph.D.s and half a million servers
My response? That you are misleading people.
There are a huge number of sites in the interwiki linnk map:
http://meta.wikimedia.org/wiki/Interwiki_map
Including for example, uhm, slashdot. And Citizendium. And Merriam-Webster.
And finally, I have nothing to do with the list. I've never edited it, never asked anyone to edit it, and I have no input into what goes on it.
I am sure you will apologize for spreading this information. Right?
Wikia