Wikipedia's Search Engine Plan
jasonoik writes "Wikia, the commercial company founded by Wikipedia's Jimmy Wales, reveals plans for a new, editable search engine. They say that the goal of the project is to get 5% of the search market. The service does not yet an official release date. The article also leaves open the possibility that the search results may contain ads, and concludes by listing figures of the web advertisement market." Update: 03/11 17:24 GMT by KD : Wikia and Wikipedia are separate companies.
...which sounded delicious.
"Do No Evil" became "Be as corrupt and evil as possible."
An "editable search engine"? Great, now even MORE of the searches I run will pop up ads for v14GR4 and enhancements for body parts I don't possess, nevermind those linkspam sites that just insert the entire fucking dictionary in metacode.
You searched for: Bill Gates
you got: 400 pictures of penises, vaginas, and one picture of a penis covered in something that looks like it came out of the OTHER opening.
Just imagine what all those malcontents out there with too much time on their hands will do with this! It could be truly amusing.
Not *everything* works best when edited by the hordes.
Wikia is not the "company" behind Wikipedia. The Wikimedia Foundation, which is a non-profit foundation, is what's behind Wikipedia. Wikia is a totally separate for-profit company that is run by Jimbo Wales.
Cyde Weys Musings - Scrutinizing the inscrutable
All those bloggers-for-hire that are starting to find themselves unemployed suddenly have new embeded job opportunities.
... is looking for missing 'have' words ...
Why UNIX?
Wikia, the company behind wikipedia reveal plans for a new, editable search engine. They say that the goal of the project is to get 5% of the search market.
According to Wikipedia, that goal of 5% will triple in the next six months.
The theory of relativity doesn't work right in Arkansas.
Maybe they're first project should be: make wikipedia's internal search work correctly! It can't even handle the most basic miss-spellings now.
If your serious about this, don't compete with google, instead partner with google and make a wiki.google.com provide google's own search results & ads, but filtered and processed in various ways, which are handled by the wiki.
For example, you want to give only unique sites/hits but this may depend upon the host's url.
The Christian religion has been and still is the principal enemy of moral progress in the world. -- Bertrand Russell
Better than "flipping through an encyclopedia" is not good enough (especially with over a million articles). If you have the technology to do better, then why not? The Wikipedia search leaves much to be desired when not searching for something that is the exact name of an article. Searching for something that's in the body of the article is painful at best. From my experience, Google is still the better option.
I agree completely - the default wiki search needs major, major work. If they get this search software working and add it to Mediawiki, it'll be a major improvement. As a standalone search engine, however, I don't see the point.
What's the advantage of having user-editable search results? Anyone can submit sites to Google already. I don't know the exact statistics, but I'd imagine that most sites that aren't complete trash end up getting accepted - my site is a jumble of code I put together to learn PHP and MySQL, and looks like something out of 1995. It got included just fine. Therefore, the only difference this search engine would have is the inclusion of Google's rejects.
Then we have the editing. Don't get me wrong - I'm a big fan of Wikipedia and believer in the "everyone can edit" system. Nevertheless, I really don't see how free editing can be useful in a search engine. I remember back when Google was first released, one of the things that made it so special is that none of the results were placed by hand. Other search engines placed higher-paying customers at the top (I have no idea if they still do that - I never use anything but Google these days) and consequently the results tended to have problems. User-editing will likely have an even worse affect, with people putting sites that don't belong on top before those that do.
Yes, there will be the community to catch that, but there's a major difference between an encyclopedia and a search engine. In an encyclopedia, there is a limited number of articles, and each one is about a very specific subject matter. There are an infinite number of search possibilities, and very few of them describe only one thing. For example, I'm a big fan of Heroes. Therefore, I go to the search engine and edit the search for "Mr. Bennet" (one of the characters) to list some sites about him before everything else. Then my evil clone, swd09, comes along. He is a big fan of Pride and Prejudice, and changes my edit to list sites about Mr. Bennet from Pride and Prejudice before everything else. I then change it back, and an edit war begins. In an encyclopedia, it could eventually be settled by virtue of the fact that an article is about one or the other. If someone tries to put information about Mr. Bennet from Pride and Prejudice in the Heroes article, it's clear that they're in the wrong. In a search engine, though, how can anyone say whether Mr. Bennet from Heroes or Mr. Bennet from Pride and Prejudice is more important? There's no way to come to a true consensus. To solve the problem, the administration will have to put its foot down and arbitrarily decide, and we end up with a non-user edited system without the neutrality of an algorithm.
Well, you'd better hope no one tries to search for a webcomic on this thing.
while my post just above got modded "troll"?
:(
Someone gave a wikinazi mod points.
Or whatever other term you want to use for bullshit, misleading, false claims of disparity between a corporation and a corporate shell game.
Now I search and my results are:
"Tom is a FAG"
"Bilbo Lives!"
"Search engine optimization: do it the Wiki way"
"In the year 1432, the United States of America was founded by Bill Gates and his horde of windows-operating-system killbots..."
Will this include any attempt at local search? This seems the weakest area in search right now to me. From what I can tell the information gets online from the old paper directories getting scanned and put in databases, but in a very imperfect way (okay, I briefly worked on this once). It's a hard problem to take all the blocks of text and pull out the relevent fields in a way that works across all the different formats of directory listings and ads as they appeared in paper yellowpages, etc. Always seemed to me the wrong way to go about it. Instead a wikipedia type setup with the proprieters of the businesses perhaps given special weight on confirming the accuracy of their own information, but also with freedom from corporate control on information such as reviews, etc. Perhaps with the current scanned in information as a starting point (but are their copyright problems with this info?) Maybe this has already been done somewhere but hasn't gotten critical mass?
With the inevitability of it having funding by advertising, there's a chance the search results will be more biased towards returning links to companies that pay more, yes I know Google work like this with their officially sanctioned adverts on the top & side of the search results but what's to stop companies editing the main results to bias towards them?
To do something right, you often have to roll up your sleeves and get busy.
The thing that really rocks about Wikipedia's search is the Disambiguation function. Even Google does not have something like this.
google --> site:wikipedia.org [put your search term]
... but as it ages it becomes more difficult to so quickly find what you are searching for.
there is an upside and down side to what is proposed.
The upside is that you might get better results, the downside is that might not get any result as to what you are searching for, unless.....
It really all depends on how the programmers and users map all the possible findings.
I'd imagine that some sort of thesaurus like plan of classification and tabular synopsis of categories could allow all to be found by providing refinement focus, without trying to refine the initial search text. But rather a refinement of what all is found from such search text.
But we will see.
However, the most notable down of this is that a for profit company will be getting free labor and brain work.
Anyone up for a non-profit effort.......... OR maybe Google can do it given it's resources already existing.
And....Do we really need more search engine web crawlers? (vs. better ability to sort thru what is already found.)
For profit means advertising dollars to motivate bias injection.
I mean what would happen of free TV put all commercials on a specific channel so to leave shows commercial free?
Web search engines are no different when applying ad income.
So with this in mind, it seems clear that a commercial free version will do better.
TO research this AD effect, perhaps a ad based version of wikipedia needs to be created.
Google has become less and less relevant. Way too often I google for a search item, and that item isn't anywhere in the results page at all.
So... this ain't my day. I tried to find a very good example of this, so I put, in quotes, the name of what I thought was a little known group even when they were still together 35 years ago and googled ["joe byrd and the field hippies" lyrics].
Damn, Google must have fixed it. The last time I googled for that I got tons of lyrics sites, none of which had Joe Byrd. This time the first entry is Wikipedia (which is the first place I look for lyrics or track listings any more) and all the rest are relevant as well.
Kudos to Google, good luck to Wales. I'm still hopeful, and besides, an open source search engine can only be a GOOD thing.
mcgrew's razor: Never attribute to stupidity that which can be explained by greedy self-interest
.. a Wikipedia like editable search engine would be no use if a bunch
of politically correct environmental-marxists run the show?
Your post is now "2, insightful". It will probably go higher as well.
mcgrew's razor: Never attribute to stupidity that which can be explained by greedy self-interest
They might not realize it, but they already have 50 percent of the search market. At least 50 percent of the "Intelligentsia" search market.
Fifty percent of the stuff I used to "look up" through a google search - I now get through wikipedia. You just have to be smart enough to know that the info you are looking for is most likely in wikipedia. And it most often is. Especially since wikipedia is so open - they've got articles for tons and tons of things that no mainstream encyclopedia would ever touch. I no longer use "fan sites" or "episode guide companies" for the episode guides of TV Series, they're all in wikipedia, and the layout and presentation is even better.
Wikipedia is already a search engine, because the no original content rule means that it doesn't contain anything that isn't summarized from somewhere else, usually somewhere on the web.
GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
...at least it would get corrected. ;)
Chu vi parolas Vikipedion?
Yes. It's going to be interesting to see if Wales reports this conflict of interest. It should be reported on IRS Form 990, under "Relationship to Other Organizations". That's where, if you're involved with both a for-profit and a non-profit in the same area, you have to report it.
Form 990 is a public record. GuideStar has them all on line, although you have to register there.
I can see it now... Google acquires Wikipedia, news @ 11
I don't know about 50%, but with me they've easily attained 5-10% of my searches.
Adeptus
No trees were killed in the making of this post; however, many trillions of electrons were horribly inconvenienced.
Now that could be useful!
Engineering is the art of compromise.
There! I feel better already.
Engineering is the art of compromise.
According to Wikipedia, that goal of 5% will triple in the next six months.
FYI, that's a Colbert reference. He tried to have mentions of the white elephant population tripling in 6 months added randomly to WP.
Editable searching could be quite useful. From the search criteria you can guess the type of porn the person wants and direct them accordingly. Afer all they might type in "lawn mower" but you really know that deep down they want some shaved chick porn.
Engineering is the art of compromise.
... I thought they were going to fix Wikipedia's search function. :(
You make a good point, but this is mostly true only in english.
In other languages you get much less from the wikipedia.
Problem is this will require a small band of fanatics to do the editing. Now for the "central/core-cultural" stuff that you might expect in an encyclopedia this model may work but web searches are probably more long tail/niche. Not sure that the editing group could ever be representative. Furthermore the risk of bias on small sample size gets even larger. Some of the bias mightn't even be conscious: e.g. exhibiting a preference for a rigourous page over a "dummies guide" (which might be more popular/widely useful).
Much better would be a behaviour based search engine that inferred when users were un/happy with results- e.g. user doesn't come back for more searches or click more links on existing return.Also even say if a user does a "poor" search firstly & then uses "clearer" terms then engine ought in future suggest the "clearer" terms as alt search or even return some of the results. Indeed even better the engine might "cluster" you with other similar users & retunr more relavant results (e.g. effectively inferring that you prefer rigourous complete guides rather than dummies intros).
This would be simpler & actually rely on the wisdom of masses rather than some central command editors, in fact this type of thinking was behind PageRank.
I find WikiPedia's search not as good as google (fixing typo's and such), so I tend to do most of my searches with google, adding "wiki" as a keyword, and the relevant wiki articles typically shows up as the first matches. Works well.
Love many, trust a few, do harm to none.
Well get going! :)
If you want all the support you can get take all the content you can get. Exclusionism by the very definition of the term drives people away.
. . . a much more promising commercial spinoff of Wikipedia which I profiled in a recent blog post.
You actually hit upon one of my uses for Google:
Being a quick spell checker.
If there is a word that I am writing and I don't want to bother with trying to look it up in a dictionary or can't think of the proper spelling, I'll punch it into google and ignore the search items themselves, other than to see how many other people suck at spelling as bad as I do and even published content with the misspelling.
What is surprising is how many times even deliberate misspellings still turn up content on the Google search.
Wikipedia is a mess. Many of its articles remain barren, even after all these years, while others are bloated with useless information and anecdotal nonsense. Plus tons of external links and SPAM. Even after all that crap is excised what's left is often innaccurate and unreliable.
And as for the Wikipedia concept? LOL --- that's a failure too! Many of Wikipedia's most prominent articles are LOCKED! The whole system appears to be controlled by a cartel of admins whose full time jobs are to LOCK articles and DELETE changes at will. They run the show; normal people have NO ACCESS to VAST SECTIONS of the website.
Why can't people just admit this thing's a failure and move on?