Seven Search Engine Evolutions for '07
eldavojohn writes "I found a short but interesting list of predicted evolutions of search engines that will most likely be implemented in 2007. While some are vague and obvious like a better human interactive experience, there are others that are worth looking into like alternative means of indexing and using semantics — not keywords — for matching documents. The author of this list is Dr. Riza Berkan, also the author of 'Fuzzy Systems Design Principles.'"
Comment removed based on user account deletion
an article beginning with a bill gates quote cant be wrong.
~~~ Paf. Le chien.
Most of the things listed seem to make results more ambigious, not narrow them down.
Even if you could radically change the way a search engine works, you then face an even bigger task: Forcing users to radically change their searching habits to fit your search engine.
And what the hell is "QDEXing"? Google reveals nothing, therefore we can conclude it does not, in fact, exist.
I wonder if I use bold in my signature, people will notice my posts.
Ah, let me just tag this article 'semanticweb'... there, much better now...
As early as 2007? Now I don't really believe that.
It may get partially implemented, and probably only in English.
Maybe Chinese as well.
Most of the other languages will have to wait for quite a while beforehand...
Not to say semantic search is a bad idea or anything... I, for one, would like to see some image-, audio- and video-search based on some kind of semantics, not tags and names... but that'll just have to wait.
Ignore this signature. By order.
unfortunately, the explosion in number of blogs is likely to make that goal impossible.
~~~ Paf. Le chien.
Putting this kind of emphasis on search is wise. MS knows this and put a vastly improved searching into Vista. Some say its better than other desktop searches. This is similiar to having a good memory in a human being, quite useful in practice.
...that I was unimpressed when "fussy logic" was a buzzword a decade ago. I do not look forward to it's resurgence in the marketing lexicon.
Is it just my observation, or are there way too many stupid people in the world?
Hmmm.... and these predictions are in a press release from someone with a new search engine which (surprise!) opens shop in '07.
Can someone tag the article "spam"?
Yoople! has already introduced a more engaging human-like search experience and let the people collaborate in order to create a better indexing.
Ok, someone could say it's the perfect way to permit abuses and lot of work has still do be done, but it's a smart proposal to start from. Don't you think?
http://www.yoople.net/
Fortunately that's going to peak in 2007 as well. At least according to Gartner for what that's worth http://news.bbc.co.uk/1/hi/technology/6178611.stm
But then you have a browser called Internet Explorer... Confused yet?
Ignore this signature. By order.
Let's hope it's better than the author's search engine, hakia.com. Just used it to search for "nike stores in the uk". First result is an etailer in the US, all the others are spam sites. Looks like we've got a long way to go before search engines actually understand what I'm looking for.
This article is a promo for hakia by its CEO. For example, his #1 point is:
1. The first time a search engine will have an alternative to indexing; new technology like QDEXing will be developed.
Unsurprisingly, QDEXing is a term invented by hakia; see http://www.hakia.com/technology.html
3. The first time that search results will include highlighted best sentences as a result of semantic analysis rather than bolded keywords as a result of finding incidences.
Unsurprisingly, this is what hakia does. And so on down the list.
The part that is surprising to me is how BADLY hakia does these things. For example, I searched for a bird I saw yesterday, the "eared grebe". The results are weak relative to MSN, Ask, Google, and Yahoo, but-- aside from that-- here's an example where they highlight the "best sentence" in a snippet:
Page 3 of 3 | Page 1 | Page 2 | Eared Grebe Podiceps nigricollis Albino. Antelope Island Causeway. Davis County, Utah. October 2003 : by Nicky Davis
The fact that they're not even finding sentence BOUNDARIES properly pretty much destroys the claim that they're extracting meanings from text. This isn't to say that someone won't do it someday, but... not these guys, not today.
Can somebody please explain what a semantic search would look like? I'm not sure if I understand the meaning.
Full Tilt
Honestly, I can't see myself NOT using Google in the years ahead. It's become too ingrained in my lifestyle. If I don't know what something means, I google it. In fact, in the rare times that Google is down, I find myself lost and constantly clicking "home" (www.google.com) only to find it doesn't work.
Clearly the way forward for search is to make an algorithmic search engine, and have it scrape information from a dead human edited directory.
Google directory. Bringing you the future today.
Why not some "intelligent design" when it comes to search engines? "Spore" looks like a nice game but I wouldn't base a search engine on it since the result set would be too Darwinian. :P
I find it to be a DNS error local to me.
next time, try the IP
(currently, 64.233.187.99)
http://64.233.187.99/
every day http://en.wikipedia.org/wiki/Special:Random
"The first time a search engine will let users evaluate answers on the spot by displaying uninterrupted and coherent text snippets, often letting searchers forgo having to click through to links and saving time."
Doesnt ask.com give you this functionality already?
That was a pretty good article, even though most of the stuff on there was pretty obvious (for most of us /.'ers) to begin with.
I think it was only inevitable that internet searching focuses more on the "type as you speak" initiative rather than the older term-by-term searching of the past. This would be great for us, but I really see that the benefits would cater more to the average man/woman who already has a difficult time searching because they are using "the wrong terms."
I really think that Google will be the first search engine to implement most of these changes, since their user-base and R&D is already above the roof. I think that Microsoft will also implement this soon with Live, since a sizable portion of their research teams are testing searching based on semantics as well.
4. The first time that a single query will bring a gallery of
results equivalent to running multiple queries about the
meaningful variations of the same topic.
5. The first time a search engine will let users evaluate answers
on the spot by displaying uninterrupted and coherent text
snippets, often letting searchers forgo having to click through
to links and saving time.
Both of these have been available for a couple of years: e.g. searching on the single query "semantic web" using CQ web, reveals clusters such as these:
fuzzy sets
fuzzy systems
neural networks
set theory
soft computing
aritifical intelligence
control systems
expert systems
And each one of which is linked to a specific page of results using sentences instead of snippets, e.g. for artificial intelligence:
1. This paper will present the foundations of fuzzy systems...noteworthy objections to its use with examples drawn from current research in the field of artificial intelligence.
Fuzzy Systems - A Tutorial
2. The most obvious implementation for the fuzzy logic is the field of artificial intelligence.
Fuzzy Logic
3. Ultimately it will be demonstrated...fuzzy systems makes a viable addition to the field of artificial intelligence and perhaps more generally to formal mathematics.
Fuzzy Systems - A Tutorial
4. The paper gives examples of the fuzzy logic applications with emphasis on the field of artificial intelligence.
Fuzzy Logic
5. A collection of articles and other technical resources for artificial intelligence.
PC AI - Fuzzy Logic
most imagesearches give you a filtered and nonfiltered option but where is the porn only search from a big company? you can trick alexa into giving you porn only i wonder if they know but i bet they got the rankings wrong! rofl come on google, give us what we want!
You want to do a search that only covers pages you've seen before. Doesn't Google Desktop do this?
Some of these things (1,6) sound pretty specific to technology that the author's company: Hakia is promising to produce this year. Some of these items (2,4,5) are already being performed by major search engines, but are done behind the scenes and are not immediatly obvious to the user. #2 Will continue to be perfected over thext 20 years, not the next 12 months. #3 Sounds like a reasonable extension to the traditional practice of bolding keywords. I'd like to see this implemented, though I think it will only come with good progress in the area of #2. #7 Is actually a pretty good insight into the way top engines will use thier computing power after they've already crawled and indexed in the standard fashion, most of the 15B good pages on the current web.
Yeah, there is some Google thing you can do that tracks your search history and promotes links you frequent.
Personally, that freaks the shit out of me, so I don't have it enabled -- but you're right, the basics of that technology are at least partially implemented by at least one major search engine.
:-((((. I know it requires enormous resources, but I need it badly. It would be like heaven if I could wildcard search, because you could show exactly what you are searching for.
Use Delicious and search your bookmarks... easy-peasy and you can do it now.
But you're right. Search Engines should be keeping a list of sites you visit and associating them with your user account and IP address... so you can get a list of previously visited sites that meet your keywords at the top of the page or in a sidebar... OR SHOULD THEY?
A fool throws a stone into a well and a thousand sages can not remove it.
I know from an ex-wife, a librarian, that librarians have been doing searches for fifteen or twenty years using such constuctions as NEAR . None of the popular search engines, from google on down, do this. It would *certainly* make my life easier, and result in relevant hits, rather than 100k hits because some asshole advertisers have thrown a laundry list into their META tag.
mark
I just want to tell the engine that keyword 1 is 5 times as important as keyword 2
Give me a slider control that instantly filters the results... ie: have the first 100 results waiting for me with 20 showing, then let me adjust the weight of my keywords until I get the list I am looking for with individual items falling off or being added to the list as I adjust the controls.
A fool throws a stone into a well and a thousand sages can not remove it.
I am sick of getting 100,000 irrelevnt hits & then doing dozens of narrowing searches, only to find that the word & phrase hits are all in different paragraphs or even unrelated articles on the same page.
Bring it on NOW.
Good grief... it's not a scientific paper, it's not a journalist's article, it isn't any meaningful content at all - it's a press release. Right off of BusinessWire. What's next, Ron Popeil's predictions for 2007?
When I go to do a search on my computer, I'm comforted by that little doggy. I wish Google would follow Microsoft's example and replace the little box to type in a search query with an animated animal, something with more limbs for going out on those internets and finding stuff.
The flag just makes more sense than the constitution. - Judas Gutenberg
"It's never the things that happen to us that upset us, it's our view of them." -Epictetus
The search history wouldn't be enough. There's lots of pages you visit through links in other pages (or typed-in urls, for that matter), not just pages you searched for and followed in the search results.
That's even scarier.
It's been extraordinarily difficult to get the kind of results this guy is talking about, and that was in a research environment that was free of SEO spammers deliberately attacking the algorithms.
Well, I have tried some keywords and Hakia has objectively less relevant sites when Google not.
Don't make Adversing with Slashdot guys help when you have nothing new to offer.
This is already a problem to some extent - Nielsen wrote about this in 2k4.
As am I. That's why I switched my homepage from Google to the Wikipedia. It's a faster gateway into real information.
You do realize that it's not possible to put pages you've seen before first without knowing which pages you've seen before right?
Predictions are always difficult. Here are some comment from somebody
d es/leidner.pdf
working in the field:
> 1. The first time a search engine will have an alternative to
> indexing; new technology like QDEXing will be developed.
Indexing is a pre-requisite for fast access of retrieval results.
Even distributed peer-to-peer indices that are a very attractive
idea suffer from bad performance due to the absence of a monolithic
index owned by an organization with huge bandwith.
> 2. The first time ontological semantics will be used that will
> enable a search engine to perceive concepts beyond words and
> retrieve results with meaningful equivalents.
The problem with applying ontology based search is the disambiguation,
i.e. the mapping from natural language words (terms) to the unambiguous
nodes of the ontology (concepts). Automatic disambiguation needs to be
pretty good in order to help in search, but this is unfortunately still
an open research problem.
> 3. The first time that search results will include highlighted best
> sentences as a result of semantic analysis rather than bolded
> keywords as a result of finding incidences.
This prediction seems to mix presentational issues (bold layout) with
processing issues. The problem with the former is that flagging a whole
sentence bold perhaps isn't well liked, as it could already have been used
with current technology. The problem with the latter is what exactly is
meant by "semantic analysis" - "deep" automatic natural language processing is
still a very costly operation and may not be an option as early as 2007 to
be applied to a whole Web index.
> 4. The first time that a single query will bring a gallery of
> results equivalent to running multiple queries about the
> meaningful variations of the same topic.
We would not notice this, since it would be carried out internally.
However, this processing intensive step could be (preferredly) replaced
by result-equivalent change in the ranking algorithm.
> 5. The first time a search engine will let users evaluate answers
> on the spot by displaying uninterrupted and coherent text
> snippets, often letting searchers forgo having to click through
> to links and saving time.
Giving answers is certainly an emerging trend, cf.
http://www.infonortics.com/searchengines/sh05/sli
But it may last longer than one year to become pervasive.
The repeated mention of snippets seems to suggest that the author of
this set of predictions has found fault with snippets and considers
this a priority, whereas most people - at least on desktop PCs - seem
to be okay with the way results are summarized today.
> 6. The first time a search engine will have a dialogue utility that
> will help point out best answers or suggest a Gallery for a more
> engaging human-like search experience.
Further work in interactive search is certainly ongoing, in some sense
a dialog feature is already operative, as real search engine logs show
that users keep re-phrasing and refining their searches all the time
to converge to the results they desire.
> 7. The first time a search engine's data will grow by detection of
> new knowledge rather than by detection of new pages. Search
> engine growth by knowledge will be the new direction for the
> industry for 2007.
This depends on a universally accepted notion of knowledge, and how to
measure/acquire it automatically. Perhaps one of the strengths of modern
search engines is that NO commitment to any kind of theory of knowledge
has to be made, it works - for better or worse - because all it needs
are strings.
I'm good friends with a guy who worked for Kozoru just before they folded up. I've got a couple of their cluster machines running right here next to me, heck I've got one for sale on Craigslist right now... Anyway, they had natural language search. It worked. It wasn't vaporware. Want to know what the #1 problem with their system was? The users. Most people are so mentally tied into keyword searching that they wouldn't believe that they could actually *ask the system a question* and get a meaningful result. They even built a chat-based system, called BYOMS, where you cold define what particular sites you wanted to search for various topics. The tech was great, it worked, and no one with money cared. http://www.google.com/search?q=kozoru http://www.google.com/search?q=byoms
I think human input will definitely come into play in the future of search. Ultimately you can make machines very good at recognizing spam content, but how can you possibly identify what people really want to see without asking them?
The way forward is to allow people to reorder their results and to delete spam results. This way we'll have a search engine that actually learns what people want and acts appropriately. Sites like Digg and Reddit are on to something in this sense. They use 'swarm' technologies to determine what is most relevant in a certain narrow category at a certain time.
Just like another commenter mentioned there is already something like this: Yoople. A couple of months back I wrote that Google's Searshmash secretly was playing around with something like that too.
Right, but there's a difference between the pages I've searched for from Google and clicked out on vs. all pages I've seen before, period.
My first comment in this thread was about being uneasy with Google tracking all out-clicks to your user name. I was then reminded that for ultimate efficacy, this system would have to record all pages you've visited, to which I responded that it was scarier still.
But I guess expected an AC to go back through parent comments is far too much, so I'll just pat you on the head and send you on your way.
Don't you hate when marketing BS is written about as if it were something significant. If it ain't on Google then it isn't even on a website that knows how to get itself properly indexed.
At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.