Search Beyond Google
An anonymous reader writes: "'Search Beyond Google', the cover story of the March issue of Technology Review, is one of the few current Google stories that discusses whether their technology can stay ahead of the competition in the months to come."
... or bad things ... or pretty much anything, come to an end sometime. Except Microsoft of course.
I think Google has deviated too much from searching, with their Blogger aquisition, and other stuff like that. We'll see how long they stay around.
They key for google providing relevancy is certainly eliminating "search engine spam". Almost everything that comes up on the first page for most things I search for is a referral program selling either something I'm looking for information about, or selling something completely different.
500GB of disk, 5TB of transfer, $5.95/mo
Google has had the last few years virtually unchallenged as the #1 search engine, because nobody has yet come out with anything that's better than PageRank.
But, five years is a long time to sit on an innovation without making it better. It gives the competition time to catch up. Furthermore, since PageRank doesn't seem to have seriously changed much, it's actually slipped backwards a bit as more and more people have figured out how to "beat the system" by posting nonsense sites with links to the site they want on top. Google's clearly trying to fight this, but that's an uphill battle.
Meanwhile, Yahoo now owns three distinct web-crawl based search engines, AltaVista, AllTheWeb, and Inktomi. They also own Overture, which begain life as GoTo.com who was the first to associate real search results with targetted ads. Put all these pieces together. Yahoo also has the original mega-directory site, which Google tries to duplicate by presenting the Open Directory Project on their site. In short, Yahoo's got all the resources to launch a brand with everything that Google has going for it... and when you look at AltaVista and AllTheWeb they feel quite a bit like Google already. Clearly, Yahoo's gearing up to issue a challenge to Google.
It really seems like Yahoo is making sure they have all the tech in place right now. When they're sure that they're better to Google, I fully expect to see a marketing campaign claiming that and inviting people to do head-to-head searches.
Google, as it stands now, is going to look pale in such showdowns. They've got to seriously modify PageRank so that the link spammers get downranked before Yahoo issues that challenge, or else Yahoo could reclaim the search market under it's "Google-killer" product line, and then direct people back to the original Yahoo site for their other portal needs.
Really good search website...
www.alltheweb.com
If you mod this up, your slashdot background will turn into a beautiful sunset!
Hopefully google will not go public anytime soon like they were talking about earlier. I fear that this would stifle their innovation and bring it closer to some of the other failed portals.. ie more ads in an attempt to satisfy investors.
I think it is a good idea for other search engines to step up to the plate and challenge google. It stops them from beoming complacent and spurs innovation from a desire to be #1.
I see no reason why the cycle cannot repeat. In fact, the cycle may be much like the semiconductor memory business, which has seen boom-bust cycles every few years since the early 70's. Sometimes a name will ride out for many cycles, but usually the company (and as necessary the technology) behind the name changes radically.
rather a document organizer. It gets some of its results from Google anyway and just reorganizes it. Search results have the flavor of
See books about "more stupid f---ing shit" at Amazon.
targeted organization as in targeted selling. All they want is your demographic datum.
IOW google will crush them.
...maintain their technological lead, goodwill toward them will give them some breathing room. I continued to use Altavista for quite a long time after Google came out. It was what I was familiar with, I liked it, and it worked. Why switch? Eventually, I realized that Google had keen "read your mind" powers and finally switched. :-)
Javascript + Nintendo DSi = DSiCade
What will stop google is not their technology,
but the ossification that takes over every
large company as it grows. Changes won't be
made because it is too big a change. Changes
won't be made because it's not cost justified.
Marketing concerns will override technology.
People we get fat and happy. And unlike microsoft
i can switch to a different search engine
in a second. Yahoo is looking pretty good...
Every couple months it's "Can Google stay ahead of new competitor x?" And so far, everytime, the answer has been yes. People shift from search engines quickly when they no longer work, and people are still heading to Google.
I type something in and it spits an answer back at me.
As long as that answer is in the first page, usually the first three items listed, people simply will not care about the backend technology.
MS and others will brag about the vastness of the numbers of matching items they can find; most people only worry about finding one or two sites.
This is going to be a big non-event...mark my word.
Google uses pagelink to rank pages for their searching. This may be teir downfall. The porn and ad agencies have found out how to take advantage of this. I would say that unless Goole finds a new way to rank/sort that someone else will come up with one that filters out the crap and take Googles market share.
Evolution or ID?
Enough branding studies have shown that it's very very hard to knock someone off their post once they seize a certain mindshare - e.g. Coke, Windows(grin), and now Google.
So, irrespective of the technical competence, or otherwise of Google, it is going to be around and the leader, for a long time to come. P.S. My favorite missing google feature: search for bittorrent files
Google has been successful due to original thinking. It needs to ride its wave of reputation now rather than later in order to snatch up some of the finest minds to stay on top of this industry that is all about originality and fresh ideas. They seem to be on the right track by providing the work environment that they do.
But no more stuff like that Friendster wannabe site.
I'm utterly fed up with eBay with the bloodymindedness of their "enhancement" and roll-out policy. Holding a near strangle-hold on the online-auction market, they are blind to the aggrevations they inflict upon users.
Radical changes to a familiar interface shouldn't take place without dire need, unfortunately some people think it's fine to dust users. Google is all I want in a search engine and it works very well. The only reason I'd seek another search engine is if they (Google) drive me away.
BTW, did you know there's a calculator? I found it when I did a search for 'stones to pounds'
A feeling of having made the same mistake before: Deja Foobar
Don't forget that they also took in DejaNews. Doesn't Google no offer a free language translation service too? I think Google might want to reconsider offering so many service.
I've always been a google fan, but this article is essentially dated on its release, given the fact that the Yahoo! switch has already occured.
I do hope Google can continue its innovation, and reduce much of the annoyance of bad results through blogs.
I'm suprised more attention wasn't given to the Google IPO, and what affect that might have on the "relatively small" 1000 person company.
-m.
for a p2p distributed transparent encrypted indexing system with voted super-nodes.
You can't handle the truth.
Once a really great tool goes "commercial" it's all downhill from there. One of the main reason I switched to Google back in the day wasn't because it was fast and accurate (which was great) but because it had such a clean interface. Now there are sponsered links that clutter things up. And who knows when/if popups will be a necessary evil to "stay in business".
Are you Corn Fed?
welcome our new search engine overlords. No, really, I'm serious.
Google is awesome, and is by far the best search engine out there. Google became the best by being the best. I use it because it works, and it works well.
In order to be dethroned, a search engine needs to work BETTER than Google. I welcome any search engine that can beat Google, as it has to be DAMN good to take that title. Microsoft search flat out sucks. If I look for articles on linux, I get articles about linux alternatives (mostly M$ content). If I google for linux, I get real linux stuff. This is just an example, but it's true across the board. I have yet to see a search engine superior to Google, and I welcome any tool that can prove itself better.
There is no reasonable defense against an idiot with an agenda
:wq
Forget about Yahoo and Microsoft. If I was google I would keep an eye on booble. No way they can compete.
Google pays hundreds of researchers and software developers, including more than 60 PhDs, to man the front lines in this technology war
Google is famous for only hiring the academic best (except for those they pick up in acquisitions), but I'm wondering if things are getting stale over there at Google. Google Labs has shown us some interesting concepts, but when a company opens the field to everyone and asks for people to develop ideas for them (as in the recent $10k prize thing), does it mean those PhDs are sitting around eating pizza all day?
PhDs are not the guys you leave around to do server maintenance or fix up problems in the clusters. They also don't make great coffee. So if you've got 60 extremely bright individuals (we're talking way into the top percentile) sitting around for a few years.. and Google has tons of money.. why aren't we seeing some major stuff coming out of Google?
My theory is that either 1) the PhDs are being stifled by upper management, 2) the PhDs aren't really as smart as they're meant to be, or 3) Google has something absolutely massive just around the corner... Take your bets, gentlemen.
Web Hosting Reviews
Exactly. When Google falls behind, you'll know it because you'll be using something else. This kind of "Entity XXXXXXX may suffer setback YYYYYYY any day now" story isn't reporting at all, it's speculation and ghost stories.
That's what technology is, isn't it? The constant search for something better than what's available? And the approach of many companies (insert any NASDAQ 100 company here) is wait-and-see. See how the poineer does it, do the same, but throw some more bells and whistles in, or just market it better.
Google has a brilliant algorithm, thanks their 60 PhD's. But there's plenty of other PhD's out there, some of whom I'm sure are just finishing up their newest, succeeding algorithm. It's a constant game of king of the hill.
And Google isn't exactly dead, its alive and coping with the new stuff all the time.
"Until you do what you believe in, how do you know whether you believe in it or not?" -- Leo Tolstoy
I'm a heavy google user, but I still miss altavista's ability to search for stems. For example, an altavista search for "slid* rul*" will get 'slide rules,' 'sliding rulers,' and plenty of other variations. Google does support whole word wildcards (try "miserable * failure") but stems are even more useful.
--- Often in error; never in doubt!
"Every couple months it's "Can Google stay ahead of new competitor x?" And so far, everytime, the answer has been yes. People shift from search engines quickly when they no longer work, and people are still heading to Google."
This is because no one has created a significant advancement in searches and marketed it well. If that happens watch out Google.
Evolution or ID?
.....in that everyone uses it, and everyone HAS used it for the past five years, or longer. People trust it, and that is something that just doesn't vanish. Plus, they HAVE done new things, such as google news.
After just a quick bit of playing around with Teoma (mentioned in the article), it seems to be better than Google. I was surprised ...
People seem to think Google is simply a place to find HTML pages. You type in your words, and poof, you get some relavent sites. Could this be replaced in 3 months? Google has a huge index, a very good search algorithm, and works for most people, but (in theory) someone might come up with a working alternative in that period. However:
And more. Babelfish translation? Caching like a billion pages? Simple design, with text ads that are actually relavent? In 3 months.
Yeah, right.
Don't think of it as a flame---it's more like an argument that does 3d6 fire damage
"Can Google stay ahead of new competitor x?"
"Apple is going out of business because it can't take out Microsoft"
"Repent, for the end is near"
"We are living in the last days"
"Live Free or Die." Don't like it? Then keep out of the USA
Someone should invent a search engine with regular expression support. *sigh* A world with regexp-enabled search engines... That would be a wonderful world to live in.
GAAH! MY PRINTER IS ON FIRE!!! PUT IT OUT! PUT IT OUT!
Google's market position when they IPO has nothing to do with their technology. It has to do with their brand. "Googling" for something is the effective equivalent of going to get a Kleenex. Noone asked for a tissue. The market is going to be buying faith in the Google brand, and it's loyal userbase.
-- http://www.criticalassets.com
Can you imagine a Beowulf cluster of those!
If my answers frighten you, stop asking scary questions.
Google is in a no man's land, of sorts. It has penetrated enough to be considered almost quasi-public, yet it does not have the security that such a status would offer, and must constantly watch its back. The company should know by now that users are not happy with the level of transparency. Yet, we see Orkut christened with very little explanation. End-users won't support a company that is overly secretive if there are alternatives.
It would be the rebuttal to Google bombing... searchers could fight back by giving the crap a thumbs-down. Of course, then you would have the bombers voting down all the ligit sites. Dammit.
The Philosophy of Liberty | lewrockwell.com
I kinda like this one, but not enough to not slashdot them. A cool pun, a funky gui, what more could you want in a nextgen search engine.
They have fucking masseuse, doctor and dentist onsite (or so I heard). All of this will be gone if they IPO.
Gotta love a calculator that can tell you there's 153,388,225 furlongs in a microparsec
Todays search engines work a lot like information sieves, or panning for gold. The idea seems to be to take a bunch of stuff and wash away the un-needed, leaving behind (we hope) what we were looking for. However the very nature of the web provides the opportunity for looking at the relationships between ideas, the synthasis of knowledge as opposed to just collections of information. While the 'tricks' from the microsoft research projects look promising; only a true 'learning machine' will be able to go beyond the information and delevop a 'meta-interpretation/representation' of the raw data in order to support a 'meta-understanding' that is traversible and navigable in that we can not only connect with what we don't know, but that we can explore the unknown in terms of its relationship with what we do know.
"Can there be a Klein bottle that is an efficient and effective beer pitcher?"
I quite like Vivisimo (after I figured out how to make it include Google in it's query by adding 'google' to the 'sources=' part of the query URL).
dogpile is also quite good, when you've got it set to display results by relevance rather than by engine.
Remember, Amazon isn't the only online bookstore, ebay isn't the only online auction site and google isn't the only search engine...
The majority of users who use search engines are just end users anyways and appreciate the simplicity of Google's page design. I go to Yahoo, Altavista and Lycos and there's half a million links all over the place. I go to Google, and there's a nice clean page with the text box smack right in the middle.
Visual appeal still counts.
Someone of course will come in not so much with a better search, but a different search, and that will be equated with better. The major search engines will have to fold in these innovations to stay relevant. The newcomer will have to adopt the best of the entrenched players if they want to last. All around its a big win for users, until that fateful and unavoidable day when people start to realize that uber-searches are the de facto "big brother" everyone fears will materialze at some point.
There are three very distinct elements involved in creating a powerhouse search engine:
- A large crawl: A search engine with nothing in its database isn't going to work very well. A search engine needs as big of a crawl as possible in order to have any results at all. This takes huge resources in terms of bandwidth and computing power. Some of the early search engines met their demise when they couldn't afford to keep their crawlers growing as fast as new web content comes out.
- The Sorter: Once the long list of results that match the keywords are pulled out of the crawl, a sort needs to be applied in order to locate the best results and present them first. Google got vaulted to the top because PageRank was better than anybody else has ever put out. However, PageRank isn't perfect, so there is still room for somebody to make something better than PageRank.
-Promotion: A web site just sits there unused if it isn't promoted. Google never spent much on advertising and it just relied on word of mouth since it was so strong in the other two areas. And now that everyone turns to them first without even checking other engines, that has given them the strong advantage of a strong brand image. However, we've seen plenty of cases where inferior technology has been beaten out by better marketing. If somebody's tech passes Google, without marketing it nobody will know about it. Therefore, look for the challengers to be launching major ad campaigns inviting people to at least try them before they assume Google is better.
Can anybody put it all together? We're about to find out...
So? Part of the newscasters, and all of sports announcers would be out of a job if people weren't interested in random speculation about possible upsets.
I'm sure other people have come across this. It's when you do a search, and the resulting page contains the keyword, but only as part of a list of other somewhat related terms. For example, if you search for "Malamute" and google returns a page which is a list of every single dog breed. Kind of a tricky problem though.
I don't really care who has the most advanced search capabilities. I use google because all the paid links appear off to the side in a different color.
Thats all I really want . . . to get my search result seperate from the commercially paid for product placements.
--Tsiangkun
Google should allow users to 'help' them. Install something on your computer so they can see when you bookmark something, and what bookmarks everyone has. For people have have 'good taste' in bookmarks, theres results count higher than someone with BS bookmarks. I think this would improve accuracy.
Personally I put google above any of the other search engines out there, its clean no nonsense interface gives you what you want. I am not interested in having a search engine have about 700 billion different features that have nothing pertaning to what I am seeking, Yahoo may have some nice tie ins but I dont need a page full of ads to get in my way or an advertisement for YAHOO SMALL BUISNESS.
You have been sig'd
... it could be great idea to publish unanswered questions as weblog.
i haventfound.blogspot.com/
Even google cannot answer everything. Web is limited even if you don't believe it. You post your question. Answers will come through trackback, comments, email. Googling the web after you posted the question. Or not.
All you need is some tag to mark post as answer or question. Hot list like metafilter to aggregate.
Is it a good idea or does it belong to recycle bin?
Mailing lists used to be about that. Discussing specific problems. Finding answers. Nowadays they are quite dead. Except some. Newbies, spam, whatever is the reasons. Problem is that those who possess knowledge don't have enough stimulus to share it. I don't solve that problem. The answer might be micropayments or gifts via amazon.
But make a good deed today. Answer one or two questions. In a year it might make quite a lot. In some day you might need answer to something yourself.
http://answers.google.com/answers/main
http://
Google recently added stemming as a search of {quit smoke} will reveal. You can read about it in their help section. Stemming can be disabled on specific words. Otherwise the update came around November 15, 2003, but is probably still in flux, so there isn't too much good info about it yet.
But google is an acronym for 'search the internet'
...
...
...
Hey can you jump on your computer and MSN Search this for me
Hey get on and Yahoo this term
Nothing else will work
*DrugCheese rants*
-Don't try to bend the spoon, that's impossible. Instead, try to realize the truth.
-What truth?
-There is no spoon.
You can't handle the truth.
A lot of articles (including this one) are focused with how Google (and their would-be competitors) can improve search via algorithms like PageRank; and again and again the proposed/imaged solutions are based on server-side computation. IMHO, the real solution to improving search is client-side -- and I don't mean search toolbars -- but rather using the computional power of the client to provide a better experience than what is available inside your browser. Searching in a browser is cool, but why not build a powerful Google search client app?
As a simple example: if your a Mac user, Beholder is really a much more useful image search frontend than using images.google.com alone (yes, I've mentioned this before, but hey, a developer has to eat).
I discovered Discount Watcher via what seemed like a spam link on Google but it turns out to be a very cool service that finds the latest discounts on almost anything you want and turns it into an RSS feed. Now my aggregator is filled with spam. But it is spam I want.
After reading this, it seems that all of the n+1 search engines are fast approaching an interactive Doug Adams Guide to the Galaxy.
Oh. Wait. www.mooter.com/corp/index.html
When Google falls behind, you'll know it because you'll be using something else
Except that there is some hysteresis (for most people) in finding the better solution, convincing themselves it's really better, and switching over. Most of here, I'm sure, pride ourselves in being among the first crowd to really switch over to google from altavista and all the rest. And we want to know who, and what, the next thing is -- and don't doubt it, there will be a next thing eventually, though it may be years away yet. The only way to be the first to know is to stay alert and keep watching.
I've had this sig for three days.
Google really helped with research papers couple years ago, but now I find there's too much spam. So much so, that now I'm into Grad studies, I am going to Lexis-Nexis to find out information about topics. Also I have found that the Internet is certianly not what it used to be either in terms of quality of content. There used to be a lot more academic sites appear when I searching for information on a topic. Now, especially being in an political science related field, International Affairs, doing a web search on some topics leads to dozens of ranting bloggers instead of more academic type work.
"The problem with socialism is eventually you run out of other people's money" - Thatcher.
of their message in the subject line
remind me of
tomshardware, and sites which think that too much information
in once place might spoil things
Interesting idea.
You said: "However the very nature of the web provides the opportunity for looking at the relationships between ideas, the synthasis of knowledge as opposed to just collections of information." How so?
As I see it, it's just a very large collection of information. I guess I find it hard to see where the web unique defines the relationship of ideas. I suppose there's the linking that is unique to the web, but that's what PageRank is all about right? You are right in that the way PageRank currently seems to be used is precisely an "information sieve". But, as I see it, how could it be used any other way? Hyperlinks are, essentially, "dumb links". Each link just points to another page, but there are no specific characteristics to any link, so how does this actually "define a relationship between ideas"?
"Injustice anywhere is a threat to justice everywhere." - Martin Luther King, Jr.
Actually, there essentially is a meta-moderate link tucked down at the bottom of the page:
It's not an automated system, but it does let you report "bad moderation".
There is a theory that the results of New Coke were exactly as planned.
"Coke Classic" is different from original Coke. Original Coke used cane sugar for sweetener, while "Coke Classic" (and "New Coke") used much cheaper corn syrup.
Supposedly Coke figured that a few-month seperation would disguise the taste change and that the loss of profits during that time was worth it.
Also the risk-taking will drop off a cliff once they are public. The litmus test for new products is much more stringent once you have quarterly reports. And yes, they are going to have an IPO, stop debating it.
In my view (and for lack of a better term to describe it) the first step is to 'assimilate' information from the web as opposed to indexing it. What is involved here is to identify the key concepts in relation to other key concepts. These relationships are largely interdiciplinary in nature. And as we see ideas explored on the web we can almost describe an evolutionary progression, where one thought leads to another.
PageRank is a little over simplified in that it counts the links, but des not actually qualify the conceptual relationships between the ideas themselves. Another issue is who the source and destination of links are. When little Jimmy is doing his pinewood derby racecar website and links to the Indy 500 website, shouldn't that link be weighted differently than that of collaborating (or confilicting) points of view among top rated experts in a field? I'm not saying that important conceptual links can't come from unusual places; but it is less likely.
One problem is that the structure of links on the web is, as you describe, 'dumb.' While I'm not in 100% agreement with him, Ted Nelson's Xanadu gives us an alternative to how things might be done.
From my perspective a smart link might actually have it's own embedded search rules and biases, and be 'free standing,' i.e. not statically linked at any one time, but able to generate links from an indexing database based on the criteria present when it is invoked; because smart links have to be aware of both the user's needs and the conceptual framework rules. When coupled with an instructional objective (for example) a smart link (in a smart browser) can help to distinguish between what the end user knows (can be demonstrated) and areas which they don't understand, and modify the weights of items in the search to 'direct' the user towards the concept they don't get through related ideas that they do understand (much as we are doing here.)
"Can there be a Klein bottle that is an efficient and effective beer pitcher?"
If anyone thinks that the most truly creative minds will jump through the hoops to get a PhD so they can learn to innovate, please think again. Case in point, Google's founders.
In fact, the inverse may be true:
Creativity implies not having a PhD.
Google needs to start flaunting its other options a little more. Froogle, for example, could really take off if they refine it.
Who doesn't like free music?
This is a great service. The Computer and Electronics have some really great deals!
Ah, I see. That's some pretty interesting stuff.
I can't help but conclude that the current state of HTML is simply not advanced enough to provide an easy and apparent way for better searching (with my limited view of course). I suppose there would be ways around it, but after reading your comments I simply think that we need to improve our current infrastructure, in this case HTML, until we can implement more intelligent searching in a straightforward and reliable manner.
I find this to be illustrated by a simple case of looking at the sorts of links I might have on my personal website. Some links may be to products I bought, others may informational pages on a product I purchased, maybe I linked goatse but I told people that they shouldn't view it, perhaps I linked a site that I thought was crappy, etc. Either way, the importance of the link to me is essentially qualitative and, like you said, Google tries to draw a qualitative assessment based on a quantitative value (how many times you've linked the page and others have linked to this page).
I guess what I'm saying, other than restating the obvious, is that maybe Google is doing the best (or near best) that one possibly could do with the state of the web today. I'm inclined to think, with my shallow view of the history of AI in the past 30 years and its, imho, relatively unsuccessful path other than few blips here and there, that any other clever ways of parsing the web will provide questionable results at best. (Which is why I think Ask Jeeves never worked that well and which is why I'm further skeptical about Microsoft's new plans. Don't get me wrong; I think it will work fine for simple questions like "How many eggs are in a baker's dozen?" but I think it will fail when I ask it "What are the worldwide cultural consequences of World War 1 compared to World War 2 both in the economic and political context?")
"Injustice anywhere is a threat to justice everywhere." - Martin Luther King, Jr.
Some of these smaller natural language engines are beginning to look very promising, see: answerbus,brainboost,webqa
Interesting as to why the big boys are largely ignoring this domain. I suspect old man jeeves has turned people off to the possiblity of reliable QA.
Searching for: href="mailto
3657 hits (33678633 - 33682289)
href="mailto:.www@vanderbilt.edu">www@www.utexas.e
...etc...
href="mailto: "> Name Law Journ
href="mailto: JMims </FONT></TD>..<TD WIDTH=72> <F
So far this is a prototype running on my machine with a corpus of 50MB of html, but I'm hoping to get a demo site up in a week or two. Performance is so far very good.
There are still a few questions to resolve, such as whether people want to search raw html, or text with the tags stripped out like most search engines use. Right now it's working on straight html, and displaying contexts in raw format. When the demo is up I'm hoping to have people try it and give feedback on what options are the most useful.
---- "If we have to go on with these damned quantum jumps, then I'm sorry that I ever got involved" - Erwin Schrodinger
On the Web, loyalty changes like the wind.
When you're talking about search sites that require no login and there is no friction associated with changing your preferred site, users are even more fickle.
Sure, Google has strong brand presence, but remember that Google obtained that stature purely through ease of use and effectiveness of their product. They spent no money on marketing.
The brand followed the product, which is in stark contrast to most new brands, where marketing positions the product.
If SearchCompanyX comes along with an easy to use engine that supports stemming, eliminates spamming, indexes zillions documents of all types, and provides some form of advanced site thumbnailing so users have more of an idea of their destination before they follow a link, Google will go down hard.
On the Web, branding and performance are much more intertwined than in the offline world..
Read the EFF's Fair Use FAQ
Yes, Google has a spam problem. It has been getting worse over the last year. In April, 2003 Google stopped crawling the web once per month, and then recalculating PageRank based on that monthly crawl. Since then, there has been a question of whether PageRank can even be calculated accurately by Google.
I speculated about a 4-byte docID overflow problem in an essay last June at Google Watch. In recent months Google started a "Supplemental Index" for some curious, unexplained reason. Their total number of pages indexed was recently updated to 4,285,199,774 -- just below the maximum for a 32-bit integer. It looks as suspicious now as it did last June.
Last November, Google began using an on-the-fly filter to further refine the search results for ecommerce sites. Some spam was deleted, a lot of other spam took its place, and a lot of mom and pop ecommerce sites were dropped inadvertently. Many people were unhappy.
Further evidence that Google's old ranking system is broken is the fact that three famous Googlebombs, "french military victories," "weapons of mass destruction" and "miserable failure" are all still working. The first one is eleven months old. It used to be that such Googlebombs were suppressed at the next monthly crawl, when PageRank was recalculated. Now it seems that suppressing them is beyond Google's ability. How else can you explain why Google puts up with these widely-publicized embarrassments?
Google's results remain unsurpassed for noncommercial sites from EDU, ORG, and GOV domains, however. Their crawling of the noncommercial sector is the most complete of any engine. The reason Google does so well here is probably because spam isn't much of a problem in this area.
So far Yahoo doesn't appear to be making much of an effort at covering the noncommercial web. It should be added that Google has more of a spam problem simply because spammers have been focused on Google for so long. Once Yahoo gets the same attention from spammers, then we'll be able to make a fair comparison of Yahoo with Google.
FWIW, in Mozilla Firebird, you can select a bunch of text, right-click on it, and go "Search the Web"... . I've never had to open a separate window for searching. Now, it would be so nice to have this in other apps.
Take email, for example. My idea is that when I'm posting a query to a mailing list, as I type in the words, the program should dynamically build a set of "related links" for the content I have typed in the email. That way, people won't have to ask me to STFW everytime I act clueless and send a simple query to the list.
Alright, I'm kidding. I'm not a clueless user, but you get the idea. For any content on my screen at any given time, I'd like to be able to access "related content" from... er... a sidebar on the screen?
Google IPO stock likely to be highly price. That gives you currency for snatching other search startups before MicroSoft does. Also makes one pricey to be acquired. Google's financial options are limited until it creates a large amount of cash by going public.
A checkbox under advanced search that says "Don't display me any sites that are trying to sell me something" and possibly another that says "Don't display any sites that are reviewing something".
Do that, and most of my searches will now give the results I want to see instead of 5000 domains all pointing at each other for pageranking.
What you're describing is the basis of PageRank: links from sites with high Google karma will increase your Google karma, but a link from a site with zero karma will have no effect. You don't have to eliminate the cycles in the graph before you iterate - instead, you have a fixed "signal strength" reduction which guarantees that the iterations will converge on a single solution. It's an eigenvector finding problem. Read the original PageRank paper, or the explanation in Raph Levien's PhD. thesis.
Xenu loves you!
Here you go (it's my sig too ;)
VIVA1023.com | Political Fashion.
as they are at Google. You get a massage and a crown and you don't pay a dime. Sounds like a dream employer to me. :-)
What about KartOO, which visually maps out relationships between sites? At the moment it's a meta search engine (the beauty's in the visuals, not the out-of-date results it gets from AllTheWeb and Lycos), but if it became the new way of looking at Google's results I think it'd be the Next Big Thang.
$ echo "ceci n'est pas une pipe" | sed -Ee 's/(eci n|pas )//g'
Searching beyong Google will only lead you to black holes and you will turn yourself into nothingness!
-------
FM Clan
You mean it's pretty similar to what I'm trying to do with litigious bastards ?
One, you could pass it as part of the URL (i.e. using GET) in your book mark. Like this.
Two, you could roll a search engine plug-in and pass the limiter as part of the form (i.e. using PUT) with <input type="hidden" ..., or add it to an existing plug-in.
Or, three, you could make your own extensions to the tool bar of the browser, if there isn't already one to do the trick.
Beta is broken and the link to classic doesn't work. Stop wasting our time or there won't be anybody left here.