AllTheWeb Claims Bigger Index Than Google
An anonymous readers writes: "Hoping to attract more mass appeal for an online search engine with a cult following, Norwegian search engine AlltheWeb on Monday declared that it indexes more Internet information than longtime pacesetter Google. Boston.com has the story." Of course, pages indexed is not the only measure of a search engine and probably isn't even the most important.
Windows declares itself better than linux, ...
:)
Gnome declares itself better than KDE
Emacs declares itself better than VI
PHP declares itself better than Perl
Let the flames fly
The Anti-Blog
but I'm sure google is faster, and it's results probably match better to what you were looking for, in anycase it'll be intresting to see
--fetch daddy's blue fright wig, i must be handsome when i release my rage
God forbid someone presents an objective comparison between Alltheweb and Google. Responses such as "Google is my God" and Timothy's little snip in the article do nothing for anyone really interested in using a useful search engine.
I just used Alltheweb for some common searches I do, and you know what? It found a lot more useful hits than Google did. Yea, imagine that.
But Alltheweb didn't seem to have a cache, which I thought was very useful in Google.
So, come on, folks, give it a chance, and don't jump to conclusions without an objective analysis. The tendency to blindly worship things like google/linux/linus/transmeta is far too common on this site.
I just tried to pull up one of my own pages with this engine. Got:
"Redirection limit for this URL exceeded. Unable to load the requested page."
Which, as near as I can tell, is their way of throttling commercial hits. Wonderful. Moving the mouse over the link doesn't reveal the address in the bottom bar, either, so the only way I can think of to obtain the address of the item it matches is by right-clicking and selecting 'copy link address', opening a new window and pasting it it (and having a browser that is capable of doing this), then editing the URL so only the target link text remains.
You can't even right-lick and open in a new window to do this. If you try, you get "about:blank" which, afaik, means they're using javascript.
These people sure go through a lot pains to render a result and then not let you anywhere near it. Saying they're bigger than Google is a bit like someone bragging about how their PDP-11 is bigger than my Athlon. Cripes.
My
Limekiller
Google is currently listing Operation Clambake first.
> What are you, AN AMERICAN TALIBAN?
No. I'm English - and you're a colonial who has a drinking song for a national anthem.
Not everything that can be measured matters; Not everything that matters can be measured.
The reason I'm for Google has little to do with technology. It has everything to do with advertisements and capitalism.
;-).
I'd rather support a company that uses subtle advertisements like Google does than a company that uses in your face banner ads, etc. (Then again I'm posting on Slashdot!) Also I make a point to check out the ads evey now and then on Google and visit the company's site. I may be getting hosting from an advertiser on Google soon.
If people who advertise on Google make more money than they do with banner ads, pop-ups, etc. then we'll see the idea spread. I don't like in-my-face ads, so I do what I can to tell companies that. It's called being a responsible consumer.
Plus more valid hits come up when I search for myself on Google
Of course, as has been mentioned a few times above, competition is a Good Thing (TM).
- Ardenstone
Norwegian search engine AlltheWeb on Monday declared that it indexes more Internet information than longtime pacesetter Google.
Then how come the word with the most search results (FYI: the) on Google, returns less results on alltheweb?
Google always seems to give me what I want, faster than anything else. Either this is because of it's search algorythms, or that it has only the indexes linked... example : I search for engsoc (looking for Canadian Univerisity Engineering Societies) and I find all the "main" entry pages with google, and I find a littering of "inside" pages with obscure titles with this new one. I'll stick with google-- and my chances of using the "i feel lucky" button are high, since the first or second link.
Alltheweb's technology is or atleast was better. Google uses 4000+ linux boxes where alltheweb uses(used) about 400 servers running freebsd. I could explain exactly how alltheweb's tech works, but I don't think my IP agreement would allow it :)
This did return more results for some search terms than google. Not many of the extras seemed all that useful, though. The signal to noise ratio seems a bit lower.
The ordering of pages seems less helpful. In many cases, the page I'm looking for is farther down the page.
The sponsored links and advertising are way more noticeable, and get in the way of the search results, although they're probably easy enough to ignore.
Google seems to be better at rating by search term proximity, under the useful assumption that if the search terms occur close to each other, it is less likely to be a random hit. One irritation with AllTheWeb is that for many results, it doesn't show you the context of the search terms in the summary.
Obviously AllTheWeb lacks the excellent USENET archive. The video and MP3 search festures might be pretty useful, I haven't had a chance to try them.
I realize I'm coming across as entirely pro-Google, but these are the only observations I have right now. I'll give AllTheWeb a chance, and let internet darwinism settle the issue.
Do a search for slashdot GOOGLE = 2,250,000 AllTheWeb = 1,649,088 What's up with that ?
Your sig here!
is that it allows me to download its search result pages from within a PHP script. I wrote it to get a concordance (list of real usage cases) for any phrase from within Emacs, and it works flawlessly with alltheweb (and greatly facilitates writing in English for a non-native speaker like me). I tried to hook up to Google in the same way, but it refused to share its wisdom with my script, apparently because the user_agent field was not that of a "real" browser. I find this rather stupid - what are they afraid of, robots stealing their bandwidth? Any robot can easily fake its user_agent, I was just a bit too lazy for this hack. And besides, alltheweb works just fine for me.