AllTheWeb Claims Bigger Index Than Google
An anonymous readers writes: "Hoping to attract more mass appeal for an online search engine with a cult following, Norwegian search engine AlltheWeb on Monday declared that it indexes more Internet information than longtime pacesetter Google. Boston.com has the story." Of course, pages indexed is not the only measure of a search engine and probably isn't even the most important.
This is the same AllTheWeb that has been buying banner ads that launch their website into pop-ups again and again...no thanks....
Great, you have a huge index. I know a haystack that has more than one needle, but the stack is about the size of Texas.
Neck_of_the_Woods
#/usr/local/surf/glassy/overhead
Well, I think this might finally answer the question I have been wondering about my love of google for a long time.
Do I love google because it's so simple and easy to use with very quick download times and simple graphic interfaces, and good search algorithms that more often then not give me the sites that I am looking for in one page.
or Do I love google because it has a ton of useful sites logged in its database including all copies, half sites, under construction sites, etc.?
I am willing to say that's it's likely the first one, and I think that it might be that for most other people.
But either way, it'll be neat to see what AllTheWeb.com does well.
~ kjrose
Ya think *somebody* might be compensating for something here?
The ultimate test: how many webpages about me:
;^)
Google: 185
AllTheWeb: 57
I'll stick with google. It indexes more interesting stuff.
DNA is the ultimate spaghetti code.
Unfortunatly their ads are at the top of the page, followed by "top news" and then the links themselves.
However the first two returns for Scientology are the Scientology homepage and Operation Clambake. I wonder how long it will be before AllTheWeb is threatened.
Aside I'll need more proof that this thing is more accurate than google before I would consider switching.
Feminism is the radical notion that women are people.
are they going to have any cutesy cartoons made out of their name for special occasions?
Well, I was pretty happy with the results of a search on my name...happier than with Google in that once case, though that's but a single tiny datapoint.
In any case, it would be terrific to have a viable alternative to Google...despite Google's almost unnerving ability to do *so* many things Right, it is good to have somewhere to turn just in case something went wrong there. Not having a monoculture (which is what we're almost on the verge of with Google) is generally a good thing.
SO YOU'RE GOING TO DIE: The Comic for Dealing with Death
The story says AlltheWeb.com is owned by a Norwegian company. Should people really support a socialist Scandinavian country? Any real American should only use capitalist homeland-based search engines, like Google or MSN Search. Like Bush says, "You're either with us or against us." Only a traitor would go against Bush's wisdom. What are you, AN AMERICAN TALIBAN?
I just did some searches, and it appears to be ok for finding information. Whether it's logic is as good as google's is hard to tell. Little slower than google. It doesn't look to me like there is any reason to use it over google. How many sites worth visiting are not in google's index?
This may be a case of a company picking a poor benchmark as their performance measurment. Google's draw is their great ranking logic, not index size.
-Pete
Soccer Goal Plans
Windows declares itself better than linux, ...
:)
Gnome declares itself better than KDE
Emacs declares itself better than VI
PHP declares itself better than Perl
Let the flames fly
The Anti-Blog
I did some searches, and I ended up with different results than google. Perhaps of note, the results I got with alltheweb are from 1998, whereas google's are from this year.
Google counts as a single page both their cache and the site in it's current form, so the number of web pages you can get to from a google search is significantly higher than the number of pages they have actualy 'indexed'.
This is far more important to me as a user than some extra pages that alltheweb may have (presumably because they ignored a few 'nobots' tags? that Google's crawlers respected?)
A pizza of radius z and thickness a has a volume of pi z z a
Jumpstart the tartan drive.
I'm too spoiled by Google, I think. I took one glance at the search results screen that had a few banner ads, and decided never to go there again. I understand they want to offset costs/make money off of the engine, but banner ads are ugly as sin. I'll stick with Google.
Al Qaeda has ninjas!
I think i remember Teoma making the same claim, "we're better than Google.".
They should be featured on one of those shows - Where Are They Now?
God forbid someone presents an objective comparison between Alltheweb and Google. Responses such as "Google is my God" and Timothy's little snip in the article do nothing for anyone really interested in using a useful search engine.
I just used Alltheweb for some common searches I do, and you know what? It found a lot more useful hits than Google did. Yea, imagine that.
But Alltheweb didn't seem to have a cache, which I thought was very useful in Google.
So, come on, folks, give it a chance, and don't jump to conclusions without an objective analysis. The tendency to blindly worship things like google/linux/linus/transmeta is far too common on this site.
when a search for
;)
"php regular expression" AND "tutorial"
on AllTheWeb gives me 131 results, with more than half being a reference to a PHP website manual (and even a dislaimer footer because it had the words "PHP" and "and" in it ???). Moreover, it took my "and" literally as a search criteria, though my advance searching techniques could probably use a bit of help
In comparison, Google gives me 73 links (without omitted results showing) with many results displaying ALL my keywords in bold and not ONE of them using "and" as a keyword.
Dunno, I'm probably a bit biased anyway since "Google" types out so much easier for me (repetition i guess) than "alltheweb".
pblt....
Because it indexes all of the domain names of the same site as different hosts.
;)
Google returns one accurate site for the company "DataHive", one domain name (not the proper one, but how would it know =)
This site returns 3 different domains, and tries to present them as different pages, though they all have the same content.
I can imagine its easy to claim more than google when you multiply the number of real hits.
I must say though, the results I found were pretty good for a number of queries. Definetly a google competitor. It does not seem to find all of the newsgroup/mailing-list stuff that google returns, good or bad depending on what you are searching for.
Its nice to have another competant option
Google is my favourite search engine, even now, its ads are unobtrusive and don't pollute the search results. They've been good net citizens and they've done substantial research into how to better search. There results are typically the best as well.
In this case their search results were very broken however, at least for the purposes of my search. What I'd like to see is google, or an engine as effective as google, add in the ability to constrain your search to subject areas. In this instance I'd constrain my search to historical sites and would have received mostly uncorrupted hits. This is different than a web directory. Web directories don't classify sites based on there quality. Google does in a round about fashion, it lists sites with more people linking to it higher than sites with less links.
I'm not sure how the details of this would work, self-nomination would not necessarily work. Porn companies would gladly pollute the keywords on the off chance that somebody looking for history would buy a membership to their site. Letting individuals vote a site into or out of a keyword might work, though you'd be in danger of concerted efforts to say vote out anti-Scientologist information and vote in pro-Scientologist information when both actually could be under a religious keyword.
Anyway, linking to more sites isn't necessarily helpful in my opinion. What I'd prefer is the ability to narrow the focus of my searches.
Chris Kuivenhoven is a thief, beware
Google: 63,500,000
AllTheWeb: 25,435,205
I think I'll stick with Google :o)
Avantslash - View Slashdot cleanly on your mobile phone.
More pages means more crap and shitty search results. Similar to programming, more lines of code doesn't mean better.
Skiers and Riders -- http://www.snowjournal.com
For those who are unable to reach AllTheWeb here is the homepage through the usual Google cache.
I just tried to pull up one of my own pages with this engine. Got:
"Redirection limit for this URL exceeded. Unable to load the requested page."
Which, as near as I can tell, is their way of throttling commercial hits. Wonderful. Moving the mouse over the link doesn't reveal the address in the bottom bar, either, so the only way I can think of to obtain the address of the item it matches is by right-clicking and selecting 'copy link address', opening a new window and pasting it it (and having a browser that is capable of doing this), then editing the URL so only the target link text remains.
You can't even right-lick and open in a new window to do this. If you try, you get "about:blank" which, afaik, means they're using javascript.
These people sure go through a lot pains to render a result and then not let you anywhere near it. Saying they're bigger than Google is a bit like someone bragging about how their PDP-11 is bigger than my Athlon. Cripes.
My
Limekiller
It was hidden as ftpsearch.lycos.com for some time, but now it seems to have come "home".
BTW: the last time their OS was visible through the firewall, it was FreeBSD...
Anyone remember archie ?
Windows 2000 - from the guys who brought us edlin
True, indexed pages on the internet is not the single most important thing for a search engine. But it is definitely up there in the top. Personally I would preffer more indexed pages before most other things one can measure a search engine, simply because then I know there's a greater chance to find what I'm looking for, even if maybe it will be a little more difficult.
Will work for bandwidth
There is actually a help link. "php regular expression" + "tutorial" would have given you what you wanted. If you want to compare two tools you should at least use enough time to see if you have to use the two differently, and then see what is best at getting the job done.
- We are the slashdot. Resistance is futile. Prepare to be moderated -
I'll just say this:
:-)
;-)
Google manage to get a graph of the slashdot effect among the first 20 hits, while AllTheWeb just manage to get Cliff showing a Think Unix book (in weirdo hawaiian clothes).
I don't know about you, but Google give me more relevant matches as usual.
Beware: In C++, your friends can see your privates!
Worse than the outdated and useless search results is the way they are presented - there is no grouping by site to put similar pages under one entry. Of the 167 results, almost all of them are from two distinct sites, but you have to wade through all of them to find any different ones. With a more common search string, it will be almost impossible to find what you are looking for, and it is still difficult with a narrow focus search. Google ain't going down that easy...
...the object of claimed affection really IS as good as everyone says.
And, Google forbid, should google start to suck, or something else start to be better, then I think most of us would find another search engine to "worship", like I (and I assume many others) did when Yahoo went down the toilet.
For me, the one mention of pop-ups and heavy graphic ads is more than enough to make it not worth my while to check out (and yes, I know, at home, I can filter out all the banner ad and pop-up garbage, but here at work I don't have the luxury of arbitrarily installing proxies and browsers to do that sort of thing. besides, web sites that use pop-ups piss me off).
..is bigger than your index.
Computer scientists - pfft...
Unable to read configuration file '/bigassraid/htdig//conf/14229.conf'
Geocrawler error message.
Today the New York Times claimed that it had published "All the News That's Fit to Print."
One question remains unanswered: Will they be able to do it again tomorrow?
Note to moderators: This is sarcasm. It isn't off-topic. I'm implying that some marketing ploy by alltheweb.com isn't exactly newsworthy. Thank you.
My Karma was at 49, then they switched to words. All that work for nothing!
Your comment sounded to me like it was specifically designed to diminish the importance AllTheWeb's claim, in favor of Google. Perhaps it was unintentional, but I doubt it.
I don't know if you're a Google-worshiper, but you certainly ran to its defense when faced with a strong claim from a competing search engine.
If you are looking for something really specific (eg. the DNS entry of your machine to see which webpages you look at publish log files), then alltheweb in my experience will find a number of pages which google misses.
For general searching google still rocks.
It's not how big it is, it's how you use it.
Google is still way more useful in my opinion.
$45 per U Colocation Special
Alltheweb's claims are not unfounded, and I find it always worth checking when google fails.
Here is one of several real life cases where it found software for me that google didn't.
(It still does, and google still doesn't.)
Timeo idiotikOS et dona ferentes
Google Pr0n Search finds 46,200 results.
Searching for pr0n via alltheweb.com leads to 2318 more potential pieces of pr0n to be seen.
The reason I'm for Google has little to do with technology. It has everything to do with advertisements and capitalism.
;-).
I'd rather support a company that uses subtle advertisements like Google does than a company that uses in your face banner ads, etc. (Then again I'm posting on Slashdot!) Also I make a point to check out the ads evey now and then on Google and visit the company's site. I may be getting hosting from an advertiser on Google soon.
If people who advertise on Google make more money than they do with banner ads, pop-ups, etc. then we'll see the idea spread. I don't like in-my-face ads, so I do what I can to tell companies that. It's called being a responsible consumer.
Plus more valid hits come up when I search for myself on Google
Comes up with more hits for my name then google.
I can't belive how many people have my "Subtle mind control? why do all the HTML buttons say 'submit'" quote on their sites.
autopr0n is like, down and stuff.
Of course, as has been mentioned a few times above, competition is a Good Thing (TM).
- Ardenstone
Your comment sounded to me like it was specifically designed to diminish the importance AllTheWeb's claim, in favor of Google.
I think you're right, except for the "in favor of Google" part. Timothy said, "pages indexed is not the only measure of a search engine and probably isn't even the most important." AllTheWeb claims that their page index is big, and Timothy is reality-checking that claim.
I think Slashdot editors get too snippy too often in their story posts. But this isn't one of those occasions.
I don't think there's anything wrong with opening a new window when you click an add, it's not the same thing as a popup, and most of the time it's the choice of the website admin, not the advertizer.
Also, lots of people prefer opening new sites in new windows. Myself included.
autopr0n is like, down and stuff.
Norwegian search engine AlltheWeb on Monday declared that it indexes more Internet information than longtime pacesetter Google.
Then how come the word with the most search results (FYI: the) on Google, returns less results on alltheweb?
wow, that's the first "'insert name' is dying!!!" post I have ever seen that is legitimate. interesting.
Google always seems to give me what I want, faster than anything else. Either this is because of it's search algorythms, or that it has only the indexes linked... example : I search for engsoc (looking for Canadian Univerisity Engineering Societies) and I find all the "main" entry pages with google, and I find a littering of "inside" pages with obscure titles with this new one. I'll stick with google-- and my chances of using the "i feel lucky" button are high, since the first or second link.
'I've got it Herbert! Let's make some inflammatory claim about Google that has nothing to do with the actual quality of either sites results and sit back and watch the hits roll in!'
yeah, so it's an obvious troll, but i guarantee you it's true.
I searched both Google and AllTheWeb for the name of my company. (For privacy reasons, I'm not going to tell you the name.) We are a small company, and probably few pages on the web link to our site, but Google pulled up our home page as its first search result. AllTheWeb failed to list it in its first page of links.
It's not hard to find our site, either. Our company's name is "foo bars"* and our URL is "foobars.com." Google nailed it, while AllTheWeb bombed.
Doing a more complex search with lots of words from our home page did, finally, get AllTheWeb to cough up our site. So I know it's in there.
So in my opinion it has little to do with how big their index is. It has to do with how good they are at finding what I'm looking for. For me, Google almost always finds what I'm looking for. I've even started using the "I Feel Lucky" button to skip the search results altogether and just take me straight to the first listed site.
*Incidentally, I've always wanted to open a pub called the Foo Bar, but I don't think many people would get it.
This did return more results for some search terms than google. Not many of the extras seemed all that useful, though. The signal to noise ratio seems a bit lower.
The ordering of pages seems less helpful. In many cases, the page I'm looking for is farther down the page.
The sponsored links and advertising are way more noticeable, and get in the way of the search results, although they're probably easy enough to ignore.
Google seems to be better at rating by search term proximity, under the useful assumption that if the search terms occur close to each other, it is less likely to be a random hit. One irritation with AllTheWeb is that for many results, it doesn't show you the context of the search terms in the summary.
Obviously AllTheWeb lacks the excellent USENET archive. The video and MP3 search festures might be pretty useful, I haven't had a chance to try them.
I realize I'm coming across as entirely pro-Google, but these are the only observations I have right now. I'll give AllTheWeb a chance, and let internet darwinism settle the issue.
Do a search for slashdot GOOGLE = 2,250,000 AllTheWeb = 1,649,088 What's up with that ?
Your sig here!
What was their pitch?
I'm interested.
Black holes are where the Matrix raised SIGFPE
telnet www.kaosinc.com 80
Trying 192.203.175.245...
Connected to www.kaosinc.com.
Escape character is '^]'.
GET
HTTP/1.1 302 Found
Date: Mon, 17 Jun 2002 16:51:47 GMT
Server: Apache/1.3.23 (Unix) Debian GNU/Linux PHP/4.1.2 ApacheJServ/1.1.2
Location: http://www.kaosinc.com/index.shtml
Connection: close
Content-Type: text/html; charset=iso-8859-1
302 Found
Found
The document has moved here.
Connection closed by foreign host.
"A mind is a terrible thing to taste."
Redirection limit for this URL exceeded. Unable to load the requested page.
i nc.com/jen.shtml in the location bar and get the same error message. What version of Mozilla are you using?
That is a Mozilla error message (source) and does not come from alltheweb. Your web server is broken. http://www.kaosinc.com/jen.shtml redirects to http://www.kaosinc.com/index.shtml, which then redirects to itself. This happens regardless of where I find the link to http://www.kaosinc.com/jen.shtml, or what browser I use to load it. IE appears to just sit there, Opera bounces between various stages of trying to connect, and Netscape 4 gives up after a few redirects and displays a raw 302-found page ("The document has moved _here_") without redirecting.
Moving the mouse over the link doesn't reveal the address in the bottom bar, either, so the only way I can think of to obtain the address of the item it matches is by right-clicking and selecting 'copy link address', opening a new window and pasting it it (and having a browser that is capable of doing this), then editing the URL so only the target link text remains.
An easier way to see the URL of the link is to hold the mouse down over the link, and then move off of the link before you lift the mouse button. But I still get the infinite-redirect error message if I type your URL directly.
You can't even right-lick and open in a new window to do this. If you try, you get "about:blank" which, afaik, means they're using javascript.
If I right-click on a link from the alltheweb search results and select "open link in new window", I see http://www.alltheweb.com/go/1/H/web/http/www.kaos
The shareholder is always right.
I don't agree; the article was clearly making a comparison between ATW and Google, to which Timothy responded by diminishing ATW's claim, clearly to the benefit of Google.
If the article had been about Google indexing it's N-billionth page, do you think Timothy would have quipped something about it's irrelevance? I doubt it. More likely there'd be the usual drooling.
Being a Norwegian company, would they be under the same mandate to hand over all 'suspect' search queries for abuse by the US's new CIAFBINSASSSASD (known in PRSpeak as the Information Awareness Office)?
Years ago, I used to use ftpsearch to find warez left in public incoming dirs by warez couriers.
Glad to see it's back, after a sojourn as a non working component of Lycos.
What we call folk wisdom is often no more than a kind of expedient stupidity.-Edward Abbey
The reason why AllTheWeb will surpass google is that it has a much catchier name.
As a bonus, alltheweb (when properly separtaed with spaces) is proper English.
Being a Norwegian company, would they be under the same mandate to hand over all 'suspect' search queries for abuse by the US's new CIAFBINSASSSASD (known in PRSpeak as the Information Awareness Office)?
I would hope not, but perhaps there might be a profit angle involved.
More seriously, do you have any knowlege that this "mandate" exists? Is it public law? Executive order? Secret executive order? A directive from "high levels"? Or is this more of an "intelligent concern" of yours? There's nothing wrong with that - everyone with half a brain should be concerned about these possibilities.
A dingo ate my sig...
come on, we all know that *cough* it isn't the size that matters, but how well you use it to um, achieve results.
:)
Okay, I get it. You've got a bug up your ass. I'll stop trying to have an objective conversation with you now.
Let's look at the numbers shall we?
Fnord: Google: 104000 AllTheWeb: 46439
Cheese: Google: 3690000 AllTheWeb: 7718252
Linux: Google: 48000000 AllTheWeb: 26670311
Windows: Google: 44600000 AllTheWeb: 66545303
Extropian: Google: 4460 AllTheWeb: 3999
Kumquat: Google: 32600 AllTheWeb: 42889
Question Authority and the authorities will question you.: Google: 90 AllTheWe b: 74
Hot man meat: Google: 229 AllTheWeb: 1661
Hot pussy: Google: 104000 AllTheWeb: 770057
"undefined reference to" error: Google: 31700 AllTheWeb: 8548
"Antimatter-Catalyzed MicroFission / Fusion": Google: 6 AllTheWeb: 1
Surprisingly alltheweb does return more hits in some areas, most notably for che ese, windows, and pr0n. With the cheese test, AllTheWeb helpfully cluttered my s creen with a banner for food products. Google, thankfully, is still bannerless, and returns more linux hits, fnords, and Voltaire quotes. Alltheweb also stalled several times and I had to resubmit a search. Conclusion: If you're a linux gee k or you want to know about fnords, futuristic philosophies, compilation errors, or advanced space propulsion concepts, google is better. If you're a horny wind ows user and want to find gay or straight pr0n, and if you for some reason like kumquats and want to learn more about cheese, use alltheweb.
Seriously, I'll probably stick with google, better numbers or no. The only thing AllTheWeb has going for it is the ftp search. The original is owned by lycos no w and broken.
...is how often they include cute, relevant, wacky different variations on their logo.
Right?
Hey, check out one of the new Google beta programs, answers.google.com Even you can now earn fame and fortune, and yes, even internet cash be searching google's archives for answers to people's questions...
Could be interesting, since AllTheWeb is based in Norway, the same country where Operation Clambake is. They might say "DMCA, what?"
But then, they might not, since the index itself is probably in the US, and besides, our Big Sister Sunde thinks DMCA is Norwegian law anyway, so she'll be banging on the doors once she gets $cientology on the phone.
Employee of Inrupt, Project Release Manager and Community Manager for Solid
here.
I'll stick with google, those 3 extra characters are too much for me. Seriously though, alltheweb seems pretty good, I tried their mp3 search and it was ok, not really comparable to the myriad p2p clients, but still a nice feature.
All I've read suggests that the IAO effort is geared toward building a massive data infrastructure that will allow fast access to all manner of information related to a specific target (which could be anyone that fits into a specific profile, for one reason or another). So, it's not a mandate per se, but the mere fact that our browsing habits, including search queries, could be part of it, is, and should be unsettling for every American citizen. The problem is that the government will have more and more access to information with less and less control or acountability.
AllTheWeb has summed all their formats to get 2.1 billion.
If you add Google's 700mil USENET articles, 300mil images, etc, Google has >3,000,000,000 documents to search. That kills ATW.
-twb
To me, Alltheweb really does have a superious sense of style, its more estheticly pleasing. And even with the banners it's still really fast
I'm probably going to start using both regularly.
autopr0n is like, down and stuff.