AllTheWeb Claims Bigger Index Than Google
An anonymous readers writes: "Hoping to attract more mass appeal for an online search engine with a cult following, Norwegian search engine AlltheWeb on Monday declared that it indexes more Internet information than longtime pacesetter Google. Boston.com has the story." Of course, pages indexed is not the only measure of a search engine and probably isn't even the most important.
This is the same AllTheWeb that has been buying banner ads that launch their website into pop-ups again and again...no thanks....
Great, you have a huge index. I know a haystack that has more than one needle, but the stack is about the size of Texas.
Neck_of_the_Woods
#/usr/local/surf/glassy/overhead
Well, I think this might finally answer the question I have been wondering about my love of google for a long time.
Do I love google because it's so simple and easy to use with very quick download times and simple graphic interfaces, and good search algorithms that more often then not give me the sites that I am looking for in one page.
or Do I love google because it has a ton of useful sites logged in its database including all copies, half sites, under construction sites, etc.?
I am willing to say that's it's likely the first one, and I think that it might be that for most other people.
But either way, it'll be neat to see what AllTheWeb.com does well.
~ kjrose
Ya think *somebody* might be compensating for something here?
The ultimate test: how many webpages about me:
;^)
Google: 185
AllTheWeb: 57
I'll stick with google. It indexes more interesting stuff.
DNA is the ultimate spaghetti code.
Unfortunatly their ads are at the top of the page, followed by "top news" and then the links themselves.
However the first two returns for Scientology are the Scientology homepage and Operation Clambake. I wonder how long it will be before AllTheWeb is threatened.
Aside I'll need more proof that this thing is more accurate than google before I would consider switching.
Feminism is the radical notion that women are people.
Regardless of what this article claims, or what alltheweb says they do, Google has proven day in and day out to be the best and fastest search engine I have ever used. It is going to be a while until someone takes google's place, and I really do not see allthecobweb doing it.
dam)(
Useless sig.
are they going to have any cutesy cartoons made out of their name for special occasions?
Well, I was pretty happy with the results of a search on my name...happier than with Google in that once case, though that's but a single tiny datapoint.
In any case, it would be terrific to have a viable alternative to Google...despite Google's almost unnerving ability to do *so* many things Right, it is good to have somewhere to turn just in case something went wrong there. Not having a monoculture (which is what we're almost on the verge of with Google) is generally a good thing.
SO YOU'RE GOING TO DIE: The Comic for Dealing with Death
The story says AlltheWeb.com is owned by a Norwegian company. Should people really support a socialist Scandinavian country? Any real American should only use capitalist homeland-based search engines, like Google or MSN Search. Like Bush says, "You're either with us or against us." Only a traitor would go against Bush's wisdom. What are you, AN AMERICAN TALIBAN?
I just did some searches, and it appears to be ok for finding information. Whether it's logic is as good as google's is hard to tell. Little slower than google. It doesn't look to me like there is any reason to use it over google. How many sites worth visiting are not in google's index?
This may be a case of a company picking a poor benchmark as their performance measurment. Google's draw is their great ranking logic, not index size.
-Pete
Soccer Goal Plans
Windows declares itself better than linux, ...
:)
Gnome declares itself better than KDE
Emacs declares itself better than VI
PHP declares itself better than Perl
Let the flames fly
The Anti-Blog
I did some searches, and I ended up with different results than google. Perhaps of note, the results I got with alltheweb are from 1998, whereas google's are from this year.
Google counts as a single page both their cache and the site in it's current form, so the number of web pages you can get to from a google search is significantly higher than the number of pages they have actualy 'indexed'.
This is far more important to me as a user than some extra pages that alltheweb may have (presumably because they ignored a few 'nobots' tags? that Google's crawlers respected?)
A pizza of radius z and thickness a has a volume of pi z z a
Jumpstart the tartan drive.
I'm too spoiled by Google, I think. I took one glance at the search results screen that had a few banner ads, and decided never to go there again. I understand they want to offset costs/make money off of the engine, but banner ads are ugly as sin. I'll stick with Google.
Al Qaeda has ninjas!
but I'm sure google is faster, and it's results probably match better to what you were looking for, in anycase it'll be intresting to see
--fetch daddy's blue fright wig, i must be handsome when i release my rage
...the search engine analog of MHz as a measure of CPU performance.
I think i remember Teoma making the same claim, "we're better than Google.".
They should be featured on one of those shows - Where Are They Now?
Face it, most of the World Wide Web is junk. Search for information and you're likely to come across unrelated personal homepages or data from unreliable sites. Similarly you might be overwhelmed with too much data - information overload.
That's where good queries, source scoping, and ranking algorithms come in. In order to sift through the gazillion pages on the Net, we need a way to find out which pages are likely to interest us. Indexing more pages may help, but that's only one part of the answer.
As for search engine comparisons... well, Google's been really, really nice. =) Googlebars. Innovations. Funky things going on in Google Labs. I don't think Google's going to be easily replaced as _the_ search engine. I'll try AllTheWeb - looks interesting - but Google's cool.
God forbid someone presents an objective comparison between Alltheweb and Google. Responses such as "Google is my God" and Timothy's little snip in the article do nothing for anyone really interested in using a useful search engine.
I just used Alltheweb for some common searches I do, and you know what? It found a lot more useful hits than Google did. Yea, imagine that.
But Alltheweb didn't seem to have a cache, which I thought was very useful in Google.
So, come on, folks, give it a chance, and don't jump to conclusions without an objective analysis. The tendency to blindly worship things like google/linux/linus/transmeta is far too common on this site.
when a search for
;)
"php regular expression" AND "tutorial"
on AllTheWeb gives me 131 results, with more than half being a reference to a PHP website manual (and even a dislaimer footer because it had the words "PHP" and "and" in it ???). Moreover, it took my "and" literally as a search criteria, though my advance searching techniques could probably use a bit of help
In comparison, Google gives me 73 links (without omitted results showing) with many results displaying ALL my keywords in bold and not ONE of them using "and" as a keyword.
Dunno, I'm probably a bit biased anyway since "Google" types out so much easier for me (repetition i guess) than "alltheweb".
pblt....
I'll take Google's simple interface over the cluttered feel of AlltheWeb's any day. I don't know about AlltheWeb, but Google has so many cool tricks (phone number lookup, file search etc.) that it seems like I learn something new about Google almost every day. To AlltheWeb's credit, though, their search was fast, even comparable to Google's speed.
For now, I'll stick with Google though.
Posting as directed.
Because it indexes all of the domain names of the same site as different hosts.
;)
Google returns one accurate site for the company "DataHive", one domain name (not the proper one, but how would it know =)
This site returns 3 different domains, and tries to present them as different pages, though they all have the same content.
I can imagine its easy to claim more than google when you multiply the number of real hits.
I must say though, the results I found were pretty good for a number of queries. Definetly a google competitor. It does not seem to find all of the newsgroup/mailing-list stuff that google returns, good or bad depending on what you are searching for.
Its nice to have another competant option
Google is my favourite search engine, even now, its ads are unobtrusive and don't pollute the search results. They've been good net citizens and they've done substantial research into how to better search. There results are typically the best as well.
In this case their search results were very broken however, at least for the purposes of my search. What I'd like to see is google, or an engine as effective as google, add in the ability to constrain your search to subject areas. In this instance I'd constrain my search to historical sites and would have received mostly uncorrupted hits. This is different than a web directory. Web directories don't classify sites based on there quality. Google does in a round about fashion, it lists sites with more people linking to it higher than sites with less links.
I'm not sure how the details of this would work, self-nomination would not necessarily work. Porn companies would gladly pollute the keywords on the off chance that somebody looking for history would buy a membership to their site. Letting individuals vote a site into or out of a keyword might work, though you'd be in danger of concerted efforts to say vote out anti-Scientologist information and vote in pro-Scientologist information when both actually could be under a religious keyword.
Anyway, linking to more sites isn't necessarily helpful in my opinion. What I'd prefer is the ability to narrow the focus of my searches.
Chris Kuivenhoven is a thief, beware
I emailed Google a while back about the possiblity of changing the label from 'Hacker' to 'H4x0r' (meaning script kiddie of course).. They said they had people calling themselves hackers (in the true sense) so they would concider it. Guess not. Bah.
Google: 63,500,000
AllTheWeb: 25,435,205
I think I'll stick with Google :o)
Avantslash - View Slashdot cleanly on your mobile phone.
Plus: It groups matches with a site. So if, say, you get a hit on Salon on a search for 'DDR', you can click on 'more hits from' to get other matches on Salon. I've found this to not work very well on Google while it seems to work well on this site.
Big, BIG minus: Doesn't cache. Which is a huge reason why I use Google.
Conclusion: I probably won't even bookmark this site. It doesn't do much that just a little bit of digging wouldn't do anyway, and probably comes with a lot more cruft in the process.
My
Limekiller
More pages means more crap and shitty search results. Similar to programming, more lines of code doesn't mean better.
Skiers and Riders -- http://www.snowjournal.com
For those who are unable to reach AllTheWeb here is the homepage through the usual Google cache.
I just tried to pull up one of my own pages with this engine. Got:
"Redirection limit for this URL exceeded. Unable to load the requested page."
Which, as near as I can tell, is their way of throttling commercial hits. Wonderful. Moving the mouse over the link doesn't reveal the address in the bottom bar, either, so the only way I can think of to obtain the address of the item it matches is by right-clicking and selecting 'copy link address', opening a new window and pasting it it (and having a browser that is capable of doing this), then editing the URL so only the target link text remains.
You can't even right-lick and open in a new window to do this. If you try, you get "about:blank" which, afaik, means they're using javascript.
These people sure go through a lot pains to render a result and then not let you anywhere near it. Saying they're bigger than Google is a bit like someone bragging about how their PDP-11 is bigger than my Athlon. Cripes.
My
Limekiller
Actually, the comment I made applies to Google just as much as to any other search engine - pages indexed aren't the only thing.
For now, Google is the best search engine I know, but before that, Hotbot was the best search engine I knew, and before that etc etc.
I do admire Google for how well it works, but no worship. I'd love to find an engine that works even better. I think Kartoo (even if only a meta engine) has an interesting approach to the display of results.
I get good results from alltheweb.
timothy
jrnl: http://tinyurl.com/c2l8yr / foes: http://tinyurl.com/ckjno5
I have seen the Fast technology briefing. They wanted to sell their product to our company. From the technology briefing and from their references reports I believe the do have a much better search engine than google. Both in terms of product as well as their site.
Perhaps, but each web surfer has a different 1% that they consider relevant. It's highly unlikely that I'm interested in your 1%.
...phil
"For a list of the ways which technology has failed to improve our quality of life, press 3."
It was hidden as ftpsearch.lycos.com for some time, but now it seems to have come "home".
BTW: the last time their OS was visible through the firewall, it was FreeBSD...
Anyone remember archie ?
Windows 2000 - from the guys who brought us edlin
True, indexed pages on the internet is not the single most important thing for a search engine. But it is definitely up there in the top. Personally I would preffer more indexed pages before most other things one can measure a search engine, simply because then I know there's a greater chance to find what I'm looking for, even if maybe it will be a little more difficult.
Will work for bandwidth
There is actually a help link. "php regular expression" + "tutorial" would have given you what you wanted. If you want to compare two tools you should at least use enough time to see if you have to use the two differently, and then see what is best at getting the job done.
- We are the slashdot. Resistance is futile. Prepare to be moderated -
Here are some features that I would love to see a decent search engine supply. Altavista use to offer all of these but it's indexing is, well, you know how they've been lately.
(i) Case sensitive searches. - Great for searching for acronyms that are also real words.
(ii)stemming - this is my biggest grip. Say you remember a company who's name begins with "blue" but you can't remember the rest. With a search engine that supports stemming you can search for "blue* inc" or "blue* +whateverindustry". With my knowledge of information retrival limited to a single grad class, I'd say google can'ed stemming for search performance, but damn, it's a useful feature.
(iii)proper language filtering - googles language filtering is somewhat broken. I have english as my language and still get even oriental language pages in my search results. I don't know how AV did this, but it worked much better.
IMHO, other than that Google is a great search engine. "Google news" is a great resource.
Based on upvotes, Ageism is the only "-ism" Slashdotters care about and think isn't SJW
I'll just say this:
:-)
;-)
Google manage to get a graph of the slashdot effect among the first 20 hits, while AllTheWeb just manage to get Cliff showing a Think Unix book (in weirdo hawaiian clothes).
I don't know about you, but Google give me more relevant matches as usual.
Beware: In C++, your friends can see your privates!
Until all the facts are in, I wouldn't suggest telling anyone you use the *engine* with the *largest* index. Just buy an SUV, it's more impressive and kills people!
Creationists are a lot like zombies. Slow, but powerful and numerous. And they all want to eat our brains.
Search for "search engine" in both:
AllTheWeb:
First Hit: Search Engine Watch (www.searchenginewatch.com)
First Search Engine: Search Engine Colossus (3) (www.searchenginecolossus.com)
Hits Google at 22
Hits AllTheWeb at >50
Google:
First Hit: Google
First Search Engine: Google (1)
Hits Google at 1
Hits AllTheWeb at 16
> It's highly unlikely that I'm interested in your 1%.
You almost certainly are interested in a fair proportion of the OP's 1% - you both read slashdot...
Not everything that can be measured matters; Not everything that matters can be measured.
I just tried to search for "simplicity twinkle twinkle" and alltheweb didn't even return anything useful. Google in fact returned LOTS of useful links, including a link to an old friend of mine who went to VT and then later on dropped out due to too much partying and girls :-)
Go google!
Worse than the outdated and useless search results is the way they are presented - there is no grouping by site to put similar pages under one entry. Of the 167 results, almost all of them are from two distinct sites, but you have to wade through all of them to find any different ones. With a more common search string, it will be almost impossible to find what you are looking for, and it is still difficult with a narrow focus search. Google ain't going down that easy...
...the object of claimed affection really IS as good as everyone says.
And, Google forbid, should google start to suck, or something else start to be better, then I think most of us would find another search engine to "worship", like I (and I assume many others) did when Yahoo went down the toilet.
For me, the one mention of pop-ups and heavy graphic ads is more than enough to make it not worth my while to check out (and yes, I know, at home, I can filter out all the banner ad and pop-up garbage, but here at work I don't have the luxury of arbitrarily installing proxies and browsers to do that sort of thing. besides, web sites that use pop-ups piss me off).
Good to see that everybody is agreeing on Google. Having a big collection means nothing, but having a quality collection is what everybody wants. Google combines similar pages (but still gives an option if you are bored enough and wants to visit all those pages). Searches are more directed and useful atleast in my case as compared to alltheweb. Go Google!!
AllTheWeb has had a larger index for a long time now. They started small and just kept indexing and collectiong pages, and they don't throw away search results based on words like "the" and other common thingies. I used to use them back in the day, and keep going back every so often. Their FTP search seems to have degenerated into a few public FTPs so i kinda stopped going back. See, to me google is good for searching for certain kinds of searches like "youll-never-find-what-you-want-in-all-the-fake-re sults" , and AllTheWeb is good for things like "youll-find-more-links-that-arent-content-or-media -censored". I'm not saying Google limits your searching... But have you ever wondered what pages it doesn't let you see?
..is bigger than your index.
Computer scientists - pfft...
Unable to read configuration file '/bigassraid/htdig//conf/14229.conf'
Geocrawler error message.
AllTheWeb doesnt do a very good job of filtering pron type material from an image search... try the following (but dont do this at work!):
Go to Googles image search and type "fist" (no quotes) as your search item, you see a nice list of images that have a fist in them. Now try the same on AllTheWeb, you get a nice little surprise on the first page of image results. Note that each has filtering on as default.
An optimist believes we live in the best world possible; a pessimist fears this is true.
Today the New York Times claimed that it had published "All the News That's Fit to Print."
One question remains unanswered: Will they be able to do it again tomorrow?
Note to moderators: This is sarcasm. It isn't off-topic. I'm implying that some marketing ploy by alltheweb.com isn't exactly newsworthy. Thank you.
My Karma was at 49, then they switched to words. All that work for nothing!
If you are looking for something really specific (eg. the DNS entry of your machine to see which webpages you look at publish log files), then alltheweb in my experience will find a number of pages which google misses.
For general searching google still rocks.
It's not how big it is, it's how you use it.
Google is still way more useful in my opinion.
$45 per U Colocation Special
It seems that all the web includes multiple listings of the same directory on my site. I run a php enabled webserver so it tries a bunch of directories with ?M=D or ?A=S and then procedes to list them. Click here for an example of this. It also lists every single sub directory in all 5 or 6 subdomains that I host. Google is much better at not going into sub directories and just giving the main site. Google is by far a better choice when looking for main pages and not having to filter through all the /whatever/ directories.
Alltheweb's claims are not unfounded, and I find it always worth checking when google fails.
Here is one of several real life cases where it found software for me that google didn't.
(It still does, and google still doesn't.)
Timeo idiotikOS et dona ferentes
Google Pr0n Search finds 46,200 results.
Searching for pr0n via alltheweb.com leads to 2318 more potential pieces of pr0n to be seen.
For obscure info, no single search engine is enough. A search engine summarizer like copernic is a good idea. Dogpile is pretty good, too.
The Uncoveror: It's the real news.
Google by contrast tells me:
In order to show you the most relevant results, we have omitted some entries very similar to the XX already displayed
Which I actually find quite useful insofar as there's less repetitive crap to wade thru to find the result(s) you were looking for...
Having said that, I do agree that competition is a good thing, much as I swear by Google right now...
The reason I'm for Google has little to do with technology. It has everything to do with advertisements and capitalism.
;-).
I'd rather support a company that uses subtle advertisements like Google does than a company that uses in your face banner ads, etc. (Then again I'm posting on Slashdot!) Also I make a point to check out the ads evey now and then on Google and visit the company's site. I may be getting hosting from an advertiser on Google soon.
If people who advertise on Google make more money than they do with banner ads, pop-ups, etc. then we'll see the idea spread. I don't like in-my-face ads, so I do what I can to tell companies that. It's called being a responsible consumer.
Plus more valid hits come up when I search for myself on Google
Index Envy, sheesh!
Comes up with more hits for my name then google.
I can't belive how many people have my "Subtle mind control? why do all the HTML buttons say 'submit'" quote on their sites.
autopr0n is like, down and stuff.
Of course, as has been mentioned a few times above, competition is a Good Thing (TM).
- Ardenstone
I don't think there's anything wrong with opening a new window when you click an add, it's not the same thing as a popup, and most of the time it's the choice of the website admin, not the advertizer.
Also, lots of people prefer opening new sites in new windows. Myself included.
autopr0n is like, down and stuff.
Norwegian search engine AlltheWeb on Monday declared that it indexes more Internet information than longtime pacesetter Google.
Then how come the word with the most search results (FYI: the) on Google, returns less results on alltheweb?
wow, that's the first "'insert name' is dying!!!" post I have ever seen that is legitimate. interesting.
Google always seems to give me what I want, faster than anything else. Either this is because of it's search algorythms, or that it has only the indexes linked... example : I search for engsoc (looking for Canadian Univerisity Engineering Societies) and I find all the "main" entry pages with google, and I find a littering of "inside" pages with obscure titles with this new one. I'll stick with google-- and my chances of using the "i feel lucky" button are high, since the first or second link.
'I've got it Herbert! Let's make some inflammatory claim about Google that has nothing to do with the actual quality of either sites results and sit back and watch the hits roll in!'
yeah, so it's an obvious troll, but i guarantee you it's true.
I searched both Google and AllTheWeb for the name of my company. (For privacy reasons, I'm not going to tell you the name.) We are a small company, and probably few pages on the web link to our site, but Google pulled up our home page as its first search result. AllTheWeb failed to list it in its first page of links.
It's not hard to find our site, either. Our company's name is "foo bars"* and our URL is "foobars.com." Google nailed it, while AllTheWeb bombed.
Doing a more complex search with lots of words from our home page did, finally, get AllTheWeb to cough up our site. So I know it's in there.
So in my opinion it has little to do with how big their index is. It has to do with how good they are at finding what I'm looking for. For me, Google almost always finds what I'm looking for. I've even started using the "I Feel Lucky" button to skip the search results altogether and just take me straight to the first listed site.
*Incidentally, I've always wanted to open a pub called the Foo Bar, but I don't think many people would get it.
I'd generalise that to:
The tendency to blindly worship things... is far too common.
deus does not exist but if he does
This Mozilla preference seems to eliminate pop-up advertisement windows without affecting the creation of new windows by user demand:
Edit -> Preferences
Advanced -> Scrupts & Windows
Uncheck "Open unrequested windows."
but how you use it. Atleast thats what my wife keeps telling me.
People who bite the hand that feeds them usually lick the boot that kicks them
This did return more results for some search terms than google. Not many of the extras seemed all that useful, though. The signal to noise ratio seems a bit lower.
The ordering of pages seems less helpful. In many cases, the page I'm looking for is farther down the page.
The sponsored links and advertising are way more noticeable, and get in the way of the search results, although they're probably easy enough to ignore.
Google seems to be better at rating by search term proximity, under the useful assumption that if the search terms occur close to each other, it is less likely to be a random hit. One irritation with AllTheWeb is that for many results, it doesn't show you the context of the search terms in the summary.
Obviously AllTheWeb lacks the excellent USENET archive. The video and MP3 search festures might be pretty useful, I haven't had a chance to try them.
I realize I'm coming across as entirely pro-Google, but these are the only observations I have right now. I'll give AllTheWeb a chance, and let internet darwinism settle the issue.
Do a search for slashdot GOOGLE = 2,250,000 AllTheWeb = 1,649,088 What's up with that ?
Your sig here!
telnet www.kaosinc.com 80
Trying 192.203.175.245...
Connected to www.kaosinc.com.
Escape character is '^]'.
GET
HTTP/1.1 302 Found
Date: Mon, 17 Jun 2002 16:51:47 GMT
Server: Apache/1.3.23 (Unix) Debian GNU/Linux PHP/4.1.2 ApacheJServ/1.1.2
Location: http://www.kaosinc.com/index.shtml
Connection: close
Content-Type: text/html; charset=iso-8859-1
302 Found
Found
The document has moved here.
Connection closed by foreign host.
"A mind is a terrible thing to taste."
I use Google; it finds anything I need anytime. Why would I change?
Google News
rm -rf / is the evil of all root
Redirection limit for this URL exceeded. Unable to load the requested page.
i nc.com/jen.shtml in the location bar and get the same error message. What version of Mozilla are you using?
That is a Mozilla error message (source) and does not come from alltheweb. Your web server is broken. http://www.kaosinc.com/jen.shtml redirects to http://www.kaosinc.com/index.shtml, which then redirects to itself. This happens regardless of where I find the link to http://www.kaosinc.com/jen.shtml, or what browser I use to load it. IE appears to just sit there, Opera bounces between various stages of trying to connect, and Netscape 4 gives up after a few redirects and displays a raw 302-found page ("The document has moved _here_") without redirecting.
Moving the mouse over the link doesn't reveal the address in the bottom bar, either, so the only way I can think of to obtain the address of the item it matches is by right-clicking and selecting 'copy link address', opening a new window and pasting it it (and having a browser that is capable of doing this), then editing the URL so only the target link text remains.
An easier way to see the URL of the link is to hold the mouse down over the link, and then move off of the link before you lift the mouse button. But I still get the infinite-redirect error message if I type your URL directly.
You can't even right-lick and open in a new window to do this. If you try, you get "about:blank" which, afaik, means they're using javascript.
If I right-click on a link from the alltheweb search results and select "open link in new window", I see http://www.alltheweb.com/go/1/H/web/http/www.kaos
The shareholder is always right.
Who cares if you index more? Google rules. They have the best results. GM sells more cars than Porsche, but there isn't a GM that I would take over any Porsche any day...
Being a Norwegian company, would they be under the same mandate to hand over all 'suspect' search queries for abuse by the US's new CIAFBINSASSSASD (known in PRSpeak as the Information Awareness Office)?
What does VI do with all computer storage?
> So slashdot is a fair part of 1% of the whole Internet? Sure.
/. - since there has to be something that motivates them to read /. There's probably a common interest in technology, or the net as an information distribution channel, etc. etc. Thus, the two readers "1%" of the web will overlap - at places other than /.
Clearly not. And I don't believe I said that. I wouldn't have thought anyone would be dumb enough to need it spelt out for them. I apologise profusly for overestimating your intelligence. (On the other hand, since that comment was posted as AC, I have to assume you can't spell your own name).
Pick any two regular slashdot readers at random. I gaurentee they will have more in common than just reading
Not everything that can be measured matters; Not everything that matters can be measured.
Years ago, I used to use ftpsearch to find warez left in public incoming dirs by warez couriers.
Glad to see it's back, after a sojourn as a non working component of Lycos.
What we call folk wisdom is often no more than a kind of expedient stupidity.-Edward Abbey
The reason why AllTheWeb will surpass google is that it has a much catchier name.
As a bonus, alltheweb (when properly separtaed with spaces) is proper English.
is that it allows me to download its search result pages from within a PHP script. I wrote it to get a concordance (list of real usage cases) for any phrase from within Emacs, and it works flawlessly with alltheweb (and greatly facilitates writing in English for a non-native speaker like me). I tried to hook up to Google in the same way, but it refused to share its wisdom with my script, apparently because the user_agent field was not that of a "real" browser. I find this rather stupid - what are they afraid of, robots stealing their bandwidth? Any robot can easily fake its user_agent, I was just a bit too lazy for this hack. And besides, alltheweb works just fine for me.
Being a Norwegian company, would they be under the same mandate to hand over all 'suspect' search queries for abuse by the US's new CIAFBINSASSSASD (known in PRSpeak as the Information Awareness Office)?
I would hope not, but perhaps there might be a profit angle involved.
More seriously, do you have any knowlege that this "mandate" exists? Is it public law? Executive order? Secret executive order? A directive from "high levels"? Or is this more of an "intelligent concern" of yours? There's nothing wrong with that - everyone with half a brain should be concerned about these possibilities.
A dingo ate my sig...
This is a very unusual site, with crude animated gif graphics and a bizarre interface.
But, you can ask it ANY QUESTION in ENGLISH and get an answer (by email) in a few minutes.
I once asked it how it works, and it told me it was a giant parallel neural network consisting of 100s of Linux boxes distributed around the country.
It has some weird training mode where you have to answer a question it asks. I don't know what that's all about either, but I comply.
--
Ask the Ya-Hoot Oracle Anything!
come on, we all know that *cough* it isn't the size that matters, but how well you use it to um, achieve results.
:)
When I walk into a library I normally don't want ALL the books on system design, just the best ones!
FYI-
I tried doing an image search for images of "servers" or "server". All was cool until about 100 images in, then it was pretty much all gay porn.
So much for alltheweb...
Jake
Let's look at the numbers shall we?
Fnord: Google: 104000 AllTheWeb: 46439
Cheese: Google: 3690000 AllTheWeb: 7718252
Linux: Google: 48000000 AllTheWeb: 26670311
Windows: Google: 44600000 AllTheWeb: 66545303
Extropian: Google: 4460 AllTheWeb: 3999
Kumquat: Google: 32600 AllTheWeb: 42889
Question Authority and the authorities will question you.: Google: 90 AllTheWe b: 74
Hot man meat: Google: 229 AllTheWeb: 1661
Hot pussy: Google: 104000 AllTheWeb: 770057
"undefined reference to" error: Google: 31700 AllTheWeb: 8548
"Antimatter-Catalyzed MicroFission / Fusion": Google: 6 AllTheWeb: 1
Surprisingly alltheweb does return more hits in some areas, most notably for che ese, windows, and pr0n. With the cheese test, AllTheWeb helpfully cluttered my s creen with a banner for food products. Google, thankfully, is still bannerless, and returns more linux hits, fnords, and Voltaire quotes. Alltheweb also stalled several times and I had to resubmit a search. Conclusion: If you're a linux gee k or you want to know about fnords, futuristic philosophies, compilation errors, or advanced space propulsion concepts, google is better. If you're a horny wind ows user and want to find gay or straight pr0n, and if you for some reason like kumquats and want to learn more about cheese, use alltheweb.
Seriously, I'll probably stick with google, better numbers or no. The only thing AllTheWeb has going for it is the ftp search. The original is owned by lycos no w and broken.
...is how often they include cute, relevant, wacky different variations on their logo.
Right?
I just happened to search for "Combined Yang-Wu" at Google yesterday (which returned two documents) and I tried it at AllTheWeb, which didn't find it.
Dilute! Dilute! OK!
Hey, check out one of the new Google beta programs, answers.google.com Even you can now earn fame and fortune, and yes, even internet cash be searching google's archives for answers to people's questions...
Could be interesting, since AllTheWeb is based in Norway, the same country where Operation Clambake is. They might say "DMCA, what?"
But then, they might not, since the index itself is probably in the US, and besides, our Big Sister Sunde thinks DMCA is Norwegian law anyway, so she'll be banging on the doors once she gets $cientology on the phone.
Employee of Inrupt, Project Release Manager and Community Manager for Solid
It seems the general consensus here is size doesn't matter. Imagine that coming from a bunch of geeks. . .
I know more than you drink.
"Never mind the quality, just feel the width."
-- js.
here.
- AllTheWeb
- Google
- Teoma
- WiseNut
- AltaVista
If you click on the above links, you will find that all of the search engines except AllTheWeb give you the correct answer (10) in the first few hits. Actually, the answer appears in the hit abstracts, so you don't even have to fetch the hits, unless you want the fascinating background info.I'll stick with google, those 3 extra characters are too much for me. Seriously though, alltheweb seems pretty good, I tried their mp3 search and it was ok, not really comparable to the myriad p2p clients, but still a nice feature.
All I've read suggests that the IAO effort is geared toward building a massive data infrastructure that will allow fast access to all manner of information related to a specific target (which could be anyone that fits into a specific profile, for one reason or another). So, it's not a mandate per se, but the mere fact that our browsing habits, including search queries, could be part of it, is, and should be unsettling for every American citizen. The problem is that the government will have more and more access to information with less and less control or acountability.
Google and Alltheweb found 6 results to my sister's web pages and rockapella (Carmon Sandiego group)entries on the net, if they both have all of these not only have they managed to index large portions of the internet, but they manage to index the most pointless stuff too.
AllTheWeb has summed all their formats to get 2.1 billion.
If you add Google's 700mil USENET articles, 300mil images, etc, Google has >3,000,000,000 documents to search. That kills ATW.
-twb
To me, Alltheweb really does have a superious sense of style, its more estheticly pleasing. And even with the banners it's still really fast
I'm probably going to start using both regularly.
autopr0n is like, down and stuff.
Mod parent up!
One simple rule for its versus it's
The first test I do is typing my name,
it doesnt find ancient UseNet posts,
but it does find my latest OSS projects.
(Whereas Google does NOT ! )
Thumbs up for them.
My other sig is Funny.
Stated perfectly! I tried AllTheWeb and didn't like the results reporting. Google is far superior in reporter the results you want in the fashion you want (and need!).
While I mostly use Google nowadays, I sometimes find their query syntax restricting.
1. No complex boolean expressions
2. No proximity operators
3. No wildcards or stemming
It would be possible to work around these restrictions, but there I stumble into the big one:
4. Queries are limited to 10 words or less.
These restrictions are even more problematic when I search Google's Usenet archives because often only a couple of posts will contain the information that I need.
I miss the more powerful query syntax of DejaNews. While Google is nominally faster, that speed is irrelevant when I have to reformulate my query 4 times to zoom in on a specific post.