Why AltaVista Lost Ground To Google Sooner Than Expected
techtsp writes: Marcia J. Bates, UCLA Professor Emerita of Information Studies recently explained why Google's birth led to the downfall of AltaVista. According to Bates, early search engines including AltaVista adapted the classical IR methods. At the other hand, Google founders started off with a completely different approach in mind. Google successfully recognized the potential of URLs, which could be added to the algorithms for the sake of information indexing altogether. Google's modern age techniques were a huge boost to those older techniques. Whatever other business and company management issues AltaVista faced, it was the last of the old style information retrieval engines.
https://www.quora.com/Why-did-Altavista-search-engine-lose-ground-so-quickly-to-Google/answer/Marcia-J-Bates
Seriously? No one ever says this. Ever. Its is, and always has been, "on the other hand".
The article says "URLs" when the Quora post, cited as the source, says LINKS. Also the article is basically devoid of any information, other than "Google did better because it used LINKS to help determine ranking." Thanks for the headline, with a summary, linking to an article that misquotes the linked source, that has a healine worth of information. No really, thanks.
Often wrong but never in doubt.
I am Jack9.
Everyone knows me.
What is this non-article doing on the frontpage ? There is absolutely no useful content or detail, the summary is even better than the complete article.
Submitter "techtsp" seems to just spam links to this pc-tablet low-quality site, guess one randomly passed the filter.
I found the summary of this article very confusing. Phrases such as "At the other hand" and "indexing altogether"?? Oh, and call me ignorant for not understanding what "IR" means. Infrared? Then I read the article and found that the summary is just a badly strung together quotation of the text, including all of the grammatical errors. I'm still confused, but slightly less so.
Literally. That's what the article says if you click through the summary and rewrite to actually read it. To quote: "What the Google founders recognized about search on the Web was that information about LINKS could be added to the algorithms." Which isn't wrong, of course, but if you call yourself a nerd you already know a hell of a lot more about the page ranking algorithm than this already.
Altavista was popular for a small web. Once it got big we needed a better tool.
Now us tech guys oddly enough who seems to be in charge of state of the art technology are very reluctant to changes. So Altavista didn't change fast enough for the newer larger web.
Google when it came out it was for a larger web, and was designed for the larger web. And what made it stay, was the fact that they weren't afraid to give us die hard techies the middle finger and make a lot of upgrades, however they did it in a way that most of us didn't notice it much.
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
... any useful to begin with? Before Google, I think I was using Yahoo.
Providing links to search results was obviously far more useful to web users than Infrared.
Duh.
That was always a huge differentiating factor in favor of Google. People trust a clear, ad-free, lean page that provides lightning-fast results. It serves as a good Home page too!
Note that Bing , the only serious competitor, has been doing that and quite in contrast to the visual assault that the failed MSN front page used to (or still) offer.
Whenever someone says Information Retrieval I think about that agency in Brazil.
Some drink at the fountain of knowledge. Others just gargle.
Then you are doing it wrong. I find everything that is not totally obscure like some error cases in a library almost nobody uses.
First paragraph at Wikipedia: "AltaVista was an early web search engine founded in 1995. It was once one of the most popular search engines, but it lost ground to Google and was purchased by Yahoo! in 2003, which retained the brand but based all AltaVista searches on its own search engine. On July 8, 2013, the service was shut down by Yahoo! and since then, the domain redirects to Yahoo!'s own search site.[2]"
Second and third lines of TFA: "Founded in 1995, AltaVista was a very popular Internet search engine website. Nevertheless, AltaVista lost ground to Google and was purchased by Yahoo! in 2003. Ten years later, Yahoo! officially shut down AltaVista in July 2013 and redirected the domain name to its own search engine website."
Hmm...
Firehose moderation picked this article? Editors allowed it in? or did Dice just take a big payoff?
Some drink at the fountain of knowledge. Others just gargle.
I don't specifically recall using Alta Vista, but I do remember how terrible all of the search engines were before Google came along. They didn't return the most relevant results, they returned the web sites that paid them to be placed higher; Google was the first one to actually do what the user wanted from a search engine - return relevant results.
Instead of ranking relevancy by hits of a word inside the document, which was how it was done before, google ranked relevancy by references to the content.
Note that most in-house wikis still rank things the old way, which is why most search results from your internal wiki suck. Even google's custom search on your internet page sucks...because without humans performing relevancy ranking via links google is just as bad as the old stuff.
By the time Altavista got popular, the interface was a cluttered mess where you could hardly find the search line. Google came with an almost empty screen with a logo and a search line. You'd have switched just to save your eyes. More like the good old Webcrawler interface.
Every sentence is a paragraph, and most aren't even complete sentences. It's like a random collection of thoughts taken from other sources or something.
i don't pretend to understand the system, but if this made it through, slashdot is clearly broken.
it does not seem possible that rubbish like TFA could make it through the review process.
am i therefore correct to assume that slashdot, being owned, now posts what it wants, when it wants?
irrespective of the algorithm that supposedly IS slashdot?
please tell me i'm wrong..
i usually am.
The article says "URLs" when the Quora post, cited as the source, says LINKS. Also the article is basically devoid of any information, other than "Google did better because it used LINKS to help determine ranking." Thanks for the headline, with a summary, linking to an article that misquotes the linked source, that has a healine worth of information. No really, thanks.
It's a paid-for "article" to a ad-infested link-farm.
Here's a link to the ACTUAL story: https://www.quora.com/Why-did-...
If you want news from today, you have to come back tomorrow.
Altavista had better results than Google for years, especially because you could use all sorts of search modifiers that Google didn't support till later like -no_pages_with_this_word or +must +have +all +these and logical operators.
But then as the leaders they got cocky and wanted to be a portal and filled up the page with so much crap and spam it hurt. Meanwhile Google's page was still just search box, go, I'm feeling lucky, and a few other tiny things.
That's why I switched after Google got good enough that they were comparable, NOT better. It was just less annoying. That's why most of the people I knew back then switched.
AltaVista realized too late what they'd done and tried to rebrand as 'Raging' with just a simple search page, but by then it was too late.
I'm sure the Google approach is much more scaleable but the article seems terribly confused and like it's trying to make some bizarre sense out of a cultural artifact from a time they can't comprehend.
...a coworker pointed out that this new search engine "Google" was much better for finding academic papers. At that time, Google was excellent for academic papers, but useless for most other things.
My how times have changed. Not that I can obtain academic papers without paying through the ... nose ... anyway.
If you want news from today, you have to come back tomorrow.
not google that cause AltaVista to fail.
by TheSpoom (715771) Uncaring Linux user here. I have nothing to add to this but please continue. *munches popcorn*
A problem was that these search engines, unlike Google, were doing a kind of "grep" of the word to find through the whole data, to yield a bigger number of results. Searching for "book" gave results like "bookmark", "bookmaker", "bookkeeper" etc... While Google returned results about books and derivatives.
Slashdot, fix the reply notifications... You won't get away with it...
Today search is ALMOST ENTIRELY SHIT. It is used because shit is king.
If you think that, you don't remember Alta Vista, which had millions of links to "Page not Found" and in the search results had multiple listings to the same (often broken) page.
"First they came for the slanderers and i said nothing."
There's a thing called the science citation index that sorts papers that are referenced more to a higher score than those that are not referenced much, and it's a good way to find those papers on a topic that others have found most useful.
Google saw it worked and applied a similar method using links (as the above poster wrote). That method brought human judgment that had already been applied into the mix and enabled them to index far more rapidly than AltaVista with better results than AltaVista's simple keyword searches. It was more likely to lead people to a key site that many used instead of an abandoned fan site.
That's the main difference.
Inertia? AltaVista, Hotbot, and Excite had the inertia. They were the big players when a couple of college students thought up the idea that became Google. AltaVista and the other established players had the inertia.
The established search engines also had algorithms based on word frequency in various parts of the page. I did search engine optimisation back then, so I studied it in detail. The simplified explanation is that searching for "Einstein" would return whichever page had the word Einstein repeated the most on the page. Minus points for repeating it "too many" times.
Google had a revolutionary idea. If lots of good pages link to abouteinstein.com, It's probably a good page. That's Page Rank, and it worked quite well. That's the far and above the most important reason Google won - their ranking system was far superior because it was based on a different, better, idea.
* You might wonder how Google knows which pages are "good", in order to calculate which pages are linked to by good pages, and are therefore also good. It's recursive across the whole internet. If lots of pages link to princeton.edu/physics/, and princeton.edu/physics/ links to lab.gov/particles/, then lab.gov/particles/ gains some "good" points. Specifically, it gains an equal share of the Princeton's pages rank value as all other links on Princeton's page. In other words, whatever value a page has, that value is divided equally among each page it links to. So a page "vouches" for each page it links to, but if it links to many pages, it can also pass a small amount of credibility to each.
Learn your lesson. Avoid posting anything sourced from website such as these. There are lots of them, buzzfeed of IT related news, mostly done by Indians, and all they do is copy from another websites and even forum posts
When Google first came on the scene, most people accessed the internet by dialup. Google's simple page loaded faster. Thats it. There is no other reason. Anyone who doesn't remember the "dialup" internet cannot comment on why one page was more popular than another,.
Around '92/93 I was an Alta Vista user. They they decided that if you shovel money their way they would put your search results to the top of the list. I, and evidently a couple others, said "fark that" and went looking for alternatives. Google was the alternative that gave the best search results.
Fark Alta Vista, I'm glad you're dead and buried.
Altavista was Boston based. Infoseek was Seattle based. Yahoo and Google were silicon valley/Stanford based where all the incestuous famous money networking and bribe-able journalists are. Google search sucked for porn and wares, though they tried hard to be the best at it pre-IPO. Their second page of results was always garbage (ie. they query system was broken) till long after their IPO.
After DEC was destroyed by Intel/Compaq's anticompetitive offer they couldn't refuse to kill Alpha, Altavista got axed and sold off through a couple of exchanges to Yahoo (while wetting the right people's beaks) to kill it off as well. These were all moves by the same cabal of Sand Hill scum to remove competition in their investments.
Altavista started to think it was a news site and not a search site. It was mistaken. A blank page + text box killed them. Serves them right.
Alta Vista decided to go the portal route, with a bunch of crap on the search page. Google came out with a simple look, with only the keyword field.
https://web.archive.org/web/19981202230410/http://www.google.com/
vs.
https://web.archive.org/web/19990125093146/http://www.altavista.com/
"Even for Slashdot, that was a very obscure reference!" - Anonymous Coward
Altavista tried to monetize their search by biasing results based on ad revenue; Google didn't (at first). It turns out, people aren't interested in a biased product, even if it's free.
I used search to fix computer problems only I was the go to guy for work and home wan lan mixed topology OS you name what gave me those answers I used people seen what we used and followed is the way I see it.
From my recollection it was because it did away the mess of the portal concept, did away with intrusive ads and focused on search. It was simple and effective. Everything else was a marketer's wet dream, but a mess for anyone else.
I am sure people who used the net back then can confirm that it was the simplicity and elegance of Google that gave it the advantage. I certainly switched because of that.
Jumpstart the tartan drive.
Hit 1: Wikipedia entry.
Hit 2...n: Random URLs.
Though much has made about "the potential of URLs" for searching, aka PageRank, my own experience as someone who used AltaVista up to the moment he discovered Google was that Google was the first full-text web search engine - or at least the first one I experienced.
Prior to Google, all the search engines simply indexed extracts of pages, primarily meta-data such as a page's own description of itself. That led to frequent disconnects between the preview content provided by the search engine and the actual content of the resulting page. Sometimes, I would search on a quoted term, see that on the search results, then not find it on the page. Very frustrating. Though I preferred AltaVista at the time, the other major search engines of that time all had the same problem and were all pretty comparable in terms of user experience.
Upon first using Google, it quickly became clear that Google was different. You could actually tell from the user experience that it was a full-text search, unlike all the others. Basically, the problem above never happened. Although PageRank may also have been an important part of its success, the difference between full-text search and what the others were providing at the time was so compelling that it just didn't matter: there simply wasn't any comparison from the user's point of view.
Now, (and for many years past) all of the major search engines provide full-text search so we just take it for granted now. They probably also all use something like PageRank, which probably isn't to hard to implement once somebody has thought of it. Personally, I find it hard to tell the difference between them now, though I still prefer Google, probably simply because of having had a long and happy experience using it. (Oh, except for when they shut me off once years ago for doing too many queries via a Python script...)
They had a complimentary idea, not a different idea. Page Rank ranks a page in general terms, but tells you nothing on if it has anything to do with Einstein (from what I understand). You still need some form of the old way of judging the Einsteininess of a page.
Troll is not a replacement for I disagree.
BS.
Google won because it's page loaded fast, and wasn't full of crap all over the screen. The others were slow loading and infested with so much advertising, paid adverts and stuff that you couldn't even locate the actual search field 90% of the time.
Users voted with their feet and went to a search engine that was fast to load and easy to use. win #1.
The quality of the results was actually secondary, however Google won that too with a policy that meant nobody could "buy rankings" or game the engine to get yourself higher up the results list. Quality search results = win #2.
It was a technology demonstration of DEC's (remember Digital Equipment? If so you are old!) new Alpha chips and servers, so powerful that they could index the entire early 1990s web. A very minor side project.
When Compaq bought DEC, they were surprised to find that they had also bought Alta Vista. Around then somebody tried to commercialize it and killed it in the process.
https://en.wikipedia.org/wiki/...
AltaVista failed because Yahoo failed to capitalize on it. As Yahoo subsequently lost its position in search because of a total lack of vision by the management.
It probably also helped that Google was a simple UI, where AltaVista and all the others were aiming for the portal type UI's with ever increasing clutter and load times.
The details are already vague, however as far as I recall, Google was so much better at finding things, and altavista links were getting stale and polluted with a lot of rubbish in between. It took so much more effort to find links related to your actual search in Altavista.
Google had a cooler logo.
systemd is Roko's Basilisk.
What a load of bollocks. Google "won" on the back of a huge advertising campaign. I remember see ads at bus stops FFS, not just on the telly.
Alta Vista Lost ground to google, because it was an ad infested junk portal where you barely could find the search field. While google hit it right, it was a search field and search field only.
It was about speed of loading. Google had a blank white page with a search box. Altavista had gone the horrible "portlet"-style approach of gluing loads of things together. Google's page loaded quickly, Altavista's did not.
When I, and those I was working with, first switched to Google the actual search results were different to what you'd "expect" (Altavista's results were the gold standard, any deviation was looked on suspciously) but they were about the same in quality. Later they became better, but it wasn't the driver at first - was all about the clean page.
This was my first reaction after hearing an ACM presentation circa 1992-4 about this new search mechanism, that I realized after shuffling through their academic gobbledegook was essentially page ranking -- even with its "refined" inheritance method. I thought -- "A search term will be judged the most relevant because of how many pages link to it. Purely a frequency (popularity) criterion with "fancy" ways of using frequency to assess quality" 20 years later, the Kardashians become the Gabors of the net...
Charly in SJ
The whole bit about Google using links as an integral part of PageRank (and this being different from AltaVista, et al) has been public information since around the day Google went live. Google, for all their secretiveness, has never been shy about that bit. (And, of course, it led to the creation of the SEO industry, since AltaVista-baiting by simply stuffing keywords colored white past the article over and over stopped working.)
"in order to calculate which pages are linked to by good pages, and are therefore also good. It's recursive across the whole internet"
You speak in the present tense, but I think it's widely believed that today, the original pagerank algorithm plays only a minor role. The original algorithm was very easy to game by building a site with a million auto-generated pages, all linking to each other and to the main page. How they actually do it today is a closely guarded secret, although it's likely that links between sites and internal links play a role.
Avantslash: low-bandwidth mobile slashdot.
On top of this, Google was fast!
It is hard to imagine now, but in those days "surfing" included a good deal of waiting, because of slower connections and probably slower servers. I remember Altavista being significantly faster than e.g. Yahoo search, and Google being faster than Altavista, most likely because the two academics that started it had a more sober web site.
It was about speed of loading. Google had a blank white page with a search box. Altavista had gone the horrible "portlet"-style approach of gluing loads of things together. Google's page loaded quickly, Altavista's did not.
True! Suddenly I didn't have to search for the search box among a cacophony of blinking and bleeping cascades of disturbance. It was like walking the red light district and then stepping into the library. And the search results were presented just as soberly.
Aaah! Peace! I immediately fell in love. <3
As soon as someone figures out how to play the ranking game the rules will change. It has been played over and over again. If I remember correctly there was a hack that caused a search for "general failure" (or similar) to direct to G. W. Bush.
If builders built buildings the way programmers wrote programs, then the first woodpecker would destroy civilization.
The search page was slow, was full of ads and the results were almost irrelevant. The search quality really took a dive when sites started loading up the metadata with keywords. Unsurprisingly when a better search engine appeared everyone jumped ship.
Today search is ALMOST ENTIRELY SHIT. It is used because shit is king.
If you think that, you don't remember Alta Vista, which had millions of links to "Page not Found" and in the search results had multiple listings to the same (often broken) page.
I remember AltaVista well and broken links in many search engines in the 1990s. But just because search engines don't direct you to broken links as often anymore doesn't mean they're better.
Now, rather than millions of broken links, search engines direct me to millions of websites that don't contain my search terms and often have nothing to do with what I'm searching for.
Is that really much of an improvement? Actually, I think it's worse -- because it takes me a fraction of a second to see a 404 error and go back and try a different hit. But when a search engine directs me to mostly links that have nothing to do with my search terms, it can take me many seconds of skimming to discover that a particular hit is bogus.
Google probably reached its optimum usefulness a little over 10 years ago. Ever since, it has gradually tried to become more like "Ask Jeeves" and less useful for people who actually have serious research to do. First, you had Google offering corrections to misspellings (a useful feature), but then those would replace your actual search. Then you had dropping of the default "AND" operator that made Google efficient and useful at the beginning. Then they dropped the "+" operator a few years ago. Then they broke double quotes and verbatim search to various unpredictable degrees. And now whenever I search for obscure terms, by default Google tries to replace them with what it thinks are "synonyms" (but which often aren't, or which I don't want). So I stopped using Google for many of my default searches a few years back... and there's really nothing out there that rivals the efficiency and precision of Google ca. 2000.
Bottom line: If you're a moron who can't spell, can't bother to think about what words might actually appear in what you're looking for, and likely don't even really have a clue what you're even looking for -- well, today's search is much "better" for you. Granted, 90+% of people are probably like this, so that's why Google targets the "lowest common denominator." If you're actually looking for a serious SEARCH engine, it's not Google anymore.
The problem is the rise of SEO. If Google just gives the straight, obvious answers (which they did 10 years ago), then people will SEO and you will get garbage. Google really had a down period 7 years ago, because of all the SEOers trying to push garbage pages up in the results.
"First they came for the slanderers and i said nothing."
I remember that Google loaded much faster over my dial up modem but mainly the quality of the porn^H^H^H^H tech results was better. Yes, better results for technical information, not porn, nothing whatever to do with porn.
All I want is a secure system where it's easy to do anything I want. Is that too much to ask ~~ Randall Munroe
It still "works" to create thousands of pages which link to the page which you want ranked high, but it doesn't (and didn't) work that that well because the feeder pages lack PR. I know of only two significant, but changes in that regard. As always, a page can only pass on the PR that it has. Because noone links to your feeder pages, they each have an actual PR of 1/(number of pages on the entire internet) . To have a really high PR without external links, the number of pages you create has to be a significant fraction of all pages on the internet. So millions of billions of pages WILL create PR.
Two simple new additions make manufacturing PR more difficult. First, duplicate page detection. You need millions of DIFFERENT pages. Second, and more important, Domain Rank. It's calculated just like page rank, but with domains instead of full URLs. If lots of different domains link to wkp.org, then wkp.org is ranked high. Pages on many different domains link to wikepedia.org, so wikipedia.org has strong domain rank. To manufacture this, you need not thousands of pages, but thousands of domains. From there, it's simple fraud detection to find the few people who buy up thousands of domains and put bogus pages on them. Are thousands of simar pages, devoid of content, hosted at the same place? Might be BS, and therefore penalized specifically- without changing the basic algorithm.
The key algorithms don't need to be secret, the thresholds for certain penalties do.
Understanding PR, one way of finding good pages relevant to Einstein is obvious:
1) You already have the PR, so you know how popular each page is.
2) Disregard all pages that don't mention Einstein.
3) Run PR again, starting with each page's general PR as the initial seed.
From this you'll find that many good pages which mention Einstein link to wikipedia.org/einstein/. Therefore, that page is probably relevant to people looking for information about Einstein.
If you want to, you can also subtract a portion of the page's non-Einstein PR. In other words, although Einstein pages link to blah.com, so do NON-einstein pages. Links from pages which do NOT mention Einstein weaken the inference that the page is relevant to einstein. So, total Einstein rank is the PR from Einstein pages minus the PR from non-Einstein pages.
youtube is destroying the internet's usefullness for obtaining how-to information. instead of a clearly written and illustrated guide now you get to sit through 15 minutes of a stuttering autist try several times failing then finally accomplish the take.
Why on earth is anyone talking about AltaVista? It's been gone darn near 15 years now!
"miserable failure", "french military victories" "worst band in the world"
Snowden and Manning are heroes.
My experience with Alta Vista was that sometimes it seemed to go "off-track", answering, not my question, but a similar question. When I tried to refine my query, it still seemed stuck on what it thought I asked before, not what I was asking.
I later found out that they used "Bayesian Logic", where the answers to the previous questions guided the answer to the new question. No wonder I had this problem!
When Google came along, of course I went with them, and still do. They are still the #1 Search Engine, although some of their other services, like Google Maps, have become untrustworthy.
Yep. Speed of loading and no clutter. I switched the instant I saw Google's home page, because while I was doing fine with Altavista's search results I hated all the crap that took forever to load.
Since 2010, of course: DuckDuckGo. For similar reasons, really.
wg
RIP to my first porn search engine. I was in teenage-heaven when they introduced picture and video searching.
This is a little article that tells us everything we already know, after going through the clickbait. Thanks /.