Yahoo! Vs. Google: Algorithm Standoff
An anonymous reader writes "There's a new report out from the guys who brought us the Google keyword density analysis. As they put it, "the goal of this analysis is to compare the keyword density elements of Yahoo's new algorithm with Google's algorithm." They compared 2000 low traffic, non-competitive keywords in the hopes of seeing the algorithms more clearly, without any possible search engine tweakings related to high-traffic keywords. Their findings are interesting. Should you go and rebuild your site based on these findings? Maybe not. It's worth a look though."
Gee, aren't these the guys responsible for continually diluting the quality of search engine results? I'm getting really tired of sites that present one thing to search engines and something totally different to me.
So you are way out of touch, I'm afraid
*--BigMan--- Time flies like an arrow.. but personally I prefer a nice glass of wine!
Nope, the big changover was a few days ago. Even had a story here on it. Inktomi now provides the smarts for the yahoo search, and MSN and Lycos as well.
Tequila: It's not just for breakfast anymore!
...they'll have to get rid of all that junk on their home page. Much of the reason for my using Google is that its home page is simple, it loads quickly, and it is just so easy to _search_, which is what a search engine should be. Yahoo failed when it became a "portal" and tried to do too much by itself. If they could somehow reduce the size of Yahoo's page down to that of Google (that would mean getting rid of those ads, guys) then maybe I'd consider trying it.
Just grab a friend and a deck of cards, and you can play Yahoo vs. Google at home.
The speed of time is one second per second.
Google is way too embedded in everyones everyday life, it will just naturally be more widely used. When was the last time you heard someone say "Yahoo it"?
Setec Astronomy
Yahoo! Switches Search Engines (Wednesday February 18, @09:51AM) has the info on when this happened.
Wasn't there a Slashdot article claiming that the Google servers may be the fastest super computer in the world, but they are so busy they couldn't run the benchmark? I can't find it now. If that's the case, how does Yahoo compete? By dividing the traffic? Can anyone link me?
Can I bum a sig?
RTFM, Yahoo is switching to their own engine.
Personally, I find the differences in how the two engines handle bold text to be most interesting. If only for that, I'd stick to Google.
Most pages that have 17 occurences of your search text in bold are only going to be Porn sites ((unrelated to your search)) or Spam sites ((unrelated to your search)).
Yesyer I was hearing a colleague curse at his computer yesterday because he was looking for something specific.
"Man, Goggle SUCKS now!, I'll try yahoo."
"DAMN! Yahoo sucks even more!"
I have to admit that I used to think google was incredible just after it came out, but nowadays I'm used to wading through 10-15 pages of results before finding something relevant to what I need.
This is essentially a problem in pattern recognition, and it's a damn hard problem to solve because of the disparity between the high-volume and low-volume words.
Information is essentially the inverse of entropy. Entropy can be calculated, and you can use Bayes probability theory to get a hold on the information content of a given word within a set of words.
What is difficult to do, and what search engines are trying to do, is measure the mutual information inherent between the set of pages that the word appears in, and the word itself, then apply that to all the words in the searched-for phrase; this is commonly called 'context'. This is plainly impossible to do for every given phrase, for every word combination, for every page indexed. The best you can do is use a statistical approach (and Bayes is your friend again) to come up with "good" matches.
The problem with the statistical approach is the class unbiasing, since once you have wildly different statistical populations, your choice of context gets harder and harder - the "easy" standard models don't cope very well. You don't have the computational resources to do a good analysis, so you're essentially stuck between a rock and a hard place.
This is why the google idea of strengthening the importance of a word depending on linked pages was such a good one - it "did" the hard work by relying on the entire planet to do it for them, by creating links. Of course, what one man can do, another can undo, and Google has got progressively worse over time. It's still by-far the best though, and my search engine of choice. When you look at the queries from search-sites, I get 100x as many from Google as Yahoo (next nearest)....
People think searching is easy, and it is. What's really really hard is searching *well*.
Simon
Physicists get Hadrons!
When I search for something, I don't want to get a page that's a marketing front for what I'm trying to find, I want an informational, probably technical, page on the item I'm searching for.
Such pages don't usually mindlessly repeat the keyword I'm searching for over and over again.
tasks(723) drafts(105) languages(484) examples(29106)
Thanks for clearing that one up. I did read that part of the article, but I was actually wondering where the results were coming from (whatever algorithm you use, you need to use it on a data set). Now I know.
I use Teoma a lot these days, it's very much like Google was about 6 years ago. Fresh, relevant and speedy. Plus their twist on pagerank is a pretty sweet idea that's worth a look.
Just typed in the company I work for name (8 employees). First hit on google, yahoo.. I gave up after 9 pages..
I'm one of those greybeards who was writing college reports in the pre-BBS days, never mind the World Wide Web. Remembering back to when I used to spend a half-day of research in the library to mine info that now magically appears on my computer screen in ten seconds, well...it's hard to throw stones. I'm just happy the damned things work at all.
Anti-gravity? That was *my* little secret! But I never patented it! Boy, was *that* dumb!
Search for "slash" in Google and the results are:
...
...
1) Slashdot
2) Slash's Snakepit
Put the same "slash" keyword and search with Teoma:
1) Slash's Snakepit
2) Slashdot
Personally for this keyword search I feel Slash's Snakepit is more relevant and belongs at the top of the heap.
I've been on vacation and away from internet and most mass media for a week. Got back on Monday and have noticed a drop in traffic to my web sites while I was gone. Didn't have a clue why. Well, now I know.
I'll be watching this very closely. Inktomi (sp?) sucked, which is what this is based on. I think it's too early to tell right now if the results are any good. Along the same lines, it will probably take about 6 months for marketers to learn to effectivly spam the results, which is something Google has historically been very good at keeping at bay.
This will be interesting to watch over the next few months.
-Pete
Soccer Goal Plans
Slightly off topic: yesterday someone said that Google ranks W3-compliant pages higher than non-W3 compliant pages. I'm still confused. Could this be true?
my other sig is a 500 page novel
Isn't this missing the point of how google works? OK, so it measures the success, but it won't tell you anything (or much) about the actual search algorythm as google is actually basing the score not only on the page you link to but also pages that link to IT.
Hence, it's an interesting read, and maybe you could draw your own preferences from what the weighting turns out to be in the listed cases, but it's not a very fair representation of how google works. *NB* I've no clue how Yahoo/Inktomi works, so I couldn't comment.
I have seen that sites that does nothing but sells stuff, has gotten higher rankings lately. But maybe I just need to be more specific in my searches.
you need to change your google preference from 10 results displayed to something larger...
:)
if you have already done this and you're still wading through that many pages of results you suck at specifying what you want to search for
For example if I search for me (Sam Smith), I show up 4th on Google, but 51st on Yahoo.
I guess Yahoo really doesn't love me after all.
While I know that various search engines use various core ideas in search, I would think that a better way to search would use multiple approaches. Some combination of link-based analysis, keyword analysis, expert analysis, cluster-analysis, etc. rather than a single "this-is-how-we-do-it-here" algorithm.
The first big challenge in search is in disambiguating what the searcher really wants without requiring a long string of inputs. A multiple-algoithmic approach would let a search engine serve up hits gathered in multiple ways (e.g., hit number 1 was top ranked using mehtod 1, hit #2 was top ranked using methd 2, etc.). The search company could then see which algorithm provides the best hits for a given search (i.e., by watching which hits the searcher clicks on).
The second big challenge is all the nasty spammers and SEOs (Search Engine Optimizers) who will try to use knowledge of any search algorithm to game the system and artificially raise their page rank for commerical purposes. This is probably one reason why Google cannot maintain dominance - any dominant search enegine attracts the concerted efforts of SEOs, thus ruining its search quality, thus ruining its dominance.
Yet a multi-algorithmic search engine could create a moving target that frustrates SEOs. By rotating the algorithms and even using negative weights on some algorithm results, a multi-algorithmic search company could cause high-ranked pages to plummet in rank over time. One week, a heavily keyworded site (e.g., one listing every possible keyword in metadata) might be at the top of the list, the next week it is at the bottom of the list. This raises the cost to sites trying to game the system. (The search company might even reward or penalize sites that change structure to often to either find the freshest sites or penalize the efforts of SEO).
There never can be one right way to do search.
Two wrongs don't make a right, but three lefts do.
As someone who does search engine optimization of his own sites, I believe there is an important distinction between ethical and non-ethical (spam) activities.
Search Engine Optimization - doing all things possible to tell a search engine what your page is about while being balanced for humans to read as well. Ethical. Sometime considered spam when really the search engine returns poor results; usually due to the page you are looking for not being easy to understand for spiders.
Search Engine Manipulation - trying to doing things to get search engines to return your page in results when the page may not otherwise be something the engine considers relevent or high quality. Showing something different for the search engine falls under this category, is commonly refered to as cloaking, and is against many search engines "rules" for designing pages. Not ethical, aka spam.
-Pete
Soccer Goal Plans
yeah trying to figure out how to get to the top of search engines by analysing keyword density so you can then construct copy text with fake entry pages or as the se.spammers call them "gateway" pages with 302 redirects via the useragent or constructing urls/with/the/keywords using ModRewrite
we know what they are up to, spamming search engines peddling shite with their refferer links
fuckers, these people are the reason 90% of search engines suck and who are rapidly poising google so in 5 years no-one can find shit without being taken for circlejerks and wading through shitty websites peddling porn,viagra and whatever shit is flavour of the month, if thats what the internet i see is gonna turn into then why the fuck do i bother
and we link em here at slashdot
i wouldnt give these people the time of day
A>S
Is that I'm pissed off for suddenly loosing my ranking a month ago. I used to be in almost every spot for the top 30 results for the keyword "QQQ", but now I am below 100. =(
Still #1 -- Lonely Gay Geek
Actually, I find an intersting way to rate search engines is to search for the word "cocks"
yeah, I know what your thinking.
You typically get a couple things from this search:
Porn (duh)
Chicken related things
and the band "The Revolting Cocks"
By looking at which ones come up first, you can infer some interesting and useful things about how an engine works. What those things are I will let you decide.
Mostly because it's funnier.
But seriously, folks, try it out.
From what I have seen in the past as well as currently more results is not always better. One of the primary reasons I use google as my search engine is because it has very accurate results. I would rather have a search engine display 10 results which are accurate than 100 results which are completly wrong. This article might show that yahoo displays more results in certain areas but I plan on using both services for searches over the next few weeks to see which one is more accurate.
The challenge for Google and Yahoo is to filter out the SEO spam (Doorways, cloaking, ...)
Check out the algorithms yourself by comparing google and yahoo search results side by side.
2004-1998=~6 For more details you can google
Food not Bombs is a nice platitude but it breaks down when you notice that the Bombees are usually well fed
I was running query after query to see who came up with better results. I typed in 'quote AMD' and yahoo brough up a nice little stock quote with a graph. In google a generic graph icon was on top with a link to 'Show Stock quotes for AMD' ... it linked to Yahoo's finance page. Just thought that was a bit ironic..
RTFA and cite your sources or prepare to get pwnd
But the only choices should be "Interesting" and "Troll." If each vote added or subtracted a very small amount from the page rank, and steps were taken to prevent stuffing the ballot box, I think this would actually improve the search results for the users.
Actually google has got worse.
Now many of my web searches tend to turn up tons of mailing lists archives. If I want to search those I'd use google groups (I get about the same results for my search terms in google groups).
I'm actually not that surprised - when I first heard they were using Page Rank some years back, I wondered how long that would keep working. It's easy to manipulate, plus it's kind of circular.
The article submitter is SPECIFICALLY trying to profile slashdot readership. Clearly the Anonymous Coward is either the article's author, or someone with a vested interest in our opinions on this topic, but someone who can't look at gorank's referral logs.
This is VERY sneaky (akin to putting an Amazon referral link in a book review).
Do NOT click on the link. If the submitter had actually bothered to use a logged in slashdot account, I would be more trusting.
Copy Link location, open new browser window, paste.
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
But you know I'm right. If you have similar sentiments, mod me back up. Let those fucknuts know how you feel about "Search Engine Optimization", i.e., "you'll never find an objective review about a commercial product EVER AGAIN with a search engine... HAHAHAHhAHAHHAAA"
THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
I've had excellent luck using Google's ads for one thing -- when I'm looking for a retailer to buy something. Not infrequently when trying to buy something, I come up with plenty of garbage and irrelevant results, but the paid advertisements are there because the people are trying to sell me what I want (and they are interested in not wasting impressions on people that *aren't* interested in their product, so they have a positive incentive to focus their ads).
May we never see th
I have never played any games what so ever to get there. What I do however is try very hard to place interesting and useful content on my site (mostly 'free web books').
I don't think that it matters so much what you do in life so long as you love doing it. I have been programming computers since the early 1960s, and I still love it!
-Mark
- Incoming link popularity appears to play a far smaller role than on Google. Pages that are "top of page 1" material in Google due to their oncoming links don't even show up on top of Yahoo.
- Yahoo is using the meta Description tag, at least in the display (but it also looks like they're using it for ranking.)
- They're giving extreme weight to items that show up in the Yahoo directory (which has been pay-for-inclusion for the most part the past several years.) In fact, one of my pages which has changed titles shows up in yahoo search under a 6 year old title (the one used to list it in the directory, natch.)
- Yahoo is also giving heavy weight to keywords that show up in URLs.
- Keyword cramming seems to move sites up on Yahoo (very annoying, especially for those of us who would rather get placed via honest content.)
To be honest, Yahoo's new engine reminds me of circa-1996 engines. Go run the same search on Yahoo and Google and see what comes back with better relevance (Google still looks better to me.)I am complete befuddled as to how/why you charge so low with so much experience and a top Google rank. What's up? Is money just not an issue?
Your monitor is staring at you.
Domain Names.
Search Engines definately give rank to domains which contain your keyword in them. Tons of sites out there seem to have figured this out to make searches useless. There are tons of "keyword.useless-site.com" dictionary pages out there.
I would really like to see the search engines be able to figure out that certain pages make no sense. They read like something from the old SNL subliminal man skits. Or site that bounce you somewhere else as soon as you arrive.
According to Whois information (CAPTCHA required), yahooslurp.com is owned by a flower store site. How long until Yahoo figures this out and hammers the store into the ground?
As an operation with several dozen websites with fairly substantial traffic, we tend to look at all this from the other direction. Google consistently delivers a whopping THIRTY TIMES more traffic than Yahoo, network-wide. Guess whose "algorithm" we like better...