Slashdot Mirror


Google Previews New Search Infrastructure

Google has announced a "developer preview" of a new search infrastructure, though one wouldn't have to be a developer to try it out. Google is asking for feedback on how the search results in the new regime stack up against the old. Matt Cutts has posted a mini FAQ. Some early testing indicates that the new search may be faster in some cases, and return more relevant results, than the old one. Those who attempt to game Google search for a living will be scrambling henceforth. Has anyone identified the new crawler bot in log files?

15 of 129 comments (clear)

  1. Re:New crawler bot... by libcrypto · · Score: 2, Interesting

    My thoughts exactly. They probably developed a new algorithm for finding the best results. There is no need for a new crawler. Found this link on search engine architecture which is helpful. http://infolab.stanford.edu/~backrub/google.html

  2. Re:New algorithm = more relevant results by CarpetShark · · Score: 5, Interesting

    Remember, in the beginning the old algorithm used to be very good in finding relevant results.

    I'm not convinced that the degradation is entirely due to SEO. Google used to be a much more technical search -- when you used specific terms, you got specific matches. It seemed to be very much like Altavista with AND between each term. Now, you get a mix of things, as if it was OR between each term. Granted, *that* could be just SEO.

    Secondly though, if you search for X, you're asked if you meant Y, and your search results already seem to be for the popular Y result they think you meant.

    Likewise, you used to be able to search for hyphenated-terms (I hyphenated all time because it's usually a character less, and requires less editing after the fact than putting quotes around words), but now, it seems to split them into two terms.

    I think google have dumbed down their search for people who don't know how to use search engines.

  3. What I'd like to see from Search 2.0 by Zocalo · · Score: 3, Interesting

    Actually, I'm mostly fine with the speed and typical results I'm getting at the moment. What annoys me the most about searching is when the first several pages of results are full of links to places that require you to have an account before you can access the answer or download the file. If I could define a blacklist that automatically excludes some of the worst offenders from my queries, that would be worth far more to me than shaving a few milliseconds of each search.

    --
    UNIX? They're not even circumcised! Savages!
    1. Re:What I'd like to see from Search 2.0 by LordLimecat · · Score: 3, Interesting

      All that matters is that your referrer is google. Doesnt have to be cached-- if what you see on the live page is different from what the googlebot sees, google will drop them from the results for SEO violations.

  4. Re:New algorithm = more relevant results by dublindan · · Score: 3, Interesting

    I agree. What I hate is if I search for "foo bar baz" it seems to ignore that I put quotes around it.. If I put quotes, I'm looking for EXACT matches.. but Google seems to still treat it as foo OR bar OR baz... :'(

  5. Re:Major Disapppointment by Korin43 · · Score: 4, Interesting

    The least they could do is update the calculator.. I mean, why can't I put in "2 pounds of chocolate in cups" and get an answer? I realize that finding out the density of chocolate may be difficult for Google to do, but why not team up with Wikipedia (have people add things like densities to articles, and then Google can crawl that and use it for calculator results). Or even easier, things that can be found on the periodic table, like "10 kg of lithium in moles" or "atomic weight of calcium".

    There seems to be so many things that it could be much more helpful with, and it can't be that hard since it already can answer questions like "What is the mass of the earth times the speed of light squared?", so why can't I ask for the "mass of the earth expressed as energy" (or possible "mass of the earth in joules")?

    I guess it's probably just that Google doesn't get many ad clicks when people ask the calculator questions :(

  6. Interesting Search Result by Anonymous Coward · · Score: 1, Interesting

    I entered "search engine" on the old infrastructure as well as the new. On the old engine, two of the hits on the first page were for bing.com and msn.com. On Google's new infrastructure neither of those sites shows up on the first page.

    Maybe they are taking a page out of Microsoft's book?

  7. Re:Major Disapppointment by koolfy · · Score: 5, Interesting

    two words :
    Exalead
    Yauba

    Exalead is more powerful, and Yauba is a little less effective for specific search like "gentoo bug kernel 2.6.30 fglrx", but guarantees 100% anon, and is pretty powerful and useful in some cases.

    Google is not the better search engine on the web, their new engine is very good, but google itself hasn't envolve since... I don't know, it's always the same, and we barely see new features added. (take a look at exdalead labs).

    After testing several search engines, it appears that google is not the one with the best ideas, and that pertinence and engines of others like exalead aren't bad enough to consider them inferior to google. Google is the most known, and others well known like bing are not as powerful as those two less-known search engines.

    --
    Segmentation Fault in "Life, Universe and Everything" at line 42. Don't Panic.
  8. Could we please go back to Google Search ~v2003? by CAIMLAS · · Score: 4, Interesting

    I don't know about anyone else, but I used to get much more search-contextual information on fringe information from Google, even when compared to a highly-tailored search. I don't know if Google does its indexing differently now, or if it's indexing/crawling different subsets of data, but the results are not only different, but often less useful in an academic/info-junkie sense.

    For instance, searing for "hammurabi" now results in Wikipedia being the first link. This is true for most searches where there's a wiki page, and for many where the search phrase is simply mentioned in the wp page (yet there is no individual wp page for the topic). A lot of the sites I've got bookmarked when researching superstitions and myth surrounding his code (giants, atlantis, etc.) which are still present do not show up in the search results today - but did around 2003.

    Likewise, search for anything which might have current cultural significance ('bush war crimes') and then compare it to something that had cultural significance just a couple years ago ('saddam war crimes'). The results are drastically different and (in the case of the former) cater to lazy people; they also make actually finding a -site- (as opposed to just a 'current event' article) on the topic somewhat more frustrating. (This is just an example, though there are plenty of other similar situations - forgive my 3am brain.)

    Now, it might be that Google has actually gotten a lot better at returning pertinent results: so good that those little things I see and go "ohhh interesting! *click*" don't occur nearly as often, and as an info junkie, I view google as having degraded.

    Who knows. Still head over heels better than Bing or anything else out there, as far as I'm concerned. I'm glad more progress on 'searching better' is being made. I just wish they'd not clog the works making -cultural- assumptions about what I'm after and stick to the semantics of my search phrases.

    --
    ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
  9. Re:New algorithm = more relevant results by value_added · · Score: 2, Interesting

    Google used to be a much more technical search ...

    I tend to agree, but IIRC, casual searches for technical terms were never that good. In my case, I invariably still get an unfiltered (read "near-endless") list of links to mailing list posts (identical content hosted by different list aggregators), or my favourite, the same frigging README file stored on what seems to be every other server on the internet. At least in the past, some of us could rely on usenet (as archived by Google groups) searches to separate out the chaff, but today everyone insists that web-forums are the way to go, so the signal-to-noise ratio is higher than ever.

    Granted, there's typically few ads possible for technical searches, so Google has no monetary incentive to improve them, but you'd think some geek employed by Google and trying to find useful information in a web search would step up and suggest an improvement or two.

    Then, again, maybe he's searching for things like deals on cameras (or Britney Spears) like everyone else. ;-)

  10. Re:Major Disapppointment by ubrgeek · · Score: 2, Interesting

    I found and started playing around with iseek for my Master's classes and have been impressed with the results. Being able to ask questions using natural language is really helpful when I'm not sure exactly for what terms I'd be searching when I first start looking for answers.

    --
    Bark less. Wag more.
  11. Re:New algorithm = more relevant results by Serious+Callers+Only · · Score: 4, Interesting

    Google seems to ignore punctuation, that's why you'd get those results.

    You put in "foo, bar, baz", it searches for "foo bar baz". It does not search for foo OR bar OR baz, as you suggested, it just strips the punctuation, and then searches for that exact phrase. There's a guide to the methodology you can google for.

    I understand why they omit punctuation, but It'd be nice if you could ask it to search including punctuation easily (not sure if you can), as it makes searching for code or precise phrases (with puncutation) very difficult.

  12. Google's Changes will impact long-tail more. by Anonymous Coward · · Score: 1, Interesting

    1. From what I have seen, improved results are not coming from a different algorithm, but from an improved indexing. Long tail keyword searches are more likely to be influenced in these cases (where sites that rank might also be on the verge of falling through the cracks of Google's new indexing patterns)

    2. From my experience, there appears to be a marked improvement in speed.

    3. Don't under estimate the power of the Top 10. One thing that Google does very well is it only rarely screws with a simple top 10 list of the most relevant pages. Innovation in the search results GUI has rarely yielded success (Ask.com for example)

  13. Re:Major Disapppointment by ChienAndalu · · Score: 2, Interesting

    My head is fine without any tinfoil, thank you. I have much personal information on google and don't care much about anonymity. I often use my real name on the Internet (maybe even here someday).

    But I know that difference of using a site that says "I promise you anonymity" and Tor.

  14. Re:New crawler bot... by Will.Woodhull · · Score: 2, Interesting

    New crawlers are needed because the web is changing.

    1. The automated cross referencing system on some blogs requires new logic to identify which article is the true search target, and which ones are simply referencing that article.
    2. The increasing use of ajax techniques to update portions of a web page requires a new approach to crawling.
    3. Other new ways of delivering content are also forcing changes, but these two are sufficient to make the point. Teh intarwebs is changing, and teh spiders need to be redesigned to crawl through all them new types of tubes.

    Some of these problems will be mitigated by HTML5 (assuming that web developers adopt the new standard-- which is likely for those not married to the Microsoft ecosystem). But even when HTML5 becomes fully mature, there will need to be some big changes in crawler and indexing technology.

    --
    Will