Slashdot Mirror


Trending Low-Volume Google Searches with Gootrude

michaelrash writes "The Google Trends project provides some visibility into how popular search terms like 'Myspace' or '2008 Election' change over time and points out relevant news articles that create jumps in search volume. This is a handy tool, but there are many search terms that Google Trends does not display any results for. Such terms (such as 'Linux Firewalls' — with the quotes) have insufficient search volumes to display graphs according to the error message that Google Trends generates. Fair enough. Google sets an internal threshold on search volume, and this threshold could be set for reasons that range anywhere from Google Trends is still experimental to Google not wanting to provide data on how it builds its massive search index for emerging search terms. Either way, I would like a way to see search term trends that Google doesn't currently make available to me. So, I've released an open source project called 'Gootrude' to do just this. For the past year Gootrude has collected a set of low-volume search terms and interfaced with Gnuplot to visualize them."

10 of 37 comments (clear)

  1. wow by Gewalt · · Score: 2, Insightful

    wow, um...congrats I think? I mean, after you get over your pat on the back, can anyone explain why this matters?

    --
    Modding Trolls +1 inciteful since 1999
  2. It it only me.... by vidarh · · Score: 4, Insightful

    ... or does the author of this tool seemingly not realize that Google Trends reports volume of searches, while what he's tracking is amount of documents indexed for a search term, and that there's no basis for assuming the two are correlated in a meaningful way?

    1. Re:It it only me.... by Gewalt · · Score: 5, Interesting

      I find it highly unlikely that someone who can make the page in question would not be smart enough to also understand what it is that google/trend is really doing, and as such, I choose to believe instead that the author is being intentionally deceptive.

      --
      Modding Trolls +1 inciteful since 1999
    2. Re:It it only me.... by aleph42 · · Score: 3, Insightful

      Agreed, the summary is misleading, as is the comparaison (from TFA) to googletrends.

      This aside, the interest of "gootrude" is that it's not porvided by google, and so it's part of the many efforts to reverse engineer how goole comes up with his numbers.

      Specificaly, it appears from TFA that the "number of results" stated by google is a wild guess for low numbers (1,000-10,000), with very sharp variations which hint at an iterative process.

      So as I get it, it's not a tool for you and me, rather for google specialists.

      --
      Don't take my posts literally; it's just code to control my botnet.
  3. Different data by UnHolier+than+ever · · Score: 2, Informative

    Google Trends plots the frequency of queries, i.e. the number of times information is asked about a subject. Gootrude plots the number of pages found, or the quantity of information google can retrieve on this subject. These are completely different.

  4. Not allowed by google by swarsron · · Score: 3, Informative

    Besides not being the same as google trends, this tool is not allowed by the TOS of google. Automatic querying of their services without prior permission is forbidden by google. But since it probably won't put any noticeable load on their network they most likely won't care

    1. Re:Not allowed by google by Vectronic · · Score: 4, Insightful

      Until there was an article posted on Slashdot that is.

    2. Re:Not allowed by google by swarsron · · Score: 2, Informative

      Google doesn't give out any more keys for this api, only old keys continue to work. So if you don't already have a key you're out of luck

  5. Re:a few different results... by lpq · · Score: 2, Informative

    Just did searches on all of the terms the author mentions and got a few different numbers:

    1. "iptables attack visualization" -- 19 results (~35) (close)
    2. "single packet authentication" -- 93 (1,300) -- off by more than 1 magnitude
    3. "linux firewalls attack detection" - 9290
    3a. "Linux Firewalls Attack Detection" - 9240 (~9000) (close)
    4. cipherdyne -- 85,200 (~70,000) ~off a bit
    4a.Cipherdyne -- 84,500 (~70,000)
    5. gpgdir (same)
    6. fwsnort (same)
    -------
    Note...caps vs. no caps made no difference on 1, 2 and 5. But for terms 3 & 4, caps made a slight difference ... anyone know why? I thought caps were supposed to be ignored?

    Most were close, but cipherdyne had about a 15% difference, but the worst was "single packet authentication" -- That one was off by more than 10x! Wonder what's up with that.

    Interesting curiosities...

  6. Privacy? by Temporal · · Score: 2, Insightful

    Google sets an internal threshold on search volume, and this threshold could be set for reasons that range anywhere from Google Trends is still experimental to Google not wanting to provide data on how it builds its massive search index for emerging search terms.
    Or maybe for privacy reasons? Some search queries implicitly reveal the identity of the person making them. Such queries are naturally low-volume, so refusing to show low-volume queries is an effective way to protect the privacy of the searchers.