Trending Low-Volume Google Searches with Gootrude
michaelrash writes "The Google Trends project provides some visibility into how popular search terms like 'Myspace' or '2008 Election' change over time and points out relevant news articles that create jumps in search volume. This is a handy tool, but there are many search terms that Google Trends does not display any results for. Such terms (such as 'Linux Firewalls' — with the quotes) have insufficient search volumes to display graphs according to the error message that Google Trends generates. Fair enough. Google sets an internal threshold on search volume, and this threshold could be set for reasons that range anywhere from Google Trends is still experimental to Google not wanting to provide data on how it builds its massive search index for emerging search terms. Either way, I would like a way to see search term trends that Google doesn't currently make available to me. So, I've released an open source project called 'Gootrude' to do just this. For the past year Gootrude has collected a set of low-volume search terms and interfaced with Gnuplot to visualize them."
Google Trends plots the frequency of queries, i.e. the number of times information is asked about a subject. Gootrude plots the number of pages found, or the quantity of information google can retrieve on this subject. These are completely different.
512 MB RAM, 20 GB disk, 200 GB transfer, five datacenters. $19.95/month.
Besides not being the same as google trends, this tool is not allowed by the TOS of google. Automatic querying of their services without prior permission is forbidden by google. But since it probably won't put any noticeable load on their network they most likely won't care
Just did searches on all of the terms the author mentions and got a few different numbers:
... anyone know why? I thought caps were supposed to be ignored?
1. "iptables attack visualization" -- 19 results (~35) (close)
2. "single packet authentication" -- 93 (1,300) -- off by more than 1 magnitude
3. "linux firewalls attack detection" - 9290
3a. "Linux Firewalls Attack Detection" - 9240 (~9000) (close)
4. cipherdyne -- 85,200 (~70,000) ~off a bit
4a.Cipherdyne -- 84,500 (~70,000)
5. gpgdir (same)
6. fwsnort (same)
-------
Note...caps vs. no caps made no difference on 1, 2 and 5. But for terms 3 & 4, caps made a slight difference
Most were close, but cipherdyne had about a 15% difference, but the worst was "single packet authentication" -- That one was off by more than 10x! Wonder what's up with that.
Interesting curiosities...