Which Web Statistics Package Would You Use?
ken-doh asks: "We host about 200 customers web sites on a Windows platform, we want to provide them with a simple web statistics package, to track hits and other useful pieces of information. We have been using Deepmetrix LiveStats XSP which has been perfect for our customers, but since Microsoft purchased it, the product is no more, with support ending next year. So we need to buy a new stats package. Any ideas?"
I'd choose awstats. It's fast, very easy to use, looks pretty, and best of all ... it's free to use on Windows as well as Linux. Here is their main page on sourceforge, which also includes a nice little demo.
Crack - Free with every butt and set of boobs
If it ain't broke keep using it ;)
or is that not an option for some reason?
-nB
whois gawk date unzip strip find touch finger mount join nice man top fsck grep eject more yes exit umount sleep dump
I've seen lots of different packages and frankly I sometimes wonder why people pay for them. They're typically (actually I guess they're literally) off-the-shelf stuff that, while offering nice and interesting features, don't cover everything for everybody. I think it's a "you get what you pay for mentality", i.e., people insist on buying packages to do this kind of analysis.
I've written probably more than 20 different web filters for various analyses because the OTS stuff didn't get me the info I wanted.
And for any more-than-small IT staff, there's always someone there who knows the tools, and can slap together stat info and tweak it ad nauseum until management sees the analysis they think they want. Lots of staff will even write it on their own time -- they like to tinker with that stuff.
Also, though I haven't looked, I'll bet there are some great CPAN modules that get you what you want as a good start with the added benefit of having the code for your own tweaking.
Considering the article specifically is asking for simple web stats, I think sed, awk, perl, and others is a perfect way to go.
Or, you could buy yet another package and risk Microsoft buying that product and disappearing it.
Avoid SmarterStats. In fact, avoid the entire SmarterTools suite of products at all costs. Buggy. Poorly designed. Horrible!
Job done. If you get enough traffic, you can pay, but it's free for something like under 1M hits/mo. And their campaign/tracking tools "pwn" you.
I'd put a link, but, c'mon. google.com
http://www.analog.cx/ that works well, atlest for my servers
Wulfram II - Free Online Mutiplayer 3D Tank Shooting Gam
Google Analytics is a little more sophisticated tool that requires you to embed a little bit of their code on every one of your pages. Also free to use.
For totally custom reporting, move your log data to the database following the guide I wrote earlier this year.
Your design to a real part online: Big Blue Saw
Click here for the solution to your problem
So what's the difference between "hits" and "visits"? My website averages ~700 hits and ~80 visits per day. Which is more meaningful and/or significant?
I surf without javascript and have adblocked the analytics-domain. If you rely on google analytics, you won't see me. Log-based stats are way better.
Informant Advanced, and Cacti - do you really need anything else???
The above comments are not guaranteed to make sense to anyone other than the author...
I guess these are nowhere near feature rich for your needs, but I'll mention them anyway:
Visitors - Simple logfile analyzer
BBClone - Nice visitor counter running on PHP
"More hits than germans surfing fetish websites
Yo, that is a lot of hits."
There's 45 hits in this song not counting the title.
http://www.youtube.com/watch?v=B38c1e52vfY
Install BBClone on your server. It's fairly easy to install, and you get good graphics and automatic statistics collection. It's free and open-source, so you can't go wrong. Look for their demo at http://bbclone.de/
Awstats seems to be the modern usual answer (http://awstats.sourceforge.net/), used and recommended by many admins and groups (in my case EGEE, European Science Grid intiative http://www.eu-egee.org/) but for traditionalists with no eye-candy desires, there is a copy of Webalizer (http://www.mrunix.net/webalizer/) lurking on most servers and almost all destribution package repositories. It's worth looking at the wikipedia page for specials, extended verions and general info on web server statistics and analysis: http://en.wikipedia.org/wiki/Webalizer.
Particularly, Stone Steps Webalizer is an interesting version of feature-full and candy-enabled version: http://www.stonesteps.ca/projects/webalizer/. Others can be easily found on Freshmeat: http://freshmeat.net/search/?q=webalizer§ion=p rojects (i.e. Webalizer Extended with included Geolizer and extensive 404 analysis support, http://www.patrickfrei.ch/webalizer/ and AwFull with usability, CSS and geo-ip features, http://www.stedee.id.au/awffull etc.).
Others can be found on Freshmeat (117 hits at this time http://freshmeat.net/search/?q=web&trove_cat_id=24 5§ion=trove_cat) and Wikipedia (very short and poor stub of a list that you might want to improve after your extensive testing :-) : http://en.wikipedia.org/wiki/Category:Free_web_ana lytics_software.
There is also Sherlog, an Apache Log Analyser, specialized in user experinece tracking more than statistcs - an interesting complimentary tool (http://sherlog.europeanservers.net/.
-Kvorg
Hosted solution (you put a bit of javascript in every page, trivial if you are using some sort of templating system).
Don't have to deal with web logs, always updated in real time, AMAZING functionality. Just pricey. our company found most of the open source or cheaper ones to be a bit lacking in functionality...just depends on your needs.
In my pants!
Thankyou, thankyou... I'll be here all week.
It's not that the links are inherently dangerous. The problem is that clicking such a link will take you to the site the link points to (obviously), and your browser will dutifully report your referrer to the remote server. And if your referrer looks something like "http://www.example.com/top-secret-stats-directory /awstats-referrers.html" then you've just given some unknown server a "back door" into your web stats, allowing them to gather intelligence about your site. In many cases that's unimportant - either the site is an inconsequential personal web page, or the directory is password protected, or you're smart enough to use something to prevent your browser from sending referrer information. But as we all know, many people don't do what they should, and sometimes little data leaks like this can lead to compromises.
I looked at a couple of the popular ones, installed Awffull and played with it for a bit. But it wasn't immediately
obvious to me that any of the common ones supported aggregating stats across domains / hosts. Eg, I have 10 virtual servers on this
Apache box, give me a sorted list of hits per domain/host. Probably one or more of the popular open-source stats packages
*does* do this, but I didn't feel like spending hours examining different ones and installing them. Since my needs were very basic
I just wrote something of my own.
Since all my domains are ultimately served by a Java webapp running on JBoss (I redirect from Apache to JBoss with mod_jk) I just wrote a servlet filter to write hits to a postgresql database. That's it,one table with the hostname, date-time, user-agent, and a handful of other things I care about. Now, getting the info I need is a simple as a quick sql query with pgadmin III. Although I'm looking at using the Eclipse BIRT stuff for looking at the data, as my next project.
// TODO: Insert Cool Sig
Next time you'll know to choose an open source solution.
With a proprietary solution, the customers are the ones who support the product, and are then shafted when the product is discontinued.
Paid Q&A/Research
http://www.tracewatch.com/ for a free analytic program that actually really helps. I have tried most of the above for my site http://usa.tiouw.com/ and none came even close. Analysis is useless if you cant trace exactly how a user is going through your site. It might mean more work for you, but the pay-off is there. I know for sure that your clients will love you for it and that it will mean more business.
something cheap and easy we use is weblog expert, works off of IIS log files and has a lot of nice reports/features. we're an educational agency so the statistics aren't mission critical or anything, just more of a nice thing to have occasionally to see what's happening on the site.
sigs suck
Check out http://www.hitslink.com/
I've been using them for several years and they're very good. The only downside is that you have to have a small bit of javascript on every page in order for it to track that page.
One organization I work with uses Webtrends... we have Pro 7, the predecessor to their current Analytics 8. For various reasons, it only relies on analyzing log files (vs. Google Analytics JavaScript implementation, which I use elsewhere). This is not always clear, and frequently numbers on one report don't totally jive with numbers on another report for the same data (e.g. Webtrends has dozens of "report" pages).
It's also remarkably expensive. I'd look elsewhere if you could.
Sawmill is an excellent package. It's easy to configure, has nice drill-down features and great reporting. I'm not associated with the vendor, just a satisfied user.
WebTrends 8.0 is nice.. if you have a dedicated windows server and deep pockets.
http://kitties.b-log.ca
"since microsoft bought it and now it is gone and.." and yada yada. And you want advice?
Ok Handwriting -> Wall, or cluestick -> up 'longside de head
How many times do you have to get burned before you realise that dealing with those lying sleazy criminal goons is a long term expensive no-win situation? That is, if you stay stuck on their platform, that situations like this are the norm, not the exception?
That's my advice, MS was ok in the olden days somewhat,but it well past time to move on now and admit reality, they have peaked as to usefulness, they are past peak as regards cash for quality, they are WAY past peak as being even close to being honest, and you will keep throwing good money after bad in huge wads now the longer you stay stuck on them. Do you honestly think that will be the last problem you will run up against with them?
It's actually much more complicated than most people think. The best write up I've seen is on Analog's site:
This section is about what happens when somebody connects to your web site, and what statistics you can and can't calculate. There is a lot of confusion about this. It's not helped by statistics programs which claim to calculate things which cannot really be calculated, only estimated. The simple fact is that certain data which we would like to know and which we expect to know are simply not available. And the estimates used by other programs are not just a bit off, but can be very, very wrong. For example (you'll see why below), if your home page has 10 graphics on, and an AOL user visits it, most programs will count that as 11 different visitors!
I am, and always will be, an idiot. Karma: Coma (mostly effected by
What problem are you trying to solve?
This appears to me to be another project looking for a problem to solve. This is way too common in IT. Application cruft on the PC occupying memory, and generally bogging things down. Then the users complain that the network is too slow. Then you have to buy new hardware, and the new hardware needs new cruft... Lather, Rinse, Repeat.
Of course, this could also be a "Review Fodder" project with the goal of adding a new line to your self-assessment paper to the boss. In that case, carry on and add some good cruft!
- High Tech workers, please say NO to Union Carpenters, their Union sees fit to control our compensation.
There are two types of web analytics technologies: log analysis and page embedding. As you are providing statistics for 200 web sites (clients), you want to stick with a log analysis solution. As the two big boys (Microsoft and Google) are going to be providing FREE web analytics based on page embedding, many of the companies currently in the web analytics game will go out of business as half their market will be lost. That said, stick with a company that is big and has not put too many eggs in the hosted service (page embedding) basket side of things. My personal suggestion would be to stick with LiveStats.XSP v8.03 (the last release). Even though support will expire, your installation will still be good forever. Some of the info in it will become outdated but... All in all, it's a gamble either way. You're dealing with a company that has been bought out, the next one you go with may fold. Good luck!
I'm not sure if it quite fits your needs, but it's both fantastic and well-designed: http://www.haveamint.com/
Switch to a LAMP stack and choose one of the many freely available analyzers for Apache on Linux. (Yeah, like you didn't see that coming...)
Actually. no. A hit is not a page view. A hit is a single http request to the server. This could be for a page, a page part, an image, a style sheet, a separate js file, an embedded object, an iframe (itself with multiple hits possibly), a redirect, even a 206. So, if you load the slashdot homepage you are causing around 45 hits just for a single page view (that's 5 style sheets, about 40 images and the page itself). Depending on your browser cache settings a reload may cause anywhere from 1 to the full 45 hits again.
This is why, in almost every case, hits > page views > visits > visitors.
Burns: We're building a casino!
McAllister: Arrr. Give me 5 minutes.
That whole page is well worth reading.
Many of the web stats packages other than analog really try to make you think they can get more data out than they really can.
That page and the one above it (What the results mean) should be required reading for anyone about to read a web stats report. I certainly send it to all my customers whenever I set them up with a report.
analog has my vote too. I've used it for years. It's got to be the most controllable package around. There are options to tweak most everything, and to do pretty complex filtering, aliasing, etc. Admittedly, most people never need to know all that, but it's good that it's there for the times we do.