Webtrends - Reporting Site Usage and Other Stats?
gammoth asks: "My company has a successful web site which gets roughly 1,800,000 hits from 45,000 sessions a day. A few years ago, our web stats software, HitList, broke when we crossed it's capacity threshold (~1,000,000 hits). I replaced it with a tailored version of Webalizer supported by an array of perl scripts and a Suitespot server plugin. My reporting system runs with little intervention, managing log files from 4 hosts, and competently reports on hits, popular pages, referrers, etc. But it's not perfect and I'm the first to admit it doesn't provide the kind of info the marketing department would find really useful. I have plans of a comprehensive system using a DB and a report engine, but I've not had the time to implement it. (We're interested in info on marketing campaign success, path through site, etc). Meanwhile, marketing is tired of waiting and the otherwise exceptionally supportive IT management (truly) is considering contracting out some of our site usage reporting. Webtrends is being looked at seriously. I was wondering if any readers out there had had any experience with Webtrends or other software package or service provider. Are there any OS packages that provide features well beyond Webalizer?"
http://urchin.com/ - seriously, we've used it for a while now, and it looks great and can report on just about anything.
"supported by an array of perl scripts... doesn't provide the kind of info the marketing department would find really useful"
I think that a LART, applied tactfully, is in order. Obviously, the marketing department needs a crash course in the elegance of Perl. =)
WebTrends annoys me greatly, because it is poorly documented, has a sucky interface, and misleads naive users into thinking they are getting reports on "visitors" and "sessions" when in fact they are simply getting stats on a window of visits from an IP number.
Read this document Why web usage statistics are worse than meaningless and memorise it.
Also, remind your marketing folks that quantitative data from your logfiles can only be interpreted with qualitative data from interviews/focus groups/usability studies. If people stay for less time in your site tan before, is it because your design sucks, or because they found what they wanted and left quickly? Only qualitative research can tell you.
Whenever marketing people spot trend variations, they will ask you why. You will need to know the above in order to respond properly.
NetMining is located in our office building. They might have some products that interest you and/or your marketing department.
Sorry for the shameless plug...
Flourescent (adj): smelling like ground wheat.
It's closed-source, commerical software, but I've been a big fan of NetTracker from Sane Solutions for a few years now.
I use it in an ISP environment, running with Apache logs on FreeBSD, and haven't had a problem with it yet. Plus, their support is outstanding.
It's one of the few pieces of closed-source software I have recommended. They have a demo version, so you can try it out on your logfiles and see if it works for you. But I highly recommend it.
Disclaimer: I have no relationship with Sane aside from being a happy customer
This story looks a lot like this ask slashdot.
To mænd sad i en tømmerflåde
I used WebTrends for several sites with about the same traffic you're looking at analyzing. In short, WebTrends sucks bigtime. It would crash for no reason almost everyday, and their dns resolver code is sloooooooow. I had to write a custom dns resolver that would replace all of the ip's with the hostnames in the logfiles before running it through webtrends. I've used both the Windows version, and the Linux Webtrends server. The windows version actually worked better, but it still sucked bigtime. Their customer support sucks too. A new version came out a week after I spent $2000 on their software, which was filled with bugs. The new version fixed most of the bugs, but they were going to make me buy it again to get the upgrade. Analog with Report Magic did the same things webtrends did, but it was free, and it worked much better.
Another package I've used is Accrue. I think this is by the same people that make HitList, but it's much better. It's not without it's problems, but it would work great for a site with the amount of traffic that you are analyzing. We didn't run into problems until trying to analyze more than 150 million hits/day. It has a sniffer that sits on your network and watches web traffic. It generates it's own logs which are more comprehensive than your webserver logs. Every hour, it uploads it's data to the "warehouse" box which analyzes it at the end of the day. It requires beefy hardware, big expensive Sun Enterprise systems. It has some nice marketing stats stuff, like path analysis, and other crap. Very expensive though, expect to spend 5 to 6 figures on the software, and another 5 to 6 figures on hardware. They purchased another company that did nearly the same thing about a year ago, and they have a new version based on the technology from the other company, version 6.0/6.1. I haven't used the new version, but supposedly it's much better. The price is still insane though, so unless this is something you really really need, I'd stay away. It also requires a good DBA who knows RedBrick or Oracle (you can use either for a database).
Another option is a managed log service like Digimine. They work well, but it's a recurring fee since it's a service, not software. And you have to upload your logs to them every day.
There's a company that's been hitting me up lately, I forget their name now. But they have a linux based version which has clustering capability. The database is stored compressed in chunks across the entire cluster. It scales linearly, so you can add machines as you need them. They've been taking business away from Digimine and Accrue. They are based in Minneapolis I think, but like I said, I forget their name now. Their software can correlate different logs together too, and get you stats on email campaign's, video streaming, and your webservers. If you're into spending money, this would likely be your best bet.
I would stay far far away from WebTrends if I were you. Webtrends is a sucky product, and you can get the same info with Analog and ReportMagic, for free, and with better performance. 1.8 million hits isn't really that much, so a product like Accrue would likely be overkill. And most companies balk at services since they can't depreciate the expenditure over time, it's an operating cost not a capital expense.
Need Free Juniper/NetScreen Support? JuniperForum
I've read that document before, and I suggest that perhaps you need to re-read it with a more jaundiced eye towards your prejudices.
:P ). It's still worth reading, but only after you filter it a little.
The document now contains several disclaimers admitting that the author's original conclusions have been undermined somewhat by his own hyperbole, ignorance and by new technology (the original was written in 1995 - in web terms, it may as well be written in hieroglyphics on decaying papyrus)(ok, so that's a little exageration of my own...
In particular, he doesn't account for cookies, which are great for web tracking (personally, I block nearly all cookies, but I don't think that session tracking is a malicious use). Cookies can give you very accurate data on visitor use, and proper reporting can turn that into very useful information.
Also, the points he (or she) and you make about IP addresses vs. sessions vs. users are valid, but overblown. Very few people access the same site from different IP address in a given session. You wouldn't want to bet your life savings on these numbers, but they're accurate over 90% of the time, and that's more than enough to get good information (as someone else once said, "Don't believe me? Next time you have a blood test, tell them to take it all to make sure they get an accurate reading.").
We've used WebTrends for month, and I like them quite a lot. For some things they are excellent; for others, not so. A word about methodology: WebTrends tracking code consists of a primary method and a fallback. The primary method uses JavaScript to compute a compressed string of data including much client information and appends this to an HTML image tag - this data is slurped into a database at WebTrends. If JavaScript is disabled, the hit still gets recorded, but without all the fancy extra info. They try to place a unique, persistent cookie with each image load (once per page).
According to WebTrends, over 95% of our visitors have both cookies and JavaScript enabled.
Their reporting tools are very good and comprehensive, containing everything I've seen from the best log analysis software and some things that software can't get (average screen resolution and window size, for instance - I love this). You can customize content groups to your heart's content by modifying some variables in their JS. Their site itself is well made and smart: their help system pops up a content-sensitive window with information for each specific page; if you click to a new page, the help window is updated. Yes, this is relatively easy to implement, but how many sites do it? Too few.
Now, not all is Madam George and roses (to coin a phrase). I've found that WebTrends reports at best 95% of our traffic. Periodically I run a couple home-brew Perl scripts on our logs and it always counts more hits than WebTrends shows (not an issue with my Perl-fu, BTW). Their tech support is decent, but not wonderful - if you have a real issue, you might run around a little. A couple times they've flat-out dropped large chunks of our traffic (e.g. 40% for a day), never to be seen again.
Finally, we get about 10% the traffic the original poster does, so I can't tell you how well they scale. They'll charge a pretty penny for that amount of traffic, too.
To summarize (whew): (a) WebTrends is pretty decent, and excellent for some things; (b) IP-based assumptions and cookie tracking can get you very accurate statistics as long as you can live with the limitations.
This isn't as much "normalization" as it is "don't take so many drugs when you're designing tables."
Where is the Free Software alternative? Shouldn't something like this be open source? To me it seems like software like WebTrends won't work with everything all the time. Upgrades to fix bugs would help I'm sure, but in the time it takes to fix them you've lost some hits forever. Better to have a GPL'd alternative IMHO.
I design user interfaces for a free network management application,
Analog is great, and free. I think it's from Cambridge University.
I have used the Webtrends hosted service, WebTrendsLive, and have no complaints.
Applications that run locally are much less expensive, but they put a bigger load on your servers (I don't have a lot of experience with them, though).
PowerPhlogger: http://www.phpee.com
AwStats: http://awstats.sourceforge.net
AXS: http://www.xav.com/scripts/axs
Howdy,
You might like to try Funnel Web Analyzer Standard (free), which we pit against WebTrends's standard Log Analyzer. We used to sell this for $399, but it's now free.
We have an Enterprise version that delves deeper into stats with Clickstreams, etc, but the free version might be sufficient for you:
http://www.funnelwebcentral.com
Cheers,
Suren
The requirement from the IT department was that it had to be able to do a two-pass analysis. The first pass to read all the raw data into a raw database, and the second one to filter through all of that (IE, hits from within the company and from search bots were discarded) and to generate the reports. The reason for that was that we didn't have room on our servers to store 20 megs of log files a day, and if we suddenly discovered that a certain IP address that had been registering all kinds of hits was actually a searchbot, we'd want to be able to rebuild the database without having to go back to the origional log files. At any rate, I spent a solid week on nothing but this, and here is what I found:
1) Webtrends - We already use this one. We don't like it as much because it doesn't track the clicks through the JSP post commands as well as we would like it to. If your company uses HTML pages, then it has a great ability to track users through your site. like what percentage of people who were on the main page clicked on this link, etc. etc. It only uses a one-pass database, so whenever we discover that a certain IP is a searchbot or we need to put on some other filter, we ahve to have someone go through hordes of data and clean it up a bit. It also has a web interface, so you can just dedicate an NT box (Mod: -1, Suggested Using Microsoft) to hosting the server and analyzing the data, and not have to dedicate anything else to it.
2) Nettracker by sane solutions: This is the best that I was able to find. It also has the web interface, and I was able to run the MySQL server, the nettracker server, and the web browser. It has a one-pass system also, but because it uses a simpler database structure than webtrends, it's easier to maintain the data. You can either use an oracle database, an SQL database, or it's own internal database. It also has the ability to track users through your website. It can export the reports through Microsoft Word or Excel (marketing people love that). It also has the ability to create custom reports easily, so that we don't have to custom make them for the marketing people.
3) The last one is sawmill. This has all the basic features that nettracker had, but can only use its own database, and as far as i could tell couldn't export the graphs. I will say, though, that it costs several orders of magnitue less than nettracker or the full version of webtrends does.
this is my analysis of web traffic analysis tools. Most of it is more than a month old, and comes from the demos that I could get at the time. If you think that I got something wrong, please post. Hope this helps a bit
Had to use it on 2 customers sites running Lotus Domino and IIS. IIS reaults were barely reasonable, however if you have a Domino site, avoid Webtrends like the plague.
.ext filenmae extension. This is really lame, and a fatal flaw for Lotus Notes based content as notes databases can contain anything. Needless to say, its also standards non-compliant.
A year after implementation, and we STILL cant get reports out of it. I think the biggest problem so far is that Webtrends dosent use MIME-types for determining the file type, it uses the
As the poor tech who had to live with this crud, I advise you to please staythe hell away. Atlease if youre in a Notes/Domino environment.
Anyone who considers arithmetical methods of producing random numbers is, of course, in a state of sin.-John von Neumann
http://ask.slashdot.org/article.pl?sid=02/04/23/02 47226&mode=thread&tid=106
Certainly my most favorite troll post, by far.
If your marketing department is looking for marketing information, the IT department will have a hard time meeting their needs. They should look at a solution specifically built for their needs. Among the ones that come to mind, Accrue, Keylime, etc, the clear leader is Coremetrics. IT doesn't have to worry about it and marketers are given all the advertising to conversion numbers that they need.