Slashdot Mirror


On Maintaining httpd Logs...

A nameless submittor dropped this in my in-bin: "I help run a site that's rapidly gaining popularity. However, I wonder how other people out there handle the large amount of logs that are generated on a busy system. How long to do you keep all those Apache logs? What about the messages logs, etc? They take a lot of space and quite often I just don't they they're worth keeping around. Thoughts?" My thoughts on this are simple: If you are serious about your site, then the logs are worth keeping. You don't have to keep them online (tape backups work well here), but the statistics within can give you valuable information on the future handling of your site. Any other thoughts?

3 of 13 comments (clear)

  1. Re:On that note, Web Log Parsers? by Paulo · · Score: 2

    Analog?

    (Don't remember the URL now, but I'm sure that you'll find it easily in any search).

  2. What to do with logs? by dlc · · Score: 3

    In my view, the logs themselves aren't as important as the information they contain. Therefore, use a comprehensive analysis tool, whether one of the commercial tools, a free one written in Perl, or write your own, and extract the relevant information, and then remove your logs.

    Tape backups do indeed work well here, but not all logs entries are created equal. If your site is very image-heavy, you probably don't want to keep the thousands of entries for each inline jpeg; you want the records of the page views.

    Sites running Apache/mod_perl (or sites where the administrator is not afraid of Apache and their C compiler) can modify Apache so that it logs only what you want. A PerlLogHandler under mod_perl with return DONE if $r->content_type =~ /image/ at the top will save you hundreds, if not thousands, of (possibly useless) log entries in your logs files. On the other hand, a 30 Gig tape will hold years worth of bzipped logfiles...

    darren

    --
    (darren)
  3. Learning experiance by h2odragon · · Score: 2

    I'm kinda suprised that nobody's mentioned it yet. Building a log statistics and storage system is an excellent way for somebody to pick up Perl or Python knowledge, or to enhance what they have. Use Apache's CustomLog directive to get referer and user-agent info in your logfiles, and let your imagination run wild as to what kind of data can be mined out of them. User tracking from page to page doesn't require cookies. As compressable as log data is, there's no real excuse not to save it. If you've got enough traffic that logs are taking up disk space you want, you've already got a tape drive or something (right? you'd better...)