How Should an Application's Logs Work?
emmjayell writes "You've been there, loaded up a new application (think server-based app like Apache or Samba ...), it's working okay for a few days or a few months, then the intermittent problems start. Usually it's the CEO or someone else of relative importance that is the first victim. You can't readily duplicate the problem, so you go to find out where the application put's it's logs - maybe it's in var/log/messages - maybe in it's own directory - sometimes it's right there and available in some administrative GUI. So what makes you happiest when diagnosing the problem? Do you want tools to access it? UI or command line? Do you want it formatted to use tools like cut and sed? Do you have any examples of an app that does a great job with system logging and diag logging? Background: My team is working on an application that is gearing up for a first release. We have a logging framework in place already (we are using Apache: logging.apache.org/) -- so that covers how we are logging, but not what we should log and how it should be laid out for optimal use."
JBoss 4. Even though I can't figure it the first time most time I look at it, the answer is always in the log.
Even though it is resource intensive, I prefer the developer log everything, and let me decide how verbose or terse I want the logs to be.
-- Bryan
Event log. Standard place to find all logs.
Pay girls to strip!
Should the administrator of the software already know where the log file is saved? If he/she wanted the application to create a log file, then he/she would have made sure it work before unleashing the application.
I just pooped your party.
Any kid of log is fine with me as long as it's there and it gives me some kind of insight into what's going wrong, e.g. "can't open this file," "that file's corrupt," "null pointer." Of course, text files are nice, because you can actually search through them.
Sadly, most applications for M$ operating systems usually just leave things like, "Error #543892157893421 occured." When you go to look up what error 84901257893423 is, no one in the world seems to have had it. Tech support proceeds to blame your hardware vendor, who blames your software vendor, ad nauseum. Seems like most applications for m$ operating systems just pull error numbers out of their asses.
Join the Slashcott! Stay away entirely Feb 10 thru Feb 17! Close all tabs to prevent autorefresh!
Maybe you want to know
/registered user did
When from Where did What by Whom
When = ISO 8601 Timestamp (from)
Where = IP Adress / Name of computer...
Who = which Login
What requested file foo/bar?a=213b=dfg
Grundgesetz * 23. Mai 1949 - 30. November 2007 - http://www.vorratsdatenspeicherung.de/
Shouldn't the administrator of the software already know where the log file is saved? If he/she wanted the application to create a log file, then he/she would have made sure it worked before unleashing it.
Comment removed based on user account deletion
I say this as a sysadmin for a large number (several thousands) of servers. So many problems could be solved by Apache et al using, and modifying where necessary, the existing solutions. But they don't, they roll their own, and so we see problems.
I realize this is a sorta OT rant, but it's causing major problems for me at work and so I think it's justified. Stuff like rotatelog has functionality to tell Apache "dump your logs, I'm rotating the file!" (which by the way doesn't always work) which could be easily rolled into a single notification of the syslog daemon. But it isn't. For whatever perverse reason, Apache and many others decided to roll their own.
It pisses me the fuck off, having to deal with it. The greatest shortcoming by far of OSS is this insistency on reimplemtning proven, robust, existing solutions in favor of a trivial fix. This is a particularly egregious example, one the OSS world would be served well by acknowledging.
as I once said to a colleague. /var/log
/var/log/appname/* is good.
If you have simple logging needs, log via syslog and leave the details to the site.
For more complex needs, especially if you have several logs,
Obviously, the logs should be a text file. You ask if special tools should be provided. For text files we already have grep, sed, awk, perl.
The exception is if you are providing some kind of administrative GUI, say a web app. Logs that relate to specific functionality should be near the controls for that functionality. By using a GUI you are saying "I don't want to get my hands dirty" which, for time-pressed admins, is a perfectly legitimate approach for apps with complicated configuration architectures (Sendmail, WebShere 5). So the GUI should take away the complexity of having to know where the logs are. It should always be possible, though, to get at the text of the logs and run standard tools against them.
MHO.
Yours Sincerely, Michael.
My favorite logs are the ones where I get control over what events get logged and in what detail they get logged. There's no such thing as a one-size-fits-all software solution; why do we believe in one-size-fits-all logging?
The alternative is to log everything in great detail, but do so in such a way as to make it truly trivial for me to strip out everything except the specific events in which I'm interested, in the level of detail in which I'm interested.
That's the message I get in my Grammar Nazi log when I put your submission through my "Grammar Macht Frei 2.0" application.
is easily the best logging package.
Configurable log levels, and you can define your own appenders.
Eg, I typically configure an email appender for severe/fatal errors, so they come straight to my inbox. Often I know of problems before the users do as a result of this.
Also, something described in one the Pragmatic Series of books is an RSS appender - just point your RSS reader at the channel, and wait for any errors to occur.
Write to a god damned text file. Add a timestamp. Be verbose. Is it really that hard to figure out?
Here's a great app to look at both for how to do logs right and how to do them wrong. The default setup gives one line for each email, including time/IP addr/status, all of which is good. It can also be incomprehensible and not well documented - bad.
Intron: the portion of DNA which expresses nothing useful.
http://www.faqs.org/docs/artu/ch05s02.html
------ Take away the right to say fuck and you take away the right to say fuck the government.
I really don't much care where logs are kept or what particular format they are in. However, it's important that the man page tells me where the logs are, and clearly documents the format of the log files. What do flags mean? What do particular messages mean?
Also, formatting the logs in such a way that they can be quickly searched with grep or parsed by a simple script is most helpful. One of my favorite loggers does this:
This lets me see everything in chronological order, but I can quickly parse the log. Splitting on ':' will yeild the first two feilds consistently, and the first four chars are *always* the type of log message. So doing something like:
Lets me immediately see all the errors for a given day. The key to good logging is, IMO, making sure that the logs can be parsed effectively.
We may not imagine how our lives could be more frustrating and complex—but Congress can. – Cullen Hightower
Comment removed based on user account deletion
Whatever your logging strategy, please remember to rotate your logfiles.
/var or your custom directory under /a
You'd be amazed how many Internet applications crash simply because the logs fill up the partition that hosts the application.
This problem was threefold, because:
1. The logfiles should never get that size
2. Logs should usually exist on a seperate partition like
3. People usually ignore the messages in a Production debug log
I run 12 moderate websites. We easily generate 5GB of logfiles a week (Rotated). It used to be 10GB, but then I switched off debug logging and fixed the most common errors.
When I started my current job 1 year ago, I deleted 40GB of logs which ran back to year 2000.
94% of Repubs and 21% of Dems voted to renew the Patriot Act
So, in order to make storage, analysis and reporting easy, your framework should attempt to coerce a consistent approach to the data logged - even the plaintext "human readable" data if you can. If you can do the same with metadata about the event (e.g. ID fields, links to online KBs etc), so much the better.
BB
MS requires that non-MS software sets bit 29 on all custom error codes, giving absurd decimal interpetations of these codes.
-mkb
It is nice to allow the end user to specify the verbosity of the log file - do you log each request, only failures, or every entrance/exit to a function? If the log is to assist in diagnosing the error, it is nice to be able to turn on extra information that would normally just quickly fill up disk space.
I agree with what others have said--I don't care about the format too much as long as it's text and has a timestamp in front.
/blah/blah" is infintely better since it lets you actually fix the error without looking something up on google. Speaking of that, if you *do* have some kind of strange error you are reporting and you know it's not going to make sense, at least make it long and unique so I *can* look it up on google. Terseness can be a good quality in a program, but it is *not* a good quality in an error log.
What really matters to me is that errors get logged nicely. "Error number #57575" or even "permission denied" doesn't help at all. It needs to be specific "permission denied while opening file
Just remember, the people installing your software and using probably don't the first thing about the way it works on the inside. You have to explain to them what went wrong and give them clues on how to fix it in a short error line.
Oh, it's also nice to have a link back to the original source line that caused the problem. In C, use __FILE__, __LINE__, and __func__ (if you have it) so that if I'm really stumped I can download your source, quickly find the error and start working backwards to find the cause. However, this could get confusing if there are a number of different versions of your code floating around, and it's not as important as the other things I mentioned.
-David
There. Now go play some cool javascript games!
To paraphrase an Apache (1.x) error I recently encountered, was: "The client was denied access by configuration rule.".
First of all: which rule did the denial? Second: does the program recall the file or line this rule was in or on? Simply having the text or location of a rule would be a huge help in debugging.
pflogsumm.
Set it up with cron, get a mail every day and keep them for a few days. Saves you a lot of headache.
For all other stuff use egrep (or grep), awk and sed. I did my own scripts to search for specific abuses. vim in command line mode may also come in handy.
Meme of the day: I browse "Disable Sigs: Checked". So should you.
Related to logging, is program output. If you are writing tools that run at a Unix console, then someday you'll want to run them from scripts and use their output:
1. Use stdout and stderr appropriately: I should be able to run "foo > file" and see warnings on my console, but only get useful data in the file. Don't write messages to stdout if they aren't useful output. C++ has std::clog too. Don't make scripts use "grep -v" to remove random status messages or whatever.!
2. prefix your warnings/errors with the name of the program. So if a bunch of programs run in a script-- or in the background-- you can sort them out.
3. Indicate if something in an error or a warning.
4. Don't get too emotional, unless it's really neccesary:
Bad: WARNING! SOMETHING REALLY HORIBLE HAPPENED! AHH!!!
Good: fooprogram: Warning: Something really horrible happened!
Notice that you can use a ! at the end for emphasis, but it's not obnoxious.
Make reasonable suggestions if you think an error might be caused by user error:
fooprogram: Error: No status file found. (Use fooprogram -s to initialize.)
fooprogram: Error: No response from server at localhost:2323. Did you run it first?
(But try not to be too patronizing!)
5. Use error codes. (in exit() or return from main). Make an enum or #defines somewhere. Document them in the man page and/or --help output.
What the heck is a bind variable?
While we are on the subject, let me make the following recommendations :
1. When your app creates a new log file, put in a header. In the header put the following information : retention period, logging level, full application path and name, where the source code to this application is located, who (at the time the code was last changed) can accurately interpret the contents of this log file, the original intent / purpose of the log file, the time / date stamp of the creation of the file. If you can look at the header of any log file and understand all that you are half way there.
2. If your log file is for debugging purposes, put in a line with ProgramName::SourceCodeFile.ext::RoutineName() and include any parameters passed in when the routine was called. If you know the parameters most of the time you can recreate the problem and step through the code by hand.
3. You can put whatever you want in if there is an error, but for God's sake don't suppress the Java dump (stack trace) that comes at the end. Nothing worse than having absolutely nothing to work with when debugging a crash.
4. Consider having the app create a new file on a per-day basis. It doesn't take any more room (don't forget the header as it is a new file) and you don't have to kill the application to delete the old log files. Keeps the log file sizes manageable.
Glonoinha the MebiByte Slayer
+1, informative
Join the Slashcott! Stay away entirely Feb 10 thru Feb 17! Close all tabs to prevent autorefresh!
You know what logs I really like? The Windows 2000 system logs. All the services write in them, so there is never any hunting for files. They can have a lot of information packed into them, if the application takes some time to do it. They rotate automatically. They have severity icons so you know which ones are the errors and which are not. And they are all in a nice GUI list, so you don't need a command line PhD to view them. User friendly, indeed!
I usually make a list of event codes to help ID the type of event or error and it's serverity. Then I make sure that whoever reads the log entry can find the line of code that generated the error.
....
For example, if I am coding in perl on apache I might have a handler called 'employees.pm'
sub add_employee
{
try { [something that generates an error]}
catch {
log("Error 1001 at employees.add_employee.100: Can't find the file $filename");
}
try { [something else that generates an error]}
catch {
log("Error 2001 at employees.add_employee.110: Bad ID of $ID assigned");
}
try { [something that generates an error]}
catch {
log("Error 1001 at employees.add_employee.120: Can't find the file $path/$somefile");
}
}
This way I can easy find exactly where in the code the error was generated. I can also create a cron to grep the log for certain type of error codes and SMS a warning to my mobile.
I'm sure there are perl modules that make it easier to do this, but I've been doing it this way for a while without trouble.
Peace, or Not?
We use log4j, which lets us dynamcally flip into debug mode, with more verbose info, when we want. So I've recently added support in our app for turning on debug mode logging on the fly. That's pretty big if the app's going into a hosting center...restarting the app may mean a walk to a secured room in another building, or at least remoting into the box. But even then, there's work in progress with that app you may not want to lose. What I did was set up a timer so that debug mode is turned off automagically after 24 hours, just in case the hosting center staff forget to turn it off. Be sure to test for screwy conditions that can cause the log to fill with junk.
So your log's filled with interesting tidbits of code gone wild...who is going to find out about it? Good options are: - SNMP (for hosting centers) - Email (for hosting centers, and otherwise) - Sysout (if somebody watches the screen)
Make sure you can express every single conceivable output line in each log file using the *same* regular expression. If you have optional fields (that don't appear in every line), use a set of containing braces if necessary, to keep the field mappings consistent.
The degree to which this simplifies logfile analysis (of any kind) is staggering..
Do not invent your own.
Use the openlog(3) and syslog(3). Document the facility you are going to use (like LOG_LOCAL0) and allow it to be changed through config-file or command-line option.
Once there, use different priorities for messages of different importance (from LOG_DEBUG to LOG_EMERG) -- the verbosity can be set once during the config-processing through the openlog(3).
There are tools to process such logs, rotate them, and to send messages to a central location. You don't need to worry about timestamping either. On better OSes (with modern syslog-daemons) run-time processing of certain messages (identified by the above-mentioned facility) is possible -- a sysadmin will be able to set things to her own liking.
If you have to support Windows -- it has logging too, and there are API-compatible syslog.h implementations.
In Soviet Washington the swamp drains you.