Am I the only person that doesn't want logging to block or buffer? Streams are great until you realize that when the thing that you are piping stdout to blocks then your program goes to sleep too... all because something couldn't process your debug logs? Files have the same problem... no space left on device, so all your apps have to crash!?
Using non-blocking IO is a solution, but then when the fd is unwritable for a while, you buffer data and your machine runs out of memory. So also not good.
I'm a big fan of the strategy Varnish uses. It mmaps a block of memory, and then treats it as a ring buffer. A separate program can connect to this and read the logs to a file or a pipe, but if that program fails, you don't stop serving web pages. As long as allocated memory doesn't randomly disappear and the Universe doesn't implode, logging adds no overhead or risk of failure to your app.
Logs are essential, but I don't necessarily want a failure in the logging system to break my app. When you use files or pipes, you run this risk.
For this very same reason you should have separate ring buffers for different types of log. I would prefer to overwrite INFO entries but not access entries - because we have to frequently deal with court orders that require us to provide access logs. It's relatively rare to use production server error logs - we only need them when things go very wrong and we try to avoid that.
This is not entirely on topic, but I'm still blown away from the UDP approach the other day. Send a UDP packet to your log collector and forget about it. It's lightning fast, the recipient can drop packets if it can't handle the load and your program will never be any the wiser.
Last year our upstream provider broke TCP (they upgraded a router/firewall box and it wasn't supposed to affect us, but it dropped asymetrically routed TCP traffic) but UDP packets still got out, and thanks to seeing UDP syslog traffic making it out helped us diagnose the issue. Using the "more reliable TCP method" (rsyslogd supports this) would have made the issue harder to track down.
Syslog is more about human-readable logging, but you're right, it's more relevant in this context. The graphite approach was about fast aggregation of samples (app response times, load, etc) and the graphing of these.
Syslog had blocking and non blocking writes, depending on the severity of the log. I suspect that it's still that way in syslog-ng,but I haven't looked at it recently.
You can log to mongodb capped collections. You get a semi-structured ring buffer with fast writes. I'm not a huge mongo fanboi, but this is one situation I've seen it excel in.
"Logs are a stream, and it behoves[sic] everyone to treat them as such. Your programs should log to stdout and/or stderr and omit any attempt to handle log paths, log rotation, or sending logs over the syslog protocol."
If you were being pedantic one might say that logs are a repository for temporally tagged event notifications that have occurred in the past. You can store these temporally tagged event streams in files, in a memory cache, in an email inbox, or a round robin database (RRD).
Not particularly profound, but useful. Logs from disparate processes which share a common time basis however are the systems analyst's go to tool for trying to untangle secondary and tertiary properties of a loosely coupled system.
"Everything is a file" is the biggest myth of UNIX. Everthing is represented by an integer that we call a "file descriptor", that's true. But try calling "accept" on a file descriptor you made with "open", or try calling "sendfile" on a pipe instead of a socket. Doesn't work.
I'm well familiar with the expression. However it's not helpful to someone who isn't already familiar. It's another form of "you're doing it wrong" without explanation.
Not really. It's a completely different statement than "everything is rocks". If everything is a file then everything shares properties common to files. It turns out that this is really useful. If everything were a rock then... well, let's say that it would not be so useful :-)
This assumes though that you only have a single log. Often times I find breaking up logs based on concerns/packages/names or even level to be useful, which you could not do without reinventing the wheel of output filtering.
This also assumes that anyone is actually really reinventing the appenders as most frameworks already have a library of common appenders.
I would argue that you want to do the filtering at the aggregation layer and not at the application level. So while you may want to set different flags or parameters that will be logged (ie: severity, component, etc) those parameters should be injected into the log stream rather than to alternative streams. Let the aggregation point use those clues to filter the logs appropriately.
This should definitely take place at the aggregation level. There are great tools for this already. This is definitely an implementation detail we developers should delegate to the systems guys.
Not to mention, it's just less development work to send out messages with, say, different syslog levels and be done with it.
if they wisht to log to syslog, each program needs to implement the syslog protocol internally
This is simply not true. I changed the logging method of a C program recently from timestamped text file to syslog in about 10 lines... Utterly trivial in any language.
So it's basically arguing that server-side code benefits from logging to stdout rather than a named file? Interesting idea if you aren't using stdout for something else, but it doesn't seem particularly radical since you can just use cat.
A few months ago I published a suite of scripts that allow me to run complex queries using shell pipes. They work by communicating in YAML. The first stage, cat_logs, simply parses logs from a file and emits yaml for each line. cat_logs understands that log files are split across multiple files, and that they may be gzipped.
Even better, logs are databases, and not a linear sequence at all. Often, I find that I am interested in some very specific subset of what happened; between these dates, from this IP address, in this module, errors of this kind, etc. If log entries were put in a database with a well-thought out schema, using the log in this way would be trivial.
Another project worth mentioning (although "as a service") is http://www.graylog2.org/ The web interface is not there yet (no complicated queries, or easy drilldown), but it definitely performs well and has an interesting design.
Interesting project, but I was hoping that it was something that it isn't. I've been looking around for something that would do ad hoc querying and graphing of 'stuff' frm the debug and production logs that I have on had, stuff like page gen time statistics vs time, or split by heaviest pages, or split by web app machine. And then follow that till I find something actionable.
I've taken a look at statsd+graphite, but that seems to be much more of a realtime solution, and I want to mine the logs I have on hand.
Statsd takes realtime info and dumps it in. You have to at least know what you're measuring when you start. I
Ve got a month or two of debug logs, and I dont know whee the gems are yet, touugh I'd probably start with timings and go from there.
I disagree. Logs and streams are very different kinds of output. For number one, streams are usually more fluid, and I tend to have a higher volume of stream output than logs. For number two, it's more important to produce logs at regular intervals, IMO.
There are differences between tail -F and tail -f.
tail -f will continue reading a file for as long as possible, but if that file is renamed underneath it, then tail won't know and you will be wondering why you don't see anymore log lines being output to your terminal.
tail -F checks to see that the file we opened is still in the same ___location as before, so if the file is renamed underneath us, and a new file is created in its place (think log rotation) then it will open that new file and continue outputting data to its standard out.
I wish there was a tail -f <wildchar> that follow even if new files are created that match. my systems timestamp their logs by hour, and more than once I've been confused as to why the app just "stopped" just to realize that it's five past.
I am on my phone now (and, while its heart is a Linux kernel, it doesn't have a terminal, bash or tail. Maybe if you don't expand the wildcard tail will detect new files.
I think it would be wonderul if everyone started logging to twitter.
Then you can just follow each machine to get a read of its logs... or, at least pipe the output of LogWatch or something like it to twitter.
Maybe for 4/1/2012 - we should organize all sys admins to output logs to twitter randomly choosing @jack or whomever else and flood them with @ log bursts...
Using non-blocking IO is a solution, but then when the fd is unwritable for a while, you buffer data and your machine runs out of memory. So also not good.
I'm a big fan of the strategy Varnish uses. It mmaps a block of memory, and then treats it as a ring buffer. A separate program can connect to this and read the logs to a file or a pipe, but if that program fails, you don't stop serving web pages. As long as allocated memory doesn't randomly disappear and the Universe doesn't implode, logging adds no overhead or risk of failure to your app.
Logs are essential, but I don't necessarily want a failure in the logging system to break my app. When you use files or pipes, you run this risk.