Centralized Logging With Rsyslog

illumin8 · on Oct 5, 2010

I've had much better results from using Syslog-NG along with php-Syslog-NG.

Syslog-NG can already split your log files into subdirectories with the hostname of each server, but it also has the capability of redirecting messages to named pipes. This is great because you can pipe it into mysql and stuff all of your log messages in a database. Combine with a php front-end and now your developers and sysadmins can search logs intelligently across multiple servers, and get really fine-grained on their search strings. Want to tail the output from all Tomcat servers in your app server pool looking for a specific string? Go right ahead.

sugarcode · on Oct 5, 2010

One big advantage of rsyslog over syslog-ng is that you can spool messages to disk if the remote syslog server is down (syslog-ng only offers this in their 'enterprise' paid version).

psadauskas · on Oct 5, 2010

I've been pushing to implement this for our application, but I'm told that we used to, and had to turn it off because it would saturate the IO of the logging server.

Has anyone else experienced this? Is it just a simple configuration tuning problem?

moe · on Oct 5, 2010

it would saturate the IO of the logging server.

Rather unlikely unless your deployment is very large or you're doing extraordinarily expensive filtering/binning on the sink-host.

The first bottleneck is normally diskspace, not disk I/O. Those logs pile up very quickly, depending on how long you retain them.

The raw network and disk I/O, however, are rarely of concern. Before you approach either limit you're already logging to the tune of ~300G per hour - and have probably switched to a distributed architecture of some sort long ago.

Writing broad streams of sequential text is very cheap.

Making sense of what you wrote, ideally before actually writing it, but at the very least before being forced to purge it due to storage constraints, is the difficult part. ;-)

d4rt · on Oct 5, 2010

I've had to up the maximum simultaneous connections, but otherwise I've been able to log extremely large volumes of data from 1,000s of machines, with syslog-ng. I'd assume the same is true with rsyslog.

Looking at limits on open files, and considering tcp timeouts and cookies were all we needed. Total volume was in the order of high 10s of GB.

If you want to unpack offline, contact details are in my profile.

gmcquillan · on Oct 5, 2010

It definitely helped to separate the incoming queue from the actions queue on the central loghost. Unfortunately, I don't have good data comparing the two. At this time, our loghost is an EC2small, so relatively susceptible to I/O disruption, and we haven't had any problems.

If you're running a whole datacenter's worth of logs into your loghost, then you might want to consider a distributed approach. For medium sized companies, I don't see why rsyslog wouldn't do the trick.

illumin8 · on Oct 5, 2010

It depends - if you're logging from thousands of machines or services, you might want to use UDP logging. Then, provided you don't saturate the network connection, you should be fine. UDP is connectionless, so you definitely won't have as much overhead.

On the other hand, a lot of high security environments want to encrypt their syslog traffic using something like stunnel, which introduces OpenSSL overhead as well as TCP connection overhead. With thousands of clients and lots of encryption going on you definitely are going to hit some limits sooner rather than later. Check kernel parameter net.ipv4.ip_local_port_range (on Linux) and make sure you have a large enough range to accomodate all of the clients.

mmt · on Oct 5, 2010

Protip: Use syslog-ng.

Besides longer log message (arbitrarily long, with a recompile) and reliable delivery, it obviates my main use for logrotate, since it can be configured to write to a filename (including directory) based on time, date, or other variables.

js2 · on Oct 5, 2010

Protip: look at rsyslog, syslog-ng, and splunk and decide what is right for your environment.

mmt · on Oct 5, 2010

The "pro" that I am is sysadmin, and I'm asserting that evaluating all three is a waste of time.

Splunk, given its cost and complexity, is almost never right for startups.

Non-ng syslog is, on the other hand, so simplistic that it's not worth the effort of fancy configuration. Is there some kind of compelling advantage that I've been overlooking?

I never quite understood the conceit that every environment is a precious-and-unique snowflake requiring careful evaluation of any given tool.

js2 · on Oct 6, 2010

Turns out I am a sysadmin as well, and I'm asserting that each has various strengths. I have used syslog-ng as long ago as 2001, so I have some experience with it. Today I would recommend rsyslog. It is the default logger in Ubuntu 10.04 LTS and Fedora is also transitioning to it:

http://fedoraproject.org/wiki/Releases/FeatureRsyslog

Further, I think that RELP and on-demand disk spooling of messages are compelling features. Its performance and reliability are good enough to feed your web-server access logs through.

I wouldn't overlook rsyslog, but I'm also not saying "just use it" because syslog-ng is certainly worth evaluating as well.

Edit: see also http://www.linuxjournal.com/content/centralized-logging-web-...

mmt · on Oct 6, 2010

This more in-depth discussion has more value. Thank you.

I think that RELP and on-demand disk spooling of messages are compelling features

I think we're coming at the question from different perspectives. One of my primary goals is to avoid wasting my time. Since I've already evaluated and experimentally proven syslog-ng, switching means a large time investment.

As such, features like REPL and, arguably misfeatures[1], like disk spooling, fail to compel such an investment.

Once rsyslog has matured, something that I expect will be accelerated by its inclusion in major distros, it may be a no-brainer.

For my "money," there are far more interesting and productive problems to work on than logging, which is why I do give the "just use it" advice.

Turns out I am a sysadmin as well

By choice or necessity? Just curiosity on my part.

[1] I have yet to encounter an environment of non-trivial size where the risk of losing logging outweighs the risk of disk filling up and/or performance degradation from additional contentious I/O. For me, it's a killer feature of centralized logging: elimination of a particular source of failure/degradation.

johnmark · on Oct 8, 2010

"Splunk, given its cost and complexity, is almost never right for startups."

Many, many startups would disagree with you. If cost is a factor, we're starting a program for startups. Feel free to ping me for more info.

We'll also have a new developer license for our next release, which will make it easier for cash-strapped developers to use Splunk.

And finally, I would just point out that using rsyslog, syslog, and syslog-ng do not preclude you from using something like Splunk. Many of our users use Splunk with a log manager, such as syslog-NG, because they like our analytics engine and reporting tools. YMMV.

Now back to your regularly scheduled discussion :)

-John Mark Splunk Community Guy

mmt · on Oct 12, 2010

Many, many startups would disagree with you. If cost is a factor, we're starting a program for startups. Feel free to ping me for more info.

Thanks for the reply. Admittedly, my statement is better qualified by "early" and "web scale."

Cost is very much a factor, as $500/mo buys a lot of ramen, and, even for a slightly larger company, is still a noticeable expense, for something where the value is unknown[1] ahead of time.

Even just the nomenclature "enterprise" implies pricing that's optimized to pay for the sales process or other hand-holding, rather than something technical. Such inference is further bolstered by the fact that a full price list isn't published.

If cost is a factor, we're starting a program for startups. Feel free to ping me for more info.

The companies I work for, in general, have neither the free time nor the inclination to pay for such a sales process. The program I'd want to see is one where I can order online with a credit card.

Time is a portion of cost, not just cash.

[1] Perhaps even unknowable, since, if the output isn't consumed, it's all potential value.

thomasknowles · on Oct 5, 2010

I'm a huge Splunk user and I agree with it's price tag it would make it very difficult to justify the costs against the gains when you're operating a small and tight ship. However, I would be very interested in the tools that they use to analyse their data. Obviously they could use a series of grep/awk scripts to pull out the data in key value pairs but what do they do with it after that?

mmt · on Oct 6, 2010

I have karma to burn[1], but I'm embarrassed and ashamed that pointed advice gets downvoted (with no discussion) while a platitude is upvoted.

This is an example of why Startup School speakers censor themselves: we bring it upon ourselves.

[1] Especially since it's not meaningful.

kvs · on Oct 5, 2010

Has anyone tried Facebook Scribe?

thwarted · on Oct 6, 2010

Yes, and it works pretty solidly. We used to use syslog-ng, writing to named pipes, but you have to be sure that there is something reading from the pipe before you start writing to it, otherwise it blocks. You don't really want to use the syslog protocol (via the syslog(3) library call) for random logging because you may end up hitting the upper bounds of the log lines. I wanted to use rsyslog, mainly for the local buffering, but it ONLY seems to support the syslog format/protocol, including prefixing all lines with a date and time and a hostname.

We use scribe and we have a stdin2scribe program (python) that can be used to hook into any log output (like apache access and error logs). We have it set up in a two tier system, all systems that we'd want to log from run a "scribe leaf" on a port on localhost, and this forwards all logs to a "scribe aggregator" (behind a load balancer), with a buffering space on the local disk when the aggregator can not be contacted. It's a pretty solid system and I recommend it.

We also have services, and command line and library interfaces to those services, that let you grab all the logs that came in on a certain time frame, or tail all the data coming into the aggregators in real time (one of them is a wrapper around a more generic tool that just tags the logs, the wrapper takes grep-style filtering arguments and the output is pipped to a pretty printer).

aguynamedben · on Oct 5, 2010

If you're interested in more volume and flexibility, check out Flume, a new open-source project from Cloudera (the Hadoop/logging experts). Solid software and community behind it. http://archive.cloudera.com/cdh/3/flume/UserGuide.html