Apache- and Zope logging with multilog

by Frank Tegtmeyer <fte@fte.to>

The daemontools package by Daniel J. Bernstein contains the multilog program that does not lose logging information and does not accidently eat up all your disk space. I use multilog for the Apache- and Zope-logging in a virtual server environment.

Overview

I have many virtual servers mapped to a Zope instance. Logging accesses to many virtual Apache servers eats many file descriptors, memory and CPU power if you run a logger for every server. That's why I use only one log which includes the virtual hostname and the port. The hostname is used to map the log entries to the virtual server later.
Multilog is used to get an exact timestamp in external TAI64N format at the beginning of each line. The major advantage of multilog is that it rotates logfiles based on size. It keeps only a fixed number of logfiles so that you never will see your disk get filled by growing logfiles.
Here is an example (the long line is wrapped, continuation lines are marked with the plus sign):


 @40000000393975c730047524 fte.tegtmeyer.net 80 195.145.137.253 - "GET /zope/logging 
+HTTP/1.0" 200 6794 "-" "Mozilla/4.72 [de] (X11; I; Linux 2.2.15 i586)" 

Multilog has it's own built in filter capabilities but I decided to split the logging itself from analyzing. So the filtering for virtual hosts is done during periods of low activity by an additional script.

Contents

  1. Hints
  2. Installing multilog
  3. Setting up logging in Apache
  4. Zope logging with multilog
  5. Watching a logfile
  6. Conversion for log file analyzers
  7. About this document

Hints

This article assumes Linux. If you are running a BSD variant or another system you probably have problems with the scripts. In this case install and use the GNU tools instead of the native ones (for example gsed instead of sed on FreeBSD, thanks to Andreas Brenk!).

Installing multilog

The multilog program is part of the daemontools package made by Daniel J. Bernstein which you can download from his webserver.

The installation is straightforward:

  1. download and unpack the sources
  2. change into the created directory
  3. as root do make setup check
That's all what to do when only using the multilog program. If you also want to control your services with daemontools, see the documentation of svscan. If you want to control Zope with daemontools see my monitoring article.

Setting up logging in Apache

The multilog program expects the data at standard input. The Apache webserver is able to specify a logger which uses a pipe to a logger program. It is necessary to log two streams: the error log and the access log. To log to the two directories /web/logs/access and /web/logs/error, you would need the following entries in the common area of the Apache httpd.conf file. It assumes that logging is done under the httplog user:


 ErrorLog "| /usr/local/bin/setuidgid httplog /usr/local/bin/multilog 
+s4194304 n10 /web/logs/error"
 LogFormat "%v %p %h %u \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" 
+combined
 CustomLog "| /usr/local/bin/setuidgid httplog /usr/local/bin/multilog t 
+s4194304 n50 /web/logs/access" combined 

Note the changed logging format: the timestamp of the Apache webserver is removed because of it's horrible format and insufficient accuracy. The line is prepended with an external TAI64N timestamp (provided by the t switch of multilog), the name of the virtual host and the port the client is accessing. Also removed is the column for results of ident-lookups. This column was not used on any webserver I have ever seen. What counts for a webserver is speed, so nobody turns on ident-lookups for incoming requests. If you have a need for ident information, change the format of the logfile and the scripts mentioned later in this article.

A mail from Jamie Heilman made me aware of a discussion on the damontools mailinglist about problems using multilog from Apache on FreeBSD. You may have a look at the archive. The discussion was in June and July of 2001. You have to look for the subjects patch to have apache run supervised and apache, mulitlog, & fifo trick redux.
To solve the problem you may use the same technique as described here for Zope: let Apache log to a named pipe and let multilog read from it. Be sure that the loggers are up when Apache starts.

Zope logging with multilog

In a virtual server environment with Apache the logging provided by Zope is mostly important for debugging because all requests are also logged by Apache. So I wanted to have a Zope access log, but only for a limited time and of course limited size.

Zope includes hooks for different logging mechanisms. A logger that rotates logfiles is also available but it is necessary to edit the Zope sources to activate another logging mechanism. Because multilog does all what we need, we looked for a solution to pipe the logged data to a program. Luckily we use Unix instead of Windows NT, so there is a simple solution without fiddling with the sources of zope.

The standard setup of Zope simply appends all log data to the File Z2.log in the var subdirectory. If we create a named pipe with this name Zope will happily log to it - all we have to do now is to ensure that the multilog program listens at this named pipe.
Here are the commands that are used to create the named pipe. We assume Linux and that Zope runs under the user id zope. The multilog program runs under the group id httplog:


 cd var 
 rm Z2.log 
 mkfifo Z2.log 
 chmod 240 Z2.log 
 chown zope.httplog Z2.log 

There are many ways to ensure that multilog listens at the pipe. Because we are monitoring Zope with daemontools, it is natural to start the logging with daemontools too. See my document Monitoring Zope with Daniel J. Bernsteins daemontools for more information.

Simply set up an additional service (called zopelog at our site) that starts the multilog program. The run script looks like that:


 #! /bin/sh 
 exec < /web/zope/var/Z2.log 
 exec /usr/local/bin/setuidgid httplog \ 
   /usr/local/bin/multilog t n5 /web/logs/zopeaccess 

Of course the /web/logs/zopeaccess directory must exist and has to be writable by the user httplog:


 mkdir /web/logs/zopeaccess 
 chown httplog /web/logs/zopeaccess 
 chmod 700 /web/logs/zopeaccess 

The access log of Zope is not the only output that Zope generates. There is also the output that is displayed by the start script. We decided to log this too to have an idea how often, why and when Zope is restarted because of a crash. The output goes to stdout and this is the standard situation the daemontools are designed for.
The svscan program sets up a logger for a service if

  1. a subdirectory log exists in the service directory
  2. a run script exist in the log subdirectory and
  3. the service directory has it's sticky bit set (this is not anymore required for daemontools 0.75 and above).

The run script for this setup (located in the log subdirectory of the Zope service):


 #! /bin/sh
 exec /usr/local/bin/setuidgid httplog /usr/local/bin/multilog t /web/logs/zopedebug

To set up all the necessary files use service directories under the Zope directory. I already prepared the files for downloading. To activate the services you link them to the /service directory (or the one svscan runs on). Don't place them directly in the /service directoy - svscan may be confused by intermediate states when you are creating or changing files:


 ln -s /web/zope/service_zopelog /service/zopelog 
 ln -s /web/zope/service_zope /service/zope

Watching a logfile

Watching the logfiles has two problems. First it is not very useful to watch the timestamps in TAI64N. Second it is possible that multilog switches the logfile while watching.
I use the GNU version of tail that has an option to recognize a switched file. The conversion of the TAI64N timestamps is done by Dan Bernsteins utility tai64nlocal.
The solution:


 tail --follow=name /web/logs/zopeaccess/current | tai64nlocal 

 on FreeBSD:
 tail -f --follow=name /web/logs/zopeaccess/current | tai64nlocal 

Conversion for log file analyzers

If you reallly are not able to modify the parser of a logfile analyzer to recognize the created format you may use the following tools and commands. It's lazy programming, so don't use it for learning Perl or something else!
You may download the scripts.

Let's say you want the converted logfile for the webserver www.fte.to on port 80, you may issue the following commands inside of the logfile directory:


 cat ./\@* current | fgrep ' www.fte.to 80 ' | \ 
     ./filter | ./iso_to_weird > fte-log 

The functionality of filter and iso_to_weird is divided to gain more flexibility.

The filter script:


 #!/bin/sh 
 sed 's/\([^ ]\+ \+\)\([^ ]\+ \+\)\([^ ]\+ \+\)\([^ ]\+ \+\)/\4\1\2\3/' \ 
    | logresolve \ 
    | sed 's/\([^ ]\+ \+\)\([^ ]\+ \+\)\([^ ]\+ \+\)\([^ ]\+ \+\)/\2\3\4\1/' \ 
    | /usr/local/bin/tai64nlocal \ 
    | sed 's/\([^ ]\+ \+\)\([^\.]\+\)\([^ ]\+ \+\)\(.*\)/\1\2 \4/' 

The first line of the script moves the ip address of the accessing host to the beginning of the line because the logresolve program (ond others) does expect it there. After piping through logresolve the ip address or name is moved back to the old place.
The TAI64N timestamp at the beginning of the line is converted to ISO format (including fractional seconds) by using Daniel J. Bernstein's tai64nlocal program.
The last line of the script then removes the fractional seconds.

At first it seemed expensive to do it this way but practice showed that the conversion has nearly no performance impact.

The iso_to_weird script:


 #!/usr/bin/perl 
 
 @months = ('Jan','Feb','Mar','Apr','May','Jun','Jul', 
            'Aug','Sep','Oct','Nov','Dec'); 
 
 while (<>) { 
    ($date,$time,$server,$port,$from,$user,@rest) = split / /; 
    ($year,$month,$day) = split('-',$date); 
    $month--; 
    $month=$months[$month]; 
    $rest = join (' ',@rest); 
    print "$from - $user [$day/$month/$year:$time] $rest"; 
 } 

This Perl-Script parses the ISO time and generates the format that many logfile analyzers do expect. The server and port are removed from the log and the useless field for identd lookups is inserted again. The rest of the line (request, return code, referer etc.) is kept like it comes in.

About this document

Please tell me if you find some errors.
2002-10-09
FreeBSD hints by Andreas Brenk
2001-07-07
downloadable scripts, some improvements
2000-07-11
Added conversion for logfile analyzers
2000-06-28
Added watching logfiles
2000-06-04
First version