DragonFly On-Line Manual Pages
ROTATION(1) User Contributed Perl Documentation ROTATION(1)
NAME
savelogs - Log file rotation made easy
SYNOPSIS
savelogs was written with log file rotation in mind. With it you can
ensure that your account does not run out of disk space because of
large log files. It allows you to preserve important Web traffic
information and system data while conserving precious disk space.
DESCRIPTION
This document details several uses of savelogs designed to meet
specific challenges in log file rotation. Some of the examples in this
tutorial are taken from the savelogs(1) man page.
savelogs was written with the axiom "make the common case fast" in
mind. This means that assumptions were made about what most people want
to do with their log files and that savelogs was optimized to make the
common scenarios intuitive and simple.
If these assumptions are not correct for your particular need, it may
mean that what you're trying to do could possibly be done in a better
way, or that perhaps you should reconsider what you're trying to do.
Equally likely, it could mean that the assumptions savelogs makes
really aren't correct after all; I'm sure the author would love to hear
about it ;o)
This document does not replace the savelogs(1) manual page. You should
read the savelogs man page thoroughly before reading this document.
Examples in this document are for savelogs version 1.32 or later (or as
otherwise specified--version specific directives are noted).
EXAMPLES
The rest of this document contains a variety of examples of different
ways to use savelogs. Each example describes a problem scenario and
then explains possible ways to solve the problem using savelogs.
savelogs can be run from the command-line or as a cron job. Options may
be given to savelogs using a configuration file or the command-line. In
our descriptions of problems and solutions we will use savelogs
configuration files rather than command-line options to control
savelogs's behavior.
For each solution offered, there will be a complete savelogs
configuration file given along with any command-line arguments if
needed. If no command-line is shown (either via cron or a shell
prompt), you should assume that the command-line is:
% savelogs --config=/path/to/savelogs.conf
or a sample cron job:
5 2 * * * $HOME/usr/local/bin/savelogs --config=/path/to/savelogs.conf
EXAMPLE 1: ARCHIVING SYSTEM LOGS
Most core network services, log special information to the system log
file /var/log/messages. sendmail, popper, imapd, and ftpd all write
authentication and some debugging information to this file. This
information is important to track down problems, compile server usage
statistics, and record possible hacker attempts.
Unfortunately, many people simply delete this file daily or weekly
because of how quickly it can grow on heavily loaded servers. Some of
these people often regret deleting their logs so quickly when a
security incident occurs for which they need to review some of the
information in their system log.
Now we're convinced that we should keep the system log around for a
little while. What approach should we take to preserve log files?
This example will illustrate three popular methods of system log
archival: permanent storage in separate archives, permanent storage in
a single archive, and newsyslog(8) style rotation.
Possible Solution 1: Permanent storage in separate archives
We want our system log file rotated daily to preserve disk space and
keep order; this will let us quickly find a particular log and view it.
Create the following savelogs configuration file:
savelogs Configuration File
## ==== begin savelogs-1a.conf ==== ##
## our log file we want to rotate
Log /var/log/messages
Touch yes
## ===== end savelogs-1a.conf ===== ##
Solution Results
Before we run savelogs with the above configuration file, we might see
something like this in our ~/var/log directory:
server% ls -l
-rw-r--r-- 1 server vuser 4455 Sep 12 08:24 messages
After we run savelogs with the above configuration file, we'll would
see this in our ~/var/log directory:
-rw-r--r-- 1 server vuser 0 Sep 12 12:01 messages
-rw-r--r-- 1 server vuser 1047 Sep 12 08:24 messages.010912.gz
Solution Explanation
The Log directive in our configuration file tells savelogs which log to
process. You may specify the Log directive multiple times to process
multiple logs.
The Touch directive tells savelogs to execute the system touch command
which creates an empty log file. While most services do not require the
log file to already exist before appending to it, it is a good habit to
create empty log files since a) it does no harm and b) some programs
will not create the log file for you and will not log until it is
created.
An alternative to this solution is to also specify the Ext directive in
the configuration file:
## ==== begin savelogs-1b.conf ==== ##
## our log file we want to rotate
Log /var/log/messages
Touch yes
Ext yesterday
## ===== end savelogs-1b.conf ===== ##
which will create a file like this:
-rw-r--r-- 1 server vuser 1047 Sep 12 08:24 messages.010911.gz
Notice that the date in the file extension is a day before the previous
example's date. You may want to do this if you run savelogs just after
midnight but want the name of the archived log to reflect the date for
which the log file contains data (instead of the day after).
Yet another alternative is to specify a date format for the rotation.
This gives you a lot of flexibility when renaming log files.
## ==== begin savelogs-1c.conf ==== ##
## our log file we want to rotate
Log /var/log/messages
DateFmt %y-%m-%d
## ===== end savelogs-1c.conf ===== ##
which will create a file like this:
-rw-r--r-- 1 server vuser 1047 Sep 12 08:24 messages.01-09-12.gz
You can see that the log extension has hyphens between year, month, and
day as we described in our DateFmt directive. You could even specify
hours, minutes, and seconds if you wanted; all these options (and
more!) are described in the strftime(1) man page.
Possible Solution 2: Permanent storage in a single archive
The previous solution is a fine solution for most uses: it's easy to
tell which logs contain data for a given date range. Even if you ran
savelogs less often than daily (e.g., weekly or monthly) you would
easily be able to tell which log had the data you wanted.
You'll notice, however, that if you do run savelogs often (e.g., daily
or hourly) in just a few days you'll have more logs than you would like
to look at. You could download files to another machine periodically in
order to reduce the sheer numbers of files to work with. Or you could
use this next approach and store all logs in a compressed archive.
savelogs Configuration File
## ==== begin savelogs-1d.conf ==== ##
## our log file we want to rotate
Log /var/log/messages
Touch yes
Process all
## ===== end savelogs-1d.conf ===== ##
Solution Results
When we run savelogs with the above configuration file, we'll see this
in our ~/var/log directory:
-rw-r--r-- 1 server vuser 0 Sep 12 12:01 messages
-rw-r--r-- 1 server vuser 1150 Sep 12 13:01 messages.tar.gz
The contents of messages.tar.gz is a single file:
server% gtar -ztf messages.tar.gz
messages.010912
You may also use the Ext option in your configuration file again if you
wish the stored file to have yesterday's date instead of today's date
(default).
Solution Explanation
This savelogs configuration file looks a lot like our previous example
except that we have added the Process directive. The Process directive
tells savelogs which phases to include while processing logs. The
savelogs phases are:
move
Log files are renamed to whatever you specify in the Ext directive
(which is today's date by default) during the move phase.
filter
The filter process phase takes the recently renamed log files (or
file) and pipes them through a command that you specify. If you
don't specify a filter command, this phase is quietly skipped.
archive
During the archive phase, logs which have been renamed (and
optionally filtered) are added to a tar archive.
compress
The compress phase takes logs and compresses them. If the archive
process phase was activated in this savelogs session, the compress
phase will compress the archive instead of the log file.
delete
After logs have been optionally renamed, filtered, archived, and
compressed, the original file (or the file after it has been
renamed) may be deleted because it now resides in an archive. This
occurs during the delete phase.
If no Process option is given, savelogs uses move,compress as its
default setting.
We specified all for our Process option; this means that savelogs
should apply all phases if they are applicable. As such, our log file,
~/var/log/messages is first renamed to ~/var/log/messages.010912.
Because we did not specify a filter, the log file is not modified in
any way after it is renamed. Then during the archive phase, the file is
added to a new tar archive. The archive is compressed during the
compress phase and the original file, ~/var/log/messages.010911 is
deleted during the delete phase.
Now each night when savelogs runs, the previous day's log file will be
added to this single archive and compressed.
Possible Solution 3: newsyslog(8) style rotation
The primary drawback to using a single archive for storage is that you
never really save space by log compression. Yes, the archive is
compressed most of the time, but savelogs is limited by the underlying
system gtar or tar to modify archives. Currently, neither gtar nor tar
can write to compressed tar files; they can only read from them.
This means that before savelogs can write to the compressed tar file,
it must first decompress it, then append the new file, then re-compress
the file. If you have many log files, this may take considerable disk
space.
This final solution involves a compromise which minimizes storage space
requirements while maintaining only a predetermined number of files in
the ~/var/log directory. The compromise is that your log files are not
as easily indexed because the date is not stored in the filename.
savelogs Configuration File
## ==== begin savelogs-1e.conf ==== ##
## our log file we want to rotate
Log /var/log/messages
Touch yes
Period 25
## ===== end savelogs-1e.conf ===== ##
Solution Results
-rw-r--r-- 1 server vuser 0 Sep 12 13:35 messages
-rw-r--r-- 1 server vuser 1042 Sep 12 08:24 messages.0.gz
Solution Explanation
If we were to run savelogs with the above configuration file many
times, you would see files like this:
-rw-r--r-- 1 server vuser 1163 Sep 16 08:24 messages.0.gz
-rw-r--r-- 1 server vuser 1388 Sep 15 08:24 messages.1.gz
-rw-r--r-- 1 server vuser 1021 Sep 14 08:24 messages.2.gz
-rw-r--r-- 1 server vuser 1048 Sep 13 08:24 messages.3.gz
-rw-r--r-- 1 server vuser 1042 Sep 12 08:24 messages.4.gz
You can see that the most recent log file is named messages.0.gz and
the oldest log file has the highest number. With the Period directive
in our configuration file, we will save 25 periods of logs. That is, if
we ran savelogs daily, we would have 25 days worth of logs stored
(maximum. After 25, logs begin to "fall off" the end--the oldest logs
are not renamed but simply clobbered by more recent logs). If we ran
savelogs hourly, we would have 25 hours worth of logs.
This style of log rotation is called newsyslog-style rotation, named
after newsyslog(8) which is a UNIX system utility that does
approximately the same thing.
EXAMPLE 2: ARCHIVING MULTIPLE SYSTEM LOGS
Now we have multiple system logs we would like to archive like we did
in the previous example. In addition to ~/var/log/messages we also want
to archive ~/var/log/procmail and ~/var/mail/cron.
Possible Solution 1: an archive for each file
The simplest approach is to archive each file separately. Each file is
easily accessible and cleanly indexed with the date embedded in the
filename.
savelogs Configuration File
## ==== begin savelogs-2a.conf ==== ##
## our log file we want to rotate
Log /var/log/messages
Log /var/log/procmail
Log /var/mail/cron
Touch yes
## ===== end savelogs-2a.conf ===== ##
Solution Results
In ~/var/log:
-rw-r--r-- 1 server vuser 0 Sep 12 14:19 procmail
-rw-r--r-- 1 server vuser 159 Sep 12 14:03 procmail.010912.gz
-rw-r--r-- 1 server vuser 0 Sep 12 14:19 messages
-rw-r--r-- 1 server vuser 1047 Sep 12 08:24 messages.010912.gz
and in ~/var/mail:
-rw-r--r-- 1 server vuser 0 Sep 12 14:19 cron
-rw-r--r-- 1 server vuser 94 Sep 12 14:18 cron.010912.gz
Solution Explanation
Like the cases in our previous example, we create a simple
configuration file that lists the logs we want to process. savelogs
will go through each of its phases (by default, since we didn't specify
any Process options, savelogs will execute the move and compress
phases) and process each log in turn.
If we were to store each log file in a tar archive using the Process
directive set to all, we would have three different compressed tar
files, one for each log.
Possible Solution 2: a single archive for all files
Suppose we wanted these three files placed into a single archive so
that we could download just one file periodically instead of many
files.
savelogs Configuration File
## ==== begin savelogs-2b.conf ==== ##
## our log file we want to rotate
Log /var/log/messages
Log /var/log/procmail
Log /var/mail/cron
Touch yes
Process all
Archive /var/log/logs.tar
## ===== end savelogs-2b.conf ===== ##
Solution Results
The resulting file in ~/var/log:
-rw-r--r-- 1 server vuser 32413 Sep 12 17:16 logs.tar.gz
contains our three original files:
server% gtar -ztf logs.tar.gz
messages.010912
procmail.010912
cron.010912
Solution Explanation
The primary strength of this solution is that files scattered about
your file system are stored centrally, making downloading logs
convenient.
This solution has the same drawback, however, as the similar case in
the previous example: the tar file must be decompressed before adding
files to it. This solution is fine if you can guarantee that your total
disk space is enough to handle all of the archived logs together in
their decompressed sizes.
Possible Solution 3: a single archive for all files where path information is
preserved
We like our single archive, but what happens if we have two files with
the same name (e.g., ~/var/log/cron and ~/var/mail/cron)? gtar is able
to put both files in the archive, but when we extract them one of them
is going to overwrite the other. The solution is to store path
information with our archives. While most people who post-process
their log files will probably not use this technique, if you're saving
the logs "just in case", this solution will work well.
savelogs Configuration File
## ==== begin savelogs-2c.conf ==== ##
## our log file we want to rotate
Log /var/log/messages
Log /var/log/procmail
Log /var/log/cron
Log /var/mail/cron
Touch yes
Process all
Archive /var/log/logs.tar
Full-Path yes
## ===== end savelogs-2c.conf ===== ##
You can see that we have two files ~/var/log/cron and ~/var/mail/cron
that will conflict inside our tar file unless we preserve path
information.
Solution Results
The resulting file, like the previous case, is ~/var/log/logs.tar.gz:
-rw-r--r-- 1 server vuser 32471 Sep 13 13:51 logs.tar.gz
contains the for log files we included in our configuration file:
server% tar -ztf logs.tar.gz
var/log/messages.010913
var/log/procmail.010913
var/log/cron.010913
var/mail/cron.010913
except the archive contains full path information, allowing us to store
two files with the same name (~/var/log/cron and ~/var/mail/cron).
Solution Explanation
This solution is well-suited for archiving scattered logs, some of
which may have the same name. It is also ideal for preserving directory
hierarchy information, as well as the actual log files themselves.
While most people who actually perform some log analysis on these files
may find that extracting the log files from the archive is cumbersome,
the only other good alternative is to archive logs separately and
download them separately.
Possible Solution 4: a single archive for each directory containing logs
This solution is a hybrid of the last two solutions. We don't want to
preserve path information in our archive, but we still wish to be able
to store files with common names. This solution takes advantage of one
of savelogs hidden features to allow us to create a single archive per-
directory root.
savelogs Configuration File
## ==== begin savelogs-2d.conf ==== ##
## our log file we want to rotate
Log /var/log/messages
Log /var/log/procmail
Log /var/log/cron
Log /var/mail/cron
Touch yes
Process all
Archive logs.tar
## ===== end savelogs-2d.conf ===== ##
We have stripped the path from the Archive directive and no longer
include the Full-Path directive.
Solution Results
After running savelogs with the above configuration file we have two
archives, one in ~/var/log/logs.tar.gz:
-rw-r--r-- 1 server vuser 32337 Sep 13 14:04 logs.tar.gz
whose contents are:
server% tar -ztf logs.tar.gz
messages.010913
procmail.010913
cron.010913
The other archive is ~/var/mail/logs.tar.gz:
-rw-r--r-- 1 server vuser 182 Sep 13 14:04 logs.tar.gz
whose contents are:
server% tar -ztf logs.tar.gz
cron.010913
Solution Explanation
This solution is like the previous except that instead of preserving
path information to store files with the same name, it uses a separate
archive for each directory root (e.g., ~/var/log and ~/var/mail).
EXAMPLE 3: ROTATING APACHE LOGS
Everyone, sooner or later, has to rotate Apache log files. savelogs has
a number of options to help make Apache log rotation as simple and
efficient as possible. To accomplish this, we introduce three new
directives: apacheconf, apachelog, and apachelogexclude.
Suppose we have all of our system log rotation under control. Now we
have some Apache log files that are consistently eating up our disk
space and we want to somehow keep the amount of disk space used by
files to a minimum. Our first instinct is to add a Log directive to our
savelogs configuration file for each log that we want to process. But
after adding a few, we realize that there must be a better way to do
this.
Possible Solution 1: Automatic Logfile Detection
savelogs is Apache-aware. That is, it knows how to read Apache style
configuration files and look for certain patterns. These patterns are
treated as filenames of log files to process. If the ApacheConf
directive is specified, savelogs will read the specified Apache
configuration file and parse it for log files. Each log file found will
be processed as if it had been specified with the Log directive or
passed on the command-line.
Assume our Apache configuration file has the following directives:
TransferLog "|/usr/local/bin/logwatch /usr/local/etc/httpd/logs/access_log"
ErrorLog "|/usr/local/bin/logwatch /usr/local/etc/httpd/logs/error_log"
<VirtualHost domain.name1 www.domain.name1>
ServerAdmin webmaster@domain.name1
DocumentRoot /usr/local/etc/httpd/vhosts/domain.name1
ServerName www.domain.name1
ErrorLog logs/error_log-domain.name1
TransferLog logs/access_log-domain.name1
</VirtualHost>
<VirtualHost domain.name2 www.domain.name2>
ServerAdmin webmaster@domain.name2
DocumentRoot /usr/local/etc/httpd/vhosts/domain.name2
ServerName www.domain.name2
ErrorLog logs/error_log-domain.name2
TransferLog /dev/null
</VirtualHost>
<VirtualHost domain.name3 www.domain.name3>
ServerAdmin webmaster@domain.name3
DocumentRoot /usr/local/etc/httpd/vhosts/domain.name3
ServerName www.domain.name3
ErrorLog logs/error_log-domain.name3
TransferLog logs/access_log-domain.name3
</VirtualHost>
We have the main server TransferLog and ErrorLog directives and each of
the three virtual hosts have their own TransferLog and ErrorLog
directives also.
savelogs Configuration File
## ==== begin savelogs-3a.conf ==== ##
ApacheConf /www/conf/httpd.conf
PostMoveHook /usr/local/bin/restart_apache
## ===== end savelogs-3a.conf ===== ##
Solution Results
After running savelogs with the above configuration file, we'll have in
our logs directory the following files:
-rw-r--r-- 1 server vuser 0 Sep 13 23:07 access_log-domain.name1
-rw-r--r-- 1 server vuser 9360 Sep 13 23:06 access_log-domain.name1.010913.gz
-rw-r--r-- 1 server vuser 0 Sep 13 23:07 access_log-domain.name3
-rw-r--r-- 1 server vuser 1040 Sep 13 23:06 access_log-domain.name3.010913.gz
-rw-r--r-- 1 server vuser 0 Sep 13 23:07 error_log-domain.name1
-rw-r--r-- 1 server vuser 859 Sep 13 23:06 error_log-domain.name1.010913.gz
-rw-r--r-- 1 server vuser 0 Sep 13 23:07 error_log-domain.name2
-rw-r--r-- 1 server vuser 352 Sep 13 23:06 error_log-domain.name2.010913.gz
-rw-r--r-- 1 server vuser 0 Sep 13 23:07 error_log-domain.name3
-rw-r--r-- 1 server vuser 661 Sep 13 23:06 error_log-domain.name3.010913.gz
Solution Explanation
When savelogs sees the ApacheConf directive, it reads the given Apache
configuration file, looking for directives that might be log files.
savelogs decides what Apache configuration directives might be log
files with the ApacheLog directive, which defaults to:
TransferLog|ErrorLog|AgentLog|RefererLog|CustomLog
After finding all of the Apache lines that match the above pattern,
lines that also match the ApacheLogExclude pattern are removed from the
list. The savelogs default ApacheLogExclude pattern is:
^/dev/null$|\|
(read "/dev/null OR a pipe character").
Notice that our second virtual host logs its transfer log to /dev/null.
savelogs recognizes this and skips that log since trying to rotate
/dev/null demonstrates poor taste. Also, our main server log files are
piped through a program first:
TransferLog "|/usr/local/bin/logwatch /usr/local/etc/httpd/logs/access_log"
ErrorLog "|/usr/local/bin/logwatch /usr/local/etc/httpd/logs/error_log"
savelogs detects the pipe character and does not attempt to rotate
these logs either. To rotate these logs, you could add them with the
Log directive to the savelogs configuration file or on the command-
line.
Possible Solution 2: Automatic Logfile Detection with Exceptions
We like what savelogs did for us in the last solution, but we have an
exception to make. We have this one virtual host that insists on doing
their own log files. We want to leave their logs alone. What to do?
savelogs Configuration File
## ==== begin savelogs-3b.conf ==== ##
ApacheConf /www/conf/httpd.conf
NoLog /usr/local/etc/httpd/logs/*_log-domain.name3
PostMoveHook /usr/local/bin/restart_apache
## ===== end savelogs-3b.conf ===== ##
Solution Results
After running savelogs with the above configuration file, we'll have in
our logs directory the following files:
-rw-r--r-- 1 server vuser 0 Sep 13 23:07 access_log-domain.name1
-rw-r--r-- 1 server vuser 9360 Sep 13 23:06 access_log-domain.name1.010913.gz
-rw-r--r-- 1 server vuser 1040 Sep 13 23:06 access_log-domain.name3
-rw-r--r-- 1 server vuser 0 Sep 13 23:07 error_log-domain.name1
-rw-r--r-- 1 server vuser 859 Sep 13 23:06 error_log-domain.name1.010913.gz
-rw-r--r-- 1 server vuser 0 Sep 13 23:07 error_log-domain.name2
-rw-r--r-- 1 server vuser 352 Sep 13 23:06 error_log-domain.name2.010913.gz
-rw-r--r-- 1 server vuser 661 Sep 13 23:06 error_log-domain.name3
Solution Explanation
The NoLog directive tells savelogs to skip files that match the
pattern. We supplied '*_log-domain.name3' as our pattern. The asterisk
follows standard shell globbing conventions (yikes! that just means
that savelogs patterns will work just like they do from your UNIX shell
command prompt). In this case, our pattern expanded to
access_log-domain.name3 and error_log-domain.name3, so these files were
removed from the list of files to process that the ApacheConf directive
made for us. NoLog is new with savelogs version 1.40.
Possible Solution 3: Log File Analysis Embedding
Many log file analysis programs require a static and consistenly-named
log file. This means that the log file must not be in use by Apache
(i.e., Apache is not logging to it) and the name of the log file must
not vary from day to day (i.e., many log analysis programs require you
to enter the name of the log file in a static configuration file).
These two requirements are often at odds with the objectives of log
file rotation programs. The object of a log rotation system is to
reduce disk space use while preserving data. An additional objective is
to locate a log quickly for a particular day. savelogs' default
behavior is to rename a log to include today's date in the filename and
then compressing the file, achieving all three goals.
In order to allow log file analysis programs the ability to have their
cake and eat it too, without letting the log file analysis program
rotate your logs for you (often crude and clumsy) and without forcing
you to write complicated cron jobs to run, savelogs introduces
stemming.
Stemming is simply a fancy way of saying "makes a link to the freshly
renamed log file". This link has the same name every day (or however
often you invoke savelogs) which lets you use the stem name in your log
file analysis program. The link points to a new log each day, however,
which means that your log file analysis program will always be reading
current information.
savelogs Configuration File
You may recall that our main access_log and error_log files are piped
through separate processes in the Apache configuration file, so they
won't be found by savelogs's ApacheConf directive. We use the Log
directive twice here to add them explicitly:
## ==== begin savelogs-3c.conf ==== ##
ApacheConf /www/conf/httpd.conf
Log /www/logs/access_log
Log /www/logs/error_log
PostMoveHook /usr/local/bin/restart_apache
StemHook $HOME/usr/local/urchin/urchin
## ===== end savelogs-3c.conf ===== ##
for Analog, our StemHook line would look like this:
StemHook /usr/local/bin/virtual /usr/local/analog/analog
Solution Results
As before, our logs have been tidily rotated. Before they were
completely rotated, however, Urchin ran and processed them.
-rw-r--r-- 1 server vuser 0 Sep 13 18:42 access_log
-rw-r--r-- 1 server vuser 0 Sep 13 18:42 access_log-domain.name1
-rw-r--r-- 1 server vuser 9360 Sep 13 18:06 access_log-domain.name1.010913.gz
-rw-r--r-- 1 server vuser 0 Sep 13 18:42 access_log-domain.name3
-rw-r--r-- 1 server vuser 1040 Sep 13 18:06 access_log-domain.name3.010913.gz
-rw-r--r-- 1 server vuser 274755 Sep 13 18:41 access_log.010913.gz
-rw-r--r-- 1 server vuser 0 Sep 13 18:42 error_log
-rw-r--r-- 1 server vuser 0 Sep 13 18:42 error_log-domain.name1
-rw-r--r-- 1 server vuser 859 Sep 13 18:06 error_log-domain.name1.010913.gz
-rw-r--r-- 1 server vuser 0 Sep 13 18:42 error_log-domain.name2
-rw-r--r-- 1 server vuser 352 Sep 13 18:06 error_log-domain.name2.010913.gz
-rw-r--r-- 1 server vuser 0 Sep 13 18:42 error_log-domain.name3
-rw-r--r-- 1 server vuser 661 Sep 13 18:06 error_log-domain.name3.010913.gz
-rw-r--r-- 1 server vuser 225964 Sep 13 18:38 error_log.010913.gz
Solution Explanation
The StemHook directive works like this. We begin with a log file:
-rw-r--r-- 1 server vuser 6348250 Sep 13 18:41 access_log
When savelogs starts up with something like this:
% savelogs --postmovehook=/usr/local/bin/restart_apache \
--stemhook=\$HOME/usr/local/urchin/urchin \
/www/logs/access_log
it first detects /www/logs/access_log and renames it:
-rw-r--r-- 1 server vuser 6348250 Sep 13 18:41 access_log.010913
Any PostMoveHook commands are executed at this time. In this example we
restart Apache so that Apache will close its file descriptors on
/www/logs/access_log.010913 and re-open a new descriptor on
/www/logs/access_log. Renaming (moving) a file will not necessarily
close descriptors that other processes may have open on that file.
savelogs then notices that we've supplied a StemHook, so it enters its
stem phase. The first thing that savelogs does in the stem phase is
create a symbolic link to the recently renamed file. The name of the
symbolic link (by default) is the name of the log concatenated with the
string 'today'. You can change the string with the Stem option.
Now we have something like this:
-rw-r--r-- 1 server vuser 6348250 Sep 13 18:41 access_log.010913
lrwxr-xr-x 1 server vuser 17 Sep 13 18:42 access_log.today -> access_log.010913
savelogs next executes the command specified in the StemHook directive,
in our case above it will run $HOME/usr/local/urchin/urchin. The $HOME
variable is a savelogs internal variable that corresponds to your home
directory. Urchin should be configured something like this:
#RestartCommand: /usr/local/bin/restart_apache
#LogDestiny: archive
...
<Report>
ReportName: server.com
ReportDirectory: /usr/home/server/usr/local/etc/httpd/htdocs/urchin/server.com/
TransferLog: /usr/home/server/usr/local/etc/httpd/logs/access_log.today
ErrorLog: /usr/home/server/usr/local/etc/httpd/logs/error_log.today
</Report>
Notice that we commented out the 'restart_apache' command for Urchin.
Because we already renamed the log file and restarted Apache in the
move phase, we don't need to do it again. Further, we have commented
out Urchin's 'LogDestiny' command: we don't want Urchin rotating our
logs or deleting our logs for us, thank you.
The Urchin report section has been modified to look for our Stem files.
After the StemHook has run, savelogs removes the links it created and
continues on through its other phases as usual.
The corresponding Analog configuration file would include this line
(other virtual host lines would follow this pattern):
LOGFILE /www/logs/access_log.today
By the end of the compression phase, we have this:
-rw-r--r-- 1 server vuser 274755 Sep 13 18:41 access_log.010913.gz
It may be that some command you issue in StemHook will not be able to
read a symbolic link. If this is the case, you should specify the
StemLink directive with another parameter:
hard
Creates a hard link to the file. This new file is indistinguishable
from the original file. Any changes made to this new file will be
reflected in the original file. This method requires no extra disk
space.
copy
Creates a copy of the original file. Any changes made to this new
file will NOT be reflected in the original file. Changes will be
completely discarded after the StemHook phase has completed and the
copy is deleted. This method requires as much extra disk space as
the size of the original log.
The rule of thumb is to use a hard link when the default symbolic link
doesn't work (which is rare) and use a copy of the file if your
StemHook command makes changes to the file which you don't want to
preserve.
Possible Solution 4: Rotate Logs for VirtualHost
As of version 1.80, you may specify a hostname for a VirtualHost block.
savelogs will look for any VirtualHost blocks in the Apache
configuration file that match the names you specify and process any
logs found.
savelogs Configuration File
## ==== begin savelogs-3d.conf ==== ##
ApacheConf /www/conf/httpd.conf
ApacheHost www.domain.name1
ApacheHost www.domain.name3
PostMoveHook /usr/local/apache/bin/apachectl restart
## ===== end savelogs-3d.conf ===== ##
Solution Results
After running savelogs with the above configuration file, we'll have
the following changes:
-rw-r--r-- 1 server vuser 0 Sep 13 23:07 access_log-domain.name1
-rw-r--r-- 1 server vuser 9360 Sep 13 23:06 access_log-domain.name1.010913.gz
-rw-r--r-- 1 server vuser 0 Sep 13 23:07 access_log-domain.name3
-rw-r--r-- 1 server vuser 1040 Sep 13 23:06 access_log-domain.name3.010913.gz
-rw-r--r-- 1 server vuser 0 Sep 13 23:07 error_log-domain.name1
-rw-r--r-- 1 server vuser 859 Sep 13 23:06 error_log-domain.name1.010913.gz
-rw-r--r-- 1 server vuser 3442 Sep 13 23:07 error_log-domain.name2
-rw-r--r-- 1 server vuser 0 Sep 13 23:07 error_log-domain.name3
-rw-r--r-- 1 server vuser 661 Sep 13 23:06 error_log-domain.name3.010913.gz
Notice that no changes were made to error_log-domain.name2 since it
wasn't specified in the savelogs configuration file.
Solution Explanation
The new ApacheHost directive tells savelogs to only process log files
for Apache VirtualHost blocks whose ServerName directive matches one of
the specified host names. The ApacheHost directive may be given
multiple times to process multiple hosts.
EXAMPLE 4: Filtering logs
We do not have root.exe or cmd.exe on our web server and we never will
if we have any say in matters.
Nevertheless, we grow weary of our Apache log files growing out of
control mostly due to requests for these files from a slew of new
Windows IIS worms. When we process our logs with our favorite log file
analysis tool, we want to get rid of these kinds of entries before our
log file analysis tool ever gets the log. What to do?
Possible Solution 1: Filtering with savelogs
We can strip these bogus requests from our log files before they are
processed. Each night we'll run our logs through a filter that will
make them clean and free of any Windows worm requests.
savelogs Configuration File
## ==== begin savelogs-4a.conf
ApacheConf /www/conf/httpd.conf
PostMoveHook /usr/local/bin/restart_apache
Filter /usr/bin/egrep -v '(root|cmd)\.exe' $LOG
## ==== end savelogs-4a.conf
Solution Results
When we started, our logs looked like this:
server:~ $ ls -l
-rw-r--r-- 1 server vuser 278115 Jan 7 10:25 access_log
-rw-r--r-- 1 server vuser 34989 Jan 7 00:10 error_log
If we could sneak a peek at the logs after they had been renamed and
filtered, they'd look like this:
server:~ $ ls -l
-rw-r--r-- 1 server vuser 260882 Jan 7 11:00 access_log.020107
-rw-r--r-- 1 server vuser 18838 Jan 7 11:00 error_log.020107
You can see that we stripped out 18k from the access_log and about 16k
from the error_log. After the entire process is complete, our logs look
like this:
server:~ $ ls -l
-rw-r--r-- 1 server vuser 26807 Jan 7 11:00 access_log.020107.gz
-rw-r--r-- 1 server vuser 2247 Jan 7 11:00 error_log.020107.gz
Solution Explanation
We use the ApacheConf directive to tell savelogs which logs to process.
savelogs searches the file specified in ApacheConf for log files. The
PostMoveHook directive restarts our Apache daemon after the logs have
been renamed. We do this so that Apache closes and reopens its log
files on a new log file (and stops trying to log to the recently
renamed logs). Lastly we use the Filter directive to remove lines with
the strings 'root.exe' or 'cmd.exe' from the log.
The Filter directive should be a program that alters the log files in
some way. The output of the Filter command is saved to a temporary file
which later replaces the log file itself, so be careful how you filter.
In this specific example, we pipe our log file through egrep(1); the -v
option tells egrep to exclude lines that match the pattern. $LOG is a
special savelogs variable (see the savelogs(1) manpage) that refers to
the log currently being processed.
SUMMARY
We have presented a few important examples which illustrate the
abilities of the savelogs program. No tutorial is complete without
reading the original manual. Please see savelogs(1) if this tutorial
has left you with unanswered questions.
CAVEATS
If you're careless you might accidentally delete logs or move logs
somewhere you didn't want to. Make sure you run savelogs with the dry-
run option enabled whenever you do experimenting, especially if the log
data might be remotely useful.
You are also encouraged to keep a log of savelogs actions. See the
LogLevel and LogFile directives in the savelogs(1) manual.
SEE ALSO
savelogs(1), cron(8), crontab(5), newsyslog(1), perl(1)
AUTHOR
Scott Wiersdorf, <scott@perlcode.org>
COPYRIGHT
Copyright (c) 2001 Scott Wiersdorf. This document may not be duplicated
in any form without prior written consent of the author or his
employer.
perl v5.20.2 2007-11-01 ROTATION(1)