Converting sendmail logs to SQL

Sendmail's default logging is ugly, because the logs for a single
message are spread across several lines. If I want to find emails
from
x [at] x.com to y [at] y.com, I can't do:

grep x [at] x.com /var/log/maillog | grep y [at] y.com

because the sender and recipient aren't on the same log line.
Instead,
you have to do:

grep x [at] x.com /var/log/maillog | tee somefile.txt

extract the message ids from somefile.txt to msgids.txt and then do

grep -f msgids.txt /var/log/maillog | grep y [at] y.com

or something similar.

An obvious workaround would be to put the logs into an SQL db (each
record would represent one message, and have columns for message id,
sender, recipient, sending relay, receiving relay, message status,
xdelay, etc. Then I could do:

SELECT * FROM logs WHERE sender = 'x [at] x.com' AND recipent = 'y [at] y.com';

Is there any open source software that does this? I believe splunk
and
some other commercial products may do this, but I'm looking for
something open source.
Kelly Jones [ Fr, 24 August 2007 03:01 ] [ ID #1804725 ]

Re: Converting sendmail logs to SQL

In article <1187917309.343053.6130 [at] q5g2000prf.googlegroups.com>
"kelly.terry.jones" <kelly.terry.jones [at] gmail.com> writes:
>Sendmail's default logging is ugly, because the logs for a single
>message are spread across several lines.

Well, that's because it's, uh, a log - and all things don't happen to a
message at a single point in time.

> If I want to find emails
>from
>x [at] x.com to y [at] y.com, I can't do:
>
>grep x [at] x.com /var/log/maillog | grep y [at] y.com
>
>because the sender and recipient aren't on the same log line.
>Instead,
>you have to do:
>
>grep x [at] x.com /var/log/maillog | tee somefile.txt
>
>extract the message ids from somefile.txt to msgids.txt and then do
>
>grep -f msgids.txt /var/log/maillog | grep y [at] y.com
>
>or something similar.

I occasionally use the perl script below - it prints all lines for a
message if one of them matches the given pattern, so it doesn't quite do
what you want (match on the combined lines), but could easily be changed
to do so. However it will read the complete log into memory, so will
probably not be usable by really high-volume sites.

>An obvious workaround would be to put the logs into an SQL db (each
>record would represent one message, and have columns for message id,
>sender, recipient, sending relay, receiving relay, message status,
>xdelay, etc. Then I could do:
>
>SELECT * FROM logs WHERE sender = 'x [at] x.com' AND recipent = 'y [at] y.com';

Well, in general messages may have multiple recipients (one of the
reasons for the multiple log lines, since delivery to different
recipients may be days apart), but you could probably handle that
somehow.

>Is there any open source
matching entries may be good enough if your volume isn't too high.
otherwise you software that does this? I believe splunk
>and
>some other commercial products may do this, but I'm looking for
>something open source.

I don't know of anything like that (which doesn't mean anything since
I've never looked for it). But it doesn't seem like a difficult problem,
assuming you know how to feed data into your SQL db. Using a syslogd
implementation that allows you to specify a program/script to be fed
matching entries, and a bit of regexp matching in that program/script,
should do the trick. FreeBSD syslogd has this funtionality, as well as
syslog-ng that can be used on on a variety of Unices.

--Per Hedeland
per [at] hedeland.org

------------------------
#!/usr/bin/perl

$pattern = shift;

while (<>) {
split;
$id = $_[5];
$lines{$id} .= $_;
if (!$want{$id} && /$pattern/o) {
$want{$id} = 1;
$wantid[++$n] = $id;
}
}
for ($i = 1; $i <= $n; $i++) {
print $lines{$wantid[$i]};
}
per [ Fr, 24 August 2007 22:45 ] [ ID #1804732 ]
Miscellaneous » comp.mail.sendmail » Converting sendmail logs to SQL

Vorheriges Thema: Saved mail file; want to reinsert messages
Nächstes Thema: Ruleset 5 - aliased or not aliased