Sendmail Woes
Greetings
We have a client running FC6 as a mail server. It doesn't do anything
unusual - acts as primary MX for it's domain, accepts incoming mail,
runs spamd and a few other things across it and then hands it off to an
MS Exchange Server box for the staff to pick up in Outlook. Outgoing
mail is taken by the MS box and handed off to the mail server as a smart
host.
All works well except for.......
We constantly see the messages like this in the mail log:
> Sep 18 06:06:05 m1 sendmail[30958]: l8HJ66xS030958: SYSERR(root):
collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> Sep 18 06:17:38 m1 sendmail[31148]: l8HJHb5i031148: SYSERR(root):
collect: read timeout on connection from mail40.messagelabs.com,
from=<xxxxxxxx>
> Sep 18 06:30:06 m1 sendmail[31182]: l8HJU6Dv031182: SYSERR(root):
collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> Sep 18 06:33:04 m1 sendmail[31188]: l8HJX54L031188: SYSERR(root):
collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> Sep 18 07:04:43 m1 sendmail[31342]: l8HK4hCw031342: SYSERR(root):
collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> Sep 18 07:04:51 m1 sendmail[31344]: l8HK4qOx031344: SYSERR(root):
collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> Sep 18 07:07:14 m1 sendmail[31500]: l8HK7F4w031500: SYSERR(root):
collect: read timeout on connection from oberon.tpgi.com.au,
from=<xxxxxxxxt>
> Sep 18 07:13:05 m1 sendmail[31516]: l8HKD4Xq031516: SYSERR(root):
collect: read timeout on connection from qsrv01sl.mx.bigpond.com,
from=<xxxxxxxx>
> Sep 18 07:31:48 m1 sendmail[31558]: l8HKVkGM031558: SYSERR(root):
collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> Sep 18 07:34:13 m1 sendmail[31564]: l8HKYBoL031564: SYSERR(root):
collect: read timeout on connection from qsrv01ps.mx.bigpond.com,
from=<xxxxxxxx>
> Sep 18 07:42:33 m1 sendmail[31582]: l8HKgXed031582: SYSERR(root):
collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> Sep 18 07:42:58 m1 sendmail[31584]: l8HKgwxh031584: SYSERR(root):
collect: read timeout on connection from qsrv03ps.mx.bigpond.com,
from=<xxxxxxxx>
> Sep 18 07:53:04 m1 sendmail[31648]: l8HKr4Px031648: SYSERR(root):
collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> Sep 18 08:08:50 m1 sendmail[31723]: l8HL8owr031723: SYSERR(root):
collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> Sep 18 08:23:51 m1 sendmail[31970]: l8HLNpYY031970: SYSERR(root):
collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> Sep 18 08:23:54 m1 sendmail[31972]: l8HLNrU7031972: SYSERR(root):
collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> Sep 18 08:29:37 m1 sendmail[31986]: l8HLTakK031986: SYSERR(root):
collect: read timeout on connection from
host81-151-82-230.range81-151.btcentralplus.com, from=<xxxxxxxx>
> Sep 18 08:50:58 m1 sendmail[32143]: l8HLovdK032143: SYSERR(root):
collect: read timeout on connection from mail40.messagelabs.com,
from=<xxxxxxxx>
and so on and so forth (addresses obfuscated for privacy).
when I run a ps -eaf | grep sendmail at any time, I see things like this:
> [root [at] m1 log]# ps -eaf | grep sendmail
> root 873 3079 0 10:37 ? 00:00:00 sendmail:
l8I0bZx3000873 mail40.messagelabs.com [216.82.245.83]: DATA
> root 885 3079 0 10:43 ? 00:00:00 sendmail:
l8I0h5SZ000885 mail.vanuatu.com.vu [202.80.33.51]: DATA
> root 887 3079 0 10:43 ? 00:00:00 sendmail:
l8I0hMVh000887 mail.vanuatu.com.vu [202.80.33.51]: DATA
> root 891 3079 0 10:43 ? 00:00:00 sendmail:
l8I0hppk000891 mail.vanuatu.com.vu [202.80.33.51]: DATA
> root 893 3079 0 10:44 ? 00:00:00 sendmail:
l8I0iOaa000893 mail.vanuatu.com.vu [202.80.33.51]: DATA
> root 895 3079 0 10:45 ? 00:00:00 sendmail:
l8I0j1l2000895 oberon.tpgi.com.au [203.12.160.4]: DATA
> root 909 3079 0 10:52 ? 00:00:00 sendmail:
l8I0q6ri000909 mail1.emailcash.com.au [202.177.214.85]: DATA
> root 936 3079 0 10:59 ? 00:00:00 sendmail:
l8I0xjWW000936 mail9.tpgi.com.au [203.12.160.104]: DATA
> root 1128 3079 0 11:03 ? 00:00:00 sendmail:
l8I13323001128 oberon.tpgi.com.au [203.12.160.4]: DATA
> root 1212 3079 0 11:06 ? 00:00:00 sendmail:
l8I16cQU001212 oberon.tpgi.com.au [203.12.160.4]: DATA
> root 1214 3079 0 11:07 ? 00:00:00 sendmail:
l8I17767001214 oberon.tpgi.com.au [203.12.160.4]: DATA
> root 1245 3079 0 11:18 ? 00:00:00 sendmail:
l8I1IF5R001245 oberon.tpgi.com.au [203.12.160.4]: DATA
> root 1253 3079 0 11:20 ? 00:00:00 sendmail:
l8I1KtGc001253 qsrv01sl.mx.bigpond.com [144.140.92.181]: DATA
> root 1303 1173 0 11:37 pts/0 00:00:00 grep sendmail
> root 3079 1 0 Sep04 ? 00:00:02 sendmail: accepting
connections
> smmsp 3083 1 0 Sep04 ? 00:00:00 sendmail: Queue
runner [at] 00:15:00 for /var/spool/clientmqueue
> root 3087 1 0 Sep04 ? 00:00:00 sendmail: Queue
runner [at] 00:15:00 for /var/spool/mqueue
Obviously something is hanging horribly and timing out but I cannot for
the life of me work out what. I've searched for these kind of messages
and came up with almost nothing apart from some old (and I mean old)
solutions to a BSD problem.
Software is as follows:
FC6
sendmail-8.14.1-4.1.fc6
spamassassin-3.1.9-1.fc6
mailscanner-4.60.8-1
Any input or suggestions short of taxidermy would be (very) gratefully
accepted.
TIA
Nigel.
Re: Sendmail Woes
In article
<46ef3eb8$0$32496$5a62ac22 [at] per-qv1-newsreader-01.iinet.net.au>,
Nigel Allen <"dna at remove this dot com dot au"> wrote:
> Greetings
>
> We have a client running FC6 as a mail server. It doesn't do anything
> unusual - acts as primary MX for it's domain, accepts incoming mail,
> runs spamd and a few other things across it and then hands it off to an
> MS Exchange Server box for the staff to pick up in Outlook. Outgoing
> mail is taken by the MS box and handed off to the mail server as a smart
> host.
>
> All works well except for.......
>
> We constantly see the messages like this in the mail log:
>
> > Sep 18 06:06:05 m1 sendmail[30958]: l8HJ66xS030958: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> > Sep 18 06:17:38 m1 sendmail[31148]: l8HJHb5i031148: SYSERR(root):
> collect: read timeout on connection from mail40.messagelabs.com,
> from=<xxxxxxxx>
> > Sep 18 06:30:06 m1 sendmail[31182]: l8HJU6Dv031182: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> > Sep 18 06:33:04 m1 sendmail[31188]: l8HJX54L031188: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> > Sep 18 07:04:43 m1 sendmail[31342]: l8HK4hCw031342: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> > Sep 18 07:04:51 m1 sendmail[31344]: l8HK4qOx031344: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> > Sep 18 07:07:14 m1 sendmail[31500]: l8HK7F4w031500: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au,
> from=<xxxxxxxxt>
> > Sep 18 07:13:05 m1 sendmail[31516]: l8HKD4Xq031516: SYSERR(root):
> collect: read timeout on connection from qsrv01sl.mx.bigpond.com,
> from=<xxxxxxxx>
> > Sep 18 07:31:48 m1 sendmail[31558]: l8HKVkGM031558: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> > Sep 18 07:34:13 m1 sendmail[31564]: l8HKYBoL031564: SYSERR(root):
> collect: read timeout on connection from qsrv01ps.mx.bigpond.com,
> from=<xxxxxxxx>
> > Sep 18 07:42:33 m1 sendmail[31582]: l8HKgXed031582: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> > Sep 18 07:42:58 m1 sendmail[31584]: l8HKgwxh031584: SYSERR(root):
> collect: read timeout on connection from qsrv03ps.mx.bigpond.com,
> from=<xxxxxxxx>
> > Sep 18 07:53:04 m1 sendmail[31648]: l8HKr4Px031648: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> > Sep 18 08:08:50 m1 sendmail[31723]: l8HL8owr031723: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> > Sep 18 08:23:51 m1 sendmail[31970]: l8HLNpYY031970: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> > Sep 18 08:23:54 m1 sendmail[31972]: l8HLNrU7031972: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au, from=<xxxxxxxx>
> > Sep 18 08:29:37 m1 sendmail[31986]: l8HLTakK031986: SYSERR(root):
> collect: read timeout on connection from
> host81-151-82-230.range81-151.btcentralplus.com, from=<xxxxxxxx>
> > Sep 18 08:50:58 m1 sendmail[32143]: l8HLovdK032143: SYSERR(root):
> collect: read timeout on connection from mail40.messagelabs.com,
> from=<xxxxxxxx>
> and so on and so forth (addresses obfuscated for privacy).
>
> when I run a ps -eaf | grep sendmail at any time, I see things like this:
>
> > [root [at] m1 log]# ps -eaf | grep sendmail
> > root 873 3079 0 10:37 ? 00:00:00 sendmail:
> l8I0bZx3000873 mail40.messagelabs.com [216.82.245.83]: DATA
>
> > root 885 3079 0 10:43 ? 00:00:00 sendmail:
> l8I0h5SZ000885 mail.vanuatu.com.vu [202.80.33.51]: DATA
>
> > root 887 3079 0 10:43 ? 00:00:00 sendmail:
> l8I0hMVh000887 mail.vanuatu.com.vu [202.80.33.51]: DATA
>
> > root 891 3079 0 10:43 ? 00:00:00 sendmail:
> l8I0hppk000891 mail.vanuatu.com.vu [202.80.33.51]: DATA
>
> > root 893 3079 0 10:44 ? 00:00:00 sendmail:
> l8I0iOaa000893 mail.vanuatu.com.vu [202.80.33.51]: DATA
>
> > root 895 3079 0 10:45 ? 00:00:00 sendmail:
> l8I0j1l2000895 oberon.tpgi.com.au [203.12.160.4]: DATA
>
> > root 909 3079 0 10:52 ? 00:00:00 sendmail:
> l8I0q6ri000909 mail1.emailcash.com.au [202.177.214.85]: DATA
>
> > root 936 3079 0 10:59 ? 00:00:00 sendmail:
> l8I0xjWW000936 mail9.tpgi.com.au [203.12.160.104]: DATA
>
> > root 1128 3079 0 11:03 ? 00:00:00 sendmail:
> l8I13323001128 oberon.tpgi.com.au [203.12.160.4]: DATA
>
> > root 1212 3079 0 11:06 ? 00:00:00 sendmail:
> l8I16cQU001212 oberon.tpgi.com.au [203.12.160.4]: DATA
>
> > root 1214 3079 0 11:07 ? 00:00:00 sendmail:
> l8I17767001214 oberon.tpgi.com.au [203.12.160.4]: DATA
>
> > root 1245 3079 0 11:18 ? 00:00:00 sendmail:
> l8I1IF5R001245 oberon.tpgi.com.au [203.12.160.4]: DATA
>
> > root 1253 3079 0 11:20 ? 00:00:00 sendmail:
> l8I1KtGc001253 qsrv01sl.mx.bigpond.com [144.140.92.181]: DATA
>
> > root 1303 1173 0 11:37 pts/0 00:00:00 grep sendmail
> > root 3079 1 0 Sep04 ? 00:00:02 sendmail: accepting
> connections
>
> > smmsp 3083 1 0 Sep04 ? 00:00:00 sendmail: Queue
> runner [at] 00:15:00 for /var/spool/clientmqueue
> > root 3087 1 0 Sep04 ? 00:00:00 sendmail: Queue
> runner [at] 00:15:00 for /var/spool/mqueue
>
> Obviously something is hanging horribly and timing out but I cannot for
> the life of me work out what. I've searched for these kind of messages
> and came up with almost nothing apart from some old (and I mean old)
> solutions to a BSD problem.
>
> Software is as follows:
>
> FC6
> sendmail-8.14.1-4.1.fc6
> spamassassin-3.1.9-1.fc6
> mailscanner-4.60.8-1
>
> Any input or suggestions short of taxidermy would be (very) gratefully
> accepted.
This is odd, since those timeouts mean that you were expecting something
from the sending side and they were quiet too long. Not so odd when
talking to random spam zombies, but many of those seem to be legit mail
systems.
I'd do three things:
1. If you are behind some sort of firewall that may be playing games
with your mail connections, eliminate that issue first. The canonical
example is the Cisco PIX 'smtp fixup' idiocy, but I've heard reports of
other NAT and packet massaging devices that screw up SMTP in similar
ways, and sometimes that can include dropping the terminating
<CRLF>.<CRLF> of a message.
2. Increase the verbosity of your logging. Your sendmail LogLevel should
be at least 10 (and much higher is not generally useful) and you should
have syslog capturing messages at 'info' severity. You don't mention how
you've integrated SpamAssassin and I am not familiar with Mailscanner,
but you should make sure they are logging in detail as well, in
particular you want to be able to tell how long messages are taking to
scan.
3. Review your SA configuration. That's a slightly old version, and
while the package you have could have fixed some issues, there may still
be some serious problems. The biggest one is that SA by default queries
both the NJABL "dynablock" and CompleteWhois DNSBL's, both of which
should not be used now. CompleteWhois is a particular problem because it
has essentially gone offline, so queries have to time out. That stalls
every SA scan, and the result can be senders timing out while waiting
for your scan to complete. To disable the checking of that DNSBL
altogether, you should add these to /etc/mail/spamassassin/local.cf:
score RCVD_IN_WHOIS_INVALID 0
score RCVD_IN_WHOIS_BOGONS 0
score RCVD_IN_WHOIS_HIJACKED 0
score __RCVD_IN_WHOIS 0
--
Now where did I hide that website...
Re: Sendmail Woes
On Tue, 18 Sep 2007, Bill Cole wrote:
> you've integrated SpamAssassin and I am not familiar with Mailscanner,
> but you should make sure they are logging in detail as well, in
MailScanner wouldn't be the problem, sendmail must first collect the
message (run as -ODeliveryMode=queueonly) then MailScanner grabs it from
that queue, does its stuff, then hands it back to be delivered by a second
sendmail process which MailScanner issues a delivery trigger to.
--
Cheers
Res
Re: Sendmail Woes
Addendum
What makes this even weirder is that if I send mail to the customer it
never fails. I've tried small files, big files, even damned big files -
always gets through.
I've run tcpdump on port 25 for an hour or so and then examined it using
ethereal (which always complains - naturally - that the transactions
were not complete). IANAE with Ethereal but it appears that the normal
conversation happens up until the DATA command. Then we either get
nothing or (occasionally) the first packet. Then nothing until the
process times out.
Thanks for the comments so far - I can't see it being SA as sendmail is
not getting the data so I believe that SA does not even get a look in at
that stage.
I'm starting to think maybe (as it's only from certain ISPs) that it
might be connection oriented. Our machines are all FC6 (as are the
customers) and our router is a small billion (as is the customers). Am I
clutching at straws here?
Regards
Nigel.
On 18/09/2007 12:58 PM, Nigel Allen wrote:
> Greetings
>
> We have a client running FC6 as a mail server. It doesn't do anything
> unusual - acts as primary MX for it's domain, accepts incoming mail,
> runs spamd and a few other things across it and then hands it off to an
> MS Exchange Server box for the staff to pick up in Outlook. Outgoing
> mail is taken by the MS box and handed off to the mail server as a smart
> host.
>
> All works well except for.......
>
> We constantly see the messages like this in the mail log:
>
> > Sep 18 06:06:05 m1 sendmail[30958]: l8HJ66xS030958: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au,
> from=<xxxxxxxx>
> > Sep 18 06:17:38 m1 sendmail[31148]: l8HJHb5i031148: SYSERR(root):
> collect: read timeout on connection from mail40.messagelabs.com,
> from=<xxxxxxxx>
> > Sep 18 06:30:06 m1 sendmail[31182]: l8HJU6Dv031182: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au,
> from=<xxxxxxxx>
> > Sep 18 06:33:04 m1 sendmail[31188]: l8HJX54L031188: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au,
> from=<xxxxxxxx>
> > Sep 18 07:04:43 m1 sendmail[31342]: l8HK4hCw031342: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au,
> from=<xxxxxxxx>
> > Sep 18 07:04:51 m1 sendmail[31344]: l8HK4qOx031344: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au,
> from=<xxxxxxxx>
> > Sep 18 07:07:14 m1 sendmail[31500]: l8HK7F4w031500: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au,
> from=<xxxxxxxxt>
> > Sep 18 07:13:05 m1 sendmail[31516]: l8HKD4Xq031516: SYSERR(root):
> collect: read timeout on connection from qsrv01sl.mx.bigpond.com,
> from=<xxxxxxxx>
> > Sep 18 07:31:48 m1 sendmail[31558]: l8HKVkGM031558: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au,
> from=<xxxxxxxx>
> > Sep 18 07:34:13 m1 sendmail[31564]: l8HKYBoL031564: SYSERR(root):
> collect: read timeout on connection from qsrv01ps.mx.bigpond.com,
> from=<xxxxxxxx>
> > Sep 18 07:42:33 m1 sendmail[31582]: l8HKgXed031582: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au,
> from=<xxxxxxxx>
> > Sep 18 07:42:58 m1 sendmail[31584]: l8HKgwxh031584: SYSERR(root):
> collect: read timeout on connection from qsrv03ps.mx.bigpond.com,
> from=<xxxxxxxx>
> > Sep 18 07:53:04 m1 sendmail[31648]: l8HKr4Px031648: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au,
> from=<xxxxxxxx>
> > Sep 18 08:08:50 m1 sendmail[31723]: l8HL8owr031723: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au,
> from=<xxxxxxxx>
> > Sep 18 08:23:51 m1 sendmail[31970]: l8HLNpYY031970: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au,
> from=<xxxxxxxx>
> > Sep 18 08:23:54 m1 sendmail[31972]: l8HLNrU7031972: SYSERR(root):
> collect: read timeout on connection from oberon.tpgi.com.au,
> from=<xxxxxxxx>
> > Sep 18 08:29:37 m1 sendmail[31986]: l8HLTakK031986: SYSERR(root):
> collect: read timeout on connection from
> host81-151-82-230.range81-151.btcentralplus.com, from=<xxxxxxxx>
> > Sep 18 08:50:58 m1 sendmail[32143]: l8HLovdK032143: SYSERR(root):
> collect: read timeout on connection from mail40.messagelabs.com,
> from=<xxxxxxxx>
> and so on and so forth (addresses obfuscated for privacy).
>
> when I run a ps -eaf | grep sendmail at any time, I see things like this:
>
> > [root [at] m1 log]# ps -eaf | grep sendmail
> > root 873 3079 0 10:37 ? 00:00:00 sendmail:
> l8I0bZx3000873 mail40.messagelabs.com [216.82.245.83]: DATA
> > root 885 3079 0 10:43 ? 00:00:00 sendmail:
> l8I0h5SZ000885 mail.vanuatu.com.vu [202.80.33.51]: DATA
> > root 887 3079 0 10:43 ? 00:00:00 sendmail:
> l8I0hMVh000887 mail.vanuatu.com.vu [202.80.33.51]: DATA
> > root 891 3079 0 10:43 ? 00:00:00 sendmail:
> l8I0hppk000891 mail.vanuatu.com.vu [202.80.33.51]: DATA
> > root 893 3079 0 10:44 ? 00:00:00 sendmail:
> l8I0iOaa000893 mail.vanuatu.com.vu [202.80.33.51]: DATA
> > root 895 3079 0 10:45 ? 00:00:00 sendmail:
> l8I0j1l2000895 oberon.tpgi.com.au [203.12.160.4]: DATA
> > root 909 3079 0 10:52 ? 00:00:00 sendmail:
> l8I0q6ri000909 mail1.emailcash.com.au [202.177.214.85]: DATA
> > root 936 3079 0 10:59 ? 00:00:00 sendmail:
> l8I0xjWW000936 mail9.tpgi.com.au [203.12.160.104]: DATA
> > root 1128 3079 0 11:03 ? 00:00:00 sendmail:
> l8I13323001128 oberon.tpgi.com.au [203.12.160.4]: DATA
> > root 1212 3079 0 11:06 ? 00:00:00 sendmail:
> l8I16cQU001212 oberon.tpgi.com.au [203.12.160.4]: DATA
> > root 1214 3079 0 11:07 ? 00:00:00 sendmail:
> l8I17767001214 oberon.tpgi.com.au [203.12.160.4]: DATA
> > root 1245 3079 0 11:18 ? 00:00:00 sendmail:
> l8I1IF5R001245 oberon.tpgi.com.au [203.12.160.4]: DATA
> > root 1253 3079 0 11:20 ? 00:00:00 sendmail:
> l8I1KtGc001253 qsrv01sl.mx.bigpond.com [144.140.92.181]: DATA
> > root 1303 1173 0 11:37 pts/0 00:00:00 grep sendmail
> > root 3079 1 0 Sep04 ? 00:00:02 sendmail: accepting
> connections
> > smmsp 3083 1 0 Sep04 ? 00:00:00 sendmail: Queue
> runner [at] 00:15:00 for /var/spool/clientmqueue
> > root 3087 1 0 Sep04 ? 00:00:00 sendmail: Queue
> runner [at] 00:15:00 for /var/spool/mqueue
>
> Obviously something is hanging horribly and timing out but I cannot for
> the life of me work out what. I've searched for these kind of messages
> and came up with almost nothing apart from some old (and I mean old)
> solutions to a BSD problem.
>
> Software is as follows:
>
> FC6
> sendmail-8.14.1-4.1.fc6
> spamassassin-3.1.9-1.fc6
> mailscanner-4.60.8-1
>
> Any input or suggestions short of taxidermy would be (very) gratefully
> accepted.
>
> TIA
>
> Nigel.
Re: Sendmail Woes
Update.
Drove in to the customers after working on this half the night. Rebooted
the router and would you believe everything went back to normal?
Damnedest thing I ever saw in my life.
The customer has had a litany of problems with their current ISP after
an enforced change from layer 3 to layer 2. After this one I think we'll
be saying adios to the current supplier.
Thanks to all who contributed.
N/
On 19/09/2007 8:43 AM, Nigel Allen wrote:
> Addendum
>
> What makes this even weirder is that if I send mail to the customer it
> never fails. I've tried small files, big files, even damned big files -
> always gets through.
>
> I've run tcpdump on port 25 for an hour or so and then examined it using
> ethereal (which always complains - naturally - that the transactions
> were not complete). IANAE with Ethereal but it appears that the normal
> conversation happens up until the DATA command. Then we either get
> nothing or (occasionally) the first packet. Then nothing until the
> process times out.
>
> Thanks for the comments so far - I can't see it being SA as sendmail is
> not getting the data so I believe that SA does not even get a look in at
> that stage.
>
> I'm starting to think maybe (as it's only from certain ISPs) that it
> might be connection oriented. Our machines are all FC6 (as are the
> customers) and our router is a small billion (as is the customers). Am I
> clutching at straws here?
>
> Regards
>
> Nigel.
>
>
> On 18/09/2007 12:58 PM, Nigel Allen wrote:
>> Greetings
>>
>> We have a client running FC6 as a mail server. It doesn't do anything
>> unusual - acts as primary MX for it's domain, accepts incoming mail,
>> runs spamd and a few other things across it and then hands it off to
>> an MS Exchange Server box for the staff to pick up in Outlook.
>> Outgoing mail is taken by the MS box and handed off to the mail server
>> as a smart host.
>>
>> All works well except for.......
>>
>> We constantly see the messages like this in the mail log:
>>
>> > Sep 18 06:06:05 m1 sendmail[30958]: l8HJ66xS030958: SYSERR(root):
>> collect: read timeout on connection from oberon.tpgi.com.au,
>> from=<xxxxxxxx>
>> > Sep 18 06:17:38 m1 sendmail[31148]: l8HJHb5i031148: SYSERR(root):
>> collect: read timeout on connection from mail40.messagelabs.com,
>> from=<xxxxxxxx>
>> > Sep 18 06:30:06 m1 sendmail[31182]: l8HJU6Dv031182: SYSERR(root):
>> collect: read timeout on connection from oberon.tpgi.com.au,
>> from=<xxxxxxxx>
>> > Sep 18 06:33:04 m1 sendmail[31188]: l8HJX54L031188: SYSERR(root):
>> collect: read timeout on connection from oberon.tpgi.com.au,
>> from=<xxxxxxxx>
>> > Sep 18 07:04:43 m1 sendmail[31342]: l8HK4hCw031342: SYSERR(root):
>> collect: read timeout on connection from oberon.tpgi.com.au,
>> from=<xxxxxxxx>
>> > Sep 18 07:04:51 m1 sendmail[31344]: l8HK4qOx031344: SYSERR(root):
>> collect: read timeout on connection from oberon.tpgi.com.au,
>> from=<xxxxxxxx>
>> > Sep 18 07:07:14 m1 sendmail[31500]: l8HK7F4w031500: SYSERR(root):
>> collect: read timeout on connection from oberon.tpgi.com.au,
>> from=<xxxxxxxxt>
>> > Sep 18 07:13:05 m1 sendmail[31516]: l8HKD4Xq031516: SYSERR(root):
>> collect: read timeout on connection from qsrv01sl.mx.bigpond.com,
>> from=<xxxxxxxx>
>> > Sep 18 07:31:48 m1 sendmail[31558]: l8HKVkGM031558: SYSERR(root):
>> collect: read timeout on connection from oberon.tpgi.com.au,
>> from=<xxxxxxxx>
>> > Sep 18 07:34:13 m1 sendmail[31564]: l8HKYBoL031564: SYSERR(root):
>> collect: read timeout on connection from qsrv01ps.mx.bigpond.com,
>> from=<xxxxxxxx>
>> > Sep 18 07:42:33 m1 sendmail[31582]: l8HKgXed031582: SYSERR(root):
>> collect: read timeout on connection from oberon.tpgi.com.au,
>> from=<xxxxxxxx>
>> > Sep 18 07:42:58 m1 sendmail[31584]: l8HKgwxh031584: SYSERR(root):
>> collect: read timeout on connection from qsrv03ps.mx.bigpond.com,
>> from=<xxxxxxxx>
>> > Sep 18 07:53:04 m1 sendmail[31648]: l8HKr4Px031648: SYSERR(root):
>> collect: read timeout on connection from oberon.tpgi.com.au,
>> from=<xxxxxxxx>
>> > Sep 18 08:08:50 m1 sendmail[31723]: l8HL8owr031723: SYSERR(root):
>> collect: read timeout on connection from oberon.tpgi.com.au,
>> from=<xxxxxxxx>
>> > Sep 18 08:23:51 m1 sendmail[31970]: l8HLNpYY031970: SYSERR(root):
>> collect: read timeout on connection from oberon.tpgi.com.au,
>> from=<xxxxxxxx>
>> > Sep 18 08:23:54 m1 sendmail[31972]: l8HLNrU7031972: SYSERR(root):
>> collect: read timeout on connection from oberon.tpgi.com.au,
>> from=<xxxxxxxx>
>> > Sep 18 08:29:37 m1 sendmail[31986]: l8HLTakK031986: SYSERR(root):
>> collect: read timeout on connection from
>> host81-151-82-230.range81-151.btcentralplus.com, from=<xxxxxxxx>
>> > Sep 18 08:50:58 m1 sendmail[32143]: l8HLovdK032143: SYSERR(root):
>> collect: read timeout on connection from mail40.messagelabs.com,
>> from=<xxxxxxxx>
>> and so on and so forth (addresses obfuscated for privacy).
>>
>> when I run a ps -eaf | grep sendmail at any time, I see things like this:
>>
>> > [root [at] m1 log]# ps -eaf | grep sendmail
>> > root 873 3079 0 10:37 ? 00:00:00 sendmail:
>> l8I0bZx3000873 mail40.messagelabs.com [216.82.245.83]: DATA
>> > root 885 3079 0 10:43 ? 00:00:00 sendmail:
>> l8I0h5SZ000885 mail.vanuatu.com.vu [202.80.33.51]: DATA
>> > root 887 3079 0 10:43 ? 00:00:00 sendmail:
>> l8I0hMVh000887 mail.vanuatu.com.vu [202.80.33.51]: DATA
>> > root 891 3079 0 10:43 ? 00:00:00 sendmail:
>> l8I0hppk000891 mail.vanuatu.com.vu [202.80.33.51]: DATA
>> > root 893 3079 0 10:44 ? 00:00:00 sendmail:
>> l8I0iOaa000893 mail.vanuatu.com.vu [202.80.33.51]: DATA
>> > root 895 3079 0 10:45 ? 00:00:00 sendmail:
>> l8I0j1l2000895 oberon.tpgi.com.au [203.12.160.4]: DATA
>> > root 909 3079 0 10:52 ? 00:00:00 sendmail:
>> l8I0q6ri000909 mail1.emailcash.com.au [202.177.214.85]: DATA
>> > root 936 3079 0 10:59 ? 00:00:00 sendmail:
>> l8I0xjWW000936 mail9.tpgi.com.au [203.12.160.104]: DATA
>> > root 1128 3079 0 11:03 ? 00:00:00 sendmail:
>> l8I13323001128 oberon.tpgi.com.au [203.12.160.4]: DATA
>> > root 1212 3079 0 11:06 ? 00:00:00 sendmail:
>> l8I16cQU001212 oberon.tpgi.com.au [203.12.160.4]: DATA
>> > root 1214 3079 0 11:07 ? 00:00:00 sendmail:
>> l8I17767001214 oberon.tpgi.com.au [203.12.160.4]: DATA
>> > root 1245 3079 0 11:18 ? 00:00:00 sendmail:
>> l8I1IF5R001245 oberon.tpgi.com.au [203.12.160.4]: DATA
>> > root 1253 3079 0 11:20 ? 00:00:00 sendmail:
>> l8I1KtGc001253 qsrv01sl.mx.bigpond.com [144.140.92.181]: DATA
>> > root 1303 1173 0 11:37 pts/0 00:00:00 grep sendmail
>> > root 3079 1 0 Sep04 ? 00:00:02 sendmail: accepting
>> connections
>> > smmsp 3083 1 0 Sep04 ? 00:00:00 sendmail: Queue
>> runner [at] 00:15:00 for /var/spool/clientmqueue
>> > root 3087 1 0 Sep04 ? 00:00:00 sendmail: Queue
>> runner [at] 00:15:00 for /var/spool/mqueue
>>
>> Obviously something is hanging horribly and timing out but I cannot
>> for the life of me work out what. I've searched for these kind of
>> messages and came up with almost nothing apart from some old (and I
>> mean old) solutions to a BSD problem.
>>
>> Software is as follows:
>>
>> FC6
>> sendmail-8.14.1-4.1.fc6
>> spamassassin-3.1.9-1.fc6
>> mailscanner-4.60.8-1
>>
>> Any input or suggestions short of taxidermy would be (very) gratefully
>> accepted.
>>
>> TIA
>>
>> Nigel.