LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   CentOS (https://www.linuxquestions.org/questions/centos-111/)
-   -   Postfix on CentOS 6.7 crashes after about 7 hours of running what is wrong (https://www.linuxquestions.org/questions/centos-111/postfix-on-centos-6-7-crashes-after-about-7-hours-of-running-what-is-wrong-4175575058/)

tdominik 03-16-2016 11:29 AM

Postfix on CentOS 6.7 crashes after about 7 hours of running what is wrong
 
My company uses postfix relay agent installed on CentOS 6.7 to forward emails from web forms to exchange mailbox which customer service uses to respond to emails from users. They also use it for our outgoing emails, email subscriptions for customers. Exchange server has its sender connector properties configuration pointed to this server. When this relay server crashes, they cannot receive emails or email delivery get delayed for hours from customers. They get stuck in queue until I reboot this server completely and restart postfix. Then all missing emails start coming smoothly but it only lasts for not whole 7 hours. It started happening suddenly recently few days ago and now it crashes every 7-8 hours. It started doing few months ago once in a while, but nobody took care of it because intermittent outages were far in between and less severe and after couple months it suddenly started doing it everyday.

We use PRTG sensor to monitor hardware and application. Everytime this happens, smtp response to sensor drastically increases from about 10-20ms to 20,000-60,000 ms. It also starts consuming more RAM but it still at low usage (800mb/6000mb) usually it consumes 500 when its normal. CPU usage is normal too whether it crashes or not. I tried yum update install and updated kernel but it did nothing.

I had looked at maillog and messages. Under messages log I am not able to see any meaningful messages.

This is what I get from /var/log/messages:

Mar 15 14:36:03 relay04 rsyslogd-2177: imuxsock lost 113 messages from pid 1158 due to rate-limiting
Mar 15 15:40:57 relay04 rsyslogd-2177: imuxsock begins to drop messages from pid 1158 due to rate-limiting
Mar 15 15:41:02 relay04 rsyslogd-2177: imuxsock lost 434 messages from pid 1158 due to rate-limiting
Mar 15 15:45:58 relay04 rsyslogd-2177: imuxsock begins to drop messages from pid 1158 due to rate-limiting
Mar 15 15:46:03 relay04 rsyslogd-2177: imuxsock lost 110 messages from pid 1158 due to rate-limiting
Mar 15 16:50:57 relay04 rsyslogd-2177: imuxsock begins to drop messages from pid 1158 due to rate-limiting
Mar 15 16:51:15 relay04 rsyslogd-2177: imuxsock lost 425 messages from pid 1158 due to rate-limiting
Mar 15 16:55:57 relay04 rsyslogd-2177: imuxsock begins to drop messages from pid 1158 due to rate-limiting
Mar 15 16:56:18 relay04 rsyslogd-2177: imuxsock lost 102 messages from pid 1158

PID 1158 belongs to postfix process

postfix 1158 0.0 0.1 88776 9128 ? S 13:20 0:07 qmgr -l -t fifo -u


It seems postfix generates too many messages and rsyslog is truncating to keep from logging excessive number of messages.

When I look at /var/log/maillog during outages I see those kind of messages that seem to indicate mail client loses conenctions during that time and holds mails in queue:

Mar 11 07:46:02 OUR-RELAY postfix/smtp[2123]: connect to mail.crisprn.top[31.192.241.72]:25: Connection timed out
Mar 11 07:46:02 OUR-RELAY postfix/smtp[2123]: 957E860C24: to=<Donald-Trump@dchje.crisprn.top>, relay=none, delay=89791, delays=89702/59/30/0, dsn=4.4.1, status=deferred (connect to mail.crisprn.top[31.192.241.72]:25: Connection timed out)
Mar 11 07:46:02 OUR-RELAY postfix/smtp[2134]: connect to mail.saudisk.top[31.192.241.88]:25: Connection timed out
Mar 11 07:46:02 OUR-RELAY postfix/smtp[2134]: 9944760D14: to=<Sexy-Russian-Singles@sijkt.saudisk.top>, relay=none, delay=82698, delays=82609/59/30/0, dsn=4.4.1, status=deferred (connect to mail.saudisk.top[31.192.241.88]:25: Connection timed out)
Mar 11 07:46:02 OUR-RELAY postfix/smtp[2135]: connect to mail.awregal.top[179.43.133.146]:25: Connection timed out
Mar 11 07:46:02 OUR-RELAY postfix/smtp[1941]: connect to mail.lostuq.top[93.174.90.212]:25: Connection timed out
Mar 11 07:46:02 OUR-RELAY postfix/smtp[2140]: connect to mail.stiffsk.top[179.43.133.135]:25: Connection timed out
Mar 11 07:46:02 OUR-RELAY postfix/smtp[2110]: connect to mail.tpworst.top[66.85.78.101]:25: Connection timed out
Mar 11 07:46:06 OUR-RELAY postfix/smtp[2178]: 983B1608CB: to=<nataly@mycompanyname.com>, relay=mx1.mycompanyname.com[38.102.x.x]:25, conn_use=3, delay=47, delays=0.71/37/5/5, dsn=4.3.2, status=deferred (host mx1.mycompanyname.com[38.102.x.x] said: 421 4.3.2 Service not available (in reply to MAIL FROM command))
Mar 11 07:46:08 OUR-RELAY postfix/smtpd[1436]: connect from 44-227.soderhamn.com[80.245.227.44]
Mar 11 07:46:09 OUR-RELAY postfix/smtpd[1436]: NOQUEUE: reject: RCPT from 44-227.soderhamn.com[80.245.227.44]: 554 5.7.1 Service unavailable; Client host [80.245.227.44] blocked using zen.spamhaus.org; https://www.spamhaus.org/query/ip/80.245.227.44; from=<Diamond_Suzette@baljobe.com> to=<laurat@mycompanyname.com> proto=SMTP helo=<44-227.soderhamn.com>
pt=1 (queue active)
Mar 11 07:49:36 OUR-RELAY postfix/smtp[2193]: A1FC060636: to=<noreply@mycompanyname.com>, relay=mx1.mycompanyname.com[38.102.x.x]:25, delay=17282, delays=17277/0/0.1/5, dsn=4.3.2, status=deferred (host mx1.mycompanyname.com[38.102.x.x] said: 421 4.3.2 Service not available (in reply to MAIL FROM command))
Mar 11 07:49:36 OUR-RELAY postfix/smtp[2152]: ADE9D60FBB: to=<amanda@mycompanyname.com>, relay=mx1.mycompanyname.com[38.102.x.x]:25, delay=12981, delays=12976/0.01/0.1/5, dsn=4.3.2, status=deferred (host mx1.mycompanyname.com[38.102.x.x] said: 421 4.3.2 Service not available (in reply to MAIL FROM command))
Mar 11 07:49:36 OUR-RELAY postfix/smtp[2235]: AD1B160F74: to=<noreply@mycompanyname.com>, relay=mx1.mycompanyname.com[38.102.x.x]:25, delay=17123, delays=17118/0.03/0.08/5, dsn=4.3.2, status=deferred (host mx1.mycompanyname.com[38.102.x.x] said: 421 4.3.2 Service not available (in reply to MAIL FROM command))
Mar 11 07:51:59 OUR-RELAY postfix/smtp[2115]: 756E46121C: to=<nataly_tully@gmail.com>, relay=gmail-smtp-in.l.google.com[2607:f8b0:4001:c1e::1b]:25, delay=2.1, delays=0.67/0/0.69/0.69, dsn=2.0.0, status=sent (250 2.0.0 OK 1457711519 m8si3883984igv.95 - gsmtp)
Mar 11 07:51:59 OUR-RELAY postfix/qmgr[1227]: 756E46121C: removed
Mar 11 07:52:01 OUR-RELAY postfix/smtp[2127]: connect to gmail.co[209.85.145.17]:25: Connection timed out
Mar 11 07:52:01 OUR-RELAY postfix/smtp[2139]: connect to gmail.co[209.85.145.83]:25: Connection timed out
Mar 11 11:04:31 OUR-RELAY postfix/smtp[7109]: connect to mail.bonuscardslook.top[69.162.107.243]:25: Connection refused
Mar 11 11:04:31 OUR-RELAY postfix/cleanup[6916]: 4D9A06021B: message-id=<006e01d17bc8$ce2b12a0$6a8137e0$@cathyphotography.com>
Mar 11 11:04:31 OUR-RELAY postfix/smtp[6791]: 0393960326: to=<CostcoReward@klsxx.bonuscardslook.top>, relay=none, delay=2383, delays=2383/0/0.11/0, dsn=4.4.1, status=deferred (connect to mail.bonuscardslook.top[69.162.107.243]:25: Connection refused)
Mar 11 11:04:31 OUR-RELAY postfix/smtp[7105]: E3C9260F67: to=<CostcoReward@klsxx.bonuscardslook.top>, relay=none, delay=2360, delays=2360/0.03/0.07/0, dsn=4.4.1, status=deferred (connect to mail.bonuscardslook.top[69.162.107.243]:25: Connection refused)
Mar 11 11:04:31 OUR-RELAY opendkim[996]: 4D9A06021B: no signing table match for 'cc@cathyphotography.com'
Mar 11 11:04:31 OUR-RELAY postfix/smtp[5805]: 28D8E60F5C: to=<CostcoReward@klsxx.bonuscardslook.top>, relay=none, delay=2366, delays=2366/0/0.1/0, dsn=4.4.1, status=deferred (connect to mail.bonuscardslook.top[69.162.107.243]:25: Connection refused)
Mar 11 11:04:31 OUR-RELAY postfix/smtp[7108]: BE40A60395: to=<CostcoReward@klsxx.bonuscardslook.top>, relay=none, delay=2378, delays=2378/0.05/0.05/0, dsn=4.4.1, status=deferred (connect to mail.bonuscardslook.top[69.162.107.243]:25: Connection refused)
zen_mike

Posts: 2
Joined: 2016/03/15 23:51:46

dijetlo 03-17-2016 03:22 AM

Have you checked the process limits on the mail server?


All times are GMT -5. The time now is 12:06 PM.