Yesterday around 12:36 AM my main server mowgli went into a temporay coma (a disk volume fell down and did not get back up) and was not receiving mail.

No problem, thanks to the magic of DNS MX records, mail goes to my backup server dixie. Good thing I was clever and had dixie forward all mail to Optimum Online’s mail relay… when the mail relay got the dembowski.net mail it tried to deliver it to mowgli (who was down) and then back to dixie. The mail dixie got was sent into a loop with my ISP’s mail relay.

Each hop is added to the messages SMTP header and when an MTA sees that it is looping with itself then it typically sends the sender a non-delivery message and discards the original mail.

I lost about 20 hours of mail messages for my domain. Once mowgli was fixed I made a change to mowgli’s Postfix configuration. In the main.cf file I changed this line from

smtpd_recipient_restrictions = permit_sasl_authenticated, permit_mynetworks, reject_unauth_destination, reject_rbl_client zen.spamhaus.org

to now include a whitelist

smtpd_recipient_restrictions = permit_sasl_authenticated, permit_mynetworks, reject_unauth_destination, check_client_access hash:/etc/postfix/whitelist, reject_rbl_client zen.spamhaus.org

The /etc/postfix/whitelist file just contains one line for dixie’s IP address

24.46.186.255 OK

I ran postmap hash:/etc/postfix/whitelist and tested. From dixie I was able to telnet to mowgli on TCP port 25 and send mail by typing in the SMTP commands directly. Before this I would get an error message like

554 Service unavailable; Client host [24.46.186.255] blocked using zen.spamhaus.org; http://www.spamhaus.org/query/bl?ip=24.46.186.255

Now my main server accepts mail just from that IP address on the whitelist before the Spamhaus check occurs. The reject_rbl_client check is still working (open mail relays are BAD) it’s just my one IP address that gets a pass.

The configuration on my backup server dixie was simple. I added to main.cf one line

transport_maps = hash:/etc/postfix/transport

The file /etc/postfix/transport contained

dembowski.net smtp:[mowgli.dembowski.net]

I ran postmap hash:/etc/postfix/transport and restarted postfix. Now when dixie needs to deliver mail to dembowski.net it sends it directly to mowgli. If mowgli is unreachable it will just queue up the mail until mowgli becomes available. Every other domain gets forwarded to my ISP’s mail relay and all is good.