MX01 SmarterMail – Outage Report for 06/29/2016

Incident Report

At 6:30PM UTC (06/29/2016) our primary SmarterMail server experienced a full outage that interrupted all incoming and outgoing mail.

Timeline (times are UTC)

  • 6:30PM Outage occurred
  • 6:33PM Technician dispatched to verify local equipment
  • 6:45PM Technician notified that server was unable to power on
  • 6:45PM Technician replaced both power supplies and notified server was still not powering on
  • 7:20PM Technician replaced motherboard and notified server was not unable to power on
  • 7:30PM A replacement standard ATX power supply was ordered and scheduled for pickup
  • 8:45PM Technician installed power supply and notified server successfully powered on
  • 9:00PM Server online and operational

Root Cause

The primary SmarterMail server (MX01) uses dual power supplies for redundancy in case of a single PSU failure. After contacting Supermicro we were notified that the internal power distribution unit that the dual power supplies plug into had failed causing the server to receive no power. This is not a component that is stocked at our data-center and not something that can be picked up locally and must be ordered directly from Supermicro.

The server is currently operational, but using a standard ATX power supply. Once we’ve received the replacement part from Supermicro, maintenance will be scheduled to replace it so that the server can be restored to full working order.

We’d like to thank everyone for their patience and understanding during this time and for choosing ASPnix!