In context: Located about one degree of latitude (137 km) north of the equator, the city-state of Singapore has no distinctive seasons, uniform temperature and pressure, and high humidity throughout the year. If something goes wrong during a system upgrade, the tropical climate can pose a challenge to the data centers operating on the island.
On October 14, DBS and Citibank suffered an IT outage that affected millions of payment transactions in Singapore. Banking apps were down, servers could not be reached, and the two banks' customers were left with very few means to pay for their purchases or receive payments. The city-state is heavily relying on digital banking systems, an approach that government authorities are now considering from a different, more cautious viewpoint.
The October outage resulted in full or partial unavailability of the online banking services provided by DBS and Citibank, Minister Alvin Tan confirmed during a parliamentary Q&A session. The root cause of the issue was later identified in a non-functioning cooling system at the Equinix data center used by both banks, which made the server temperatures rise above optimal operating conditions.
The outage led to 810,000 failed access attempts, Tan said this Monday, with 2.5 million unsuccessful payment and ATM transactions. According to Equinix, the overheating issue was caused by a contractor that sent an incorrect signal to "close the valves from the chilled water buffer tanks" during a planned system upgrade.
DBS and Citibank had some backup plans prepared for this kind of situation, but those turned out to be absolutely worthless. DBS was unable to reach its backup data center because of a "network misconfiguration," Singapore's government said, while Citibank had some unspecified connectivity issues.
The two financial institutions didn't comply with the requirements from the Monetary Authority of Singapore (MAS) related to resilience of critical IT systems. MAS dictates that unscheduled downtime for critical banking systems should not exceed four hours within a 12-month period, and the October issue clearly went beyond that limit.
According to Kevin Reed, Chief information security officer for Singapore-based backup company Acronis, Equinix should have had a redundant cooling system for its servers. As is often the case, Reed remarked, an incident is not a single issue, but "a chain of interconnected events" as the DBS and Citibank case clearly demonstrates.
Minister Tan also had some remarks about the "digital first" approach within Singapore financial market, which shouldn't be a "digital only" affair anyway. Consumers and businesses should be aware of the risks related to paperless money, and companies should of course provide alternative payment options for when the servers and apps are unavailable.