Azure Went Dark
And down went all Microsoft 365 services around the world.
One popular argument against putting your business trust in the cloud is that if your hyper-cloud provider goes down, so does your business. Well, on the early U.S. East coast morning, it happened. Microsoft Azure went down and along with it went Microsoft 365, Exchange Online, Outlook, SharePoint Online, OneDrive for Business, GitHub, Microsoft Authenticator, and Teams. In short, pretty much everything running on Azure went boom.
Azure’s status page revealed the outage hit everything in the Americas, Europe, Asia-Pacific, the Middle East, and Africa. The only area to avoid the crash was China.
Microsoft first reported the problem at 2:31 a.m. Eastern, just as Europe was getting to work. The Microsoft 365 Status Twitter account reported, “We’re investigating issues impacting multiple Microsoft 365 services.”
Of course, by that time, users were already screaming. As one Reddit user on the sysadmin subreddit, wrote, “Move it to the cloud, they said, it will never go down, they said, we will save so much money they said.”
Later, Microsoft reported, “We’ve rolled back a network change that we believe is causing impact. We’re monitoring the service as the rollback takes effect.” By 9:31 a.m., Microsoft said the disaster was over. “We’ve confirmed that the impacted services have recovered and remain stable.” But, “We’re investigating some potential impact to the Exchange Online Service.” So, Exchange admins and users? Don’t relax just yet.
What Caused It?
So, what really caused it? Microsoft isn’t saying, but my bet, as a former network administrator, is it was either a Domain Name System (DNS) or Border Gateway Protocol (BGP) misconfiguration. Given the sheer global reach of the failure across multiple Azure Regions, I’m putting my money on BGP.