Facebook data centre

Monday night’s Facebook outage, which brought offline the social networking site as well as WhatsApp and Instagram for users across much of the world is expected to have cost the company roughly $99.75 million (€86.08 million), according to calculations by Fortune.

This figure is based on the company’s second quarter earnings, which saw a revenue of $29.08 billion (€25.09 billion) over a 91-day period.

It represents the most catastrophic downtime for the company since 2019, when a technical error knocked the company’s sites offline for more than 14 hours, leaving them mostly inaccessible from across the world.

So, what actually happened?

Though the company was on its official channels throughout the outage, only acknowledging the outage briefly, its vice president of infrastructure, Santosh Jonardhan has since published a statement apologising for the incident, and providing some information on what happened.

Essentially, the outage was due to faulty configuration changes on its routers, the effect of which cascaded into other sections of the company’s services, making it harder to deal with the problem.

BusinessNow.mt reached out to cybersecurity and IT expert, Leon Allen, director of cybersecurity at Continent 8 Technologies to expand on what exactly this means.

Firstly, he clarified, the outage was not the result of a cybersecurity attack, but rather it “appears to be a case of human error, after Facebook’s own engineers pushed a flawed network configuration change.

“This change inadvertently withdrew BGP (Border Gateway Protocol) which is the system by which one network figures out the best route to a different network. As such, with no BGP routes into Facebook’s network, Instagram, WhatsApp and Facebook were inaccessible.”

Asked whether the incident demonstrates a vulnerability in Facebook’s cybersecurity, considering speculation online had hypothesised the outage could be due to a ransomeware attack, Mr Allen hypothesised that this sort of incident would likely not be the result of a cyber attack.

“Pushing a BGP change requires a very high level of access and if a bad actor had that level of access, they could have perpetrated far wider levels of damage,” he said.

Human error

So, evidence seems to point to a very bad day at the office for an engineer, but there are other possible explanations, and indeed, it seems notable that Facebook did not mention human error in its statement.

As pointed out by Mr Allen, Facebook launched an automated peering configuration system in May, which could be culpable for the mistake.

Regardless of who’s to blame, it seems the problem was aggravated by difficulties in responding to the issue – reports suggested that engineers and technicians were struggling to gain physical access to the facilities needed to bring the services back online.

This is because, reportedly, the company’s internal systems, such as its electronic smart key card systems, were also knocked offline.

Related

The end of pesky liquids restrictions in airport hand luggage? UK airports could ditch rules by 2024

November 24, 2022
by Helena Grech

The UK Government is considering rolling out more advanced scanners by mid-2024, sources told the BBC

Paris overtakes London as most valuable stock market in Europe

November 15, 2022
by Robert Fenech

The UK is the only G7 nation whose economy is still smaller than it was before the pandemic

Primark goes online, rules out deliveries…for now

November 14, 2022
by Robert Fenech

The British retail giant has opted for a click-and-collect system