Crowdstrike Outage: The Global Impact of a Software Update Gone Wrong
The Global Impact of a Software Update Gone Wrong
Transcript
It’s 11:00 PM on 18th July 2024 at Sydney Airport, Australia, and a worker shuts down the machines responsible for checking in customers.
In terminal 1 international, the airport’s busiest terminal, it’s uncharacteristically quiet. The last few flights have departed from the airport, and departing flights won’t resume until after the 2:30 AM opening hours.
Check-in desks, departure screens, and other windows-operated computer equipment go into sleep mode.
In the background, a key part of the systems’ antivirus is updating.
Unbeknownst to anyone yet, this update will prove to cause havoc upon not just Sydney airport, but the whole world…
Falcon Endpoint Detection is a piece of software distributed by the Company Crowdstrike PLC. The software is intended for enterprises and runs in the background on installed computers looking for anomalies that might indicate a vulnerability or cyberattack.
The issue with this software is that to be so thorough it has to sit above the chain of command before the computer’s operating system. When software runs here, it is known as being in the system’s “kernel,” and this kind of software is typically reserved for drivers. Although it’s now used in advanced types of anti-cheat meant to prevent cheating in online games.
Because this software runs in the critical path of the computer, it has to load up and start correctly for the computer to boot into its operating system. If it fails, when Windows is booted, it will simply return this glorious screen of evil, known rather ominously as the “blue screen of death,” and looking at the global impact, I don’t think it was improperly named.
Back at Sydney airport, the terminal is reopened, and the opening routine for workers is going as usual, until the computers get booted up. None of the machines are operative, simply boot-looping as soon as they have started.
Overnight, an automatic update had temporarily bricked part of the airport’s most important infrastructure.
Before long, it wasn’t just flights at Sydney airport. Shortly after, United Airlines stopped flying their planes, and the damage wasn’t just limited to air travel, with the London Stock Exchange also experiencing issues.
Before long, trains were cancelled, Sky News had to pause its broadcast, and self-checkout machines across the world were broken. At hospitals, records were unreachable and machinery inoperable. Even some telecommunication networks and the emergency call centres in Alaska were affected. Not to mention my girlfriend’s favourite coffee shop was operating purely by click and collect due to the issues with card payments.
If I were to list all of the affected companies and infrastructure, I think I would be here all day. So here are the footnotes: this is estimated to be the worst IT issue since the WannaCry cyber-attack in 2017, and Crowdstrike, the company behind the software, lost about 16 billion dollars of market cap in one night. While the total costs are still being estimated, it’s likely that the cost is big enough to start with a T and end in -illions.
So that’s all I have for you now. I hope you learnt something, and as it turns out, Y2K was real, it was just 24 years delayed, which is considered a minor delay for Ryanair. Thanks for listening.
sources:
1 - https://www.bbc.com/news/live/cnk4jdwp49et
2 - https://www.bbc.com/news/live/cn056371561t
3 - https://www.thesun.co.uk/tech/27223882/microsoft-crowdstrike-meltdown-trillions-cost-world-economy/
4 - https://youtu.be/4yDm6xNeYas?si=cEO7anKOoPZidXTP
5 - https://www.youtube.com/watch?v=deJuXfwS7Bo