Why did they update everything all at once?

afavour · 2024-07-19T20:31:45 1721421105

I assume Crowdstrike is software you usually want to update quickly, given it is (ironically) designed to counter threats to your system.

Very easy for us to second guess today of course. But in another scenario a manager is being torn a new one because they fell victim to a ransomware attack via a zero day systems were left vulnerable to because Crowdstrike wasn’t updated in a timely manner.

SoftTalker · 2024-07-20T04:12:59 1721448779

Maybe, if there's a new zero-day major exploit that is spreading like wildfire. That's not the normal case. Most successful exploits and ransom attacks are using old vulnerabilites against unpatched and unprotected systems.

Mostly, if you are reasonably timely about keeping updates applied, you're fine.

afavour · 2024-07-20T14:45:07 1721486707

> Maybe, if there's a new zero-day major exploit that is spreading like wildfire. That's not the normal case.

Sure. And Crowstrike releasing an update that bricks machines is also not the normal case. We're debating between two edges cases here, the answers aren’t simple. A zero day spreading like wildfire is not normal but if it were to happen it could be just as, if not more, destructive than what we’re seeing with Crowdstrike.

johncessna · 2024-07-19T20:53:04 1721422384

In the context of the GP where they were actively treating a heart attack, the act of restarting the computer (let alone it never come back) in of itself seems like an issue.

owl57 · 2024-07-19T23:12:24 1721430744

I believe this update didn't restart the computer, just loaded some new data into kernel. Which didn't crash anything the previous 1000 times. A successful background update could hurt performance, but probably machines where that's considered a problem just don't run a general-purpose multitasking OS?

anonymous8888 · 2024-07-19T20:33:29 1721421209

tfw you need to start staggering your virus updates in case your anti-virus software screws you over instead

nikau · 2024-07-20T02:42:33 1721443353

Maybe those old boomer IT people were on to something by using different Citrix clusters and firewalling off the ones that run essential software...

jmcgough · 2024-07-19T23:49:10 1721432950

Crowdstrike pushed a configuration change that was a malformed file, which was picked up by every computer running a the agent (millions of computers across the globe). It's not like hospitals and IT systems are manually running this update and can roll it back.

As to why they didn't catch this during tests or why they don't use perform gradual change rollouts to hosts, your guess is as good as mine. I hope we get a public postmortem for this.

jboy55 · 2024-07-20T00:24:46 1721435086

Considering Crowdstrike mentioned in their blog that systems that had their 'falcon sensor' installed weren't affected [1], and the update is falcon content, I'm not sure it was a malformed file, but just software that required this sensor to be installed. Perhaps their QA only checked if the update broke systems with this sensor installed, and didn't do a regression check on windows systems without it.

[1]https://www.crowdstrike.com/blog/statement-on-falcon-content...

vladvasiliu · 2024-07-20T01:48:23 1721440103

That’s not exactly what they’re saying.

It says that if a system isn’t “affected”, meaning it doesn’t reboot in a loop, then the “protection” works and nothing needs to be done. That’s because the Crowdstrike central systems, on which rely the agents running on the clients’ systems, are working well.

The “sensor” is what the clients actually install and run on their machines in order to “use Crowdstrike”.

The crash happened in a file named csagent.sys which on my machine was something like a week old.

adrianmonk · 2024-07-20T02:33:39 1721442819

I'm not familiar with their software, but I interpreted their wording to mean their bug can leave your system in one of two possible states:

(1) Entire system is crashed.

(2) System is running AND protected from security threats by Falcon Sensor.

And to mean that this is not a possible state:

(3) System is running but isn't protected by Falcon Sensor.

In other words, I interpreted it to mean that they're trying to reassure people they don't need to worry about crashes and hacks, just crashes.

lr1970 · 2024-07-19T22:05:16 1721426716

> Why did they update everything all at once?

This is beyond hospital IT control. Clownstrike (sorry, Crowdstrike) unconditionally force-updates the hosts.

cyanydeez · 2024-07-19T23:46:40 1721432800

Likely because staggered updates would harm their overall security services. I'm guessing these software offer telemetry that gets shared across their clientele, so that gets hampered if you have a thousand different software versions.

whydoyoucare · 2024-07-19T20:15:28 1721420128

My guess is this was an auto-update pushed out by whatever central management server they use. Given CS is supposed to protect your from malware, IT may have staged and pushed the update in one go.

Groxx · 2024-07-19T22:44:39 1721429079

Auto-updates are the only reason something like this gets so widespread so fast.