Hacker News new | past | comments | ask | show | jobs | submit login

No, I'm not surprised either. But if you're operating at this kind of scale and with this level of immediate roll-out, what I would expect are:

* A staggered process for the roll-out, so that machines that are updated check-in with some metrics that say "this new version is OK" (aka "canary deployment") and that the update is paused/rolled back if not.

* Basic smoke testing of the files before they're pushed to any customers

* Validation that the file is OK before accepting an update (via a checksum or whatever, matched against the "this update works" automated test checksums)

* Fuzz tests that broken files don't brick the machine

Literally any of the above would have saved millions and millions of dollars today.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: