Hacker News new | past | comments | ask | show | jobs | submit login

This is the whole point of chaos engineering that was invented at Netflix, which tests the resiliency of these systems.

I guess we now know the limits of what "at scale" is for Netflix's live-streaming solution. They shouldn't be failing at scale on a huge stage like this.

I look forward to reading the post mortem about this.




Everyone keeps mentioning at scale. I seriously doubt this was an "at scale" problem. I have strong suspicion this was a failure at the origination point being able to push a stable signal. That is not an "at scale" issue, but a hubris of we can do better/cheaper than broadcasting standard practices


As counterpoint, I observed 2-3 drops in bitrate, but an otherwise fine experience. So the problem seems to have been in dissemination, not at the origin.


Yeah, I was switching between my phone and desktop to watch the stream and I had a seamless experience on both devices the entire time. I’m not sure why so many people are assuming this was a universal experience.


I highly doubt this. Netflix has a system of OCAs that are loaded with hard disks, are installed in ISP’s networks, and serve the majority of those ISP’s customers.

Given than many people had no problems with the stream, it is unlikely to have been an origin problem but more likely the mechanism to fanout quickly to OCAs. Normally latency to an OCA doesn’t matter when you’re replicating new catalogs in advance, but live streaming makes a bunch of code that previously “didn’t need to be fast” get promoted to the hot path.


I am not sure that it is an issue with the origination point. In fact I just thought it was my ISP because my daughter's boyfriend was watching and doing facetime with her and my video was dropping but his was not. I have 2gb fiber and we regularly stream five TVs without any issue, so it should not have been a bandwidth issue.


I've tried to watch an old Seinfeld episode during this event. It was freezing every few minutes even at downgraded bitrate. A video that should be on my local CDN node.


If it was a problem at origin, why did it get better/worse as viewership fell/rose?


Perhaps it was, or perhaps it was not.

I was watching a pirated, live retransmission of the event on Twitch (in Portuguese), and there was zero buffering on my end.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: