Hacker News new | past | comments | ask | show | jobs | submit login
Texas Pickup Truck Crash Shuts Down the Internet (2007) (toolbox.com)
8 points by tosh on Sept 16, 2018 | hide | past | favorite | 4 comments



> emergency crews shut off power to rescue the truck driver without notifying Rackspace, the double whammy that the company had never planned on: the double power outage within the 30-minute chiller up-cycle. Oops.

Wow, what a design fault.

I used to do data center electrical/mechanical commissioning, which means putting equipment and systems through their paces to prove things work and so the vendor can get paid. We had a fairly standard process that we applied to validation - but every once in a while we'd have to skip/tweak a test because of something specific to the datacenter. Usually, this was a good thing, like an extra layer of redundancy - but sometimes it was a glaring assumption that we had to run-with, despite our objections.

I remember one DC had spec'd their cooling based off of power needs and climate of the area. They had balanced things out to very thin margins - which was in some ways very efficient. But... they didn't factor in global warming. Come summer Some days they could only load to 50% capacity because they couldn't keep things cool enough. Ended up redoing much of the DC to support beefier cooling.

Single points of failure can take many forms and occur at many points along a redundancy plan. The industry has oodles more examples to pull from. But now that i'm nerdier, the better approach is to be able to tolerate DC failure in your topology than try to prop up unicorn buildings. IIRC, google doesn't even have backup generators, because utility is good enough.


I'll never forget the phone call after the truck crash. It was about noon, and I was scheduled to start a support shift at 2pm. My team lead called and not long after, I was in Datapoint along with every other available racker, ansewering the phones.

The way Rackspace respondrd to that incident has forever shaped my perspective on what makes up an acceptable level of customer service.

Fanatical Fuckin Support.


Does your username reference the incident somehow?


Not the event, but the team I was on at the time.

-subs




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: