Hacker News new | past | comments | ask | show | jobs | submit login

I got fired from a (sysadmin) job because a servo in a tape library failed while i was working 14h days on conference that I was told was "my only priority".

We didn't need the backups at all, prod was fine, and I found out 2 business days after the servo failed when the conference was over. apparently 2 days of not having backups due to a mechanical failure out of my control was beyond unacceptable.




in the early 2000s I was a young guy and board observer on a .com that was growing extremely rapidly and had a 9 day complete outage.

The venture funds, including the one I was part of, came in screaming, "who's getting fired for this", to which the CEO responded: "are you kidding me? We just spent $1.5M educating this team and you want me to hand them over to our competition?"

Every investor and other board member got real quiet, realizing how correct the CEO was. Good CEOS recognize that mistakes happen. The best make sure they retain the people that have learned through those mistakes.


Early 2000s, 9 day complete outage.

Don't know if that was the same one, but happened to me at Rent.com. The story is that a change in a shell script meant that backups were not actually being sent properly to tape. That was OK, there was another online backup copy. But the restore process deleted that for 1 hour each day before it was recreated.

The production database died during that hour. We had to take the last good backup (several months earlier) and replay WAL logs to bring it up to date.

The sysadmin whose mistake it was offered her resignation, and was turned down by the head of tech because he knew she wouldn't have made it if she had a more reasonable load. The head of tech offered his resignation to the CEO and was turned down because the CEO knew that it was due to incorrect company priorities.

The next tech hire was a DBA whose sole task was make sure that we have multiple levels of verified backups.

In less than a year we were sold to eBay at a nice price. Part of the reason was that they thought that the way that we handled failure said very positive things about the organization.


And this is how you run a business. It seems every person who worked with you had integrity and true leadership in spades!

That's rare. Too rare!


Oh, and one other memorable detail that I couldn't make up.

The database went down DURING the CEO's 50th birthday party!


i love happy endings :)


Perhaps one of the senior guy should have learned from past mistakes but they are still in this situation?


Ah, the good old "blameful postmortem." Sounds like a toxic culture...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: