"WHY we use our own hardware..." The why is is the interesting part of this arti...

veidr · 2024-12-22T15:56:56 1734883016

I take that back; this is (to me)t he most interesting part:

"Although we’ve only ever used datacenter class SSDs and HDDs failures and replacements every few weeks were a regular occurrence on the old fleet of servers. Over the last 3+ years, we’ve only seen a couple of SSD failures in total across the entire upgraded fleet of servers. This is easily less than one tenth the failure rate we used to have with HDDs."

veidr · 2024-12-26T16:37:29 1735231049

I wanted to revisit this after checking my own anecdata. (But based on logfiles not just like recollections.)

I've had a ZFS system or some sort for about 10 years, and before that I had proprietary RAID chassis like Pegasus2 and Synology etc.

I can't quite say how many drives I have used, because my records are not that good. But maybe its like 100 drives since 2008. Maybe 150. Less than 200.

I had over 10 HDD devices fail (probably 13, confidence of like 90%).

I've only ever had 1 SSD fail.

I've also used the absolute cheapest shite SSDs.

I suspect the failure modes tend to be

- hard disks fail whenever the fuck, who knows

- SSDs fail in the beginning or end of their reasonable service life

P.S. With ZFS though, you don't really care if/when they fail. I've so far (knock on wood) never lost any data with a ZFS config with >1 disk redundancy and reasonable backups.