I agree with most of the points here, but it's interesting that you mention:
> In my experience, good dedicated servers practically never crash
One of my toy servers (ecc ram/xeon cpu - but bought "second hand" via hetzner's auction) disappeared the other day. I thought maybe a disk had failed - but I couldn't bring it up in their network booted rescue mode - and requested a "hands on" power cycle - and after a few minutes the server was up again:
> Dear Client.
> A fault in your neighbor servers PSU tripped the fuse of the small rack segment which your server is located in too. We have fixed the issue and now your OS is back online.
Now, I think that box had a 700-900 days up-time before - I didn't really have to do anything (or pay) to get it back up.
But it was kind of surprising.
I guess all I'm saying is that I do like cheap, dedicated servers from hetzner - but if you need to guarantee five nines uptime, the architecture part is important.
I guess all I'm saying is that I do like cheap, dedicated servers from hetzner - but if you need to guarantee five nines uptime, the architecture part is important.
Five-nines is less than 10 minutes of downtime per year. I doubt anyone is really guaranteeing that without 24/7 active monitoring and maintaining extensive automated failover systems, which is already several full-time jobs. No solo operator is credibly providing that level of service.
I’m running five-nines with my setup and I’m the only operator. Monitoring and automatic failover is not difficult but I think it requires a solid architecture from the ground up. When I first started in 2011 I was running DRBD in VMs and zebra to unicast my presence. Future upgrades were incremental steps to more resilience to where I am today with a fully redundant architecture in 2 data centers. In fact the only thing that made me miss my uptime target one year was failed generator maintenance by my provider.
I’m running five-nines with my setup and I’m the only operator. Monitoring and automatic failover is not difficult but I think it requires a solid architecture from the ground up.
OK, I concede that it is not completely inconceivable to do that, but unless the service you're operating is relatively light in its demands on the tech stack, I think it's a very impressive achievement to maintain infrastructure that can consistently and reliably deliver that performance on your own if you're also the person doing the development work and your infrastructure costs aren't getting silly.
We have a simple, fully redundant architecture at one of my businesses as well, and I suppose we probably do achieve five-9s most years, but I wouldn't be willing to guarantee that to customers with serious money on the line if we missed it. We're still only D disk failures at similar times away from degraded performance while we spin up new machines from scratch, or N network failures away from degraded performance until we can bring up more capacity where it's still available.
Agreed. I doubt most people/services should build to 'guarantee' even .9999.
.9990 or .9995 is much cheaper, much easier, and probably closer to what your end user's network connectivity is anyway. (Yes, they're multiplicative, but if your user is connecting from a single-path residential connection and a $50 router, your 5th nine isn't needed to demolish their local 5-10 hours of downtime per year.)
> In my experience, good dedicated servers practically never crash
One of my toy servers (ecc ram/xeon cpu - but bought "second hand" via hetzner's auction) disappeared the other day. I thought maybe a disk had failed - but I couldn't bring it up in their network booted rescue mode - and requested a "hands on" power cycle - and after a few minutes the server was up again:
> Dear Client.
> A fault in your neighbor servers PSU tripped the fuse of the small rack segment which your server is located in too. We have fixed the issue and now your OS is back online.
Now, I think that box had a 700-900 days up-time before - I didn't really have to do anything (or pay) to get it back up.
But it was kind of surprising.
I guess all I'm saying is that I do like cheap, dedicated servers from hetzner - but if you need to guarantee five nines uptime, the architecture part is important.