Hacker News new | past | comments | ask | show | jobs | submit login

The concept I try to go for with that status check is 'Can this node connect to everything so it can successfully respond to http requests'. However my approach wouldn't identify an overloaded server, which might be a good thing if we need to scale up - taking down an overloaded server is just going to make the other servers that much more overloaded.

I'm aways up for hearing about other ways people solve health checks.




> taking down an overloaded server is just going to make the other servers that much more overloaded

Emphatically agree, but it's important in the first place to design and deploy your infrastructure such that basic increases of scale are accounted for - prevention is the most important piece of the puzzle. Get that right and an overloaded server is symptomatic of something else, in which case taking down access to the unruly resource is first priority.

IMO, the big takeaway here is that they were load balancing simply by only hitting the top level - selectivity is somewhat tedious to build but worth it in the long run.


Just because it can connect to everything doesn't mean it can successfully respond though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: