> Borg will remain orders of magnitude beyond Kubernetes until Kubernetes is completely rearchitected. It’s not scalability bugs. It’s decisions regarding how the cluster maintains state that hamstring it, and that’s so fundamental to everything it’s not a find/squish loop.
Can you say more about this? Borgmaster uses Paxos for replicating checkpoint data, and etcd uses Raft for replicating the equivalent data, but these are really just two flavors of the same algorithm. I don't doubt that there are probably more efficient ways that Kubernetes could handle state (I don't claim to be an expert in that area), but I don't think they're approaches that would look any more like Borg than Kubernetes does.
If you're at liberty to do so, could you say what orchestrators the customers you mentioned chose in lieu of Kubernetes? What scale are they running at for a single cluster?
Can you say more about this? Borgmaster uses Paxos for replicating checkpoint data, and etcd uses Raft for replicating the equivalent data, but these are really just two flavors of the same algorithm. I don't doubt that there are probably more efficient ways that Kubernetes could handle state (I don't claim to be an expert in that area), but I don't think they're approaches that would look any more like Borg than Kubernetes does.
If you're at liberty to do so, could you say what orchestrators the customers you mentioned chose in lieu of Kubernetes? What scale are they running at for a single cluster?
[Disclaimer: I work on Kubernetes/GKE at Google.]