>*So why are they fucking with disks if ScyllaDB is so good?* The database layer...

paulfurtado · on Aug 16, 2022

At a huge price, EBS can finally get you near-local-nvme performance. If you use an io2 drive attached to a sufficiently sized r5b instance (and I think a few other instance types), you can achieve 260,000 IOPS and 7,500 MB/s throughput.

But up until the last year or two, you couldn't get anywhere near that with EBS and I'm sure as hardware advances, EBS will once again lag and you'll need to come up with similar solutions to remedy this.

Also, I guess AWS would fight them a little less here: the lack of live migrations at least means that a local failed disk is a failed disk and you can keep using the others.

briffle · on Aug 16, 2022

google cloud also has a metric ton of IOPs and throughput on their networked persistent disks. But what this article is talking about is latency.

stingraycharles · on Aug 16, 2022

What are the latency characteristics like for io2? You’re mentioning near-local-nvme performance, but describing throughput (IOPS can be deceptive as they are pipelined, and as such could still give you 2ms latency at times).

tuankiet65 · on Aug 16, 2022

Apparently io2 Block Express (I'm not sure what's the difference from io2) is capable of "sub-millisecond average latency"[0].

[0]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/provisio...

zxcvbn4038 · on Aug 16, 2022

I have this same conversation about AWS autoscaling all too frequently. It is a cost control mechanism, not a physics cheat. if you suddenly throw a tidal wave of traffic at a server then that traffic is going to queue and/or drop until there are more servers. If you saturate the network before the CPU (which is easy to do with nginx) or your event loop is too slow to accept the connections so they are dropped without being processed (easy to do I nodejs) then you might not scale at all.

kmod · on Aug 16, 2022

I don't think it's the cable length: 0.5ms is 150km in fiber or about 100km in copper. Cable length is important in HFT where you are measuring fractions of micro seconds.

It's really quite amazing to me that HFT reduces RPC latency by about three orders of magnitude, I feel like there are lessons from there that are not being transferred to the tech world.