TCP-brutal: Congestion control algorithm that increase speed on packet lost

bdd8f1df777b · on Nov 7, 2023

I'm surprised that this got posted on Hacker News and got enough attention. It was designed mostly for China's special situation.

China only has a handful nodes for connecting to the rest of the world. Each of the node is conceptually a single router handling billions of connections. The conventional model of congestion control algorithm breaks down:

1. The exit nodes are always under congestion. By conventional wisdom of congestion control, we should reduce our speed to wait until the congestion is resolved. That is, to reduce speed to 0 and wait until infinity.

2. The exit nodes cannot do real time traffic shaping of individual connections due to the sheer amount of them. As a result, the packet loss and RTT changes are mostly random noise. Basing the send rate on random noise does not make sense.

3. Even if one connection halves its rate, it only improves the situation by less than 0.000000001%.

4. If I sustain a high speed (300 Mbps) for several hours, my connection to the same server will be throttled heavily for at least a week. So the exit nodes probably have an offline batch job picking out outliers and put them in a naughty list.

So the only thing that matters to an end user like me is to stay below the naughty threshold. Beyond that, dynamically adjusting the send rate on packet loss or RTT is mostly behavioral art.

Per my own experiment, keeping around 90 Mbps for long will not trigger the punishment of the exit node (or my ISP). Meanwhile, BBR gives only about 45 Mbps, and Cubic 15 Mbps (barely usable).

This is all assuming that the exit nodes are the bottleneck. When other parts of the network can be the bottleneck too, such as a crowded restaurant WiFi, I switch to BBR for the congestion control.

In addition, rest assured that Brutal will not see widespread adoption. In most cases, the sever does not know the bandwidth of the client, and cannot or will not trust the bandwidth data sent from the client. It's only in the special case where both the server and client are controlled by the same person where Brutal can be applied.

codetrotter · on Nov 7, 2023

> I'm surprised that this got posted on Hacker News and got enough attention.

> Proceeds to post a lot of interesting commentary.

Well, there you have it :P that’s why it gets posted and upvoted and commented on, because it is interesting to HN readers :D

pests · on Nov 7, 2023

Yeah, similar thoughts.

dkbrk · on Nov 7, 2023

It's curious that you say RTT is mostly random noise yet BBR gives you only 45 Mbps. The whole idea of BBR is that it probes bandwidth vs RTT and finds the point at which RTT starts to increase.

Maybe BBR is just misbehaving due to too much noise? Wouldn't a better approach be to tweak BBR to spend more time probing or change the model parameters that it uses to estimate link bandwidth?

rfoo · on Nov 8, 2023

> It's curious that you say RTT is mostly random noise yet BBR gives you only 45 Mbps.

I believe OP meant RTT fluctuate somewhere between 150ms to 250ms completely randomly even with no load at all, hence (the delta is) mostly random noice. Not that the RTT is below 0.1ms and barely measurable.

bdd8f1df777b · on Nov 7, 2023

Your idea does make sense. I'll try to see if I can tweak the BBR parameters on my server and achieve better performance.

mhandley · on Nov 6, 2023

The Internet only works because TCP (and QUIC) congestion control does a reasonable job of matching offered load to available capacity. Without congestion control, the network is apt to get into congestion collapse, where the network is increasingly busy but getting no useful work done. We saw such congestion collapses in the 1980s. Van Jacobson's TCP congestion control algorithm was the response and its descendents have kept the Internet working ever since.

Now, in certain limited domains something like Brutal can work, but you really wouldn't want everyone to do it.

lq0000 · on Nov 6, 2023

> Like Hysteria, Brutal is designed for environments where the user knows the bandwidth of their connection, as this information is essential for Brutal to work.

They don't quite say that this is a bad idea for use over WAN. If they intentionally avoided ruling out such usage in this qualification, they're making an implicit assumption here that either the last-mile connection or the endpoints themselves are going to be the bottleneck. If some router in between is having a bad day, it would definitely make its day worse.

edit: I wasn't familiar with Hysteria but now that I'm reading those docs, I guess the intent is for this to be used on the internet. In that case, it does seem pretty like it'd be pretty adversarial to run this. I bet if it saw widespread adoption it'd make ISPs pretty upset.

edit 2: Going slightly off-topic now, but I wonder if the bandwidth profile of Hysteria compromises its HTTP/3 masquerade?

NeverBehave · on Nov 6, 2023

It is intentionally used on WAN. Brutal part of Hysteria(https://news.ycombinator.com/item?id=38026756) internal components, and Hysteria is a proxy made for people in China under censorship, where outbound Internet access is heavily degraded.

rfoo · on Nov 7, 2023

> but I wonder if the bandwidth profile of Hysteria compromises its HTTP/3 masquerade?

Most likely so. GFW is not able to reassemble and analyze QUIC (and AFAIK, any UDP-based multiplexed protocol) traffic, yet. If Hysteria takes off, GFW will try to kill it and so far it's likely to be degraded severely just as Shadowsocks, V2Ray or (ironically) Trojan.

Very few "censorship-resistance" proxy implementations out of China were designed to systematically evade traffic analysis, they usually just avoid general techniques and rely on being niche enough to fly under radar. Which is not wrong: being diverse is also a good strategy.

_flux · on Nov 6, 2023

A long time ago I had a bad cable internet connection (high packet loss), but I also had good shell access to Uni's computers, so what I did was that I downloaded large files there and then I had encountered a tool that would be given three parameters:

- Destination IP and port

- Bytes per second

- Files to transfer

On the receiving end there was the counterpart.

The tool would send the data from the beginning to the end with given BPS and then start again from the beginning, skipping frames that it had received acknowledgements for, until all frames were acked.

Worked great! I was able to just select a bitrate that worked well and let it churn.

I don't remember the name of the tool, but I doubt there would be many use cases for it today—nor would it be very difficult to reimplement, given its brutal nature.

rft · on Nov 6, 2023

About 2 years ago I had a similarly bad cable connection. I could push nearly 500mbit/s UDP through it, as advertised, but HTTPS downloads and TCP streams would only reach about 80mbit/s each. I could run multiple of them in parallel to max out the downstream on TCP alone. I also tracked it down to a low packet loss on the connection that caused TCP to reduce its rate.

I wondered whether you could add a VPN-like layer that retransmits and potentially reorders TCP packets itself without letting the actual endpoint TCP stack handle it. That way it should transparently remove the packet loss at the cost of additional complexity and higher latency.

I wish I knew about Hysteria, but I never made the connection to censored networks. It seems like it could be (ab)used for this use case.

don-code · on Nov 6, 2023

Norton Ghost could do something similar, but with UDP multicast. You'd boot a room full of machines to Ghost, have them join a multicast group, multicast an OS image to them, and then have the host machine just send the image out once (from its perspective), saving a ton of bandwidth in the process. The individual machines would then rerequest any chunks that they missed.

ale42 · on Nov 6, 2023

Except that Ghost was using (or still uses? I switched to udpcast since ages) ACKs from the destination hosts to adapt the bandwidth dynamically. So you didn't have to give/impose the bandwidth. The funny part was when a machine in a room with 45 PCs was misbehaving (read: failing HD), basically bringing the transfer rate close to zero for the whole deployment... and good luck finding out which machine it was.

EvanAnderson · on Nov 6, 2023

Exactly my experience w/ Ghost, too. I moved over to udpcast as well. (The last time I did this kind of work I simply ended up blasting the output of "ntfsclone" via udpcast. I was very happy w/ how it worked.)

ale42 · on Nov 7, 2023

Am almost doing the same, basically udpcasting a file that was created with partclone.ntfs -c, and piping that to partclone.ntfs -r :)

phyzome · on Nov 6, 2023

Sounds very torrent-y.

wbl · on Nov 6, 2023

Fountain codes would be another choice for that application

neilalexander · on Nov 6, 2023

I would think that more intelligent packet scheduling (like fair queuing) is also playing a significant role these days, with routers on the path being empowered to drop packets from one flow in order to service another. That helps to bully most TCP congestion control algorithms into throttling back to a fairer pace, as they'll just end up with lots of retries otherwise.

mhandley · on Nov 6, 2023

At the edge, if Brutal comes up against FQ-Codel or Cake in your (recent) Linux router, then if it causes congestion it should shoot itself in the foot, and other traffic shouldn't see too much adverse impact. When Brutal is limited to its fair share and sees loss, it will likely increase, get more loss, and only cause itself pain. Unfortunately Cake isn't all that widespread yet in home routers.

In large routers at ISPs, the best tool that is likely to be available is WFQ, with a limited number of hash buckets available. Likely WFQ will bundle many other flows together in the same bucket with a Brutal flow, so Brutal will still cause collareral damage if it encounters congestion.

dtaht · on Nov 8, 2023

We are making significant progress with CAKE & fq_codel in the WISP & and fiber markets as a middlebox. See Preseem, Bequant, and LibreQos.io. Still best to get these native on the CPE as Mikrotik and many many others have one.

I wonder how well these would work on the GFW?

BillFranklin · on Nov 6, 2023

If you're interested in recent state-of-art improvements to TCP congestion control, check out the Remy project, it's pretty amazing https://web.mit.edu/remy/.

twic · on Nov 6, 2023

> But there is a closely-related question, which is to ask: how much does Remy depend on the way the simulator works? Will it work on real networks? We have some confidence that the results don't depend on fine details of the simulator, because RemyCCs are optimized within Remy's own simulator but evaluated inside ns-2 and its TCP implementations, which were developed independently and long predate Remy. But the only way to know for sure is to try it on a real network, which we haven't done yet.

Did they try it on a real network?

Also, it seems unclear that using a Remy congestion control algorithm while all the machines around you are using Cubic will help very much.

BillFranklin · on Nov 7, 2023

I’m not sure if much more research has been done here from this team. I was able to produce similar results in my dissertation, also not in a production setting.

Regarding your second point, the way congestion avoidance algorithms typically work is to punish the local client for the state of the global congestion. If Remy can do that in a more efficient way (ie greater throughput or fairness for the local client) then it would be beneficial to use the algorithm before others switch.

champtar · on Nov 7, 2023

The page mentions Ubuntu 13.04 and doesn't mention BBR, is the project still active ?

joelthelion · on Nov 6, 2023

Out of curiosity, what do modern OSes use?

Agingcoder · on Nov 6, 2023

Cubic, or bbr I’ve seen in practice. Bbr is extremely efficient in particular ( and I’d like to know how Rémy compares to it )

Edit: bbr essentially builds and fits a model - in some ways it’s learning ( it’s stats essentially).

hnav · on Nov 6, 2023

cubic and bbr

kazinator · on Nov 6, 2023

Car-brutal: new congestion control algorithm!

Pass everyone fast in the left lane, get to within 50 yards of the target exit (where they are are all also going) and then hit your brakes to cut back in, sending a peristaltic wave of stoppage backward in the fast lane.

In networking, packets can disappear due to hardware (mainly on wireless) or due to being deliberately dropped due to congestion. You can't tell these two scenarios apart. It makes sense to try harder in the former scenario, but trying harder in the latter scenario makes the congestion worse for the entire network, while eking out a minor improvement for that connection. If every node does it, the entire network will be far worse off, so it is counterproductive.

TCP congestion control depends on cooperation: that every node complies with the RFC requirements to implement all the necessary algorithms. That unfortunately leaves room for idiots and dickheads, hence from time to time we may see work like this. "Hey, if I blatantly break the RFC, I get faster transfers. Holy shit, how come nobody knows about this?"

twic · on Nov 6, 2023

> Pass everyone fast in the left lane, get to within 50 yards of the target exit (where they are are all also going) and then hit your brakes to cut back in, sending a peristaltic wave of stoppage backward in the fast lane.

Interesting. Looking into it.

JohnBooty · on Nov 6, 2023

Reminds me of those "alternative" bittorrent clients that did neat things by, basically, behaving in selfish ways.

The main example behavior I remember was that these clients would sort of cheat by downloading the file chunks in sequential order (so they could start playing the media sooner) rather than in random order (which is a big part of what makes bittorrent a pretty effective protocol/algorithm)

FeepingCreature · on Nov 6, 2023

If lots of people are watching the file sequentially, then sequential order is close to optimal. Consider a Bittorrent-based lifestream. Clients should focus on getting and distributing the chunks that everyone wants.

JohnBooty · on Nov 6, 2023

Bittorrent does livestreams? Did not know that.

But for "normal" Bittorrent usage, oh hell naaaaah.

Where you are trying to actually obtain a complete file, that can be a disaster. What happens is the chunks at the end of the file will be much rarer than the chunks at the beginning. Getting the blocks sequentially offers a small convenience benefit (you can start viewing a video while the torrent is still downloading) but if significant numbers of people did that it would harm every other use case including your own ability to actually download the entire file.

If you have a surplus of seeders this isn't a problem, but for sparsely-seeded torrents that is not good.

zamadatix · on Nov 6, 2023

This implies the reason most sparsely seeded torrents have chunk availability is because the transient leechers just happen to arrive consistently in time to hand off to one another. I think reality is sparse torrents tend to be kept alive by a few seeders slowly kicking out data (most likely because they are seeding a ton of other stuff too). Sequential or not, if you and another guy are planning on dumping a few seconds after finishing the torrent then it doesn't really matter if that is a chunk near the end or a random chunk - either way you're both trying to squeeze the last chunk out of the seeder.

Densely seeded torrents may be slightly different and play out more like you describe, but it doesn't really matter as much there as the first parts can be sourced from those watching sequentially and the end parts left to those that actually seed after getting a full file.

JohnBooty · on Nov 6, 2023

    This implies the reason most sparsely seeded torrents 
    have chunk availability is because the transient leechers 
    just happen to arrive consistently in time to hand off to 
    one another.

No. I'm not sure where you got that idea.

This is a pathological and unrealistic example but please try to extrapolate to more realistic scenarios. Imagine a file with three pieces. 200 torrent users have zero pieces, 100 torrent users have the first piece, 50 have the first and second piece, and "Joe" is the sole seeder that has all three.

Obviously, assuming Joe doesn't have godlike bandwidth, there are 350 users who want that third piece and only a single Joe and there is going to be contention. In this pathological situation things will work itself out as soon as supply of the third piece increases, assuming people don't kill their clients as soon as they hit 100%, but things will be slower than they need to be at first.

What I think you're trying to point out is that there are situations where sequential torrent downloads aren't harmful to download speed or overall availability. Yes, that is true. If seeder bandwidth for a given piece is more than sufficient for leecher demand for that piece then... yes, no harm done.

As another user posted, http://bittorrent.org/bittorrentecon.pdf section 2.4.2 "Rarest first"

You're free to disagree with the design of bittorrent, obviously.

FeepingCreature · on Nov 7, 2023

If people download the complete file eventually, it should average out. And if people view half the file and then drop out, then the earlier chunks being more available is optimal.

JohnBooty · on Nov 7, 2023

    If people download the complete file eventually, it should average out

Yes. As supply of the rarer chunks increases, the problems go away.

Notice the implicit fact up there. The uneven distribution of chunks causes problems.

    then the earlier chunks being more available is optimal.

For most use cases, no, the first chunks of the file are not more valuable. Having the first 99 chunks of a Linux .ISO is not valuable at all if you don't have the 100th chunk. The chunks are equally valuable and the chunks are useless unless you have them all.

There are some (or were?) torrent clients that let you start watching incomplete downloads while they're still downloading. This is a dubious use case. I do not notice many people using these nonstandard clients.

Also even for some kind of livestreaming scenario, what are even the "first" chunks? The start of the stream? The stream at the current time? It's probably a moot point anyway because it seems like Bittorrent livestreaming hasn't exactly taken the world by storm but maybe there are uses of it with which I'm not familiar: https://www.google.com/search?client=firefox-b-1-d&q=bittorr...

FeepingCreature · on Nov 8, 2023

> For most use cases, no, the first chunks of the file are not more valuable. Having the first 99 chunks of a Linux .ISO is not valuable at all if you don't have the 100th chunk.

Sure but we just established that people are dropping out in mid-watch. For most use cases, people aren't interested in downloading the first chunks first; those cases where they are, and where it doesn't eventually average out, are exactly the cases where the earlier chunks are more important.

nix0n · on Nov 6, 2023

If multiple users download in sequential order, wouldn't that actually make it pretty likely that the previous downloader would have the chunk that the next downloader is looking for?

Or is there a specific advantage to random order?

TonyTrapp · on Nov 6, 2023

With clients constantly appearing and disappearing, the probability that you will be able to finish the download will be much higher if everyone has a random collection of file parts. Imagine the original seeder goes away after one day. Everyone is stuck at 99% because they downloaded in sequential order. Nobody has the last missing part. This would be much less likely if the clients had downloaded random parts instead, because every part would be roughly equally likely to be available in the end.

the8472 · on Nov 6, 2023

see http://bittorrent.org/bittorrentecon.pdf section 2.4.2

nix0n · on Nov 6, 2023

Thanks! That answers my question.

For anyone who doesn't feel like clicking on a PDF, the section title is "Rarest First".

fmajid · on Nov 6, 2023

That's pretty much what Google is doing with QUIC and BBR, aka HTTP/3. Tragedy of the Commons in the making.

progbits · on Nov 6, 2023

I don't think you even get faster transfers, at least beyond some short term gain. You will overload some router on the path and just increase packetloss for everyone without improving your goodput.

Maybe my intuition is wrong but there are no graphs and real world measurements.

ta1243 · on Nov 6, 2023

In some cases you could. I had an issue last week with my 1G leased circuit with a 160ms rtt

That circuit had a 1% packet loss on it due to a dodgy SFP. This devastated cubic, with peak TCP transfer dropping from 150mbit (we police it to about that) to less than 1mbit. BBR was better but still down a fair bit.

Using this algorithm I presume it would have continued at about 140-150mbit.

I don't really do TCP so haven't looked too closely at different algorithms in different latency/loss (burst or constant) conditions, but I can see where this type of algorithm could be useful.

xiconfjs · on Nov 7, 2023

You could check if you have SACKs [1] enabled - these can help by a lot on lossy uplinks using tcp.

[1] https://packetlife.net/blog/2010/jun/17/tcp-selective-acknow...

progbits · on Nov 6, 2023

OK I suppose that is what GP meant with the first category - if there is no congestion but your physical layer is dodgy this can indeed help.

But if many people start using it this will fail on congestion (and even if there is little congestion to begin with, this will amplify it so there is sure to be some).

kazinator · on Nov 6, 2023

If many people are driving aggressively and cutting in, everyone is made late; but those who engage in the behavior arrive earlier than those who refrain.

A router that is dropping packets due to being congested is still forwarding some packets. If you wallop that router with your own packets, that's your best chance of getting more of your packets to be forwarded, at the expense of someone else's being dropped. (Assuming the router has no countermeasure against that behavior.)

If we take a simple leaky bucket model: the router drops the packet because it has no resources in that moment. A moment later, space opens up because of transmitted/cleared data, so when a packet is received in that later moment, is not dropped. There is likely a chain of leaky buckets: multiple points in the router traversal where packets can be dropped due to resource issues: dropping could be on the receive side or transmit. Wherever there is a queue whose length is capped or a memory allocation operation.

If you wallop that machine with your packets, you increase the chances that when those moments come when resources are available that allow a packet to be retained rather than leaked onto the floor, that opportunity goes to one of your packets rather than someone else's.

the8472 · on Nov 6, 2023

That's only true if the router drops randomly rather than hashing into buckets by host

zamadatix · on Nov 6, 2023

Yeah, you'd probably get a better rate in this case. In this kind of non-congestion related path loss you'd be even better off just doing something like FEC instead though.

mort96 · on Nov 6, 2023

> It's particularly effective at seizing bandwidth in congested, best-effort delivery networks, hence its name.

So basically it's an algorithm designed to push out people who play nice?

This seems like an absolutely terrible idea. It seems like in terms of congestion control algorithms, the Internet has been balancing in the good quadrant of the Prisoner's Dilemma, probably mostly because the people who work on that low of a level are nerds with a functioning moral compass. Is that era coming to an end?

Sesse__ · on Nov 6, 2023

There's already a variant of this from the receiver side, the so-called “download accelerators”. I've seen some of those literally open hundreds of connections for the same file until the server was full (hit its MaxServers limit), and when the server started sending 503 Too Busy, they would retry every 10 milliseconds. Several of them would mask their user agent. When I asked a user (unrelated to that specific program) about why they'd use something like this, their answer was “I've paid for 50 Mbit/sec, I will have 50 Mbit/sec”.

I can tell you, hosting files in such an environment and trying to work from the same machine is not a pleasant experience.

phh · on Nov 6, 2023

Funny thing is, I used to (and still sometime do) host on a webserver with HDD storage. The users using such downloaders were DoSing themselves and got less than a tenth of a sequential download.

Sesse__ · on Nov 6, 2023

The dataset here is 17 TB+, so HDD storage is the only thing that's economically viable, especially for a service hosted as a hobby. :-) It doesn't really help that they DoS themselves, they still do it.

(After I limited connections to one per IP address, 90% of the problems went away. After I started blackholing users ignoring 503 responses, the remaining 10% were solved. I get the occasional “I didn't knooow, please give me access again” email now and then.)

elitepleb · on Nov 6, 2023

it's main use is in censorship resistant proxies, where the network at large does not have a good quadrant.

https://hysteria.network/

mort96 · on Nov 6, 2023

I don't understand the connection between censorship and congestion control. Surely there's a good quadrant where everyone plays nice in terms of congestion control, even though there are bad actors doing censorship in other ways? Surely censorship isn't generally performed by making the network artificially congested so that you may access the censored material but a bit more slowly?

NeverBehave · on Nov 6, 2023

That's actually part of the situation in China --- degraded network situation to certain part of the Internet. You may be able to establish connections e.g. Github, and even okay to download release from S3, but usually speed is stable around 2-3kb/s, which is effectively useless.

I am not certain this is due to the censorship, but this issue is sitting there for at least a decade.

Big Corps in China usually setup their own VPN/Private Line to workaround this situation.

mort96 · on Nov 6, 2023

Okay, say it's part of the censorship, and parts of the Internet are intentionally "soft-blocked" in the way you describe.

If the degraded performance some kind of artificial throttling, then I have a hard time understanding how an antisocial congestion control algorithm would help. If there's some middle box tasked with providing every IP address no more than 2kbit/s, then it should be able to do that job just fine even if you keep throwing lots of packets at it, right?

If the degraded performance is simply due to intentionally terrible infrastructure and there's real congestion going on due to many Chinese people trying to access the Internet at the same time, then using an antisocial congestion control algorithm might give you faster transfers, at the cost of everyone else. If everyone started using these antisocial congestion control algorithms, the end result would simply be that nobody would get to communicate with those "soft-blocked" parts of the Internet, not even at those 2 kbit/s.

In short, I don't understand how this could even in principle be an effective tool for fighting censorship. I'm happy to reconsider if anyone describes such a use case in technical detail though.

NeverBehave · on Nov 6, 2023

Here is the opinion from the author, unfortunately it only has Chinese version but here is the relevant part translated using deepl:

> And if you insist on sending packets, even though the other traffic does not give way, since your packets are taken more proportion of the traffic, they would more likely to be selected. Whether it's "ethical" to "grab" bandwidth in this way is a subjective question, but the objective root cause is the urgent need to expand the operator's equipment with insufficient bandwidth. Operators should not expect users to be "sympathetic" to the lack of backbone capacity - the operator has contracted a rate with the user and the user is not cracking that limit, just using the bandwidth that the operator has committed to them, which is reasonable behavior.

Personally I don't think it is a good idea, but I was in that situation during my high school, and it was terrible. I get the idea why this project would exist sooner or later and just trying to present some context for discussion.

https://v2.hysteria.network/zh/docs/misc/Hysteria-Brutal/

elitepleb · on Nov 6, 2023

>it should be able to do that job just fine even if you keep throwing lots of packets at it, right?

there's circumvention techniques that abuse the fact that they can't: https://upb-syssec.github.io/blog/2023/record-fragmentation

the intentionally terrible speeds are decided by software heuristics, so antisocial limits are fought with antisocial techniques, with continued brutality being enough to skip the slow deep packet inspection path via force

mort96 · on Nov 6, 2023

If that's actually the case, then using this congestion control algorithm for that purpose is okay I suppose.

fragmede · on Nov 6, 2023

it's your first scenario. the infrastructure is fine, but there's a middlebox in the way, doing whatever it can to fuck with your connection. the middlebox will drop packets. this middlebox will invent RST packets and spam them at you to make you think the connections been dropped. using the antisocial congestion control manages to get you more than 2Kbps with the middlebox still doing its thing.

ajb · on Nov 6, 2023

True, it's a pretty dumb idea

On consumer networks at least, there is normally a queue scheduling stage where customers are weighted equally. So this would just sieze bandwidth from other connections from the same household. I think a similar thing usually happens on hosting platforms etc

giardini · on Nov 6, 2023

>"So basically it's an algorithm designed to push out people who play nice?"<

Perhaps should be renamed TCP-Peking in contrast to TCP-HongKong where queuing is accepted behavior.

ta1243 · on Nov 6, 2023

You should see what products like Signiant do

MadnessASAP · on Nov 6, 2023

What does Signiant do? And what does Signiant DO?

ta1243 · on Nov 6, 2023

Uses UDP to aggressively use every drop of bandwidth to spray files in a way which makes brutal look like cubic

muxamilian · on Nov 6, 2023

> Unlike BBR, Brutal operates on a fixed rate model and does not reduce its speed in response to packet loss or RTT changes. [...] It's particularly effective at seizing bandwidth in congested, best-effort delivery networks, hence its name.

So if there's one TCP flow using Brutal, all other traffic gets pushed out. Fair queuing can prevent this. If one can be sure that there's fair queuing, one can do much smoother congestion control: https://github.com/muxamilian/fair-queuing-aware-congestion-...

mhandley · on Nov 6, 2023

It's interesting how long we've been waiting for fair queuing to fix congestion problems in the Internet. The first proper research on this was published in 1989: https://dl.acm.org/doi/10.1145/75246.75248

I'm a big fan of fair queuing, and have it enabled for my home network. But in core routers, the best approximation is likely to be WFQ, where you're likely to have each flow hashed to one of something like 256 queues. This means one badly behaved flow can't force well-behaved traffic out of the way and take over the whole link, but it can take over its WFQ queue, starving well behaved flows that hash to the same queue.

I'm not aware of any backbone router that implements true fair queuing. But even if all routers did, it's not a complete solution. Typically flows are mapped to queues based on the 5-tuple (src IP, dst IP, src port, dst port, proto). If you do this, then all Brutal-NG needs to do is use many source ports so it gets many queues, thus many times its fair share, and take over the link again. In fact, this would enable DoS attacks on router state, so no-one is going to do this.

An alternative would be to map to queues using just the source and destination IP addresses. But this has problems too. Brutal-NG could spoof the source address of most of the packets (but send ACKs back to the one unspoofed address), again taking over the link. And it could still cause DoS issues on router state.

The only thing you can't spoof if you want to actually exchange data (as opposed to DoSing the network) is the destination IP address. But now one Brutal flow can achieve the same fair share as all the traffic headed for a busy Google server or an entire ISP's CGNAT. Equally, one flow Brutal flow sending to a host behind the CGNAT can deny service to everyone else sending to the same CGNAT IP address.

So in the end, while I really like what fair queuing does for my VoIP latency on my home network, it is unlikely to ever be a complete solution for constraining misbehaving flows.

dtaht · on Nov 8, 2023

We designed cake to be implementable in hardware if anyone could be convinced it was useful: https://blog.cerowrt.org/post/juniper/ - I thought this test series rather compelling.

1) The data paths MUST prepend a timestamp to every packet 2) HW invsqrt for codel or cobalt AQM is 3k gates, checked on output 3) 32k 4 way set associative cache is a std IP block 4) HW Packet hasher - three different outputs

There are a few states in cake that are optional, but I remain boggled that step 1!! did not happen within 2 hardware releases of switch chips after van Jacobson and kathie Nichols published codel as a replacement for RED in 2012.

https://queue.acm.org/detail.cfm?id=2209336

dtaht · on Nov 8, 2023

Nice to see you mark! and that you are using CAKE. Cake has a feature that helps on regulating multiple flows to a single destination: per host/per flow FQ. In the middle of a network this only works well with IPv6, along the edge it has a nat mode that helps there. We put it in to manage things like bittorrent really well.

Example:

https://www.reddit.com/r/eero/comments/u7xm83/gen_2_sqm_vs_g...

The 8 way set associative hash is also much better than the direct map hash other things use, and I agree 256 flows is too small, 1024 set associative is great, 16k direct map comparable.

xxpor · on Nov 7, 2023

Fair queuing would be extremely difficult to support while maintaining overall peak performance. Way too much shared state.

mhandley · on Nov 7, 2023

I can think of ways to approximate FQ on an input-queued switch while maintaining performance, but it doesn't really help because the total number of queues being a function of the traffic still opens you up to resource exhaustion attacks.

muxamilian · on Nov 7, 2023

Fair queuing shouldn't be necessary in the core. If every base station/WiFi router/client device support fair queuing that should be enough.

dtaht · on Nov 8, 2023

I think FQ helps everywhere, and FQ + aqm is needed at every fast->slow transition on the network. Ideally the core is over provisioned... except where it is not.

dtaht · on Nov 8, 2023

Pushing CAKE to 10Gbit/xeon core in libreqos currently (the bottleneck is on the read path actually). Or, 10k ISP subscribers per $2k xeon box.

xxpor · on Nov 9, 2023

10 gbit per xeon core is bad though

dtaht · on Nov 9, 2023

The bottleneck is mostly on the read path,even leveraging xdp heavily. dpdk/vpp would be faster. In the case however of an ISP 10k subscribers/network segment at 25Gbit is more than enough and busies 40% of 20 cores, on libreqos. Others are making boxes capable of more using smarter ethernet cards. If only more ISPs cared about QoE, we could cause a run on ebay on slightly obsolete Xeons and finish up fixing the bufferbloat problem right quick!

sam0x17 · on Nov 6, 2023

I remember as an undergrad thinking it was crazy that a lot of congestion control is essentially left up to the client to play nice.

AnthonyMouse · on Nov 6, 2023

It's because everyone knows what would happen if they didn't. If you deploy this kind of algorithm at scale, multiple clients fail to back off which only causes the router to drop both of their packets. Then they have to retransmit them instead of transmitting them to begin with at a rate that causes them to not be dropped, which is actually slower.

muxamilian · on Nov 6, 2023

It sounds crazy but fortunately fair queuing is becoming more and more prevalent, making sure everyone plays nice. Every iPhone has it: https://blog.cerowrt.org/post/state_of_fq_codel/

keuin · on Nov 7, 2023

I live in China. And I have to admit the global internet connection is very busy and limited here. But I don't think you are right. Filling the wire forcefully with retransmitted packets is a very bad behavior. It's not surprising to get blocked temporarily by your ISP.

dtaht · on Nov 8, 2023

Wish more would look harder there at the actual delays being observed and how bad congestion collapse gets at +250ms of it.

https://blog.cerowrt.org/post/juniper/

flemhans · on Nov 7, 2023

I remember the best internet connection was my own cellphone using roaming (Three), which seemed to bypass all the China bullshit.

chrisweekly · on Nov 6, 2023

mods: title grammar:

increase -> increases

packet lost -> packet loss