Elaborate?

sk5t · on Jan 3, 2022

The solution described in the article is likely to be extremely wasteful in both time and memory, by allocating a queue entry for each call, and then O(n) scanning and dropping stale entries on each successive call.

Tabulating call count by division(s) of time would be less obviously problematic.

malisper · on Jan 3, 2022

> The solution described in the article is likely to be extremely wasteful in both time and memory, by allocating a queue entry for each call, and then O(n) scanning and dropping stale entries on each successive call.

Even if n elements are scanned in the the worst case, the expected time it takes to perform such a scan is O(1). This is because we perform a scan on each insertion and O(n) elements are deleted in a scan of length n. That means the number of elements scanned is proportional to the number of elements inserted.

> Tabulating call count by division(s) of time would be less obviously problematic.

Tabulating call count by division(s) of time doesn't quite work. If a 9 calls are made at the last second of one division of time and 9 more calls are made in the first second of the next division of time, you will hit 18 calls in a two second interval which is over the limit of 10 calls for any one minute interval.

dnpp123 · on Jan 3, 2022

> That means the number of elements scanned is proportional to the number of elements inserted.

And the number of elements inserted is in the worst case N, hence the proposed solution has a worst case complexity of O(N)?

malisper · on Jan 3, 2022

> hence the proposed solution has a worst case complexity of O(N)?

Worst case yes, but the question is concerned with the average case.

dnpp123 · on Jan 3, 2022

It's not specified in the article?

Do that for rate limiting and it'll be super easy to DoS you - just show how fundamentally you can't brute force your way out of algorithmic questions despite the article suggestion (or how interviews are fundamentally flawed, but this is another debate ;)

malisper · on Jan 3, 2022

> It's not specified in the article?

From the article:

> The function should have expected O(1) performance.

> Do that for rate limiting and it'll be super easy to DoS you

How so? The rate limiter has the same performance as, for example, a hash table. Operations are usually O(1), but are periodically O(n). It's not like every service that uses a hash-table is DoS-able.

dnpp123 · on Jan 3, 2022

Geniusly curious: does "expected" in English necessarily means average? My understanding was that expected is dependent of the application ___domain, so it can mean best or worst or average.

If you have an implementation that is O(N) in worst case it's (theoretically) DoSable since an attacker would always hit that case - so the expected complexity in case of an attack is O(N).

A trivial solution in O(60)=O(1) for worst case is to store the number of call everything second in a fixed size array and loop over it. With some arithmetic you can even avoid looping over it.

bonoboTP · on Jan 4, 2022

"Expected value" or "expectation" is a precise mathematical term in probability theory that corresponds to "average-case", always.

GistNoesis · on Jan 3, 2022

How bad is the code I want to write even though I know it is wrong :

initialize : double credit = N

At every request :

penalty = (elapsedTime / window_size) - (1/N)

gain = N* (elapsedTime / window_size)

credit = min( credit + gain - (penalty<0) , N)

if( credit < 0 ) throw exception

gsliepen · on Jan 3, 2022

I wouldn't call it penalty, but cost, and if credit = N, then I assume the cost of one call would just be 1. So:

    gain = N * (elapsedTime / window_size);
    credit = min(credit + gain, N);
    if (credit >= 1) {
        credit--;
        log(...);
    } else {
        return; /* not enough credits */
    }

Your approach has the nice property that after the initial burst, you get regularly spread out log messages, whereas the linked list approach will stay bursty.

GistNoesis · on Jan 3, 2022

I was trying to express the sliding window constraint as a violation of the pigeon-hole principle : to violate the constraint you have to have at least N+1 intervals that are shorter than window_size/N over the window_size.

But I'm not really sure of the value it adds over simply taking a cost of 1 as you suggest. When time beween requests are spreaded more than window_size/N they don't cost credit. This mean you have a smaller number of "hot" clients (clients who are not full credit) to monitor.

The second idea that often occurs in problems with sliding windows is the telescoping sum which is hiding implicitly in the sum of elapsed Times.

taeric · on Jan 3, 2022

Probably safe to say you aren't having to rate limit over a thousand endpoints from a single spot, so the linear scan sounds like a non issue. That said, I'd hesitate to get to fancy in client rate limiting. Sounds like an area that will hide bugs that are stupid hard to deal with.

sk5t · on Jan 3, 2022

I don't quite see it that way, although maybe we read the problem as something different. Suppose the rate limit is 500 per second; a client makes a burst of 500 requests in one second, sleeps two seconds, then another burst of 500. Isn't the author's planned approach going to end up doing a bunch of bogus work?

  - putting 500 timestamps into a queue
  - inspecting 500 queue entries and removing each
  - putting 500 new timestamps into a queue

bonoboTP · on Jan 3, 2022

Depends on the requirements. If it's enough that the client doesn't make >N requests within the same "calendar second", then counting in bins is enough, no need to store timestamps. But then you could have a client send N requests in the second half of second k and N requests in the first half of second k+1, so you'd have a 1 second period (that spans two "calendar seconds") with 2N requests allowed. Since such rate limits are often not extremely exact, just an order of magnitude, this quantization/rounding effect may be negligibe and the speed may be worth it. But maybe not always. The original problem statement did not say that it's fine. Probably it's a good sign if an interviewee asks if this would be okay too.

taeric · on Jan 3, 2022

I'd assume a linked list was used to be able to drop a lot in a single shot. In particular, seems safe to assume the queue is sorted by time. Such that as soon as you see a time that is outside the window, you can drop all of the rest in one shot.

The page wasn't loading for me, so I can't see the proposal. Just don't let a linear scan scare you, for small n.

sk5t · on Jan 3, 2022

There are various approaches that might or might not improve the runtime outlook (maybe a priority queue, maybe a list that supports fast random access for binary or exponential search, etc.) but wouldn't the only "fast" path to drop a bunch of entries hinge on moving the head reference and pushing the work onto the GC? One might start to think about something like a ring at the point where this would start to matter.

Asserting small n seems like a way to wash out of coding interviews that are focused on identifying and avoiding costly naive solutions. Interviewers are like as not to say "okay, now the rate limit is 5,000/sec/customer/operation, and there are 100,000 customers."

taeric · on Jan 3, 2022

I'm assuming a mutable linked list where you can just set the rest from any point to null.

Though, I confess I don't see benefits of linked list for this, all told.

Sounds like the intent is to scan forward until you hit the limit of calls, or see you have hit the boundary. In which case you set the current next point to the current call? Is what you are describing.