Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider "burst" support for rate limits #148

Open
Mr0grog opened this issue Dec 13, 2023 · 0 comments
Open

Consider "burst" support for rate limits #148

Mr0grog opened this issue Dec 13, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@Mr0grog
Copy link
Member

Mr0grog commented Dec 13, 2023

This is low priority! It’s a bit complex and the use cases where it’s valuable are few. I’ve mainly filed this issue just to get the idea written down.

We currently implement rate limits by evenly spacing out every HTTP request. However, it would be extremely useful to make paginated CDX index requests or redirect requests for Mementos more quickly when possible.

To address this, we could add some sort of “burst” capability to RateLimit objects that bypasses the rate limit as long as doing so still fits within the overall limit for some larger timeframe and adds additional delay to the next normal request to make up for it. For example:

# Make a rate limit of 1 request/s, but burstable over a 60-second window:
limit = RateLimit(per_second=1, burst_window=60)

# No wait the first time, same as today:
limit.wait()

# Sleeps for 1 second, same as today:
limit.wait()

# Does not sleep since we haven’t exceeded 1 r/s over the last 60 seconds:
limit.burst()

# Same as above, no wait:
limit.burst()

# Sleeps for 3 seconds to even out from the two bursts:
limit.wait()

# Does not sleep since we haven't exceeded 1 r/s over the last 60 seconds:
for i in range(55):
    limit.burst()

# Sleeps for 57 seconds to get back under 1 r/s average over 60 seconds:
limit.burst()  # OR `limit.wait()`; the results would be the same here

# Sleeps for 1 second to stay under 1 r/s average over 60 seconds:
limit.burst()  # OR `limit.wait()`; the results would be the same here

IA folks said “currently we are looking at requests/minute over a 5 minute period,” so the correct default window here is probably 300 seconds.

I don’t think this is super important to do right now. It could be pretty complex to implement correctly (need to consider things like some threads calling burst() while others call wait() at nearly the same time), but the value we get out of it is probably low (nicer performance if you are just making a handful of requests and not likely to hit the limits, but no added value if you are making a lot of requests or using several threads — you’re going to hit the limits and take the same time overall regardless). But I did want to get the concept written down so it doesn’t get lost.

@Mr0grog Mr0grog added the enhancement New feature or request label Dec 13, 2023
@Mr0grog Mr0grog moved this to Unreleased in Wayback Roadmap Dec 13, 2023
@Mr0grog Mr0grog moved this from Unreleased to Backlog in Wayback Roadmap Dec 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Backlog
Development

No branches or pull requests

1 participant