Skip to content

feat(middleware): add rate_limit#10

Merged
hidetzu merged 1 commit into
mainfrom
feat/middleware-rate-limit
Apr 11, 2026
Merged

feat(middleware): add rate_limit#10
hidetzu merged 1 commit into
mainfrom
feat/middleware-rate-limit

Conversation

@hidetzu
Copy link
Copy Markdown
Owner

@hidetzu hidetzu commented Apr 11, 2026

Summary

PR 4/9 of the Phase 2 series. Adds the second defensive middleware: per-IP token-bucket rate limiting backed by golang.org/x/time/rate. Each unique client IP gets its own *rate.Limiter, and exceeded requests get 429\ + response.CodeRateLimited` before reaching the handler.

Also includes a small internal refactor: the `clientIP` helper moves from `logging.go` into a dedicated `clientip.go` file so both `logging.go` and `ratelimit.go` can share it without drift. Behavior is unchanged.

Not wired into `app.New()`. Wiring happens in PR 6 along with `body_limit` and `concurrency_limit`.

Key design decisions

Keying by clientIP, not RemoteAddr

Fly.io terminates TLS at its edge, so every request arriving at the machine shares the same RemoteAddr (the edge proxy's internal IP). Without X-Forwarded-For keying, every client on the planet would share a single limiter — totally broken.

clientIP prefers the first entry of X-Forwarded-For and falls back to RemoteAddr only when no XFF header is present. The rate limit tests include a KeysByXForwardedForNotRemoteAddr case that explicitly simulates the Fly edge scenario.

rpm → tokens per second conversion

rate.Limit is tokens per second. The middleware converts from rpm via rate.Limit(float64(rpm) / 60.0). For the Phase 1 config defaults (RATE_LIMIT_RPM=10, RATE_LIMIT_BURST=20), that's 0.167 tokens/sec with a 20-token burst.

rpm <= 0 disables the limiter

Matches the convention established by body_limit. Useful for tests and opt-in no-limit local dev.

Unbounded map growth is accepted for Phase 2

Per the Phase 2 instruction §14, eviction of stale per-IP entries is not required in this phase. The map will grow over process lifetime. For short-lived Fly.io machines (auto-stop after idle) this is a non-issue in practice; a background janitor can be added in a later phase if long-lived deployments see problematic growth. The design comment in ratelimit.go flags this explicitly.

clientIP refactor

Moved from logging.go into a dedicated clientip.go file to avoid duplication. Both callers live in the same middleware package so the helper stays package-private (not exported). The logging tests still pass against the moved definition — the refactor is mechanical.

Test plan

7 test cases in `ratelimit_test.go`:

Test Verifies
`AllowsWithinBurst` 3 requests with `burst=3` all pass
`RejectsBeyondBurstReturns429` 3rd request with `burst=2` returns 429 with standard error body (`code=rate_limited`, non-empty message, correct Content-Type)
`PerIPIsIndependent` IP A exhausts its burst; IP B still has full burst
`KeysByXForwardedForNotRemoteAddr` Two clients sharing one `RemoteAddr` but with different XFF headers are keyed independently (Fly edge scenario)
`RecoversOverTime` 600 rpm + burst 1: exhaust, reject, sleep 150ms, recover. Verifies the limiter is time-based
`DisabledWhenRPMZero` `RateLimit(0, ...)` → handler runs 5/5 times
`NegativeRPMDisabled` `RateLimit(-1, ...)` → handler runs

Local checks

  • go vet ./... clean
  • go test ./... -race -count=1 all pass (existing + 7 new + 2 body_limit + 2 logging)
  • golangci-lint run 0 issues
  • go mod tidy leaves go.mod / go.sum clean (CI tidy check will pass)

Dependencies added

golang.org/x/time v0.15.0

Per `docs/development_rules.md` §3 this is an explicitly-allowed dependency for rate limiting.

Phase 2 context

# PR Status
1 response codes merged
2 validation helpers merged
3 body_limit middleware merged
4 rate_limit middleware (this PR) open
5 concurrency_limit middleware blocked on 1
6 wire defensive middleware + E2E test blocked on 3, 4, 5
7 analyze usecase adapter for pkg/prism blocked on 1
8 /v1/analyze handler and routing blocked on 1, 2, 7
9 /v1/prompt endpoint blocked on 1, 2, 7

Intended to be squash-merged.

Second of the three defensive middleware. Enforces a per-IP
token-bucket rate limit using golang.org/x/time/rate. Each unique
client IP gets its own *rate.Limiter configured with rpm requests
per minute and a burst capacity of burst tokens; exceeded
requests are rejected with 429 and response.CodeRateLimited
before the handler runs.

Keying uses the same clientIP helper that logging already uses,
so rate limiting honors X-Forwarded-For (Fly.io edge proxy) and
falls back to RemoteAddr. To keep the two consumers in sync, the
helper is moved out of logging.go into a dedicated clientip.go
file; both logging.go and ratelimit.go now reference the same
package-internal clientIP function. Behavior is unchanged, the
logging tests still pass against the new location.

rpm <= 0 disables the limiter entirely (returns a pass-through
middleware), matching the convention established by body_limit.

Memory note: the per-IP limiter map grows unbounded over process
lifetime. Phase 2 explicitly accepts this (Phase 2 instruction
§14 says eviction is not required); a background janitor will
be added in a later phase if long-lived deployments see
meaningful growth.

Not yet wired into app.New(); the full defensive stack
(body_limit + rate_limit + concurrency_limit) will land together
in a later PR along with an end-to-end test.

Tests cover within-burst passthrough, beyond-burst rejection
with the standard error body, per-IP independence, X-Forwarded-
For keying (important for the Fly.io edge shared-proxy case),
time-based recovery with a 150 ms sleep, and the disabled paths
for zero and negative rpm.
@hidetzu hidetzu merged commit 0529de1 into main Apr 11, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant