Skip to content

feat(middleware): add concurrency_limit#11

Merged
hidetzu merged 1 commit into
mainfrom
feat/middleware-concurrency-limit
Apr 11, 2026
Merged

feat(middleware): add concurrency_limit#11
hidetzu merged 1 commit into
mainfrom
feat/middleware-concurrency-limit

Conversation

@hidetzu
Copy link
Copy Markdown
Owner

@hidetzu hidetzu commented Apr 11, 2026

Summary

PR 5/9 of the Phase 2 series. Adds the third and final defensive middleware: a global in-flight request cap using `golang.org/x/sync/semaphore`. Overflow is rejected immediately with `503` + `response.CodeServiceUnavailable`; requests do not queue.

Not wired into `app.New()`. PR 6 will insert `body_limit` + `rate_limit` + `concurrency_limit` into the chain in §6 order and add an end-to-end integration test, completing the defensive stack.

Key design decisions

`TryAcquire` (immediate rejection), not `Acquire` (queue)

`semaphore.Weighted` supports both. This middleware uses `TryAcquire` so overflowing requests get `503` instantly rather than blocking on a slot. Rationale: under load we want to shed traffic fast. Queueing hogs connections and delays client retry/backoff logic, and in the worst case compounds the overload. Phase 2 instruction §14 matches this choice ("超過時は 503") and explicitly says no `Retry-After` header is needed in Phase 2.

`maxInFlight <= 0` disables

Matches the convention established by `body_limit` and `rate_limit`. Useful for tests and opt-in no-limit local dev.

Parameter name `maxInFlight` (not `max`)

Go 1.21+ introduced `max` as a predeclared built-in; using `max` as a parameter name shadows it and `revive` flags it. `maxInFlight` is more descriptive anyway.

Test plan

6 tests, all race-clean:

Test Verifies
`AllowsWithinCapacity` `ConcurrencyLimit(2)` + 3 serial requests → all 200
`RejectsBeyondCapacityReturns503` 1-slot limit, one request in flight, second request gets 503 with standard error body (`code=service_unavailable`, non-empty message, correct Content-Type)
`RecoversAfterRelease` After an in-flight request completes, the slot is actually returned so a subsequent request can acquire it
`MultipleSlots` 2-slot limit, 2 concurrent requests in flight both succeed, 3rd rejected
`DisabledWhenMaxZero` `ConcurrencyLimit(0)` → 5 concurrent requests all pass (handler runs 5 times, confirmed via `sync.WaitGroup`)
`NegativeMaxDisabled` `ConcurrencyLimit(-1)` → handler runs

The concurrent tests synchronize via a gate channel (`release`) + signal channel (`inHandler`) rather than `time.Sleep`, so the race detector sees clean ordering and the tests do not rely on timing.

Local checks

  • `go vet ./...` clean
  • `go test ./... -race -count=1` all pass
  • `golangci-lint run` 0 issues (after renaming `max` → `maxInFlight` to avoid shadowing the built-in)
  • `go mod tidy` leaves go.mod / go.sum clean

Dependencies added

```
golang.org/x/sync v0.20.0
```

Per `docs/development_rules.md` §3, an explicitly-allowed dependency for concurrency limiting.

Phase 2 context

# PR Status
1 response codes merged
2 validation helpers merged
3 body_limit middleware merged
4 rate_limit middleware merged
5 concurrency_limit middleware (this PR) open
6 wire defensive middleware + E2E test blocked on 3, 4, 5
7 analyze usecase adapter for pkg/prism blocked on 1
8 /v1/analyze handler and routing blocked on 1, 2, 7
9 /v1/prompt endpoint blocked on 1, 2, 7

With this PR, all three defensive middleware will be complete and ready for chain wiring in PR 6. Intended to be squash-merged.

Third and last of the defensive middleware. Caps the number of
in-flight requests using golang.org/x/sync/semaphore and rejects
overflow immediately with 503 response.CodeServiceUnavailable.

Immediate rejection via semaphore.TryAcquire is intentional:
under load we would rather shed traffic fast than let clients
hog connections waiting for a slot. Per Phase 2 instruction §14
no Retry-After header is emitted.

maxInFlight <= 0 disables the limiter entirely (pass-through),
matching the convention used by body_limit and rate_limit. The
parameter name maxInFlight (not max) avoids shadowing the Go
1.21+ built-in max function.

Not yet wired into app.New(); all three defensive middleware
plus an end-to-end chain test will land together in the next PR.

Tests cover capacity passthrough, beyond-capacity rejection
with the standard error body, recovery after slot release,
multiple-slot independence (capacity 2, third rejected while
two in flight), and the disabled paths for zero and negative
maxInFlight. The concurrent tests synchronize via a gate
channel + signal channel so the race detector sees clean
ordering.
@hidetzu hidetzu merged commit 1342392 into main Apr 11, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant