feat(app): wire defensive middleware into chain + E2E test#12
Merged
Conversation
Insert the three defensive middleware from the Phase 2 series
into app.New()'s Chain call in the canonical §6 order:
RequestID
Recover
Logger
BodyLimit (new)
RateLimit (new)
ConcurrencyLimit (new)
Timeout
router/handler
Each middleware consumes config values that were already loaded
at startup in Phase 1 (MaxRequestBytes, RateLimitRPM,
RateLimitBurst, MaxConcurrentRequests); no new config knobs
required.
Add internal/app/app_chain_test.go for end-to-end coverage of
the wired chain (Phase 1 QA gap: full middleware chain was never
exercised against the real app.New() output). The tests drive
requests through a.server.Handler so they see exactly what a
production request would see.
Cases:
- Positive path: GET /healthz /readyz /version all return 200
with X-Request-Id header and Content-Type: application/json;
the Logger middleware emits "request completed" log lines
with the request_id field populated, proving RequestID ->
Logger ordering.
- body_limit with a tight MaxRequestBytes rejects an oversized
POST with 413, and the error body carries the same request_id
as the X-Request-Id response header — this proves RequestID
runs upstream of BodyLimit, which is the single most load-
bearing ordering assertion in the chain.
- rate_limit rejects the third request from a single client IP
when burst=2, and the standard 429 body is returned.
- rate_limit keys by X-Forwarded-For, not RemoteAddr: two
clients sharing a Fly edge proxy (same RemoteAddr) but
distinct XFF values are limited independently.
concurrency_limit is not E2E-tested here because exercising it
against the fast-returning health endpoints would require
orchestrating blocking handlers; the middleware is thoroughly
unit-tested in its own file and the wiring correctness is
observable by inspection.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR 6/9 of the Phase 2 series. Wires `body_limit`, `rate_limit`, and `concurrency_limit` (merged individually in PRs 3-5) into the `app.New()` middleware chain in the canonical §6 order, and adds end-to-end integration tests that exercise the wired chain through the real `app.server.Handler`.
This PR is also the first Fly.io verification point after merge — health endpoints must remain operational through the new chain before we move on to PRs 7-9.
Wiring change
Before (Phase 1):
```
RequestID → Recover → Logger → Timeout → mux
```
After (this PR):
```
RequestID → Recover → Logger → BodyLimit → RateLimit → ConcurrencyLimit → Timeout → mux
```
Each new middleware reads config values that were already loaded at startup in Phase 1 (`MaxRequestBytes`, `RateLimitRPM`, `RateLimitBurst`, `MaxConcurrentRequests`). No new config knobs.
Why this needs an E2E test
The Phase 1 QA review flagged "full middleware chain never tested end-to-end" as a real gap. Unit tests for each middleware cover their individual behavior, but they can't catch:
This PR adds `internal/app/app_chain_test.go` to close that gap by driving real requests through `a.server.Handler`.
Test plan
All tests drive `a.server.Handler.ServeHTTP(...)` — the exact same handler chain that production serves.
`TestChain_HealthEndpointsPassThroughAllLayers`
Positive path. GET `/healthz`, `/readyz`, `/version` all return 200 with `X-Request-Id` header and `Content-Type: application/json`. Captures slog output and asserts the Logger middleware emitted "request completed" entries with the `request_id` field populated — proves RequestID → Logger ordering.
`TestChain_BodyLimitRejectsOversizedWithRequestID`
Sets `MaxRequestBytes=64` and POSTs a 1024-byte body. Asserts 413, verifies the error body has `code=payload_too_large`, and — crucially — that `error.request_id` equals the `X-Request-Id` response header. This is the single most load-bearing assertion in the entire chain: if BodyLimit were placed before RequestID, the error body would have empty request_id and this test would catch it.
`TestChain_RateLimitRejectsExcessFromSameIP`
Sets `RateLimitRPM=60`, `RateLimitBurst=2`, same `RemoteAddr` across 3 requests. First 2 return 200, 3rd returns 429.
`TestChain_RateLimitKeysByXForwardedFor`
Sets `RateLimitRPM=60`, `RateLimitBurst=1`, and sends requests with the same `RemoteAddr` but different `X-Forwarded-For` values. Verifies that two logical clients arriving through a shared Fly edge proxy are keyed independently — client A can be rate-limited without affecting client B. This is the scenario that motivated the `clientIP` refactor in PR 4.
Why concurrency_limit isn't E2E-tested here
Exercising `concurrency_limit` against the fast-returning health endpoints would require orchestrating blocking handlers (goroutines + signal channels) from inside an E2E test. That's doable but makes the test much more complex with little added value: the middleware's own unit tests already cover capacity, rejection, recovery, and multiple-slot scenarios with identical orchestration patterns, and its presence in the chain is observable by inspection of `app.go`. This is a conscious coverage tradeoff, called out in the commit message.
If a subtle wiring bug slipped in (e.g. concurrency_limit accidentally removed from the Chain call), the same-file unit tests wouldn't catch it but the code would fail review. Acceptable.
Local checks
Fly.io verification plan (after merge)
Per the Phase 2 plan, this PR is the 1st of 3 Fly verifications. The target is to confirm that health endpoints still work after the new chain is active:
Phase 2 context
Intended to be squash-merged.