Skip to content

Conversation

@alessandrobologna
Copy link
Contributor

@alessandrobologna alessandrobologna commented Dec 10, 2025

📬 Issue #: #1065

✍️ Description of changes:

Support Lambda Managed Instances / concurrent runtime (refs #1065)

Note

I originally built this to experiment with Lambda Managed Instances in a test workload and thought it might be a useful starting point for upstream support. Parts of the implementation were drafted with the help of an AI assistant and have only been exercised in my own workloads and the existing test suite so far, so please treat this as a starting point for discussion rather than a final design. A super simple demo/benchmark of this implementation is at https://github.com/alessandrobologna/rust-lmi-demo

This draft PR implements a concurrent runtime mode for Lambda Managed Instances, plus the API changes required in lambda-runtime and lambda-http. It preserves existing single-invocation behavior for classic Lambda. The implementaion is based on the AWS guidance for building runtimes that support Lambda Managed Instances, as described in Building custom runtimes for Lambda Managed Instances.

Architecture (high level)

The runtime chooses between a classic "one invocation at a time" loop and a concurrent mode controlled by AWS_LAMBDA_MAX_CONCURRENCY:

  • Sequential mode (default): a single /next long-poll loop fetches one event at a time and invokes the handler synchronously, preserving existing behavior.
  • Concurrent mode (AWS_LAMBDA_MAX_CONCURRENCY >= 2): a window of /next requests feeds a queue of invocations; a semaphore bounds the number of in-flight handler tasks, and a shared shutdown signal lets the runtime stop opening new polls and drain in-flight work before exit.
sequenceDiagram
    participant Lambda as Lambda orchestrator
    participant Runtime as lambda-runtime
    participant API as Runtime API
    participant Handler as User handler

    Lambda->>Runtime: Start process (env AWS_LAMBDA_MAX_CONCURRENCY = N)
    loop up to N concurrent polls
        Runtime->>API: GET /runtime/invocation/next
        API-->>Runtime: Invocation event + context headers
        Runtime->>Handler: Spawn handler task(event, context)
        Handler-->>Runtime: Result or error
        Runtime->>API: POST /response or /error
    end
    Lambda-->>Runtime: Shutdown (SIGINT/SIGTERM via extension)
    Runtime->>Runtime: Stop new polls, wait for in-flight handlers to finish
Loading

Breaking changes & compatibility

  • Existing entrypoints (lambda_runtime::run, lambda_http::run, and lambda_http::run_with_streaming_response) keep their original signatures and sequential behavior. Handlers that compiled against the current release should continue to compile unchanged.
  • New concurrent entrypoints are added:
    • lambda_runtime::run_concurrent
    • lambda_http::run_concurrent
    • lambda_http::run_with_streaming_response_concurrent
      These require handler services to implement Clone + Send + 'static (with responses/stream bodies Send/Sync + 'static) so they can be safely cloned and driven by the concurrent runtime.
  • Config gains a new public field max_concurrency: Option<u32>, populated from AWS_LAMBDA_MAX_CONCURRENCY. This is a semver-visible change because struct literals in downstream code may need updating; I'm happy to follow whatever versioning/visibility strategy maintainers prefer (e.g., major bump vs. constructor helpers).
  • In concurrent mode (when the new entrypoints are used and AWS_LAMBDA_MAX_CONCURRENCY > 1), the runtime no longer sets _X_AMZN_TRACE_ID in the process environment. The per-invocation X-Ray trace ID is available via Context::xray_trace_id and tracing spans instead. Sequential mode behavior is unchanged.
  • For the concurrent entrypoints, the behavior when AWS_LAMBDA_MAX_CONCURRENCY is set changes from "always sequential per environment" to "per-environment concurrency up to that value". Code that continues to call the existing run functions will remain strictly sequential even if the env var is set.

In other words: the earlier versions of this branch tightened the bounds on the existing run functions, but after maintainer feedback those entrypoints are left as-is and concurrency is opt-in via the new *_concurrent APIs.

Below is a concise summary of the changes (unfortunately many) by area.

Runtime & Config (lambda-runtime)

  • Add Config.max_concurrency: Option<u32> populated from AWS_LAMBDA_MAX_CONCURRENCY, plus Config::is_concurrent() to decide whether concurrent mode should be enabled.
  • Runtime::new now sizes the lambda_runtime_api_client HTTP pool from max_concurrency so the number of idle connections matches expected concurrency.
  • Runtime::run remains the original sequential /next loop via run_with_incoming, preserving existing behavior.
  • New Runtime::run_concurrent implements a windowed /next loop (FuturesUnordered) plus a Semaphore to enforce at most max_concurrency active handler tasks, with graceful shutdown coordinated via SHUTDOWN_NOTIFY. When Config::is_concurrent() is false, it falls back to the same sequential loop as Runtime::run.
  • Internal layers (api_client, api_response, trace) now implement/derive Clone so they can be composed into a cloneable service stack for the concurrent entrypoints.
  • Context::new is more robust when lambda-runtime-client-context / lambda-runtime-cognito-identity headers are present but empty (treated as None instead of failing JSON parse).

HTTP & streaming (lambda-http, lambda-runtime-api-client)

  • lambda_runtime_api_client::Client:
    • Gains with_pool_size(usize) on the builder and threads a pool_size: Option<usize> into the Hyper client to set pool_max_idle_per_host.
    • Still works as before when pool_size is not provided.
  • lambda_http::run and run_with_streaming_response keep their existing signatures and sequential behavior, delegating to lambda_runtime::run.
  • New lambda_http::run_concurrent and lambda_http::run_with_streaming_response_concurrent wrap the same handler types but require them to be Clone + Send + 'static (with response/stream bounds aligned to lambda_runtime::run_concurrent) so they can be driven by the concurrent runtime.
  • HTTP adapters (Adapter, StreamAdapter) are now Clone when the inner service is Clone, and the streaming path uses BoxCloneService internally for the concurrent entrypoint so the composed service stack can be cloned.

Tooling & examples

  • Makefile and scripts/test-rie.sh:
    • Add RIE_MAX_CONCURRENCY and a test-rie-lmi target that runs RIE with AWS_LAMBDA_MAX_CONCURRENCY set, making it easy to exercise managed-instances behavior locally.
  • examples/basic-lambda/src/main.rs:
    • Wraps lambda_runtime::run(func).await in an if let Err(err) block to log and propagate runtime errors when testing under RIE.

Validation

On feat/concurrent-lambda-runtime, I ran:

  • cargo +stable fmt --all
  • cargo +stable test --all
    • All unit tests passing; the integration test requiring TEST_ENDPOINT still fails as expected without a deployed stack.
  • cargo +stable clippy --all-targets --all-features
    • All Clippy lints clean.
  • ./scripts/test-rie.sh basic-lambda:
    • Built the Docker image with RIE.
    • Started the container.
    • Verified a request:
      • curl -XPOST http://localhost:9000/2015-03-31/functions/function/invocations -d '{"command":"test from RIE"}'
      • Response: {"req_id": "...", "msg": "Command test from RIE executed."}

If maintainers prefer, this could be split into smaller PRs (e.g., builder/Config prep, handler Clone changes, and finally the concurrent runtime), but this branch shows the full "end-to-end" implementation so that it can be tested with Lambda Managed Instances.


🔏 By submitting this pull request

  • I confirm that I've ran cargo +nightly fmt.
  • I confirm that I've ran cargo clippy --fix.
  • I confirm that I've made a best effort attempt to update all relevant documentation.
  • I confirm that my contribution is made under the terms of the Apache 2.0 license.

- Add Config.max_concurrency and size runtime HTTP client pool from AWS_LAMBDA_MAX_CONCURRENCY.

- Introduce windowed concurrent /next polling with semaphore-limited handler tasks and shutdown coordination.

- Require Clone + Send + 'static handlers in lambda-runtime and lambda-http, and make internal layers/HTTP adapters cloneable.

- Adjust streaming HTTP to use BoxCloneService and align bounds for concurrent execution.

- Add RIE LMI helper (Makefile + test-rie.sh) and minor robustness improvements (Context parsing, basic example error logging).

Tests: cargo +stable fmt --all; cargo +stable clippy --all-targets --all-features; cargo +stable test --all (integration test requiring TEST_ENDPOINT not configured); ./scripts/test-rie.sh basic-lambda
@alessandrobologna alessandrobologna marked this pull request as ready for review December 12, 2025 06:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant