-
Notifications
You must be signed in to change notification settings - Fork 359
feat(tracing, lambda-runtime): add support for custom writer with default tracing subscriber, add turnkey graceful shutdown helper behind 'graceful-shutdown' feature flag #982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
87277be
to
763ad8b
Compare
A couple updates: I do have a PR out to tokio for If we merge this with the TODO still in place, I can circle back and cut it over once it lands. I also have I'll push a tweak to the README tomorrow giving local testing instructions once that lands. |
763ad8b
to
56922f3
Compare
I might scope creep this PR to also add in a turnkey 'graceful shutdown' hook - #983 But I'd also be happy to merge it as is if a maintainer reviews it before then. In that case I'd backport whatever new API for graceful shut down to the example. |
56922f3
to
5c60ffc
Compare
db25a8a
to
ff784ee
Compare
…graceful-shutdown` feature flag
…tracing module requires tracing feature
82ed6ba
to
eacb593
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
📬 Issue #, if available:
Closes #983
✍️ Description of changes:
Custom writer support for default tracing subscriber
The current tracing subscriber uses the default stdout writer. This mostly is the right choice, but it does run the risk of occasionally blocking the tokio runtime - ref: tokio-rs/tracing#2653
For high throughput, performance sensitive lambdas that have async concurrency in their handlers, they might instead prefer to use a non-blocking writer on a dedicated logging thread.
You can do that today by initializing a totally custom tracing subscriber, but the default one has convenient integrations with some native Lambda ENVs, that I'd like to keep using.
Ideally we would have an API to return the configured
fmt::SubscriberBuilder
rather than implicitly initializing it in our helper. But, the tracing types are very unpleasant to work with due to the JSON vs full formatting choice. We would need to introduce a layer of dynamic dispatch, custom traits, or something else like that, if we wanted to bake in the handling for theAWS_LAMBDA_LOG_FORMAT
ENV. We could also have a default subscriber builder that adds the JSON format at the end, but that felt very heavyweight for a simple 'default subscriber' helper API.I felt that adding a custom writer is an important enough knob that a new
init_default_subscriber_with_writer()
API was warranted. It is much simpler code.I don't think the semantic version of risk of surfacing the fmt::MakeWriter trait bound is significant - first of all, we already re-export all the tracing types, so presumably we would treat a major version tracing-subscriber bump as our own major version bump. Second, the
MakeWriter
trait is very simple and seems unlikely to change; it just expects a function that returns something that implementsstd::io::Writer
.Note that we also now explicitly initialize
init_default_subscriber()
using STDOUT writer. This is the current default behavior of tracing subscriber, but it could presumably change across a major version bump. Probably better to pin it and make it a conscious choice on our own end to change it, if we later desire.Documentation decisions
I didn't explicitly document this in the README, since I think it would add confusion as it is not the right choice for most users. I do reference the API in the new
Graceful shutdown
README section, which I think is a useful breadcrumb for the sort of user that might need it.The function docs do give a pretty good overview of what scenarios it might be appropriate for.
Turnkey 'graceful shutdown hook'
Previously I had a more extended graceful shutdown example that added manual non-op extension registry, signal handling, etc. @bnusunny pointed out that this could be very cheaply abstracted away by the library, and he was right!
I went with a route that was pretty opinionated and on-rails, to reduce cognitive load of the API. Specifically:
JoinHandle
or since the caller generally will be thinking of this as a background taskNote that the dummy extension is very low cost, since the extension.nextEvent API will keep a connection but then hang indefinitely if no events come. So it won't frequently wake the task that contains it and the signal handling future. Generally waking will be from the signal handling. The cost is essentially the memory to keep the connection open (and also in my testing, the extension future was exiting on run() when no events were registered, though I wouldn't count on that behavior).
Nothing is stopping the caller from manually constructing a more sophisticated set of signal handling, avoiding registering a dummy extension if they already have another one, etc.
I did reference these choices in the function docs. I also shared the panic conditions as well as a realistic usage example. I also added a small section in the README to help with discoverability.
Anyway, I think that this is the API that is proper for 95%+ of callers, and anyway we can expand on it if we need to. I assume at that point it probably is worth baking into the
Runtime
type since anyway that is where we add more granularity of behavior.Dependency changes
I added a new feature flag for this,
graceful-shutdown
, since it adds new dependencies:tokio
'ssignal
featurelambda-extension
Also worth noting is that this is the first time we actually need the
tokio
rt
feature (tokio::signal::signal()
requires it, as doestask::spawn()
). That matters more for #984 , but I made this feature a default feature to make it easier to rip that out without breaking people if we ever remove the vestigial dependency onrt/rt-multi-threaded
without feature flags (across a semver boundary).I didn't add a
lib.rs
passage documenting this new feature flag, but I did mention it in a new README section. I also added somedocs.rs
metatags to give us the nicefeature
indicator for both this as well as the tracing module re-export.Testing
I didn't see any great existing unit tests or integration tests that can be run locally, to extend to cover this case. My doctest validates that it will compile, but not that it runs, since it would hang indefinitely waiting for events. It would be cool to set up some sort of build-time integ testing that uses
cargo lambda
, but that wasn't scope I had time for.I did throw in the graceful shutdown helper in our integration test that presumably runs against github actions - that at least will capture a fatal regression in our logic implicitly spinning up the non-op extension, etc, that would prevent the function from starting properly. Open to cutting a new integration test to run in CI that more specifically probes this, if folks feel it is necessary. Itust will be a bit awkward, since we'll need to have it sleep for a while to let the function spin down. Also either we will need sort of callback from the function to call against our github CI runner from inside the shutdown hook (aka expose a server in Github CI), or else poll eg CloudWatch Metrics or CloudWatch Logs to validate that the graceful shutdown logic fired.
Manual e2e testing
I did test a complete example manually both:
cargo-lambda watch
signal handling fix in place - refBoth showed the expected behavior:
Full AWS logs are at the end of this section.
I previously had my test code checked in as a complete
examples/
folder code. But, it felt kind of ridiculous when literally the doc comment was sufficient to demonstrate the behavior. I'm glad to add it back in if we think it is useful. Another option would be to use that example to show a more 'custom' signal handling use case that doesn't use our helper, but I was concerned that that would confuse readers.AWS logs:
🔏 By submitting this pull request
cargo +nightly fmt
.cargo clippy --fix
.