fix: instrumentation context runtime blocking#226
Open
fallintoplace wants to merge 1 commit into
Open
Conversation
dhruv9vats
reviewed
Jun 15, 2026
Comment on lines
+215
to
+220
| let join_result = thread::spawn(move || { | ||
| let exporter = runtime | ||
| .block_on(create_exporter(kind, id)) | ||
| .map_err(|e| e.to_string())?; | ||
| Ok::<_, String>((runtime, exporter)) | ||
| }) |
Member
There was a problem hiding this comment.
Thanks for the PR @fallintoplace!
Can we not just spawn a new thread as here on an existing runtime if one exists instead of creating another runtime? Also, can we consider spawn_blocking here?
211de5d to
e079b0c
Compare
rapids-bot Bot
pushed a commit
that referenced
this pull request
Jun 24, 2026
# Description This PR is of mainly instrumentation infrastructure / support crates based on two issues: 1. Have one exporter per observer instead of one exporter per context (see #229) 2. Make the context id unrelated to the id of the root entity (e.g. the engine id in the query engine domain model). 2 allows every context to write events without the need to synchronize on the root id or to consider the root entity as something special versus other entities. While this would be ideally propagated back to the modeling API, #191 will remove that in its current form entirely anyway, so it is left as is for now so upstream model definitions experience no breaking changes twice in a short period of time. I initially tried to deal with 1 and 2 separately, but since this touches a lot of the same code across many layers it seemed best to tackle these in one pass. The consequences are as follows. In this description, only {App}Context and {Entity}Observer/Handle are user-facing, plain Context and Observer/Handle mentions are not user facing, but are the model-agonistic backing structures for their generated counterparts. - There is no longer a need for an "umbrella" event type representing all events of an application event model. This is for now still generated by the model macros because the analyzer side still relies on it. Removing that is intentionally not done in this PR to reduce the scope. - The quent_instrumentation::Context (not user facing) is now mainly responsible for ensuring there is an async runtime that observers can work with, it no longer owns the exporter, and to provide a bridge from sync <-> async code. - A generated {App}Context now concurrently writes the model provenance sidecar file, constructs (+connects) exporters and observers, and blocks until all of that is ready. - Each Observer now manages its own Exporter (which lives in an async forwarder task that it cancels on drop) - Each {Entity}Observer deals out {Entity}Handles for entity instances which hold an Arc to the inner Observer, which ensures everything is kept alive as long as a handle exists. This means as long as there are any live handles, even when all Observers or Context is dropped, events can be emitted. Also, constructing or dropping Context / Observer / Handle can be done from both synchronous and asynchronous code as long as that code is not using a `current-thread` runtime (also raised in #226). This PR adds tests for all flavors of runtime environments for the Context to live in. - Filesystem exporters now write in the following directory tree: ``` - <context uuid> - <entity_name> - <uuid>.<extension> (using a uuid here is just a non-enforced convention that mainly batching no-append file format exporters will benefit from as they can just generate a new uuid for each batch of events as file name). - <another entity name> - <uuid>.<extension> - <uuid>.<extension> - ... - model.qmi (sidecar file) ``` - For the query engine domain, since a (distributed) engine's events could now be spread out across multiple contexts / directories, listing engines is done by first scanning all engine events across the entire data source folder to obtain all engine ids. Then, all worker events are scanned to figure out which workers were spawned on behalf of those engines. For now it is assumed that these entities are directly tied to all event-producing processes of the engines, such that their instrumentation contexts together capture all events for an entire engine model. This is a rather brittle assumption that should not be relied on in follow-up work. I intend to address this with a generated importing stack that leverages the quent-ref-tree constraint after #191 is done, plus some sort of indexing service hinted at in #40. ## Related Issues Closes #229 ## For reference: example of macro generated code by 🤖: ### Generated event type + stream name ```rust pub enum FileStatsEvent { Checksum(Checksum), Decompressed(Decompressed) } impl quent_model::EntityEvent for FileStatsEvent { const NAME: &'static str = "file_stats"; // exporter subdir / wire tag / ingest key } ``` ### `{App}Context` (the user entry point), the relevant slice ```rust pub struct AppContext { file_stats: FileStatsObserver, // ... one field per entity _inner: quent_model::Context, } impl AppContext { pub fn try_new(exporter: Option<…ExporterOptions>) -> Result<Self, …> { let inner = quent_model::Context::try_new(exporter)?; // sync: resolve runtime only Self::assemble(inner) } // The single sync→async bridge: sidecar + every observer built concurrently, // blocked once. fn assemble(inner: quent_model::Context) -> Result<Self, …> { let (file_stats, …) = inner.block_on(async { let (_sidecar, file_stats, …) = quent_model::tokio::try_join!( async { inner.write_sidecar(<App as ModelSource>::model_info()).await; Ok(()) }, inner.observer::<FileStatsEvent>(), // ... one per entity )?; Ok((file_stats, …)) })?; Ok(Self { file_stats: FileStatsObserver::new(file_stats), …, _inner: inner }) } pub fn file_stats_observer(&self) -> FileStatsObserver { self.file_stats.clone() } // cheap Arc clone } ``` ### `{Entity}Observer` facade → `{Entity}Handle` ```rust pub struct FileStatsObserver { inner: Arc<quent_model::Observer<FileStatsEvent>> } impl FileStatsObserver { pub fn new(observer: quent_model::Observer<FileStatsEvent>) -> Self { Self { inner: Arc::new(observer) } } pub fn send(&self, event: quent_model::Event<FileStatsEvent>) { self.inner.send(event); } // pre-built (collector path) pub fn create(&self, id: Uuid) -> FileStatsHandle { FileStatsHandle { id, inner: self.inner.clone() } // handle co-owns the observer } } pub struct FileStatsHandle { id: Uuid, inner: Arc<quent_model::Observer<FileStatsEvent>> } impl FileStatsHandle { pub fn uuid(&self) -> Uuid { self.id } pub fn checksum(&self, event: Checksum) { self.inner.emit(self.id, FileStatsEvent::Checksum(event)); // Observer::emit -> EventSender -> forwarder } pub fn decompressed(&self, event: Decompressed) { self.inner.emit(self.id, FileStatsEvent::Decompressed(event)); } } ``` Emit path: `AppContext.file_stats_observer().create(id).checksum(..)` → `Observer::emit` → mpsc → forwarder task → exporter. ### Read/route side (collector + analyzer), same entity ```rust // CollectorSink::ingest (server replays a received stream) if entity == <FileStatsEvent as EntityEvent>::NAME { let e: Event<FileStatsEvent> = ciborium::from_reader(event)?; self.file_stats.send(e); // into the (already built) observer return Ok(()); } // {Model}::import_events (analyzer reconstruction), per entity: let path = dir.join(<FileStatsEvent as EntityEvent>::NAME); // <ctx>/file_stats if path.is_dir() { let importer = create_importer::<FileStatsEvent>(&FileSystem { format, path })?; streams.push(importer.map(|e| Event::new(e.id, e.timestamp, AppEvent::from(e.data)))); } ``` `AppEvent` (the umbrella enum) + its `From<{Entity}Event>` impls are still generated solely for this analyzer reconstruction; nothing on the capture path uses them. Authors: - Johan Peltenburg (https://github.com/johanpel) Approvers: - Matthijs Brobbel (https://github.com/mbrobbel) - Dhruv Vats (https://github.com/dhruv9vats) URL: #241
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Make
Context::try_newandContextteardown safe when called from inside an existing Tokio runtime.Details
Context::try_newdetectedHandle::try_current()and then calledhandle.block_on(create_exporter(...)). Tokio does not allow blocking the current runtime from a runtime worker thread, so async callers could panic during context construction.Drophad the same shape while waiting for the forwarder and flushing the exporter.This keeps the existing synchronous API and preserves the original runtime behavior: contexts reuse an existing Tokio runtime when one is present, and only create an owned runtime when no runtime exists. When construction or teardown happens from inside Tokio, the blocking
Handle::block_onwork runs on a helper thread instead of the async worker thread. Noop contexts still short-circuit without creating or using a runtime.Regression coverage now exercises:
Validation
cargo test -p quent-instrumentationcargo clippy -p quent-instrumentation --all-targets -- -D warningscargo testNote: the broad
cargo testrun emitted existing C++ bridge build warnings from macOSar -D, but completed successfully.