Skip to content

feat: add Rust-based NVTX injection library#87

Draft
johanpel wants to merge 14 commits into
rapidsai:mainfrom
johanpel:nvtx
Draft

feat: add Rust-based NVTX injection library#87
johanpel wants to merge 14 commits into
rapidsai:mainfrom
johanpel:nvtx

Conversation

@johanpel

@johanpel johanpel commented Apr 1, 2026

Copy link
Copy Markdown
Contributor

First step to solving: #76 where we want to be able to observe NVTX events with Quent.

WIP

johanpel and others added 7 commits April 1, 2026 17:59
Adds the NVIDIA/NVTX repository as a git submodule for vendored C headers
and the Rust nvidia-nvtx crate. Registers the four new integration crates
in the workspace.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Raw NVTX event types (NvtxEvent enum with 19 variants) representing
verbatim NVTX API calls. Pure serde types with no Quent dependencies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stateless NVTX injection library that intercepts all NVTX API calls via
the static injection mechanism and forwards them as NvtxEvent values
through a user-provided hook. Includes bindgen for C types, 30 callback
implementations (CORE + CORE2), and a C symbol file for weak symbol
override.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Uses nvidia-nvtx to emit real NVTX C API calls and verifies correct
NvtxEvent types, messages, thread IDs, and range ID pairing arrive
through the injection hook.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Thin Quent-specific wrapper that connects quent-nvtx-injection to
Quent's EventSender. Single install() function that type-erases the
sender via the injection hook.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stub crate for the NVTX analyzer that will reconstruct Quent traces,
FSMs, and handle mappings from raw NvtxEvent streams.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
12 design documents covering event mapping, injection architecture,
push/pop and start/end range mapping, payload extension for FSM
correlation, programmatic installation, analyzer integration, and
resolved open questions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
johanpel and others added 7 commits April 1, 2026 20:36
- Fix domain_handle_id typing: use Option<u64> consistently on all event
  types (ResourceCreate, RegisterString, DomainCreate, DomainDestroy)
- Rename NvtxMessage::Ascii to NvtxMessage::String (wraps both ASCII and
  converted unicode)
- Add Clone and PartialEq derives to all event types
- Document payload extension variants as not yet emitted (Phase 5)
- Add compile_error\! guard for Windows wchar_t (2-byte UTF-16 unsupported)
- Remove empty symbol.rs module, move doc comment to lib.rs
- Bump bindgen to 0.72 to align with nvtx-sys
- Move quent-nvtx-events to sibling path (integrations/nvtx/events/)
- Fix 5 stale design doc claims (install signature, cdylib, timestamps,
  registered strings)
- Expand integration test: domains, registered strings, categories, colors,
  domain-scoped push/pop, domain destroy

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove the NVTX git submodule and use a Cargo git dependency on
nvidia-nvtx from the release-v3 branch instead. The build script
locates NVTX C headers via cargo_metadata by finding the nvtx-sys
crate source in the dependency graph.

Add NVIDIA/NVTX to deny.toml's allow-git list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Delete leftover empty .gitmodules
- Bounds-check table size before writing callback slots in init.rs
- DomainCreate/DomainDestroy/RegisterString: domain_handle_id is u64
  (always present), not Option<u64>
- Fix design doc paths (events crate moved to integrations/nvtx/events/)
- Document force-load linker arg in build.rs
- Add safety comment on union field reads in convert.rs
- Use replacement character for invalid wchar_t codepoints
- Move Windows compile_error\! to crate root
- install_hook() returns bool (false if already installed)
- Document NO_PUSH_POP_TRACKING constant
- Fix doc example crate name (quent_nvtx_injection)
- Document thread ID capture infallibility
- Remove quent-nvtx-analyzer placeholder (not in scope for this PR)
- Document Relaxed ordering on atomic counters
- Derive Default on NvtxAttributes, simplify helper constructors
- Replace magic number with NVTX_ETID_CALLBACKS constant

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix null pointer dereference: RegisterString callbacks now use
  domain_handle_id() (null-safe) instead of handle_to_id(), and the
  field type is Option<u64> for consistency
- Remove unused handle_to_id() function
- Document handle lifetime contract (NVTX spec: caller must not use
  handle after destroy)
- Add compile_error\! for non-64-bit targets (union field access assumes
  pointer and u64 are the same size)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Move agent-generated design documents to .gitignore. The files remain
on disk for reference but are not part of the PR.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove banner-style section separators (// --- X --- and // ----...----).
Use plain // X comments instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@johanpel johanpel changed the title Add Rust-based injection library for NVTX consumption feat: add Rust-based NVTX injection library Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants