feat(selfhost): sampled Sentry spans for review latency#1922
Conversation
Add sampled Sentry tracing across the review pipeline so operators can answer "why was this review slow / which stage failed" without digging through logs (JSONbored#1734). Tracing is strictly opt-in: spans are a complete no-op until SENTRY_TRACES_SAMPLE_RATE is configured above 0 (default 0), independent of error capture. resolveTracesSampleRate clamps the rate to [0,1] and treats a malformed value as off. withSentrySpan (in sentry.ts) runs its callback inside a Sentry span only when sampling is on, tagging it with the safe, low-cardinality attribute subset (secrets/null dropped, non-finite/non-scalar dropped, strings truncated) — never prompts, diffs, tokens, or bodies. withReviewSpan lives in a neutral src/selfhost/tracing.ts module (not coupled to either tracer's module): it opens ONE boundary that feeds BOTH tracers — OpenTelemetry and Sentry — each independently no-op when its backend is off. It replaces the existing withOtelSpan boundaries for the queue-job and AI-provider stages, so a sampled review produces a connected trace (whole-review job span with the AI span nested) where slow/failed stages are filterable. Fully unit-tested: rate resolution, attribute scrubbing, no-op when off, span emission + safe attributes when on, error propagation, and the cross-tracer composition. Documented in .env.example.
|
Tip 🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩 ✅ Gittensory review result - approve/merge recommendedReview updated: 2026-07-01 05:57:33 UTC
✅ Suggested Action - Approve/Merge
Review summary Nits — 6 non-blocking
Review context
Contributor next steps
Signal definitions
🟩 Safe / merged · 🟦 Advisory · 🟨 Held for review · 🟥 Blocked / closed 💰 Earn for open-source contributions like this. Gittensor lets GitHub contributors earn for the work they already do — register to start earning →. Checked by Gittensory, a quiet PR intelligence layer for OSS maintainers.
|
pg-queue.ts and sqlite-queue.ts call withOtelSpan for the admission-deferred span (like server.ts does) but import only withReviewSpan, so tsc fails with 'Cannot find name withOtelSpan' on main — #1922 added the calls and #1986 landed close behind, green separately but broken together. Add the missing import from ./otel. Behavior-preserving (the span was already intended); unblocks CI.
Summary
Adds sampled Sentry tracing across the review pipeline so operators can answer "why was this review slow?" / "which stage failed?" without digging through logs (#1734). Additive, opt-in, and safe by construction; coexists with the existing OpenTelemetry spans by reusing the same boundaries.
Opt-in & safe
SENTRY_TRACES_SAMPLE_RATE > 0(default0), independent of error capture.resolveTracesSampleRateclamps to[0,1]and treats a malformed value as off, so a typo can't flood the tracer. Sampling off ⇒ no span started, no trace traffic.sentrySpanAttributeskeeps only the safe, low-cardinality subset (drops secret-keyed, null, non-finite, and non-scalar values; truncates strings) — never prompts, diffs, bodies, tokens, or headers.Design
withSentrySpan(name, attrs, fn)(sentry.ts) — runsfninside a Sentry span when sampling is on; a pass-through otherwise.withReviewSpan(name, attrs, fn, options)(neutraltracing.ts) — opens one boundary feeding both tracers (OTel + Sentry); each side independently no-ops when off.consume) with the nested AI-provider span. A sampled review ⇒ a connected trace with the major stages visible and slow/failed stages filterable.Validation (reproducible, Node 24)
Coverage of the changed code (from
coverage/lcov.info):src/selfhost/tracing.tswithReviewSpan— covered (line hit 90×)src/selfhost/sentry.tsnew helpers — 100% lines (rate resolution, attribute scrubbing incl. NaN/non-scalar/secret drop + truncation, no-op-when-off, span emission, error propagation)selfhost.queue.job(pg-queue 25×, sqlite-queue 43×) andselfhost.ai.provider(ai.ts 19×), all hit by the existing queue/AI suitesAcceptance criteria
fn()).Documented in
.env.example. Part of #998. Closes #1734