Skip to content

feat(selfhost): sampled Sentry spans for review latency#1922

Merged
JSONbored merged 1 commit into
JSONbored:mainfrom
nickmopen:feat/selfhost-review-tracing-spans
Jul 1, 2026
Merged

feat(selfhost): sampled Sentry spans for review latency#1922
JSONbored merged 1 commit into
JSONbored:mainfrom
nickmopen:feat/selfhost-review-tracing-spans

Conversation

@nickmopen

Copy link
Copy Markdown
Contributor

Summary

Adds sampled Sentry tracing across the review pipeline so operators can answer "why was this review slow?" / "which stage failed?" without digging through logs (#1734). Additive, opt-in, and safe by construction; coexists with the existing OpenTelemetry spans by reusing the same boundaries.

Resubmit of the gate-approved tracing change with both review nits addressed: withReviewSpan now lives in a neutral src/selfhost/tracing.ts module (callers no longer import ./sentry for the boundary), and the error test no longer asserts SDK-internal span-status behavior the mock can't prove.

Opt-in & safe

  • No-op until configured. Spans emit only when SENTRY_TRACES_SAMPLE_RATE > 0 (default 0), independent of error capture. resolveTracesSampleRate clamps to [0,1] and treats a malformed value as off, so a typo can't flood the tracer. Sampling off ⇒ no span started, no trace traffic.
  • Never sensitive payloads. sentrySpanAttributes keeps only the safe, low-cardinality subset (drops secret-keyed, null, non-finite, and non-scalar values; truncates strings) — never prompts, diffs, bodies, tokens, or headers.

Design

  • withSentrySpan(name, attrs, fn) (sentry.ts) — runs fn inside a Sentry span when sampling is on; a pass-through otherwise.
  • withReviewSpan(name, attrs, fn, options) (neutral tracing.ts) — opens one boundary feeding both tracers (OTel + Sentry); each side independently no-ops when off.
  • Wired at the latency-heavy boundaries: the queue-job span (whole-review latency — GitHub reads, REES call, AI, gate, publish all run inside consume) with the nested AI-provider span. A sampled review ⇒ a connected trace with the major stages visible and slow/failed stages filterable.

Validation (reproducible, Node 24)

npm run typecheck                                                          # clean (exit 0)
npx vitest run test/unit/selfhost-sentry.test.ts test/unit/selfhost-tracing.test.ts   # 49 pass
npm run test:coverage                                                      # exit 0

Coverage of the changed code (from coverage/lcov.info):

  • src/selfhost/tracing.ts withReviewSpancovered (line hit 90×)
  • src/selfhost/sentry.ts new helpers — 100% lines (rate resolution, attribute scrubbing incl. NaN/non-scalar/secret drop + truncation, no-op-when-off, span emission, error propagation)
  • instrumented boundaries — selfhost.queue.job (pg-queue 25×, sqlite-queue 43×) and selfhost.ai.provider (ai.ts 19×), all hit by the existing queue/AI suites

Acceptance criteria

  • ✅ Sampling enabled → one review produces a connected trace (queue-job span with the AI-provider span nested).
  • ✅ Sampling disabled → behavior unchanged, no Sentry trace traffic (helpers reduce to fn()).
  • ✅ Slow/failed stages filterable without exposing sensitive payloads.

Documented in .env.example. Part of #998. Closes #1734

Add sampled Sentry tracing across the review pipeline so operators can answer "why was this review slow / which
stage failed" without digging through logs (JSONbored#1734).

Tracing is strictly opt-in: spans are a complete no-op until SENTRY_TRACES_SAMPLE_RATE is configured above 0
(default 0), independent of error capture. resolveTracesSampleRate clamps the rate to [0,1] and treats a malformed
value as off. withSentrySpan (in sentry.ts) runs its callback inside a Sentry span only when sampling is on,
tagging it with the safe, low-cardinality attribute subset (secrets/null dropped, non-finite/non-scalar dropped,
strings truncated) — never prompts, diffs, tokens, or bodies.

withReviewSpan lives in a neutral src/selfhost/tracing.ts module (not coupled to either tracer's module): it opens
ONE boundary that feeds BOTH tracers — OpenTelemetry and Sentry — each independently no-op when its backend is off.
It replaces the existing withOtelSpan boundaries for the queue-job and AI-provider stages, so a sampled review
produces a connected trace (whole-review job span with the AI span nested) where slow/failed stages are filterable.

Fully unit-tested: rate resolution, attribute scrubbing, no-op when off, span emission + safe attributes when on,
error propagation, and the cross-tracer composition. Documented in .env.example.
@nickmopen nickmopen requested a review from JSONbored as a code owner June 30, 2026 23:03
@dosubot dosubot Bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Jun 30, 2026
@gittensory-orb

gittensory-orb Bot commented Jun 30, 2026

Copy link
Copy Markdown

Tip

🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩

✅ Gittensory review result - approve/merge recommended

Review updated: 2026-07-01 05:57:33 UTC

8 files · 1 AI reviewer · no blockers · readiness 55/100 · CI green · clean

✅ Suggested Action - Approve/Merge

  • safe to merge

Review summary
The change cleanly adds an opt-in Sentry tracing path and routes the existing self-host review boundaries through a neutral `withReviewSpan` wrapper, preserving the existing OpenTelemetry span calls while adding Sentry only when a DSN and positive sample rate are configured. The safety pieces in the visible diff are coherent: malformed sample rates resolve to 0, span attributes are filtered to low-cardinality scalar values, and tests cover disabled tracing, enabled tracing, attribute scrubbing, and wrapper composition. I do not see a reachable correctness break in the changed hunks.

Nits — 6 non-blocking
  • nit: `src/selfhost/sentry.ts:204` ties tracing to `active`, so a future reader could miss that `SENTRY_TRACES_SAMPLE_RATE > 0` alone is intentionally insufficient; a short inline note at the predicate would make the DSN+sample-rate requirement explicit.
  • nit: `test/unit/selfhost-tracing.test.ts:23` verifies that both wrappers receive the same name but not that the same attribute object is forwarded to both, which is the more important part of the single-boundary contract.
  • In `test/unit/selfhost-tracing.test.ts`, add assertions for `otel.withOtelSpan.mock.calls[0]![1]` and `sentry.withSentrySpan.mock.calls[0]![1]` so the test locks the attribute forwarding contract, not only the span name.
  • In `src/selfhost/sentry.ts`, consider documenting on `sentryTracingEnabled()` that error capture must be initialized because Sentry tracing is deliberately not independently initialized.
  • Pull request duplicates other open work — Check for an existing pull request or issue covering this change and coordinate or consolidate before continuing.
  • Readiness score is below the configured threshold — Use the readiness panel as advisory maintainer context; the score does not block this PR.
Signal Result Evidence
Code review ✅ No blockers 1 reviewer
Linked issue ✅ Linked #1734
Related work ⚠️ 1 scoped overlap Top overlaps are listed below; lower-confidence bulk is hidden.
Change scope ❌ 8/20 High review scope from cached public metadata (size label size:M; 1 linked issue).
Validation posture ❌ 5/25 Preflight is holding this PR; address the blocker before review.
Contributor workload ✅ 10/10 Author activity: 96 registered-repo PR(s), 63 merged, 1 issue(s).
Contributor context ✅ Confirmed Gittensor contributor nickmopen; Gittensor profile; 96 PR(s), 1 issue(s).
Gate result ✅ Passing No configured blocker found.
Review context
  • Author: nickmopen
  • Role context: outside_contributor
  • Public audience mode: oss maintainer
  • Lane context: Repository registration is not available in the local Gittensory cache.
  • Public profile languages: not available
  • Official Gittensor activity: 96 PR(s), 1 issue(s).
  • Related work: Titles/paths share 7 meaningful terms. (PR #1945)
Contributor next steps
  • Review top overlaps.
  • Add a concise scope and risk note.
  • Fix the blocker.
  • Triage stale or unlinked PRs.
  • Refresh registry data or choose a registered active repo.
  • Check active issues and PRs before submitting.
Signal definitions
  • Related work = same linked issue, overlapping active PRs, or title/path similarity.
  • Change scope = cached public metadata such as size labels, draft state, and review-burden hints.
  • Validation posture = whether the PR provides enough public validation/test evidence for maintainer review.
  • Contributor workload = public contributor activity and cleanup pressure, not a repo-wide quality failure.
  • Contributor context = public GitHub/Gittensor identity context; non-Gittensor status is not a blocker.

🟩 Safe / merged · 🟦 Advisory · 🟨 Held for review · 🟥 Blocked / closed


💰 Earn for open-source contributions like this. Gittensor lets GitHub contributors earn for the work they already do — register to start earning →.

Checked by Gittensory, a quiet PR intelligence layer for OSS maintainers.

  • Re-run Gittensory review

@gittensory-orb gittensory-orb Bot added gittensor Gittensor contributor context gittensor:feature Gittensor-scored feature linked to a feature issue — scores a 1.25x multiplier. labels Jun 30, 2026
@dosubot dosubot Bot added the lgtm Approved by a maintainer. label Jul 1, 2026
@JSONbored JSONbored merged commit d0d0b8f into JSONbored:main Jul 1, 2026
8 checks passed
@github-project-automation github-project-automation Bot moved this from Todo to Done in gittensory - v1 roadmap Jul 1, 2026
JSONbored added a commit that referenced this pull request Jul 1, 2026
pg-queue.ts and sqlite-queue.ts call withOtelSpan for the admission-deferred
span (like server.ts does) but import only withReviewSpan, so tsc fails with
'Cannot find name withOtelSpan' on main — #1922 added the calls and #1986 landed
close behind, green separately but broken together. Add the missing import from
./otel. Behavior-preserving (the span was already intended); unblocks CI.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gittensor:feature Gittensor-scored feature linked to a feature issue — scores a 1.25x multiplier. gittensor Gittensor contributor context lgtm Approved by a maintainer. size:M This PR changes 30-99 lines, ignoring generated files.

Projects

No open projects
Status: Done

Development

Successfully merging this pull request may close these issues.

observability(tracing): add sampled Sentry spans for review latency

2 participants