Skip to content

observability(selfhost): add Sentry cron monitors for scheduled work #1733

Description

@JSONbored

Part of #998. Related: #982 and #1667.

Context

The self-host runtime has important scheduled and repeating work: the two-minute scheduled loop, relay drains, registry refreshes, sweeps, backfills, and retry jobs. Exceptions are useful, but they do not catch the worst class of failure: work silently stops running.

Sentry cron monitors should catch missed, late, and failed scheduled work without adding mandatory infrastructure for self-host operators.

Requirements

  • Add Sentry monitor check-ins only when Sentry is enabled; unset SENTRY_DSN remains fully inert.
  • Monitor the self-host scheduled loop and the highest-value recurring jobs first.
  • At minimum consider monitor coverage for: scheduled loop, Orb relay drain, registry refresh, regate sweep fanout, relay retry, RAG full index, and open-data/backfill jobs.
  • Monitor names/slugs must be stable and environment-aware.
  • Failed check-ins must include safe operational context only: job type, repo when relevant, count/latency where useful, release, and environment.
  • Do not attach request bodies, queue payload bodies, tokens, headers, or private prompt/review content.

Deliverables

  • Sentry monitor helper for start/success/failure check-ins.
  • Instrumentation around the selected scheduled and recurring work paths.
  • Tests for success, failure, missed-config/no-op, and sanitization behavior.
  • Docs describing which monitors exist and what an alert means operationally.

Acceptance criteria

  • A healthy self-host runtime shows successful check-ins for the monitored scheduled work.
  • A forced scheduled job failure records a failed check-in with actionable but sanitized context.
  • Disabling Sentry leaves behavior and runtime overhead effectively unchanged.

Metadata

Metadata

Assignees

Labels

maintainer-onlyWork to be completed solely by jsonbored - yields no gittensor points.

Projects

Status
Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions