🛰️ LLMOps Telemetry Command Center

A decision-ready Streamlit command center for LLM reliability, latency, cost, routing-policy review, drift signals, triage thresholds, and operational evidence exports.

LLMOps Telemetry Command Center turns offline LLM telemetry and evaluation artifacts into an operator-facing review workflow:

telemetry data → validation → KPIs → hotspots → routing policy scenarios → triage simulation → review queue → evidence exports

This is an offline, self-contained review system: it does not call external LLM providers or require live telemetry infrastructure.

✨ What this dashboard does

📌 Summarizes operational posture across request volume, failure rate, p95 latency, cost, and health score.
🔥 Ranks risk hotspots by provider, model, use case, latency pressure, SLA breaches, failure rate, and cost.
🧪 Separates held-out artifact evidence from live scenario review so exploratory filters support investigation without changing audit artifacts.
🧭 Simulates routing policy choices with transparent assumptions around failure cost, SLA penalty, and minimum traffic.
🎯 Explores triage thresholds across review share, expected cost, precision, recall, and confusion-matrix trade-offs.
📋 Builds a filter-aware review queue for operational handoff and evidence review.
🧾 Surfaces drift and decision artifacts with controlled JSON expanders and clear status summaries.
📤 Exports filtered operational evidence for follow-up analysis, documentation, or release review.

🧭 Dashboard pages

Page	Purpose
Command	Current operating state, KPI strip, incidents, and top risk slices
Hotspots	Provider/model/use-case segments driving reliability, latency, or cost pressure
Policy Lab	Held-out routing verdicts plus live filtered routing scenarios
Triage Simulator	Threshold, review-load, cost, precision, recall, and confusion-matrix analysis
Review Queue	Filter-aware triage queue with priority, reason, probability, latency, and cost fields
Evidence	Drift report, decision artifact, routing artifact, and raw evidence browser
Data Explorer	Searchable telemetry tables, cohorts, and instruction-template diagnostics

✅ Key engineering details

Uses a thin app.py entrypoint and a dedicated Streamlit view controller.
Loads telemetry through a typed DataBundle with startup schema and integrity checks.
Validates required CSV and JSON artifact contracts before rendering.
Keeps notebook-generated artifacts as the audit source of truth.
Applies sidebar filters only to live exploratory views and matched review queues.
Uses explicit Plotly keys to prevent duplicate Streamlit element IDs.
Renders custom UI through controlled helpers instead of raw Markdown HTML blocks.
Keeps missing or invalid evidence visible through fail-fast checks rather than silent fallback charts.
Includes unit, contract, chart, UI, policy, and project-quality tests.
Ships with Docker, Docker Compose, CI, deployment notes, and release docs.
Provides a cross-platform test runner for Windows, Linux, macOS, and GitHub Actions.

🧱 Architecture at a glance

Bundled telemetry CSVs + notebook artifacts
                |
                v
        src.data.load_bundle()
                |
        schema checks + type coercion
        artifact contract validation
        cross-table integrity checks
                |
                v
             DataBundle
                |
    +-----------+-----------+-----------+
    |                       |           |
src.metrics              src.policy   src.charts
KPIs, hotspots,          routing +    Plotly figures
queues, cohorts          triage
    |                       |           |
    +-----------+-----------+-----------+
                |
                v
          src.dashboard
       Streamlit workflow tabs
                |
                v
             src.ui
   cards, safe HTML helpers, tables

For the full design, see docs/architecture.md. For bundled file checksums and row counts, see docs/artifact_manifest.md.

📦 Project structure

llmops-telemetry-command-center/
├── app.py
├── requirements.txt
├── requirements-dev.txt
├── pyproject.toml
├── README.md
├── CONTRIBUTING.md
├── LICENSE
├── CHANGELOG.md
├── DEPLOYMENT.md
├── VERSION
├── Dockerfile
├── docker-compose.yml
├── Makefile
├── assets/
│   └── preview.png
├── artifacts/
│   ├── decision_artifact.json
│   ├── drift_report.csv
│   ├── routing_backtest_summary.csv
│   ├── routing_policy_use_case.csv
│   ├── triage_actions_preview.csv
│   ├── triage_baseline_comparison.csv
│   ├── triage_threshold_curve.csv
│   └── triage_threshold_policy.json
├── data/
│   ├── llm_system_interactions.csv
│   ├── llm_system_sessions_summary.csv
│   ├── llm_system_users_summary.csv
│   ├── llm_system_prompts_lookup.csv
│   └── llm_system_instruction_tuning_samples.csv
├── docs/
│   ├── architecture.md
│   ├── artifact_provenance.md
│   ├── artifact_manifest.md
│   ├── data_dictionary.md
│   ├── operational_boundaries.md
│   └── testing_strategy.md
├── scripts/
│   ├── docker_smoke_test.py
│   └── run_tests.py
├── src/
│   ├── __init__.py
│   ├── charts.py
│   ├── dashboard.py
│   ├── data.py
│   ├── metrics.py
│   ├── models.py
│   ├── policy.py
│   ├── ui.py
│   └── views/
│       ├── command.py
│       ├── data_explorer.py
│       ├── evidence.py
│       ├── hotspots.py
│       ├── overview.py
│       ├── policy_lab.py
│       └── triage.py
├── tests/
│   ├── conftest.py
│   ├── test_artifact_contracts.py
│   ├── test_bundle_contract.py
│   ├── test_charts.py
│   ├── test_command_view.py
│   ├── test_data.py
│   ├── test_metrics.py
│   ├── test_policy.py
│   ├── test_project_quality.py
│   └── test_ui.py
└── .github/workflows/ci.yml

📊 Included data and artifacts

The repository includes synthetic/offline telemetry files for a frictionless first run:

File	Role
`data/llm_system_interactions.csv`	Request-level telemetry and outcome fields
`data/llm_system_sessions_summary.csv`	Session-level rollups
`data/llm_system_users_summary.csv`	User/account-level synthetic summaries
`data/llm_system_prompts_lookup.csv`	Prompt and instruction metadata
`data/llm_system_instruction_tuning_samples.csv`	Instruction-template examples
`artifacts/decision_artifact.json`	Notebook-generated decision artifact
`artifacts/routing_backtest_summary.csv`	Held-out routing-policy backtest summary
`artifacts/triage_threshold_policy.json`	Offline triage policy metadata
`artifacts/triage_threshold_curve.csv`	Threshold/cost/review-load curve
`artifacts/triage_actions_preview.csv`	Review-queue preview artifact

The included data is synthetic and does not contain real customer, billing, incident, or user records. It is intentionally bundled with the repository so the dashboard can run immediately after cloning without external services, private datasets, or notebook regeneration steps.

⚖️ Artifact evidence vs live scenario review

This project deliberately separates two scopes:

Held-out evaluation artifacts
Notebook-generated outputs such as routing backtests, triage curves, drift reports, and decision artifacts. These are treated as audit evidence.
Live filtered scenario review
Sidebar filters, operator knobs, routing assumptions, and queue filters. These support investigation and candidate review while held-out artifacts remain the audit trail.

This design keeps the interface useful for operational analysis while preserving a clean separation between evidence review and rollout approval.

🚀 Run locally

Create a fresh virtual environment:

python -m venv .venv

Activate it.

Windows PowerShell:

.venv\Scripts\Activate.ps1

macOS/Linux:

source .venv/bin/activate

Install dependencies and run:

pip install -r requirements.txt
streamlit run app.py

Open:

http://localhost:8501

🐳 Run with Docker

Build the image:

docker build -t llmops-telemetry-command-center .

Run the app:

docker run --rm -p 8501:8501 llmops-telemetry-command-center

Production-style Docker validation:

python scripts/docker_smoke_test.py --image llmops-telemetry-command-center:ci --build

Or use Docker Compose:

docker compose up --build

🧪 Validate locally

Install development dependencies:

pip install -r requirements.txt -r requirements-dev.txt

Run the same checks used by CI:

python -m ruff check app.py src tests scripts
python -m ruff format --check app.py src tests scripts
python -m compileall app.py src tests scripts
python scripts/run_tests.py

Or use:

make check

Expected result:

Ruff passes, compileall completes, and the full pytest suite passes.

The repository is CI-tested on Python 3.11 and 3.12. CI also validates Docker image build and runtime health through a Streamlit smoke test.

🌐 Deployment targets

This project is suitable for:

GitHub technical repository
Streamlit Community Cloud
Hugging Face Spaces with sdk: streamlit
Docker-based local walkthroughs
Internal technical review sessions over offline telemetry artifacts

See DEPLOYMENT.md for deployment commands and runtime assumptions.

🛡️ Data and operations note

The command center uses synthetic/offline telemetry and notebook-generated artifacts. It is designed for LLMOps analysis, evaluation review, routing-policy inspection, triage-threshold planning, and dashboard engineering.

A live operating environment would add telemetry ingestion, persistence, authentication, access controls, alert delivery, service monitoring, and scheduled artifact refresh jobs around this interface.

📄 License

MIT License. See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🛰️ LLMOps Telemetry Command Center

✨ What this dashboard does

🧭 Dashboard pages

✅ Key engineering details

🧱 Architecture at a glance

📦 Project structure

📊 Included data and artifacts

⚖️ Artifact evidence vs live scenario review

🚀 Run locally

🐳 Run with Docker

🧪 Validate locally

🌐 Deployment targets

🛡️ Data and operations note

📄 License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
.streamlit		.streamlit
artifacts		artifacts
assets		assets
data		data
docs		docs
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
DEPLOYMENT.md		DEPLOYMENT.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
VERSION		VERSION
app.py		app.py
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🛰️ LLMOps Telemetry Command Center

✨ What this dashboard does

🧭 Dashboard pages

✅ Key engineering details

🧱 Architecture at a glance

📦 Project structure

📊 Included data and artifacts

⚖️ Artifact evidence vs live scenario review

🚀 Run locally

🐳 Run with Docker

🧪 Validate locally

🌐 Deployment targets

🛡️ Data and operations note

📄 License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages