Skip to content

docs: reflect the now-live Phoenix-MCP loop in compliance + evidence docs#86

Merged
ComBba merged 1 commit into
mainfrom
docs/phoenix-mcp-live-compliance
Jun 6, 2026
Merged

docs: reflect the now-live Phoenix-MCP loop in compliance + evidence docs#86
ComBba merged 1 commit into
mainfrom
docs/phoenix-mcp-live-compliance

Conversation

@ComBba
Copy link
Copy Markdown
Contributor

@ComBba ComBba commented Jun 6, 2026

Follow-up audit (prompted by 'did you fix everything?'): the primary surfaces (web chips, README table, Devpost) were flipped to live in #84, but two judge-facing docs still understated/contradicted it.

  • rapid-agent-compliance.md: Phoenix-MCP status Wired -> Live; section 3 reframed so the DEPLOYED path is the live PhoenixMcpConsultant (per-request read + write-back over MCP, Cloud-SQL-backed Phoenix) and TableConsultant is the no-endpoint fallback.
  • evidence/rapid-agent-visual-proof-2026-05-24.md: dated Superseded banner (kept the historical amber/wired content intact — it was accurate on that date).

Docs-only.

Summary by CodeRabbit

릴리스 노트

  • Documentation
    • 문서에 최신 상태 정보를 반영하여 업데이트했습니다.
    • 감사 현황 및 시스템 상태 변경 사항을 문서에 추가했습니다.
    • 기술 요구사항 및 시스템 흐름도를 최신 배포 정보로 수정했습니다.

Caught while auditing whether everything was updated: the judge-facing compliance
doc still understated the Phoenix-MCP path.

- rapid-agent-compliance.md: status Wired -> Live; section 3 reframed so the
  DEPLOYED path is the live PhoenixMcpConsultant (read + write-back over MCP per
  request, Cloud-SQL-backed Phoenix) and the TableConsultant is the no-endpoint
  fallback.
- evidence/rapid-agent-visual-proof-2026-05-24.md: added a dated Superseded banner
  (kept the historical amber/wired content intact rather than rewriting the record).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 6, 2026

Complex PR? Review this PR in Change Stack to move by importance, not file order.

Review Change Stack

Caution

Review failed

Pull request was closed or merged during review

개요

이 풀 리퀘스트는 Phoenix MCP 실시간 감시 루프의 현재 상태를 반영하도록 증거 및 규정 준수 문서를 업데이트합니다. 과거의 amber/wired 상태 설명을 superseded 주석으로 표시하고, 배포된 감시가 live per-request MCP 라운드-트립을 수행하는 것을 명시적으로 기술합니다.

변경 사항

Phoenix MCP 라이브 감시 문서화

계층 / 파일 요약
과거 증거 상태 주석
docs/evidence/rapid-agent-visual-proof-2026-05-24.md
2026-06-06 superseded 주석을 문서 상단에 추가해, 과거 Phoenix MCP 칩 상태 설명(amber/wired/table prior)이 변경되었음을 명시하고, 현재 배포된 감사가 live Phoenix-MCP 루프를 수행하며 5개 칩이 모두 green/live 상태임을 기술합니다.
라이브 감사 규정 준수 설명
docs/rapid-agent-compliance.md
요구사항 #4(Partner MCP server)의 상태를 "Wired"에서 "Live"로 변경하고, 배포된 감사가 Cloud SQL 지원 Phoenix MCP에 대해 요청마다 live MCP 라운드-트립을 수행함을 명시합니다. §3의 아키텍처 다이어그램도 갱신해 "DEPLOYED" 경로의 PhoenixMcpConsultant(MCP stdio 상호작용 및 add-dataset-examples write-back 루프)와 "FALLBACK" 경로의 TableConsultant(Phoenix 엔드포인트 미설정 시) 분기 로직을 설명합니다.

예상 코드 리뷰 노력

🎯 1 (사소함) | ⏱️ ~8분

연관된 풀 리퀘스트

  • Two-Weeks-Team/glasshat#39: Phoenix MCP 라이브 감시 루프와 증거 결과에 대한 규정 준수 설명을 업데이트하는 주요 PR로, 동일 파일에서 PR #39에 추가된 Phoenix MCP 에이전트-루프 계약을 직접 기반으로 구축합니다.

🐰 Phoenix 칩이 반짝반짝 green으로 빛나네,
Live 루프 도는 감사 루프의 춤,
과거의 amber는 물러나고,
문서는 최신 상태 담아내네!
진실된 증거와 함께 나아가네!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed PR 제목은 변경 사항의 핵심을 정확하게 반영하고 있습니다. 'Phoenix-MCP loop'가 이제 '실시간(live)'으로 작동함을 명시하며, 영향받는 문서들('compliance + evidence docs')을 구체적으로 언급하고 있습니다.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/phoenix-mcp-live-compliance

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the documentation to reflect that the deployed audit now runs a live Phoenix-MCP loop (with read and write-back capabilities against a Cloud-SQL-backed Phoenix on Cloud Run) instead of using a static table prior. The feedback points out that the documentation still references @latest for the Phoenix MCP package instead of the pinned version @4.0.13 used in the code, and contains outdated line numbers for the referenced functions.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

| 2 | **Code-owned agent runtime** (rules name "Agent Builder"; the **Arize track** requires a code-owned runtime — *Gemini CLI / Agent Platform SDK / **Google ADK** / Agent Runtime / **Cloud Run***, and states **"Visual Agent Builder alone is insufficient. Direct code instrumentation is required."**) | **Google ADK** orchestrator, OpenInference-instrumented, deployed on **Cloud Run**. No visual Agent Builder app — that path is *explicitly disallowed* for this track. See §2. | `services/pipeline-orchestrator/src/glasshat/pipeline/adk_runtime.py` (`instrument_adk`, `run_via_adk`); engine `…/pipeline/engine.py`; deploy `infra/deploy.sh` | §2 below + `claudedocs/hackathon-source-2026-05-21/03-arize-resources.md` (the rule, quoted) | ✅ Resolved |
| 3 | **Arize partner integration** (OpenInference tracing → Arize/Phoenix) | OpenInference auto-instrumentation → **Arize AX** at `otlp.arize.com`; **one span per agent** (`RubricSynthesizer · BluePlanner · SixHatPanel · Audit · BMADScorer · ReportAssembler`) + per-hat `hat_assess`, all carrying `glasshat.*` attributes | `packages/shared/src/glasshat/shared/tracing.py` → `ArizeTracer` (registers via `arize.otel`, line 68); span sites `…/pipeline/engine.py:115–149` | `uv run python scripts/real_arize_ax_e2e.py`; live run `2b2e29c2` (final 56.93, 4 self-corrections) | ✅ Live |
| 4 | **Partner MCP server** (Phoenix MCP — required by the track) | ADK **`MCPToolset` over stdio** → `npx @arizeai/phoenix-mcp@latest`. The audit's calibration consultant calls the Phoenix MCP **`get-dataset-examples`** tool, parses per-anchor score deltas, and feeds them into the self-correction. See §3. | `…/pipeline/adk_runtime.py` → `build_phoenix_mcp_toolset` (l.31), `PhoenixMcpConsultant.consult` (l.53–96, tool `get-dataset-examples` l.82) | `uv run python scripts/real_e2e.py` (real ADK → Phoenix MCP stdio → pipeline) | ✅ Wiredexercised by e2e (see §3 on deployed vs. live-trace path) |
| 4 | **Partner MCP server** (Phoenix MCP — required by the track) | ADK **`MCPToolset` over stdio** → `npx @arizeai/phoenix-mcp@latest`. The audit's calibration consultant calls the Phoenix MCP **`get-dataset-examples`** tool, parses per-anchor score deltas, and feeds them into the self-correction. See §3. | `…/pipeline/adk_runtime.py` → `build_phoenix_mcp_toolset` (l.31), `PhoenixMcpConsultant.consult` (l.53–96, tool `get-dataset-examples` l.82) | `uv run python scripts/real_e2e.py` (real ADK → Phoenix MCP stdio → pipeline) | ✅ **Live**the deployed audit does a per-request MCP round-trip (read `get-dataset-examples` + write-back `add-dataset-examples`) against a Cloud-SQL-backed Phoenix on Cloud Run (`PHOENIX_COLLECTOR_ENDPOINT` set). See §3. |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There are two inconsistencies in this row compared to the actual implementation in services/pipeline-orchestrator/src/glasshat/pipeline/adk_runtime.py:

  1. Pinned Version: The documentation mentions npx @arizeai/phoenix-mcp@latest, but adk_runtime.py pins the package to @4.0.13 (_PHOENIX_MCP_PACKAGE = "@arizeai/phoenix-mcp@4.0.13") for supply-chain hardening.
  2. Outdated Line Numbers: The referenced line numbers for build_phoenix_mcp_toolset (l.31), PhoenixMcpConsultant.consult (l.53–96), and get-dataset-examples (l.82) are outdated. They should be updated to l.102, l.158, and l.147 respectively to match the current codebase.

Here is the suggested replacement:

| 4 | **Partner MCP server** (Phoenix MCP — required by the track) | ADK **`MCPToolset` over stdio**`npx @arizeai/phoenix-mcp@4.0.13`. The audit's calibration consultant calls the Phoenix MCP **`get-dataset-examples`** tool, parses per-anchor score deltas, and feeds them into the self-correction. See §3. | `…/pipeline/adk_runtime.py``build_phoenix_mcp_toolset` (l.102), `PhoenixMcpConsultant.consult` (l.158, tool `get-dataset-examples` l.147) | `uv run python scripts/real_e2e.py` (real ADK → Phoenix MCP stdio → pipeline) |**Live** — the deployed audit does a per-request MCP round-trip (read `get-dataset-examples` + write-back `add-dataset-examples`) against a Cloud-SQL-backed Phoenix on Cloud Run (`PHOENIX_COLLECTOR_ENDPOINT` set). See §3. |

@ComBba ComBba merged commit b69b211 into main Jun 6, 2026
4 of 5 checks passed
@ComBba ComBba deleted the docs/phoenix-mcp-live-compliance branch June 6, 2026 08:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant