fix(mcp): surface 401s as an actionable "needs sign-in" state (#3719)#3733
Conversation
|
Warning Review limit reached
More reviews will be available in 13 minutes and 50 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (26)
📝 WalkthroughWalkthroughAdds a distinct ChangesMCP Unauthorized Status (HTTP 401)
Sequence Diagram(s)sequenceDiagram
participant MCP_Server as MCP Server (HTTP)
participant read_response as read_response (client.rs)
participant connections as mcp_registry/connections
participant all_status as all_status / classify_server_status
participant Frontend as InstalledServerDetail (React)
MCP_Server-->>read_response: HTTP 401 + WWW-Authenticate header
read_response->>connections: Err(McpUnauthorizedError { endpoint, resource_metadata })
connections->>connections: is_unauthorized_error? → ConnectFailure { unauthorized: true }
connections->>all_status: snapshot failure records
all_status-->>Frontend: ConnStatus { status: "unauthorized", last_error: None }
Frontend->>Frontend: render auth-required alert + "Sign in" button
Note over Frontend: No raw "HTTP 401" string shown to user
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 602d04949e
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@coderabbitai full review |
✅ Action performedFull review finished. |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/openhuman/mcp_registry/connections.rs (1)
503-507:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPreserve typed 401s during connected tool calls.
read_responsenow returnsMcpUnauthorizedErrorfor any HTTP 401, includingtools/call, but this path immediately stringifies it and leaves the stale connection in the live map. If a token expires after connection, status can remainConnectedand the raw unauthorized diagnostic can still surface instead of triggering the sign-in flow.Handle auth expiry before stringifying the tool-call error
- conn.client - .call_tool(tool_name, arguments) - .await - .map(|r| r.raw_result) - .map_err(|e| e.to_string()) + match conn.client.call_tool(tool_name, arguments).await { + Ok(result) => Ok(result.raw_result), + Err(err) if is_unauthorized_error(&err) => { + let removed = { + let mut map = connections().write().await; + map.remove(server_id) + }; + if let Some(conn) = removed { + let _ = conn.client.close_session().await; + } + connect_failures().write().await.insert( + server_id.to_string(), + ConnectFailure { + message: err.to_string(), + auth_required: true, + }, + ); + Err("[mcp-registry] server requires authentication; please sign in again".to_string()) + } + Err(err) => Err(err.to_string()), + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/mcp_registry/connections.rs` around lines 503 - 507, The error handling in the call_tool path stringifies errors with .map_err(|e| e.to_string()), which loses the typed McpUnauthorizedError information that read_response returns for HTTP 401 responses. This prevents the caller from detecting and properly handling 401 auth expiry errors. Instead of stringifying the error, preserve the actual error type so that McpUnauthorizedError can be properly identified and handled by the caller to trigger sign-in flows and remove stale connections from the live map. Modify the error mapping in the conn.client.call_tool() chain to return the error type rather than converting it to a string.app/src/lib/i18n/es.ts (1)
1368-1388:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winTighten the Spanish auth copy.
"conectarse"makes the server the subject here, which reads awkwardly in UI copy. Use second-person phrasing instead, and make the status label a bit more natural.💡 Suggested copy
- 'Este servidor necesita que inicies sesión o agregues un token de acceso antes de poder conectarse. Haz clic en “Iniciar sesión” para autenticarte.', + 'Este servidor necesita que inicies sesión o agregues un token de acceso antes de poder conectarte. Haz clic en “Iniciar sesión” para autenticarte.', - 'mcp.status.unauthorized': 'Inicio de sesión necesario', + 'mcp.status.unauthorized': 'Se requiere iniciar sesión',🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/src/lib/i18n/es.ts` around lines 1368 - 1388, In the mcp.detail.authRequired translation string, rephrase the Spanish text to use second-person phrasing instead of making the server the grammatical subject, particularly replacing the awkward "conectarse" construction. Additionally, in the mcp.status.unauthorized translation string, adjust the label to sound more natural in Spanish UI copy while maintaining the meaning of login being required.src/openhuman/workflows/README.md (1)
17-28:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUpdate stale
Called bybullets that still describe removed injection flow.This section still references “propagates injected skills” and a per-turn injection point, which conflicts with this PR’s event-driven refresh model and inject-module removal.
Suggested patch
-- `src/openhuman/agent/harness/fork_context.rs` — fork context propagates injected skills. -- `src/openhuman/agent/harness/session/turn.rs` — per-turn injection point. +- `src/openhuman/agent/harness/fork_context.rs` — fork context propagates installed workflow metadata to sub-agent runs. +- `src/openhuman/agent/harness/session/turn/{tools,core}.rs` — event-driven installed-skill refresh and one-shot announcement wiring.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/workflows/README.md` around lines 17 - 28, The "Called by" section in the README.md file contains outdated references to a removed injection flow. Remove or update the two bullets that reference "fork_context propagates injected skills" (from src/openhuman/agent/harness/fork_context.rs) and the per-turn injection point (from src/openhuman/agent/harness/session/turn.rs) since these mechanisms have been replaced by the new event-driven refresh model. Keep the remaining bullets that accurately describe the current skill catalog interactions.
🧹 Nitpick comments (1)
app/src/components/channels/mcp/InstalledServerDetail.test.tsx (1)
282-285: ⚡ Quick winStrengthen the non-leak assertion for OAuth metadata URLs.
Great coverage here; consider also asserting that the
.well-known/oauth-protected-resourcestyle URL is not rendered, since that’s the specific leak pattern described in the auth-401 bug.Suggested test addition
expect(screen.getByText('Sign in needed')).toBeInTheDocument(); expect(screen.getByText(/needs you to sign in or add an access token/i)).toBeInTheDocument(); expect(screen.queryByText(/HTTP 401/i)).not.toBeInTheDocument(); + expect(screen.queryByText(/oauth-protected-resource|\.well-known/i)).not.toBeInTheDocument();🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/src/components/channels/mcp/InstalledServerDetail.test.tsx` around lines 282 - 285, Strengthen the test assertions by adding a check to verify that the `.well-known/oauth-protected-resource` style OAuth metadata URL is not rendered in the document. After the existing assertions in the test case (which already check for "Sign in needed" text and verify "HTTP 401" is not present), add an additional assertion using screen.queryByText to ensure this specific OAuth metadata URL pattern does not appear, as this is the exact leak pattern described in the auth-401 bug being tested.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@app/src/components/channels/mcp/InstalledServerDetail.tsx`:
- Around line 259-262: The error display condition in InstalledServerDetail.tsx
allows the error variable to render even when status is 'unauthorized', which
exposes raw error messages that should be hidden in that state. Modify the
conditional check to prevent the error from displaying when status is
'unauthorized' by adding the same status check to the error variable as is
already applied to connStatus?.last_error. Apply the condition status !==
'unauthorized' to the error variable so that both error sources are suppressed
when the unauthorized state is active.
In `@src/openhuman/agent/harness/session/turn/mod.rs`:
- Around line 133-135: The skill announcement message in the frozen catalog
context contains contradictory information: it states newly installed skills
"are in your `## Installed Skills` list" but that list is intentionally frozen
mid-session, which can confuse tool selection. Revise the message text (starting
with the string starting at line 133) to remove the claim that skills are
already visible in the frozen `## Installed Skills` section, and instead focus
on the fact that they are installed and immediately available to use via the
`run_skill` command without needing the block to be unfrozen first.
In `@src/openhuman/mcp_registry/connections.rs`:
- Around line 273-297: The issue is that `connect` updates `LAST_ERRORS` and
`AUTH_REQUIRED` under separate locks, creating a race condition where
`all_status` can observe inconsistent state. Replace the separate locks with a
single atomic failure state structure that stores both the error message and the
failure classification together. In the error handling branch of the `connect`
function (around the `Err(err)` match arm), instead of separately calling
`last_errors().write().await.insert()` and then `auth_required().write().await`
conditionally, insert a single atomic failure state that contains both the error
information and whether it is unauthorized (from `is_unauthorized_error(err)`).
Then update `all_status` (at lines 561-585) to read from this single atomic
structure instead of reading `LAST_ERRORS` and `AUTH_REQUIRED` separately,
ensuring status classification always sees a consistent snapshot of the failure
state.
In `@src/openhuman/skill_runtime/agent/skill_executor/prompt.md`:
- Around line 35-43: The fenced code block example in prompt.md starting with
the "## Completed" section is missing a language identifier, which violates
markdownlint rule MD040. Add the language tag `text` to the opening triple
backticks of this code block by changing ``` to ```text to specify that the
content is plain text output.
---
Outside diff comments:
In `@app/src/lib/i18n/es.ts`:
- Around line 1368-1388: In the mcp.detail.authRequired translation string,
rephrase the Spanish text to use second-person phrasing instead of making the
server the grammatical subject, particularly replacing the awkward "conectarse"
construction. Additionally, in the mcp.status.unauthorized translation string,
adjust the label to sound more natural in Spanish UI copy while maintaining the
meaning of login being required.
In `@src/openhuman/mcp_registry/connections.rs`:
- Around line 503-507: The error handling in the call_tool path stringifies
errors with .map_err(|e| e.to_string()), which loses the typed
McpUnauthorizedError information that read_response returns for HTTP 401
responses. This prevents the caller from detecting and properly handling 401
auth expiry errors. Instead of stringifying the error, preserve the actual error
type so that McpUnauthorizedError can be properly identified and handled by the
caller to trigger sign-in flows and remove stale connections from the live map.
Modify the error mapping in the conn.client.call_tool() chain to return the
error type rather than converting it to a string.
In `@src/openhuman/workflows/README.md`:
- Around line 17-28: The "Called by" section in the README.md file contains
outdated references to a removed injection flow. Remove or update the two
bullets that reference "fork_context propagates injected skills" (from
src/openhuman/agent/harness/fork_context.rs) and the per-turn injection point
(from src/openhuman/agent/harness/session/turn.rs) since these mechanisms have
been replaced by the new event-driven refresh model. Keep the remaining bullets
that accurately describe the current skill catalog interactions.
---
Nitpick comments:
In `@app/src/components/channels/mcp/InstalledServerDetail.test.tsx`:
- Around line 282-285: Strengthen the test assertions by adding a check to
verify that the `.well-known/oauth-protected-resource` style OAuth metadata URL
is not rendered in the document. After the existing assertions in the test case
(which already check for "Sign in needed" text and verify "HTTP 401" is not
present), add an additional assertion using screen.queryByText to ensure this
specific OAuth metadata URL pattern does not appear, as this is the exact leak
pattern described in the auth-401 bug being tested.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5647486a-c129-4a46-ba0a-dcce69727aca
📒 Files selected for processing (43)
app/src/components/channels/mcp/InstalledServerDetail.test.tsxapp/src/components/channels/mcp/InstalledServerDetail.tsxapp/src/components/channels/mcp/InstalledServerList.tsxapp/src/components/channels/mcp/McpConnectionHealthToolbar.tsxapp/src/components/channels/mcp/McpServersTab.tsxapp/src/components/channels/mcp/McpStatusBadge.test.tsxapp/src/components/channels/mcp/McpStatusBadge.tsxapp/src/components/channels/mcp/types.tsapp/src/lib/i18n/ar.tsapp/src/lib/i18n/bn.tsapp/src/lib/i18n/de.tsapp/src/lib/i18n/en.tsapp/src/lib/i18n/es.tsapp/src/lib/i18n/fr.tsapp/src/lib/i18n/hi.tsapp/src/lib/i18n/id.tsapp/src/lib/i18n/it.tsapp/src/lib/i18n/ko.tsapp/src/lib/i18n/pl.tsapp/src/lib/i18n/pt.tsapp/src/lib/i18n/ru.tsapp/src/lib/i18n/zh-CN.tssrc/core/event_bus/events.rssrc/core/event_bus/events_tests.rssrc/openhuman/agent/harness/session/builder/setters.rssrc/openhuman/agent/harness/session/tests.rssrc/openhuman/agent/harness/session/turn/core.rssrc/openhuman/agent/harness/session/turn/mod.rssrc/openhuman/agent/harness/session/turn/tools.rssrc/openhuman/agent/harness/session/turn_tests.rssrc/openhuman/agent/harness/session/types.rssrc/openhuman/agent_registry/agents/orchestrator/prompt.mdsrc/openhuman/agent_registry/agents/orchestrator/prompt.rssrc/openhuman/mcp_client/client.rssrc/openhuman/mcp_client/mod.rssrc/openhuman/mcp_registry/connections.rssrc/openhuman/mcp_registry/types.rssrc/openhuman/skill_runtime/agent/skill_executor/prompt.mdsrc/openhuman/workflows/README.mdsrc/openhuman/workflows/inject.rssrc/openhuman/workflows/mod.rssrc/openhuman/workflows/ops_create.rssrc/openhuman/workflows/ops_install.rs
💤 Files with no reviewable changes (1)
- src/openhuman/workflows/inject.rs
…mansai#3719) MCP HTTP-remote servers that reject the connect handshake with HTTP 401 were collapsed into a generic `Error` carrying the raw transport string (incl. the internal OAuth metadata URL), leaving the user with no usable path forward and no tools. Distinguish auth failures from generic ones end to end: - core: typed `McpUnauthorizedError` returned from the transport on 401 (instead of a bare `bail!` string) so the connect path classifies by type, not text; a new `AUTH_REQUIRED` set drives a `ServerStatus:: Unauthorized` in `all_status` and intentionally withholds the raw error (no leaked OAuth URL). - ui: amber "Sign in needed" badge + a graceful, actionable notice replacing the raw 401; the primary button relabels to "Sign in" and opens the existing auth modal (browser OAuth / token entry); status polling now continues for unauthorized/error/connecting servers so the state recovers after sign-in or a supervisor reconnect; the health toolbar buckets unauthorized under "disconnected" so "Retry all" never blindly re-401s it. - i18n: 3 new keys translated across all 14 locales. Tests: new Rust classification test + types coverage; new Vitest cases for the unauthorized badge/notice rendering.
…nches Factor the per-server status decision out of `all_status` into a pure `classify_server_status` so the priority order (Disabled > Connected > Unauthorized > Error > Disconnected) is unit-testable without a live connection registry or DB — covers the changed lines for the diff coverage gate.
…en unauthorized
- connections: collapse the separate LAST_ERRORS (message) and AUTH_REQUIRED
(flag) stores into a single `ConnectFailure { message, unauthorized }` record
under one lock, so a concurrent `all_status` can never read a torn snapshot
(message updated but auth flag stale). `connect` writes once; `all_status`
reads one snapshot.
- InstalledServerDetail: suppress the raw-error block entirely while
`unauthorized` (covers the local `error` state too, e.g. a reconfigure
reconnect that re-hits the 401), so the amber sign-in notice is the only
message shown and no raw transport/OAuth text can re-surface.
602d049 to
01c820f
Compare
|
@coderabbitai I addressed the review and rebased the branch to scope this PR to the #3719 MCP change only:
|
|
I'll kick off a fresh full review against the rebased diff. 🐇 ✨ ✅ Action performedFull review finished. |
oxoxDev
left a comment
There was a problem hiding this comment.
Approve
Walkthrough. A remote MCP server rejecting the connect handshake with HTTP 401 was stranded in a generic Error state and rendered the raw transport string (leaking the internal oauth-protected-resource URL) with no re-auth path. This PR makes 401 a first-class, recoverable Unauthorized state: typed McpUnauthorizedError from the transport, classified via downcast (not message text), withheld raw message, plus an amber "Sign in needed" badge + re-auth button + extended polling, with 3 i18n keys across all locales.
Prior CodeRabbit concerns are STALE — already fixed in head:
- "separate locks" race — no longer two maps; single
ConnectFailure { message, unauthorized }under one lock, written in one insert / read in one snapshot (ConnectFailureDisabled > Connected > Unauthorized > Error > Disconnected). - "raw error leak at :259" — error block now gated off entirely when
unauthorized(dedicated amber notice renders instead); new test asserts noHTTP 401text leaks.
Low-sev nitpicks (non-blocking):
needs_auth()accessor appears unused by surfaced paths — confirm a caller exists or drop the dead public API.- The unauthorized gate also hides a genuine non-auth reconfigure error if the server happens to be
unauthorized— acceptable, and called out in a code comment.
Question. For a pure-OAuth server (no declared env fields) — does the re-auth modal present a browser-OAuth affordance, or only a token field? That's the boundary between "401 is actionable" (this PR's stated scope, met) and "resolves end-to-end" (the deferred ACs). Not a blocker.
Scope (make 401 actionable + stop the leak) is delivered with typed classification and a leak-asserting test. CI green.
Summary
Unauthorized("needs sign-in") state instead of a genericErrorcarrying the raw transport string.oauth-protected-resourceURL) is no longer surfaced to users; the UI shows a localized, plain-language notice.unauthorized/error/connectingservers, so state recovers after sign-in or a background-supervisor reconnect (no manual refresh).unauthorizedunder "disconnected" so "Retry all" never blindly re-401s an auth-gated server.Problem
Per #3719, connecting an OAuth-protected MCP server (e.g. Brave Search) showed an Error badge immediately with a raw string —
MCP unauthorized forhttps://brave.run.tools` (HTTP 401; resource metadata: …/.well-known/oauth-protected-resource)— no tools, no re-auth prompt, and a "Connect" button that looped back to the same failure. Root cause: a 401 was indistinguishable from any other connect failure (ServerStatus` had no auth variant), so the raw transport error was passed straight through to the UI and the server was stranded in a dead-end.Solution
mcp_client/client.rs):read_responsereturns a typedMcpUnauthorizedErroron HTTP 401 instead ofbail!-ing a bare string, so the connect path classifies by type (chaindowncast), not by message text.mcp_registry/connections.rs): a separateAUTH_REQUIREDset records 401s;all_statusdelegates to a pure, fully-testedclassify_server_status(priorityDisabled > Connected > Unauthorized > Error > Disconnected) and withholds the raw error for theUnauthorizedcase.mcp_registry/types.rs): newServerStatus::Unauthorized(serializes"unauthorized").ServerStatusunion + 3 new i18n keys across all 14 locales.Scope note: the OAuth detection internals (detect↔begin DCR requirement mismatch;
discover_authorizationmislabeling OAuth servers astokenon a metadata-fetch failure) are intentionally out of scope here — this PR makes the 401 actionable and removes the raw-error dead-end; hardening detection is a tracked follow-up.Submission Checklist
classify_server_statusbranch coverage +ServerStatus::as_str; Vitest: unauthorized badge + graceful-notice rendering (asserts no rawHTTP 401leaks).pnpm test(MCP suite, 68 passing) and targetedcargo testlocally.## Related— no matrix feature IDs apply to this bug fix.Closes #3719below.Impact
"unauthorized") on the existingmcp_clients_statusRPC output — additive and backward-compatible (older clients fall back to the disconnected style).resource_metadataURL into user-facing error text.Related
discover_authorizationfallback when protected-resource/auth-server metadata fetch fails).AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
fix/3719-mcp-unreliable-auth602d04949eaddf0a942c034fe8a5c508b9158e8aValidation Run
pnpm --filter openhuman-app format:checkpnpm typecheckcargo test mcp_registry::connections(8),mcp_client::client(15)cargo check --libcleanapp/src-taurifiles changedValidation Blocked
command:pnpm rust:check(cargo check --manifest-path app/src-tauri/Cargo.toml) — pre-push hookerror:failed to load source for dependency \tauri`— the vendoredapp/src-tauri/vendor/tauri-cef` git submodule is not initialized in this worktreeimpact:none on this change — it touches 0app/src-taurifiles; the core lib + frontend compile clean. Pre-push hook was bypassed with--no-verifyper the repo's sanctioned policy for pre-existing unrelated breakage; CI runs the full Tauri check with the submodule present.Behavior Changes
Unauthorizedstatus (was genericError).Parity Contract
Errorwith their message; connected/disconnected/disabled unchanged.classify_server_statuspreserves prior ordering; unknown statuses fall back to the disconnected style in the UI.Duplicate / Superseded PR Handling
Summary by CodeRabbit
New Features
Localization