Skip to content

fix(bedrock): fail fast for Bedrock models on unreachable remote sandboxes#329

Merged
xdotli merged 1 commit into
mainfrom
fix/bedrock-proxy-daytona
May 21, 2026
Merged

fix(bedrock): fail fast for Bedrock models on unreachable remote sandboxes#329
xdotli merged 1 commit into
mainfrom
fix/bedrock-proxy-daytona

Conversation

@xdotli
Copy link
Copy Markdown
Member

@xdotli xdotli commented May 21, 2026

The twin bug

PR #327 fixed ensure_usage_proxy_runtime so the host-side usage telemetry proxy is skipped on remote sandboxes (daytona/modal) where the host's 127.0.0.1 is unreachable from the agent. But src/benchflow/providers/runtime.py has a second, structurally identical host-side proxy entry point — ensure_bedrock_proxy_runtime — that PR #327 did not patch.

It builds ProviderRuntime(kind="aws-bedrock", host=_bedrock_proxy_command(environment=environment), ...), and _bedrock_proxy_command returns BEDROCK_PROXY_LOCAL_HOST = "127.0.0.1" for any non-docker environment. So a Bedrock-routed model (needs_provider_runtime(model) is True) on a Daytona or Modal sandbox injected an unreachable 127.0.0.1 base URL into the agent and failed with the same opaque ACP error -32603 ... ECONNREFUSED mid-run.

Chosen behavior for Bedrock-on-remote: fail fast (option b) — and why

Unlike the usage proxy (pure telemetry — safe to skip), the Bedrock proxy is load-bearing. Reading bedrock_proxy.py: it translates Anthropic Messages / OpenAI Responses requests into AWS Bedrock Converse API calls and signs them with host AWS credentials (AWS_BEARER_TOKEN_BEDROCK, AWS_REGION) pulled from runtime_env. There is no path for the agent to reach Bedrock without it — silently skipping would just move the failure (the agent would get no base-URL rewrite and try to talk Bedrock-style endpoints, or hit the default provider URL with a Bedrock model name).

So ensure_bedrock_proxy_runtime now gates on the host_proxy_reachable_from_agent(environment) predicate (added by #327) and raises a clear, actionable RuntimeError at setup time instead of injecting an unreachable URL:

Bedrock-routed models are not supported on the 'daytona' sandbox: the host-side Bedrock proxy that translates and signs requests to AWS Bedrock binds to the host machine and is unreachable from the agent, which runs on a separate remote host. Re-run with '--sandbox docker', or select a model that is not routed through AWS Bedrock.

Any stale runtime carried over from a prior role/env is stopped before raising.

Consolidation (verifier Finding 2)

_bedrock_proxy_command and host_proxy_reachable_from_agent both encoded "is the host reachable from the agent." host_proxy_reachable_from_agent is now the single source of truth: it is the gate, and _bedrock_proxy_command asserts the precondition (host_proxy_reachable_from_agent(environment) is True) so it is only ever reached for environments already known reachable.

Honest allow-list (verifier Finding 3)

_LOCAL_REACHABLE_ENVIRONMENTS = {"docker","local","host",""} is replaced by _REMOTE_UNREACHABLE_ENVIRONMENTS = {"daytona","modal"}. The phantom values local/host/"" were never produced by the runtime — the canonical environment set is {docker, daytona, modal}. The predicate is now "unreachable iff in the explicit remote set"; unknown environments are treated conservatively as reachable (still wire up the proxy rather than silently skip), with an explicit comment.

Tests

tests/test_provider_runtime.py — new TestBedrockProxyRemoteSandbox: fail-fast on daytona/modal, actionable error text, non-Bedrock noop on remote, stale-runtime teardown, _bedrock_proxy_command precondition assertion. tests/test_usage_proxy.py — reachability test updated to the honest domain.

CI gate all green: ruff format --check, ruff check, ty check src/, pytest tests/ (1307 passed, 27 skipped).


Note

Medium Risk
Changes Bedrock model startup behavior to raise a hard error on daytona/modal instead of attempting to use an unreachable host proxy, which may break previously (already failing) remote-sandbox runs but improves diagnosability. Touches provider-runtime wiring used during rollout connect, so regressions would block Bedrock-routed executions.

Overview
Bedrock-routed models now fail fast on remote sandboxes. ensure_bedrock_proxy_runtime checks host_proxy_reachable_from_agent and raises an actionable RuntimeError on daytona/modal (stopping any stale runtime first) rather than injecting an unreachable 127.0.0.1 base URL.

Reachability logic is tightened and centralized. The reachability predicate is inverted to an explicit remote-unreachable set (daytona, modal), treats unknown environments as reachable, and _bedrock_proxy_command now asserts it’s only called when the host proxy is reachable.

Tests add coverage for the new fail-fast behavior/actionable messaging, stale runtime cleanup, and updated reachability expectations.

Reviewed by Cursor Bugbot for commit 8802c89. Bugbot is set up for automated code reviews on this repo. Configure here.

…boxes

PR #327 fixed `ensure_usage_proxy_runtime` to skip the host-side usage
telemetry proxy on remote sandboxes (daytona/modal) where the host's
127.0.0.1 is unreachable from the agent. Its structural twin,
`ensure_bedrock_proxy_runtime`, was not patched: a Bedrock-routed model
on daytona/modal still injected an unreachable 127.0.0.1 base URL and
failed mid-run with an opaque `ACP error -32603 ... ECONNREFUSED`.

Unlike the usage proxy (pure telemetry, safe to skip), the Bedrock proxy
is load-bearing: it translates Anthropic/OpenAI requests into AWS Bedrock
Converse calls and signs them with host AWS credentials. There is no path
for the agent to reach Bedrock without it, so silently skipping would just
move the failure. Instead `ensure_bedrock_proxy_runtime` now gates on
`host_proxy_reachable_from_agent` and raises a clear, actionable error at
setup time ("Bedrock-routed models are not supported on the 'daytona'
sandbox ... re-run with --sandbox docker, or select a non-Bedrock model").

Consolidation: `host_proxy_reachable_from_agent` is now the single source
of truth for "is the host reachable from the agent". `_bedrock_proxy_command`
asserts the precondition and is only ever reached for reachable environments.

Honest allow-list: `_LOCAL_REACHABLE_ENVIRONMENTS = {docker,local,host,""}`
is replaced by `_REMOTE_UNREACHABLE_ENVIRONMENTS = {daytona,modal}`. The
phantom values local/host/"" were never produced by the runtime (canonical
env set is {docker,daytona,modal}); unknown environments are now treated
conservatively as reachable.

Adds regression tests covering fail-fast on daytona/modal, the actionable
error text, non-Bedrock noop, stale-runtime teardown, and the
_bedrock_proxy_command precondition.
@xdotli xdotli merged commit 1ab538e into main May 21, 2026
2 of 3 checks passed
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8802c894fe

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +228 to +230
"""The Bedrock proxy is load-bearing; on a remote sandbox where the host
proxy is unreachable the run must fail fast rather than inject an
unreachable 127.0.0.1 base URL (the Daytona telemetry-proxy twin bug)."""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Include guarded PR/commit in regression test docstring

The new regression test class docstring describes the Daytona/Modal Bedrock proxy regression but does not name the PR or commit it guards, which violates the repository rule in /workspace/benchflow/AGENTS.md (Regression tests must name the PR/commit they guard). This makes future triage and intentional test-retirement decisions harder because maintainers cannot quickly map the test to its historical regression source.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 3 additional findings in Devin Review.

Open in Devin Review

Comment thread tests/test_usage_proxy.py
Comment on lines 548 to +554
assert host_proxy_reachable_from_agent("docker") is True
assert host_proxy_reachable_from_agent("local") is True
assert host_proxy_reachable_from_agent("host") is True
assert host_proxy_reachable_from_agent("") is True
# Remote cloud sandboxes run the agent on a separate machine.
assert host_proxy_reachable_from_agent("daytona") is False
assert host_proxy_reachable_from_agent("modal") is False
# Unrecognized environments are treated conservatively as reachable.
assert host_proxy_reachable_from_agent("") is True
assert host_proxy_reachable_from_agent("some-future-local-env") is True
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Passing test assertions for "local" and "host" environments removed in violation of AGENTS.md

The AGENTS.md rule states: "Don't rewrite passing tests to match new behavior." The test test_host_proxy_reachable_only_for_local_environments removed the assertions assert host_proxy_reachable_from_agent("local") is True and assert host_proxy_reachable_from_agent("host") is True. These assertions still pass under the new blacklist-based implementation ("local" and "host" are not in _REMOTE_UNREACHABLE_ENVIRONMENTS at src/benchflow/providers/runtime.py:123, so host_proxy_reachable_from_agent returns True for both). The assertions were passing tests that were removed, which violates the repository convention.

Suggested change
assert host_proxy_reachable_from_agent("docker") is True
assert host_proxy_reachable_from_agent("local") is True
assert host_proxy_reachable_from_agent("host") is True
assert host_proxy_reachable_from_agent("") is True
# Remote cloud sandboxes run the agent on a separate machine.
assert host_proxy_reachable_from_agent("daytona") is False
assert host_proxy_reachable_from_agent("modal") is False
# Unrecognized environments are treated conservatively as reachable.
assert host_proxy_reachable_from_agent("") is True
assert host_proxy_reachable_from_agent("some-future-local-env") is True
# docker shares the host's network namespace via the docker bridge.
assert host_proxy_reachable_from_agent("docker") is True
assert host_proxy_reachable_from_agent("local") is True
assert host_proxy_reachable_from_agent("host") is True
# Remote cloud sandboxes run the agent on a separate machine.
assert host_proxy_reachable_from_agent("daytona") is False
assert host_proxy_reachable_from_agent("modal") is False
# Unrecognized environments are treated conservatively as reachable.
assert host_proxy_reachable_from_agent("") is True
assert host_proxy_reachable_from_agent("some-future-local-env") is True
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@xdotli xdotli deleted the fix/bedrock-proxy-daytona branch May 22, 2026 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant