Sandbox backend registry with lazy-secret Daytona provider#45
Conversation
Replace the single-slot sandbox backend with a provider registry: the harness validates a supported set + a default at startup, and per-agent selection errors if a requested provider isn't supported. Remote providers record only a secret-name reference at startup and resolve the credential from the secret store lazily on first use (Daytona reads DAYTONA_API_KEY / DAYTONA_ORGANIZATION_ID / DAYTONA_TARGET), so the harness reads zero environment variables. Backends build without holding the cache lock, and duplicate providers are rejected at startup. Adds the Daytona REST backend under sandbox_provider/ with a typed sandbox-state enum (reuse a sandbox only when runnable; start only durably-stopped states), plus hermetic wiremock tests for it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
| @@ -0,0 +1,7 @@ | |||
| //! Per-provider [`crate::sandbox::ManagedSandboxBackend`] implementations, | |||
There was a problem hiding this comment.
I'm going to try adding support for my exe.dev VMs which I love so dearly, just to see how well the pattern fits. they're much more durable than all of the ephemeral sandbox providers, but it'd be great to be able to support that too.
| toolbox_url: crate::DEFAULT_DAYTONA_TOOLBOX_URL.to_string(), | ||
| api_key_secret: "DAYTONA_API_KEY".to_string(), | ||
| organization_id_secret: Some("DAYTONA_ORGANIZATION_ID".to_string()), | ||
| target_secret: Some("DAYTONA_TARGET".to_string()), |
There was a problem hiding this comment.
note: I realize these technically aren't "secrets", but it felt clean to have these come from the store so they are configured the same way. we can alternatiely take them in some other form of configuration.
btw, we may want to have a config file eventually for exo, rather than command line flags for everything.
There was a problem hiding this comment.
that's what bindings are for. we should add sandbox provider bindings if we need to
|
btw – once PR #21 is reviewed (and approved), I'll separately work on reconciling the container lifecycles to the the same across the different backends. there's a few common states to consider: right now, the pattern is a little different for auto-stop interval. but I want to add snapshotting to daytona in a followup PR, and there will also reconcile behavior to match. once that is done, I'll add the adapter to make teleportation work!! (requires small adaptor b/c daytona doesn't natively produce docker tarballs from its snapshots |
| // `snapshot` is a pre-registered snapshot name, not an image ref; only | ||
| // the restore path sets it, so fresh creates use Daytona's default image. | ||
| let body = DaytonaCreateRequest { | ||
| snapshot: snapshot_name.map(str::to_string), |
There was a problem hiding this comment.
i think if the snapshot is empty, we need to fall back to the request.spec.image right? Otherwise when is the image ever used?
| // (starting a running/mid-transition sandbox would race). | ||
| Some(existing) if existing.is_reusable() => { | ||
| if existing.needs_start() { | ||
| self.start_sandbox(&existing.id).await?; |
There was a problem hiding this comment.
do we need to wait until it finishes starting?
| // Daytona's exec endpoint is request/response, not streaming: run the | ||
| // command, then hand back already-populated stdout/stderr/wait. stdin | ||
| // goes to a sink — this endpoint doesn't accept piped input. | ||
| let output = |
There was a problem hiding this comment.
maybe we throw a runtime error here instead? i'm neutral
| } | ||
|
|
||
| async fn create_sandbox(&self, request: CreateSandboxRequest) -> Result<SandboxId> { | ||
| let _guard = self.harness.inner.write_lock.lock().await; |
There was a problem hiding this comment.
not new code but maybe we shouldn't lock the whole exoharness on this?
Summary
Replaces the single-slot sandbox-backend model with a provider registry:
LocalProcess, and Daytona.exo secret set(DAYTONA_API_KEY,DAYTONA_ORGANIZATION_ID,DAYTONA_TARGET) with no code/flag changes.Adds the Daytona REST backend under
crates/exoharness/src/sandbox_provider/, with a typed sandbox-state enum (reuse a sandbox only when runnable; start only durably-stopped states; replace terminal/error ones). Backends build without holding the cache lock, and duplicate providers are rejected at startup.This supersedes #22, reworked to fit the registry paradigm and addressing its review feedback (secrets-not-env, move out of the crate root, typed states, doc links).
Test plan
cargo build --workspace --all-featuresclean; clippy clean.cargo test --workspace— all unit/contract tests pass, including the lazy-secret resolution tests.cargo test -p exo --test daytona_backend(wiremock) — 12 tests covering find-or-create routing, label query shape, stop-not-delete, snapshot manifest, toolbox-vs-control-plane host routing, and transient/dead-state handling.exo secret set, the same conversation lazily resolves the secret and provisions a real remote sandbox.🤖 Generated with Claude Code