Sandbox backend registry with lazy-secret Daytona provider by akrentsel · Pull Request #45 · ankrgyl/exo

akrentsel · 2026-06-04T02:17:33Z

Summary

Replaces the single-slot sandbox-backend model with a provider registry:

The harness starts with a set of supported providers plus one default. The CLI default registers the OS-local container backend (Apple on macOS, Docker on Linux), LocalProcess, and Daytona.
Per-agent/per-conversation selection picks a provider; unset → default; specified-but-unsupported → error at use time.
Two-tier secret model — prespecify the reference, deref lazily. Remote providers record only a secret-name reference at startup (touching zero keys). The credential is read/decrypted from the secret store on first use, then the backend is cached. Missing secret → clear error at use time. The harness reads zero environment variables.
Daytona is fully configurable via exo secret set (DAYTONA_API_KEY, DAYTONA_ORGANIZATION_ID, DAYTONA_TARGET) with no code/flag changes.

Adds the Daytona REST backend under crates/exoharness/src/sandbox_provider/, with a typed sandbox-state enum (reuse a sandbox only when runnable; start only durably-stopped states; replace terminal/error ones). Backends build without holding the cache lock, and duplicate providers are rejected at startup.

This supersedes #22, reworked to fit the registry paradigm and addressing its review feedback (secrets-not-env, move out of the crate root, typed states, doc links).

Test plan

cargo build --workspace --all-features clean; clippy clean.
cargo test --workspace — all unit/contract tests pass, including the lazy-secret resolution tests.
New hermetic cargo test -p exo --test daytona_backend (wiremock) — 12 tests covering find-or-create routing, label query shape, stop-not-delete, snapshot manifest, toolbox-vs-control-plane host routing, and transient/dead-state handling.
Live end-to-end against real Daytona: local sandbox runs a command; Daytona request with no secret errors cleanly; after exo secret set, the same conversation lazily resolves the secret and provisions a real remote sandbox.

🤖 Generated with Claude Code

Replace the single-slot sandbox backend with a provider registry: the harness validates a supported set + a default at startup, and per-agent selection errors if a requested provider isn't supported. Remote providers record only a secret-name reference at startup and resolve the credential from the secret store lazily on first use (Daytona reads DAYTONA_API_KEY / DAYTONA_ORGANIZATION_ID / DAYTONA_TARGET), so the harness reads zero environment variables. Backends build without holding the cache lock, and duplicate providers are rejected at startup. Adds the Daytona REST backend under sandbox_provider/ with a typed sandbox-state enum (reuse a sandbox only when runnable; start only durably-stopped states), plus hermetic wiremock tests for it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

akrentsel · 2026-06-04T02:44:02Z

@@ -0,0 +1,7 @@
+//! Per-provider [`crate::sandbox::ManagedSandboxBackend`] implementations,


I'm going to try adding support for my exe.dev VMs which I love so dearly, just to see how well the pattern fits. they're much more durable than all of the ephemeral sandbox providers, but it'd be great to be able to support that too.

this is the way

akrentsel · 2026-06-04T02:45:49Z

+            toolbox_url: crate::DEFAULT_DAYTONA_TOOLBOX_URL.to_string(),
+            api_key_secret: "DAYTONA_API_KEY".to_string(),
+            organization_id_secret: Some("DAYTONA_ORGANIZATION_ID".to_string()),
+            target_secret: Some("DAYTONA_TARGET".to_string()),


note: I realize these technically aren't "secrets", but it felt clean to have these come from the store so they are configured the same way. we can alternatiely take them in some other form of configuration.

btw, we may want to have a config file eventually for exo, rather than command line flags for everything.

that's what bindings are for. we should add sandbox provider bindings if we need to

akrentsel · 2026-06-04T03:27:28Z

btw – once PR #21 is reviewed (and approved), I'll separately work on reconciling the container lifecycles to the the same across the different backends.

there's a few common states to consider:
(1) running
(2) paused – costs some small amount of storage $ on Daytona, consumes RAM in docker
(3) deleted – no cost on either, but loss of data (unless snapshotted)

right now, the pattern is a little different for auto-stop interval. but I want to add snapshotting to daytona in a followup PR, and there will also reconcile behavior to match.

once that is done, I'll add the adapter to make teleportation work!! (requires small adaptor b/c daytona doesn't natively produce docker tarballs from its snapshots

ankrgyl

did a first pass

ankrgyl · 2026-06-04T04:21:19Z

+        // `snapshot` is a pre-registered snapshot name, not an image ref; only
+        // the restore path sets it, so fresh creates use Daytona's default image.
+        let body = DaytonaCreateRequest {
+            snapshot: snapshot_name.map(str::to_string),


i think if the snapshot is empty, we need to fall back to the request.spec.image right? Otherwise when is the image ever used?

ankrgyl · 2026-06-04T04:22:51Z

+            // (starting a running/mid-transition sandbox would race).
+            Some(existing) if existing.is_reusable() => {
+                if existing.needs_start() {
+                    self.start_sandbox(&existing.id).await?;


do we need to wait until it finishes starting?

ankrgyl · 2026-06-04T04:26:13Z

+        // Daytona's exec endpoint is request/response, not streaming: run the
+        // command, then hand back already-populated stdout/stderr/wait. stdin
+        // goes to a sink — this endpoint doesn't accept piped input.
+        let output =


maybe we throw a runtime error here instead? i'm neutral

ankrgyl · 2026-06-04T04:30:45Z

    }

    async fn create_sandbox(&self, request: CreateSandboxRequest) -> Result<SandboxId> {
        let _guard = self.harness.inner.write_lock.lock().await;


not new code but maybe we shouldn't lock the whole exoharness on this?

akrentsel requested a review from ankrgyl June 4, 2026 02:27

akrentsel mentioned this pull request Jun 4, 2026

Daytona remote-container sandbox backend #22

Closed

6 tasks

akrentsel commented Jun 4, 2026

View reviewed changes

ankrgyl reviewed Jun 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sandbox backend registry with lazy-secret Daytona provider#45

Sandbox backend registry with lazy-secret Daytona provider#45
akrentsel wants to merge 1 commit into
mainfrom
sandbox-backend-registry

akrentsel commented Jun 4, 2026

Uh oh!

akrentsel Jun 4, 2026

Uh oh!

ankrgyl Jun 4, 2026

Uh oh!

akrentsel Jun 4, 2026

Uh oh!

ankrgyl Jun 4, 2026

Uh oh!

akrentsel commented Jun 4, 2026

Uh oh!

ankrgyl left a comment

Uh oh!

ankrgyl Jun 4, 2026

Uh oh!

ankrgyl Jun 4, 2026

Uh oh!

ankrgyl Jun 4, 2026

Uh oh!

ankrgyl Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -0,0 +1,7 @@
		//! Per-provider [`crate::sandbox::ManagedSandboxBackend`] implementations,

Conversation

akrentsel commented Jun 4, 2026

Summary

Test plan

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

akrentsel commented Jun 4, 2026

Uh oh!

ankrgyl left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants