Cold flake evaluation is dominated by per-derivation nix-daemon IPC. Each .drv instantiation fires many small synchronous round-trips wopAddTempRoot (op 11) per input path and wopAddToStore (op 7) per builder script, serialized on one daemon socket per eval-worker and on the daemon's global store-DB lock (seen via strace: op 11 / op 7 ↔ STDERR_LAST).
This is the per-commit lever: the eval-cache is keyed by the whole-flake fingerprint, so a normal CI commit is a cache miss and re-evaluates cold every time. Parallel sharding has landed; the remaining cold-eval cost is these round-trips.
Why the daemon
The eval-worker opens the default (auto) store, which resolves to the daemon for a non-root worker. The .drv registration writes /nix/var/nix/db/db.sqlite either way; the daemon path just adds the IPC round-trip on top of that unavoidable write.
Constraints found
- A direct/local store cannot safely share the daemon's
db.sqlite while the daemon is live (concurrent writers, same WAL-contention class).
- An eval-store (a separate, worker-owned store for
.drv) avoids the daemon and writes locally without root, but .drv then land under the eval-store root: the closure-walk drv_reader must read from there, and the .drv must be copied into the main store before build. Needs eval-store wiring in nix-bindings / the fork.
Options
- Eval-store at a worker-writable path (no root) +
drv_reader path handling + copy-to-main before build.
- Run the eval-worker with direct store access (root / store owner) so
auto resolves to a local store (conflicts with a running daemon).
- Measure first:
strace -c an eval-worker to confirm daemon read/write dominates vs CPU forcing (use statistics_json).
Acceptance
Cold eval no longer bottlenecks on serialized daemon IPC; show before/after wall-time and syscall counts.
Deferred from the parallel-eval / eval-cache-deadlock work. Related: #386 (optimized evaluations).
maybe save the new derivations to memory store
Cold flake evaluation is dominated by per-derivation nix-daemon IPC. Each
.drvinstantiation fires many small synchronous round-tripswopAddTempRoot(op 11) per input path andwopAddToStore(op 7) per builder script, serialized on one daemon socket per eval-worker and on the daemon's global store-DB lock (seen viastrace: op 11 / op 7 ↔STDERR_LAST).This is the per-commit lever: the eval-cache is keyed by the whole-flake fingerprint, so a normal CI commit is a cache miss and re-evaluates cold every time. Parallel sharding has landed; the remaining cold-eval cost is these round-trips.
Why the daemon
The eval-worker opens the default (
auto) store, which resolves to the daemon for a non-root worker. The.drvregistration writes/nix/var/nix/db/db.sqliteeither way; the daemon path just adds the IPC round-trip on top of that unavoidable write.Constraints found
db.sqlitewhile the daemon is live (concurrent writers, same WAL-contention class)..drv) avoids the daemon and writes locally without root, but.drvthen land under the eval-store root: the closure-walkdrv_readermust read from there, and the.drvmust be copied into the main store before build. Needs eval-store wiring in nix-bindings / the fork.Options
drv_readerpath handling + copy-to-main before build.autoresolves to a local store (conflicts with a running daemon).strace -can eval-worker to confirm daemonread/writedominates vs CPU forcing (usestatistics_json).Acceptance
Cold eval no longer bottlenecks on serialized daemon IPC; show before/after wall-time and syscall counts.
Deferred from the parallel-eval / eval-cache-deadlock work. Related: #386 (optimized evaluations).
maybe save the new derivations to memory store