feat: add file locking to prevent concurrent state.json corruption#20
feat: add file locking to prevent concurrent state.json corruption#20don-petry wants to merge 2 commits intooneirosoft:mainfrom
Conversation
Use an advisory lock file (.dagger/lock) with atomic create_new to ensure only one dgr process can modify state.json at a time. The lock is held for the duration of a StoreSession and released on drop. Relates to oneirosoft#10 item 1. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR introduces an advisory lock file to prevent multiple concurrent dgr processes from simultaneously mutating the store (notably state.json), reducing the risk of store corruption under parallel invocations.
Changes:
- Added a
StoreLockguard that acquires an exclusive lock via atomiccreate_newfile creation and removes the lock file on drop. - Integrated lock acquisition into
StoreSessionopening/creation flows. - Added unit tests for lock acquisition, mutual exclusion, and cleanup; added
tempfileas a dev-dependency.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/core/store/lock.rs | New lock guard implementation and unit tests. |
| src/core/store/session.rs | Acquires the lock as part of session creation/opening. |
| src/core/store/mod.rs | Exposes the new lock module within store. |
| src/core/commit.rs | Switches session construction to a helper that acquires the lock. |
| Cargo.toml | Adds tempfile for the new unit tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| pub fn acquire(dagger_root: &Path) -> io::Result<Self> { | ||
| let lock_path = dagger_root.join("lock"); | ||
|
|
||
| // Try to create the lock file exclusively. | ||
| // O_CREAT | O_EXCL ensures atomic creation — fails if file already exists. | ||
| match fs::OpenOptions::new() | ||
| .write(true) | ||
| .create_new(true) | ||
| .open(&lock_path) |
There was a problem hiding this comment.
StoreLock::acquire assumes dagger_root already exists. That prevents taking the lock before store initialization (e.g., in open_or_initialize) and will fail with NotFound if the .dagger dir is missing. Consider creating the directory (e.g., fs::create_dir_all(dagger_root)?) before attempting create_new so locking can protect initialization too.
There was a problem hiding this comment.
Addressed in the latest push.
There was a problem hiding this comment.
Fixed in b5373c0. StoreLock::acquire now calls fs::create_dir_all(dagger_root) before attempting to create the lock file, so the lock can be taken even before the store directory is fully initialized.
src/core/store/session.rs
Outdated
| let store_initialization = initialize_store(&paths, trunk_branch)?; | ||
| let config = | ||
| load_config(&paths)?.ok_or_else(|| io::Error::other("dagger config is missing"))?; | ||
| let lock = StoreLock::acquire(&paths.root)?; |
There was a problem hiding this comment.
In open_or_initialize, the lock is acquired after initialize_store (which may write config.json/state.json/events). Two concurrent dgr processes can still race during initialization and write state concurrently before either takes the lock. If StoreLock::acquire is updated to create_dir_all on the root, consider acquiring the lock before calling initialize_store so initialization and subsequent reads/writes are fully mutually exclusive.
| let store_initialization = initialize_store(&paths, trunk_branch)?; | |
| let config = | |
| load_config(&paths)?.ok_or_else(|| io::Error::other("dagger config is missing"))?; | |
| let lock = StoreLock::acquire(&paths.root)?; | |
| let lock = StoreLock::acquire(&paths.root)?; | |
| let store_initialization = initialize_store(&paths, trunk_branch)?; | |
| let config = | |
| load_config(&paths)?.ok_or_else(|| io::Error::other("dagger config is missing"))?; |
There was a problem hiding this comment.
Addressed in the latest push.
There was a problem hiding this comment.
Fixed in b5373c0. Lock acquisition is now moved before initialize_store in open_or_initialize, so two concurrent processes can no longer race on initialization.
| let actions = | ||
| restack::plan_after_branch_advance(&state, node.id, &node.branch_name, old_head_oid)?; | ||
| let mut session = StoreSession { | ||
| repo: context.repo.clone(), | ||
| paths: store_paths, | ||
| config, | ||
| state, | ||
| }; | ||
| let mut session = StoreSession::from_parts(context.repo.clone(), store_paths, config, state)?; | ||
| record_branch_divergence_state(&mut session, node.id, BranchDivergenceState::Diverged)?; |
There was a problem hiding this comment.
StoreSession::from_parts acquires the store lock only at this point, but load_state and plan_after_branch_advance were already done earlier without holding the lock. That means this code can still read state.json while another dgr process holds the lock (and may be mid-write), and the planned restack actions may be based on stale state. Consider acquiring the lock before reading state / planning (or pass an already-acquired StoreLock into session construction) and reloading state after the lock is held.
There was a problem hiding this comment.
Addressed in the latest push.
There was a problem hiding this comment.
Fixed in b5373c0. The lock is now acquired in maybe_restack_after_commit_inner before load_state, and the pre-acquired lock is passed into the session via the new StoreSession::from_lock constructor. Note that load_config remains outside the lock because the config is written once during init and is effectively immutable after that—it is never modified by concurrent operations, so reading it without the lock is safe. The mutable state.json is the resource that needs serialization, and that is now fully covered.
- Ensure dagger_root directory exists before creating lock file - Move lock acquisition before initialize_store in open_or_initialize - Acquire lock before state read in maybe_restack_after_commit_inner - Add from_lock constructor to StoreSession for pre-acquired locks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Why?
Running multiple
dgrprocesses concurrently (e.g., in editor hooks or parallel terminals) can cause them to read and writestate.jsonsimultaneously, leading to silent data corruption or lost updates. A file lock serializes access so only onedgrprocess modifies state at a time.Summary
StoreLockguard (src/core/store/lock.rs) that uses atomiccreate_newfile creation to prevent concurrentdgrprocesses from writing tostate.jsonsimultaneously.daggerdirectory if it doesn't exist before acquiring the lock, so locking works even on first initializationinitialize_storeto prevent races during initial setupStoreSessionso it is acquired when a session opens and released on dropRelates to #10 item 1 (file locking).
Test plan
cargo testpasses (new lock tests inlock.rs)dgrcommands simultaneously and verify the second reports "another dgr process appears to be running".dagger/lockrecovers from a stale lock after crash🤖 Generated with Claude Code