-
Notifications
You must be signed in to change notification settings - Fork 1
feat: add file locking to prevent concurrent state.json corruption #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,83 @@ | ||
| use std::fs; | ||
| use std::io; | ||
| use std::path::{Path, PathBuf}; | ||
|
|
||
| /// An advisory lock file that prevents concurrent dagger operations. | ||
| /// The lock is released (file deleted) when the guard is dropped. | ||
| pub struct StoreLock { | ||
| lock_path: PathBuf, | ||
| } | ||
|
|
||
| impl StoreLock { | ||
| /// Acquire an advisory lock by creating a lock file. | ||
| /// Returns an error if another process holds the lock. | ||
| pub fn acquire(dagger_root: &Path) -> io::Result<Self> { | ||
| fs::create_dir_all(dagger_root)?; | ||
| let lock_path = dagger_root.join("lock"); | ||
|
|
||
| // Try to create the lock file exclusively. | ||
| // O_CREAT | O_EXCL ensures atomic creation — fails if file already exists. | ||
| match fs::OpenOptions::new() | ||
| .write(true) | ||
| .create_new(true) | ||
| .open(&lock_path) | ||
|
Comment on lines
+14
to
+23
|
||
| { | ||
| Ok(mut file) => { | ||
| use std::io::Write; | ||
| let _ = writeln!(file, "{}", std::process::id()); | ||
| Ok(Self { lock_path }) | ||
| } | ||
| Err(e) if e.kind() == io::ErrorKind::AlreadyExists => Err(io::Error::other(format!( | ||
| "another dgr process appears to be running; \ | ||
| if this is incorrect, delete '{}'", | ||
| lock_path.display() | ||
| ))), | ||
| Err(e) => Err(e), | ||
| } | ||
| } | ||
| } | ||
|
|
||
| impl Drop for StoreLock { | ||
| fn drop(&mut self) { | ||
| let _ = fs::remove_file(&self.lock_path); | ||
| } | ||
| } | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { | ||
| use super::*; | ||
|
|
||
| #[test] | ||
| fn acquire_creates_lock_file() { | ||
| let dir = tempfile::tempdir().unwrap(); | ||
| let lock = StoreLock::acquire(dir.path()).unwrap(); | ||
| assert!(dir.path().join("lock").exists()); | ||
| drop(lock); | ||
| } | ||
|
|
||
| #[test] | ||
| fn drop_removes_lock_file() { | ||
| let dir = tempfile::tempdir().unwrap(); | ||
| let lock = StoreLock::acquire(dir.path()).unwrap(); | ||
| drop(lock); | ||
| assert!(!dir.path().join("lock").exists()); | ||
| } | ||
|
|
||
| #[test] | ||
| fn second_acquire_fails_while_held() { | ||
| let dir = tempfile::tempdir().unwrap(); | ||
| let _lock = StoreLock::acquire(dir.path()).unwrap(); | ||
| let result = StoreLock::acquire(dir.path()); | ||
| assert!(result.is_err()); | ||
| let err = result.unwrap_err(); | ||
| assert!(err.to_string().contains("another dgr process")); | ||
| } | ||
|
|
||
| #[test] | ||
| fn acquire_succeeds_after_release() { | ||
| let dir = tempfile::tempdir().unwrap(); | ||
| let lock = StoreLock::acquire(dir.path()).unwrap(); | ||
| drop(lock); | ||
| let _lock2 = StoreLock::acquire(dir.path()).unwrap(); | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StoreSession::from_partsacquires the store lock only at this point, butload_stateandplan_after_branch_advancewere already done earlier without holding the lock. That means this code can still readstate.jsonwhile anotherdgrprocess holds the lock (and may be mid-write), and the planned restack actions may be based on stale state. Consider acquiring the lock before reading state / planning (or pass an already-acquiredStoreLockinto session construction) and reloading state after the lock is held.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in the latest push.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in b5373c0. The lock is now acquired in
maybe_restack_after_commit_innerbeforeload_state, and the pre-acquired lock is passed into the session via the newStoreSession::from_lockconstructor. Note thatload_configremains outside the lock because the config is written once duringinitand is effectively immutable after that—it is never modified by concurrent operations, so reading it without the lock is safe. The mutablestate.jsonis the resource that needs serialization, and that is now fully covered.