-
Notifications
You must be signed in to change notification settings - Fork 9
Resolving Pacman Lock File Issues After Rollback #23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Resolving Pacman Lock File Issues After Rollback #23
Conversation
|
Hey @rankaiyx thank you for the contribution and for the very clear description of both the problem and the solution you came up with. I previously got a request for this here which I rejected on the grounds of script robustness and simplicity. Since you're the second person to request this exact feature, let's take a closer look.
I'd therefore like to suggest the following tradeoff:
The entire section is optional. Assuming it is present, the
This means that to auto-delete pacman lockfiles post-rollback:
How does that sound? |
snapper-rollback.conf
Outdated
|
|
||
| # Directory to which your btrfs root is mounted. | ||
| mountpoint = /btrfsroot | ||
| mountpoint = /mnt/snapper-rollback/btrfs_root |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's switch this back. Everyone has their own setting. I was thinking of moving this to somewhere under $TMPDIR or /tmp if that's unset, but this seems like it will break a fair few configs when people upgrade so it requires a proper migration plan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I just wanted to keep both mount directories in the same folder and keep the root directory clean, but it's not mandatory. Considering compatibility, maybe it really shouldn't be changed.
snapper-rollback.py
Outdated
| try: | ||
| mountpoint_newsubvol = config.get("root", "mountpoint_newsubvol") | ||
| except configparser.NoOptionError: | ||
| mountpoint_newsubvol = None | ||
| LOG.info("mountpoint_newsubvol not configured, skipping pacman db.lck removal") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than doing a try/except here, I think it would be better to add a new --sanitize flag. If a failure occurs, the script fails fast and loud
snapper-rollback.py
Outdated
| if mountpoint_newsubvol: | ||
| subvol_name = config.get("root", "subvol_main") | ||
| mount_and_remove_db_lck( | ||
| subvol_name, | ||
| mountpoint_newsubvol, | ||
| dev, | ||
| dry_run=args.dry_run | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should avoid mounting a second time. Once we've rolled back, we can remove the lock file directly from the mounted snapshot - unless you had a specific reason for wanting to mount things separately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea, mountpoint_newsubvol is also no longer needed.
|
First, I tried to delete the lck file after taking a snap-pac snapshot by setting the read-only snapshot as writable, deleting the file, and then restoring the snapshot attribute to read-only. However, it broke the strict correspondence between the snapshot metadata created by snapper and the btrfs read-only snapshot, causing some inexplicable problems. Another idea was to set the directory where the db.lck file resides as a subvolume, but this would cause other files of pacman not to be snapshoted, so it didn't work. I also tried to set db.lck as a dangling link soft connection, linking to other subvolumes that are not snapshoted, but the presence of dangling soft links would also hinder the operation of pacman. |
These are all good suggestions, I'll try to optimize them later today. |
|
The code update is complete and has been preliminarily tested. It works as expected. Looking forward to further review and testing. |
jrabinow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking great, I especially want to call out my appreciation for how you're handling errors 🙇
If you could please address the comments and run the black formatter on the code, I'd be happy to merge this in :-)
| target_path = target_root / rel_path | ||
|
|
||
| if dry_run: | ||
| LOG.info(f"[DRY-RUN] Would check and clean: {target_path}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's print out the exact shell commands which we're running an equivalent for. I should be able to --dry-run the script and get the exact commands that need to be run from a shell if the script wasn't available
|
|
||
| # If the following files exist in the file system after the rollback, clean them up. | ||
| # Use absolute paths and separate multiple files with commas. | ||
| [cleanup] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's name this [root.sanitize]
The idea here is that right now, the tool only supports rolling back the root partition. Out of scope here - someday this tool will be adapted to handle other partitions as well. However, when that day comes, we will want to ensure that cleaning files up applies only when rolling back the associated partition
Apologies, I should have mentioned this in my initial proposal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[root.cleanup] may be better? in line with Simple English, and good for internationalization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleanup is ok, but I was hoping for something slightly more specific. How about purge and the associated similarly-named flag?
If you prefer cleanup to purge, let's go with your preference
| # Use absolute paths and separate multiple files with commas. | ||
| [cleanup] | ||
| enabled = false | ||
| paths = /var/lib/pacman/db.lck, /var/cache/foo/temp.lock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each file should be on its own line (suppose I have 15 files I want to cleanup -> we want the config to remain legible), and paths should be on a line of its own
Valid file paths include , so we want to ensure we can support that as well (valid file paths also include the newline character but it's far less common)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that multi-line key values are not supported, although there are workarounds, but I'm not sure if they are good or bad.New weaknesses may be introduced, such as strict requirements for sequence numbers.
Is there an elegant way to do this?
a workaround way:
[Plugins]
plugin[0] = core
plugin[1] = auth
plugin[2] = storage
config = configparser.ConfigParser()
config.read('config.ini')
plugins = []
i = 0
while True:
key = f'plugin[{i}]'
if key in config['Plugins']:
plugins.append(config['Plugins'][key])
i += 1
else:
break
print(plugins)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, weird, I tested before making this comment:
diff --git a/snapper-rollback.conf b/snapper-rollback.conf
index e2b52f0..4a658cd 100644
--- a/snapper-rollback.conf
+++ b/snapper-rollback.conf
@@ -36,4 +36,6 @@ mountpoint = /btrfsroot
# Use absolute paths and separate multiple files with commas.
[cleanup]
enabled = false
-paths = /var/lib/pacman/db.lck, /var/cache/foo/temp.lock
+paths =
+ /var/lib/pacman/db.lck
+ /var/cache/foo/temp.lock
$ echo "macOS $(sw_vers -productVersion) $(sw_vers -buildVersion) $(uname -m)"
macOS 15.5 24F74 arm64
$ python --version
Python 3.13.4
$ ipython
Python 3.13.4 (main, Jun 7 2025, 00:36:51) [Clang 17.0.0 (clang-1700.0.13.5)]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.31.0 -- An enhanced Interactive Python. Type '?' for help.
[ins] In [1]: from configparser import ConfigParser
[ins] In [2]: cfg = ConfigParser()
[ins] In [3]: cfg.read("snapper-rollback.conf")
Out[3]: ['snapper-rollback.conf']
[ins] In [4]: paths = cfg.get("cleanup", "paths")
[ins] In [5]: paths
Out[5]: '\n/var/lib/pacman/db.lck\n/var/cache/foo/temp.lock'
As you can see, I'm using mac at the moment so maybe configparser is implemented differently according to platforms? It seems doubtful though.
I did the same with python3.3 (python3.2 won't build for me) the behavior is identical, so it's not related to python versions.
What does your system do? If it's really not working, I'm afraid we'll have to stick with commas like you initially did, and so be it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
python3 & archlinux
I know the key point, there must be at least one space before each line of multi-line key value.
I will implement it.
| else: | ||
| LOG.warning(f"Cleanup skipped: {target_path} is not a file") | ||
| except OSError as e: | ||
| LOG.error(f"Error cleaning {target_path}: {str(e)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f"Error deleting '{target_path}': {e}"
Variable interpolation is already handled, no need for str(e), using quotes makes it very clear what is the filepath and what isn't, and deleting is more explicit than cleaning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let’s do it.
| if target_path.is_file(): | ||
| target_path.unlink() | ||
| LOG.info(f"Found and removed file: {target_path} ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- let's do the dry-run check inside here, to ensure we aren't reporting removing target_path when in reality it would be untouched.
- nit: what if we combined the
.exists()and the.is_file()check? - bonus: add support for deleting directories? Potentially dangerous (then again what isn't in this context?) -> I'm not sure it's such a good idea to do this one but it would be consistent with principle of least astonishment. Your call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When dry-run, the program will not run here because there is no real target subvolume(target_path).
Perhaps we could do this, in a simulation run, to check if there are matching files in the source subvolume, and if so list them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point. In that case, I think you had the right idea initially. We can print rm -f '${target_path}'. This command won't error out even if the file doesn't exist. We print the rm command regardless of whether the file exists/is a file or not, like you already coded. How's that sound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rm -f '${target_path}'
Well, it's very easy to understand, even though it's not actually the rm command used.
Just display it like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly. Easy to understand and and convert into a shell script to run in an environment where the python script doesn't work (python runtime or btrfsutil module not available)
| dev, | ||
| dry_run=args.dry_run, | ||
| ) | ||
| cleanup_files(config, subvol_main, dry_run=args.dry_run) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to suggest let's call this function only if the user made explicit their intention to cleanup files by one of the following methods:
- explicitly passed in the
--sanitizeflag when calling the script on the CLI - explicitly enabled the feature in the config file
I also think we should keep contained the logic for whether cleanup gets run or not, e.g. we shouldn't check the flag value in one place and the enabled field value in another
|
For the original problem, I had a flash of inspiration and came up with a good solution in the upstream project. |
Pull Request: Resolving Pacman Lock File Issues After Rollback
Problem Description
When using snapper-rollback with Arch Linux's recommended btrfs layout, rolling back to snapshots created by snap-pac causes pacman to stop working. This occurs because pacman's lock mechanism doesn't interact well with snapshot hooks:
Problematic Sequence:
Ideal Sequence:
➜ ~ sudo snapper list
...
19 │ single │ │ Sat 07 Jun 2025 08:04:42 PM │ root │ │ test2 │
20 │ pre │ │ Sat 07 Jun 2025 08:04:51 PM │ root │ number │ pacman -S iperf3 │
21 │ post │ 20 │ Sat 07 Jun 2025 08:04:52 PM │ root │ number │ iperf3 lksctp-tools │
22 │ single │ │ Sat 07 Jun 2025 08:04:58 PM │ root │ │ test3 │
➜ ~ sudo snapper status 19..20
c..... /home/abc/.zsh_history
+..... /var/lib/pacman/db.lck
➜ ~ sudo snapper status 21..22
c..... /home/abc/.zsh_history
-..... /var/lib/pacman/db.lck
When rolling back to snapshots containing the
db.lckfile, pacman incorrectly assumes the database is still locked, preventing further operations.Solution
This PR adds an optional feature to automatically remove pacman's lock file after rollback:
New Configuration Option:
Users enable the feature by uncommenting and configuring this option
Automatic Post-Rollback Processing:
/var/lib/pacman/db.lckif presentSafe Implementation:
Tested btrfs Layout
(Matches Arch Wiki recommended layout)
User Benefits
This feature seamlessly resolves compatibility issues between snapshot rollback and pacman's lock mechanism while preserving snapper-rollback's simplicity and reliability.