Resolving Pacman Lock File Issues After Rollback #23

rankaiyx · 2025-06-08T05:53:54Z

Pull Request: Resolving Pacman Lock File Issues After Rollback

Problem Description

When using snapper-rollback with Arch Linux's recommended btrfs layout, rolling back to snapshots created by snap-pac causes pacman to stop working. This occurs because pacman's lock mechanism doesn't interact well with snapshot hooks:

Problematic Sequence:

Pacman locks database 🔒
Pre-snapshot hook runs 📸
Post-snapshot hook runs 📸
Pacman unlocks database 🔓

Ideal Sequence:

Pre-snapshot hook 📸
Pacman locks 🔒
Pacman unlocks 🔓
Post-snapshot hook 📸

➜ ~ sudo snapper list
...
19 │ single │ │ Sat 07 Jun 2025 08:04:42 PM │ root │ │ test2 │
20 │ pre │ │ Sat 07 Jun 2025 08:04:51 PM │ root │ number │ pacman -S iperf3 │
21 │ post │ 20 │ Sat 07 Jun 2025 08:04:52 PM │ root │ number │ iperf3 lksctp-tools │
22 │ single │ │ Sat 07 Jun 2025 08:04:58 PM │ root │ │ test3 │
➜ ~ sudo snapper status 19..20
c..... /home/abc/.zsh_history
+..... /var/lib/pacman/db.lck
➜ ~ sudo snapper status 21..22
c..... /home/abc/.zsh_history
-..... /var/lib/pacman/db.lck

When rolling back to snapshots containing the db.lck file, pacman incorrectly assumes the database is still locked, preventing further operations.

Solution

This PR adds an optional feature to automatically remove pacman's lock file after rollback:

New Configuration Option:

# Optional: Uncomment and set directory to mount new subvolume for post-rollback operations
# Enables automatic removal of pacman lock file in new snapshots
# Example: 
# mountpoint_newsubvol = /mnt/snapper-rollback/btrfs_newsubvol

Users enable the feature by uncommenting and configuring this option

Automatic Post-Rollback Processing:
- Mounts newly created subvolume in RW mode to temporary directory
- Checks for and deletes /var/lib/pacman/db.lck if present
- Unmounts temporary mount point
- Full dry-run mode support
Safe Implementation:
- Error handling: Catches and logs all operation exceptions
- Resource cleanup: Ensures temporary mount point is always unmounted
- Graceful degradation: Skips operation if option not configured

Tested btrfs Layout

├── @
├── @snapshots
└── @var_log

(Matches Arch Wiki recommended layout)

User Benefits

Fixes pacman unavailability after rollback
Maintains snapper-rollback's elegance and efficiency
Optional feature doesn't disrupt existing workflows
Clear logging provides operation status feedback

This feature seamlessly resolves compatibility issues between snapshot rollback and pacman's lock mechanism while preserving snapper-rollback's simplicity and reliability.

jrabinow · 2025-06-08T07:35:30Z

Hey @rankaiyx thank you for the contribution and for the very clear description of both the problem and the solution you came up with.

I previously got a request for this here which I rejected on the grounds of script robustness and simplicity. Since you're the second person to request this exact feature, let's take a closer look.

this script is still meant to be robust through simplicity
this script also runs on other OSes than arch, see spiral-linux based on Debian, it also got some interest from gecko-linux at one point (which is based on openSuse). The script should keep supporting them - which it will given that the feature needs to be explicitly enabled 🙇 👍
rather than adding file-specific hardcoded support in the script, I think it would be better to not include the lock in the snapshot when it's created. It looks like excluding /var is not recommended
better would be if snap-pac could create the snapshot before the lockfile is in place (would require changes to pacman, and those people know what they're doing) or it could delete the lockfile right after creating the snapshot. I would suggest making a feature request or a PR on snap-pac since the root cause lies with the timing of its snapshot creation
my instinct is that the simplest solution here from an engineering perspective is for people with specific needs to write their own wrapper script which calls snapper-rollback, then cleans up whatever is needed, since this may vary across distros. A good place for the wrapper script could be /usr/local/bin
simplicity from an engineering perspective is not always simplicity from an end-user's perspective

I'd therefore like to suggest the following tradeoff:

we add a new section to the config file, as follows:

[sanitize]
enabled = false
paths =
    /var/lib/pacman/db.lck
    /var/cache/foo/temp.lock

The entire section is optional. Assuming it is present, the enabled field is optional as well.

we add a --sanitize flag to ensure that sanitized files do NOT get deleted by default, unless the enabled field is set to true
users can place whatever paths they want to auto-delete from their snapshots into the config file, and the script automatically deletes them post rollback - but only if sanitization is explicitly requested

This means that to auto-delete pacman lockfiles post-rollback:

users would put an entry into the config file
either the user sets enabled to true in the config file, or they call the script with the --sanitize flag

How does that sound?

jrabinow · 2025-06-08T05:57:23Z

snapper-rollback.conf

+
 # Directory to which your btrfs root is mounted.
-mountpoint = /btrfsroot
+mountpoint = /mnt/snapper-rollback/btrfs_root


let's switch this back. Everyone has their own setting. I was thinking of moving this to somewhere under $TMPDIR or /tmp if that's unset, but this seems like it will break a fair few configs when people upgrade so it requires a proper migration plan

Ok, I just wanted to keep both mount directories in the same folder and keep the root directory clean, but it's not mandatory. Considering compatibility, maybe it really shouldn't be changed.

jrabinow · 2025-06-08T06:06:22Z

snapper-rollback.py

+    try:
+        mountpoint_newsubvol = config.get("root", "mountpoint_newsubvol")
+    except configparser.NoOptionError:
+        mountpoint_newsubvol = None
+        LOG.info("mountpoint_newsubvol not configured, skipping pacman db.lck removal")


rather than doing a try/except here, I think it would be better to add a new --sanitize flag. If a failure occurs, the script fails fast and loud

jrabinow · 2025-06-08T06:11:46Z

snapper-rollback.py

+        if mountpoint_newsubvol:
+            subvol_name = config.get("root", "subvol_main")
+            mount_and_remove_db_lck(
+                subvol_name,
+                mountpoint_newsubvol,
+                dev,
+                dry_run=args.dry_run
+            )


I think that we should avoid mounting a second time. Once we've rolled back, we can remove the lock file directly from the mounted snapshot - unless you had a specific reason for wanting to mount things separately

good idea, mountpoint_newsubvol is also no longer needed.

rankaiyx · 2025-06-08T07:54:08Z

First, I tried to delete the lck file after taking a snap-pac snapshot by setting the read-only snapshot as writable, deleting the file, and then restoring the snapshot attribute to read-only. However, it broke the strict correspondence between the snapshot metadata created by snapper and the btrfs read-only snapshot, causing some inexplicable problems.

Another idea was to set the directory where the db.lck file resides as a subvolume, but this would cause other files of pacman not to be snapshoted, so it didn't work.

I also tried to set db.lck as a dangling link soft connection, linking to other subvolumes that are not snapshoted, but the presence of dangling soft links would also hinder the operation of pacman.

rankaiyx · 2025-06-08T08:15:38Z

Hey @rankaiyx thank you for the contribution and for the very clear description of both the problem and the solution you came up with.

I previously got a request for this here which I rejected on the grounds of script robustness and simplicity. Since you're the second person to request this exact feature, let's take a closer look.

1. this script is still meant to be robust through simplicity

2. this script also runs on other OSes than arch, see [spiral-linux](https://spirallinux.github.io/) based on Debian, it also got some interest from [gecko-linux](https://github.com/jrabinow/snapper-rollback/issues/4) at one point (which is based on openSuse). The script should keep supporting them - which it will given that the feature needs to be explicitly enabled 🙇 👍

3. rather than adding file-specific hardcoded support in the script, I think it would be better to not include the lock in the snapshot when it's created. It looks like excluding `/var` is [not recommended](https://www.reddit.com/r/archlinux/comments/sacp6g/comment/htsyqk3/)

4. better would be if snap-pac could create the snapshot before the lockfile is in place (would require changes to pacman, and those people know what they're doing) or it could delete the lockfile right after creating the snapshot. I would suggest making a feature request or a PR on [snap-pac](https://github.com/wesbarnett/snap-pac) since the root cause lies with the timing of its snapshot creation

5. my instinct is that the simplest solution here from an engineering perspective is for people with specific needs to write their own wrapper script which calls snapper-rollback, then cleans up whatever is needed, since this may vary across distros. A good place for the wrapper script could be `/usr/local/bin`

6. simplicity from an engineering perspective is not always simplicity from an end-user's perspective

I'd therefore like to suggest the following tradeoff:

* we add a new section to the config file, as follows:

[sanitize]
enabled = false
paths =
    /var/lib/pacman/db.lck
    /var/cache/foo/temp.lock

The entire section is optional. Assuming it is present, the enabled field is optional as well.

* we add a `--sanitize` flag to ensure that sanitized files do NOT get deleted by default, unless the `enabled` field is set to true

* users can place whatever paths they want to auto-delete from their snapshots into the config file, and the script automatically deletes them post rollback - but only if sanitization is explicitly requested

This means that to auto-delete pacman lockfiles post-rollback:

* users would put an entry into the config file

* either the user sets `enabled` to true in the config file, or they call the script with the `--sanitize` flag

How does that sound?

These are all good suggestions, I'll try to optimize them later today.

rankaiyx · 2025-06-08T14:47:39Z

The code update is complete and has been preliminarily tested. It works as expected.

Looking forward to further review and testing.

jrabinow

This is looking great, I especially want to call out my appreciation for how you're handling errors 🙇

If you could please address the comments and run the black formatter on the code, I'd be happy to merge this in :-)

jrabinow · 2025-06-09T02:56:09Z

snapper-rollback.py

+        target_path = target_root / rel_path
+
+        if dry_run:
+            LOG.info(f"[DRY-RUN] Would check and clean: {target_path}")


let's print out the exact shell commands which we're running an equivalent for. I should be able to --dry-run the script and get the exact commands that need to be run from a shell if the script wasn't available

jrabinow · 2025-06-10T03:44:17Z

snapper-rollback.conf

+
+# If the following files exist in the file system after the rollback, clean them up. 
+# Use absolute paths and separate multiple files with commas.
+[cleanup]


Let's name this [root.sanitize]

The idea here is that right now, the tool only supports rolling back the root partition. Out of scope here - someday this tool will be adapted to handle other partitions as well. However, when that day comes, we will want to ensure that cleaning files up applies only when rolling back the associated partition

Apologies, I should have mentioned this in my initial proposal

[root.cleanup] may be better? in line with Simple English, and good for internationalization.

cleanup is ok, but I was hoping for something slightly more specific. How about purge and the associated similarly-named flag?
If you prefer cleanup to purge, let's go with your preference

jrabinow · 2025-06-10T03:45:59Z

snapper-rollback.conf

+# Use absolute paths and separate multiple files with commas.
+[cleanup]
+enabled = false
+paths = /var/lib/pacman/db.lck, /var/cache/foo/temp.lock


Each file should be on its own line (suppose I have 15 files I want to cleanup -> we want the config to remain legible), and paths should be on a line of its own
Valid file paths include , so we want to ensure we can support that as well (valid file paths also include the newline character but it's far less common)

It seems that multi-line key values are not supported, although there are workarounds, but I'm not sure if they are good or bad.New weaknesses may be introduced, such as strict requirements for sequence numbers.
Is there an elegant way to do this?

a workaround way:
[Plugins]
plugin[0] = core
plugin[1] = auth
plugin[2] = storage

config = configparser.ConfigParser() config.read('config.ini') plugins = [] i = 0 while True: key = f'plugin[{i}]' if key in config['Plugins']: plugins.append(config['Plugins'][key]) i += 1 else: break print(plugins)

Huh, weird, I tested before making this comment:

diff --git a/snapper-rollback.conf b/snapper-rollback.conf index e2b52f0..4a658cd 100644 --- a/snapper-rollback.conf +++ b/snapper-rollback.conf @@ -36,4 +36,6 @@ mountpoint = /btrfsroot # Use absolute paths and separate multiple files with commas. [cleanup] enabled = false -paths = /var/lib/pacman/db.lck, /var/cache/foo/temp.lock +paths = + /var/lib/pacman/db.lck + /var/cache/foo/temp.lock

$ echo "macOS $(sw_vers -productVersion) $(sw_vers -buildVersion) $(uname -m)" macOS 15.5 24F74 arm64 $ python --version Python 3.13.4 $ ipython Python 3.13.4 (main, Jun 7 2025, 00:36:51) [Clang 17.0.0 (clang-1700.0.13.5)] Type 'copyright', 'credits' or 'license' for more information IPython 8.31.0 -- An enhanced Interactive Python. Type '?' for help. [ins] In [1]: from configparser import ConfigParser [ins] In [2]: cfg = ConfigParser() [ins] In [3]: cfg.read("snapper-rollback.conf") Out[3]: ['snapper-rollback.conf'] [ins] In [4]: paths = cfg.get("cleanup", "paths") [ins] In [5]: paths Out[5]: '\n/var/lib/pacman/db.lck\n/var/cache/foo/temp.lock'

As you can see, I'm using mac at the moment so maybe configparser is implemented differently according to platforms? It seems doubtful though.
I did the same with python3.3 (python3.2 won't build for me) the behavior is identical, so it's not related to python versions.
What does your system do? If it's really not working, I'm afraid we'll have to stick with commas like you initially did, and so be it.

python3 & archlinux
I know the key point, there must be at least one space before each line of multi-line key value.
I will implement it.

jrabinow · 2025-06-10T03:48:26Z

snapper-rollback.py

+            else:
+                LOG.warning(f"Cleanup skipped: {target_path} is not a file")
+        except OSError as e:
+            LOG.error(f"Error cleaning {target_path}: {str(e)}")


f"Error deleting '{target_path}': {e}"

Variable interpolation is already handled, no need for str(e), using quotes makes it very clear what is the filepath and what isn't, and deleting is more explicit than cleaning

Okay, let’s do it.

jrabinow · 2025-06-10T03:53:40Z

snapper-rollback.py

+            if target_path.is_file():
+                target_path.unlink()
+                LOG.info(f"Found and removed file: {target_path} ")


let's do the dry-run check inside here, to ensure we aren't reporting removing target_path when in reality it would be untouched.

nit: what if we combined the .exists() and the .is_file() check?

bonus: add support for deleting directories? Potentially dangerous (then again what isn't in this context?) -> I'm not sure it's such a good idea to do this one but it would be consistent with principle of least astonishment. Your call.

When dry-run, the program will not run here because there is no real target subvolume(target_path).
Perhaps we could do this, in a simulation run, to check if there are matching files in the source subvolume, and if so list them.

Great point. In that case, I think you had the right idea initially. We can print rm -f '${target_path}'. This command won't error out even if the file doesn't exist. We print the rm command regardless of whether the file exists/is a file or not, like you already coded. How's that sound?

rm -f '${target_path}'
Well, it's very easy to understand, even though it's not actually the rm command used.
Just display it like this.

Exactly. Easy to understand and and convert into a shell script to run in an environment where the python script doesn't work (python runtime or btrfsutil module not available)

jrabinow · 2025-06-10T03:59:36Z

snapper-rollback.py

            dev,
            dry_run=args.dry_run,
        )
+        cleanup_files(config, subvol_main, dry_run=args.dry_run)


I'd like to suggest let's call this function only if the user made explicit their intention to cleanup files by one of the following methods:

explicitly passed in the --sanitize flag when calling the script on the CLI

explicitly enabled the feature in the config file

I also think we should keep contained the logic for whether cleanup gets run or not, e.g. we shouldn't check the flag value in one place and the enabled field value in another

rankaiyx · 2025-06-11T18:27:11Z

For the original problem, I had a flash of inspiration and came up with a good solution in the upstream project.
wesbarnett/snap-pac#59

rankaiyx and others added 2 commits June 8, 2025 13:31

feat: add automatic removal of pacman db.lck after rollback

052cdb9

Update snapper-rollback.conf to add mountpoint_newsubvol config option

1703949

jrabinow reviewed Jun 8, 2025

View reviewed changes

rankaiyx added 2 commits June 8, 2025 22:35

feat(cleanup): implement post-rollback file cleanup functionality

efe8f0b

feat(config): add cleanup section with file removal paths

e210820

jrabinow reviewed Jun 10, 2025

View reviewed changes

Resolving Pacman Lock File Issues After Rollback #23

Are you sure you want to change the base?

Resolving Pacman Lock File Issues After Rollback #23

Uh oh!

Conversation

rankaiyx commented Jun 8, 2025

Pull Request: Resolving Pacman Lock File Issues After Rollback

Problem Description

Solution

Tested btrfs Layout

User Benefits

Uh oh!

jrabinow commented Jun 8, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rankaiyx commented Jun 8, 2025

Uh oh!

rankaiyx commented Jun 8, 2025

Uh oh!

rankaiyx commented Jun 8, 2025

Uh oh!

jrabinow left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jrabinow Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jrabinow Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rankaiyx Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jrabinow Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rankaiyx commented Jun 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

jrabinow Jun 10, 2025 •

edited

Loading

jrabinow Jun 10, 2025 •

edited

Loading

rankaiyx Jun 10, 2025 •

edited

Loading

jrabinow Jun 10, 2025 •

edited

Loading