-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better hardlink handling #5
Comments
I see, thank you for reporting. Initially it didn't occur to me something could be done about hardlinks as they're quite smooth at hiding their linked status. Simplest (perhaps naive?) way to do this would be to keep a HashSet of all the inodes, and assigning 0 bytes to the size of any duplicate inode occurrences. Another possibility might be to use inodes as indices for the tree structure, this would be an involved change. Also need to come up with an equivalent implementation for Windows (or we could ignore as hard links are not that common on Windows land). I intend to also support "size on disk" detection, I am myself using a compressed filesystem and the advertised sizes don't quite match reality. Hard-link elimination that you propose should also be a good addition. I cannot promise a timeline of completion, currently being busy with university. This is still my favorite project however, thank you for reminding me to work on it. :^) |
The solution that simply tracks encountered inodes sounds like it would result in the same output as It still doesn't give an ideal overview of what to delete and whether deleting something will help (deleting directory Re Windows: I'm not surprised that NTFS supports hardlinks. I'm somewhat surprised the windows API actually supports creating them. A cursory look indicates that if GetFileInformationByHandle()'s |
I have a project where there is lots of shared state between various versions, which is handled using hardlinks.
Spaceman and du give different results.
Example on a directory of about 1 GB with 14 subdirectories sharing a lot of data:
du -hs . ; du -hs *; du -hs 2
As you can see, du only assigns the size of each inode to the first directory it encounters it in, avoiding double-counting as long as it happens within a single run.
However, spaceman says that every folder is about 690 MB, and gives a total of 9.2 GB. This is very misleading when it comes to finding stuff to delete.
I'm not sure what a good approach would be, but this makes spaceman quite a bit less useful for me.
The text was updated successfully, but these errors were encountered: