-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
package: skip unpack errors on paths excluded by manifest #19324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it's worth (I'm not on the core team, just a contributor), this approach seems reasonable to me. Using an iterator interface in the package unpacking logic definitely seems nicer than requiring each file tree structure to expose its own interface to "hook" into the extraction logic using Diagnostics
.
I'll be happy to help with any support needed on the Git side of things for this PR, such as the iterator approach I mentioned in my other comment.
src/Package/Unpack.zig
Outdated
// Same interface as std.fs.Dir.createFile, symLink | ||
const inf = struct { | ||
parent: *Self, | ||
|
||
pub fn makePath(_: @This(), _: []const u8) !void {} | ||
|
||
pub fn createFile(t: @This(), sub_path: []const u8, _: fs.File.CreateFlags) !fs.File { | ||
return (try t.parent.createFile(sub_path)) orelse error.Skip; | ||
} | ||
|
||
pub fn symLink(t: @This(), target_path: []const u8, sym_link_path: []const u8, _: fs.Dir.SymLinkFlags) !void { | ||
try t.parent.symLink(target_path, sym_link_path); | ||
} | ||
}{ .parent = self }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why this was introduced, but I think a better long-term solution to solve this use-case would be for git.zig
to expose an iterator over tree entries and use that from the unpacking code. That would avoid the use of anytype
and error.Skip
. Since git.zig
isn't a public API and is used only for package fetching, the checkout
function could be completely removed in favor of the tree iterator design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
That was also my thinking but since checkout is used in test and the main function I decided for the first step to make just minimal changes so those two can be mostly unchanged.
Iterator which returns all entries, and does not require a recursive call for each directory, will be nice to use from the outside.
That error.Skip
is a really ugly hack.
I think both ways of doing it are reasonable. To me it is I think one or the other can be better depending on the problem, and in general beginner software developers have a bias towards the first one while intermediate programmers have a bias towards the second one. For this part of the codebase, I did consider both options and reasoned that, while the former does involve having similar logic implemented multiple times, it makes up for it in the sense that you can more easily read the code for a particular fetch mechanism, and perhaps optimize it better. I made this mistake in the backend of the compiler - I tried to do ABC ABC ABC with respect to CPU architecture, but it turned out to be untenable, and now it has been reworked to the AAA BBB CCC pattern. It's difficult to tell objectively which way is better. I will say that I do have a preference for status quo. |
Fixes #18089 (checks the remaining 2 items). Before this any unpack error will be raised after the unpack step. Now this is postponed until we get the manifest. Manifest is then used to filter errors for the files which are not included. There are two types of unpack errors:
Here is example which has both:
It is packed in .paths = .{
"src",
}, On Linux both tars get unpacked. On Windows and macOS both fail before Windows (with disabled symlinks), before both failing:
after, just first is failing:
macOS, before both failing:
after, just first is failing:
I checked that fetch behavior is unchanged for a list of hopefully representative packages. |
Tar pipeToFileSystem and git checkout have repeating logic: collecting diagnostic on errors, writing that errors to error bundle. Recurisive directory copy and pipeToFileSystem are both creating dir when writing file while dir still don't exists. Implementing new logic, thinking about file executable bit, requires changes on few places. This collects file handlling logic on single place and leaves other places with their job: unpacking tarball or git pack.
Just refactoring to put similar code closer.
Logic moved to Unpack.
Diagnostic error collection has been moved to Unpack. Mode does not have implementation, and after this changes implementation will be in Unpack.
This error type is ignored during tar unpacking.
Option to remove errors from paths not included in package.
Prior to [this](ziglang@ef9966c#diff-08c935ef8c633bb630641d44230597f1cff5afb0e736d451e2ba5569fa53d915R805) commit tar was not a valid extension. After that this one is valid case.
Before this any unpack error will be raised after the unpack step. Now this is postponed until we get manifest, and if then use included folders from manifest to filter errors which are not for the files from included folders. I made a folder which has symlinks and file names with same names in different casing. ``` sample ├── build.zig ├── build.zig.zon ├── dir1 │ └── dir2 │ └── dir3 │ ├── file1 │ ├── file2 │ └── file3 ├── dir4 │ └── dir5 │ └── dir6 │ ├── file1 -> ../../../dir1/dir2/dir3/file1 │ └── file2_symlink -> ../../../dir1/dir2/dir3/file2 ├── dir7 │ └── dir8 │ └── dir9 │ ├── file4 │ └── File4 └── src ├── main.zig └── root.zig ``` Then pack this into sample.tar and another one sample_src.tar where build.zig.zon has only src in paths (excluded all problematic folders): ```Zig .paths = .{ "src", }, ``` On Linux both tars get unpacked. On Windows and macOS both fail before this changes and second succeds after this change. Windows (with disabled symlinks): ``` C:\Users\ianic\code\zig-bin\zig.exe build --global-cache-dir zig-global-cache-release C:\Users\ianic\code\issues\fetch\build.zig.zon:78:20: error: unable to unpack tarball .url = "http://10.211.55.26:8000/sample.tar", ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ note: unable to create file 'dir7/dir8/dir9/file4': PathAlreadyExists note: unable to create symlink from 'dir4/dir5/dir6/file1' to '../../../dir1/dir2/dir3/file1': AccessDenied note: unable to create symlink from 'dir4/dir5/dir6/file2_symlink' to '../../../dir1/dir2/dir3/file2': AccessDenied C:\Users\ianic\code\issues\fetch\build.zig.zon:85:20: error: unable to unpack tarball .url = "http://10.211.55.26:8000/sample_src.tar", ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ note: unable to create file 'dir7/dir8/dir9/file4': PathAlreadyExists note: unable to create symlink from 'dir4/dir5/dir6/file1' to '../../../dir1/dir2/dir3/file1': AccessDenied note: unable to create symlink from 'dir4/dir5/dir6/file2_symlink' to '../../../dir1/dir2/dir3/file2': AccessDenied PS C:\Users\ianic\code\issues\fetch> C:\Users\ianic\code\zig-bin\bin\zig.exe build --global-cache-dir zig-global-cache-release C:\Users\ianic\code\zig-io\build\stage3\bin\zig.exe build --global-cache-dir zig-global-cache-master C:\Users\ianic\code\issues\fetch\build.zig.zon:78:20: error: unable to unpack .url = "http://10.211.55.26:8000/sample.tar", ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ note: unable to create file 'dir7/dir8/dir9/file4': PathAlreadyExists note: unable to create symlink from 'dir4/dir5/dir6/file1' to '../../../dir1/dir2/dir3/file1': AccessDenied note: unable to create symlink from 'dir4/dir5/dir6/file2_symlink' to '../../../dir1/dir2/dir3/file2': AccessDenied ``` macOS ``` Fetch Packages [45/40] /Users/ianic/code/zig/issues/fetch/build.zig.zon:75:20: error: unable to unpack tarball .url = "file:///Users/ianic/code/zig/issues/fetch/sample.tar", ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ note: unable to create file 'dir7/dir8/dir9/file4': PathAlreadyExists /Users/ianic/code/zig/issues/fetch/build.zig.zon:80:20: error: unable to unpack tarball .url = "file:///Users/ianic/code/zig/issues/fetch/sample_src.tar", ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fetch Packages [41/40] /Users/ianic/code/zig/issues/fetch/build.zig.zon:75:20: error: unable to unpack .url = "file:///Users/ianic/code/zig/issues/fetch/sample.tar", ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ note: unable to create file 'dir7/dir8/dir9/file4': PathAlreadyExists ```
In pipeToFileSystem when option mode_mode option is set on posix file systems.
Not used from Unpack any more.
Unpack tar without removing leading root folder. Then find package root in unpacked tmp folder. Fixes ziglang#17779
Adding logic from another PR: #19111 which fixes #17779. This now fixes 0.12 milestone package manager related issues: #17779 and #18089. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's difficult to tell objectively which way is better. I will say that I do have a preference for status quo.
I'm going to strengthen my stance on this. Sorry to do this after you have put so much work into it. But as the person who ultimately must maintain this code, I do want to keep my preference here. I don't want this restructuring to unify the git unpacking and archive unpacking. I liked it better before, when there were separate unpack implementations that had a set of well-defined tasks they were supposed to do, even though it was possible for them to get out of sync.
No problem. |
The purpose of this is to prepare the ground for package manager related changes like:
#17463 and #18089.
I found that there is are same logic implemented in both tar and git unpacking.
Tar pipeToFileSystem and git checkout are both: collecting file system errors,
writing that errors to error bundle.
Recursive directory copy and pipeToFileSystem are both creating Dir when
writing file while Dir still doesn't exist.
Adding something new requires changes in a few places.
This collects file system logic in a single place and leaves tar/git with their
primary job: decoding tarball or git pack.
I'll mark this draft hoping to get some kind of go/no go to implementing fixes
to the above issues on top of this.