-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extractcode's behaviour and error message output on damaged archives #8
Comments
Thanks! I was not aware of this problem. There are interesting cases there as some of these archives may be damaged archives used by GCC for testing or these could be issues in extractcode |
interesting! yes, this makes sense. how might someone approach this? obviously, if such files are used for testing within the component and are faulty by purpose, we should avoid extracting those. does extractcode continue the job after failing on such corrupt archive files? |
yes, the extraction is never interrupted at large, it only choke on faulty archives (and usually tries a best effort in these cases, but obviously it is not trying hard enough here) and then keeps on trucking on the rest.
This is not entirely trivial.... I guess there are a couple ways. One idea would be to combine extraction with file classification: e.g. if a failed-to-extract archive is part of a directory classified as "test" files, then the error could re-qualified as a warning or silenced. Another idea is to brute force the problem and to maintain a list of test archives known to fail extracting for the (few?) extraction (or other) tools that would keep such test files, eg. gcc, tar, infozip, gunzip, libarchive ... Yet another way would be to bypass the problem entirely: if GCC 4.9 had been pre-scanned and that scan peer reviewed by the community, then the fact that it has some test files that do no extract correctly becomes moot, does it? |
So here is the deal:
This looks like a damaged-on-purpose test file. It should however extracts correctly or at least partially with extractcode. It may contain a fake, pretending to be super big 16GB file. This is another test case.
and gunzip fails to decompress it:
Here extractcode would never be able to process it alright because of the encryption. Yet we should likely report a warning and a more explicit message instead. Note that all these are test files used in GCC so overall so these are annoying yet probably not critical issues. What's your take? |
Thank you for your comprehensive response.
This was my main concern. I'm glad extractcode behaves this way. Also, I already do get some information about the errors. I don't think that we're on the wrong track here - I rather misinterpreted the error messages to be critical. I'd honestly leave it as it is in regards to extractcode's behaviour. However, further information in the error message might be handy like you said. I'd really like to see a summary on what was going on in the end, e.g. "extracted n files successfully, failed on m files". For the detection after a failed extraction attempt you might - as you mentioned:
I will update the title of the issue as it is misleading with the current state of knowledge. |
In terms or resolution, I think this would work best:
|
when running
extractcode
ongcc-4.9
(download here https://packages.debian.org/jessie/all/gcc-4.9-source/download) extractcode fails with these error messages:with the
--verbose
flag the error messages look like this:are you aware of this problem?
it seems that either the depth of further archives within the initial archive, the depth that those have or both depths sumed up is the problem here.
The text was updated successfully, but these errors were encountered: