-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cope with invalid hash algorithms in RECORD #179
base: main
Are you sure you want to change the base?
Conversation
5d00415
to
ba5d586
Compare
ba5d586
to
87f0df5
Compare
I can't tell what the test failure at https://github.com/pypa/installer/actions/runs/4646834563/jobs/8680555289?pr=179 was complaining about I pushed again in the hope of kicking a retry but apparently that needs manual approval |
oh this
I find it hard to believe that windows 3.9 uniquely fails to get to the line in question, so not sure what to do about that |
I pushed a commit that adds |
hash_value = Hash.parse(hash_) | ||
if hash_value.name not in hashlib.algorithms_available: | ||
issues.append(f"invalid hash algorithm '{hash_value.name}'") | ||
hash_value = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be performing this validation in the validate method somehow, since this would prevent a RecordEntry
from being created with a hash that isn't in algorithms_available
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking was that it's desirable to be unable to create a RecordEntry
with a bad hash algorithm, but I'm happy to be guided in another direction if you have a preference.
Yet another approach, doubling down on "invalid hash algorithms are invalid" would be to raise a ValueError
from Hash.parse()
, though the error message from that path isn't currently very informative
Thanks for filing this PR @dimbleby, and appreicate your patience on this as I went away from OSS for a month.
The only reason I can think of is that |
No worries, take your time! if |
It would, if it's conditionally available on only certain platforms. 😬 |
if hash_: | ||
try: | ||
hash_value: Optional[Hash] = Hash.parse(hash_) | ||
hash_value = Hash.parse(hash_) | ||
if hash_value.name not in hashlib.algorithms_available: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP 376 / https://packaging.python.org/en/latest/specifications/recording-installed-packages/ requires that RECORD must be in hashlib.algorithms_guaranteed, using _available is not good enough when producing installed databases in site-packages.
But the wheel format itself (https://packaging.python.org/en/latest/specifications/binary-distribution-format/) doesn't say anything one way or another about it...
RECORD is a list of (almost) all the files in the wheel and their secure hashes. Unlike PEP 376, every file except RECORD, which cannot contain a hash of itself, must include its hash. The hash algorithm must be sha256 or better; specifically, md5 and sha1 are not permitted, as signed wheel files rely on the strong hashes in RECORD to validate the integrity of the archive.
So it disagrees with PEP 376 in at least two points:
- hash all files, even pyc
- md5 and sha1 are algorithms_guaranteed, but still not allowed
And therefore the PEP cannot be taken as authoritative on algorithms_guaranteed. But only for wheels. Once e extracted it does need to be in _guaranteed.
Fun.
...
In theory this would solve the platform-conditional concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not obvious to me what the intended availability of installer.records
is. Can I use it to parse the RECORD in a wheel file, or one in site-packages? Is the appropriate guard here to enforce this when creating new RECORD files only?
Experimental patch to validate this as part of destinations, where we know that any algorithm passed is definitely intended for creating a new installed RECORD file: 67da026.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what conclusions, if any, you reach from the above.
I am therefore politely saying "noted"... but not doing anything about it!
Please say if you have opinions about how this MR ought to look (or submit an MR, I'm fine with this one being overtaken by something better)
@pradyunsg if The test verifies that we get an error complaining that the algorithm name is invalid: I don't see how that test can pass without hitting the line of code in question. I really don't see how this can be anything other than coverage weirdness. |
d019252
to
cca5692
Compare
- raise an InvalidRecordEntry - catch such things during wheel validation and include them in the issues reported
cca5692
to
4eda07a
Compare
Fixes #178
I said there that I didn't have a fix to offer; but perhaps even a wrong fix is a better starting point - or anyway has more momentum - than no fix at all