-
Notifications
You must be signed in to change notification settings - Fork 91
Description
Recently we granted unsigned access to a bucket. When testing quilt3.Package.browse (top_hash=None) without S3 credentials configured, I was getting a boto3 error from resp = s3_client.get_object(**part_params) saying I was unable to GetObject:
quilt/api/python/quilt3/data_transfer.py
Lines 425 to 437 in 2964063
| def download_part(part_number): | |
| nonlocal remaining_counter | |
| with dest_file.open('r+b') as chunk_f: | |
| if part_number is not None: | |
| start = part_number * part_size | |
| end = min(start + part_size, size) - 1 | |
| part_params = dict(params, Range=f'bytes={start}-{end}') | |
| chunk_f.seek(start) | |
| else: | |
| part_params = params | |
| resp = s3_client.get_object(**part_params) |
Upon digging deeper I learned that the S3Provider was not building and selecting an unsigned client but rather a standard client. It appears that this check was passing:
quilt/api/python/quilt3/data_transfer.py
Lines 207 to 220 in 2964063
| def check_get_object_works_for_client(s3_client, params): | |
| try: | |
| head_args = dict(Bucket=params["Bucket"], Key=params["Key"]) | |
| if "VersionId" in params: | |
| head_args["VersionId"] = params["VersionId"] | |
| s3_client.head_object(**head_args) # HEAD/GET share perms, but HEAD always fast | |
| except ClientError as e: | |
| if e.response["Error"]["Code"] == "403": | |
| # This can also happen if you have full get_object access, but not list_objects_v2, and the object does not | |
| # exist. Instead of returning a 404, S3 will return a 403. | |
| return False | |
| return True |
It appears
head_object call is actually being made here instead of a get_object call for performance reasons.
Ultimately, adding s3:GetObjectVersion to the same bucket access policy for Principal * resolved the issue (Note: also added s3:GetObjectTagging but I don't imagine this one did the trick).
I wasn't able to print the part_params for whatever reason, but it's possible that VersionId and Range are being specified which cause the need for additional permissions that the head_object call likely does not account for.
It would be ideal if the get_object check could account for the scenario where get_object(...) with VersionId or other likely parameters is called.