Skip to content

wr.s3.read_parquet() errors if any files are missing #1320

Open
@dhorkel

Description

@dhorkel

Is your idea related to a problem? Please describe.

Currently when using wr.s3.read_parquet(path=list_of_paths) with a list of file paths, if any individual file does not exist, the following error is raised:

botocore.exceptions.ClientError: An error occurred (404) when calling the HeadObject operation: Not Found

As a work around I've used wr.s3.does_object_exist() but this can be slow if being used for many files as it only accepts one file path.

Describe the solution you'd like
I would like if wr.s3.read_parquet() had an option argument like error_missing_files= where the user could specify whether to raise an error/warning or just ignore when files are missing.

Alternatively (or additionally) allowing wr.s3.does_object_exist() to accept a list of files for path= and checking if the objects exist in parallel.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions