-
Notifications
You must be signed in to change notification settings - Fork 109
Open
Description
It looks like we can easily add fsspec support for paths with a few modifcations. This will let docling save to fsspec compatiable backends (S3, GCS, Azure, etc.) via https://github.com/fsspec/universal_pathlib which should increase doclings versatility in cloud deployments.
Currently things like save_as save_as_json, save_as_yaml, save_as_markdown, save_as_html and their load_from_* ccounterparts only work with local filesystem paths. Something like:
Before
from upath import UPath
s3_path = UPath("s3://my-bucket/doc.json")
doc.save_as_json(s3_path) # TypeError: expected str, bytes or os.PathLike object, not S3Path
After
from upath import UPath
s3_path = UPath("s3://my-bucket/doc.json")
doc.save_as_json(s3_path) # Works!
doc.save_as_json(s3_path, artifacts_dir=UPath("s3://my-bucket/images"),
image_mode=ImageRefMode.REFERENCED) # Images saved to S3 too!
Metadata
Metadata
Assignees
Labels
No labels