Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Handle mounting shared layers cross-images on upload #85

Open
Dramelac opened this issue Jan 3, 2025 · 4 comments
Open

[feature] Handle mounting shared layers cross-images on upload #85

Dramelac opened this issue Jan 3, 2025 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@Dramelac
Copy link
Contributor

Dramelac commented Jan 3, 2025

Hello,

First of all, thank you very much for this tool!

I'm writing this issue because one of the registry behaviors is not implemented yet. It's not the most common use-case but I wonder if it's planned to be implemented soon or not?

When docker pushes an image, each layer will be uploaded if this is the first time this layer/blob is sent to the registry. But when an image has shared layers that have already been pushed to an A repository and then a new image on another B repository is pushed, docker will try to mount the layer from the A repository to B avoiding re-uploading.

This behavior is described in the Cross Repository Blob Mount section below: https://docker-docs.uclv.cu/registry/spec/api/#pushing-an-image

  • Docker client will send this query to mount a layer from image-A to the new image-B with the common blob ID to mount:
    POST /v2/image-B/blobs/uploads/?from=image-A&mount=sha256%3A6b17377d415bc5e55a5a71a28a74139961b94b9f835c9c29852e7658d76c2ab0
  • The registry should respond with 201 Created if the mount is successful otherwise it will return 202 Accepted to re-upload the layer
  • Currently the serverless-registry (in a wrangler dev environment) reply with 202 and docker will upload again the layer (wich work with docker push command).
  • Some client command (like docker manifest push) doesn't handle when the blob mounting doesn't work and crash with the message error mounting image-a@sha256:6b17377d415bc5e55a5a71a28a74139961b94b9f835c9c29852e7658d76c2ab0 to localhost:8787/image-b:test
@gabivlj gabivlj added the enhancement New feature or request label Jan 6, 2025
@Dramelac
Copy link
Contributor Author

Hello @gabivlj

I can try to submit a PR for this use case, but I'm wondering which method would be best suited to duplicate an object in R2? I've read the Worker R2 API documentation but nothing about the duplication operation. Is there any official support coming soon?
If we were to use the S3 API, the layers could be >5Gb and only CopyObject seems to support the x-amz-copy-source header and not UploadPartCopy for larger files...

Any ideas on how to solve this problem?

Thank you!

@gabivlj
Copy link
Collaborator

gabivlj commented Jan 15, 2025

Copying objects in R2 is kind of a hard problem as we'd have to manually copy the object. One idea is redesign of how we store layers based on #78. Basically, making a "layer" object point to a "blob". (/<path>/blobs/<digest> pointing to /storage/<digest>). Then, if a mount request comes in, we just create a reference to the original blob. It might make us have to redesign garbage collection a bit I think.

@Dramelac
Copy link
Contributor Author

We came to the same conclusion, I have a "symlink" system that works but I still have the garbage collector part and I think I'm going to change a lot of things, there are even a lot of bugs in the current version (for example, in untagged mode, manifest lists are not taken into account and all layers are wrongly deleted).
PR soon when everything works!

@Dramelac
Copy link
Contributor Author

@gabivlj should I wait for PR #78 to be merged before continuing ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants