Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download models on the fly #60

Open
TimPietrusky opened this issue Aug 20, 2024 · 1 comment
Open

Download models on the fly #60

TimPietrusky opened this issue Aug 20, 2024 · 1 comment
Assignees

Comments

@TimPietrusky
Copy link
Member

TimPietrusky commented Aug 20, 2024

Is your feature request related to a problem? Please describe.
The Docker images are very huge and users would like to just use the base image (which doesn't contain any models). This would make it more convenient for them to work on the image and add things, without the need to upload a huge image all the time when they make changes.

Describe the solution you'd like

  • Specify the models they want via env variables
  • Make sure that private models can also be handled
  • Models are downloaded on first request to the worker or when the worker starts
  • Models are downloaded onto the network-storage. If this doesn't exist, we will not download anything, as this would totally screw up the cold start time otherwise

Describe alternatives you've considered

  • Use the base image
  • Download the models manually onto the network storage
  • Use the network storage with the deployed endpoint

Additional context
This idea came to life because of https://discord.com/channels/912829806415085598/1273963578369642557 and https://discord.com/channels/912829806415085598/1270792081580753047.

@TimPietrusky TimPietrusky self-assigned this Aug 20, 2024
@jelling
Copy link

jelling commented Sep 6, 2024

It's been awhile, but I recall trying this via a custom image I built. And the long initial download times caused the Runpod serverless process to restart the worker repeatedly... which then tried to download the image. Perhaps that's been changed, dunno.

A simpler, and more reliable way to accomplish your goal of a smaller image: mount your network volume using a CPU-only / cheap GPU pod, download the models to it, and then use the models from this image. But keep in mind that loading models from the NAS is much slower than on-disk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants