Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vgpu plugin does not restrict memory in container #1

Open
kunal642 opened this issue May 3, 2024 · 7 comments
Open

Vgpu plugin does not restrict memory in container #1

kunal642 opened this issue May 3, 2024 · 7 comments

Comments

@kunal642
Copy link

kunal642 commented May 3, 2024

Hi @archlitchi,
Creating this issue as a continuation of the conversation we were having on the volcano issue #3384

@kunal642
Copy link
Author

kunal642 commented May 3, 2024

@archlitchi Is the plugin version 1.9.0 compatible with volcano 1.8.2?

@kunal642
Copy link
Author

kunal642 commented Jun 5, 2024

hey @archlitchi,

We got the hard isolation working by mounting the "/tmp/gpu" and "/tmp/gpulock" to the container explicitly.

Can you explain why we are not able to assign more than 4 vgpu to a single pod (we have 4 GPU cards on a single node).

@archlitchi
Copy link
Contributor

@archlitchi Is the plugin version 1.9.0 compatible with volcano 1.8.2?

i recommend to use 1.9.0

@archlitchi
Copy link
Contributor

hey @archlitchi,

We got the hard isolation working by mounting the "/tmp/gpu" and "/tmp/gpulock" to the container explicitly.

Can you explain why we are not able to assign more than 4 vgpu to a single pod (we have 4 GPU cards on a single node).

yes, there are only 4 devices in /dev folder, so you can use 4 gpus at most, we can't mount a non-exist gpu device into container and can be recognized by nvidia-driver

@kunal642
Copy link
Author

kunal642 commented Jun 6, 2024

Does this mean that device plugin only restricts memory and not the compute resources?

If no then how can a pod use the full gpu using vgpu config?

@archlitchi
Copy link
Contributor

Does this mean that device plugin only restricts memory and not the compute resources?

If no then how can a pod use the full gpu using vgpu config?

it can restrict compute resources by specifying volcano.sh/vgpu-cores, if you want to use the full gpu, only specify volcano.sh/vgpu-number inside task

@kunal642
Copy link
Author

got it, is there a way to check how many cores are allocated in the container? if we configure 50% cores, then we want to make sure that only 50% is allocated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants