New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

GPU使用率限制问题 #37

Open

testadbck opened this issue Nov 30, 2024 · 0 comments

testadbck commented Nov 30, 2024

启动了一个限制GPU使用率30%的pod，即yaml中增加了以下配置
nvidia.com/vgpu: "1"
nvidia.com/gpucores: 30
启动pod后，在pod的环境变量中能看到 CUDA_DEVICE_SM_LIMIT=30
显卡是英伟达4090

目前问题如下：
1 使用Qwen2.5-7B-Instruct进行推理，在宿主机上看到使用率在推理期间在90%以上
2 增加GPU_CORE_UTILIZATION_POLICY=FORCE的环境变量后，在推理期间，宿主机上看到使用率有时候90%以上，有时候又是0%

请问，目前的实现是不是只能在一定程度上限制GPU使用率（看到的实现如果gpu核心数不够时，执行nanosleep等待，直到gpu核心数足够）？目前的实现是不是达不到让宿主机上看到最大使用率是CUDA_DEVICE_SM_LIMIT的使用率？

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment