Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kalos cluster under-utilised #1

Open
HDRah opened this issue May 18, 2024 · 0 comments
Open

Kalos cluster under-utilised #1

HDRah opened this issue May 18, 2024 · 0 comments

Comments

@HDRah
Copy link

HDRah commented May 18, 2024

Hi, when I explored the cluster utilisation rate (number of running GPUs / total number of GPUs) based on the job start time, end time, and the number of GPUs for each job, I found that the maximum utilisation rate of the Kalos cluster is only around 70%, and there are lots of periods where less than 40% or even 20% of the total GPUs of the cluster are used, which is quite weird and is not the case for Seren. I also found that the Seren data has ~800k job records, while Kalos only has ~60k. Does this mean that not all jobs are recorded for Kalos, which further leads to the severe under-utilisation?

Sincerely appreciate it if you could help clarify this. Also thank you so much for sharing this fantastic dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant