Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuebot reserve all cores #1313

Conversation

KernAttila
Copy link
Contributor

@KernAttila KernAttila commented Aug 26, 2023

Link the Issue(s) this Pull Request is related to.
Fixes #1297

Summarize your change.
As in many render engines, we should be able to set a negative core requirement.
minCores=8 > reserve 8 cores
minCores=0 > reserve all cores
minCores=-2 > reserve all cores minus 2

This PR addresses this feature by handling negative core requests.
Cuebot will try to match this number against the number of cores on each host.
The frame will be booked only if all cores are available in this scenario.
If the host is busy (even slightly), the frame is not booked, to avoid filling the remaining cores.

Testing
I would need some guidance to create proper tests for cuebot.

Screenshot
negative_cores
Update:
There is now a "ALL" text for zero cores, or "ALL (-2)" for negative cores reservation.
core_reservation
(cuesubmit feature in another PR #1284)

KernAttila and others added 30 commits May 31, 2023 12:49
doc: added some comments
doc: Added some documentation
doc: fix docstrings and parameters
doc: added debug message
doc: explain why we allow negative value
doc: update debug message for rqd
@KernAttila KernAttila force-pushed the cuebot-reserve-all-cores branch from 669be2d to cc0f3a9 Compare August 26, 2023 02:12
@bcipriano
Copy link
Collaborator

Very interesting addition, thanks for sending this!

I haven't dug into this much yet but I brought it up during the TSC meeting, we're going to discuss some more.

One thing to note -- we'll definitely want to wrap this in a flag so it can be turned on and off. Scheduling changes are always tricky and we don't want to disrupt other folks too much.

@bcipriano
Copy link
Collaborator

@DiegoTavares Let me know if y'all have thoughts on this.

@KernAttila
Copy link
Contributor Author

@bcipriano Would you recommand it to be on or off by default ?

Copy link
Collaborator

@DiegoTavares DiegoTavares left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we should default to keeping this feature disabled. Very well implemented and an interesting feature to have.

Would you mind writing a blog post explaining the concept and how to use it?

@KernAttila
Copy link
Contributor Author

IMO we should default to keeping this feature disabled. Very well implemented and an interesting feature to have.

Would you mind writing a blog post explaining the concept and how to use it?

Thanks for the review.

I don't have access to a Linux farm at the moment, but we are doing a Windows set-up very soon.
I'll need a bit more time for this one, and I will try to have that blog post ;)

@DiegoTavares
Copy link
Collaborator

DiegoTavares commented May 21, 2024 via email

…o cuebot-reserve-all-cores

# Conflicts:
#	cuebot/src/main/java/com/imageworks/spcue/dispatcher/HostReportHandler.java
@KernAttila KernAttila marked this pull request as draft August 27, 2024 10:19
@KernAttila
Copy link
Contributor Author

@bcipriano
As discussed with the team in the last meeting, the logic doesn't change the behavior of current submissions.
Launching with zero or negative cores before this feature seems unlikely.
We figured that disabling this feature with a flag would mean introducing a lot of code, with no added value or safety for current setups.
Instead, we settled on changing the layer's "cores" column display in cuegui to "ALL" for 0 and "ALL (-2.00)" for negative values (see the last screenshot in the description).

Would that work for you ?

@KernAttila KernAttila marked this pull request as ready for review September 21, 2024 02:03
@DiegoTavares DiegoTavares merged commit da2fe01 into AcademySoftwareFoundation:master Sep 25, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle negative core requests
3 participants