-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libjob: rabbit jobspec tests #6727
libjob: rabbit jobspec tests #6727
Conversation
Yeah, I don't think it would help. We'll need to think of some simple update to correctly count resources in jobspec V1+ or whatever this is. I actually didn't realize that jobspec could have more than one slot. We may need to sit down and devise some rules for a v2 jobspec that scheduling rabbits requires, encode that in an RFC, then update libjj using those rules. Maybe we can discuss this at the next meeting. |
Problem: some flux-coral2 rabbit jobs insert an 'ssd' vertex type which is not accepted by the jj jobspec reader. Ignore unknown types.
c42cf32
to
d82fb69
Compare
Problem: the jj-reader tests need to be updated for jj's new handling of unknown resource types. Change the tests, removing two tests that were previously expected to fail but now succeed.
Problem: flux-coral2 sometimes generates jobspecs with entries above 'node' (usually 'slot' entries) where the intended meaning is that whatever is below should be multiplied by that count. Add logic so that if there are entries in jobspec above 'node', the nodecount read in by `jj.c` is multiplied.
Problem: there are no tests for the multiplication of the nodecount in jobspec if there are entries above "node". Add tests.
d82fb69
to
d159c71
Compare
After discussion in the meeting yesterday, I think this is ready for review @grondo . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks!
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #6727 +/- ##
==========================================
- Coverage 83.82% 83.80% -0.02%
==========================================
Files 534 534
Lines 89281 89279 -2
==========================================
- Hits 74838 74823 -15
- Misses 14443 14456 +13
🚀 New features to boost your workflow:
|
WIP PR to address #6710. Several of the tests in
t/t0022-jj-reader.t
fail. Some of them because I didn't bother to spend the time to get them right, but others because nowjj-reader
can incorrectly count the nodes. A jobspec like this:should give a nodecount of 2, but it currently gives a nodecount of 1 because the reader doesn't take into account the
count
of the higher-levelslot
entry. That could be easily fixed. However, it will probably get a wrongslot
count but I don't know how the slot count is supposed to be used so I'm not as sure how to fix it.@grondo @cmoussa1 I have to step away from development on this issue for a while to get back to SCR work, but I wanted to put it up so you could see what I was thinking.
I am wondering if
flux-coral2
should drop the use ofjobspec_update
and just update the job-manager's version of the jobspec in-place like it used to, so that only the scheduler sees the updated version (I think that's how it works). That would let us punt on the issue until we have a better way to handle it. But if I'm wrong about only the scheduler seeing the updated jobspec, and in particular if themf_priority
plugin sees the in-place-updated jobspec, we'd still hit the same problem and that wouldn't save us anything.