-
Notifications
You must be signed in to change notification settings - Fork 308
[BE] [float8] Run test_everything.sh in float8 test CI using linux.aws.h100.4 #2541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2541
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 Cancelled JobAs of commit 2530c2d with merge base c57226b ( CANCELLED JOB - The following job was cancelled. Please retry:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
63a6069
to
dcb7d63
Compare
dcb7d63
to
aca4873
Compare
fyi @vkuzo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the main way we are testing our integrations, add back
Also seems that there still exsits tests that can run non not 4 gpus that are worth doing
Can you clarify what you mean here? We could use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment was just that we now are requesting 4 gpus for all of these tests where previously we only needed 1, I am not sure if this ends up being harder to schedule / add smore delays
Hmm I see, perhaps we can see how it goes over the next week or so and if it's slowing us down then we can revisit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about:
- split
test_everything.sh
into one piece for single GPU and another one for multi GPU - keep the single GPU one in the current target
- make a new target for multi GPU
@danielvegamyhre IMO worth reverting or forward fixing asap, because this PR seems to run the single GPU float8 tests twice, once from the original code and once again from |
Fix forward: #2561 |
Fixes #2477