Skip to content

Multi agent parallel testing in CI #18523

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 24 commits into
base: main
Choose a base branch
from
Open

Conversation

majocha
Copy link
Contributor

@majocha majocha commented Apr 29, 2025

Make #18517 work.

Copy link
Contributor

✅ No release notes required

@majocha
Copy link
Contributor Author

majocha commented Apr 29, 2025

This kind of worked, but now Linux is the bottleneck, oh well :)

@T-Gro
Copy link
Member

T-Gro commented Apr 29, 2025

Linux and MacOs will likely alternate in being the slowest, MacOs also has the smallest pool of available machines.

I do like the numbers very much.
I will be happy to get this in on a per-partes basis ;; always focusing on the currently slowest leg(s).

let addBatchTrait (testCase: ITestCase) =
let data = Text.Encoding.UTF8.GetBytes testCase.DisplayName
let hashCode = BitConverter.ToUInt32(sha.ComputeHash(data), 0)
let batch = hashCode % 4u + 1u
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why % 4 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to get a number from 1 to 4. this is hardcoded just for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to get the batch number from a custom hashCode that is stable between executions. I heard bad things about String.GetHashcode() in this regard.

@majocha
Copy link
Contributor Author

majocha commented Apr 29, 2025

Now it's important to verify the that number of tests add up correctly 😅
image

@T-Gro
Copy link
Member

T-Gro commented Apr 29, 2025

.. number of unique executed tests.

I would recommend to download testlog.xml files and have a script count unique test entries and compare two sets (prior exection, new execution from 4 jobs)

@majocha
Copy link
Contributor Author

majocha commented May 5, 2025

So I had some fun splitting the build and test phases into separate jobs.
Nominally the run times are nice, but it doesn't seem very beneficial as for now, because there is a significant wait time when starting a windows vm agent.

@majocha
Copy link
Contributor Author

majocha commented May 5, 2025

I verified that in fact all tests do have a batch number trait, and none are left out when run in batches.

The failsafe, in case this ever becomes problematic, is to remove the -testBatch x from script invocations and it will run all the tests without filtering.

As for the build / test split into separate jobs, I'm not sold on this. it's a matter of reverting the single commit.

@majocha majocha marked this pull request as ready for review May 5, 2025 09:08
@majocha majocha requested a review from a team as a code owner May 5, 2025 09:08
@majocha
Copy link
Contributor Author

majocha commented May 5, 2025

For future reference:
image
image

@T-Gro
Copy link
Member

T-Gro commented May 12, 2025

As for the build / test split into separate jobs, I'm not sold on this. it's a matter of reverting the single commit.

Indeed, the overall time of the CI (and not duration of individual jobs) is the same, if not slower.
If we ever feel the pressure to reduce the resource utilization here, it is a step we could undertake.

=> I think we can keep each leg do it's own build.

if ($testDesktop -and -not $ci ) {
TestUsingMSBuild -testProject "$RepoRoot\FSharp.sln" -targetFramework $script:desktopTargetFramework -testadapterpath "$ArtifactsDir\bin\FSharp.Compiler.ComponentTests\"
if ($testDesktop) {
TestUsingMSBuild -testProject "$RepoRoot\FSharp.sln" -targetFramework $script:desktopTargetFramework
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in general fine with the approach and like it - of testing the entirety of FSharp.sln solution, since for me it is the one I am using the most as well.

Can you please add a sentence or two to the test guide, that any new test project that wishes to be part of CI should be added to this solution?

And inversely, test project which needs special guidance (like editor specific tests) should not be in FSharp.sln - and have to catered for explicitely in the script file, just like it was until now.

Copy link
Contributor Author

@majocha majocha May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I noticed this is not consistent between linux and windows build scripts. The bash script does not run `FSharp.Suite' at all. Probably it could, it runs on net9.0 without problems on Windows.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave this for another PR. Bash is not my forte :)

@majocha
Copy link
Contributor Author

majocha commented May 12, 2025

Indeed, the overall time of the CI (and not duration of individual jobs) is the same, if not slower.

Yes much of this comes from waiting for the windows vms to get running. There is significant wait time between the build and test phases because of this.

=> I think we can keep each leg do it's own build.

I'm away but I'll restore this when I get back 🙂

@T-Gro T-Gro marked this pull request as draft May 14, 2025 07:38
@T-Gro
Copy link
Member

T-Gro commented May 14, 2025

Indeed, the overall time of the CI (and not duration of individual jobs) is the same, if not slower.

Yes much of this comes from waiting for the windows vms to get running. There is significant wait time between the build and test phases because of this.

=> I think we can keep each leg do it's own build.

I'm away but I'll restore this when I get back 🙂

Thanks @majocha . Please mark it as ready for review onace that is changed, I am eager to have this in 👍

@majocha majocha force-pushed the multi-agent-ci branch 2 times, most recently from ff7fa06 to 88be2a4 Compare May 15, 2025 22:06
@majocha majocha marked this pull request as ready for review May 16, 2025 09:18
@majocha
Copy link
Contributor Author

majocha commented May 16, 2025

This is ready, overall CI run time is now around 48 minutes. It is determined by the slowest -testVs leg, which is unfortunately not parallelized at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: New
Development

Successfully merging this pull request may close these issues.

2 participants