Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Thread per test class" via MultiEnvTestEngine #9453

Merged
merged 2 commits into from
Sep 6, 2024

Conversation

snazy
Copy link
Member

@snazy snazy commented Aug 30, 2024

See #9441 for a complete description of the underlying problem. TL;DR is: ThreadLocals from various 3rd party libraries leak into the single Test worker thread that runs all the tests, resulting in TL objects/suppliers from the various Quarkus test class loaders, eventually leading to nasty OOMs.

This change updates the MultiEnvTestEngine by using the new ThreadPerTestClassExecutionExecutorService and also "assimilate" really all tests, even the non-multi-env tests, so that those also run on a thread per test-class. The logic to distinguish multi-env from non-multi-env tests via MultiEnvExtensionRegistry.registerExtension() via test discovery is not perfect (but good enough), it can add multi-env tests to the non-multi-env tests, so an additional check is needed there.

Since each test class runs on "its own thread", the ThreadLocals are registered on that thread. Once the test class finishes, the thread is disposed and its thread locals become eligible for garbage collection, which is what is needed.

The bump of the max-heap size for test workers is also reduced back to 2g (was changed in #9433).

Fixes #9441

Copy link
Contributor

@adutra adutra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I think we should wait for @dimas-b review as well.

import org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutorService;
import org.junit.platform.engine.support.hierarchical.ThrowableCollector;

public class ThreadPerTestClassExecutionTestEngine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor drawback here is that this new engine is only used as a delegate to MultiEnvTestEngine, never as a "top-level" engine. IOW, only modules that include :nessie-multi-env-test-engine will benefit from the improvement.

Granted, most "problematic" modules have a dependency on the multi-env engine anyways. But i was thinking of the kubernetes operator PR: this module does not have multi-env tests, so it wouldn't benefit from these improvements as it is today. But maybe it's just a matter of adding a dependency to :nessie-multi-env-test-engine there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or use JUnit, if junit-team/junit5#3939 gets merged and released.

/** Returns {@code junit-jupiter-engine} as the artifact ID. */
@Override
public Optional<String> getArtifactId() {
return Optional.of("junit-jupiter-engine");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is correct. Do we have to use this ID to make stuff work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean: would a Nessie artifact ID work here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ID doesn't really matter. And with the Junit-PR this would go away entirely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the Junit PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an issue ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See projectnessie#9441 for a complete description of the underlying problem. TL;DR is: `ThreadLocal`s from various 3rd party libraries leak into the single `Test worker` thread that runs all the tests, resulting in TL objects/suppliers from the various Quarkus test class loaders, eventually leading to nasty OOMs.

This change updates the `MultiEnvTestEngine` by using the new `ThreadPerTestClassExecutionExecutorService` and also "assimilate" really all tests, even the non-multi-env tests, so that those also run on a thread per test-class. The logic to distinguish multi-env from non-multi-env tests via `MultiEnvExtensionRegistry.registerExtension()` via test discovery is not perfect (but good enough), it can add multi-env tests to the non-multi-env tests, so an additional check is needed there.

Since each test class runs on "its own thread", the `ThreadLocal`s are registered on that thread. Once the test class finishes, the thread is disposed and its thread locals become eligible for garbage collection, which is what is needed.

The bump of the max-heap size for test workers is also reduced back to 2g (was changed in projectnessie#9433).

Fixes projectnessie#9441
@snazy snazy force-pushed the multi-env-thread-per-test-class branch from 79eb782 to 7789b10 Compare September 4, 2024 15:32
@snazy snazy requested review from dimas-b and adutra September 5, 2024 14:16
// Also execute all other tests via the MultiEnv test engine to get the "thread per
// test-class" behavior.
registry()
.probablyNotMultiEnv()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why collect test descriptors in an "extension" registry? Why not reuse preliminaryResult here and filter out all multi-env tests?

We already rediscover tests for each env. ID, so we're probably not saving much time by collecting test descriptors during the extension registration process... On the other hand using a discovery result directly would be much more clear... WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result is sadly not correct that way. We can make it "perfect" in a follow-up.

@snazy snazy merged commit 862bc90 into projectnessie:main Sep 6, 2024
16 checks passed
@snazy snazy deleted the multi-env-thread-per-test-class branch September 6, 2024 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix class/resource leak in :nessie-quarkus:test
3 participants