Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load test classes with runtime classloader #34681

Merged
merged 1 commit into from
Mar 25, 2025

Conversation

holly-cummins
Copy link
Contributor

@holly-cummins holly-cummins commented Jul 11, 2023

Bugs fixed by this PR

Bugs created by this PR (doh!)

Outstanding issues/breaking changes (input to release notes)

What problem is this solving?

We see a lot of problems caused by the fact that we load test classes with the deployment classloader, and then intercept the execution and reload the classes with the runtime classloader. Although the new test is loaded with the runtime classloader, its arguments are still loaded with the system classloader. To work around that we sometimes need to clone the arguments by serializing and de-serializing. This was always brittle and no longer worked at all on Java 17+ (until #40601 fixed that). We also see issues because parts of the test infrastructure see the 'wrong' instance of the class. See, for example, quarkiverse/quarkus-pact#73 and #22611.

We have several feature raised against the JUnit team to allow us more control over classloading. The first of this features was introduced in JUnit 5.10, and allows an interceptor to be registered before any tests are launched. This interceptor can set a thread context classloader, which is then used by JUnit to load tests.

My experiments with this feature were thoroughly disappointing. It turns out, setting a TCCL early in the test lifecycle doesn't really help us, because we overwrite our 'early' TCCL with other TCCLs later in the test lifecycle. The following diagram shows some of the places we set the TCCL.

life-of-a-tccl-2023-02-28-1714

Source: https://excalidraw.com/#json=HFPHIKx8wv0iiyXgNhAzw,8IlEmPcMRvm9pfCGShdClQ

What if we just used one of the existing interception points to set the 'right' classloader, before tests are loaded? If the tests were loaded with our preferred classloader, we wouldn’t need to intercept the factory. Loading the tests with the runtime classloader needs us to move some of our app initialisation earlier in the lifecycle, but I don't think there's any fundamental barrier to this. (We would have had to do this with a solution based on the new JUnit Launcher Interceptor anyway.)

The logic for starting Quarkus needs to be in the test discovery phase, rather than in the extension. This allows us to create the runtime classloader before the test is loaded. The JUnitTest runner already knows about the Quarkus Extension, so it’s only a small extra bit of knowledge to do some of the startup actions.

This only gets us part of the way, though. @stuartwdouglas raised the point that if we have to set only a single classloader, that's not very flexible, because we have a runtime classloader for each test profile. A Quarkus test run doesn't just use one classloader, it uses several. Every resource/unique profile triggers an app relaunch, which means a new classloader. What I've done to handle this is create a FacadeClassLoader. It takes the classloading requests, and then either routes them on to the quarkus application (for vanilla @QuarkusTests), or, if there's a profile/resource, it makes a new app + classloader and sends the request to that.

What we used to before was load a throwaway copy of the the test, pass it to JUnit discovery, let JUnit launch it, and then intercept the execution, figure out what profiles+resources the test declares, create a quarkus app with that information, start the quarkus app, reload the test with the runtime classloader of the quarkus app (and clone its parameters), and execute the test.

The new model is load a throwaway copy of the the test, figure out what profiles+resources the test declares, create a quarkus app with that information, reload the test with the runtime classloader of the quarkus app, pass the ‘right’ class to JUnit discovery, let JUnit launch it, and then intercept the execution, start the quarkus app, and execute the test.

One fundamental limitation of "load tests with the classloader used to execute them" is that a single test cannot run with multiple classloaders, which means it cannot support multiple profiles. We know some people do use this feature, but we also know there have been suggestions that we drop support for it, since it is complex to support (#45349). There is an easy workaround, which is to use one test per profile.

Thoughts on serialization and cloning

A big initial goal of this PR was to get rid of the xstream serialization, since it didn't work on Java 17+. #40601 fixes this issue by switching to use the JBoss marshaller for serialization. Does that mean this work item isn't needed any more? No, although it does mean its benefits are smaller. Here's why it's still useful:

  • Even with the JBoss serializer, higher-level test infrastructure (such as @TestTemplate) does not see Quarkus bytecode transformations done by extensions
  • Although the JBoss serializer works a lot better than xstream with the Java 17 access restrictions (as in, it works), serialization may continue to be a challenge going forward. See https://bugs.openjdk.org/browse/JDK-8164908 for some context. Most serializers use sun.misc.unsafe, but unsafe is shrinking. It seems certain the JDK team will have to come up with some solution and API to open up access for serializers, but the final design could have security implications (perhaps opening up access in a blanket way), or performance implications (reflection fun), or user experience implications (a need to manually set flags such as --enable-serialization?). If we can avoid serialization, we avoid all that.

Todo before this merges

Todo after this merges

  • Monitor ecosystem CI for new failures; we do not have coverage of every code path in our current suite (for example, the tests added in Continuous Testing: add support for build system like test selection #46389 broke with this change, by going down an uncovered code path)
  • Address the 'do not start dev services in augmentation' issue
  • Removal of all dead code in QuarkusTestExtension
  • Consolidation of duplicated code between AppMakerHelper and TestSupport
  • Tests for interaction with QuarkusProdModeTest, particularly for the tests in Add tests which exercise more complex JUnit extensions #35124
  • Continued improvements to fix hacky config usage in existing tests (such as relying on mutable system properties, etc)
  • Consolidation + streamlining of logic in the test order (unify around a key-based approach?)
  • Adding tests based on various outstanding issues which do not yet have reproducers in our test suite

@holly-cummins holly-cummins marked this pull request as draft July 11, 2023 14:13
@quarkus-bot quarkus-bot bot added area/dependencies Pull requests that update a dependency file area/devtools Issues/PR related to maven, gradle, platform and cli tooling/plugins area/platform Issues related to definition and interaction with Quarkus Platform area/testing labels Jul 11, 2023
@quarkus-bot quarkus-bot bot added the area/arc Issue related to ARC (dependency injection) label Aug 30, 2023
@holly-cummins holly-cummins marked this pull request as ready for review August 30, 2023 18:56
@holly-cummins holly-cummins changed the title Load test classes with runtime classloader Load test classes with runtime classloader (draft) Aug 30, 2023
@quarkus-bot

This comment has been minimized.

@quarkus-bot

This comment has been minimized.

@quarkus-bot quarkus-bot bot added the area/gradle Gradle label Aug 31, 2023
@holly-cummins holly-cummins changed the title Load test classes with runtime classloader (draft) Load test classes with runtime classloader Aug 31, 2023
@holly-cummins holly-cummins marked this pull request as draft August 31, 2023 21:03
@quarkus-bot

This comment has been minimized.

@The-Funk
Copy link

Still watching this one. I've created a minimal not-working-example to see if this fixes it. :)

@holly-cummins
Copy link
Contributor Author

Still watching this one. I've created a minimal not-working-example to see if this fixes it. :)

Still going on it ... :)

The profile support in normal mode turned out to be a bit thorny, so looking at that now. I haven't checked for a while, but at one point I had reproducers for three of the test-classloading-related defects. One was passing, but two (annoyingly) were failing. If your reproducer is shareable, I'm happy to take it and include it in what I'm checking, or to add it into the test suites, if it's a gap in what we're testing now. (It'd be worth checking what I've added in #35124 to see if one of those covers your scenario, too.)

@holly-cummins holly-cummins marked this pull request as ready for review March 22, 2024 18:31
@holly-cummins holly-cummins changed the title Load test classes with runtime classloader Draft: Load test classes with runtime classloader Mar 22, 2024
@geoand
Copy link
Contributor

geoand commented Mar 20, 2025

I'm favor of merging (and dealing with any consequences in main) at this point

@aloubyansky
Copy link
Member

@holly-cummins could you rebase please? There is a conflict

@gsmet
Copy link
Member

gsmet commented Mar 21, 2025

@holly-cummins I think you can probably squash now as I think the idea is to merge this ASAP.

Co-Authored-By: Alexey Loubyansky <[email protected]>
Co-Authored-By: Roberto Cortez <[email protected]>
@geoand
Copy link
Contributor

geoand commented Mar 21, 2025

+1

Copy link

quarkus-bot bot commented Mar 21, 2025

Status for workflow Quarkus CI

This is the status report for running Quarkus CI on commit ac829ad.

✅ The latest workflow run for the pull request has completed successfully.

It should be safe to merge provided you have a look at the other checks in the summary.

You can consult the Develocity build scans.


Flaky tests - Develocity

⚙️ JVM Integration Tests - JDK 21

📦 integration-tests/hibernate-orm-panache

io.quarkus.it.panache.defaultpu.PanacheFunctionalityTest.testBug7102 - History

  • expected: <jozo> but was: <pero> - org.opentest4j.AssertionFailedError
org.opentest4j.AssertionFailedError: expected: <jozo> but was: <pero>
	at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
	at org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
	at org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)
	at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182)
	at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:177)
	at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1145)
	at io.quarkus.it.panache.defaultpu.PanacheFunctionalityTest.testBug7102(PanacheFunctionalityTest.java:220)

io.quarkus.it.panache.defaultpu.PanacheFunctionalityTest.testPanacheFunctionality - History

  • 1 expectation failed. Response body doesn't match expectation. Expected: is "OK" Actual: {"details":"Error id ca5710be-f48b-4390-9b30-986985d15479-1, org.jboss.resteasy.spi.UnhandledException: org.opentest4j.AssertionFailedError: expected: <1> but was: <0>","stack":"org.jboss.resteasy.spi.UnhandledException: org.opentest4j.AssertionFailedError: expected: <1> but was: <0>\n\tat org.jboss.resteasy.core.ExceptionHandler.handleApplicationException(ExceptionHandler.java:107)\n\tat org.jboss.resteasy.core.ExceptionHandler.handleException(ExceptionHandler.java:344)\n\tat org.jboss.resteasy.core.SynchronousDispatcher.writeException(SynchronousDispatcher.java:205)\n\tat org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:452)\n\tat org.jboss.resteasy.core.SynchronousDispatcher.lambda$invokePropagateNotFound$6(SynchronousDispatcher.java:275)\n\tat org.jboss.resteasy.core.SynchronousDispatcher.lambda$preprocess$0(SynchronousDispatcher.java:154)\n\tat org.jboss.res... - java.lang.AssertionError
java.lang.AssertionError: 
1 expectation failed.
Response body doesn't match expectation.
Expected: is "OK"
  Actual: {"details":"Error id ca5710be-f48b-4390-9b30-986985d15479-1, org.jboss.resteasy.spi.UnhandledException: org.opentest4j.AssertionFailedError: expected: <1> but was: <0>","stack":"org.jboss.resteasy.spi.UnhandledException: org.opentest4j.AssertionFailedError: expected: <1> but was: <0>\n\tat org.jboss.resteasy.core.ExceptionHandler.handleApplicationException(ExceptionHandler.java:107)\n\tat org.jboss.resteasy.core.ExceptionHandler.handleException(ExceptionHandler.java:344)\n\tat org.jboss.resteasy.core.SynchronousDispatcher.writeException(SynchronousDispatcher.java:205)\n\tat org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:452)\n\tat org.jboss.resteasy.core.SynchronousDispatcher.lambda$invokePropagateNotFound$6(SynchronousDispatcher.java:275)\n\tat org.jboss.resteasy.core.SynchronousDispatcher.lambda$preprocess$0(SynchronousDispatcher.ja...
  • 1 expectation failed. Response body doesn't match expectation. Expected: is "OK" Actual: {"details":"Error id 955fdfdb-9976-4071-9d63-ae9e01a5cd76-1, org.jboss.resteasy.spi.UnhandledException: org.opentest4j.AssertionFailedError: expected: <0> but was: <1>","stack":"org.jboss.resteasy.spi.UnhandledException: org.opentest4j.AssertionFailedError: expected: <0> but was: <1>\n\tat org.jboss.resteasy.core.ExceptionHandler.handleApplicationException(ExceptionHandler.java:107)\n\tat org.jboss.resteasy.core.ExceptionHandler.handleException(ExceptionHandler.java:344)\n\tat org.jboss.resteasy.core.SynchronousDispatcher.writeException(SynchronousDispatcher.java:205)\n\tat org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:452)\n\tat org.jboss.resteasy.core.SynchronousDispatcher.lambda$invokePropagateNotFound$6(SynchronousDispatcher.java:275)\n\tat org.jboss.resteasy.core.SynchronousDispatcher.lambda$preprocess$0(SynchronousDispatcher.java:154)\n\tat org.jboss.res... - java.lang.AssertionError
java.lang.AssertionError: 
1 expectation failed.
Response body doesn't match expectation.
Expected: is "OK"
  Actual: {"details":"Error id 955fdfdb-9976-4071-9d63-ae9e01a5cd76-1, org.jboss.resteasy.spi.UnhandledException: org.opentest4j.AssertionFailedError: expected: <0> but was: <1>","stack":"org.jboss.resteasy.spi.UnhandledException: org.opentest4j.AssertionFailedError: expected: <0> but was: <1>\n\tat org.jboss.resteasy.core.ExceptionHandler.handleApplicationException(ExceptionHandler.java:107)\n\tat org.jboss.resteasy.core.ExceptionHandler.handleException(ExceptionHandler.java:344)\n\tat org.jboss.resteasy.core.SynchronousDispatcher.writeException(SynchronousDispatcher.java:205)\n\tat org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:452)\n\tat org.jboss.resteasy.core.SynchronousDispatcher.lambda$invokePropagateNotFound$6(SynchronousDispatcher.java:275)\n\tat org.jboss.resteasy.core.SynchronousDispatcher.lambda$preprocess$0(SynchronousDispatcher.ja...

Copy link
Contributor

@brunobat brunobat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked out the PR and run the OTel related tests for execution time differences. Nothing meaningful was spotted.

Copy link
Member

@gsmet gsmet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Release the KRAAAKEN!

@gsmet gsmet merged commit 90fc415 into quarkusio:main Mar 25, 2025
56 checks passed
@quarkus-bot quarkus-bot bot added this to the 3.22 - main milestone Mar 25, 2025
gemmellr added a commit to amqphub/quarkus-qpid-jms that referenced this pull request Mar 26, 2025
@gemmellr
Copy link
Contributor

Just an FYI note for anyone else later, I had to update my extensions tests to Surefire 3.x to fix a break which I'm guessing began due to this change (apologies if not!).

The extension test run against Quarkus main started failing last night, with:

Caused by: java.lang.RuntimeException: Internal error. The test class class org.amqphub.quarkus.qpid.jms.it.QpidJmsReceiveTest should have been loaded with a QuarkusClassLoader, but instead it was loaded with jdk.internal.loader.ClassLoaders$AppClassLoader@2cdf8d8a. This is caused by the FacadeClassLoader not correctly identifying this class as a QuarkusTest.

The 'same tests' (repackaged) worked ok in last nights quarkus-platform run against Quarkus main however. Checking for differences, I noticed that the platform run was using Surefire 3.0.0 whilst the extension builds own runs were still on 2.22.1. Updating to the latest surefire and failsafe (3.5.2) the extension tests now work ok against quarkus main again.

@holly-cummins
Copy link
Contributor Author

Just an FYI note for anyone else later, I had to update my extensions tests to Surefire 3.x to fix a break which I'm guessing began due to this change (apologies if not!).

The extension test run against Quarkus main started failing last night, with:

Caused by: java.lang.RuntimeException: Internal error. The test class class org.amqphub.quarkus.qpid.jms.it.QpidJmsReceiveTest should have been loaded with a QuarkusClassLoader, but instead it was loaded with jdk.internal.loader.ClassLoaders$AppClassLoader@2cdf8d8a. This is caused by the FacadeClassLoader not correctly identifying this class as a QuarkusTest.

The 'same tests' (repackaged) worked ok in last nights quarkus-platform run against Quarkus main however. Checking for differences, I noticed that the platform run was using Surefire 3.0.0 whilst the extension builds own runs were still on 2.22.1. Updating to the latest surefire and failsafe (3.5.2) the extension tests now work ok against quarkus main again.

Thanks for the tip, @gemmellr! I noticed several ecosystem CI runs started failing with this change (which isn't too surprising, given how big the change is). Investigating the failures is today's job, but I think you might have saved me a lot of trouble.

I'm surprised the surefire version would make a difference. I wonder what's different in surefire? One possibility is how the classpath gets passed to the FacadeClassLoader. I might be able to make my stuff tolerate Surefire 2, I will investigate.

@gemmellr
Copy link
Contributor

gemmellr commented Mar 26, 2025

Thanks for the tip, @gemmellr! I noticed several ecosystem CI runs started failing with this change (which isn't too surprising, given how big the change is). Investigating the failures is today's job, but I think you might have saved me a lot of trouble.

I'm surprised the surefire version would make a difference. I wonder what's different in surefire? One possibility is how the classpath gets passed to the FacadeClassLoader. I might be able to make my stuff tolerate Surefire 2, I will investigate.

Np, I was also surprised so thought it worth a mention.

I was also pleased though when I realised that was the case; it meant it was an easy fix, and also doesnt need me to release a new version for the next platform hehe.

I had skimmed the description here and couldn't see anything suggesting a break should really be expected in my case, or anything specific I could even change that would fix whatever the issue was. It was only then I went looking for how much broke in the platform build that I was then surprised to find the quarkus-qpid-jms tests had actually worked there, which pretty quickly pointed out the difference/solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/arc Issue related to ARC (dependency injection) area/config area/core area/dependencies Pull requests that update a dependency file area/devtools Issues/PR related to maven, gradle, platform and cli tooling/plugins area/documentation area/flyway area/gradle Gradle area/infra-automation anything related to CI, bots, etc. that are used to automated our infrastructure area/kafka area/kotlin area/kubernetes area/maven area/oidc area/platform Issues related to definition and interaction with Quarkus Platform area/testing area/vertx kind/bugfix release/noteworthy-feature triage/flaky-test
Projects
8 participants