[SPARK-56744][INFRA] Document test base class hierarchy in AGENTS.md#55707
Open
zhengruifeng wants to merge 10 commits intoapache:masterfrom
Open
[SPARK-56744][INFRA] Document test base class hierarchy in AGENTS.md#55707zhengruifeng wants to merge 10 commits intoapache:masterfrom
zhengruifeng wants to merge 10 commits intoapache:masterfrom
Conversation
Add a "Test Base Classes" section to AGENTS.md describing the layered SparkFunSuite -> PlanTest -> QueryTest -> SharedSparkSession hierarchy, when to pick each base, common redundant-mixin patterns to avoid, and the linearization gotcha when the first item in an `extends` clause is a pure trait without the SparkFunSuite class chain. Generated-by: Claude opus-4-7
…sion) in AGENTS.md Clarify that QueryTest declares `spark` abstractly and that `SharedSparkSession` is one of several session-providing traits in the repo. List the common providers (`SharedSparkSession`, `TestHiveSingleton`, `RemoteSparkSession`) and note the rare manual-override pattern. Generated-by: Claude opus-4-7
…ENTS.md Add a "Base chain" column to the session provider table and explain that RemoteSparkSession directly extends AnyFunSuite (not SparkFunSuite), so Connect client suites must combine it with QueryTest to recover the SparkFunSuite-only features (per-test timeout, gridTest, retry, log-capture). Generated-by: Claude opus-4-7
…S.md Add SparkTestSuite as the bottom of the test base class chain — it holds the common Spark test functionality (thread audit, fixed timezone/locale, withTempDir, withLogAppender, checkError) and is style-agnostic. SparkFunSuite is AnyFunSuite + SparkTestSuite. Mention the alternative-style demonstration suites under core/.../*SparkTestSuite.scala and update the linearization gotcha to note that SparkTestSuite is itself a pure trait. Generated-by: Claude opus-4-7
…TS.md Drop the RemoteSparkSession row and follow-up paragraph from the session provider section. The trait does not formally extend SparkSessionProvider and serves a separate audience (Connect client tests under sql/connect/client/jvm), so it is out of scope for this guide. Generated-by: Claude opus-4-7
QueryTest extends SparkFunSuite with QueryTestBase with PlanTest, so PlanTest is a parent of QueryTest. Indent QueryTest one level deeper to reflect the actual hierarchy and show QueryTest as PlanTest + QueryTestBase (PlanTest already brings the SparkFunSuite chain). Generated-by: Claude opus-4-7
Add a parallel diagram for the style-agnostic `*Base` helper traits (`PlanTestBase` -> `QueryTestBase` -> `SharedSparkSessionBase`) and note that `PlanTest` / `QueryTest` / `SharedSparkSession` are the FunSuite-flavored bundles built on top of them, analogous to `SparkFunSuite` = `AnyFunSuite` + `SparkTestSuite`. Generated-by: Claude opus-4-7
Drop the "manual session override", helper-trait combination, and "common mistakes" subsections from the Test Base Classes section to keep the guide focused on the layered base hierarchy and the linearization gotcha. Generated-by: Claude opus-4-7
Reorder the Test Base Classes section so the style-agnostic helper traits (SparkTestSuite, PlanTestBase, QueryTestBase, SharedSparkSessionBase) are introduced first, followed by the class-bearing FunSuite-style bases that bundle each helper with AnyFunSuite. The helpers are where the actual test functionality lives, so leading with them makes the relationship clearer. Generated-by: Claude opus-4-7
The section covers Scala test bases only (PySpark tests are handled separately under Build and Test), so the more specific name avoids confusion. Generated-by: Claude opus-4-7
HyukjinKwon
approved these changes
May 6, 2026
cloud-fan
reviewed
May 7, 2026
| SparkTestSuite (core; thread audit, fixed timezone/locale, withTempDir, withLogAppender, checkError) | ||
| PlanTestBase (sql/catalyst; plan-comparison helpers — comparePlans, normalizePlan) | ||
| <- QueryTestBase (sql/core; adds SQL/DataFrame helpers + abstract `spark` via `SparkSessionProvider`) | ||
| <- SharedSparkSessionBase (sql/core; provides the actual `TestSparkSession`) |
Contributor
There was a problem hiding this comment.
hmmm do we need to mention then? Inside Spark we should always pick AnyFunSuite style for consistency
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Add a
Scala Test Base Classessection toAGENTS.mdthat documents the layered Scala test base hierarchy in this repo and how to pick a base class for a new test suite.The actual test helpers live in style-agnostic traits (no ScalaTest style committed):
The class-bearing test bases bundle each helper with
AnyFunSuite(the default ScalaTest style) so concrete suites can extend them directly:The new section also includes:
QueryTest, full SQL with session) to the right base.QueryTestdeclaresspark: SparkSessionabstractly viaSparkSessionProvider, plus a session-provider table listing the two formalSparkSessionProviderimplementations in the repo:SharedSparkSession(sql/core) andTestHiveSingleton(sql/hive).extendsclause must transitively extend a class. Pure traits (SparkTestSuite,*ErrorsBase,*Helper) cannot be put first.CLAUDE.mdis a symlink toAGENTS.md, so this change is picked up by both AI agent toolchains.Why are the changes needed?
Picking the wrong test base class (e.g. extending
QueryTestdirectly when a session is needed, orSparkFunSuitewhenPlanTestwould do) is a common stumble when adding new Scala test suites. The information is currently spread across the source ofSparkTestSuite,SparkFunSuite,PlanTest,QueryTest, andSharedSparkSession(and the corresponding*Basetraits), with no single place that summarizes when to use which. Documenting it inAGENTS.mdgives both contributors and AI coding agents a quick reference.Does this PR introduce any user-facing change?
No. Documentation-only change to a developer/agent guide file.
How was this patch tested?
N/A. Documentation-only change; no code or tests are affected.
Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude opus-4-7