Skip to content

Conversation

@rahil-c
Copy link
Contributor

@rahil-c rahil-c commented Dec 3, 2025

Description

  • This PR aims to add an integration test for the polaris-hudi integration, following a similar pattern as what was done in SparkDeltaIT

Checklist

  • 🛡️ Don't disclose security issues! (contact [email protected])
  • 🔗 Clearly explained why the changes are needed, or linked related issues: Fixes #
  • 🧪 Added/updated tests with good coverage, or manually tested (and explained how)
  • 💡 Added comments for complex logic
  • 🧾 Updated CHANGELOG.md (if needed)
  • 📚 Updated documentation in site/content/in-dev/unreleased (if needed)

@rahil-c
Copy link
Contributor Author

rahil-c commented Dec 3, 2025

@flyrain @gh-yzou @singhpk234

@gh-yzou
Copy link
Contributor

gh-yzou commented Dec 3, 2025

@rahil-c there is also an ongoing work for spark 4.0 support here #3188, does the hudi change also work with 4.0 without extra change?

// TODO: extract a polaris-rest module as a thin layer for
// client to depends on.
implementation(project(":polaris-core")) { isTransitive = false }
testImplementation("org.apache.hudi:hudi-spark3.5-bundle_${scalaVersion}:1.1.0")
Copy link
Contributor

@flyrain flyrain Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: we put versions in the file pluginlibs.versions.toml, refer it as line 35 does.

Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thanks @rahil-c !

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Dec 3, 2025
// TODO: extract a polaris-rest module as a thin layer for
// client to depends on.
implementation(project(":polaris-core")) { isTransitive = false }
testImplementation("org.apache.hudi:hudi-spark3.5-bundle_${scalaVersion}:1.1.0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the actual spark project, we don't really intend to introduce any table format specific dependency, even for testing. i didn't see any change in the actual spark project, is there a reason that we need this?

exclude("org.slf4j", "jul-to-slf4j")
}

// Add spark-hive for Hudi integration - provides HiveExternalCatalog that Hudi needs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rahil-c could you also update the readme to include the support for hudi?

It would be great if we could also have a notebook in the get-started to help people to onboard for hudi, we could do that in follow up, we should also extend the regress test to include actual end to end test for hudi to avoid any potential break of the feature

@adam-christian-software
Copy link
Contributor

adam-christian-software commented Dec 5, 2025

@rahil-c there is also an ongoing work for spark 4.0 support here #3188, does the hudi change also work with 4.0 without extra change?

@gh-yzou - I believe that we can merge this & the Spark 4 work. Then, address the Hudi support in Spark 4. Maybe, we can file a Git issue on this to move forward?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants