Skip to content

[SPARK-52243][CONNECT] Add NERF support for schema-related InvalidPlanInput errors #50997

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

heyihong
Copy link
Contributor

@heyihong heyihong commented May 23, 2025

What changes were proposed in this pull request?

This PR adds NERF (New Error Framework) support for schema-related InvalidPlanInput errors in Spark Connect. The changes include:

  1. Added new error conditions in error-conditions.json for schema validation:

    • INVALID_OUTPUT_SCHEMA_TYPE_FOR_TRANSFORM_WITH_STATE_IN_PANDAS_NON_STRUCT
    • INVALID_PYTHON_UDTF_RETURN_TYPE_NON_STRUCT
    • INVALID_SCHEMA_TYPE_NON_STRUCT
    • INVALID_STATE_SCHEMA_TYPE_FOR_FLAT_MAP_GROUPS_WITH_STATE_NON_STRUCT
  2. Refactored error handling in InvalidInputErrors.scala to use the new NERF framework:

    • Added helper function invalidPlanInput for consistent error message generation
    • Updated schema validation error methods to use NERF error conditions
    • Made quoteByDefault method accessible to other packages
  3. Added a test suite InvalidInputErrorsSuite.scala to verify error handling

Why are the changes needed?

These changes are needed to:

  1. Standardize error reporting across Spark Connect using the NERF framework
  2. Improve error messages with better parameterization and consistency
  3. Ensure proper SQL state codes are associated with schema-related errors
  4. Provide clearer error messages for users when schema validation fails

Does this PR introduce any user-facing change?

No

How was this patch tested?

build/sbt "connect/testOnly *InvalidInputErrorsSuite"

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Cursor 0.50.5 (Universal)

@heyihong heyihong force-pushed the SPARK-52243 branch 12 times, most recently from f2d5abb to 01fddeb Compare May 25, 2025 15:57
@HyukjinKwon HyukjinKwon changed the title [SPARK-52243] Add NERF support for schema-related InvalidPlanInput errors [SPARK-52243][CONNECT] Add NERF support for schema-related InvalidPlanInput errors May 26, 2025
@heyihong
Copy link
Contributor Author

@@ -3432,6 +3444,12 @@
],
"sqlState" : "42602"
},
"INVALID_SCHEMA_TYPE_NON_STRUCT" : {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we use this error condition everywhere? All of those features need to specify a schema, and INVALID_SCHEMA_TYPE_NON_STRUCT is general enough that fit all of them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To indicate where the error happened, error context is a better place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants