Skip to content

Commit 1d575f2

Browse files
sryzadongjoon-hyun
authored andcommitted
[SPARK-54067][CORE] Improve SparkSubmit to invoke exitFn with the root cause instead of SparkUserAppException
### What changes were proposed in this pull request? Hides the `SparkUserAppException` and stack trace when a pipeline run fails. ### Why are the changes needed? I hit this when I ran a pipeline that had no flows: ``` org.apache.spark.SparkUserAppException: User application exited with 1 at org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:127) at org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:569) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1028) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:226) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:95) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1166) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1175) at org.apache.spark.deploy.SparkPipelines$.main(SparkPipelines.scala:42) at org.apache.spark.deploy.SparkPipelines.main(SparkPipelines.scala) ``` This is not information that's relevant to the user. ### Does this PR introduce _any_ user-facing change? Not for anything that's been released. ### How was this patch tested? Ran the CLI and observed this error was gone and the other output remained the same: ``` > spark-pipelines run --conf spark.sql.catalogImplementation=hive WARNING: Using incubator modules: jdk.incubator.vector 2025-10-28 13:22:49: Loading pipeline spec from /Users/sandy.ryza/sdp-test/demo2/pipeline.yml... 2025-10-28 13:22:49: Creating Spark session... WARNING: Using incubator modules: jdk.incubator.vector Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 25/10/28 13:22:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable /Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/conf.py:64: UserWarning: Failed to set spark.sql.catalogImplementation to Some(hive) due to [CANNOT_MODIFY_STATIC_CONFIG] Cannot modify the value of the static Spark config: "spark.sql.catalogImplementation". SQLSTATE: 46110 2025-10-28 13:22:53: Creating dataflow graph... 2025-10-28 13:22:53: Registering graph elements... 2025-10-28 13:22:53: Loading definitions. Root directory: '/Users/sandy.ryza/sdp-test/demo2'. 2025-10-28 13:22:53: Found 2 files matching glob 'transformations/**/*' 2025-10-28 13:22:53: Importing /Users/sandy.ryza/sdp-test/demo2/transformations/example_python_materialized_view.py... 2025-10-28 13:22:53: Registering SQL file /Users/sandy.ryza/sdp-test/demo2/transformations/example_sql_materialized_view.sql... 2025-10-28 13:22:53: Starting run... 25/10/28 13:22:55 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 2.3.0 25/10/28 13:22:55 WARN ObjectStore: setMetaStoreSchemaVersion called but recording version is disabled: version = 2.3.0, comment = Set by MetaStore sandy.ryza10.15.139.54 Traceback (most recent call last): File "/Users/sandy.ryza/oss/python/pyspark/pipelines/cli.py", line 413, in <module> run( File "/Users/sandy.ryza/oss/python/pyspark/pipelines/cli.py", line 340, in run handle_pipeline_events(result_iter) File "/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/pipelines/spark_connect_pipeline.py", line 53, in handle_pipeline_events for result in iter: File "/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/client/core.py", line 1186, in execute_command_as_iterator File "/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/client/core.py", line 1619, in _execute_and_fetch_as_iterator File "/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/client/core.py", line 1893, in _handle_error File "/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/client/core.py", line 1966, in _handle_rpc_error pyspark.errors.exceptions.connect.AnalysisException: [PIPELINE_DATASET_WITHOUT_FLOW] Pipeline dataset `spark_catalog`.`default`.`abc` does not have any defined flows. Please attach a query with the dataset's definition, or explicitly define at least one flow that writes to the dataset. SQLSTATE: 0A000 25/10/28 13:22:57 INFO ShutdownHookManager: Shutdown hook called 25/10/28 13:22:57 INFO ShutdownHookManager: Deleting directory /private/var/folders/1v/dqhbgmt10vl6v3tdlwvvx90r0000gp/T/spark-1214d042-270d-407f-8324-0dfcdf72c38c ``` ### Was this patch authored or co-authored using generative AI tooling? Closes #52770 from sryza/user-app-exited-error. Authored-by: Sandy Ryza <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
1 parent f37cd07 commit 1d575f2

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1166,7 +1166,7 @@ object SparkSubmit extends CommandLineUtils with Logging {
11661166
super.doSubmit(args)
11671167
} catch {
11681168
case e: SparkUserAppException =>
1169-
exitFn(e.exitCode, Some(e))
1169+
exitFn(e.exitCode, Option(e.getCause))
11701170
}
11711171
}
11721172

0 commit comments

Comments
 (0)