Commit 1d575f2
[SPARK-54067][CORE] Improve
### What changes were proposed in this pull request?
Hides the `SparkUserAppException` and stack trace when a pipeline run fails.
### Why are the changes needed?
I hit this when I ran a pipeline that had no flows:
```
org.apache.spark.SparkUserAppException: User application exited with 1
at org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:127)
at org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:569)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1028)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:226)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:95)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1166)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1175)
at org.apache.spark.deploy.SparkPipelines$.main(SparkPipelines.scala:42)
at org.apache.spark.deploy.SparkPipelines.main(SparkPipelines.scala)
```
This is not information that's relevant to the user.
### Does this PR introduce _any_ user-facing change?
Not for anything that's been released.
### How was this patch tested?
Ran the CLI and observed this error was gone and the other output remained the same:
```
> spark-pipelines run --conf spark.sql.catalogImplementation=hive
WARNING: Using incubator modules: jdk.incubator.vector
2025-10-28 13:22:49: Loading pipeline spec from /Users/sandy.ryza/sdp-test/demo2/pipeline.yml...
2025-10-28 13:22:49: Creating Spark session...
WARNING: Using incubator modules: jdk.incubator.vector
Using Spark's default log4j profile: org/apache/spark/log4j2-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
25/10/28 13:22:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/conf.py:64: UserWarning: Failed to set spark.sql.catalogImplementation to Some(hive) due to [CANNOT_MODIFY_STATIC_CONFIG] Cannot modify the value of the static Spark config: "spark.sql.catalogImplementation". SQLSTATE: 46110
2025-10-28 13:22:53: Creating dataflow graph...
2025-10-28 13:22:53: Registering graph elements...
2025-10-28 13:22:53: Loading definitions. Root directory: '/Users/sandy.ryza/sdp-test/demo2'.
2025-10-28 13:22:53: Found 2 files matching glob 'transformations/**/*'
2025-10-28 13:22:53: Importing /Users/sandy.ryza/sdp-test/demo2/transformations/example_python_materialized_view.py...
2025-10-28 13:22:53: Registering SQL file /Users/sandy.ryza/sdp-test/demo2/transformations/example_sql_materialized_view.sql...
2025-10-28 13:22:53: Starting run...
25/10/28 13:22:55 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 2.3.0
25/10/28 13:22:55 WARN ObjectStore: setMetaStoreSchemaVersion called but recording version is disabled: version = 2.3.0, comment = Set by MetaStore sandy.ryza10.15.139.54
Traceback (most recent call last):
File "/Users/sandy.ryza/oss/python/pyspark/pipelines/cli.py", line 413, in <module>
run(
File "/Users/sandy.ryza/oss/python/pyspark/pipelines/cli.py", line 340, in run
handle_pipeline_events(result_iter)
File "/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/pipelines/spark_connect_pipeline.py", line 53, in handle_pipeline_events
for result in iter:
File "/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/client/core.py", line 1186, in execute_command_as_iterator
File "/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/client/core.py", line 1619, in _execute_and_fetch_as_iterator
File "/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/client/core.py", line 1893, in _handle_error
File "/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/client/core.py", line 1966, in _handle_rpc_error
pyspark.errors.exceptions.connect.AnalysisException: [PIPELINE_DATASET_WITHOUT_FLOW] Pipeline dataset `spark_catalog`.`default`.`abc` does not have any defined flows. Please attach a query with the dataset's definition, or explicitly define at least one flow that writes to the dataset. SQLSTATE: 0A000
25/10/28 13:22:57 INFO ShutdownHookManager: Shutdown hook called
25/10/28 13:22:57 INFO ShutdownHookManager: Deleting directory /private/var/folders/1v/dqhbgmt10vl6v3tdlwvvx90r0000gp/T/spark-1214d042-270d-407f-8324-0dfcdf72c38c
```
### Was this patch authored or co-authored using generative AI tooling?
Closes #52770 from sryza/user-app-exited-error.
Authored-by: Sandy Ryza <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>SparkSubmit to invoke exitFn with the root cause instead of SparkUserAppException
1 parent f37cd07 commit 1d575f2
File tree
1 file changed
+1
-1
lines changed- core/src/main/scala/org/apache/spark/deploy
1 file changed
+1
-1
lines changedLines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1166 | 1166 | | |
1167 | 1167 | | |
1168 | 1168 | | |
1169 | | - | |
| 1169 | + | |
1170 | 1170 | | |
1171 | 1171 | | |
1172 | 1172 | | |
| |||
0 commit comments