Commit a8cb80a
committed
[SPARK-56482][SQL][FOLLOWUP] Simplify UnionExec codegen and narrow partition-index gate
### What changes were proposed in this pull request?
Followup to SPARK-56482 (#55425). Two groups of changes to `UnionExec`'s whole-stage codegen path.
**Code cleanness:**
- Hoist `metricTerm("numOutputRows")` to `doProduce` and store it on the instance. `doConsume` runs once per child during emission, so the previous code registered the same metric N times in `references[]` for an N-child Union; now once.
- Drop the dead `assert` in `perChildProjections` and the duplicate `allChildOutputDataTypesMatch` lazy val. The dataType comparison now has a single source of truth in the `type-mismatch` branch of the gate.
- Inline the one-shot `hasAnyPartitionIndexDependentDescendant` lazy val.
- Drop the unreachable `case other` in the `UnionPartition` match and replace with `asInstanceOf`. `unionedInputRDD` is built as `new UnionRDD(...)` two lines up, and `getPartitions` only ever returns `UnionPartition[_]`.
- Factor `isPlainUnion` helper used by the gate and `doExecute` so the invariant "codegen path matches `sparkContext.union` semantics" lives in one place.
- Bind `currentPartitionIndexVar` to the array-deref expression `((int[]) refs[K])[partitionIndex]` directly. An earlier revision hoisted this to a `childLocalIdx` local at helper entry, but `SampleExec.doConsume` reads `currentPartitionIndexVar` from inside an `addMutableState` initializer, which is emitted into the state-init function — outside the per-child helper — so the local was not in scope and the generated code failed to compile. The expression form resolves in any emission scope (helper parameter or `BufferedRowIterator` field).
- Drop the `try/finally` around codegen state restoration. Codegen failure aborts the whole stage, so the restoration is unreachable.
**Gate narrowing:**
- Narrow `hasPartitionIndexDependentCodegen` to exclude `InputFileName`, `InputFileBlockStart`, and `InputFileBlockLength`. These are `Nondeterministic` but read from `InputFileBlockHolder` (a per-task thread-local) and do not embed `partitionIndex`, so they are safe under fusion. Queries like `SELECT input_file_name() FROM a UNION ALL SELECT input_file_name() FROM b` now fuse.
### Why are the changes needed?
The cleanups remove accidental complexity in the fused code path: an N-fold metric reference, two duplicated dataType comparisons, an unreachable defensive guard, and a `try/finally` that protects against an unreachable case. The gate narrowing turns a missed optimization (file-scan unions) into a fused plan.
### Does this PR introduce _any_ user-facing change?
No. `spark.sql.codegen.wholeStage.union.enabled` remains off by default; when on, the new behavior fuses additional plans (file-scan unions with `input_file_name()`) that the previous gate over-rejected.
### How was this patch tested?
`UnionCodegenSuite`, `UnionCodegenAnsiSuite`, `UnionCodegenAqeSuite`, and the relevant `SQLMetricsSuite` test all pass. Three tests added:
- `partitioning-aware union falls back to non-codegen` — covers a `supportCodegenFailureReason` branch that lacked explicit coverage.
- `input_file_name child fuses (Nondeterministic but partition-index-free)` — validates the gate narrowing.
- `union with sample children fuses (or falls back) without crashing` — regression test for the `currentPartitionIndexVar` binding (caught by LuciferYang in review).
The `columnar` fallback branch is not covered by a new test: reliably constructing a plan where `Union.supportsColumnar` is true via the user-facing API turned out to be brittle, since `ApplyColumnarRulesAndInsertTransitions` aggressively rebalances columnar/row transitions.
### Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code
Closes #55719 from cloud-fan/SPARK-56482-followup.
Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit d905e73)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>1 parent fd05859 commit a8cb80a
2 files changed
Lines changed: 128 additions & 82 deletions
File tree
- sql/core/src
- main/scala/org/apache/spark/sql/execution
- test/scala/org/apache/spark/sql/execution
Lines changed: 77 additions & 82 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
901 | 901 | | |
902 | 902 | | |
903 | 903 | | |
904 | | - | |
905 | | - | |
906 | | - | |
907 | | - | |
908 | | - | |
909 | | - | |
| 904 | + | |
| 905 | + | |
| 906 | + | |
| 907 | + | |
| 908 | + | |
| 909 | + | |
| 910 | + | |
| 911 | + | |
| 912 | + | |
| 913 | + | |
910 | 914 | | |
911 | 915 | | |
912 | 916 | | |
913 | | - | |
914 | | - | |
915 | | - | |
916 | | - | |
917 | 917 | | |
918 | 918 | | |
919 | 919 | | |
920 | 920 | | |
921 | 921 | | |
922 | 922 | | |
923 | 923 | | |
924 | | - | |
925 | | - | |
926 | | - | |
927 | | - | |
928 | | - | |
929 | | - | |
930 | | - | |
931 | | - | |
932 | | - | |
933 | | - | |
934 | | - | |
935 | | - | |
936 | | - | |
937 | | - | |
938 | | - | |
939 | 924 | | |
940 | 925 | | |
941 | 926 | | |
942 | 927 | | |
943 | 928 | | |
944 | 929 | | |
945 | 930 | | |
946 | | - | |
| 931 | + | |
947 | 932 | | |
948 | 933 | | |
949 | 934 | | |
950 | 935 | | |
951 | 936 | | |
952 | | - | |
| 937 | + | |
953 | 938 | | |
954 | 939 | | |
955 | 940 | | |
956 | 941 | | |
957 | 942 | | |
958 | | - | |
| 943 | + | |
| 944 | + | |
959 | 945 | | |
960 | 946 | | |
961 | 947 | | |
| |||
1002 | 988 | | |
1003 | 989 | | |
1004 | 990 | | |
1005 | | - | |
1006 | | - | |
| 991 | + | |
| 992 | + | |
| 993 | + | |
| 994 | + | |
| 995 | + | |
| 996 | + | |
1007 | 997 | | |
1008 | 998 | | |
1009 | 999 | | |
| 1000 | + | |
| 1001 | + | |
1010 | 1002 | | |
1011 | 1003 | | |
1012 | 1004 | | |
1013 | 1005 | | |
1014 | | - | |
1015 | | - | |
1016 | | - | |
1017 | | - | |
1018 | | - | |
1019 | | - | |
1020 | | - | |
| 1006 | + | |
| 1007 | + | |
| 1008 | + | |
| 1009 | + | |
1021 | 1010 | | |
1022 | 1011 | | |
1023 | 1012 | | |
1024 | 1013 | | |
1025 | | - | |
1026 | | - | |
| 1014 | + | |
1027 | 1015 | | |
1028 | 1016 | | |
1029 | 1017 | | |
1030 | 1018 | | |
1031 | | - | |
1032 | | - | |
1033 | | - | |
1034 | | - | |
1035 | | - | |
1036 | | - | |
1037 | | - | |
| 1019 | + | |
| 1020 | + | |
| 1021 | + | |
| 1022 | + | |
| 1023 | + | |
| 1024 | + | |
| 1025 | + | |
| 1026 | + | |
| 1027 | + | |
| 1028 | + | |
| 1029 | + | |
| 1030 | + | |
| 1031 | + | |
1038 | 1032 | | |
1039 | | - | |
1040 | | - | |
1041 | | - | |
1042 | | - | |
1043 | | - | |
1044 | | - | |
1045 | | - | |
1046 | | - | |
1047 | | - | |
1048 | | - | |
1049 | | - | |
1050 | | - | |
1051 | | - | |
1052 | | - | |
1053 | | - | |
1054 | | - | |
1055 | | - | |
1056 | | - | |
1057 | | - | |
1058 | | - | |
| 1033 | + | |
| 1034 | + | |
| 1035 | + | |
| 1036 | + | |
| 1037 | + | |
| 1038 | + | |
| 1039 | + | |
| 1040 | + | |
| 1041 | + | |
| 1042 | + | |
| 1043 | + | |
| 1044 | + | |
| 1045 | + | |
| 1046 | + | |
| 1047 | + | |
1059 | 1048 | | |
| 1049 | + | |
| 1050 | + | |
1060 | 1051 | | |
1061 | 1052 | | |
1062 | 1053 | | |
| |||
1071 | 1062 | | |
1072 | 1063 | | |
1073 | 1064 | | |
1074 | | - | |
1075 | | - | |
1076 | 1065 | | |
1077 | | - | |
1078 | | - | |
1079 | | - | |
1080 | | - | |
1081 | | - | |
1082 | | - | |
| 1066 | + | |
1083 | 1067 | | |
1084 | 1068 | | |
| 1069 | + | |
1085 | 1070 | | |
1086 | 1071 | | |
1087 | 1072 | | |
1088 | 1073 | | |
1089 | | - | |
1090 | 1074 | | |
1091 | | - | |
| 1075 | + | |
1092 | 1076 | | |
1093 | 1077 | | |
1094 | 1078 | | |
| |||
1103 | 1087 | | |
1104 | 1088 | | |
1105 | 1089 | | |
1106 | | - | |
| 1090 | + | |
1107 | 1091 | | |
1108 | 1092 | | |
1109 | 1093 | | |
| |||
1138 | 1122 | | |
1139 | 1123 | | |
1140 | 1124 | | |
1141 | | - | |
1142 | | - | |
1143 | | - | |
| 1125 | + | |
| 1126 | + | |
1144 | 1127 | | |
| 1128 | + | |
| 1129 | + | |
| 1130 | + | |
| 1131 | + | |
| 1132 | + | |
| 1133 | + | |
1145 | 1134 | | |
1146 | | - | |
1147 | | - | |
| 1135 | + | |
| 1136 | + | |
| 1137 | + | |
| 1138 | + | |
| 1139 | + | |
| 1140 | + | |
| 1141 | + | |
| 1142 | + | |
1148 | 1143 | | |
1149 | 1144 | | |
1150 | 1145 | | |
| |||
Lines changed: 51 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
602 | 602 | | |
603 | 603 | | |
604 | 604 | | |
| 605 | + | |
| 606 | + | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
| 611 | + | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
| 628 | + | |
| 629 | + | |
| 630 | + | |
| 631 | + | |
| 632 | + | |
| 633 | + | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
| 642 | + | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
| 648 | + | |
| 649 | + | |
| 650 | + | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
605 | 656 | | |
606 | 657 | | |
607 | 658 | | |
| |||
0 commit comments