Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1222 commits
Select commit Hold shift + click to select a range
49a3c13
[SPARK-53632][PYTHON][DOCS][TESTS] Reenable doctest for `DataFrame.pa…
zhengruifeng Sep 18, 2025
2639792
[SPARK-53523][SQL][FOLLOWUP] Udpate scaladocs and add tests in Proced…
pan3793 Sep 18, 2025
fb46424
[SPARK-53626][DOCS] Add invalid mixed-type operations to ANSI migrati…
xinrong-meng Sep 18, 2025
552effc
[SPARK-53637][BUILD] Demote bcprov-jdk18on to test scope
pan3793 Sep 18, 2025
a8bb8b0
[SPARK-53625][SS] Propagate metadata columns through projections to a…
liviazhu Sep 19, 2025
db13a38
[SPARK-53578][CONNECT] Simplify data type handling in LiteralValuePro…
heyihong Sep 19, 2025
4f10262
[SPARK-53623][SQL] improve reading large table properties performance
yeshengm Sep 19, 2025
589141e
[SPARK-53233][SQL][FOLLOWUP] Add compatibility class/object for org.a…
cloud-fan Sep 19, 2025
686d844
[SPARK-53592][PYTHON] Make `@udf` support vectorized UDF
zhengruifeng Sep 20, 2025
71c67b0
[SPARK-53641][DOCS] Add PARTITION BY support in Arrow Python UDTF docs
allisonwang-db Sep 22, 2025
36ed5ee
[SPARK-53429][PYTHON] Support Direct Passthrough Partitioning in the …
shujingyang-db Sep 22, 2025
f48de10
[SPARK-53592][PYTHON][TESTS][FOLLOW-UP] Remove unused config in the p…
zhengruifeng Sep 22, 2025
984e16b
[SPARK-53657][PYTHON][TESTS] Enable doctests for `GroupedData.agg`
zhengruifeng Sep 22, 2025
ed2692f
[SPARK-53654][SQL][PYTHON] Support `seed` in function `uuid`
zhengruifeng Sep 22, 2025
f9aa4c9
[SPARK-53655][SQL][TESTS] Fix the intention of 'read parquet footers …
yaooqinn Sep 22, 2025
c37ab6e
[SPARK-53653][DOC] Update `rexml` gem version to 3.4.4
bjornjorgensen Sep 22, 2025
69031c9
[SPARK-53661][BUILD][TESTS] Upgrade `bouncycastle` to 1.82
dongjoon-hyun Sep 22, 2025
a2adc43
[SPARK-53668][BUILD] Add `--enable-native-access=ALL-UNNAMED` to `bui…
dongjoon-hyun Sep 22, 2025
1e7169e
[SPARK-53660][SQL][TESTS] Add unit test for Metadata equality check
Yicong-Huang Sep 23, 2025
33196fe
[SPARK-53643][DOCS] Add Arrow UDF to debugging and user guide
xinrong-meng Sep 23, 2025
dde895c
[SPARK-53651][SDP] Add support for persistent views in pipelines
sryza Sep 23, 2025
cf30da2
[SPARK-47110][INFRA] Reenble AmmoniteTest tests in Maven builds
sarutak Sep 23, 2025
fdcd140
[SPARK-53629][SQL] Implement type widening for MERGE INTO WITH SCHEMA…
szehon-ho Sep 23, 2025
0e42b95
[SPARK-53673][CONNECT][TESTS] Fix a flaky test failure in `SparkSessi…
sarutak Sep 23, 2025
b6993cb
[SPARK-53516][SDP] Fix `spark.api.mode` arg process in SparkPipelines
pan3793 Sep 23, 2025
a13187c
[SPARK-53676][PYTHON][TESTS] Skip UDF type check with numpy 1.x
zhengruifeng Sep 23, 2025
c6cea73
[SPARK-53591][SDP] Simplify Pipeline Spec Pattern Glob Matching
jackywang-db Sep 23, 2025
1841dd2
[SPARK-53516][CORE][TESTS][FOLLOWUP] Fix compilation errors in `Spark…
LuciferYang Sep 23, 2025
e95f12b
[SPARK-53633][SQL] Reuse InputStream in vectorized Parquet reader
pan3793 Sep 23, 2025
2a9999f
[SPARK-53671][PYTHON] Exclude 0-args from `@udf` eval type inference
zhengruifeng Sep 24, 2025
fa9e787
[SPARK-53425][PYTHON][TESTS] Add more table argument tests for Arrow …
allisonwang-db Sep 24, 2025
661d611
[SPARK-53645][PS] Implement `skipna` parameter for ps.DataFrame `any()`
petern48 Sep 24, 2025
7dfd9e2
[SPARK-53681][BUILD][TESTS] Upgrade `snowflake-jdbc` to 3.26.1
dongjoon-hyun Sep 24, 2025
8ac0382
[SPARK-53689][BUILD] Respect RELEASE_VERSION environment variable if …
HyukjinKwon Sep 24, 2025
b4592c4
[SPARK-53678][SQL] Fix NPE when subclass of ColumnVector is created w…
manuzhang Sep 24, 2025
06f7ad2
[SPARK-53682][INFRA] Refresh `spark-rm` Docker Image with `jammy-2025…
dongjoon-hyun Sep 24, 2025
de7ba3b
[SPARK-53631][CORE] Optimize memory and perf on SHS bootstrap
pan3793 Sep 24, 2025
b1bbf02
[SPARK-53688][PYTHON][INFRA][TESTS] Increase the timeout and skip UDF…
zhengruifeng Sep 24, 2025
3f2c623
[SPARK-53689][BUILD][FOLLOW-UP] Respect RELEASE_VERSION environment v…
HyukjinKwon Sep 24, 2025
b005f56
[SPARK-53689][BUILD][FOLLOW-UP] Check if RELEASE_VERSION is already s…
HyukjinKwon Sep 24, 2025
4418d6e
[SPARK-53562][PYTHON] Limit Arrow batch sizes in `applyInArrow` and `…
zhengruifeng Sep 24, 2025
35c7208
[SPARK-53674][SQL] Handle single-pass analyzer LCAs when assigning al…
mihailotim-db Sep 24, 2025
997e538
[SPARK-53677][SQL] Improve debuggability for JDBC data source when qu…
urosstan-db Sep 24, 2025
32f7b43
[SPARK-53692][DOCS] Remove a unused configuration spark.appStatusStor…
yaooqinn Sep 24, 2025
466c608
[SPARK-53667][SQL] Fix EXPLAIN for CALL with IDENTIFIER
aokolnychyi Sep 24, 2025
ed326b2
[SPARK-52844][PYTHON] Update `protobuf` Python package to 5.29.5
eschcam Sep 24, 2025
969a342
[SPARK-53492][CONNECT] Reject second ExecutePlan with an operation id…
nija-at Sep 24, 2025
f3a69b2
[SPARK-53356][PYTHON][DOCS] Small improvements to python data source …
sryza Sep 24, 2025
25c550e
[SPARK-53695][PYTHON][TESTS] Add tests for 0-arg grouped agg UDF
zhengruifeng Sep 24, 2025
a8cdae8
[SPARK-53112][SQL][PYTHON][CONNECT] Support TIME in the make_timestam…
Yicong-Huang Sep 25, 2025
a863eee
[SPARK-53694][SQL][TESTS] Improve `V1WriteHiveCommandSuite` test cove…
pan3793 Sep 25, 2025
f7309f2
[SPARK-53709][BUILD] Upgrade Junit to 5.13.4
LuciferYang Sep 25, 2025
5aa9057
[SPARK-53708][BUILD] Upgrade `ZooKeeper` to 3.9.4
dongjoon-hyun Sep 25, 2025
ba92e8e
[SPARK-51831][SQL] Column pruning with existsJoin for Datasource V2
jackylee-ch Sep 25, 2025
70ce0cf
[SPARK-53695][PYTHON][TESTS][FOLLOW-UP] Additional test for type hint
zhengruifeng Sep 25, 2025
64b0cb6
[SPARK-53372][INFRA][FOLLOWUP] Synchronize the installation of `pyyam…
LuciferYang Sep 25, 2025
b37d51a
[SPARK-53700][SQL] Remove redundancy in `DataSourceV2RelationBase.sim…
aokolnychyi Sep 25, 2025
76d9a7b
[SPARK-53717][SQL] Revise `MapType.valueContainsNull` parameter comme…
heyihong Sep 25, 2025
6ff9edc
[SPARK-52407][SQL] Add support for Theta Sketch
Sep 25, 2025
00928f4
[SPARK-53716][PYTHON][DOCS] Document vectorized UDF with `@udf`
zhengruifeng Sep 25, 2025
75cb479
[SPARK-53112][SQL][PYTHON][CONNECT][FOLLOW-UP] Change date and time t…
Yicong-Huang Sep 26, 2025
2f2abd5
[SPARK-53719][SQL][PYTHON][CONNECT] Enhance type checking in `_to_col…
Yicong-Huang Sep 26, 2025
cb28340
[SPARK-53700][SQL][TEST][FOLLOWUP] Regenerate `ProtoToParsedPlanTestS…
dongjoon-hyun Sep 26, 2025
75267bc
[SPARK-53715][SQL] Refactor getWritePrivileges for MergeIntoTable
beliefer Sep 26, 2025
3d19a65
[SPARK-52407][SQL][TESTS][FOLLOWUP] Regenerate thetasketch.sql_analyz…
yaooqinn Sep 26, 2025
984d578
[SPARK-53729][PYTHON][CONNECT] Fix serialization of `pyspark.sql.conn…
zhengruifeng Sep 26, 2025
470d622
[SPARK-53112][PYTHON][TESTS][FOLLOW-UP] Remove some tests for Non-ANSI
zhengruifeng Sep 26, 2025
0cad1cd
[SPARK-53707] Improve attribute metadata handling
ksbeyer Sep 26, 2025
65b9da5
[SPARK-46679][SQL] Fix for SparkUnsupportedOperationException Not fou…
szehon-ho Sep 26, 2025
e56ab2f
[SPARK-53593][SDP] Add response field for DefineDataset and DefineFlo…
cookiedough77 Sep 26, 2025
9e12201
[SPARK-49547][SQL][PYTHON] Add iterator of `RecordBatch` API to `appl…
Kimahriman Sep 27, 2025
6cdc62e
Revert "[SPARK-53015][BUILD] Upgrade log4j to 2.25.1"
pan3793 Sep 27, 2025
7a0bf9e
[SPARK-53127][SQL][FOLLOWUP] Clean up golden files for and add commen…
Pajaraja Sep 28, 2025
573c7da
[MINOR][INFRA] Free disk space in releasing workflow
HyukjinKwon Sep 29, 2025
922adad
[SPARK-53575][CORE] Retry entire consumer stages when checksum mismat…
ivoson Sep 29, 2025
776ffd5
[SPARK-53735][SDP] Hide server-side JVM stack traces by default in sp…
sryza Sep 29, 2025
1692b55
[SPARK-53728][SDP] Print PipelineEvent Message with Error In Test
jackywang-db Sep 29, 2025
746db3d
[SPARK-53743][SS] Remove the usage of fetchWithArrow in ListState.put…
HeartSaVioR Sep 30, 2025
3c8c714
[SPARK-53574][SQL][FOLLOWUP] still reset the default AnalysisContext
cloud-fan Sep 30, 2025
46ac78e
[SPARK-53734][SQL] Prefer table column over LCA when resolving array …
mihailotim-db Sep 30, 2025
d5729f0
[SPARK-53621][CORE] Adding Support for Executing CONTINUE HANDLER
TeodorDjelic Sep 30, 2025
0e9e27d
[SPARK-53766][DOC] Improve execute immediate docs
srielau Sep 30, 2025
5dc061c
[SPARK-53593][SDP] Fix: Use unquoted for response fields
cookiedough77 Sep 30, 2025
e04fd59
[SPARK-52807][SDP] Proto changes to support analysis inside Declarati…
SCHJonathan Oct 1, 2025
ca9c054
[SPARK-53536][CORE] Adding a Golden File Test With Randomly Generated…
TeodorDjelic Oct 1, 2025
2ed58ab
[SPARK-53773][SQL] Recover alphabetic ordering of rules in `RuleIdCol…
dongjoon-hyun Oct 1, 2025
6526293
[SPARK-53741][BUILD] Upgrade ORC to 2.2.1
dongjoon-hyun Oct 1, 2025
0ecb519
[SPARK-52614][SQL] Support RowEncoder inside Product Encoder
eejbyfeldt Oct 1, 2025
848e47a
[SPARK-53737][SQL][SS] Add Real-time Mode trigger
jerrypeng Oct 2, 2025
7ed0e37
[SPARK-53536][CORE][FOLLOWUP] Fixing Flakiness of Golden File Test Wi…
TeodorDjelic Oct 2, 2025
65ff85a
[SPARK-52640][SDP] Propagate Python Source Code Location
anishm-db Oct 2, 2025
1124b09
[SPARK-53691][PS][INFRA][TESTS] Further reorganize tests for Pandas API
zhengruifeng Oct 6, 2025
08bd390
[SPARK-53794][SS] Add option to limit deletions per maintenance opera…
anishshri-db Oct 6, 2025
fd02372
[SPARK-53804][SQL] Support TIME radix sort
bersprockets Oct 7, 2025
6bcd095
[SPARK-53784] Additional Source APIs needed to support RTM execution
jerrypeng Oct 7, 2025
c3d6eea
[SPARK-53788][CORE] Move VersionUtils to `common` module
pan3793 Oct 7, 2025
be1de5b
[SPARK-53802][SDP] Support string values for user-specified schema in…
sryza Oct 7, 2025
e312318
[SPARK-53621][CORE][FOLLOWUP] Adding Comments And Enriching Tests of …
TeodorDjelic Oct 7, 2025
e0cb512
[SPARK-53795][CONNECT] Remove unused parameters in LiteralValueProtoC…
heyihong Oct 8, 2025
8bf6640
[SPARK-53638][SS][PYTHON] Limit the byte size of arrow batch for TWS …
zeruibao Oct 8, 2025
c5f007b
[SPARK-53829][PYTHON] Support `datetime.time` in column operators
zhengruifeng Oct 8, 2025
032a4e7
[SPARK-53831][INFRA] Update script `free_disk_space`
zhengruifeng Oct 8, 2025
ee15beb
[SPARK-53808][CONNECT] Allow to pass optional JVM args to `spark-conn…
sarutak Oct 8, 2025
57a9fc9
[SPARK-53832][K8S] Make `KubernetesClientUtils` Java-friendly
dongjoon-hyun Oct 8, 2025
011b2b8
[SPARK-53833][PYTHON] Update `dev/requirements.txt` to skip `torch/to…
dongjoon-hyun Oct 8, 2025
22d9709
[SPARK-53738][SQL] PlannedWrite should preserve custom sort order whe…
pan3793 Oct 8, 2025
3b51e19
[SPARK-53834][INFRA] Add a separate docker file for Python 3.14 daily…
dongjoon-hyun Oct 8, 2025
b83d701
[SPARK-53836][INFRA] Update script `free_disk_space_container`
zhengruifeng Oct 8, 2025
4cc0b51
[SPARK-53806][SQL] Allow empty input on AES decrypt to have error class
richardc-db Oct 8, 2025
c393419
[SPARK-53125][TEST] RemoteSparkSession prints whole `spark-submit` co…
pan3793 Oct 8, 2025
65336a0
[SPARK-53810][SS][TESTS] Split large TWS python tests into multiple s…
huanliwang-db Oct 8, 2025
9ebe4a2
[SPARK-53792][SS] Fix rocksdbPinnedBlocksMemoryUsage when bounded mem…
Oct 8, 2025
9a1c742
[SPARK-53751][SDP] Explicit Versioned Checkpoint Location
jackywang-db Oct 8, 2025
bf2457b
[SPARK-53843][BUILD] Upgrade `netty-tcnative` to 2.0.74.Final
dongjoon-hyun Oct 8, 2025
d6f713e
[SPARK-53844][TESTS] Remove `SPARK_JENKINS*` and related logics from …
dongjoon-hyun Oct 8, 2025
22e24df
[SPARK-53812][SDP] Refactor DefineDataset and DefineFlow protos to gr…
sryza Oct 8, 2025
18f0463
[SPARK-53846][PYTHON][TESTS] Skip `test_profile_pandas_*` tests if pa…
dongjoon-hyun Oct 9, 2025
49e2c9e
[SPARK-53849][BUILD] Upgrade Netty to 4.2.6.Final
dongjoon-hyun Oct 9, 2025
2a5d03a
[SPARK-53854][PYTHON][TESTS] Skip `test_collect_time` test if pandas …
dongjoon-hyun Oct 9, 2025
ce3437a
Revert "[SPARK-53738][SQL] PlannedWrite should preserve custom sort o…
peter-toth Oct 9, 2025
37f8df4
[SPARK-53507][CORE] Don't use case class for BreakingChangeInfo
imarkowitz Oct 9, 2025
26ba0ed
[SPARK-53564][CORE] Avoid DAGScheduler exits due to blockManager RPC …
ivoson Oct 9, 2025
05a1ffb
[SPARK-51169][PYTHON] Add Python 3.14 support in Spark Classic
zhengruifeng Oct 9, 2025
efcc8f6
[SPARK-53850][SDP] Define proto for Sinks and Rename DefineDataset to…
jackywang-db Oct 9, 2025
282e7d3
[SPARK-53862][DSTREAM][TESTS] Fix `CheckpointSuite.'get correct spark…
dongjoon-hyun Oct 10, 2025
b34bc29
[SPARK-53860][BUILD][TESTS] Upgrade `sbt-jupiter-interface` to 0.17.0
dongjoon-hyun Oct 10, 2025
7b7cb9a
[SPARK-53562][PYTHON][TESTS][FOLLOW-UP] Add more tests for `maxBytesP…
zhengruifeng Oct 10, 2025
dc11b66
[MINOR][PYTHON][TESTS] Retry `test_observe_with_map_type`
zhengruifeng Oct 10, 2025
cd6d701
[SPARK-53859][BUILD][TESTS] Upgrade `JUnit` to 6.0.0
dongjoon-hyun Oct 10, 2025
ef1eec4
[SPARK-53858][PYTHON][TESTS] Skip doctests in `pyspark.sql.functions.…
zhengruifeng Oct 10, 2025
ea0f4fa
[MINOR][DOCS] Add `build_python_3.14.yml` to `Build Pipeline Status` …
dongjoon-hyun Oct 10, 2025
f041118
[SPARK-53861][PYTHON][INFRA] Factor out streaming tests from `pyspark…
zhengruifeng Oct 10, 2025
cd23ad5
[SPARK-53856][CORE] Remove `blacklist` alternative config names
dongjoon-hyun Oct 10, 2025
6032a40
[SPARK-53779][SQL][CONNECT] Implement `transform()` in Column API
Yicong-Huang Oct 10, 2025
ed1140f
[SPARK-53786][SQL] Default value with special column name should not …
szehon-ho Oct 10, 2025
320d09d
[SPARK-53605][INFRA] Restore pyspark execution timeout to 2 hours
zhengruifeng Oct 10, 2025
474731f
[SPARK-53866][PYTHON][TESTS] Skip doctest `pyspark.sql.pandas.functio…
zhengruifeng Oct 10, 2025
b616068
[SPARK-53864][BUILD] Upgrade `commons-lang3` to 3.19.0
LuciferYang Oct 10, 2025
6eb4d3c
[SPARK-53865][SQL] Extract common logic from ResolveGenerate rule
mikhailnik-db Oct 10, 2025
0702d58
[SPARK-53822][PYTHON][SS][TESTS] Add Python TransformWithState test c…
jiateoh Oct 10, 2025
a35c9f3
[SPARK-53805][SQL] Push Variant into DSv2 scan
huaxingao Oct 10, 2025
72fc87b
[SPARK-53690][SS] Fix exponential formatting of avgOffsetsBehindLates…
jayantdb Oct 11, 2025
418cf56
[SPARK-53796][SDP] Add `extension` field to a few pipeline protos to …
SCHJonathan Oct 11, 2025
9aab260
[SPARK-53868][SQL] Only use signature with Expression[] of `visitAgg…
alekjarmov Oct 12, 2025
81d9d1f
[SPARK-53847] Add ContinuousMemorySink for Real-time Mode testing
jerrypeng Oct 12, 2025
3f663bf
[SPARK-53870][PYTHON][SS] Fix partial read bug for large proto messag…
jiateoh Oct 12, 2025
47e4108
[SPARK-53879] Upgrade `Ammonite` to 3.0.3
dongjoon-hyun Oct 12, 2025
00d2a54
[SPARK-53881][BUILD][TESTS] Upgrade `Selenium` to 4.32.0
dongjoon-hyun Oct 12, 2025
3ac4a48
[SPARK-53585][BUILD] Upgrade Scala to 2.13.17
vrozov Oct 13, 2025
264ca4d
[SPARK-53455][CONNECT] Add `CloneSession` RPC
vicennial Oct 13, 2025
343a25b
[SPARK-53868][SQL] Use array length check instead of direct reference…
alekjarmov Oct 13, 2025
56f8b3b
[SPARK-53877] Introduce BITMAP_AND_AGG function
uros7251brick Oct 13, 2025
e38a651
[SPARK-53845] SDP Sinks
jackywang-db Oct 13, 2025
d72e6a3
[SPARK-53884][BUILD] Upgrade `ZSTD-JNI` to 1.5.7-5
dongjoon-hyun Oct 13, 2025
f92816c
[SPARK-53878][SQL][CONNECT] Fix race condition issue related to Obser…
sarutak Oct 13, 2025
6bae835
[SPARK-53720][SQL] Simplify extracting Table from DataSourceV2Relatio…
aokolnychyi Oct 13, 2025
0cb933e
Revert "[MINOR][PYTHON][TESTS] Retry `test_observe_with_map_type`"
sarutak Oct 14, 2025
710c607
[SPARK-53892][SS] Use `DescribeTopicsResult.allTopicNames` instead of…
dongjoon-hyun Oct 14, 2025
7d028c6
[SPARK-53609][PYTHON] Limit Arrow batch sizes in SQL_GROUPED_AGG_PAND…
zhengruifeng Oct 14, 2025
8898ec9
[SPARK-53893][TESTS] Regenerate benchmark results after upgrading to …
dongjoon-hyun Oct 14, 2025
6f0f587
[SPARK-53894][BUILD][TESTS] Upgrade `docker-java` to 3.6.0
dongjoon-hyun Oct 14, 2025
1e515d3
[MINOR][DOCS] Fix 404 for Python Package Management link in rdd-progr…
yaooqinn Oct 14, 2025
e6a76df
[SPARK-53896][CORE] Enable `spark.io.compression.lzf.parallel.enabled…
dongjoon-hyun Oct 14, 2025
37564db
[SPARK-51426][PYTHON][SQL] Fix 'Setting metadata to empty dict does n…
petern48 Oct 14, 2025
85c9fd1
[SPARK-53867][PYTHON] Limit Arrow batch sizes in SQL_GROUPED_AGG_ARRO…
zhengruifeng Oct 14, 2025
cba28ea
[SPARK-53841][PYTHON][CONNECT] Implement `transform()` in Column API
Yicong-Huang Oct 14, 2025
800309e
[SPARK-53857] Enable messageTemplate propagation to SparkThrowable
miland-db Oct 14, 2025
ff0f1ab
[SPARK-53900][CONNECT] Fix unintentional `Thread.wait(0)` under rare …
vicennial Oct 14, 2025
e05c75e
[SPARK-53906][K8S] Protect `ExecutorPodsAllocator.numOutstandingPods`…
dongjoon-hyun Oct 14, 2025
4eacd0b
[SPARK-53907][K8S] Support `spark.kubernetes.allocation.maximum`
dongjoon-hyun Oct 15, 2025
9ae8198
[SPARK-53611][PYTHON] Limit Arrow batch sizes in window agg UDFs
zhengruifeng Oct 15, 2025
2d22064
[SPARK-53897][CONNECT][TESTS] Add dependency checks for Python-relate…
LuciferYang Oct 15, 2025
694cc72
[SPARK-53895][SS][TESTS] Add `ContinuousMemorySuite`
jerrypeng Oct 15, 2025
8ff9feb
[SPARK-53760][GEO][SQL] Introduce GeometryType and GeographyType
uros-db Oct 15, 2025
e3e6982
[SPARK-53867][PYTHON][FOLLOW-UP] Fix `pa.concat_batches` for old pyar…
zhengruifeng Oct 15, 2025
98010b3
[SPARK-53913][DOCS] Document newly added K8s configurations
dongjoon-hyun Oct 15, 2025
abe7853
[SPARK-53789][SQL][CONNECT] Canonicalize error condition CANNOT_MODIF…
pan3793 Oct 15, 2025
17d15a4
[SPARK-53762][SQL] Add date and time conversions simplifier rule to o…
peter-toth Oct 15, 2025
9e14f5f
[SPARK-53111][SQL][PYTHON][CONNECT] Implement the time_diff function …
uros-db Oct 15, 2025
b0285f8
[SPARK-53902][SQL] Add tree node pattern bits for supported expressio…
mihailoale-db Oct 15, 2025
fcd8371
[SPARK-53923][CORE] Rename `spark.executor.(log -> logs).redirectCons…
dongjoon-hyun Oct 15, 2025
2ae81cd
[SPARK-53925][INFRA] Use `MacOS 26` in `build_maven_java21_macos15.yml`
dongjoon-hyun Oct 15, 2025
bc8020f
[SPARK-53919][BUILD] Make Maven plugins up-to-date
dongjoon-hyun Oct 15, 2025
032dcf8
[SPARK-53926][DOCS] Document newly added `core` module configurations
dongjoon-hyun Oct 16, 2025
61a024c
[SPARK-53916][PYTHON] Deduplicate the variables in PythonArrowInput
zhengruifeng Oct 16, 2025
9823daf
[SPARK-53931][INFRA][PYTHON] Fix scheduled job for numpy 2.1.3
zhengruifeng Oct 16, 2025
deb7b62
[SPARK-53925][FOLLOW-UP][DOCS] Fix the link in README
zhengruifeng Oct 16, 2025
83be7e7
[SPARK-53929][SQL] Support TIME in the make_timestamp and try_make_ti…
uros-db Oct 16, 2025
24a6abf
[SPARK-53935][BUILD] SBT assembly should handle META-INF correctly
pan3793 Oct 16, 2025
ea71991
[SPARK-53936][BUILD] Upgrade sbt-pom-reader from 2.4.0 to 2.5.0
gemelen Oct 16, 2025
eb117a6
[SPARK-53908][CONNECT] Fix observations on Spark Connect with plan cache
ueshin Oct 16, 2025
983d384
[SPARK-53573][SQL] Use Pre-processor for generalized parameter marker…
srielau Oct 16, 2025
f689ff7
[SPARK-53939][PYTHON] Use batch.num_columns instead of len(batch.colu…
ueshin Oct 17, 2025
136201a
[SPARK-53927][BUILD][DSTREAM] Upgrade kinesis client and fix kinesis …
vrozov Oct 17, 2025
d799aa7
[SPARK-53149][CORE] Fix testing whether BeeLine process run in backgr…
pan3793 Oct 17, 2025
2f57459
[MINOR][PYTHON][DOCS] Fix the docstring of `assert_true`
zhengruifeng Oct 17, 2025
c041671
[SPARK-53943][PYTHON][DOCS] Add examples for function unwrap_udt
zhengruifeng Oct 17, 2025
8499a62
[SPARK-53785][SS] Memory Source for RTM
jerrypeng Oct 17, 2025
1c0bca9
[SPARK-52798][SQL] Add function approx_top_k_combine
yhuang-db Oct 18, 2025
2c7bc89
[SPARK-53760][GEO][SQL][FOLLOWUP] Fix error message and comment for S…
uros-db Oct 18, 2025
dd8d13b
[SPARK-53944][K8S] Support `spark.kubernetes.executor.useDriverPodIP`
dongjoon-hyun Oct 18, 2025
5edebd2
[SPARK-53938][PYTHON][CONNECT] Fix decimal rescaling in LocalDataToAr…
zhengruifeng Oct 20, 2025
5ae573f
[SPARK-53656][SS] Refactor MemoryStream to use SparkSession instead o…
ganeshas-db Oct 20, 2025
9c38696
[SPARK-53945][BUILD] Upgrade `semanticdb-shared` to `4.13.10`
sarutak Oct 20, 2025
4ab7d5a
[SPARK-53949][CONNECT] Use `Utils.getRootCause` instead of `Throwable…
LuciferYang Oct 20, 2025
b1f5428
[SPARK-42857][PYTHON][TESTS] Enable parity test `test_supported_types`
zhengruifeng Oct 20, 2025
2bb73fb
[SPARK-53950][BUILD] Upgrade scala-xml to 2.4.0
LuciferYang Oct 20, 2025
9c40c12
[SPARK-53946][BUILD] Upgrade SBT to 1.11.7
sarutak Oct 20, 2025
57b4cd2
[SPARK-53696][PYTHON][CONNECT][SQL] Default to bytes for BinaryType i…
xianzhe-databricks Oct 20, 2025
b0327f3
[SPARK-53947][SQL] Count null in approx_top_k
yhuang-db Oct 20, 2025
1f21a8b
[SPARK-53951][BUILD] Upgrade `protobuf-java` to 4.33.0
LuciferYang Oct 20, 2025
37ee992
[SPARK-53535][SQL] Fix missing structs always being assumed as nulls
ZiyaZa Oct 20, 2025
94cccad
[SPARK-53755][CORE] Add log support in BlockManager
ivoson Oct 21, 2025
8430dbf
[SPARK-53961][SQL][TESTS] Fix `FileStreamSinkSuite` flakiness by usin…
dongjoon-hyun Oct 21, 2025
6739e4f
[MINOR][PYTHON][TESTS] Update `test_arrow_udf_output_timestamps_ltz` …
zhengruifeng Oct 21, 2025
128fb13
[SPARK-53914][BUILD][CONNECT] Add connect-client-jdbc module
pan3793 Oct 21, 2025
9f32542
[SPARK-53963][PYTHON][TESTS] Drop temporary functions in regular UDF …
zhengruifeng Oct 21, 2025
f33d8aa
[SPARK-53738][SQL] Fix planned write when query output contains folda…
pan3793 Oct 21, 2025
e963eb7
[SPARK-53965][CONNECT][SS][PYTHON] Upgrade buf plugins to `v29.5`
LuciferYang Oct 21, 2025
85bf6ec
[SPARK-53958][BUILD] Simplify Jackson deps management by using BOM
pan3793 Oct 21, 2025
ed9957b
[SPARK-53964][BUILD] Simplify Java Home finding for SBT unidoc
pan3793 Oct 21, 2025
0f42632
[SPARK-53636][CORE] Fix thread-safety issue in SortShuffleManager.unr…
Ngone51 Oct 21, 2025
a8cfe0c
[SPARK-53960][SQL] Let approx_top_k_accumulate/combine/estimate handl…
yhuang-db Oct 21, 2025
6a77ec4
[SPARK-53971][BUILD] Bump zstd-jni 1.5.7-6
pan3793 Oct 21, 2025
748de5f
[SPARK-53969][PYTHON][TESTS] Drop temporary functions in Arrow UDF tests
zhengruifeng Oct 21, 2025
e58a3db
[SPARK-53954][BUILD] Bump Avro 1.12.1
pan3793 Oct 21, 2025
41d3619
[SPARK-53921][GEO][PYTHON] Introduce GeometryType and GeographyType t…
uros-db Oct 22, 2025
3d18fe1
[SPARK-53973][AVRO] Classify errors for AvroOptions boolean casting f…
siying Oct 22, 2025
7427ff4
[SPARK-53979][PYTHON][TESTS] Drop temporary functions in Pandas UDF t…
zhengruifeng Oct 22, 2025
3109488
[SPARK-53974][BUILD] Bump Jackson 2.20.0
pan3793 Oct 22, 2025
a96e9ca
[SPARK-53968][SQL] Store decimal precision loss conf in arithmetic ex…
stefankandic Oct 22, 2025
db81309
[SPARK-53917][CONNECT] Support large local relations
khakhlyuk Oct 22, 2025
4a62f75
[SPARK-53319][SQL] Support the time type by try_make_timestamp_ltz()
uros-db Oct 22, 2025
d65ed4a
[SPARK-53687][SQL][SS][SDP] Introduce WATERMARK clause in SQL statement
HeartSaVioR Oct 22, 2025
0be5f96
[SPARK-53972][SS] Fix streaming query recentProgress regression in cl…
Oct 22, 2025
a6d17f7
[SPARK-53956][PYTHON] Support TIME in the try_make_timestamp function…
uros-db Oct 23, 2025
0e10341
[SPARK-53930][PYTHON] Support TIME in the make_timestamp function in …
uros-db Oct 23, 2025
c707f59
Add PipelineAnalysisContext message to support pipeline analysis duri…
cookiedough77 Oct 23, 2025
c70c728
[SPARK-53981][BUILD] Upgrade Netty to 4.2.7.Final
yaooqinn Oct 23, 2025
96093bd
[SPARK-53922][GEO][SQL] Introduce physical Geometry and Geography types
uros-db Oct 23, 2025
76d4718
[SPARK-53920][GEO][SQL] Introduce GeometryType and GeographyType to J…
uros-db Oct 23, 2025
fcbafc3
[SPARK-53914][BUILD][FOLLOWUP] Fix branch-4.0 daily maven test
pan3793 Oct 23, 2025
f72435a
[SPARK-53966][CORE] Add utility functions to detect JVM GCs
Oct 23, 2025
c9ae4ab
[SPARK-53999][CORE] Native KQueue Transport support on BSD/MacOS
yaooqinn Oct 23, 2025
92ac08c
[SPARK-50205][SQL][TEST] Re-enable `SparkSessionJobTaggingAndCancella…
sarutak Oct 23, 2025
2da5c62
Cleanup shuffle from fallback storage
EnricoMi May 23, 2025
3f54903
Use conf key const in doc string
EnricoMi Sep 3, 2025
d11df30
Swap configs to allow for cross-referencing key
EnricoMi Sep 3, 2025
bbff22d
Fix indentation
EnricoMi Oct 23, 2025
1c6d17b
Simplify message match logic in FallbackStorage.ask
EnricoMi Oct 24, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
2 changes: 1 addition & 1 deletion .asf.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

# https://cwiki.apache.org/confluence/display/INFRA/git+-+.asf.yaml+features
# https://github.com/apache/infrastructure-asfyaml/blob/main/README.md
---
github:
description: "Apache Spark - A unified analytics engine for large-scale data processing"
Expand Down
30 changes: 28 additions & 2 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,11 @@ on:
description: 'Number of job splits'
required: true
default: '1'
create-commit:
type: boolean
description: 'Commit the benchmark results to the current branch'
required: true
default: false

jobs:
matrix-gen:
Expand Down Expand Up @@ -195,10 +200,31 @@ jobs:
# To keep the directory structure and file permissions, tar them
# See also https://github.com/actions/upload-artifact#maintaining-file-permissions-and-case-sensitive-files
echo "Preparing the benchmark results:"
tar -cvf benchmark-results-${{ inputs.jdk }}-${{ inputs.scala }}.tar `git diff --name-only` `git ls-files --others --exclude=tpcds-sf-1 --exclude=tpcds-sf-1-text --exclude-standard`
tar -cvf target/benchmark-results-${{ inputs.jdk }}-${{ inputs.scala }}.tar `git diff --name-only` `git ls-files --others --exclude=tpcds-sf-1 --exclude=tpcds-sf-1-text --exclude-standard`
- name: Create a pull request with the results
if: ${{ inputs.create-commit && success() }}
run: |
git config --local user.name "${{ github.actor }}"
git config --local user.email "${{ github.event.pusher.email || format('{0}@users.noreply.github.com', github.actor) }}"
git add -A
git commit -m "Benchmark results for ${{ inputs.class }} (JDK ${{ inputs.jdk }}, Scala ${{ inputs.scala }}, split ${{ matrix.split }} of ${{ inputs.num-splits }})"
for i in {1..5}; do
echo "Attempt $i to push..."
git fetch origin ${{ github.ref_name }}
git rebase origin/${{ github.ref_name }}
if git push origin ${{ github.ref_name }}:${{ github.ref_name }}; then
echo "Push successful."
exit 0
else
echo "Push failed, retrying in 3 seconds..."
sleep 3
fi
done
echo "Error: Failed to push after 5 attempts."
exit 1
- name: Upload benchmark results
uses: actions/upload-artifact@v4
with:
name: benchmark-results-${{ inputs.jdk }}-${{ inputs.scala }}-${{ matrix.split }}
path: benchmark-results-${{ inputs.jdk }}-${{ inputs.scala }}.tar
path: target/benchmark-results-${{ inputs.jdk }}-${{ inputs.scala }}.tar

109 changes: 88 additions & 21 deletions .github/workflows/build_and_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ jobs:
ui=false
docs=false
fi
build=`./dev/is-changed.py -m "core,unsafe,kvstore,avro,utils,network-common,network-shuffle,repl,launcher,examples,sketch,variant,api,catalyst,hive-thriftserver,mllib-local,mllib,graphx,streaming,sql-kafka-0-10,streaming-kafka-0-10,streaming-kinesis-asl,kubernetes,hadoop-cloud,spark-ganglia-lgpl,profiler,protobuf,yarn,connect,sql,hive,pipelines"`
build=`./dev/is-changed.py -m "core,unsafe,kvstore,avro,utils,utils-java,network-common,network-shuffle,repl,launcher,examples,sketch,variant,api,catalyst,hive-thriftserver,mllib-local,mllib,graphx,streaming,sql-kafka-0-10,streaming-kafka-0-10,streaming-kinesis-asl,kubernetes,hadoop-cloud,spark-ganglia-lgpl,profiler,protobuf,yarn,connect,sql,hive,pipelines"`
precondition="
{
\"build\": \"$build\",
Expand All @@ -122,6 +122,8 @@ jobs:
\"tpcds-1g\": \"$tpcds\",
\"docker-integration-tests\": \"$docker\",
\"lint\" : \"true\",
\"java17\" : \"$build\",
\"java25\" : \"$build\",
\"docs\" : \"$docs\",
\"yarn\" : \"$yarn\",
\"k8s-integration-tests\" : \"$kubernetes\",
Expand Down Expand Up @@ -240,7 +242,7 @@ jobs:
# Note that the modules below are from sparktestsupport/modules.py.
modules:
- >-
core, unsafe, kvstore, avro, utils,
core, unsafe, kvstore, avro, utils, utils-java,
network-common, network-shuffle, repl, launcher,
examples, sketch, variant
- >-
Expand Down Expand Up @@ -360,7 +362,7 @@ jobs:
- name: Install Python packages (Python 3.11)
if: (contains(matrix.modules, 'sql') && !contains(matrix.modules, 'sql-')) || contains(matrix.modules, 'connect') || contains(matrix.modules, 'yarn')
run: |
python3.11 -m pip install 'numpy>=1.20.0' pyarrow pandas scipy unittest-xml-reporting 'lxml==4.9.4' 'grpcio==1.67.0' 'grpcio-status==1.67.0' 'protobuf==5.29.1'
python3.11 -m pip install 'numpy>=1.22' pyarrow pandas pyyaml scipy unittest-xml-reporting 'lxml==4.9.4' 'grpcio==1.67.0' 'grpcio-status==1.67.0' 'protobuf==5.29.5'
python3.11 -m pip list
# Run the tests.
- name: Run tests
Expand Down Expand Up @@ -512,37 +514,34 @@ jobs:
pyspark-core, pyspark-errors, pyspark-streaming, pyspark-logger
- >-
pyspark-mllib, pyspark-ml, pyspark-ml-connect, pyspark-pipelines
- >-
pyspark-structured-streaming, pyspark-structured-streaming-connect
- >-
pyspark-connect
- >-
pyspark-pandas
- >-
pyspark-pandas-slow
- >-
pyspark-pandas-connect-part0
- >-
pyspark-pandas-connect-part1
- >-
pyspark-pandas-connect-part2
pyspark-pandas-connect
- >-
pyspark-pandas-connect-part3
pyspark-pandas-slow-connect
exclude:
# Always run if pyspark == 'true', even infra-image is skip (such as non-master job)
# In practice, the build will run in individual PR, but not against the individual commit
# in Apache Spark repository.
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark != 'true' && 'pyspark-sql, pyspark-resource, pyspark-testing' }}
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark != 'true' && 'pyspark-core, pyspark-errors, pyspark-streaming, pyspark-logger' }}
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark != 'true' && 'pyspark-mllib, pyspark-ml, pyspark-ml-connect' }}
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark != 'true' && 'pyspark-structured-streaming, pyspark-structured-streaming-connect' }}
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark != 'true' && 'pyspark-connect' }}
# Always run if pyspark-pandas == 'true', even infra-image is skip (such as non-master job)
# In practice, the build will run in individual PR, but not against the individual commit
# in Apache Spark repository.
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark-pandas != 'true' && 'pyspark-pandas' }}
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark-pandas != 'true' && 'pyspark-pandas-slow' }}
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark-pandas != 'true' && 'pyspark-pandas-connect-part0' }}
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark-pandas != 'true' && 'pyspark-pandas-connect-part1' }}
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark-pandas != 'true' && 'pyspark-pandas-connect-part2' }}
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark-pandas != 'true' && 'pyspark-pandas-connect-part3' }}
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark-pandas != 'true' && 'pyspark-pandas-connect' }}
- modules: ${{ fromJson(needs.precondition.outputs.required).pyspark-pandas != 'true' && 'pyspark-pandas-slow-connect' }}
env:
MODULES_TO_TEST: ${{ matrix.modules }}
HADOOP_PROFILE: ${{ inputs.hadoop }}
Expand Down Expand Up @@ -605,8 +604,9 @@ jobs:
run: |
for py in $(echo $PYTHON_TO_TEST | tr "," "\n")
do
echo $py
$py --version
$py -m pip list
echo ""
done
- name: Install Conda for pip packaging test
if: contains(matrix.modules, 'pyspark-errors')
Expand Down Expand Up @@ -766,7 +766,7 @@ jobs:
python-version: '3.11'
- name: Install dependencies for Python CodeGen check
run: |
python3.11 -m pip install 'black==23.12.1' 'protobuf==5.29.1' 'mypy==1.8.0' 'mypy-protobuf==3.3.0'
python3.11 -m pip install 'black==23.12.1' 'protobuf==5.29.5' 'mypy==1.8.0' 'mypy-protobuf==3.3.0'
python3.11 -m pip list
- name: Python CodeGen check for branch-3.5
if: inputs.branch == 'branch-3.5'
Expand Down Expand Up @@ -919,6 +919,42 @@ jobs:
- name: R linter
run: ./dev/lint-r

java17:
needs: [precondition]
if: fromJson(needs.precondition.outputs.required).java17 == 'true'
name: Java 17 build with Maven
runs-on: ubuntu-latest
timeout-minutes: 120
steps:
- uses: actions/checkout@v4
- uses: actions/setup-java@v4
with:
distribution: zulu
java-version: 17
- name: Build with Maven
run: |
export MAVEN_OPTS="-Xss64m -Xmx4g -Xms4g -XX:ReservedCodeCacheSize=128m -Dorg.slf4j.simpleLogger.defaultLogLevel=WARN"
export MAVEN_CLI_OPTS="--no-transfer-progress"
./build/mvn $MAVEN_CLI_OPTS -DskipTests -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pjvm-profiler -Pspark-ganglia-lgpl -Pkinesis-asl clean install

java25:
needs: [precondition]
if: fromJson(needs.precondition.outputs.required).java25 == 'true'
name: Java 25 build with Maven
runs-on: ubuntu-latest
timeout-minutes: 120
steps:
- uses: actions/checkout@v4
- uses: actions/setup-java@v4
with:
distribution: zulu
java-version: 25
- name: Build with Maven
run: |
export MAVEN_OPTS="-Xss64m -Xmx4g -Xms4g -XX:ReservedCodeCacheSize=128m -Dorg.slf4j.simpleLogger.defaultLogLevel=WARN"
export MAVEN_CLI_OPTS="--no-transfer-progress"
./build/mvn $MAVEN_CLI_OPTS -DskipTests -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pjvm-profiler -Pspark-ganglia-lgpl -Pkinesis-asl clean install

# Documentation build
docs:
needs: [precondition, infra-image]
Expand Down Expand Up @@ -998,10 +1034,14 @@ jobs:
# Should unpin 'sphinxcontrib-*' after upgrading sphinx>5
python3.9 -m pip install 'sphinx==4.5.0' mkdocs 'pydata_sphinx_theme>=0.13' sphinx-copybutton nbsphinx numpydoc jinja2 markupsafe 'pyzmq<24.0.0' 'sphinxcontrib-applehelp==1.0.4' 'sphinxcontrib-devhelp==1.0.2' 'sphinxcontrib-htmlhelp==2.0.1' 'sphinxcontrib-qthelp==1.0.3' 'sphinxcontrib-serializinghtml==1.1.5'
python3.9 -m pip install ipython_genutils # See SPARK-38517
python3.9 -m pip install sphinx_plotly_directive 'numpy>=1.20.0' pyarrow pandas 'plotly<6.0.0'
python3.9 -m pip install sphinx_plotly_directive 'numpy>=1.22' pyarrow pandas 'plotly<6.0.0'
python3.9 -m pip install 'docutils<0.18.0' # See SPARK-39421
- name: List Python packages
- name: List Python packages for branch-3.5 and branch-4.0
if: inputs.branch == 'branch-3.5' || inputs.branch == 'branch-4.0'
run: python3.9 -m pip list
- name: List Python packages
if: inputs.branch != 'branch-3.5' && inputs.branch != 'branch-4.0'
run: python3.11 -m pip list
- name: Install dependencies for documentation generation
run: |
# Keep the version of Bundler here in sync with the following locations:
Expand All @@ -1010,7 +1050,8 @@ jobs:
gem install bundler -v 2.4.22
cd docs
bundle install --retry=100
- name: Run documentation build
- name: Run documentation build for branch-3.5 and branch-4.0
if: inputs.branch == 'branch-3.5' || inputs.branch == 'branch-4.0'
run: |
# We need this link to make sure `python3` points to `python3.9` which contains the prerequisite packages.
ln -s "$(which python3.9)" "/usr/local/bin/python3"
Expand All @@ -1031,6 +1072,30 @@ jobs:
echo "SKIP_SQLDOC: $SKIP_SQLDOC"
cd docs
bundle exec jekyll build
- name: Run documentation build
if: inputs.branch != 'branch-3.5' && inputs.branch != 'branch-4.0'
run: |
# We need this link to make sure `python3` points to `python3.11` which contains the prerequisite packages.
ln -s "$(which python3.11)" "/usr/local/bin/python3"
# Build docs first with SKIP_API to ensure they are buildable without requiring any
# language docs to be built beforehand.
cd docs; SKIP_ERRORDOC=1 SKIP_API=1 bundle exec jekyll build; cd ..
if [ -f "./dev/is-changed.py" ]; then
# Skip PySpark and SparkR docs while keeping Scala/Java/SQL docs
pyspark_modules=`cd dev && python3.11 -c "import sparktestsupport.modules as m; print(','.join(m.name for m in m.all_modules if m.name.startswith('pyspark')))"`
if [ `./dev/is-changed.py -m $pyspark_modules` = false ]; then export SKIP_PYTHONDOC=1; fi
if [ `./dev/is-changed.py -m sparkr` = false ]; then export SKIP_RDOC=1; fi
fi
export PYSPARK_DRIVER_PYTHON=python3.11
export PYSPARK_PYTHON=python3.11
# Print the values of environment variables `SKIP_ERRORDOC`, `SKIP_SCALADOC`, `SKIP_PYTHONDOC`, `SKIP_RDOC` and `SKIP_SQLDOC`
echo "SKIP_ERRORDOC: $SKIP_ERRORDOC"
echo "SKIP_SCALADOC: $SKIP_SCALADOC"
echo "SKIP_PYTHONDOC: $SKIP_PYTHONDOC"
echo "SKIP_RDOC: $SKIP_RDOC"
echo "SKIP_SQLDOC: $SKIP_SQLDOC"
cd docs
bundle exec jekyll build
- name: Tar documentation
if: github.repository != 'apache/spark'
run: tar cjf site.tar.bz2 docs/_site
Expand Down Expand Up @@ -1259,9 +1324,9 @@ jobs:
sudo apt update
sudo apt-get install r-base
- name: Start Minikube
uses: medyagh/[email protected].19
uses: medyagh/[email protected].20
with:
kubernetes-version: "1.33.0"
kubernetes-version: "1.34.0"
# Github Action limit cpu:2, memory: 6947MB, limit to 2U6G for better resource statistic
cpus: 2
memory: 6144m
Expand All @@ -1279,8 +1344,10 @@ jobs:
kubectl create clusterrolebinding serviceaccounts-cluster-admin --clusterrole=cluster-admin --group=system:serviceaccounts || true
if [[ "${{ inputs.branch }}" == 'branch-3.5' ]]; then
kubectl apply -f https://raw.githubusercontent.com/volcano-sh/volcano/v1.7.0/installer/volcano-development.yaml || true
else
elif [[ "${{ inputs.branch }}" == 'branch-4.0' ]]; then
kubectl apply -f https://raw.githubusercontent.com/volcano-sh/volcano/v1.11.0/installer/volcano-development.yaml || true
else
kubectl apply -f https://raw.githubusercontent.com/volcano-sh/volcano/v1.12.2/installer/volcano-development.yaml || true
fi
eval $(minikube docker-env)
build/sbt -Phadoop-3 -Psparkr -Pkubernetes -Pvolcano -Pkubernetes-integration-tests -Dspark.kubernetes.test.volcanoMaxConcurrencyJobNum=1 -Dtest.exclude.tags=local "kubernetes-integration-tests/test"
Expand Down
28 changes: 14 additions & 14 deletions .github/workflows/build_infra_images_cache.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,13 +33,13 @@ on:
- 'dev/spark-test-image/python-minimum/Dockerfile'
- 'dev/spark-test-image/python-ps-minimum/Dockerfile'
- 'dev/spark-test-image/pypy-310/Dockerfile'
- 'dev/spark-test-image/python-309/Dockerfile'
- 'dev/spark-test-image/python-310/Dockerfile'
- 'dev/spark-test-image/python-311/Dockerfile'
- 'dev/spark-test-image/python-311-classic-only/Dockerfile'
- 'dev/spark-test-image/python-312/Dockerfile'
- 'dev/spark-test-image/python-313/Dockerfile'
- 'dev/spark-test-image/python-313-nogil/Dockerfile'
- 'dev/spark-test-image/python-314/Dockerfile'
- 'dev/spark-test-image/numpy-213/Dockerfile'
- '.github/workflows/build_infra_images_cache.yml'
# Create infra image when cutting down branches/tags
Expand Down Expand Up @@ -153,19 +153,6 @@ jobs:
- name: Image digest (PySpark with PyPy 3.10)
if: hashFiles('dev/spark-test-image/pypy-310/Dockerfile') != ''
run: echo ${{ steps.docker_build_pyspark_pypy_310.outputs.digest }}
- name: Build and push (PySpark with Python 3.9)
if: hashFiles('dev/spark-test-image/python-309/Dockerfile') != ''
id: docker_build_pyspark_python_309
uses: docker/build-push-action@v6
with:
context: ./dev/spark-test-image/python-309/
push: true
tags: ghcr.io/apache/spark/apache-spark-github-action-image-pyspark-python-309-cache:${{ github.ref_name }}-static
cache-from: type=registry,ref=ghcr.io/apache/spark/apache-spark-github-action-image-pyspark-python-309-cache:${{ github.ref_name }}
cache-to: type=registry,ref=ghcr.io/apache/spark/apache-spark-github-action-image-pyspark-python-309-cache:${{ github.ref_name }},mode=max
- name: Image digest (PySpark with Python 3.9)
if: hashFiles('dev/spark-test-image/python-309/Dockerfile') != ''
run: echo ${{ steps.docker_build_pyspark_python_309.outputs.digest }}
- name: Build and push (PySpark with Python 3.10)
if: hashFiles('dev/spark-test-image/python-310/Dockerfile') != ''
id: docker_build_pyspark_python_310
Expand Down Expand Up @@ -244,6 +231,19 @@ jobs:
- name: Image digest (PySpark with Python 3.13 no GIL)
if: hashFiles('dev/spark-test-image/python-313-nogil/Dockerfile') != ''
run: echo ${{ steps.docker_build_pyspark_python_313_nogil.outputs.digest }}
- name: Build and push (PySpark with Python 3.14)
if: hashFiles('dev/spark-test-image/python-314/Dockerfile') != ''
id: docker_build_pyspark_python_314
uses: docker/build-push-action@v6
with:
context: ./dev/spark-test-image/python-314/
push: true
tags: ghcr.io/apache/spark/apache-spark-github-action-image-pyspark-python-314-cache:${{ github.ref_name }}-static
cache-from: type=registry,ref=ghcr.io/apache/spark/apache-spark-github-action-image-pyspark-python-314-cache:${{ github.ref_name }}
cache-to: type=registry,ref=ghcr.io/apache/spark/apache-spark-github-action-image-pyspark-python-314-cache:${{ github.ref_name }},mode=max
- name: Image digest (PySpark with Python 3.14)
if: hashFiles('dev/spark-test-image/python-314/Dockerfile') != ''
run: echo ${{ steps.docker_build_pyspark_python_314.outputs.digest }}
- name: Build and push (PySpark with Numpy 2.1.3)
if: hashFiles('dev/spark-test-image/numpy-213/Dockerfile') != ''
id: docker_build_pyspark_numpy_213
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/build_maven_java21_arm.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ name: "Build / Maven (master, Scala 2.13, Hadoop 3, JDK 21, ARM)"

on:
schedule:
- cron: '0 15 * * *'
- cron: '0 15 */2 * *'
workflow_dispatch:

jobs:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
# under the License.
#

name: "Build / Maven (master, Scala 2.13, Hadoop 3, JDK 21, MacOS-15)"
name: "Build / Maven (master, Scala 2.13, Hadoop 3, JDK 21, MacOS-26)"

on:
schedule:
Expand All @@ -33,7 +33,7 @@ jobs:
if: github.repository == 'apache/spark'
with:
java: 21
os: macos-15
os: macos-26
arch: arm64
envs: >-
{
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/build_non_ansi.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ jobs:
"PYSPARK_IMAGE_TO_TEST": "python-311",
"PYTHON_TO_TEST": "python3.11",
"SPARK_ANSI_SQL_MODE": "false",
"SPARK_TEST_SPARK_BLOOM_FILTER_SUITE_ENABLED": "true"
}
jobs: >-
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
# under the License.
#

name: "Build / Python-only (master, Python 3.9)"
name: "Build / Python-only (master, Python 3.14)"

on:
schedule:
Expand All @@ -37,8 +37,8 @@ jobs:
hadoop: hadoop3
envs: >-
{
"PYSPARK_IMAGE_TO_TEST": "python-309",
"PYTHON_TO_TEST": "python3.9"
"PYSPARK_IMAGE_TO_TEST": "python-314",
"PYTHON_TO_TEST": "python3.14"
}
jobs: >-
{
Expand Down
Loading
Loading