This is an experimental Swift library to show how to connect to a remote Apache Spark Connect Server and run SQL statements to manipulate remote data.
So far, this library project is tracking the upstream changes like the Apache Spark 4.0.0 RC3 release and Apache Arrow project's Swift-support.
- Apache Spark 4.0.0 RC3 (March 2025)
- Swift 6.0 (2024)
- gRPC Swift 2.1 (March 2025)
- gRPC Swift Protobuf 1.1 (March 2025)
- gRPC Swift NIO Transport 1.0 (March 2025)
- Apache Arrow Swift
Create a Swift project.
$ mkdir SparkConnectSwiftApp
$ cd SparkConnectSwiftApp
$ swift package init --name SparkConnectSwiftApp --type executable
Add SparkConnect
package to the dependency like the following
$ cat Package.swift
import PackageDescription
let package = Package(
name: "SparkConnectSwiftApp",
platforms: [
.macOS(.v15)
],
dependencies: [
.package(url: "https://github.com/apache/spark-connect-swift.git", branch: "main")
],
targets: [
.executableTarget(
name: "SparkConnectSwiftApp",
dependencies: [.product(name: "SparkConnect", package: "spark-connect-swift")]
)
]
)
Use SparkSession
of SparkConnect
module in Swift.
$ cat Sources/main.swift
import SparkConnect
let spark = try await SparkSession.builder.getOrCreate()
print("Connected to Apache Spark \(await spark.version) Server")
let statements = [
"DROP TABLE IF EXISTS t",
"CREATE TABLE IF NOT EXISTS t(a INT) USING ORC",
"INSERT INTO t VALUES (1), (2), (3)",
]
for s in statements {
print("EXECUTE: \(s)")
_ = try await spark.sql(s).count()
}
print("SELECT * FROM t")
try await spark.sql("SELECT * FROM t").cache().show()
try await spark.range(10).filter("id % 2 == 0").write.mode("overwrite").orc("/tmp/orc")
try await spark.read.orc("/tmp/orc").show()
await spark.stop()
Run your Swift application.
$ swift run
...
Connected to Apache Spark 4.0.0 Server
EXECUTE: DROP TABLE IF EXISTS t
EXECUTE: CREATE TABLE IF NOT EXISTS t(a INT)
EXECUTE: INSERT INTO t VALUES (1), (2), (3)
SELECT * FROM t
+---+
| a |
+---+
| 2 |
| 1 |
| 3 |
+---+
+----+
| id |
+----+
| 2 |
| 6 |
| 0 |
| 8 |
| 4 |
+----+
You can find this example in the following repository.