Skip to content

Commit

Permalink
feat: add configurable spark configuration (#3)
Browse files Browse the repository at this point in the history
* feat: add configurable spark configuration

* fix: scala formatting
  • Loading branch information
fmarsault authored Aug 25, 2022
1 parent 99d2dd8 commit 22bdc20
Show file tree
Hide file tree
Showing 5 changed files with 35 additions and 8 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
target/
.bsp
.idea/
15 changes: 10 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# SparkTest - 0.1.0
# SparkTest - 0.2.0

**SparkTest** is a Scala library for unit testing with [Spark](https://github.com/apache/spark).
For now, it is only made for DataFrames.
Expand Down Expand Up @@ -56,20 +56,21 @@ To use **SparkTest** in an existing maven or sbt project:
<dependency>
<groupId>com.bedrockstreaming</groupId>
<artifactId>sparktest_2.12</artifactId>
<version>0.1.0</version>
<version>0.2.0</version>
<scope>test</scope>
</dependency>
```

### SBT

```scala
libraryDependencies += "com.bedrockstreaming" % "sparktest_2.12" % "0.1.0" % "test"
libraryDependencies += "com.bedrockstreaming" % "sparktest_2.12" % "0.2.0" % "test"
```

## Tools
### SparkTestSupport
This small `trait` provides a simple SparkSession with log set to warnings and let you focus only on your tests and not on the technical needs to create them.
If your SparkSession need additional configuration, you can pass it through the val `additionalSparkConfiguration`.

Example:
```scala
Expand All @@ -83,8 +84,12 @@ class MainSpec
with Matchers
with SparkTestSupport {

override lazy val additionalSparkConfiguration: Map[String, String] =
Map("spark.sql.extensions" -> "io.delta.sql.DeltaSparkSessionExtension",
"spark.sql.catalog.spark_catalog" -> "org.apache.spark.sql.delta.catalog.DeltaCatalog")

"main" should "do stuff" in {
# A SparkSession `spark` is built in trait `SparkTestSupport`
// A SparkSession `spark` is built in trait `SparkTestSupport`
import spark.implicits._

// ...
Expand All @@ -110,7 +115,7 @@ class MainSpec
with SparkTestSupport {

"main" should "do stuff" in {
# A SparkSession `spark` is built in trait `SparkTestSupport`
// A SparkSession `spark` is built in trait `SparkTestSupport`
import spark.implicits._

val df1 = Seq(("id1", 42)).toDF("id", "age")
Expand Down
2 changes: 1 addition & 1 deletion build.sbt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
ThisBuild / organization := "com.bedrockstreaming"
ThisBuild / version := "0.1.0"
ThisBuild / version := "0.2.0"
ThisBuild / scalaVersion := "2.12.11"

// ********
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,17 @@ trait SparkTestSupport {
lazy val appName: String = "SparkTest Session"
lazy val logLevel: Level = Level.WARN
lazy val shufflePartitions: Int = 2
lazy val additionalSparkConfiguration: Map[String, String] = Map()

implicit val spark: SparkSession = SparkSession
private val sparkBuilder = SparkSession
.builder()
.master("local[*]")
.appName(appName)
.config("spark.sql.shuffle.partitions", shufflePartitions.toString)
.getOrCreate()

additionalSparkConfiguration.foreach { case (k, v) => sparkBuilder.config(k, v) }

implicit val spark: SparkSession = sparkBuilder.getOrCreate()

spark.sparkContext.setLogLevel(logLevel.name())
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
package com.bedrockstreaming.data.sparktest

import org.scalatest.flatspec.AnyFlatSpec
import org.scalatest.matchers.should.Matchers

class SparkTestSupportSpec extends AnyFlatSpec with Matchers with SparkTestSupport {

override lazy val additionalSparkConfiguration: Map[String, String] =
Map("spark.test.toto" -> "false", "spark.test.titi" -> "0")

"spark" should "possess passed additional configuration" in {
val sparkConf = spark.conf.getAll
sparkConf.contains("spark.test.toto") shouldBe true
sparkConf.contains("spark.test.titi") shouldBe true
sparkConf.contains("spark.test.tata") shouldBe false
}
}

0 comments on commit 22bdc20

Please sign in to comment.