Scala Spark unit test run on cluster


#1

How to test the scala spark UNIT test in cluster( Hortonworks/Cloudera) mode. Please can you help me how to approach on.
lets say, have class and it has a method and need to test the same.
Sake of learning referred below example.

import org.apache.spark.sql.SparkSession
import org.apache.spark.SparkContext
import org.apache.spark.rdd.RDD

class wordcountlogic {

** def wc1(file: String, sc: SparkContext): RDD[(String, Int)] = {**
** val lines = sc.textFile(file, 2)**
** lines.flatMap(.split(" ")).map((, 1)).reduceByKey( + )**
** }**
}

Test Class as below.

class WordCountTest extends FunSuite with BeforeAndAfterAll {

var sparkConf: SparkConf = _
var sc: SparkContext = _

override def beforeAll() = {
sparkConf = new SparkConf().setAppName(“test wordCount”)
sc = new SparkContext(sparkConf)
}

val wordcount = new wordcountlogic

test("get word count rdd ") {

val result = wordcount.wc1("file.txt", sc)

assert(result.take(10) == 10)

}

override def afterAll() = {

sc.stop()

}

}

How to run this WordCountTest on cluster. once i get to know this then i can scale it to the my production applications.

If there are any better approach to perform UNIT test of scala spark applications.
Please provide suggestions/inputs.