SBT cannot compile .scala file with spark-hive dependency

Hi everyone,

I am trying to compile a scala script to submit a job through spark-submit. I am using sbt from Windows command line to compile. Directory structure is as defined by sbt.

Here is my build file:
build.sbt

name := "TestQuery"
version := "1.0"
scalaVersion := "2.11.8"

libraryDependencies ++= {
   val sparkVer = "2.1.0"
   Seq("org.apache.spark" %% "spark-core" % sparkVer % "provided"   withSources(),
   "org.apache.spark" %% "spark-hive" % sparkVer % "provided" )
}

My TestQuery.scala file is under ./test/src/main/scala/TestQuery.scala

From Windows cmd, I switch directory to ./test and run sbt. When I run compile command, sbt gives the following error:

[error]./test/src/main/scala/TestQuery.scala:2:29: 
object hive is not a member of package org.apache.spark.sql

sbt uses maven2 repository and spark-hive exists under:
https://repo1.maven.org/maven2/org/apache/spark/spark-hive_2.11/1.2.0/

Also, this import command works in spark-shell. (spark-shell runs Spark 2.1.1. and scala 2.11.8).

Why can’t it find it?

That’s normal.
you’re adding a “provided” tag, which means that you’re telling sbt that it shouldn’t download this dependency and that it should expect it in its classPath. Removing the “provided” tag should solve your issue.

Provided dependencies are on the compile classpath.

Hi, Spark dependencies always must be “provided”. Because the dependencies will be on the spark cluster when you run your job.

You tests should perfectly work with “provided” libraries, look the example:

I think you are missing “spark-sql” dependency in your build.