Using SQL commands in Scala

Hello, I am trying to run a SQL query using Data Frames.

val namesDF = spark.read.json("/user/ashhall1616/bdc_data/lab_6/names.json")

continentDF.createOrReplaceTempView("continents")
namesDF.createOrReplaceTempView("names")

spark.sql("""SELECT continents.countryCode AS CountryCode,  names.name AS Name FROM
continents INNER JOIN names
ON continents.countryCode = names.countryCode
WHERE CountryCode = 'OC'
GROUP BY Name
ORDER BY Name ASC""").show(10)

I get error

org.apache.spark.sql.AnalysisException: Reference 'CountryCode' is ambiguous, could be: CountryCode#43, CountryCode#47.; line 4 pos 6

Can someone tell whats wrong here?

You need to be more explicit in your WHERE clause. You either need to use continents.countryCode = 'whatever' or names.countryCode.

the AS thing just changes the column name on the result of the query

1 Like