Struct to array or map

Hi.

Completely new to this scala thing…!

I have a JSON file which has a serialised map in it. When reading into Scala, the map is referred to as a struct. I want to explode this map, hence need to convert it into a array or map.

Couple of questions?

  1. Is there a way of ensuring the data is read into scala as a map (or array)
  2. Failing that, how can I convert this struct into an map (or array) so I can “explode” it.

Steps I’m using;

Start shell
val sqlContext = spark.sqlContext
var dfs = sqlContext.read.json(sc.wholeTextFiles(“sample.json”).values)
dfs.printSchema()

root
|-- Books : struct (nullable = true)
| |-- 0919191919191919: struct (nullable = true)
| | |-- Id: string (nullable = true)
| | |-- Name: string (nullable = true)
| |-- 8181818181818181: struct (nullable = true)
| | |-- Id: string (nullable = true)
| | |-- Name: string (nullable = true)

I want Books to be a Map or Array, it is a map when I serialise the data to the JSON file.

Do I have to use a custom UDF here, or can I apply a schema before loading the JSON in , or apply some transformation to the data is scala?

Many thanks

1 Like