Hi.
Completely new to this scala thing…!
I have a JSON file which has a serialised map in it. When reading into Scala, the map is referred to as a struct. I want to explode this map, hence need to convert it into a array or map.
Couple of questions?
- Is there a way of ensuring the data is read into scala as a map (or array)
- Failing that, how can I convert this struct into an map (or array) so I can “explode” it.
Steps I’m using;
Start shell
val sqlContext = spark.sqlContext
var dfs = sqlContext.read.json(sc.wholeTextFiles(“sample.json”).values)
dfs.printSchema()
root
|-- Books : struct (nullable = true)
| |-- 0919191919191919: struct (nullable = true)
| | |-- Id: string (nullable = true)
| | |-- Name: string (nullable = true)
| |-- 8181818181818181: struct (nullable = true)
| | |-- Id: string (nullable = true)
| | |-- Name: string (nullable = true)
I want Books to be a Map or Array, it is a map when I serialise the data to the JSON file.
Do I have to use a custom UDF here, or can I apply a schema before loading the JSON in , or apply some transformation to the data is scala?
Many thanks