Scala Json Parse

i got this string representing json:

val query =
"""
{
“queries” : [ {
“queryId” : “f7436f4bdd7162d1:c1fac9fc00000000”,
“statement” : “SELECT ‘Hello World!’”,
“queryType” : “QUERY”,
“queryState” : “CREATED”,
“startTime” : “2017-08-06T18:11:50.098Z”,
“rowsProduced” : null,
“attributes” : {
“thread_storage_wait_time” : “0”,
“admission_result” : “Admitted immediately”,
“session_id” : “f745e3785d36ab82:8ca4a969df63c181”,
“planning_wait_time” : “2543”,
“oom” : “false”,
“stats_missing” : “false”,
“memory_spilled” : “0”,
“network_address” : “192.168.126.130:43005”,
“admission_wait” : “0”,
“file_formats” : “”,
“planning_wait_time_percentage” : “2”,
“client_fetch_wait_time” : “0”,
“client_fetch_wait_time_percentage” : “0”,
“pool” : “root.cloudera”,
“session_type” : “HIVESERVER2”,
“connected_user” : “cloudera”,
“thread_network_receive_wait_time” : “0”,
“thread_network_send_wait_time” : “0”,
“impala_version” : “impalad version 2.7.0-cdh5.10.0 RELEASE (build 785a073cd07e2540d521ecebb8b38161ccbd2aa2)”,
“estimated_per_node_peak_memory” : “1024”,
“query_status” : “OK”
},
“user” : “cloudera”,
“coordinator” : {
“hostId” : “quickstart.cloudera”
},
“detailsAvailable” : true,
“database” : “default”,
“durationMillis” : 135828
}],
“warnings” : [ ]
}

basicly, i got string with json of two arrays.
The Array queries is array of jsons.

I want take the queries array and convert it to array of ImpalaQuery

    class ImpalaQuery(val queryId: String,startTime:Timestamp,planning_wait_time_percentage:Int) {
    
    
    }

*Pay Attention attributes.planning_wait_time_percntage is int and not map.

what is the best way to do this?
Thx.

There are a bunch of JSON parsing libraries in Scala, have you tried one of them? E.g., using Circe I would define the following data types and decoders:

import io.circe.generic.semiauto.deriveDecoder

case class ImpalaQuery(queryId: String, startTime: Timestamp, planning_wait_time_percentage: Int)
object ImpalaQuery {
  implicit val decoder: Decoder[ImpalaQuery] = deriveDecoder
}

case class Queries(queries: Seq[ImpalaQuery]) extends AnyVal
object Queries {
  implicit val decoder: Decoder[Queries] = deriveDecoder
}

Then parse the string with:

import io.circe.parser.decode

val impalaQuery = decode[Queries](query)

It won`t work because query has two arrays (queries and warnings),
and also planning_wait_time_percentage:Int is attributes.planning_wait_time_percentage
in the JSON.
i dont have problem with parse it manualy in the constructor and dont make as[query]
this.queryId=query.queryId
.
.
.
There is a way to do it?

I just tested my idea, and with a few modifications it works. There are two main tricks:

(1) If you have a JSON object { "a": 1, "b": 2 }, then you can decode a subset of its fields into a Scala class like

case class Foo(a: Int)

(2) JSON decoders like Circe recursively decode each component type of your custom types, so if you want it to decode case class ImpalaQuery(queryId: String, startTime: Timestamp, attributes: Attributes) then you’ll also need to tell it how to decode Timestamp and Attributes. These decoders are implicit, so for classes you define the correct place to put them is in the companion objects, but in third-party classes like Timestamp the correct place is usually somewhere that’s auto-imported, like the package object.

Check out the code: https://github.com/yawaramin/impala-query

Thank you for your answer, you really helped me :slight_smile:

1 Like