Can anyone tell me a good way to iterate all the elements in
rdd_43: org.apache.spark.rdd.RDD[((Int, String, String), Iterable[(Int, Int, Int, Int, Int, Int, Int)])] = ShuffledRDD[203] at groupByKey at <console>:115
And then call aggregate function sum on each element of Iterable.
I have grouped the data based on 1st 2nd & 3rd element.
rdd_42: org.apache.spark.rdd.RDD[(Int, String, String, Int, Int, Int, Int, Int, Int, Int)] = UnionRDD[201] at union at <console>:113
The final O/P should be RDD[(Int, String, String, Int, Int, Int, Int, Int, Int, Int)]