Thanks a lot guys:
Here is the piece of code it made:
scala> val people = sc.parallelize(Array(("Jane", "student", 1000),
| ("Peter", "doctor", 100000),
| ("Mary", "doctor", 200000),
| ("Michael", "student", 1000)))
people: org.apache.spark.rdd.RDD[(String, String, Int)] = ParallelCollectionRDD[25] at parallelize at <console>:24
scala>
scala>
scala> val b =sc.parallelize(a.map{case (_, job, salary) => (job, salary)})
b: org.apache.spark.rdd.RDD[(String, Int)] = ParallelCollectionRDD[26] at parallelize at <console>:26
scala> val result =b.reduceByKey(_+_).sortBy(_._1)
result: org.apache.spark.rdd.RDD[(String, Int)] = MapPartitionsRDD[32] at sortBy at <console>:28
scala> result.collect
res12: Array[(String, Int)] = Array((doctor,300000), (student,2000))
scala> val people = sc.parallelize(Array(("Jane", "student", 1000),
| ("Peter", "doctor", 100000),
| ("Mary", "doctor", 200000),
| ("Michael", "student", 1000)))
people: org.apache.spark.rdd.RDD[(String, String, Int)] = ParallelCollectionRDD[25] at parallelize at <console>:24
scala>
scala>
scala> val b =sc.parallelize(a.map(v => (v._2, v._3)))
b: org.apache.spark.rdd.RDD[(String, Int)] = ParallelCollectionRDD[26] at parallelize at <console>:26
scala> val result =b.reduceByKey(_+_).sortBy(_._1)
result: org.apache.spark.rdd.RDD[(String, Int)] = MapPartitionsRDD[32] at sortBy at <console>:28
scala> result.collect
res12: Array[(String, Int)] = Array((doctor,300000), (student,2000))
Btw a user told me to use before and after entering the code here. In my keyboard if i use quotes ''' it's different from
. how can type it using normal keyboard? I always have to go back to find the ``` from the chat. Its very basic still but helpful.
Thanks