How to group two colums using pivot,agg without deleting duplicate values

kumarraj · December 11, 2017, 11:50am

Hi,
I have two columns,
A and B.
a1 b1
a1 b1
a1 b2
a2 b1
a2 b3
a3 b3
a3 b4

I need to aggregate and group A and B and create new columns in scala dataframe, so results should be,
> A B => a1 a2 a3

    ------------------------------------------------------------------------
    a1 b1              b1   b1    b3
    a1 b1              b1   b3    b4
    a1 b2              b2   null   b3
    a2 b1
    a2 b3
    a3 b3
    a3 b4
    a3 b3

a1,a2 and a3 columns (derived from A and B) as below,
  a1   a2     a3
------------------------------------------------------------------------
b1   b1    b3
 b1   b3    b4
 b2   null   b3

a1I tried with pivot and aggregation, but it remoes duplicate entries from column a1,a2,a3.
How can I achieve the above using scala spark.

Thank you.