MapPartition and forEachPartition


#1

Hi
I have Data Set as below , Now I want to apply my own logic on each partition . How do partition/split on DataSet or RDD or DataFrame , Getting many issue for forEachPartition or mapPartition , No proper documentation to verify these or no proper examples. Please help me got struck with this
Sample Data
{23,“mcr”,10.09 }
{23,“Ncr”,1.09 }
{24,“Lcr”,10.09 }
{24,“Hcr”,133.09 }
p1:
{23,“mcr”,10.09 }
{23,“Ncr”,1.09 }
p2 : partition 2
{24,“Lcr”,10.09 }
{24,“Hcr”,133.09 }

Thanks,
Gopi


#2

Hello. What means I have dataset? you present it line by line as text,
{23,“mcr”,10.09 }
{23,“Ncr”,1.09 }
{24,“Lcr”,10.09 }
{24,“Hcr”,133.09 }
p1:
{23,“mcr”,10.09 }
{23,“Ncr”,1.09 }
p2 : partition 2
{24,“Lcr”,10.09 }
{24,“Hcr”,133.09 }

Do you real have text line p1: ?


#3

Hi Alex,

This is not a text its the output of Dataset, It has all the rows, Now I want to apply my Calculation logics on each partition , Divide this whole dataset or RDD into multiple parition like 23 is one partition and 24 as another parition


#4

This really isn’t a Scala question, it’s a Spark one (I believe) – you may have more success with a Spark-centric forum…