I don’t want to print any null values in the dataframe table (at any empty value) I tried ,but for empty value, “null” prints. If I use dataframe.na, it is remove the entire row at null column. I just ant to remove null value where ever it comes and arrange the column values as above result table.
My second question, how to convert rdd[string,string] to rdd[string] with and new line applied. Thank you.
In the above I need to remove null in column val2.
I tried with dataframe.na.isNotNull, but it removes rows wherever null comes.
But I need to remove null values at each column and don’t want to remove entire row.
Thank you…
Unfortunately, that is not really possible. There has to be some kind of indicator whether a specific row in specific column has a value or not. Think of dataframe as a 2D N x M sized matrix - every column has the same length.
Why would your data in this format though? What is your use case? We can figure something out if you tell us more
Hi,
I have two column with key and value pairs,
Such as,
Column A. Column B
A a1
B. a2
A a2
B. a2
C. a3
A. a2
B. a3
I need to collect A,B,C values as individual columns without delete duplicate values. But when I use pivot, agg methods, duplicate values deleted and null comes.