How would I repeat rows in a DataFrame?

Hi how’s it going?

If I have a Spark Dataframe like this…

  val df = spark.read
    .format("csv")
    .option("sep",",")
    .option("inferSchema","true")
    .option("header","true")
    .load(dbPath+"data"+".csv")

and I want to repeat the dataframe rows so that each row has 7 copies of the row in the dataframe, how would I do that?

here’s an example of the before and after I’m looking for, except done in Python.

Thank you!

union the df with itself 6 times

1 Like

This does not work. Is there anyone who can give me a response that works? I’ve been reading online about the spark.sql functions repeat and explode, could those be possible solutions? The rows that are repeated need to stay next to each other.

Thanks

Hi,

That’s how you can do it.

val df = ...
val f = (0 to 5).foldLeft(df)((d,n) => d.union(df)).orderBy($"col1",$"col2")