Hi how’s it going?
If I have a DataFrame that looks like this,
and I want to write logic in Scala that says “for the columns that have the word left in them, store them in an array called ‘first_variables’, and for the columns that have the word right in them, store them in an array called ‘second_variables’”, how would I do that.
I’m brand new to Scala, in Python it would be very easy for me… something like, first_variables = [val for val in df.columns if ‘left’ in val] etc…
Thank you!
This kind of thing is generally very easy to accomplish in Scala with higher-order functions like map
and filter
(some like those are present in Python as well).
// Simple data structure
case class DataFrame(rows: List[List[String]]) {
val columns = rows.transpose
}
val df = DataFrame(List(
"" :: "left_1" :: "left_2" :: "name" :: "right_1" :: "right_2" :: Nil,
"0" :: "1" :: "4" :: "jim" :: "7" :: "10" :: Nil,
"1" :: "2" :: "5" :: "jim" :: "8" :: "11" :: Nil,
"2" :: "3" :: "6" :: "jim" :: "9" :: "12" :: Nil,
))
val first_variables = df.columns.filter(_.head.contains("left"))
val second_variables = df.columns.filter(_.head.contains("right"))
println(s"first_variables = $first_variables")
println(s"second_variables = $second_variables")
// Prints
// first_variables = List(List(left_1, 1, 2, 3), List(left_2, 4, 5, 6))
// second_variables = List(List(right_1, 7, 8, 9), List(right_2, 10, 11, 12))
Executable example: https://scastie.scala-lang.org/P3FX7mpTQ8WyTHvo6NWOag
3 Likes