Python User Needing Help with Scala Code

Hi how’s it going?

If I have a DataFrame that looks like this,

Screenshot from 2020-08-14 11-25-42

and I want to write logic in Scala that says “for the columns that have the word left in them, store them in an array called ‘first_variables’, and for the columns that have the word right in them, store them in an array called ‘second_variables’”, how would I do that.

I’m brand new to Scala, in Python it would be very easy for me… something like, first_variables = [val for val in df.columns if ‘left’ in val] etc…

Thank you!

This kind of thing is generally very easy to accomplish in Scala with higher-order functions like map and filter (some like those are present in Python as well).

// Simple data structure
case class DataFrame(rows: List[List[String]]) {
  val columns = rows.transpose
}

val df = DataFrame(List(
  ""  :: "left_1" :: "left_2" :: "name" :: "right_1" :: "right_2" :: Nil,
  "0" :: "1"      :: "4"      :: "jim"  :: "7"       :: "10"      :: Nil,
  "1" :: "2"      :: "5"      :: "jim"  :: "8"       :: "11"      :: Nil,
  "2" :: "3"      :: "6"      :: "jim"  :: "9"       :: "12"      :: Nil,
))

val first_variables = df.columns.filter(_.head.contains("left"))
val second_variables = df.columns.filter(_.head.contains("right"))
println(s"first_variables = $first_variables")
println(s"second_variables = $second_variables")
// Prints
// first_variables = List(List(left_1, 1, 2, 3), List(left_2, 4, 5, 6))
// second_variables = List(List(right_1, 7, 8, 9), List(right_2, 10, 11, 12))

Executable example: https://scastie.scala-lang.org/P3FX7mpTQ8WyTHvo6NWOag

3 Likes