Help understanding scala conventional indentation

I don’t really understand the conventional indentation of highly composed functional code written with c/java syntax, i.e., with “.” “{}” “,” and infix operators etc… Can someone suggest how I might intent the following if the local function count1 were inlined?

  def foo(strs: List[String], rdd: RDD[W]): List[(String, Int)] = {
    def count1 (str:String,w:W):Int = {
      if ( containsLang(w,str)) {1}
      else                     {0}
    }
    rdd.flatMap(w => strs.map(str => (str,count1(str,w)))).reduceByKey(_+_).collect.toList.sortWith((p1,p2)=> p1._2 > p2._2)
  }

There’s no consensus about formatting but I would probably do it something like this. The initial flatMap with the nested map can be rewritten as a for-comprehension which is how I usually prefer it. You can inline count1 as-is if you like but I wouldn’t bother.

def foo(strs: List[String], rdd: RDD[W]): List[(String, Int)] = {
  
  def count1(str: String, w: W): Int =
    if (containsLang(w, str)) 1 else 0

  val ps: RDD[(String, Int)] =
    for {
      w   <- rdd
      str <- strs
    } yield (str, count1(str, w))

  ps.reduceByKey(_ + _)
    .collect
    .toList
    .sortWith { case ((_, a), (_, b)) => a > b }

}
1 Like

What tpolcat said but a bit more explicitly, vertical whitespace is your friend. I’ll nestle simple vals/vars up against the top but if there’s a def or something like

def foo(): Unit = {

  val thing = {
    val t = init()
    t.thatExtraThing()
    t
  }

  // rest of code
}

then throwing in a blank line helps keep clutter down.

There is no definitive standard. The pattern I use has elements that are found very broadly, but people differ on some of the details (e.g. do dots trail or lead, do you use braces, where do you put a leading parenthesis, etc.).

I would write it like this:

def foo(strs: List[String], rdd: RDD[W]): List[(String, Int)] =
  rdd.
    flatMap(w => strs.map(str => (str, if (containsLang(w, str)) 1 else 0))).
    reduceByKey(_ + _).
    collect.toList.
    sortWith(_._2 > _._2)

But let’s suppose we want not to wrap at 80 columns but to try to keep it to under 30 columns. Now we have a lot more wrapping to do. And, to be honest, this looks awful–but that is the result of the ridiculous 30 column restriction, not because the nesting is unclear. (Moral: don’t limit your line lengths to something unreasonably short, especially when there’s a lot of nesting!)

def foo(
  strs: List[String],
  rdd: RDD[W]
): List[(String, Int)] = {
  // Braces for clarity!
  rdd.
    flatMap{ w =>
      strs.map{ str =>
        (
          str,
          if (
            containsLang(
              w, str
            )
          ) 1
          else 0
        )
      }
    }.
    reduceByKey(_ + _).
    collect.toList.
    sortWith{ (p1, p2) =>
      p1._2 > p2._2
    }
}

This is pretty silly, but you see the pattern. Basically, you repeatedly apply the following transformation:

foo.bar(baz => quux)
bippy(a, b, c)

foo.bar{ baz =>
  quux
}
bippy(
  a,
  b,
  c
)

until things are shallow enough for your tastes.