Collect may need a way to skip the current element within the case statement

Consider the following:
.collect {
case o if f(o) => bigFn(o)
.collect {
case p if p.value > 10 => g§
} (with formatting)

In this circumstance, for performance reasons, the fewer classes generated the better. bigFn is performance-heavy enough that we don’t want to call it twice. If we were able to return a value from a collect statement to indicate that we wanted to “skip” the current element, we would be able to avoid the performance cost of calling collect twice in a row. For example:
.collect {
case o if f(o) => {
val oVal = bigFn(o)
if oVal.value > 10 oVal else Nothing

What do you think about this?

You can use flatMap:

view.flatMap { o =>

if (!f(o)) None

else Some(bigFn(o)).filter(_ > 10)


My main reservation about doing so is the performance penalty that flatmap incurs.

Benchmark                              Mode  Cnt        Score      Error  Units
FilterBenchmark.withCollect           thrpt   10  1359035.689 ± 2749.815  ops/s
FilterBenchmark.withCollectTypeMatch  thrpt   10  1361227.743 ± 2337.850  ops/s
FilterBenchmark.withFlatMap           thrpt   10   113074.826 ±  288.107  ops/s
FilterBenchmark.withFlatMapTypeMatch  thrpt   10   113188.419 ±  262.826  ops/s

FlatMap generates options for every item in the sequence, whereas collect filters the items before operating on them. This leads to a 13x deficit in performance in a circumstance in which this deficit matters.

For high-performance code like this you will see considerably larger gains by writing custom code than you will by hoping that the library happens to hit your exact use-case.

In your particular case you can write a faster version with a builder and an iterator.

val b = Array.newBuilder[Foo]
val i = xs.iterator
while (i.hasNext) {
  val o =
  if (f(o)) {
    val temp = bigFn(o)
    if (temp.value > 10) b += temp

This manually fuses the iterator- or view-based version

xs.iterator.filter(f).map(bigFn).filter(_.value > 10).toArray

which, if you care about the overhead of boxing in Option, may be significant.

(Note: the filter/map/filter may be faster than the Option boxing.)

Scala 2.13 has overloads in scala.PartialFunction like def andThen[C](k: PartialFunction[B, C]): PartialFunction[A, C] which enable concise composition of partial functions, so you can fuse multiple collects into one. Example:

import scala.{PartialFunction => PF}

val ints = List.tabulate(10)(i => i)

val skipOddNumsAndHalveEven: PF[Int, Int] = {
  case x if x % 2 == 0 => x / 2

def skipLowerThanAndSubtract(threshold: Int): PF[Int, Int] = {
  case x if x >= threshold => x - threshold

// unfused partial functions, two 'collect' passes
// fused partial functions, single 'collect' pass

On Scala 2.12 this will throw MatchError due to lack of support of partial functions in andThen, compose etc combinators. On Scala 2.13 it will work properly.

@Ichoran You’re right; I’ll refactor in this case.

@tarsa Appreciate the info. That’s clean enough, I’ll definitely use that when I run into this next time in smaller collections.