Emulating `for` of `flatMap`s

While I was constructing the following question, the mere act of explicitly formulating the question allowed me to realize the answer. I’ll ask the question anyway, because I think the answer is pretty cool.

The idiom of for { x <- List(...) ; y <- List( ...) } yield f(x,y) is unrolled as a flatMap of a map. As long as this idiom is what you need then for is your friend, and Bob’s your uncle.

However, missing from the for unrolling is the flatMap of flatMap with if/else idiom. It’s almost there because you can include an if within the first argument of for. But trying to pun an else in an obscure way (in my opinion) makes the code even more unreadable. It’s better to simply use flatMap the old fashioned way.

If I were processing only Lists it’s is easy. But I don’t know how to do it with Maps.

How do I convert the following idiom on List to the analogous idiom on Map?

    def xyzzy(as:List[Int],bs:List[Int]):List[Int] = {
      as.flatMap { a =>
        bs.flatMap { b => 
          if (f1(a,b))
            List(f2(a,b)) // collect one thing
          else if (g1(a,b))
            List() // collect zero things
          else if (h1(a,b))
            List(h2(a,b),h2(b,a)) // collect 2 things
        }
      }
    }

If I want a similar function which takes as and bs of type Map[Int], and I want to accumulate a Map as output rather than List, what goes into the collect zero things case and what goes in the collect two things case?

To answer my own question, here is how to rewrite the code using Map. It’s elegant and cool.

    def xyzzy(as:Map[Int,Int],bs:Map[Int,Int]):Map[Int,Int] = {
      as.flatMap { case (a,x) =>
        bs.flatMap { case (b,y) => 
          if (f1(a,b))
            Map(a -> f2(x,y)) // collect one thing
          else if (g1(a,b))
            Map() // collect zero things. ACTUALLY this does not compile because type explicit type is missing :(
          else if (h1(a,b))
            Map(a -> h2(x,y), b -> h2(y,x)) // collect 2 things
        }
      }
    }

This is missing a final else clause.

I’d prefer moving the “nested” code to a dedicated method/function.

def xyzzy(as:Map[Int,Int], bs:Map[Int,Int]):Map[Int,Int] = {
  def buildMap(a: Int, x: Int, b: Int, y: Int): Map[Int, Int] = {
    if (f1(a,b))
      Map(a -> f2(x,y))
    else if (g1(a,b))
      Map.empty
    else if (h1(a,b))
      Map(a -> h2(x,y), b -> h2(y,x))
    else
      ???
  }
  for {
    (a, x) <- as
    (b, y) <- bs
    res <- buildMap(a, x, b, y)
  } yield res
}

Ahh, I see, res <- buildMap(...) iterates over the zero or more Maps, and then yields them by a final .map(identity).

That’s semantically the same. My gut feel is that it is less efficient. But I wonder whether it really is?

  def xyzzy1[A,M[_]](as:M[A],bs:M[A],t:A=>Boolean,h1:A=>M[A],empty:M[A]):M[A] = {
    as.flatMap { a =>
      bs.flatMap { b =>
        if (t(a))
          h1(a)
        else if (t(b))
          h1(b)
        else
          empty
      }
    }
  }

  def xyzzy2[A,M[_]](as:M[A],bs:M[A],t:A=>Boolean,h1:A=>M[A],empty:M[A]):M[A] = {
    def local(a:A, b:A):M[A] = {
      if (t(a))
        h1(a)
      else if (t(b))
        h1(b)
      else
        empty
    }
    for { a <- as
          b <- bs
          res <- local(a,b)
          } yield res
  }

And yes, you’re right. I missed the final else.

It very likely is - the question is whether it is significantly so. I’d first go with more readable code. If this turns out to be a hotspot, I’d benchmark variants and then act according to the results.

1 Like

I understand the point you’re making, and it is a good one.

Nevertheless, where I’m coming from, CommonLisp, one learns very early when to use the equivalence of flatMap (ie MAPCAN and friends) vs map (ie MAPCAR and friends) vs foreach (ie MAPC and friends). The equivalent of concentric flatMap is fairly common to avoid intermediate values being allocated and GC’ed. There’s a cousin of for{}yield() (ie LOOP), but the programmer controls whether the inner most clause uses flatMap vs map.

There were already proposals to enhance for-comprehensions to avoid the innermost .map(identity), e.g. https://contributors.scala-lang.org/t/pre-sip-improve-for-comprehensions-functionality/3509

IIRC Mr Odersky is generally OK with the idea but only after cleaning current for-comprehensions desugaring. Desugaring is currently pretty complicated, in some cases inefficient and also sometimes problematic with some monads (like unbiased Either).

1 Like