Help understanding compiler errors regarding Option and Map

jimka · July 1, 2019, 2:29pm

I’m getting a compiler error which I don’t understand. I’ve tried to reduce the test case to some code which exhibits the error, but the code is now pretty useless.
Nevertheless, it would be great if someone could help me understand what the error means. It would be great if someone could have me some insight into what the real problem is here.

ScalaFiddle.scala:25: error: type mismatch;
 found   : immutable.this.Iterable[scala.this.Int]
 required: scala.this.Option[?]
          (rectified: Int, cla: Set[Int]) <- rectHashA

I’ve created a ScalaFiddle to show the problem.

The code is also here in a much simpler form than the actual code in my application. If I comment out the line rectHashB: H1 <- lengthHashB.get(length) when the problem goes away.

def foldups(ups:Iterable[Int]):Int = {
  ups.sum
}

type H1 = Map[Int, Set[Int]]
type H2 = Map[Int, H1]
val h1:H1 = Map(1 -> Set(1,2,3))
val h2:H2 = Map(1 -> h1)

val hash:Map[Int,H2] = Map(1 -> h2)

def reducePosCountClauses(posCount: Int): Double = {
  val dummy:Int = 0

  (hash.get(posCount), hash.get(posCount - 1)) match {
    case (None, _) => dummy
    case (_, None) => dummy
    case (Some(lengthHashA: H2), Some(lengthHashB: H2)) =>
      foldups(for {
        (length: Int, rectHashA: H1) <- lengthHashA
        rectHashB: H1 <- lengthHashB.get(length)
        (rectified: Int, cla: Set[Int]) <- rectHashA
        if cla.nonEmpty
      } yield dummy 
              )
  }
}

jimka · July 1, 2019, 2:54pm

Converting the Option to a List seems to fix the problem. Not sure whether this is the correct approach. It seems like a hack.

https://scalafiddle.io/sf/1pLDHwl/2

  def optionToList[A](o:Option[A]):List[A] = {
    o match {
      case None => Nil
      case Some(a) => List(a)
    }
  }

sangamon · July 1, 2019, 4:29pm

It’s what @curoli covered in the other thread.

Assume

val m: Map[Int, Int] = ???
val o: Option[Int] = ???

This works:

for {
  mv <- m
  ov <- o
} yield ov

It desugars to

m.flatMap { mv =>
  o.map { ov =>
    ov
  }
}

…and Map#flatMap() (from TraversableLike) has this signature:

def flatMap[B, That](
  f: A => GenTraversableOnce[B])(
  implicit bf: CanBuildFrom[Repr, B, That]
): That

Somewhere under the hood, the implicit Option#optionToIterable() is triggered, thus That is inferred to be Iterable.

On the other hand…

for {
  ov <- o
  mv <- m
} yield mv

…doesn’t compile. It desugars to

o.flatMap { ov =>
  m.map { mv =>
    mv
  }
}

…with Option#flatMap() having signature

def flatMap[B](f: A => Option[B]): Option[B]

…and there’s no implicit conversion from Map to Option. If you force o to Iterable, it works:

for {
  ov <- o.toIterable
  mv <- m
} yield mv

…because then we’re back to Iterable#flatMap() that’s compatible with Map. (EDIT: #toIterable() explicitly triggers the implicit conversion - it’s actually Option.option2Iterable(o).toIterable(). You could also use Option#toList(), which is native to Option.)

Of course one could ask why the type inference doesn’t “find” the Iterable conversion on its own in the latter case. But to me personally the important takeaway rather is that implicit conversions often add more confusion than they help.

In general I’d recommend to just avoid mixing different types in a for comprehension. And yes, forcing Option to Iterable (and even more so relying on the implicit conversion) feels like a hack to me in general, although it may be justifiable sometimes.

jimka · July 1, 2019, 7:31pm

That’s a shame to avoid using for comprehensions with different types. When I first look saw the for comprehension, I thought it was scary. Then I got used to it, now I’m scared of it again.

I wonder whether it’s not better to use a collector such as the following.

  def withCollector[A](f: (A => Unit) => Unit): List[A] = {
    var objects: List[A] = Nil

    def collect(obj: A): Unit = {
      objects = obj :: objects
    }

    f(collect)
    objects.reverse
  }

Which seems to lift this impedance mismatch between different types in a for comprehension, as shown here.

def reducePosCountClauses(posCount: Int): Int = {
  val dummy:Int = 0

  (hash.get(posCount), hash.get(posCount - 1)) match {
    case (None, _) => dummy
    case (_, None) => dummy
    case (Some(lengthHashA: H2), Some(lengthHashB: H2)) =>
      foldups(withCollector(collect=> for {
        (length: Int, rectHashA: H1) <- lengthHashA
        rectHashB: H1 <- lengthHashB.get(length)
        (rectified: Int, cla: Set[Int]) <- rectHashA
        if cla.nonEmpty
      } collect( dummy)))
  }
}

jducoeur · July 1, 2019, 7:36pm

Don’t be scared of it, but be aware of its limitations. And keep in mind, this has nothing to do with the Scala library itself: at the underlying theoretical level, mixing Monads (such as List and Option) doesn’t actually make sense – that’s really what @sangamon’s note is talking about. The standard library sometimes lets you cheat on this without any boilerplate, but don’t get too used to that: as you’re finding, that often breaks down.

I generally recommend using for comprehensions freely when everything to the right of the arrows are the same Monad – when they are all Option, or all List, or all Future, and so on. You can sometimes use it a bit more flexibly than that, but only because the standard library is bending over backwards to make it sometimes possible. Broadly speaking, though, I don’t count on that: I generally put in an explicit conversion when I notice it, so that it’s clear what my code is actually doing…

sangamon · July 1, 2019, 10:29pm

I’m very sorry, that certainly wasn’t my intention. Looking at your code again, I actually think that using the Option as an Iterable might actually be justifiable - but I’d certainly add the explicit .toList/.toIterable to the Option, whether the compiler requires it or not.

It’s a bit hard to tell, though, since something seems to have gone missing in the reduction of the code (or maybe it’s just me, it’s late here already).

I’d expect both rectHashA and rectHashB to be actually used at some point. Or is the rectHashB line only there to ensure that there exists an entry for length in lengthHashB…? (Overall this code variant just seems to return 0, anyway, for the given data and in general, so I’m really a bit at loss about the originally intended semantics.)

jimka · July 2, 2019, 8:21am

yes, in the code same rectHashB and rectHashA are not used simply because I removed lots of code to minimize the test case.

The sense is that my data is stored in a hash of hashes of hashes of sets. If the data is there (as opposed to missing), I can access the set of clauses by hash(posCount)(length)(rectified). This data structure makes it easy to find the clauses which my algorithm needs to treat. The danger is that hash(posCount) might find no such entry, slimier for hash(posCount)(length), and for hash(posCount)(length)(rectified). It turns out this data structure is amenable to this Option based for comprehension.

To help it make sense, here is the working code after forcing to list.

  def reducePosCountClauses(posCount: Int): REMADD = {
    (hash.get(posCount), hash.get(posCount - 1)) match {
      case (None, _) => (Nil, Nil)
      case (_, None) => (Nil, Nil)
      case (Some(lengthHashA: H2), Some(lengthHashB: H2)) =>
        foldups(for {
          (length, rectHashA) <- lengthHashA
          rectHashB <- lengthHashB.get(length).toList  // marked as .toList as suggested by sangamon and others
          (rectified, cla) <- rectHashA
          if cla.nonEmpty
          clb <- rectHashB.get(rectified)
          if clb.nonEmpty
        } yield crossCompatiblize(cla, clb, posCount, length, rectified)
                )
    }
  }

Frankly though, I find it unsettling that I have to match the adjacent types in the for comprehension in order to make the yield work. Also it is not clear to me how to designate the type I want to compute. I’d love to hear comments about my use of withCollector in the following equivalent code, which DOES NOT need to match adjacent types. withCollector provides its body with a function (which the caller can name as he likes, called collect in the example below) which serves the purpose of yield, so to the human it reads the same way. The collect function may be called as many times as desired within the loop. And, there are different forms of withCollector which collect in different ways: withSetCollector, withMaximizer, withCounter, withSummer…

def reducePosCountClauses(posCount: Int): REMADD = {
    import accumulators.Accumulators.withCollector

    foldups(withCollector(collect =>
                            for {
                              lengthHashA <- hash.get(posCount)
                              lengthHashB <- hash.get(posCount - 1)
                              (length, rectHashA) <- lengthHashA
                              rectHashB <- lengthHashB.get(length) // no need to match adjacent types here
                              (rectified, cla) <- rectHashA
                              if cla.nonEmpty
                              clb <- rectHashB.get(rectified)
                              if clb.nonEmpty
                            } collect(crossCompatiblize(cla, clb, posCount, length, rectified))))
  }

jducoeur · July 2, 2019, 4:10pm

While I think of it: this is getting a smidgeon advanced, but there’s a concept called Monad Transformers that allow you to sometimes use multiple Monad types together. For example, the OptionT transformer specifically allows you to use Option with other types: you basically blend them together into a single Option-plus-other Monad, essentially.

This only works for certain types (OptionT and EitherT are the most common ones), and there’s a bit of runtime overhead involved, but it is a common approach for mixing Monads when you have two that you really need to use together. Here is the page on the Cats implementation of OptionT, which introduces the concept.

There is also a newer and more general approach called Nested, but I haven’t used that myself yet, so can’t say much more than point to it.

sangamon · July 2, 2019, 5:35pm

You could use the #toList() trick on those first hash.get() lookups, as well. Or you could at least simplify the match by coercing the two None cases to a final case _ => (Nil, Nil) default.

…because then you just have nested #foreach() calls that don’t require any of the nested target types to match. Personally I find this pattern terribly convoluted and hard to understand. And it’s certainly not a very functional approach…