Don't understand why the `for{}` iterates like it does

jimka · November 26, 2019, 3:54pm

I can’t understand why the zoom loop gets skipped or delayed? Can someone help me understand the difference?

The printed output of the following code is the following. It basically means the next year is iterated before zoom is iterated. Consequently memory fills out and I get an out of memory exception, because records and locations are huge structures.

year = 1986
calculated records
calculated locations
year = 1987
calculated records
calculated locations
year = 1988

def generateFiles(year1:Int,year2:Int): Unit = {
    // To write an image into a PNG file, use the output method.
    // For instance: “myImage.output(new java.io.File("target/some-image.png"))”.
    import Extraction._

    for {year <- (year1 to year2)
         _1 = println(s"year = $year")
         records = locateTemperatures(year, "/stations.csv", "/" + year + ".csv")
         _2 = println(s"calculated records")
         locations: Iterable[(Location, Double)] = locationYearlyAverageRecords(records)
         _3 = println(s"calculated locations")
         zoom <- 0 to 2
         _4 = println(s"zoom=$zoom")
         y <- (0 until 1 << zoom)
         _5 = println(s"y=$y")
         x <- (0 until 1 << zoom)
         _6 = println(s"x=$x")
         dirname = s"target/temperatures/$year/$zoom"
         fname = s"$dirname/$x-$y.png"
         } {
      println(s"generating $fname")
      val image = tile(locations, palette, zoom, x, y)
      val dir = new java.io.File(dirname)
      val file = new java.io.File(fname)
      dir.mkdirs()
      image.output(file)
      println(s"generated $fname")
    }
  }

However, if I break up the for comprehension into several concentric for comprehensions it works and the output is as follows.

year = 1986
calculated records
calculated locations
zoom=0
zoom=1
zoom=2
y=0
x=0
generating target/temperatures/1986/0/0-0.png
generated target/temperatures/1986/0/0-0.png
y=0
y=1
x=0
x=1
generating target/temperatures/1986/1/0-0.png
generated target/temperatures/1986/1/0-0.png
generating target/temperatures/1986/1/1-0.png

The code that works is the following

  def generateFiles(year1:Int,year2:Int): Unit = {
    // To write an image into a PNG file, use the output method.
    // For instance: “myImage.output(new java.io.File("target/some-image.png"))”.
    import Extraction._

    for {year <- (year1 to year2)} {
      val _1 = println(s"year = $year")
      val records = locateTemperatures(year, "/stations.csv", "/" + year + ".csv")
      for {
        _2 <- Some(println(s"calculated records"))
        locations: Iterable[(Location, Double)] = locationYearlyAverageRecords(records)
        _3  <- Some(println(s"calculated locations"))
      }
        for {
          zoom <- 0 to 2
          _4 = println(s"zoom=$zoom")
          y <- (0 until 1 << zoom)
          _5 = println(s"y=$y")
          x <- (0 until 1 << zoom)
          _6 = println(s"x=$x")
          dirname = s"target/temperatures/$year/$zoom"
          fname = s"$dirname/$x-$y.png"
        } {
          println(s"generating $fname")
          val image = tile(locations, palette, zoom, x, y)
          val dir = new java.io.File(dirname)
          val file = new java.io.File(fname)
          dir.mkdirs()
          image.output(file)
          println(s"generated $fname")
        }
    }
  }

jducoeur · November 26, 2019, 5:38pm

Okay, that one sent me hunting. I created this simplified example for testing:

for {
  x <- 1 to 3
  _ = println(s"x = $x")
  y <- 5 to 7
  _ = println(s"x = $x; y = $y")
} ()

That confirms the surprising behavior.

To find out what’s going on in a for comprehension, it’s always best to break it down to, “what are the actual functions being called?” – that is, desugar the comprehension. I tried that in IntelliJ and got disappointingly useless results, but Ammonite clarified things (as so often, Haoyi is the god of useful tools):

desugar {
  for {
    x <- 1 to 3
    _ = println(s"x = $x")
    y <- 5 to 7
    _ = println(s"x = $x; y = $y")
  } ()
} 

res0: Desugared = scala.Predef.intWrapper(1).to(3).map[(Int, Unit)](((x: Int) => {
  val x$1 = scala.Predef.println((("x = ".+(x)): String));
  scala.Tuple2.apply[Int, Unit](x, x$1)
})).foreach[Unit](((x$4: (Int, Unit)) => (x$4: @scala.unchecked) match {
  case scala.Tuple2((x @ _), _) => scala.Predef.intWrapper(5).to(7).map[(Int, Unit)](((y: Int) => {
  val x$2 = scala.Predef.println((("x = ".+(x).+("; y = ").+(y)): String));
  scala.Tuple2.apply[Int, Unit](y, x$2)
})).foreach[Unit](((x$3: (Int, Unit)) => ((x$3: @scala.unchecked) match {
    case scala.Tuple2((y @ _), _) => ()
  })))
}))

Deciphering that, it appears that all = expressions in the for comprehension get map'ped before it calls foreach / map / flatMap on the stuff below it.

I assume there’s a reason for that, but I share your surprise at that extra map in there – I would have expected the first println to be inside the foreach, not before it. In pure-functional code, the results are probably always identical (indeed, the inner println shows exactly what I would expect), but when you have side-effects involved (eg, all these printlns) the difference shows up.

(NB: you don’t need the _1, _2, etc – as my reduced version shows, if you plan to throw out the value anyway, it’s generally idiomatic to just use _.)

tarsa · November 26, 2019, 6:20pm

Desugaring is actually documented here Expressions | Scala 2.13

A generator 𝑝 <- 𝑒 followed by a value definition 𝑝′ = 𝑒′ is translated to the following generator of pairs of values, where 𝑥 and 𝑥′ are fresh names:
(𝑝, 𝑝′) <- for (𝑥@𝑝 <- 𝑒) yield { val 𝑥′@𝑝′ = 𝑒′; (𝑥, 𝑥′) }

jducoeur · November 26, 2019, 6:36pm

Yep, that matches the results I found. I suppose it’s at least consistent at the theoretical level, but I can’t say the results are entirely intuitive when you combine that with nested clauses…

jimka · November 26, 2019, 8:45pm

Yes, your right about _1 and _2. That was just something I was trying as I wasn’t sure I could use two _ with the same name in the same scope. I was just trying to eliminate possible sources of error. it was a red herring.

jimka · November 26, 2019, 9:51pm

Does desugar work in a scala scratch file? or is it only a scala repl feature?

desugar-scastie

jducoeur · November 26, 2019, 10:55pm

Neither – it’s an Ammonite feature. (Ammonite is a much-better alternate REPL / scripting engine for Scala: https://ammonite.io/ – this feature is described at https://ammonite.io/#desugar )

jimka · November 27, 2019, 8:01am

I’m happy my question wasn’t as stupid as I feared it might be.

Jasper-M · November 27, 2019, 9:51am

The Scala REPL has a “hidden” feature that does the same thing:

scala> for { x <- 1 to 3; _ = println(s"x = $x"); y <- 5 to 7; _ = println(s"x = $x; y = $y") } () //print

scala.Predef.intWrapper(1).to(3).map[(Int, Unit)](((x: Int) => {
  val x$1 = scala.Predef.println(scala.StringContext.apply("x = ", "").s(x));
  scala.Tuple2.apply[Int, Unit](x, x$1)
})).foreach[Unit](((x$4: (Int, Unit)) => (x$4: @scala.unchecked) match {
  case scala.Tuple2((x @ _), _) => scala.Predef.intWrapper(5).to(7).map[(Int, Unit)](((y: Int) => {
  val x$2 = scala.Predef.println(scala.StringContext.apply("x = ", "; y = ", "").s(x, y));
  scala.Tuple2.apply[Int, Unit](y, x$2)
})).foreach[Unit](((x$3: (Int, Unit)) => ((x$3: @scala.unchecked) match {
    case scala.Tuple2((y @ _), _) => ()
  })))
})) // : Unit

After //print you have to press the tab key instead of enter. For some reason it only works reliably when you write everything on one line.

And the scala.reflect.runtime.universe.reify function also does more or less the same thing as desugar.