Side effects in for comprehensions

jimka · November 5, 2019, 8:57am

I still encounter the problem from time to time about how to break out of a loop.
There is often a way to re-think the problem to avoid doing so, but the problem does
seem to keep coming up in my code.

I thought of yet another way of doing this, which happens to work. I wonder whether this works only by accident or whether it is dependable. The idea is that use .takeWhile on an iterator, and have the condition test some side-effected-woooooo-stop-value.

Here is a simple example.

locally {
  var x = 0
  for {
    i <- (1 to 100).toIterator.takeWhile(_ => x < 10)
    _ = (x = i*i)
  } yield i
}.toList

This returns the list List(1,2,3,4).

Of course this simple example could be rewritten easily to something purely functional. But there are cases where it is difficult to calculate the termination condition simply based on the values being iterated over.

tarsa · November 5, 2019, 11:16am

I would say it’s not only unreliable, but also unreadable. If that’s a simple example then I’m frightened what a complex one would be. For-yields are meant to be side effect free, so they are rather poorly tested when it comes to mixing for-yields with side effects. Also, if you’re filtering in for-yield and a type doesn’t implement withFilter or doesn’t obey its contract fully then you will observe incorrect results.

If you want complex termination logic then tail recursion is the way to go. It can be converted to imperative loops by the compiler and it gives you all the power that imperative loops offer. If you are keen to locally use mutable structures to improve performance then explicitly imperative loops are the best fit.

jimka · November 5, 2019, 11:35am

The problem with tail recursion is that the code has to be constructed very differently depending on the container. A function which does tail recursion on a list cannot be used to do tail recursion on an array. The for compression hides this difference.

If I recall the actual case when I encountered, and eventually rewrite completely was a fold. Something like this

generating_stuff.foldLeft(init)(...unfortuantly can't break here...)

After rewriting it as the following, but it was too late

generating_stuff_as_iterator.takeWhile(test_side_effect).foldLeft(init)(...compute_with_side_effect...)

In the case I encountered, it was the generator (feeding into the fold) which was successively generating larger and larger objects, which needed to be combined in the fold. And it was the computation inside the fold which recognized when it was time to stop computing.

I don’t remember anymore how I rewrote the problem to avoid this metacircularity, but on my pillow in the middle of the night I thought: Why didn’t I use takeWhile(...).foldLeft(...)?

jimka · November 7, 2019, 11:12am

Here’s another example where the technique seems too be somewhat useful. This approach is for running a timing analysis for 45 minutes. ...toIterator.takeWhile(timeOutNotReached).

    val t0 = System.nanoTime() // nano-seconds
    val timeOut =  45 * 60  // time in seconds = 45 minutes
    def timeOutNotReached(ignored:Int):Boolean = {
      val now = System.nanoTime()
      val elapsed = (now - t0) / 1.0e9 // time in seconds
      if (elapsed < timeOut)
        true
      else {
        println(s"Timeout reached: elapsed=$elapsed   timeOut=$timeOut")
        false
      }
    }
    val rawDataForPlot = for {
      maxNumTerms <- (10 to 200 by 3).toIterator.takeWhile(timeOutNotReached)
      (text, f) <- foldPairs
      dnf = genDnfFromBitMask(numVars, maxNumBits, maxNumTerms, _ => true).toList
      ms: Double = time(3, s"num-terms=$maxNumTerms algo=$text", {
        withNewBddHash {
          f(dnf)
        }
      })
    } yield (text, maxNumTerms.toDouble, ms)