Can't traverse an iterator more than once

jimka · June 21, 2019, 11:00am

I found a bug in my code which surprises me somewhat.
I’m using a for comprehension to create an iterator over the rows of what might be a very large truth table. (in fact it has 2^n rows). dnf is the value of the iterator. If I print dnf.toList, and then pass dnf to a function which tries to dnf.foldLeft then nothing gets folded, because the iterator is at the end.

If in fact these are expected semantics, then shouldn’t an attempt to iterate a second time over the iterator trigger some sort of exception? Otherwise my program has a Heisenbug, ever time I don’t look at it, it works.

  def genKthBdd(n:Int, k:Long): Bdd = {
    val dnf = for {
      truthTableRow <- (0 to (1<<n) - 1).iterator
      if (k & (1<<truthTableRow)) != 0
    } yield Aux.minTerm(n, truthTableRow)
    println(s"dnf=${dnf.toList}")
    dnfToBdd(dnf)
  }

sangamon · June 21, 2019, 11:11am

How would the ~~compiler~~ runtime decide between attempting to iterate a second time (illegal) and picking up again after partial iteration (legal)?

Welcome to stateful APIs!

jimka · June 21, 2019, 11:22am

If it is indeed stateful, then it could know when it has exceeded the final state, and trigger an error on the next attempt. Right?

Do I need to avoid using iterators if I want to write functional programs? I don’t really need it to maintain state (for the purpose of my application) I just need it to behave well with iteration constructs.

sangamon · June 21, 2019, 11:40am

val i = Iterator(42)
println(i.hasNext)
println(i.next)
println(i.hasNext)
println(i.hasNext) // why should/how could this be illegal?

Well, iterators are not functional, so if you are striving for functional programs, you should at least contain them safely within a function/method/class that exhibits “functional” behavior to the outside.

jimka · June 21, 2019, 11:55am

Yes, I see now. The veil is lifted.

In that case, I really don’t know what kind of object I need. I’d like to avoid creating and passing a very long sequence to dnfToBdd, but rather have the next sequence entry calculated as needed.

sangamon · June 21, 2019, 2:52pm

It depends on the concrete case. Stream might be an alternative. Iterator may be perfectly fine, though. Its usage ideally should just be nicely encapsulated, and within the confines of the capsule you need to be aware of its specifics.

curoli · June 21, 2019, 3:18pm

An iterator is basically something that has methods hasNext() and next(), and all other methods are convenience. If hasNext() is false, there usually is no way of telling whether that’s because it was always empty or whether some one traversed it to the end.

Essential use cases for an iterator are, for example, processing results from a file or database, without the need to have all of them in memory, or even in existence, at the same time.

When you say Heisenbug, do you actually mean it gives different results for different runs?

jimka · June 21, 2019, 4:37pm

No, by heisenbug I mean that once I look at it, it behaves differently. My problem was that dnf was an iterator of objects, and at some point during debugging I needed to see what it was, so I printed it with println(s" dnf=${dnf.toList}"). It took a long time debugging lots of different things until I finally realized that certain code paths gave me an empty dnf, and all those code paths ran through that println.

tyohDeveloper · June 21, 2019, 5:15pm

Functional? Not from the pure math perspective. Can math tell the difference? No. For the same inputs, the same outputs are guaranteed.* It is s Co-function. One who’s results are generated through time.

In the 70s, 80s there were, shall we say, old farts in mathematics. They claimed that computational mathematics was not really math. Fractals weren’t math. Eventually, someone proved the obvious. Numbers like e and pi are no different than a fractal expansion. The halting problem had be around for a few years. They are defined by mathematical functions, qed…

Iterators are no different than computing the next digit of pi. Fortunately, many are non-continuing rationals rather than an alternative. When dealing with process everything has state. The definition of process mean change happens over time.

Yes, iterators are functions.

But we live in a world of computers. Real computers can’t work as functions. Memory has a finite limit. Power can fail. Coffee can be spilled. An interruption the memory buss can intrude. As programmers, we pretend they don’t exist, mostly. As builders of systems, we must account for the last.

We just need to make sure those interrupts don’t disrupt the function. If we use the iterator for tail recursion, it’s easy. There are no other possible interactions. Use iterators. They are no different than immutable lists.

Shared iterators, mutable backing stores, those are problems. For the unwary, unordered sets are problematic. That is what give evangelists the willies. If they’d think it through, they would be fine. State isn’t bad. The computer has a stack or registers, after all. In math, a function that surprises us is still a function. Even functions with inverses can be unpredictable “trap door” functions.

Shared state isn’t even bad. It does make it difficult for good programmers to keep it from being problematic. It encourages bad programmers from ensuring problems. That’s the logic behind state=cooties.

Summary. State that isn’t shared isn’t bad. It’s a requirement.

*[Presuming the source is not designed to change — like I/O.]

tyohDeveloper · June 21, 2019, 5:18pm

The original problem was caused by imperative programming, not the iterator.

jducoeur · June 21, 2019, 6:22pm

A bit concise, but yeah. @jimka, in principle what you really want is a data structure that represents this computation, rather than the computation itself. Doing things that way is pretty much the heart and soul of functional programming, and it makes it much easier to achieve what you’re looking for here: to have a computation that can be re-run correctly as much as you like.

In principle, I suspect that the data structure you really want for your (0 to (1<<n) - 1).iterator is an fs2 Stream, but that’s not something you can just casually drop into imperative-style code: it’s designed to be used in a functional style, which is a significantly different mode of programming. (It can be encapsulated inside of an otherwise imperative program, but finding the right places to draw the lines is often a bit challenging.)

jimka · June 21, 2019, 7:54pm

Is my code imperative style? Do you mean that the use of for comprehensions with yield is imperative style?

jimka · June 21, 2019, 8:04pm

BTW. I’m pretty happy with the current state of the code once it has been debugged. Especially how I was finally able to express generating a Bdd from randomly selected truth table (genRandomBdd) vs generating a Bdd from the k’th truth table (genKthBdd) without having to generate a random 2^n bit integer, which may be larger than Long.MaxVal.


  def clauseToBdd(clause: Clause, ident: BddTerm, op: BinaryOperation): Bdd = {
    // interpreting clause as an OR so that clause is CNF rather than DNF
    def conv(i: Int): Bdd = Bdd(i)

    treeReduce(clause, ident: Bdd, conv, (acc: Bdd, b: Bdd) => op(acc, b))
  }

  def cnfClauseToBdd(cnfClause: Clause): Bdd = clauseToBdd(cnfClause, BddFalse, Or)

  def dnfClauseToBdd(dnfClause: Clause): Bdd = clauseToBdd(dnfClause, BddTrue, And)

  // Interpret a list of Clauses as an And of Ors where each clause is a disjunction
  def cnfToBdd(cnf: TraversableOnce[Clause]): Bdd = {
    treeReduce(cnf, BddTrue, cnfClauseToBdd, { (acc: Bdd, b: Bdd) => And(acc, b) })
  }

  // Interpret a list of Clauses as an Or of Ands where each clause is a conjunction
  def dnfToBdd(dnf: TraversableOnce[Clause]): Bdd = {
    treeReduce(dnf, BddFalse, dnfClauseToBdd, { (acc: Bdd, b: Bdd) => Or(acc, b) })
  }

  def genBddFromBitMask(n: Int, f: Int => Boolean): Bdd = {
    val dnf = for {
      truthTableRow <- (0 until (1 << n)).iterator
      if f(truthTableRow)
    } yield minTerm(n, truthTableRow)
    dnfToBdd(dnf)
  }

  val prng = scala.util.Random

  // Generate randomly selected BDD of n variables.
  // In terms of truth table, this is equivalent to choosing a number between 0 and 2^n-1 where
  // each 1 in k'th position of the binary representation indicates a T in the k'th row
  // of the truth table, but we don't really generate such an integer as it may be larger
  // than Long.MaxValue
  def genRandomBdd(n: Int): Bdd = {
    genBddFromBitMask(n, _ => (0 == prng.nextInt(2)))
  }

  // if the 1 bits of the binary representation of k denote the True
  // rows in the truth table, with the least significant bit representing
  // variable x1. (there is no x0)
  // k is interpreted as a 2^n-bit binary number. 0 <= k < 2^(2^n)
  def genKthBdd(n: Int, k: Long): Bdd = {
    // we interpret k as a 'function' mapping the range 0<=i<2^n to Boolean
    //   whose value is true if there is a 1 in the ith position of k,
    //   considering the 0'th position as the least significant bit.
    genBddFromBitMask(n, truthTableRow => (k & (1 << truthTableRow)) != 0)
  }

jducoeur · June 22, 2019, 3:17am

No, that’s unrelated – after all, a for comprehension is just syntax sugar for map and flatMap. That’s actually extremely common in functional programming.

The distinction here is as opposed to the pure-functional style, where all side-effects (such as the one-off nature of the iterator) are tightly encapsulated. One of the many nice things about that approach is that it’s easier to reason about the results – you tend not to get bitten by side-effects like the Iterator getting “used up”. But it’s a different way of structuring code.

While I’m getting to the point of being comfortable with that style myself, I’m far from expert in it, and I’m not great at describing it. (As is obvious here.) If you want to delve in (and I’m not going to push you in this direction yet – it’s yet more stuff to learn), the standard source is Functional Programming in Scala, aka “The Red Book”. I won’t kid you, it’s fairly heavy going, but you’re about at the point of having enough Scala knowledge to tackle it, at least once you have typeclasses, and it’s hugely educational. You might want to consider delving into it, at least as a background project. Or possibly easier going might be Essential Scala, which I gather has a pretty strong FP focus with a lighter touch.

jimka · June 22, 2019, 6:44am

Yes, I read the Red Book some time ago. Might make sense to read it again. I had several complaints about the book, however. As I recall the main issue I had was that the authors hide very important concepts in exercises for which the answer to the exercise is not in the book itself. This makes for a significant obstacle when reading in isolation-- I read it mostly in the metro during my morning and evening commutes. As I understand there is a companion Blue Book with all the solutions to the exercises. Thus to read it you need to read the two in parallel, with both open at the same time.

jducoeur · June 22, 2019, 4:17pm

Hmm. I don’t think that’s technically true – at least, I’ve never heard that particular complaint before. (I didn’t even know the companion book existed.) But the book does assume that you know Scala pretty well, and it’s very common for folks to pick it up before they know the language well enough to actually understand what the heck is going on, or to be able to answer the exercises; I wonder if you might have hit some of that?

(This is my primary complaint with the book – IMO, it would be well-served to have a section in the introduction that says essentially, “Do not try this book until you are well-practiced in the following language features.” And I wish the community would stop recommending it to new Scala programmers: in too many cases, it’s more frustrating than useful if you’re just starting out. Creative Scala and Essential Scala are aimed at newer Scala programmers; FPiS really isn’t.)

jimka · June 22, 2019, 7:43pm

Oh the other hand the Alvin Alexander book , Functional Programming, Simplified seems to do a good job of explaining function programming without getting lost in the type calculus, and also without requiring knowledge of java. It’s my favorite Scala book that I’ve read so far. Admittedly the Alexander book is very thick, which makes it difficult to lug around. I really think he could have designed the book with half as many pages without loosing too much content, because a huge number of pages are blank, the font is huge, and every chapter has an introduction and conclusions which more or less say the same thing.

jducoeur · June 23, 2019, 3:01pm

Hmm – haven’t read the book, but from the sound of it I suspect it’s a compilation of his relatively popular blog. That would explain why it’s not all that well structured as a book per se: it probably makes more sense as individual posts…

jimka · June 24, 2019, 7:57am

One thing I like about the book, (although I haven’t read his blog), is that he emphasizes which things he found confusing when he was first learning various topics. I often found the same things confusing, so the fact that the author does a good job of putting himself in my place is only a good thing.

ivanopagano · June 24, 2019, 9:13am

The blue book is nothing more than the collection of an existing github repository: GitHub - fpinscala/fpinscala: Code, exercises, answers, and hints to go along with the book "Functional Programming in Scala"

Searching the web would’ve given that out earlier, but sadly it’s not referred to in the book itself, as the github repo was born after the book came out, I guess

In any case it’s quite easy to work through the exercises by cloning the repo and solving the code within… there are separate folders for exercises and full-blown solutions.

Being it live, you also get the advantage of having generally up-to-date solutions