Why is my implementation of Iterable not lazy


#1

I am implementing a wrapper that manages iterators within it. Essentially I keep iterators in the wrapper
that represent columns of data. While testing I found that my wrapper is not lazy. So I made this very
simple test:

    val counter1 = Stream.from(1).toIterable // No need to call toIterable
    val r1 = counter1.map( _ * 2).take(25).toList // terminates
    println(r1)

The infinite stream above can be processed lazily. So now I implement my own Iterable and expect it
to behave lazily too just as above. I have the following minimal example:

    val data = List(Stream.from(1).toIterator)
    val counter3 = new Iterable[List[Int]] {
      override def iterator: Iterator[List[Int]] = new AbstractIterator[List[Int]] {

        override def hasNext: Boolean = data.forall( p => p.hasNext)

        override def next(): List[Int] = data.map { v => v.next() }
      }
    }
    //val r3 = counter3.toIterable.map( _.map(_ * 2) ).take(25).toList // does not terminate
    val r3 = counter3.map( _.map(_ * 2) ).take(25).toList // does not terminate
    //val r3 = counter3.toIterator.map( _.map(_ * 2) ).take(25).toList // terminates
    //val r3 = counter3.iterator.map( _.map(_ * 2) ).take(25).toList // terminates
    println(r3)

If I use the Iterable’s map directly, the behavior is not lazy. I am obligated to access
the iterator first. What am I doing wrong?

TIA


Why can't the Iterator work correctly?
#2

The default implementations of collection methods up to Scala 2.12 are strict. You have to override basically everything if you want to implement a lazy collection. This changes with the new design in 2.13.0-M4 that makes the default implementations lazy and provides mix-in traits for more efficient strict implementations.


#3

Your implementation of the hasNext is not lazy. forall needs to go over all elements, until it finds one that returns false.


#4

If you are specifically referring to the Iterable and Iterator implementations this seems odd. Note that I am using the Stream’s iteratable, so Iterable is lazy. Because I use that Iterable, my Iterable should also be lazy (baring some error I am making). Note that if I access the Stream’s iterator via the very same Iterable, then it is lazy. Must be more to it than meets the eye.


#5

I am iterating over Iterators. In this case we have only one - does not seem to be an issue.


#6

Same behavior on 2.13, maybe because of using List here? Just guessing from browsing. I got lost among the factories.

scala> val data = List(Stream.from(1).toIterator)
                                      ^
       warning: method toIterator in trait IterableOnceOps is deprecated (since 2.13.0): Use .iterator() instead of .toIterator
                       ^
       warning: value Stream in package scala is deprecated (since 2.13.0): Use LazyList instead of Stream
data: List[Iterator[Int]] = List(non-empty iterator)

scala> def vs = new Iterable[List[Int]] { def iterator = new Iterator[List[Int]] { def hasNext = data.forall(_.hasNext); def next() = data.map(_.next()) }}
vs: Iterable[List[Int]]

scala> vs.map(_.map(_*2)).take(25).toList  // bye

Note that what is built by Iterable.map has nothing to do with anything else, though one could imagine that an Iterable[SomethingLazy] might try to be lazy on 2.12. Also guessing.


#8

The list is not an issue, but you are correct - the example is not clear.
I have made the following simpler example:

    case class Wrapper(r:Iterable[Int])
    val counter2 = Wrapper(Stream.from(1))
    val r2 = counter2.r.map( _ * 2).take(25).toList // terminates
    println(r2)

    val data = Wrapper(Stream.from(1))
    val counter3 = new Iterable[Int] {
      override def iterator: Iterator[Int] = new AbstractIterator[Int] {
        val iter = data.r.toIterator

        override def hasNext: Boolean = iter.hasNext
        override def next(): Int = iter.next()
      }
    }
    //val r3 = counter3.toIterable.map( _ * 2 ).take(25).toList // does not terminate
    //val r3 = counter3.map( _ * 2 ).take(25).toList // does not terminate
    val r3 = counter3.toIterator.map( _ * 2 ).take(25).toList // terminates
    //val r3 = counter3.iterator.map( _ * 2 ).take(25).toList // terminates
    println(r3)

In the first “section” I use the Stream’s Iterable directly with no problem.
In the second case I simply wrap this Iterable in my own. Note that it does
nothing useful here - just delegates to the underlying iterator. But even so
it is not lazy when I process the stream via the Iterable and not the underlying
iterator.

Here is the “same” code that you can paste into scastie:

object Main {
  
  case class Wrapper(r:Iterable[Int])
  
  def main(args: Array[String]): Unit = {}
  
  val counter2 = Wrapper(Stream.from(1))
    val r2 = counter2.r.map( _ * 2).take(25).toList // terminates
    println(r2)
    val data = Wrapper(Stream.from(1))
    val counter3 = new Iterable[Int] {
      override def iterator: Iterator[Int] = new Iterator[Int] {
        val iter = data.r.toIterator

        override def hasNext: Boolean = iter.hasNext
        override def next(): Int = iter.next()
      }
    }
  
    //val r3 = counter3.toIterable.map( _ * 2 ).take(25).toList // does not terminate
    val r3 = counter3.map( _ * 2 ).take(25).toList // does not terminate
    //val r3 = counter3.toIterator.map( _ * 2 ).take(25).toList // terminates
    //val r3 = counter3.iterator.map( _ * 2 ).take(25).toList // terminates
    println(r3)
}

Same result, it hangs processing an infinite loop.

Hmm… now how do I cancel the process in scastie’s sever?


#9

I don’t spend my days thinking about collections, but I think my previous comment is still cogent or relevant.

You can’t expect that result to be either strict or lazy unless Stefan Zeiger tells you so. Zeiger points the way.


#10

I see what you did there. But I had to take another look anyway.

The problem is indeed what map builds. In 2.12 map is strict so you’d have to override it with a lazy implementation and provide a sensible CanBuildFrom instance, too. In 2.13 map supports lazy building but the default Iterable, as @som-snytt points out, is List, so it’s not lazy. It works (in 2.13.0-M4) if you override the iterableFactory to build a lazy collection:

val counter3 = new Iterable[Int] {
  override def iterator: Iterator[Int] = ...
  override def iterableFactory = Stream
}

#11

Cogent and relevant. I simply did not understand what you said. @szeiger’s last comment has made it clear. Thanks.


#12

Ok. Finally get the gist of it. I am using 2.12 so I have one last question: the underlying CanBuildFrom generates a List, however the following statement succeeds:

val r3 = counter3.iterator.map( _ * 2 ).take(25).toList

Am I correct in assuming it is lazy? Reason I ask is because I am using this as a kludge to “solve” the problem.

TIA


#13

Yes, you are correct. Iterator goes to great lengths to be lazy.

I see the latest commit on 2.12 Iterator is, “Iterator.scanLeft is lazy at initial value.” Apparently, it was my contribution, but I’d have to get out a napkin and a pencil to verify how it works. Another was, “GroupedIterator is also lazy when padded.” I no longer understand the helpful comments I added to the code four years ago.

Those iterators are like the Monday crossword.

I’m not sure why you see your solution as a kludge. Nothing says lazy like iterator, so you have made it clear that you want laziness, the way par makes it clear you want parallelism.


#14

Thank you very much for the feedback. I will assume Iterator is always lazy irrespective of the operation.

I have function that returns an Iterable. However I have to do a map, but in order to keep it lazy I have to:

  • get the iterator
  • map over the iterator
  • return the Iterable of the map

Feels a little convoluted. Anyway thank you to you and @szeiger for the information.