Views vs iterators

What can I do with views that I cannot do with iterators? What’s the benefit of one versus the other?

Iterators are destructive and not referentially transparent. If i do the following

List(1,2,3).iterator.foreach(println)
List(1,2,3).iterator.foreach(println)

I cannot refactor the common pieces of code

val iter = List(1,2,3).iterator

iter.foreach(println)
iter.foreach(println)

where as a view is immutable, i can do the refactor with no issue

val view = List(1,2,3).view

view.foreach(println)
view.foreach(println)

Other than that they behave pretty similarly, they are both lazy, and may provide a performance benefit when doing many operations on large collections, for example the below.

List(1,2,3).map(...).map(...).filter(...).flatMap(...) // each statement here will create an intermittent collection eagerly, for large collections this could run the risk of running out of memory
3 Likes

That’s my point. For this kind of pattern does it matter if I use an iterator instead of a view? And in those cases where I need to refer to a view twice, can I create two iterators instead? What is the reason the collections framework implements both?

As mrooneytrimble said above, iterators are destructive and views aren’t. Views are good if you want to perform lots of map and filter operations on a collection and then transform that into, say, a List or Map at the end. Iterators don’t keep stuff in memory as views do, so if you want to iterate over all permutations of a list to do something without generating a ginormous collection, iterators are probably better.

1 Like

But, again, I can do that with an iterator.

I understand the difference between views and iterators. My question is about usage patterns.

You can do that, but afaik it’s not usually done - at least, I’ve never used an iterator to map Map values or something like that.

Yes, you can. In some cases that’s all the view is doing under the hood anyway. The view might be more efficient, might not. It’s probably more convenient to use two views, except maybe it’s less convenient to have to think about another mode of operation of collection.

But if you can do it with iterators, may as well do it with iterators, especially if it’s just a single pass.

1 Like

What else is an Iterator good for, though? This is exactly their intended use case.

1 Like

Oh, I thought they were some kind of hold-over from Java used for imperative algorithms (although I guess .map and other methods probably use an iterator and builder under the hood).

Iterators are great for creating lots of expensive things one at a time and consuming them without the same memory footprint as collection items.

List.fill(10000)(expensiveOperation)

new AbstractIterator[Op] {
  var count = 10000
  def hasNext: Boolean = count > 0
  def next(); Op = {
    if (!hasNext)
      Iterator.empty.next
   else {
     count -= 1
     expensiveOperation
   }
  }
}

It’s a bit of a niche use. I normally just add .view to whatever pipeline I’m building and don’t worry about the details. But there are situations where a custom iterator is a nice approach.

What’s wrong with Iterator.fill(10000)(expensiveOperation)?

Can someone explain the choice of the word view here? How is the meaning of the word supposed to imply lazy. Normally a view is a projection. I.e., I can view the spherical globe either as a cylindrical projections, or a rectangular view window between two coordinates on the surface of the globe.

Also in a model/view/controller, the view is one of many possible projections of the model to present a particular aspect of the system to the user or to another software component.

Also in a CAD system, a view is a projection of the physical or mathematical system which provides a certain abstraction to the designer, such as the schematic view, or the layout view or a simulation view.

I don’t understand the connection between viewing a sequence and making it lazy.

I think the intuition here is that if you map a View on a List, you create a new projection to the list, rather than creating a new list.

1 Like

And for a view is it guaranteed that values computed are memoized? or does that depend on the particular view? I.e., If I create a view on a sequence for which the elements are computationally expensive to compute, is it guaranteed that reading the values a second time is a cache retrieved and not a re-computation?

Views in the stdlib are not memoized. They’re intended to represent staged operations on some underlying collection, as a lens to view the collection through.

View is a trait, so implementations that do memoize could be made, but people do expect Views to have low memory consumption, so users of a memoizing view could be in for an unpleasant surprise.

2 Likes

in terms of unpleasant surprise, this could work both ways right? I might be unpleasantly surprised that slow computations occur multiple times, or I might be unpleasantly surprised that memory consumption is high.

It seems like what you are saying, however, is that if I want a lazy sequence which does not recompute slow computations, I, the application programmer, am responsible myself for memoizing?

But the first kind of surprise is caused by not knowing how Views work in the scala std library, while the second kind of surprise is caused by one particular type of View working differently than all other Views.

For that use case I think a LazyList is best suited.

2 Likes

yes, that was sort of the reason for my question/comment. I read through the page 2.13 views and I didn’t see whether lazy implied memoizing or not. In some functional languages, lazy does imply memoizing. In my mind that is an important semantic that the user should know to avoid either of the two surprises mentioned above.

With regard to “first kind of surprise is caused by not knowing how Views work in the scala std library”, I understand now how they work, but only because we discussed it here, not because I was able to find that in the documentation. Please forgive me if I simply missed that point in the documentation. BTW I do see that LazyList is documented as being memoizing. Thanks @Jasper-M for the suggestion; that will be useful.

I have to agree that the documentation and scaladoc is not 100% clear on this topic.

1 Like

Which makes sense, as memoization is a concern orthogonal to the eager/lazy divide and might rather be located at the function level. Using scalaz’s Memo:

def inc(i: Int): Int = {
  println(s"inc $i")
  i + 1
}

def memoInc: Int => Int = Memo.immutableHashMapMemo(inc)

val is = Vector(1, 2, 1, 3)
println(is.map(inc).take(3))
println(is.map(memoInc).take(3))
println(is.view.map(inc).take(3).to(Vector))
println(is.view.map(memoInc).take(3).to(Vector))

Output:

inc 1
inc 2
inc 1
inc 3
Vector(2, 3, 2)
inc 1
inc 2
inc 3
Vector(2, 3, 2)
inc 1
inc 2
inc 1
Vector(2, 3, 2)
inc 1
inc 2
Vector(2, 3, 2)