Why can't the Iterator work correctly?

MikuChaser · September 13, 2018, 12:51pm

According to the book Programming in Scala, an iterator just can use one time. But sometimes it’s not empty after I use it. My version of Scala is 2.12.6.
The following picture is my experimentation.

However, the example of this book is different. Why?
————————————————————————————————————————————————
scala> val it = Iterator(“a”, “number”, “of”, “words”)
it: Iterator[java.lang.String] = nonempty
iterator
scala> it.map(_.length)
res1: Iterator[Int] = nonempty
iterator
scala> res1 foreach println
1
6
2
5
scala> it.next()
java.util.NoSuchElementException: next on empty iterator

hmf · September 13, 2018, 12:54pm

The book example uses the variable res1, which holds the output of the last expression that was evaluated. These variables have the form resX where X is an integer that increases for each new expression that is evaluated.

HTHs

MikuChaser · September 13, 2018, 1:06pm

I retry it and it works correctly this time. However, the book says “As you can see, after the call to map, the it iterator has advanced to its end”. In fact, it’s “res1 foreach println” that push the iterator to its end rather than call to map. So, can I think of it as a statement error? And another example is wrong too.
This is my experimentation.
————————————————————————————————————————————————
scala> val it = Iterator(“a”, “number”, “of”, “words”)
it: Iterator[String] = non-empty iterator

scala> it dropWhile (_.length < 2)
res40: Iterator[String] = non-empty iterator

scala> it.next()
res41: String = of

And this is the example of this book. Why?
————————————————————————————————————————————————
scala> val it = Iterator(“a”, “number”, “of”, “words”)
it: Iterator[java.lang.String] = non-empty iterator

scala> it dropWhile (_.length < 2)
res4: Iterator[java.lang.String] = non-empty iterator

scala> it.next()
res5: java.lang.String = number

curoli · September 13, 2018, 2:25pm

map creates a new iterator, exhausting the old one.

MikuChaser · September 13, 2018, 2:36pm

No. When I execute these codes:

val it = Iterator(“a”, “number”, “of”, “words”)
it.map(_.length)

then

it.hasNext is still true

martijnhoekstra · September 13, 2018, 2:53pm

When using methods that consume the iterator, it means all bets are off, and you shouldn’t use the iterator anymore.

After calling it.map(_.length), it should be regarded as invalidated, and there are no promises on how it behaves anymore.

Knowledge of the map implementation of the specific implementation of Iterator it shows that map will not directly affect the iterator, and that the new iterator from map shares the underlying iterator.

That behaviour can’t and shouldn’t be relied on. It’s an implementation detail.

Also note that in the example in the book, it.next() throws a NoSuchElementException after iterating over the new Iterator. That’s because they share the underlying iterator.

That’s allowed by the interface, because the behaviour you get when using an iterator after an “unsafe” method is called on it is undefined.

The following transcript may help:

Welcome to Scala 2.12.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_161).
Type in expressions for evaluation. Or try :help.

scala> val src = Iterator("a", "number", "of", "words")
src: Iterator[String] = non-empty iterator

scala> val target = src.map(_.length)
target: Iterator[Int] = non-empty iterator

scala> src.hasNext
res0: Boolean = true

scala> target.hasNext
res2: Boolean = true

scala> target foreach println
1
6
2
5

scala> src.hasNext
res4: Boolean = false

ClintGilbert · September 13, 2018, 3:49pm

Mapping does make a new Iterator, but it doesn’t exhaust the old one:

> val i = Iterator(1,2,3,4,5)
i: Iterator[Int] = non-empty iterator

> i.hasNext
res0: Boolean = true

> val plusOnes = i.map(_ + 1)
plusOnes: Iterator[Int] = non-empty iterator

> i.hasNext
res1: Boolean = true

SethTisue · September 13, 2018, 4:49pm

right, but exhausting the new one will exhaust the old one as well. (I’m sure you know that Clint, but it isn’t necessarily clear/obvious when you’re new to this.)

ClintGilbert · September 13, 2018, 5:25pm

Aha, yes, I misunderstood Oliver’s response.

curoli · September 13, 2018, 7:00pm

Actually, I think you understood me right. I didn’t realize that Iterators are always mapped lazily.

Russ · September 13, 2018, 7:45pm

If you intend to use an Iterator more than once, why not just use a List or a Vector instead? And even if you only intend to use it once, what is the advantage of using an Iterator rather than a List or Vector? Am I missing something?

ClintGilbert · September 13, 2018, 8:46pm

If you intend to use an Iterator more than once, why not just use a List or a Vector instead?

You can’t use one more than once, so yeah, a “regular” collection would
be the thing in that case.

And even if you only intend to use it once, what is the advantage of using an Iterator rather than a List or Vector? Am I missing something?

Some reasons that I’ve encountered in real life:

You don’t want to make intermediate collections that just get GC’d.

You want to have only one element in memory for processing at any one
time instead of keeping around every element in a collection.

You want to expose a “stream” of unknown, possibly-unbounded size with
the familiar and handy collections API.

You want something lazy, but don’t want to hang onto every
processed/generated element like scala.Stream does.

curoli · September 13, 2018, 8:46pm

Reasons to use Iterator:

(1) There are potentially so many elements, you don’t want to have to keep them all in memory at the same time (e.g. traversing lines of a very large file)

(2) You want to start retrieving the first elements while later ones are not yet available (e.g. results from a database query or webservice)

(3) The elements are expensive to create, and maybe you don’t need them all (e.g. some expensive calculation)

(4) The number of elements is very large or even indefinite, but you only need the first ones (e.g. calculate prime numbers in sequence, keep going until I find one I like)

curoli · September 13, 2018, 9:23pm

Yes, I was wrong, you are right and the book is wrong.

MikuChaser · September 14, 2018, 5:08am

I think Martin Odersky may not want to create a lazy map method. Is it a bug?

martijnhoekstra · September 14, 2018, 7:50am

What Martin Odersky wants to create or not is not really germane to the discussion.

The behaviour I’m showing is exactly the behaviour promised by the Iterator contract. It’s not a bug.

If you want to be able to iterate over an iterator twice, duplicate it.

Welcome to Scala 2.12.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_161).
Type in expressions for evaluation. Or try :help.

scala> val src = Iterator("a", "number", "of", "words")
src: Iterator[String] = non-empty iterator

scala> val (i1, i2) = src.duplicate //invalidate src, get two for the price of one
i1: Iterator[String] = non-empty iterator
i2: Iterator[String] = non-empty iterator

scala> val target = i1.map(_.length) //invlidate i1
target: Iterator[Int] = non-empty iterator

scala> target foreach println
1
6
2
5

scala> i2 foreach println
a
number
of
words

MikuChaser · September 14, 2018, 7:57am

OK. I will remember this usage.
Thanks.

hmf · September 14, 2018, 9:07am

No it is not a bug. For an explanation of why, look at:

Its seems this will change in 2.13 and map should retain its laziness.

siddhartha-gadgil · September 14, 2018, 10:26am

One may want to use an iterator only once to save memory, e.g. to iterate over all permutations.

regards,

Siddhartha

MikuChaser · September 14, 2018, 5:15pm

A new problem! If a iterator uses duplicate method, the old one is still non-empty. And if I exhaust the old one, the new two will become empty. Amazing!