According to the book Programming in Scala, an iterator just can use one time. But sometimes it’s not empty after I use it. My version of Scala is 2.12.6.
The following picture is my experimentation.
However, the example of this book is different. Why?
————————————————————————————————————————————————
scala> val it = Iterator(“a”, “number”, “of”, “words”)
it: Iterator[java.lang.String] = nonempty
iterator
scala> it.map(_.length)
res1: Iterator[Int] = nonempty
iterator
scala> res1 foreach println
1
6
2
5
scala> it.next()
java.util.NoSuchElementException: next on empty iterator
The book example uses the variable res1
, which holds the output of the last expression that was evaluated. These variables have the form resX
where X is an integer that increases for each new expression that is evaluated.
HTHs
I retry it and it works correctly this time. However, the book says “As you can see, after the call to map, the it iterator has advanced to its end”. In fact, it’s “res1 foreach println” that push the iterator to its end rather than call to map. So, can I think of it as a statement error? And another example is wrong too.
This is my experimentation.
————————————————————————————————————————————————
scala> val it = Iterator(“a”, “number”, “of”, “words”)
it: Iterator[String] = non-empty iterator
scala> it dropWhile (_.length < 2)
res40: Iterator[String] = non-empty iterator
scala> it.next()
res41: String = of
And this is the example of this book. Why?
————————————————————————————————————————————————
scala> val it = Iterator(“a”, “number”, “of”, “words”)
it: Iterator[java.lang.String] = non-empty iterator
scala> it dropWhile (_.length < 2)
res4: Iterator[java.lang.String] = non-empty iterator
scala> it.next()
res5: java.lang.String = number
map creates a new iterator, exhausting the old one.
No. When I execute these codes:
val it = Iterator(“a”, “number”, “of”, “words”)
it.map(_.length)
then
it.hasNext is still true
When using methods that consume the iterator, it means all bets are off, and you shouldn’t use the iterator anymore.
After calling it.map(_.length)
, it
should be regarded as invalidated, and there are no promises on how it behaves anymore.
Knowledge of the map implementation of the specific implementation of Iterator it
shows that map
will not directly affect the iterator, and that the new iterator from map
shares the underlying iterator.
That behaviour can’t and shouldn’t be relied on. It’s an implementation detail.
Also note that in the example in the book, it.next() throws a NoSuchElementException after iterating over the new Iterator. That’s because they share the underlying iterator.
That’s allowed by the interface, because the behaviour you get when using an iterator after an “unsafe” method is called on it is undefined.
The following transcript may help:
Welcome to Scala 2.12.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_161).
Type in expressions for evaluation. Or try :help.
scala> val src = Iterator("a", "number", "of", "words")
src: Iterator[String] = non-empty iterator
scala> val target = src.map(_.length)
target: Iterator[Int] = non-empty iterator
scala> src.hasNext
res0: Boolean = true
scala> target.hasNext
res2: Boolean = true
scala> target foreach println
1
6
2
5
scala> src.hasNext
res4: Boolean = false
Mapping does make a new Iterator, but it doesn’t exhaust the old one:
> val i = Iterator(1,2,3,4,5)
i: Iterator[Int] = non-empty iterator
> i.hasNext
res0: Boolean = true
> val plusOnes = i.map(_ + 1)
plusOnes: Iterator[Int] = non-empty iterator
> i.hasNext
res1: Boolean = true
right, but exhausting the new one will exhaust the old one as well. (I’m sure you know that Clint, but it isn’t necessarily clear/obvious when you’re new to this.)
Aha, yes, I misunderstood Oliver’s response.
Actually, I think you understood me right. I didn’t realize that Iterators are always mapped lazily.
If you intend to use an Iterator more than once, why not just use a List or a Vector instead? And even if you only intend to use it once, what is the advantage of using an Iterator rather than a List or Vector? Am I missing something?
If you intend to use an Iterator more than once, why not just use a List or a Vector instead?
You can’t use one more than once, so yeah, a “regular” collection would
be the thing in that case.
And even if you only intend to use it once, what is the advantage of using an Iterator rather than a List or Vector? Am I missing something?
Some reasons that I’ve encountered in real life:
You don’t want to make intermediate collections that just get GC’d.
You want to have only one element in memory for processing at any one
time instead of keeping around every element in a collection.
You want to expose a “stream” of unknown, possibly-unbounded size with
the familiar and handy collections API.
You want something lazy, but don’t want to hang onto every
processed/generated element like scala.Stream
does.
Reasons to use Iterator:
(1) There are potentially so many elements, you don’t want to have to keep them all in memory at the same time (e.g. traversing lines of a very large file)
(2) You want to start retrieving the first elements while later ones are not yet available (e.g. results from a database query or webservice)
(3) The elements are expensive to create, and maybe you don’t need them all (e.g. some expensive calculation)
(4) The number of elements is very large or even indefinite, but you only need the first ones (e.g. calculate prime numbers in sequence, keep going until I find one I like)
Yes, I was wrong, you are right and the book is wrong.
I think Martin Odersky may not want to create a lazy map method. Is it a bug?
What Martin Odersky wants to create or not is not really germane to the discussion.
The behaviour I’m showing is exactly the behaviour promised by the Iterator
contract. It’s not a bug.
If you want to be able to iterate over an iterator twice, duplicate it.
Welcome to Scala 2.12.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_161).
Type in expressions for evaluation. Or try :help.
scala> val src = Iterator("a", "number", "of", "words")
src: Iterator[String] = non-empty iterator
scala> val (i1, i2) = src.duplicate //invalidate src, get two for the price of one
i1: Iterator[String] = non-empty iterator
i2: Iterator[String] = non-empty iterator
scala> val target = i1.map(_.length) //invlidate i1
target: Iterator[Int] = non-empty iterator
scala> target foreach println
1
6
2
5
scala> i2 foreach println
a
number
of
words
OK. I will remember this usage.
Thanks.
No it is not a bug. For an explanation of why, look at:
Its seems this will change in 2.13 and map should retain its laziness.
One may want to use an iterator only once to save memory, e.g. to iterate over all permutations.
regards,
Siddhartha
A new problem! If a iterator uses duplicate method, the old one is still non-empty. And if I exhaust the old one, the new two will become empty. Amazing!