Map with updatable state

kwalcock · April 29, 2023, 2:06am

Every now and then I need to use map on a collection in a way that depends on what has been previously observed of the collection. This could be handled with a var outside of the mapping block that gets updated from within the block. I’d rather avoid those vars in regular, non-library code, though. It can also be done recursively, but that doesn’t look like map anymore. A simple solution is

implicit class StatefulMapper[T](list: List[T]) { // Anything that can be mapped?
  def withStateMap[S, U](initialState: S)(f: (S, T) => (S, U)): List[U] = {
    var currentState = initialState

    list.map { value: T =>
      val (newState: S, newValue: U) = f(currentState, value)

      currentState = newState
      newValue
    }
  }
}

List(1, 2, 3, 4, 5, 7, 8).withStateMap(false) { case (oddRun, value) =>
  val newOddRun = value % 2 == 1
  (newOddRun, if (oddRun) -value else value)
} //  List(1, -2, 3, -4, 5, -7, -8)

This works for a List, of course. What is the best way to write it so that it works on just about anything that can be mapped, though? Or is there a completely different and better way to do it? Sometimes a similar foreach would be handy.

spamegg1 · April 29, 2023, 7:33am

It looks like you are doing a fold operation?
foldLeft is available for pretty much any Iterable:
https://scala-lang.org/api/3.2.2/scala/collection/IterableOnceOps.html#foldLeft-fffff9e1

Jasper-M · April 29, 2023, 9:44am

It’s pretty complicated to make it fully generic but not impossible.

import scala.collection.IterableOnce
import scala.collection.BuildFrom
import scala.collection.generic.IsIterableOnce
import scala.language.implicitConversions

implicit def toMapper[Repr](coll: Repr)(implicit it: IsIterableOnce[Repr]): StatefulMapper[Repr, it.type] =
  new StatefulMapper(coll, it)

class StatefulMapper[Repr, I <: IsIterableOnce[Repr]](coll: Repr, val it: I) {
  def withStateMap[S, U, That](
    initialState: S
  )(
    f: (S, it.A) => (S, U)
  )(
    implicit bf: BuildFrom[Repr, U, That]
  ): That = {
    val factory = bf.toFactory(coll)
    val f2: ((S, U), it.A) => (S, U) = {
      case ((s, u), a) => f(s, a)
    } 
    it(coll)
      .iterator
      .scanLeft((initialState, null.asInstanceOf[U]))(f2)
      .drop(1)
      .map(_._2)
      .to(factory)
  }
}

BalmungSan · April 29, 2023, 1:34pm

For the record, cats has mapAccumulate which is essentially what you want.

PS: I also proposed to add it to the future stdlib: Add mapWithState · Issue #122 · scala/scala-library-next · GitHub

kwalcock · April 29, 2023, 4:12pm

Fold seems to result in a single value rather than a collection based on the original collection. Here it would be a U while I’m hoping for a List[U]. I could turn it around and use the list as the state and return the last state instead of the result from the fold, but that doesn’t seem quite right.

BalmungSan · April 29, 2023, 4:19pm

U can be List[U] that is not an issue.
Folding is the natural transformation of collections, it is the more general and flexible combinator that abstracts the concept of traversal (e.g. recursion). You can implement all operations in List using only foldLeft

kwalcock · April 29, 2023, 4:36pm

Cool! There are ~~a few~~ lots of things in there I would never have thought of without first diving deep into the implementation of the standard collections, like IsIterableOnce, BuildFrom, and toFactory and maybe even the null and drop. Writing code like this to get it properly integrated isn’t for mere mortals, I think. In the end, does this iterate through the collection just once despite both the scan and map?

BalmungSan · April 29, 2023, 4:38pm

Yes, because it is using an iterator which is lazy and only performs all the requested operations once you consume it with to(factory)

Cool! There are ~~a few~~ lots of things in there I would never have thought

Yeah, abstracting over collections is not really pleasant, is best for libraries rather than application code.

kwalcock · April 29, 2023, 4:42pm

Thanks for the tip. Most of the time what I’m working on doesn’t have an existing dependency on cats so that using mapAccumulate would be nearly free, but someday that may change. I’ll watch for it being added to the scala library and then being backported to scala-collection-compat.

kwalcock · April 29, 2023, 4:46pm

No doubt, but can isn’t the same as should. It looks like scan might be a better option than fold. Thanks for checking on the iterator!

BalmungSan · April 29, 2023, 4:59pm

I remember when I started to learn Scala, I really didn’t want to add cats as a dependency, yet almost every time I had a question I found an answer using cats. Until one day I said “no more” and just added it, best decision ever.
There is really no reason not to have cats in scope, is relatively small, and is just a bunch of utility methods (you can think of it as an extended stdlib), doesn’t imply any paradigm shift like cats-effect.

No doubt, but can isn’t the same as should.

Well, again, using List[U] as the accumulator of a fold is not only possible, is okay. There is no rule that says that you shouldn’t do that.
But yes, scanLeft may be a better fit in this case, I forgot that one was part of the stdlib.

philipschwarz · April 30, 2023, 11:33am

Hello @kwalcock,

FWIW, I had a go at writing code to map your sample input list to your expected output list:

def isOdd(n: Int): Boolean = n % 2 == 1

val input = List(1, 2, 3, 4, 5, 7, 8)
val expected =  List(1, -2, 3, -4, 5, -7, -8)

assert(
  input.foldLeft((false,List.empty[Int])){ case ((oddRun,ns), n) => 
    (isOdd(n), (if oddRun then -n else n) :: ns)}(1).reverse 
  == expected
)
                                                                                                                                                                                     
assert(
  (0::input zip input).foldLeft(List.empty[Int]){ case (acc,(m,n)) => 
    (if isOdd(m) then -n else n) :: acc }.reverse 
  == expected
)
                                                                                                                                                                                      
assert(
  input.scanLeft((false,0)){ case ((oddRun,_), n) => 
    (isOdd(n), if oddRun then -n else n)}.unzip.tail.head.tail 
  == expected
)
                                                                                                                                                                                      
assert(
  input.tail.scanLeft((true,input.head)){ case ((oddRun,_), n) => 
    (isOdd(n), if oddRun then -n else n)}.unzip.tail.head 
  == expected
)

assert(
  (0::input zip input).map{ case (m,n) => 
    if isOdd(m) then -n else n } 
  == expected
)

kwalcock · May 1, 2023, 2:46am

I really like this construction. It emphasizes that fact that the state is only dependent on the previous input value, at least in this example.

LannyRipple · May 4, 2023, 9:16pm

Iterator.unfold

Iterator.unfold(false -> List(1,2,3,4,5,7,8)){
  case (oddRun, values) =>
    values match {
      case Nil => None
      case value :: rest =>
        val newOddRun = value % 2 == 1
        val a = if (oddRun) -value else value
        val s = (newOddRun, rest)
        Option(a -> s)
    }
}.toList

// List(1, -2, 3, -4, 5, -7, -8)

kwalcock · May 5, 2023, 6:10am

That’s cool. I have to admit not being up to date with Scala 3 possibilities. It is nice to know when one sees things like map, that the incoming and outgoing collections are of the same size. Here I would have to inspect the code to make sure. Can’t have everything, I guess.