What combinator to use? scanLeft but not every iteration

Given a list of integers

Seq(2,4,6,1,2,2,3,7,4,4,4)

produce a new list where all consecutive even numbers are summed, and intervening odd numbers are kept unchanged.

Seq(2,4,6,1,2,2,3,7,4,4,4).fn(???) = Seq(12,1,4,3,7,12)
// ((2 + 4 + 6) = 12, 1, (2 + 2) = 4, 3, 7, (4 + 4 + 4) = 12)

It seems to be a mix of scanning and takeWhile, except the scan combinators take snapshots every iteration while we need to only take one snapshot per group of consecutive even numbers.

1 Like

Here’s a take at it doing the recursion explicitly:

1 Like

I don’t think there’s one specific combinator for this use case. You could probably use foldleft or foldRight, but it will look just like @anqit’s solution without the explicit recursion.

With span you could simplify your custom combinator a bit:

def combine(list: List[Int]): List[Int] = {
  val (evens, rest) = list.span(_ % 2 == 0)
  if (evens.nonEmpty) evens.sum :: combine(rest)
  else if (rest.nonEmpty) rest.head :: combine(rest.tail)
  else Nil
}
2 Likes

TIL List#span, cool!
I tried a solution with foldLeft/Right, but the issue I had was not knowing when you were at the last element of the list (in either direction) and being able to handle adding the last element.

That is indeed a problem. It’s probably not worth the effort to try to fit it in foldLeft.

1 Like

FWIW, had a go at this approach:

def f(list: List[Int]) = LazyList.unfold(list){
  case Nil => None
  case x::xs if x % 2 == 1 => Some(x, xs)
  case xs => xs.span(_ % 2 == 0) match { case (even, rest) => Some(even.sum, rest) }
}.toList

val ints = List(2,4,6,1,2,2,3,7,4,4,4)

assert(f(ints) == List(12,1,4,3,7,12))

1 Like

TIL LazyList.unfold claims to be “significantly simpler than Iterator.unfold”, but I guess that is in the eye of the beholder.

Arranging deck chairs:

def evenSum(xs: List[Int]): List[Int] =
  List.unfold(xs):
    _.span(_ % 2 == 0) match
    case (Nil, x :: xs) => Some(x, xs)
    case (Nil, _)       => None
    case (evens, xs)    => Some(evens.sum, xs)

@main def test = println:
  evenSum:
    List(2,4,6,1,2,2,3,7,4,4,4)

I was curious to see if this compiled, because of recent discussion about PartialFunction synthesis:

def pf: PartialFunction[List[Int], Option[(Int, List[Int])]] =
    _.span(_ % 2 == 0) match
    case (Nil, x :: xs) => Some(x, xs)
    case (Nil, _)       => None
    case (evens, xs)    => Some(evens.sum, xs)

It compiles but (in file even-sums.scala):

âžś  snips scala-cli run --server=false -S 3.7.0 even-sums.scala
Exception in thread "main" java.lang.ClassFormatError: Duplicate method name "even$minussums$package$$$_$_$$anonfun$$anonfun$1" with signature "(I)Z" in class file even$minussums$package$

Oh I bet that is this recent issue.

With a symbol like even$minussums$package$$$_$_$$anonfun$$anonfun$1, it must be “code golfing for dollars”.

3 Likes

Sometimes questions like these appear in a mutable context and just get carried over to an immutable one without really deciding whether it’s the kind of thing one would do.

If Scala had a built-in group-by-equivalence-class method, which it doesn’t, then it would be a simple matter of expressing that even values are equivalent to each other and odd values aren’t equivalent to anything. (e.g. (a % 2) | (b %2) == 0 would be the equivalence criterion).

You can fake it with scan.

val ecs = xs.scan(0){ (c, x) => if x % 2 == 0 then (c + 1) & ~1 else (c + 1) | 1 }.drop(1)

(The odd bit math is there to make sure odd values have odd equivalence class numbers which keep increasing, while even values advance to an even equivalence class number and then stay there.)

Now we can use groupBy to actually gather the elements.

val sums = (ecs zip xs).groupBy(_._1).mapValues(_.map(_._2).sum).toSeq.sortBy(_._1).map(_._2)

So, overall, it’s kind of a two-liner. Not a very pretty two-liner, granted.

But it really ought to be

xs.chunkWith((a, b) => ((a % 2) | (b % 2)) == 0).map(_.sum)

Given that the scan/groupBy version is pretty messy, it’s worth considering

var sum = 0
val ysb = Seq.newBuilder[Int]
for (a, b) <- xs.zipAll(xs.drop(1), 0, 1) do
  if a % 2 == 0 then
    sum += a
    if b % 2 != 0 then ysb += sum
  else
    sum = 0
    ysb += a
val ys = ysb.result

(If you had an indexable sequence, you’d probably index instead of using the zipAll trick.)

2 Likes

The context is a parser, where I am taking a string like "hello world friend " and turning it into the sequence Seq("hello", " ", " ", "world", " ", "friend", " ", " ", " ")

The “even/odd numbers” analogy is:
“even” → consecutive letters (not spaces)
“odd” → the spaces themselves.

So it’s like, string.takeWhile(notSpace) but takeWhile would stop immediately when " " is found and not include that " "; I want it to keep going throughout the entire string, concatenating letters but keeping spaces unchanged.

Seq(
  'h' + 'e' + 'l' + 'l' + 'o',
  " ",
  " ",
  'w' + 'o' + 'r' + 'l' + 'd',
  " ",
  'f' + 'r' + 'i' + 'e' + 'n' + 'd',
  " ", 
  " ",
  " "
)

I can solve this using recursion but prefer a 1-liner.

1 Like

Parsers run like molasses if you’re treating them a character at a time. Now, if you’re not parsing much, it’s okay if they run like molasses.

If you actually have a parsing problem, you probably should use a parsing library like fastparse.

If you want to chop strings into pieces, use the .split method on String.

If you’re just using this to gain experience, well, congrats! You have discovered that it is a bit frustrating to not have a chunk/fragment method.

1 Like

It’s for a compiler. I’m not sure how it would be logically possible to avoid parsing 1 character at a time. Every compiler I’ve seen does this in the lexer, except they usually do it imperatively with i and j vars within a while. Isn’t it definitional to the problem that you have to scan each individual character? I would assume those parse libraries just abstract over that process.

Thanks! I remembered you said on gitter several years ago that you write this all the time, and in fact I looked in kse just in kase.

My suggestion for a pronunciation, however, since you mention on the project page that it has none, would be KerrSE, like an acronym out of a James Bond movie.

Do it in Array[Char] and String, so you’re using primitives (efficiently). Or use a library like fastparse, which is a little slower but a lot easier and uses primitives wherever possible. Trying to use generic immutable sequence operations, where you have to individually box every character, is not the way to do it. For compiler parsers you generally can’t afford a 100x slowdown.

2 Likes

I don’t remember that, but I do have diced on my set of array operations (in basics/src/Data.scala), and I can find the boundaries with whereOp. So when writing it myself, I’d

val boundaries = xs.whereOp: (x, i) =>
  if x % 2 == 1 || (i > 0 && xs(i-1) % 2 == 1) then i else -1
xs.diced(boundaries, "[)").map(_.sum)
1 Like

Just for fun, you can also use a fs2.Stream:

import fs2.Stream
import cats.syntax.all._

val list = List(2,4,6,1,2,2,3,7,4,4,4)

Stream.emits(list).groupAdjacentBy(_ % 2).flatMap {
  case (0, chunk) => Stream.emit(chunk.sumAll)
  case (_, chunk) => Stream.chunk(chunk)
}.toList
4 Likes

You really only need to find the beginning.

xs.foldLeft(List.empty[Int]) {
    case (Nil, n) => List(n)
    case (h :: ts, n) if h % 2 == 0 && n % 2 == 0 => n + h :: ts
    case (h :: ts, n) => n :: h :: ts
}
  .reverse
1 Like