Common idiom with groupBy

It is fairly common that I have a sequence of pairs and I want to collect the _._2 values but grouped by the _._1 values.

E.g.,

val data = List((1,"a"),(2,"ab"),(1,"b"),(3,"abc"),(2,"ba"))

data.groupBy(_._1)

This returns something like: Map(1 -> List((1,"a"),(1,"b")), 2 -> List((2,"ab"),(2,"ba")), 3 -> List((3,"abc")))

The problem is that the accumulated values are redundant in many cases.

Is there some cousin function of groupBy which I can call like this?

data.groupKeysBy(_._1,_._2)

which will map the 2nd function over the values of the map before collecting them?

Map(1 -> List("a","b"), 2 -> List("ab","ba"), 3 -> List("abc"))

Of course such a function is pretty easy to write for a given set of types. But I think it us exceedingly difficult if you want it to work on any types which groupBy works on.

1 Like

I think Map.mapValues is very close to what I want.

In Scala 2.13

scala> data.groupMap(_._1)(_._2)
res32: scala.collection.immutable.Map[Int,List[String]] = Map(1 -> List(a, b), 2 -> List(ab, ba), 3 -> List(abc)

There’s also a groupMapReduce if you want to take it a step further.

1 Like

Looks like I’m not the only one who needs this idiom. :slight_smile: