# Merging Scala Maps Efficiently

I have two Scala Maps and I wanted to merge them. Signature of Maps look like this :

case Person(name: String, dob: Option[Int] = None, ssn: Option[Int] = None)

Map[Int, Person]. Person is Case class object.

1.Map((1->Person(name=“abc”,dob=None, ssn=1234)),(2->Person(name=“pqr”,dob=22900123, ssn=None)) )
2.Map((1->Person(name=“abc”,dob=11111, ssn=None)),(4->Person(name=“mno”,dob=22900123, ssn=None)))

I have these two maps and want to merge them. End map should have missing information from colliding keys.
Result will be :

Map((1->Person(name=“abc”,dob=11111, ssn=1234)), (2->Person(name=“pqr”,dob=22900123, ssn=None)), (4->Person(name=“mno”,dob=22900123, ssn=None)))

As both map has Key =1 and while merging Person information is merged if it is not available in any of the Map
and for any other case I should be adding it into final Map.

To merge the actual `dob` and `ssn` fields on collision, you can just use `orElse`:
`dob1.orElse(dob2)` will do what you want (favoring `dob1` if both are defined).

The harder part is merging the maps efficiently. You could just iterate over the second map and add its elements one-by-one, and I think this has the right asymptotic cost, but there are more efficient alternatives.

I had this problem once, and a good solution is not trivial. I forgot too many details to give a good answer quickly, but I remember that the best efficient answer I found used `HashMap.merged`:

def merged[B1 >: B](that: HashMap[A, B1])(mergef: MergeFunction[A, B1]): HashMap[A, B1]
Creates a new map which is the merge of this and the argument hash map.

Uses the specified collision resolution function if two keys are the same. The collision resolution function will always take the first argument from this hash map and the second from that.

The merged method is on average more performant than doing a traversal and reconstructing a new immutable hash map from scratch, or ++.

That does not work on arbitrary `Map` but you can convert your `Map` to `HashMap` (or change your code to give you `HashMap`, quite possibly more efficient).

That method takes a function that handles collisions… but which, sometimes, will be called with `null` arguments when it shouldn’t (which seems a bug but I never reported it), and then you need a workaround (see my code). In the end, my implementation was the one here—that’s implementing bags so it’s not quite what you need but I hope it helps.

1 Like

`https://github.com/inc-lc/ilc-scala/blob/9a158ae7757eab62b91ef2fd2196adda810f6ad5/src/main/scala/ilc/feature/bags/Library.scala#L11-L37`

And here’s the link to docs: HashMap.merged docs

@Blaisorblade : Hey Thank you for helping. As I’m pretty new to the Scala, Could you please provide actual solution or function (even it is not efficient) using the one you were saying by `dob1.orElse(dob2)`. At least I can learn what you are thinking there.
By the way thank you again.

The piece involving `orElse` is:

``````def mergePersons(p1: Person, p2: Person) =
Person(p1.name, p1.dob.orElse(p2.dob), p1.ssn.orElse(p2.ssn))
``````

1 Like

For anyone else following the docs for HashMap.merged

``````  type HashMap[A, +B]

def merged[B1 >: B](that: HashMap[A, B1])(mergef: MergeFunction[A, B1]): HashMap[A, B1]

// with
type MergeFunction[A1, B1] = ((A1, B1), (A1, B1)) => (A1, B1)
``````

Hello,

I once did something like this (not sure how it performs ):

val mergeValues : (V, V) => V = …
val map1 : Map[K, V] = …
val map2 : Map[K, V] = …
val jointKeys = map1.keys.intersect(map2.keys)
val overlap = jointKeys.map(key => (key, mergeValues(map1(key),
map2(key))).toMap
val merged = overlap ++ (map1 – jointKeys) ++ (map2 – jointKeys)

`` Best, Oliver``