Defaults for map

I worked in a proprietary language a few years back which had a nice feature in its hash table implementation. When I create the hash table, I specify the value which should be returned when accessing a key not in the table. This was often useful. For example, if the hash table values are integers, it often makes sense to have 0 returned as default.

Is there some Map-like container in Scala which has this feature.

The closest thing that I know of is that I can use .getOrElse(key,default) which allows every call-site to specify a default value. It would sometimes make the code cleaner, if I could specify that default value at object construction time.

As I always say, the Scaladoc is your friend.

You can use withDefault or withDefaultValue.
The first takes a function which takes as input the key and has to return the default value.
The second one just takes a constant value, i.e. is the same as the first one with _ => v

3 Likes

Interesting, it’s not another type, it’s just a feature of the normal Map. Cool. I probably wouldn’t have thought of looking int he scaladoc of Map. I’d have tried to search for a type which extends Map.

There’s some discussion about whether Maps with default values should be redesigned: https://github.com/scala/bug/issues/8099

Also note that the default value only affects the apply method, not the get method. apply will return the default instead of throwing an exception. get will still return None if the key is not present in the Map.

3 Likes

Also note that the default value only affects the apply method, not the get method. apply will return the default instead of throwing an exception. get will still return None if the key is not present in the Map .

Because the default breaks this and other invariants it’s no longer a well-behaved map at all, but it’s just fine as a function. So I highly – highly – recommend ascribing the wider type K => V when you do this.

2 Likes

That’s really interesting, after withDefaultValue, apply and get return different values.

Sorry I don’t understand the comment about ascribing a wider type.

Here is an example of my use case.

  def makeAdj[V](edges: Seq[(V, V)]): Map[V, Set[V]] = {
    edges.foldLeft(Map[V, Set[V]]().withDefaultValue(Set())) {
      case (adj, (src, dst)) =>
        adj + (src -> (adj(src) + dst))
    }
  }

The use of withDefaultValue(Set()) prevents me from having to use src.getOrElse(src,Set()) on line 4. So I can add an element to the set, defaulting to adding to the empty set.

I would like to propose this alternative:

def makeAdj[V](edges: Seq[(V, V)]): Map[V, Set[V]] =
  edges.foldLeft(Map.empty[V, Set[V]]) {
    case (adj, (src, dst)) =>
      adj.updatedWith(key = src) {
        case Some(set) => Some(set + dst)
        case None      => Some(Set(dst))
      }
  }

Is that better than just using getOrElse ?

  def makeAdj_11[V](edges: Seq[(V, V)]): Map[V, Set[V]] = {
    edges.foldLeft(Map[V, Set[V]]()) {
      case (adj, (src, dst)) =>
        adj + (src -> (adj.getOrElse(src,Set()) + dst))
    }
  }

And what about the bidirectional graph case?

  def makeAdj_12[V](edges: Seq[(V, V)]): Map[V, Set[V]] = {
    edges.foldLeft(Map[V, Set[V]]().withDefaultValue(Set())) {
      case (adj, (src, dst)) =>
        adj +
          (src -> (adj(src) + dst)) +
          (dst -> (adj(dst) + src))
    }
  }

Right, sorry. My point is that the map you’re returning will be ill-behaved. In this case I would recommend adj.getOrElse(src, Set()) or something like @BalmungSan’s solution rather than withDefaultValue.

You might want to look at Cats at some point. The solution via Foldable is so short it’s hardly worth writing out a new definition.

def makeAdj[F[_]: Foldable, A](fa: F[(A, A)]): Map[A, Set[A]] =
  fa.foldMap { case (a, b) => Map(a -> Set(b)) }
1 Like

That’s a shame. The example is for a classroom of beginner Scala programmers. They have just finished a course in Graph Theory where they coded these same functions in Python. My hope was to show them how to build the adjacency map of a graph easily, using immutable data structures.

In my opinion, using getOrElse will work, but is confusing for beginners. Having a Map with default values is easy to think about.

I’ve never used Cats before. What do I need to do to bring Foldable and foldMap into scope?

You can chain the two updatedWith

def makeAdj[V](edges: Seq[(V, V)]): Map[V, Set[V]] =
  edges.foldLeft(Map.empty[V, Set[V]]) {
    case (adj, (src, dst)) =>
      adj.updatedWith(key = src) {
        case Some(set) => Some(set + dst)
        case None      => Some(Set(dst))
      }.updatedWith(key = dst) {
        case Some(set) => Some(set + src)
        case None      => Some(Set(src))
      }
  }

Or keep using withDefault the thing is, the returned Map will have the default which may not be what you want.

Or use getOrElse, which is less verbose than mine.

Or use the cats alternative proposed by Rob.
(for cats always add import cats.implicits._ and you will have everything in scope)
Also, you do not really need to make it that generic, you just need to:

import cats.data.NonEmptySet
import cats.implicits._
import cats.kernel.Order

def makeAdj[V : Order](edges: List[(V, V)]): Map[V, NonEmptySet[V]] =
  edges.foldMap {
    case (src, dst) =>
      Map(src -> NonEmptySet.one(dst), dst -> NonEmptySet.one(src))
  }

Which you can use like:

makeAdj(List('a' -> 'b', 'b' -> 'c', 'a' -> 'd', 'd' -> 'e')) 

res: Map[Char, NonEmptySet[Char]] = Map(
  'a' -> TreeSet('b', 'd'),
  'b' -> TreeSet('a', 'c'),
  'c' -> TreeSet('b'),
  'd' -> TreeSet('a', 'e'),
  'e' -> TreeSet('d')
)
1 Like

It is? That surprises me — can you expand on that?

1 Like

OK, I admit I don’t really know what is confusing to beginners. It is really hard to remember what it was like to not understand something.

But guessing that an expression like adj(key) to retrieve the value associated with the key should be intuitive. And if the student can assume that adj(key) returns a Set and we add to a Set using +, then the following expression should be pretty easy to understand.

        adj +
          (src -> (adj(src) + dst)) +
          (dst -> (adj(dst) + src))

Now if you say, “wait, adj(dst) doesn’t always return a Set, sometimes it throws an exception”, and to avoid the exception you have use the following, I think it is confusing.

        adj +
          (src -> (adj.getOrElse(src,Set()) + dst)) +
          (dst -> (adj.getOrElse(dst,Set()) + src))

So we have to handle the exception every time we access the Map. To me it is less confusing to specify once: avoid the exception by simply returning an empty set, and call sites can easily add to the set with +.

I’ll give the lecture to my students in two days. I’ll show them both approaches, and see which one they find most intuitive.

Do we know which situations the resulting Map has the correct default and in which cases it has the incorrect default? In my very simple test using + to add to a Map the default is retained correctly.

val m1 = Map("a" -> 1).withDefaultValue(Set())
val m2 = m1 + ("b" -> 2)
m2("") // returns Set()

val m3 = Map("z"-> 3).withDefaultValue(Set())
val m4 = m3 ++ m2
m4("") // returns Set()

(m4 - "a")("") // returns Set()

Including the implicits doesn’t seem to help.

Screenshot 2020-05-13 at 21.41.54

Error:(267, 24) not found: type Foldable
  def makeAdj_14[F[_]: Foldable, A](fa: F[(A, A)]): Map[A, Set[A]] =
Error:(268, 8) value foldMap is not a member of type parameter F[(A, A)]
    fa.foldMap { case (a, b) => Map(a -> Set(b)) }

But guessing that an expression like adj(key) to retrieve the value associated with the key should be intuitive. And if the student can assume that adj(key) returns a Set and we add to a Set using + , then the following expression should be pretty easy to understand.
Now if you say, “wait, adj(dst) doesn’t always return a Set , sometimes it throws an exception”, and to avoid the exception you have use the following, I think it is confusing.

What I did when I was self-learning Scala was just told myself that map.apply didn’t exist.
Given the key for which you are asking may not exists, calling map.get will return an Option, or you can call map.getOrElse as a shortcut for calling map.get(key).getOrElse(default).

Do we know which situations the resulting Map has the correct default and in which cases it has the incorrect default? In my very simple test using + to add to a Map the default is retained correctly.

Sorry if my message wasn’t clear.
The default will be preserved, what I meant was that the map returned by the method will preserve the default, which may not be what you want.
However, for your mental model, it is even a good idea that the returned map will always return an empty set for a bad key.

Including the implicits doesn’t seem to help.

Yeah, foldable is not an implicit, is a typeclass.
You find it on cats.Foldable, again the scaladoc is very useful.

But, as I show in my snippet, you do not need to abstract over any foldable, just ask for a List.
But if you want, just to make it more generic.

import cats.Foldable
import cats.data.NonEmptySet
import cats.implicits._
import cats.kernel.Order

def makeAdj[F[_] : Foldable, A : Order](edges: F[(A, A)]): Map[A, NonEmptySet[A]] =
  edges.foldMap {
    case (src, dst) =>
      Map(src -> NonEmptySet.one(dst), dst -> NonEmptySet.one(src))
  }

Why not just add functions then?

val map: Map[A,B] = ???
def getOrDefault(a: A) = map.getOrElse(a, 0)

I usually write extenders when it came to similar default values (like getOr0 or getOr1 or getOrEmpty). Btw in general using the apply method on a Map is not “safe” because of the exception. So either I use getOrElse, or get. Both of them are garantees an errorless execution. If I would teach others to this language this would be one of the first thing I would tell them (bcs this is one of the biggest power of the FP that the language introduces as a build in). The same with head vs headOption on lists, or fold vs reduce vs cats reduceOption or min vs cats minOption.

From your example I would go something like:

implicit class DefaultToOptionSet[A](o: Option[Set[A]]) {
  def getOrEmpty = o.getOrElse(Set.empty[A])
}

adj +
          (src -> (adj.get(src).getOrEmpty + dst)) +
          (dst -> (adj.get(dst).getOrEmpty + src))

It’s not as clean as a maps default, but it has a lots of adventage like:

  • you explicitly communicate what happens in the “error” case
  • you explicitly communicate that you thought about the “error” case (and that is not a showstopper)
  • the function not depending on the map default value (if you want to you can get that as a param/function)
  • the function will work the same for every Map[A, Set[B]] and will not prerequire a Map which has a default value set somewhere
  • you can use different default values for the same map in different parts of the code (like summing vs producting 0 vs 1 is a good default?)