# Ugliness of implementing Floyd-Warshall in Scala

I’m implementing the Floyd-Warshall multiple-source shortest path algorithm in Scala. I haven’t debugged it yet, so don’t copy my code and expect it to work.

Nevertheless, there are a couple of things I find exceedingly ugly in my implementation, but don’t know of a better way of doing them. Maybe someone can give me some advice???

The algorithm starts with a structure describing the graph: `adj:Map[A,Set[A]]` which specifies that any node of type `A` connects to zero or more other nodes of the same type `A`.

The function, `computeDistancePredecessors`, computes two things. 1) `d`, a distance map, which maps each pair `(x,y)` of vertices to the integer distance from `x` to `y`. and 2) `pred`, a predecessor map, which maps each pair `(x,y)` of vertices to the vertex which is the predecessor of `y` along the shortest path from `x` to `y`.

Two things that are ugly (in my opinion).

1. I’m modeling infinite distance as simply pairs missing from the Map. This has ugly consequences.

`val d` is effectively an initial value of a 2x2 matrix, implemented as a `Map[(A,A),Int]`. If vertex `x` is adjacent to vertex `y` in the graph, then `d(x -> y)` is the integer distance from `x` to `y`, and `d(x -> x) = 0` for all vertices. In my implementation, if `x` and `y` are not adjacent, then `d.get(x->y) = None`.

Many implementations of Floyd-Marshall use a special value for INFINITY, which as I understand Scala `Int` does not support. This special value works with assignment, and `+` and `<` etc in intuitive ways. e.g., INFINITY + x = INFINITY for all x.

My implementation is ugly because I have 5 cases, each one representing the situation that a different distance is INFINITY. The classic Floyd-Marshall reduces these 5 cases to just 2 cases.

1. I am using two concentric `flatMap` calls to immutably derive the next `d` iteration from the current `d`. There is one iteration per value of `k` which is realized in my code as a `adj.foldLeft(d)(loop)`.

The ugly thing is that I also need to derive a new value of `pred` on each call to `loop`. So I compute this `pred = pred + ((row -> col) -> pred(k -> col))` which modifies the value of the free `var`.

I don’t know how to do this better, as I don’t know how to make two concentric calls to `flatMap` take a pair of `Map`s and construct a new pair of maps, whereas normally `flatMap` of `flatMap` takes a `Map` and contracts a new `Map`.

There is a 3rd small issue described elsewhere which discusses the problem that concentric calls to `flatMap` unfortunately cannot be abbreviated using the `for` construct.

``````object FloydWarshall {

val d: Map[(A, A), Int] = for {(src, dsts) <- adj
dst <- dsts + src
} yield (src -> dst) -> (if (src==dst) 0 else weight(Edge(src, dst)))
var pred: Map[(A, A), A] = for {(src, dsts) <- adj
dst <- dsts
} yield (src, dst) -> src

def loop(d: Map[(A, A), Int], k:A): Map[(A, A), Int] = {
val emptyMap: Map[(A, A), Int] = Map()

adj.flatMap { case (row, _) =>
adj.flatMap { case (col, _) =>
if (d.get(row -> col).isEmpty) {
if (d.get(row -> k).nonEmpty && d.get(k -> col).nonEmpty) {
pred = pred + ((row -> col) -> pred(k -> col)) // affecting the free var pred
Map((row -> col) -> (d(row -> k) + d(k -> col)))
} else { // either row->k  or k->col is INFINITY
emptyMap
}
}
else { // d(row -> col) is empty, distance is INFINITY
if (d.get(row -> k).isEmpty || d.get(k -> col).isEmpty) { // either row->k or k->col is INFINITY
Map((row -> col) -> d(row -> col))
} else if (d(row -> k) + d(k -> col) < d(row -> col)) {
pred = pred + ((row -> col) -> pred(k -> col)) // affecting the free var pred
Map((row -> col) -> (d(row -> k) + d(k -> col)))
} else {
Map((row -> col) -> d(row -> col))
}
}
}
}
}

}
def shortestPath[A](adj: ADJ[A], src: A, dst: A)(weight: Edge[A] => Int): Option[List[A]] = {
???
}
}
``````
1 Like

Which language does? IEEE floating point has positive/negative infinity (which for most use cases is an error condition rather than a proper value), but I’m not aware of a language with a (bounded or unbounded) integer type that contains an infinity value. (This may just be due to my lack of knowledge, though.)

I’d rather use pattern matching on the option value pairs rather than cascading `#isEmpty` comparisons.

Theoretically you could provide `Ordering[Option[Int]]` and `Numeric[Option[Int]]`, where `None` represents infinity, or come up with a custom `InfInt` type and provide `Ordering`/`Numeric` for it. But honestly, this all sounds like a bit of overkill for the use case - I’d go with pattern matching.

1 Like

In a dynamically typed language, I’d just come up with a special value and then define local functions `+` and `<` which do the right things for that special value.

I thought about defining local Scala functions here to make `+` and `<` work on `Option[Int]`.

Your idea about pattern matching a tuple like `(d.get(row->col), d.get(row->k),d.get(k->col))` is not a bad idea at all. That way I could be sure I didn’t miss a case.

I originally wrote the cascading ifs because I wasn’t 100% sure what all the cases were. I did sort of a binary search for cases. I thought I might also need cases for `pred.get(...)` but in the end I didn’t.

Basically that’s what providing `Ordering`/`Numeric` would do, in a more structured way.

@sangamon, It turned out the suggestion to rewrite the cascading if’s as pattern matching was a good idea. I’m not sure whether it was just rewriting or the fact that I changed it to pattern matching, but I found a bug in the code. I have to test `d(row, k) + d(k, col) < d(row, col)` in the case they they are all numbers.

``````  def computeDistancePredecessors[A](adj:ADJ[A])(weight: Edge[A] => Int): (Map[(A,A),Int],Map[(A,A),A]) = {

val d: Map[(A, A), Int] = for {(src, dsts) <- adj
dst <- dsts + src
} yield (src -> dst) -> (if (src==dst) 0 else weight(Edge(src, dst)))
var pred: Map[(A, A), A] = for {(src, dsts) <- adj
dst <- dsts
} yield (src, dst) -> src

def loop(d: Map[(A, A), Int], k:A): Map[(A, A), Int] = {
val emptyMap: Map[(A, A), Int] = Map()

adj.flatMap { case (row, _) =>
adj.flatMap { case (col, _) =>
(d.get(row->col), d.get(row->k), d.get(k->col)) match {
case (None, Some(rowk), Some(kcol)) =>
pred = pred + ((row -> col) -> pred(k -> col))
Map((row -> col) -> (rowk + kcol))

case (None, _, _) => emptyMap

// previous version neglected to check < here
case (Some(rowcol), Some(rowk), Some(kcol)) if rowk + kcol < rowcol =>
pred = pred + ((row -> col) -> pred(k -> col))
Map((row -> col) -> (rowk + kcol))

case (Some(rowcol), _, _) =>
Map((row -> col) -> rowcol)
}
}
}
}
}
``````

Is there some elegant way to eliminate the two side effecting assignments?
`pred = pred + ((row -> col) -> pred(k -> col))`

In the Floyd-Warshall algorithm every time we find a shorter path we have to update the mapping (via the `flatMap` in the case here) and also update the predecessor matrix to reflect that we found a better path.

I have made my own version :] No variables. No mutable structures. Imperative I/O and input parsing.

``````object FloydWarshall {
val example: String =
"""% "standard example" from https://www-m9.ma.tum.de/graph-algorithms/spp-floyd-warshall/index_en.html#tab_tg
|n 120 220
|n 240 220
|n 360 220
|n 480 340
|n 360 340
|n 240 340
|n 120 340
|n 240 100
|e 0 1 4
|e 1 2 9
|e 0 6 7
|e 0 7 4
|e 1 7 1
|e 7 2 3
|e 1 6 8
|e 6 1 4
|e 6 5 7
|e 1 5 6
|e 5 4 6
|e 4 5 5
|e 4 3 6
|e 2 4 10
|e 4 2 8
|""".stripMargin

def main(args: Array[String]): Unit = {
val edges = example.linesIterator
.filter(_.startsWith("e "))
.foldLeft(Map.empty[(Int, Int), (Int, Int)]) { (map, line) =>
val Array(from, to, weight) = line.split(' ').tail
map.updated(from.toInt -> to.toInt, weight.toInt -> to.toInt)
}
val nodes: Set[Int] =
edges.keysIterator.flatMap(pair => Seq(pair._1, pair._2)).toSet
val shortestPaths = {
val paths0 =
nodes.foldLeft(edges)((paths0, v) => paths0.updated(v -> v, 0 -> v))
nodes.foldLeft(paths0)((paths1, k) => {
nodes.foldLeft(paths1)((paths2, i) => {
nodes.foldLeft(paths2)((paths3, j) => {
tryIndirectPath(paths3, k, i, j)
})
})
})
}
for (y <- nodes.toList.sorted) {
for (x <- nodes.toList.sorted) {
shortestPaths
.get(y -> x)
.fold(print("  ~ ")) { case (num, _) => print(f"\$num%3d ") }
}
println()
}
println()
for (y <- nodes.toList.sorted) {
for (x <- nodes.toList.sorted) {
shortestPaths
.get(y -> x)
.fold(print("  ~ ")) { case (_, vert) => print(f"\$vert%3d ") }
}
println()
}
}

def tryIndirectPath(paths: Map[(Int, Int), (Int, Int)],
k: Int,
i: Int,
j: Int): Map[(Int, Int), (Int, Int)] = {
val directOpt = paths.get(i -> j)
val indirectOpt = for {
(distIK, nextIK) <- paths.get(i -> k)
(distKJ, _)      <- paths.get(k -> j)
} yield {
(distIK + distKJ) -> nextIK
}
val updatedOpt = (for {
direct   <- directOpt
indirect <- indirectOpt
if direct._1 > indirect._1
} yield {
indirect
}).orElse(directOpt).orElse(indirectOpt)
updatedOpt.fold(paths)(updated => paths.updated(i -> j, updated))
}
}
``````

What are the n’s and the e’s in your version?

I think they are:

``````n <x coord> <y coord>
e <source node> <target node> <edge weight>
``````

I have ignored node descriptions as I don’t need coordinates (because I don’t plot the graph).

Ahh, the n’s just give instructions about how to plot the i’th node?

It seems so. If I move the nodes and download the data again then the numbers in lines starting with "n " change.

So I don’t immediately follow your code. But is it calculating both the distance matrix and also the predecessor matrix?

I’m guessing that `tryIndirectPath` is a mapping from (src,dst) to (weight, predecessor) ?

If so, that’s an interesting idea to eliminate my need for the `var pred`. That only works because coincidentally the two matrixes are the same size and have the same indices. Lucky coincidence. But that totally won’t work with my `flatMap` approach. At least I don’t see how.

Yes.

`tryIndirectPath` updates the paths matrix. It takes current paths matrix, checks if going from `i` to `j` through `k` improves anything and returns either original paths matrix (if there’s no improvement) or updated one.

Why wouldn’t it work? Try doing the same thing as me, i.e. output predecessors together with distances. I haven’t checked my predecessor matrix code, though. The website you’ve posted only shows final distances, not predecessors.

Your `tryIndirectPath` is pretty amazing. Folding over an option hurts my brain.

1 Like

I say it wouldn’t work because of the following argument. Suppose `f` is a function `A=>M[A]` and ms has type `M[A]`, then `ms.flatMap(f)` produces an `M[A]`. Now let `g` be a function `A=>N[B]` and then let `mns` be a tuple of type `(M[A],N[A])`, then `mns.flatMap{a => (f(a),g(a))}` will NOT return an object of type `(M[A],N[A])`; in fact it won’t even compile.

I’ll have to rethink the concentric loops to be `foldLeft` like your code does.

On the other hand, looking at your implementation DOES give me better intuitive insight into how Floyd-Marshall works. I’m pretty sure that such added insight and looking at your code more carefully, I can come up with something that I’m happy with. Ideally I’d like to be able to show the code to someone who is not a Scala programming.

I’ve updated `computeDistancePredecessors` to use `foldLeft` rather than `flatMap`, and the code becomes much much simpler.

In my opinion, this fulfills my request of removing the ugliness. I would dare to say that the code is almost readable, even to someone who doesn’t know Scala.

``````  def computeDistancePredecessors[A](adj:ADJ[A])(weight: Edge[A] => Option[Int]): (Map[(A,A),Int],Map[(A,A),A]) = {

val d: Map[(A, A), Int] = for {(src, dsts) <- adj
dst <- dsts + src
w <- if (src==dst) Some(0) else weight(Edge(src, dst))
} yield (src -> dst) -> w
val pred: Map[(A, A), A] = for {(src, dsts) <- adj
dst <- dsts
} yield (src, dst) -> src

vertices.foldLeft(d -> pred) { case ((d, pred), k) =>
vertices.foldLeft(d -> pred) { case ((d, pred), row) =>
vertices.foldLeft(d -> pred) { case ((d, pred), col) =>
val rc = row -> col
(d.get(rc), d.get(row -> k), d.get(k -> col)) match {
case (Some(rowcol), Some(rowk), Some(kcol)) if rowk + kcol < rowcol =>
// we had an old distance, but we have a better distance via vertex k
(d + (rc -> (rowk + kcol)), pred + (rc -> pred(k -> col)))

case (None, Some(rowk), Some(kcol)) =>
// old distance was INFINITY, but we have a new distance via vertex k
(d + (rc -> (rowk + kcol)), pred + (rc -> pred(k -> col)))

case (_, _, _) => (d, pred)
}
}
}
}
}
``````
``````case (Some(rowcol), Some(rowk), Some(kcol)) if rowk + kcol < rowcol =>
// we had an old distance, but we have a better distance via vertex k
(d + (rc -> (rowk + kcol)), pred + (rc -> pred(k -> col)))

case (None, Some(rowk), Some(kcol)) =>
// old distance was INFINITY, but we have a new distance via vertex k
(d + (rc -> (rowk + kcol)), pred + (rc -> pred(k -> col)))
``````

that probably could be optimized into:

``````case (rowcolOpt, Some(rowk), Some(kcol)) if rowcolOpt.forall(_ > rowk + kcol) =>
(d + (rc -> (rowk + kcol)), pred + (rc -> pred(k -> col)))
``````

Also `case (_, _, _) => (d, pred)` can be replaced by `case _ => (d, pred)`

It would also be cool if there was ready to use function that produces iterators of cartesian products, so code like:

``````vertices.foldLeft(d -> pred) { case ((d, pred), k) =>
vertices.foldLeft(d -> pred) { case ((d, pred), row) =>
vertices.foldLeft(d -> pred) { case ((d, pred), col) =>
<body>
}
}
}
``````

could be turned into

``````val triples: Iterator[(A, A, A)] = Cartesian(vertices, vertices, vertices)
triples.foldLeft(d -> pred) { case ((d, pred), (k, row, col)) =>
<body>
}
// or inlined version
Cartesian(vertices, vertices, vertices).foldLeft(d -> pred) {
case ((d, pred), (k, row, col)) =>
<body>
}
``````

With all the improvements the loop instead of being this:

``````    vertices.foldLeft(d -> pred) { case ((d, pred), k) =>
vertices.foldLeft(d -> pred) { case ((d, pred), row) =>
vertices.foldLeft(d -> pred) { case ((d, pred), col) =>
val rc = row -> col
(d.get(rc), d.get(row -> k), d.get(k -> col)) match {
case (Some(rowcol), Some(rowk), Some(kcol)) if rowk + kcol < rowcol =>
// we had an old distance, but we have a better distance via vertex k
(d + (rc -> (rowk + kcol)), pred + (rc -> pred(k -> col)))

case (None, Some(rowk), Some(kcol)) =>
// old distance was INFINITY, but we have a new distance via vertex k
(d + (rc -> (rowk + kcol)), pred + (rc -> pred(k -> col)))

case (_, _, _) => (d, pred)
}
}
}
}
``````

would be this:

``````Cartesian(vertices, vertices, vertices).foldLeft(d -> pred) {
case ((d, pred), (k, row, col)) =>
val rc = row -> col
(d.get(rc), d.get(row -> k), d.get(k -> col)) match {
case (rowcolOpt, Some(rowk), Some(kcol)) if rowcolOpt.forall(_ > rowk + kcol) =>
(d + (rc -> (rowk + kcol)), pred + (rc -> pred(k -> col)))
case _ => (d, pred)
}
}
``````
1 Like

Also `case (_, _, _) => (d, pred)` can be replaced by `case _ => (d, pred)`

Is that an optimization or simply a style issue? If it is only style, I prefer `(_,_,_)` as I can look at the group with my eyes, and emphasize that the only possibility is a triple, rather than some other sort of null value.

Is this `Cartesian` idea a common idiom? If not, I prefer (for my application) the three loops, as it tends to look like the pseudo-code for Floyd-Warshall which a student would find with any search.

Another part of my motivation is that I’ll be giving a course in graph-theory where students learn about single and multiple source shortest path search. Most texts they look at will present a mutable, imperative algorithm in pseudocode. I want to enhance their experience by should a function approach, even if (especially if) the students are not intimately familiar with functional programming. I don’t expect the students to go away and program it themselves functionally, but I’d like them to have some appreciation for how elegant and understandable it can be if done without mutation.