I’d like to recount an experience where I think I came up with a good solution using Scala. The implementation uses classical tail recursive approach, because that seems most elegant to me. However, I’m open to criticism about whether the code might be better (by some definition of better) using some library. I don’t really see how to do this with fold
or reduce
.
What is the problem to solve?
I have a graph (graph in the mathematical sense). The graph consists of a finite set of vertices and a finite set of directed, labeled edges. However, I don’t yet know the graph. Rather I have a starting vertex, and a function which when given a vertex will return a set of edges, each having a label and a destination vertex.
So in Scala I have a starting vertex v0
of type V
, and I have a function edges
of type V=>Seq[(L,V)]
. So V
is the type of the vertex and L
is type type of the label.
Now, since the edges
function is computationally difficult to compute (for my application), and because I’d like to easily be able to iterate through all the vertices, I’d like to generate a data structure which helps me do that. I.e., a data structure which contains all the vertex/edge information which the edges
function is able to compute.
Furthermore, I would like to use an array with integer indices in the end so I can access any vertex in constant time.
So my data structure should contain an array, vs
of vertices so that vs(n)
is vertex n
. And an array, es
, of edge sequences so that es(n)
is a sequence of all edges with origin n
.
My final code is given below. Several observations about the experience of writing this function.
- I had to think a couple of days how to factor the problem. What made it doable (for me) was forgetting the specific types of in the graph, and the specific edge generation function, and just think about it as an abstract graph.
- I’m using parameterized types, not make the function reusable, but simply to make it more understandable.
- Internal to the function there is a tail recursive helper function which uses
Map
to associate integers with vertices, and vertices with integers. This is easier for debugging, but theyMap[Int,X]
objects are converted toVector[X]
in the final return value - I use a single tail recursive helper function to achieve a loop-within-a-loop iteration. The outer loop recurs on vertices; the inner loop recurs on edges of that vertex.
- the graph is neither traversed in BFS nor DFS order, but rather in random order. Each time we simply traverse into the vertex which is in one
Map
but not in the other. So finally all vertices are in bothMap
s, as they are finitely many such vertices. This code is concise, but sadly not efficient.
def traceGraph[V, L](v0: V, edges: V => Seq[(L, V)]): (Vector[V], Vector[Seq[(L, Int)]]) = {
val s0: Seq[(L, Int)] = Seq()
@tailrec
def recur(v: V,
i: Int,
es: List[(L, V)],
intToV: Map[Int, V],
vToInt: Map[V, Int],
m: Map[Int, Seq[(L, Int)]]): (Map[Int,V], Map[Int,Seq[(L, Int)]]) = {
es match {
case (label, v1) :: lvs if !vToInt.contains(v1) =>
recur(v, i + 1, es, intToV + (i -> v1), vToInt + (v1 -> i), m)
case (label, v1) :: lvs =>
val currentEdges: Seq[(L, Int)] = m.getOrElse(vToInt(v), s0)
recur(v, i, lvs, intToV, vToInt, m + (vToInt(v) -> conj(currentEdges, label -> vToInt(v1))))
case Nil =>
// TODO, certainly, there is a way to compute v2 without a linear search.
// we want to find a vertex which we have learned exists, but whose
// edges have not yet been computed.
intToV.keys.find(k => !m.contains(k)).map(intToV) match {
case Some(v2) => recur(v2, i, edges(v2).toList, intToV, vToInt, m)
case None => (intToV, m)
}
}
}
val (vertexMap,edgeMap) = recur(v0, 1, edges(v0).toList, Map(0 -> v0), Map(v0 -> 0), Map())
(Vector.tabulate(vertexMap.size)(vertexMap),
Vector.tabulate(edgeMap.size)(edgeMap))
}