I feel like there needs to be some summary of the solution to this problem.
In the end the code is very simple, but finding the simple code was a formidable challenge.
The project includes implementing a sort of logic in several different programming languages to compare expressivity. Languages include Python, Clojure, and Scala, but may be extended to include OCaml, Haskel, and Common Lisp.
A short summary of the project in question is a system of decorated Boolean values, which I have called HeavyBool
. These are not simply True
and False
but rather true-because(reasons) and false-because(reasons). HeavyBool
is accompanied by operators such as conjunction, disjunction, inversion which at least preserve the reasons component both in the true case and also in the false case, and at most conjoin additional reasons to the because description. This is related to the concept of Either
when used for error handling, in the sense that errors may accumulate; however it is different because typically Either
is used to accumulate errors (false results) but NOT accumulate true results.
Implementing this logic includes existential and universal quantifiers (existsM
and forallM
). I I want to know whether something is true because there exists a witness and in particular what the witness is and indeed the witness may be the Scala False
value. On the other hand, I would like to know that a universal quantifier fails because of a witness (counter example). Inverting a true-witness becomes a false-counter-example.
The original refactoring problem was related to the existential and universal quantifiers in my code which had been written using LazyList
primarily because I needed to abort an iteration, and in some cases the list of candidates to feed to foreachM
might be extremely long, and laziness helps in these cases.
object HeavyBool {
def forallM[T](tag:String, items: LazyList[T], p: T => HeavyBool): HeavyBool = {
def loop(data: LazyList[T]): HeavyBool = {
if (data.isEmpty)
HTrue
else {
val hb = p(data.head)
if (hb.toBoolean)
loop(data.tail)
else
hb ++ Map("witness" -> data.head,
"tag" -> tag)
}
}
loop(items)
}
def existsM[T](tag:String, items: LazyList[T], p: T => HeavyBool): HeavyBool = {
!(forallM[T](tag, items, x => !(p(x))))
}
...
}
The first refactoring was to use foldM
to eliminate the tail recursion, and make way for accommodating any type for which foldM
works. I cannot hope to use the tail-recursion scheme on non-list-like collections.
So the second iteration of the code contained the following new implementation of forallM
exploiting foldM
, but keeping existsM
exactly as before
object HeavyBool {
def forallM[T](tag:String, items: LazyList[T], p: T => HeavyBool): HeavyBool = {
import cats._
import cats.syntax.all._
def folder(_hb:HeavyBool, item:T):Either[HeavyBool,HeavyBool] = {
val hb = p(item)
if (hb.toBoolean)
Right(HTrue) // not yet finished
else
Left(hb ++ Map("witness" -> item,
"tag" -> tag)) // finished
}
items.foldM(HTrue:HeavyBool)(folder).merge
}
def existsM[T](tag:String, items: LazyList[T], p: T => HeavyBool): HeavyBool = {
!(forallM[T](tag, items, x => !(p(x))))
}
...
}
The next step is to remove the hard dependency on LazyList
in the functions existsM
and foreachM
. I’d like to allow any collection which is supported by foldM
.
BalmungSan suggested the first refactoring, which included splitting the argument list into three parts. (tag, items)
, (predicate)
, and (implicit ...)
.
object HeavyBool {
def forallM[T, C[_]](tag:String, items: C[T])( p: T => HeavyBool)(implicit ev: cats.Foldable[C]): HeavyBool = {
import cats._
import cats.syntax.all._
def folder(_hb:HeavyBool, item:T):Either[HeavyBool,HeavyBool] = {
val hb = p(item)
if (hb.toBoolean)
Right(HTrue) // not yet finished
else
Left(hb ++ Map("witness" -> item,
"tag" -> tag)) // finished
}
items.foldM(HTrue:HeavyBool)(folder).merge
}
def existsM[T, C[_]](tag:String, items: C[T])(p: T => HeavyBool)(implicit ev: cats.Foldable[C]): HeavyBool = {
!(forallM[T,C](tag, items)(x => !(p(x)))(ev))
}
...
}
A nice advantage of this refactoring is that it increases readably of call-sites of forallM
and existsM
. For example:
def isAntisymmetric[T](gen:LazyList[T], rel:(T,T)=>Boolean) = {
// if a relates to b, and b relates to a, then a == b
def hrel(a: T, b: T) = HeavyBool(rel(a, b))
forallM("x", gen) { (x: T) =>
forallM("y", gen) { (y: T) => (hrel(x, y) && hrel(y, x)) ==> HeavyBool(x==y) }
}.annotate("antisymetric")
}
def isStronglyConnected[T](gen:LazyList[T], rel:(T,T)=>Boolean):HeavyBool = {
def hrel(a: T, b: T) = HeavyBool(rel(a, b))
// forall a,b, either a relates to b or b relates to a
forallM("a", gen){ (a:T) =>
forallM("b", gen) {(b:T) =>
hrel(a,b) || hrel(b,a)
}
}
}.annotate("strongly connected")
The next step in the refactoring was to look at classes which are using HeavyBool
. One such class is called Magma
and subclasses such as ModP
which have a method named gen
, responsible for generating a collection of object for which we’d like to make claims about using foreachM
and existsM
.
The original version of Magma
looked like the following
abstract class Magma[T] {
def gen(): LazyList[T]
...
}
abstract class ModP(p: Int) extends Magma[Int] {
override def toString: String = s"ModP($p)"
override def gen(): LazyList[Int] = Magma.genFinite(p - 1)
...
}
The goal at this point was to remove the dependence on LazyList
in the abstract classes, perhaps keeping it in the leaf-level classes where appropriate. Attempts failed to make this change by type parameters on the gen
method. I had to use an additional type parameter on the Magma
class and its subclasses.
A difficult thing to understand is the syntax for creating subclasses of Magma
and declaring and overriding methods in the various classes. When to use C[T]
, vs C[_]
, vs C
, and when to use List
vs List[Int]
etc.
abstract class Magma[T, C[_]: Foldable] {
def gen(): C[T]
def op(a: T, b: T): T
...
}
abstract class ModP(p: Int) extends Magma[Int, List] {
override def gen(): List[Int] = Magma.genListFinite(p - 1)
...
}
class AdditionModP(p: Int) extends ModP(p) {
override def op(a: Int, b: Int): Int = (a + b) % p
...
}
class MultiplicationModP(p: Int) extends ModP(p) {
override def gen(): List[Int] = {
super.gen().filter { a:Int => a != 0 }
}
override def op(a: Int, b: Int): Int = (a * b) % p
}
class GaussianIntModP(p: Int) extends Magma[(Int,Int), LazyList] {
override def gen(): LazyList[(Int,Int)] = {
def loop(u:Int,v:Int):LazyList[(Int,Int)] = {
if (u==p)
LazyList.empty
else if (v==p)
loop(u+1,0)
else
(u,v) #:: loop(u,v+1)
}
loop(0,0)
}
...
}
As I recall this caused lots of problem, especially with warning messages in IntelliJ. I later realized that I was using an old version of cats. In retrospect, it seems all the compiler problems go away if I simply upgrade cats from 2.7.0 to 2 10.0.