Lazy computation in a class

jimka · January 8, 2021, 11:35am

Continuing the discussion from Semantics of lazy vals in a class, I have an optimization case in one of my classes which I don’t know the best way to optimize.

I have a class (hierarchy) which represents a certain type of recursively defined expression. And I have an idempotent operation which converts it to a canonical form. However, my code doesn’t know the operation is idempotent.

It is currently implemented something like the following (with lots of code removed)

abstract class SimpleTypeD {
 ...
  def canonicalizeOnce(nf:Option[NormalForm]=None):SimpleTypeD = this

  def canonicalize(nf:Option[NormalForm]=None):SimpleTypeD = {
    fixedPoint(this,
               (t:SimpleTypeD)=>t.canonicalizeOnce(nf=nf),
               (a:SimpleTypeD, b:SimpleTypeD) => a.getClass == b.getClass && a == b)
  }
...
}

The canonical form is computed by finding a fixed-point of the canonicalizeOnce, and the canonicalizeOnce method is overridden in all the various subclasses.
There are two obstacles as I see it.

I when I construct an object, I don’t yet know that it is canonical; I.e. I have to evaluate the termination test given to the fixedPoint method. So I cannot set a is_already_canonicalized Boolean on construction.
Canonicalization is actually a function of the canonical form. I currently support 3 canonical forms Some(Dnf) (disjunctive normal form), Some(Cnf) (conjunctive normal form), and None another default form which is much easier to compute.

What I’d like to have is

that given an object obj if I call obj.canonicalize(Some(Dnf)) for the first time, then it computes and remembers that value (which is an another object whose class inherits from the same abstract class SimpleTypeD), and calling obj.canonicalize(Some(Dnf)) simply returns the same object…
If I try to re-canonicalize new object, that this just returns itself. I.e.,
obj.canonicalize(Some(Dnf)).canonicalize(Some(Dnf)) never canonicalizes twice. I.e. an object which has already been canonicalized should know it.

I could create hash tables which map input to output, and flags (implemented by ‘var’ set imperatively), but I feel like I’m doing more work than is necessary. Programming without side effect means calling the same function with the same input returns the same result. I feel like I’m trying to implement hacky code to support what the language should be doing for me.

I’d welcome advice.

jimka · January 8, 2021, 12:03pm

Here is my attempt:


  def canonicalizeOnce(nf:Option[NormalForm]=None):SimpleTypeD = this

  val canonicalizedHash:scala.collection.mutable.Map[Option[NormalForm],SimpleTypeD] = scala.collection.mutable.Map()

  def canonicalize(nf:Option[NormalForm]=None):SimpleTypeD = {
    canonicalizedHash
      .getOrElseUpdate(nf,
        // if not already memoized, then perhaps compute a new object, using fixed-point
                       fixedPoint(this,
                                  (t:SimpleTypeD)=>t.canonicalizeOnce(nf=nf),
                                  (a:SimpleTypeD, b:SimpleTypeD) => a.getClass == b.getClass && a == b))
    // imperatively tell the perhaps new object it is already canonicalized
    canonicalizedHash(nf).canonicalizedHash(nf) = canonicalizedHash(nf)
    // return the perhaps new object which knows it is canonicalized
    canonicalizedHash(nf)
  }