Continuing the discussion from Semantics of lazy vals in a class, I have an optimization case in one of my classes which I don’t know the best way to optimize.
I have a class (hierarchy) which represents a certain type of recursively defined expression. And I have an idempotent operation which converts it to a canonical form. However, my code doesn’t know the operation is idempotent.
It is currently implemented something like the following (with lots of code removed)
abstract class SimpleTypeD {
...
def canonicalizeOnce(nf:Option[NormalForm]=None):SimpleTypeD = this
def canonicalize(nf:Option[NormalForm]=None):SimpleTypeD = {
fixedPoint(this,
(t:SimpleTypeD)=>t.canonicalizeOnce(nf=nf),
(a:SimpleTypeD, b:SimpleTypeD) => a.getClass == b.getClass && a == b)
}
...
}
The canonical form is computed by finding a fixed-point of the canonicalizeOnce
, and the canonicalizeOnce
method is overridden in all the various subclasses.
There are two obstacles as I see it.
- I when I construct an object, I don’t yet know that it is canonical; I.e. I have to evaluate the termination test given to the
fixedPoint
method. So I cannot set ais_already_canonicalized
Boolean on construction. - Canonicalization is actually a function of the canonical form. I currently support 3 canonical forms
Some(Dnf)
(disjunctive normal form),Some(Cnf)
(conjunctive normal form), andNone
another default form which is much easier to compute.
What I’d like to have is
- that given an object
obj
if I callobj.canonicalize(Some(Dnf))
for the first time, then it computes and remembers that value (which is an another object whose class inherits from the same abstract classSimpleTypeD
), and callingobj.canonicalize(Some(Dnf))
simply returns the same object… - If I try to re-canonicalize new object, that this just returns itself. I.e.,
obj.canonicalize(Some(Dnf)).canonicalize(Some(Dnf))
never canonicalizes twice. I.e. an object which has already been canonicalized should know it.
I could create hash tables which map input to output, and flags (implemented by ‘var’ set imperatively), but I feel like I’m doing more work than is necessary. Programming without side effect means calling the same function with the same input returns the same result. I feel like I’m trying to implement hacky code to support what the language should be doing for me.
I’d welcome advice.