Emulating CLOS-like :around methods in Scala

jimka · January 28, 2021, 8:33am

One feature of CLOS which I really miss in Scala is method qualifiers.
I wonder what the best approach to working around this missing feature is.

Here is an example.

As an API, I have a method named foo. All users of my library make calls to foo. And foo is well documented in .md files, and source files, and in other documentation inside and outside the project file hierarchy. The method is defined on all classes in a class hierarchy, and the method takes exactly one argument which is of the same type as the top of the class hierarchy.

This means if I have n leaf classes, there are possibly n^2 cases to consider. Usually several cases are considered together, but the unit testing framework needs to make sure that all n^2 cases are thoroughly tested.

Suppose the top class is A and the leaf classes are V, W, X, Y, and Z, perhaps with some intervening abstract classes.

If V.foo(a) is the same algorithm independent of the class of its argument, that code can go into the method defined in the V class—no problem.
But what about if a.foo(W) is the same algorithm for all leaf classes. Where should that code go? Moreover, what if it is algorithmically better or even algorithmically necessary that whenever the argument has class W, then the other code needs to be circumvented. How do make sure that the check for W is made first, regardless of the class of the object on the left-hand-side of the dot?

There are two ugly ways that I can think of to implement this in Scala.

Write a function, bar (perhaps a method in A) and call it atop every foo method, including program logic such as if a.bar(b).nonEmpty return its-value else .... Remember to add a comment telling the programmer, if you implement a new foo method for a new class, remember to call bar on the first line of the method including logic to deal with its return value. This has the horrible disadvantage that if foo calls super.foo(...) then bar will be called twice.
Rename the foo methods to fooDown, but don’t rename the call-sites. Now re-implement foo as a call to bar followed by a call to fooDown. Now re-read all the documentation and selectively replace foo with fooDown. So the API becomes, call foo but implement fooDown.

In CLOS there is a concise feature to handle this. It is called an around method. Any code which needs to happen as a precondition of calling the method goes in the around method. and when call-next-method (the equivalent of super.foo()) is called, that pre-condition code DOES NOT get re-evaluated.

jimka · January 28, 2021, 8:45am

Here is an example which might make this situation more tangible.

Suppose I am implementing an algebra with several operators, *, +, =, and /. Each operator is a binary operator with a lhs and a rhs. Some operators have left identities, some have right identities, some of identities which are the same for the left and right. Additionally some operators have annihilators (left and right).

Furthermore suppose that there are many leaf classes representing the objects in the algebra, and these objects interact via the defined operators.

E.g., a - 0 = a, regardless of the class of a. a * 0 = 0 regardless of the class of a. a / 1 = a regardless of the class of a, a / 0 throws an exception regardless of the class of a, else 0 / a = 0.

With my understanding of the Scala object system, I need to implement these identities in every class which a might have. In CLOS I only implement the identities once.

This is of course not a make or break feature/short-coming, but around methods do make binary operators much easier and more elegant to implement consistently.

curoli · January 28, 2021, 9:28am

For example, this is how one might implement multiplication for a Galois field of order 5:

import Galois5Element._

sealed trait Galois5Element {
def *(o: Galois5Element): Galois5Element = {
o match {
case Zero => Zero
case One => this

case _ => nonTrivialTimes(o)

}

}

protected def nonTrivialTimes(o)

}

object Galois5Element {
object Zero extends Galois5Element {
override protected def nonTrivialTimes(o): Galois5Element = Zero

}

object One extends Galois5Element {

override protected def nonTrivialTimes(o): Galois5Element = o

}

object Two extends Galois5Element {
override protected def nonTrivialTimes(o): Galois5Element = {
o match {
case Two => Four
case Three => One
case Four => Three

}

}
}

...

}

jimka · January 28, 2021, 10:06am

@curoli, nice example! thanks for adding that one.

This looks like my solution #2 above. You’ve named fooDown as nonTrivialTimes.
So two different methods are needed.

As I mentioned above, renaming a function is disruptive. It probably makes the human readable documentation wrong. All documentation which instructed users to write a method named foo must be updated to say: write a method named fooDown. However, documentation that says: call a method named foo remains unchanged.

Changing the method name also risks making user code obsolete. Any user who has extended one of the classes provided by the library, and had provided a foo, method must edit his code and rename his method fooDown.

Jasper-M · January 28, 2021, 10:44am

This is essentially the template method pattern?

jimka · January 28, 2021, 11:25am

@Jasper-M, thanks for the reference to template method pattern. It is nice to know the names of these idioms which I’ve used many times without knowing they had standard names and descriptions. I know that many lisp programmers (including myself) don’t know these standard patterns because the language itself obviates the need for many of them.

For example I’ve never understood the visitor pattern, except to know that multiple dispatch and the ability to define methods outside of classes obviates it. I believe also (maybe I’m wrong) that being able to write generic anonymous functions also obviates it. It is an entry in a very long list of things I want someday to learn.

jducoeur · January 28, 2021, 2:16pm

The answer, as so often, is that you don’t do this with inheritance in Scala, you do it with typeclasses. You actually underline that point by accident:

That’s basically The Expression Problem, and it is the standard example for using typeclasses in Scala – a large fraction of intros to typeclasses start with it. There are scads of blog articles on this (try Googling “expression problem scala typeclass”), but here’s a fairly concise StackOverflow answer on it.

jimka · January 28, 2021, 2:39pm

Looks like the Expression Problem is does not really seem to be about implementing expressions which have identities and annihilators, but rather about how to allow users to extend classes without modifying the source code or without breaking existing code.

On reading a bit out the Expression Problem, I see that multiple dispatch is one way to solve it, and that’s probably the way I have come to think about it; i.e., using an OO system which supports multiple dispatch. So it makes sense that trying to implement mult-dispatch-bases solution in a single-dispatch OO system is ugly.

jducoeur · January 28, 2021, 6:25pm

Yes and no – fundamentally, both of those are about how to avoid combinatoric explosion in the complexity of implementing this sort of problem. This particular link doesn’t deal with identities and annihilators, but I’ve come across a number of more fully-fleshed-out discussions of the Expression Problem that do so, so I don’t think there’s any limitation there. (Assuming I’m understanding your terminology correctly.)

curoli · January 28, 2021, 6:41pm

It’s news to me that type classes lead to fewer cases than inheritance. Could you elaborate why that would be so?

sangamon · January 31, 2021, 6:36pm

A bit late to the party, just trying to tie a few threads together (as far as I understand the problem)…

The issue of placing the “shared” implementation parts seems to be specific to the design based on OO inheritance. As discussed in another thread, there’s a tradeoff between this and an ADT/pattern matching/type class based design, and this seems to be one aspect in favor of the latter approach. A naive translation of @curoli’s example to this style might look like this:

sealed trait Galois5Element

object Galois5Element {
  case object Zero extends Galois5Element
  case object One extends Galois5Element
  case object Two extends Galois5Element
  case object Three extends Galois5Element
  case object Four extends Galois5Element
}

trait Algebra[T] {
  def mult(a: T, b: T): T
}

implicit val galois5ElementAlgebra: Algebra[Galois5Element] = {
  import Galois5Element._
  (a: Galois5Element, b: Galois5Element) =>
    (a, b) match {
      case (Zero, _) => Zero
      case (_, Zero) => Zero
      case (One, y) => y
      case (x, One) => x
      case(Two, Two) => Four
      case(Two, Three) => One
      case(Two, Four) => Three
      case (_, _) => ???
    }
}

Placing the shared algebra code becomes a non-issue, and subjectively this code looks somewhat cleaner and easier to grasp to me.

But there’s the issue of future extensions - the OO approach in general makes it easy to add new data flavors in isolation, but requires shotgun surgery to existing code when adding new functionality, with the FP/ADT approach it’s exactly the other way round. If you want both, i.e. add new data cases without changing existing code in the ADT design, you’re facing the expression problem. This can be solved, but it requires a more convoluted initial design.

I’m not sure whether this is an actual drawback in this case. It probably doesn’t make much sense to add a new data case to a Galois 5 field. As for your actual code, if I understand correctly from the other thread, you want to use pattern matching with exhaustiveness checking internally in your methods, so you’ll have to fix up existing code upon adding new data cases in this “hybrid” approach, anyway. Sounds like a net plus for the ADT/pattern matching style to me - YMMV, of course.

jimka · February 1, 2021, 8:21am

It seems like very simple examples work well for the ADT model. in each case the multiplication table fits in less than 10 lines. In my case the code for the zeros and ones of the operations indeed fit in a few lines. However, the other cases take 10s or hundreds of lines each, especially after the code is well commented. And in some cases, after the 100s of lines, the routine still needs to then defer to the superclass via super.foo(...).

In my mind the issue of well-factored code seems to have the same effect as allowing applications to extend the class hierarchy.

In my case I’ve given up on letting the Scala language check the ADT. My underlying system is essentially an ADT, but Scala doesn’t know that it is. It is up to me to verify that pattern matching is exhaustive. I accept that as a limitation of scala not allowing sealed classes to extend beyond file boundaries.

The original question was about around methods. The way I see to implement this in Scala is to have a method, foo, with the public name, which contains the before and after code. Somewhere in the middle of this small public method is a call to a cryptically named, fooInternal method which is implemented for each of the classes in the ADT. My application and applications from my customers need to implement the cryptically named fooInternal method for classes which they add, customers need to be careful not to call fooInternal but call foo instead.

The unfortunate part is that this boilerplating pattern makes the code less apparent in the usual case that foo does nothing but directly call fooInternal. Yet it must be there so that in the future if refactoring is necessary, we won’t have to break the interface, and change all the documentation.

In CLOS foo and fooInternal have the same name. Whether the method is the public one or the internal one is handled by a qualifier, a sort of meta-data on the declaration.

Jasper-M · February 1, 2021, 9:43am

Note that Scala has the protected modifier which is meant for internal methods that can be overridden by subclasses but not called from the outside.

jimka · February 1, 2021, 10:22am

I’ve never really understood protected, although it sounds like the correct annotation here. Does it mean that any subclass is allowed to call the method by name? What about super.foo(...) is that allowed for subclasses?

Jasper-M · February 1, 2021, 10:35am

To be completely honest the rules for protected and especially protected[xyz] are a bit murky to me too. But I think it’s like this:

class Foo { protected def foo: Int = 1 }

Works:

class Bar extends Foo { override protected def foo = 3 }

Works:

class Bar extends Foo { def bar = this.foo }

Works:

class Bar extends Foo { def bar = super.foo }

Works:

class Bar extends Foo { def bar(b: Bar) = b.foo }

Doesn’t work:

class Bar extends Foo { def bar(f: Foo) = f.foo }

If you change foo to protected[this] then everything stays the same expect case 4 no longer works.

jimka · February 1, 2021, 11:15am

If I mark a method as protected in the superclass, do I need to also mark the method as protected when I implement the method in each subclass?

Jasper-M · February 1, 2021, 11:25am

You don’t have to but then that method will be publicly accessible. To use my above example, (new Bar: Foo).foo still won’t compile because the Bar instance is upcast to Foo, but (new Bar).foo will.

sangamon · February 1, 2021, 12:30pm

Functional decomposition? The Haskell crowd would be surprised to learn that their language only works for very simple examples.

As said before, your approach is fine if you’re happy with it. It just feels somewhat odd that you choose to categorically rule out a design approach that’s nicely supported by Scala and then keep discovering flaws and limitations in Scala with your chosen style that would be non-issues with the alternative approach. (Which has its downsides, too, of course - it’s always a tradeoff.)

This is the classical OO inheritance based take, as one would do it in Java, as well, yes. Having a cryptic name for the method is not actually a prerequisite, though. And as @Jasper-M already noted, it needn’t be public.

Just for fun, this could alternatively be achieved by (ab)using stackable traits, which is a Scala-specific take on the decorator pattern. However, stacking goes the wrong way round for this use case, so it requires additional “shadow classes” for the actual implementations. This makes it more verbose than the classical approach, and it’s certainly non-idiomatic.

sealed trait Galois5Element {
  def *(o: Galois5Element): Galois5Element
}

object Galois5Element {

  trait Galois5TrivialMult extends Galois5Element {
    abstract override def *(o: Galois5Element): Galois5Element = {
      o match {
        case Zero => Zero
        case One => this
        case _ => super.*(o)
      }
    }
  }

  protected sealed class ZeroImpl extends Galois5Element {
    override def *(o: Galois5Element): Galois5Element = Zero
  }

  protected sealed class OneImpl extends Galois5Element {
    override def *(o: Galois5Element): Galois5Element = o
  }

  protected sealed class TwoImpl extends Galois5Element {
    override def *(o: Galois5Element): Galois5Element =
      o match {
        case Two => Four
        case Three => One
        case Four => Three
      }
    }

  protected sealed class ThreeImpl extends Galois5Element {
    override def *(o: Galois5Element): Galois5Element = ???
  }

  protected sealed class FourImpl extends Galois5Element {
    override def *(o: Galois5Element): Galois5Element = ???
  }

  case object Zero extends ZeroImpl with Galois5TrivialMult
  case object One extends OneImpl with Galois5TrivialMult
  case object Two extends TwoImpl with Galois5TrivialMult
  case object Three extends ThreeImpl with Galois5TrivialMult
  case object Four extends FourImpl with Galois5TrivialMult
}

jimka · February 1, 2021, 2:17pm

I rejected the ADT-via-sealed-class approach because I don’t want all my code in a single file. I had that at the beginning, and I abandoned it because it was not manageable as the code grew. Having one file per concept is much more reasonable for me than having one file that mixes everything just to make the compiler happy.

BTW, I don’t know Haskell. Do you know if the Haskell solution also requires the user to put all the code in one single file?

With regard to the type class approach (which fundamentally is independent from the ADT approach). I cannot judge whether this approach would fit my needs or not. I rejected it because I don’t understand it well enough. Several people have suggested small snippets, which I appreciate, and find interesting every time. But I don’t see how they compose to solve my larger problem. It may well a good approach. However, until now every time I tried to implement something with type classes, I failed understood how it worked or how to refactor it to do more than originally planned for — even though my unit tests all passed. Also I had lots of trouble to convert the type class code from scala 2.12 to 2.13. And I have nightmares about trying to eventually convert it to 3.x.

To me the concept of implicits obfuscates type class related code. Type class code which I’ve written did not (for me) express the intent, but expressed language constraints which to me looked like leaky abstractions. As I understand, Scala 3 attempts to alleviate some of this obfuscation.

Additionally, in my mind the problem I’m trying to solve really is object oriented—despite the Scala object system’s limitations and java legacy. There is a hierarchy of equivalence classes of mathematical objects, and a set of unary and binary functions which each object or pair of objects must be capable of computing. Each of the equivalence classes knows how to solve part of the problem. In OO, when less specific classes know how to do a computation, then that avoids having to duplicate code in subclasses. When more specific classes have more specific algorithmic information that can be coded in subclasses which do their work and then defer to the superclasses to do the rest.

Jasper-M · February 1, 2021, 2:27pm

You can separate the “bodies” of your leaf classes into separate files: Class, trait, impl in different *.scala files? - #4 by Jasper-M