Refactoring class hierarchy into ADT

jimka · January 2, 2021, 10:26am

This post describes the reason I asked the question Traits have no parameters.

I had an ADT which several (6) leaf level case classes and two leaf level objects.

sealed abstract class SimpleTypeD { ... }
final case class SAnd(tds: SimpleTypeD*) extends SimpleTypeD { ...}
final case class SOr(tds: SimpleTypeD*)  extends SimpleTypeD { ... }
final case class SEql(a:Any)             extends SimpleTypeD { ... }
...etc...

The problem was that as the for all the classes was becoming unmanageable as the code was growing larger as I continued to develop the application. So I refactored it to different files. A well-known and apparently much-loved Scala limitation made it impossible to use a sealed class. (you can’t inherit from a sealed class from a different file.) Thus my ADT was no longer sealed. (so perhaps it should not even be called ADT, not sure). At that point the code was organized into one file containing an abstract class SimpleTypeD and additionally 8 files, each containing one of the leaf level classes or objects (e.g. SAnd, SOr, SEql, etc).

This solution was usable, but not ideal. I had to do run-time checking for pattern matching exhaustiveness, by added a default error case to the pattern matching clauses. So I finally am trying to do a second refactoring to make an sealed ATD.

I tried to define SimpleTypeD as a sealed abstract class, and define implementation classes as traits (SAndI, SOrI, SEqlI) each name ending in I for interface. Then I wished to define the leaf classes as follows.

sealed abstract class SimpleTypeD { ... }
final case class SAnd(tds: SimpleTypeD*) extends SimpleTypeD with SAndI(tds : _*)
final case class SOr(tds: SimpleTypeD*)  extends SimpleTypeD with SOrI(tds : _*)
final case class SEql(a:Any)             extends SimpleTypeD with SEqlI(a)
...etc...

However, this doesn’t work because traits cannot take parameters. So what I did, which seems to accidentally work, but seems convoluted and an abuse is the following.

abstract class SimpleTypeDA { ... } // `A` stands for Abstract

sealed trait SimpleTypeD extends SimpleTypeDA 

final case class SOr(tds: SimpleTypeDA*)
  extends SOrI(tds : _*) with SimpleTypeD

final case class SAnd(tds: SimpleTypeDA*)
  extends SAndI(tds : _*) with SimpleTypeD

final case class SEql(a: Any)
  extends SEqlI(a)  with SimpleTypeD

Now in different files, define SAndI, SOrI, SEqlI, etc each as abstract class which extends SimpleTypeDA . (Yes, SAndI should be renamed to SAndA as it is now abstract rather than an interface.)

abstract class SAndI(tds: SimpleTypeDA*) extends SimpleTypeDA {
   ...
}

Thus the sealed trait SimpleTypeD is primarily used to identify which leaf classes are in the ADT. And the fact that SimpleTypeD extends SimpleTypeDA forces each of the ADT leaf classes to implement a common set of methods.

One big advantage of this approach, even it is is wierd and abusive and obfuscated, is that if SimpleTypeDA defines a method foo then each of the other abstract classes can override the method override def foo, and each such overridden method can call super.foo(...) to reach the next method. And the methods seem to all be called in a usable order even if bizzare, i.e, the leaf level method first, and the SimpleTypeDA next. In fact the order is: SEql -> SimpleTypeD -> SEqlI -> SimpleTypeDA
It is not clear to me why SimpleTypeD should be considered more specific than SEqlI, but this minor weirdness doesn’t hinder the semantics of my program.

BalmungSan · January 2, 2021, 1:43pm

A couple of questions /suggestions.

Why is SimpleTypeD an abstract class instead of a trait?
Do you really need constructors for the implementations? You can juts put all the arguments as abstract val / def that are overridden by the case classes.
About terminology, I would call the traits / abstract classes that contain the implementation of the case classes FooImpl because even if abstract they contain the implementation; also remember to make them private to the package, and use self-types to ensure you do not mix or expose them.

sangamon · January 2, 2021, 3:48pm

Just an observation: This sounds like you want to define an ADT, but then go full OO with it, i.e. have lots of methods on the types themselves…? While this certainly is legitimate, it’d perhaps be more common/idiomatic to only have the raw ADT definition in one file, plus maybe some smart constructor(s) and similar stuff in the top-level companion object, and declare any behavior in other files, using pattern matching over the ADT inside functions or type classes. With this style, having six “leaves”/“constructors” in the file should be easily maintainable…

jimka · January 2, 2021, 5:32pm

I think it would work as a trait or as an abstract class. It is however, the base of the class hierarchy, i.e., the root, and has several default method definitions which I’d like to be at the tail end of the super.foo() call chain.

jimka · January 2, 2021, 5:37pm

Sorry, I don’t understand the issue here. The advantage of having all the code for all the methods of a class in one place is that they are in one place, easy to find, and easy to think about together. For example all the code which implements the semantics of SAnd and the code which distinguishes its behavior for every other SimpleTypeD is in a single file. This is especially helpful for distinguishing SAnd from SOr because much of the code is similar and Boolean duals of each other, true vs false, exists vs forall etc.

jimka · January 2, 2021, 5:40pm

FooImpl, that sounds like good adivse. I really don’t care, semantically, whether they are trait or abstract, the name simply implies that the implementation of the methods are in this class.

I don’t understand the issue with private, can you please elaborate? I also don’t understand your comment about self-types.

BalmungSan · January 2, 2021, 5:50pm

My point is that if you use abstract vals or defs then you do not need constructors.

As such everything can be a trait or a final class / object; which can help you with your linearization issues.
Something like:

// instead of
abstract class FooImpl(bar: Int) {
  final def baz(x: Int): Int = bar * x
} 
final case class Foo(bar: Int) extends FooImpl(bar)

// you can
trait FooImpl {
  def bar: Int

  final def baz(x: Int): Int = bar * x
} 
final case class Foo(override final val bar: Int) extends FooImpl

Also, as @sangamon said, whereas your use case is legitimate. Things like ADTs with a lot of methods and depending on linearization order seems too OOP to me and very complex.
As I have mentioned before, for me everything is either abstract or final, I rarely find myself overriding something concrete and I have never needed to call my parent implementation, I personally would find that a nightmare to understand and change.
But, whatever, do not let my mental limitations to guide you

About the private a self types, I guess you can see it here the reason for both is not exposing those to external users and avoiding mistakes like extending FooImpl from Bar.

jimka · January 2, 2021, 5:51pm

I started off with everything in one file, and I found it really confusing. Some of the methods contain delicate logic, and implementing an ADT which is many hundreds of lines long makes the file difficult to understand, and difficult to check by hand, ie., have I implemented all the methods I need for each of the classes. Additionally, having one class per leaf class, makes testing easier, as I also have one testing file per leaf class. I can easily find the tests, and I can easily figure out whether I’ve fully tested each of the classes.

WRT OO, I’m sure there are many ways to think about the problem, but in my case I have a limited number of primitives, and each of the primitives implements several methods. (I refer to these methods as a calculus, i.e., a set of operations which are available to instances, including how those operations interact) There is some behavior which is common to all the leaf classes, and there is behavior which distinguishes them, and in some cases those overlap. I.e., I have reduction rules which start attempting leaf specific reduction, and finally resort to more general reduction. In my thinking this is OO in nature—thinking from specific to general.

One reason that exhaustive pattern matching is really useful is that I define binary methods on many pairs of leaf classes. Since Scala lacks multiple dispatch, I have to include exhaustive pattern matching in several different classes. So I’d like to be sure that I’ve accounted for every combination of X x Y which X is a leaf class and Y is a leaf class.

jimka · January 2, 2021, 5:57pm

ahh, I see your technique for emulating initialization arguments on traits. Perhaps this is the trick which Scala 3 uses?

So I see how to work around the lack of initialization parameters for traits. But I don’t see how this technique allows me to have a sealed class, and also put the implementations into their individual files. It seems like I’ll still need each leaf case class to inherit from two different classes 1) the …Impl class and also the sealed class. Right?

BalmungSan · January 2, 2021, 6:00pm

Yes, exactly.

You need both, that is my point.
Make everything a trait or a final class object.

And follow everything else mentioned in my other post, that way you can have an ADT splitted in many files.

jimka · January 2, 2021, 6:01pm

So whats the advantage of avoiding abstract classes? I didn’t understand the motivation for doing that.

BalmungSan · January 2, 2021, 6:05pm

Because that seemed to be the root of your problem, that plus linearization order.

If all are traits, you an control the linearization order easily…

Ah no, scratch that, you need to override methods of the parent trait in the implementation traits.

Hum, no idea then, sorry.
It seems Scala tries very hard to avoid your design.

jimka · January 2, 2021, 6:06pm

Yes, you commented earlier about how to split an ADT into separate files. I didn’t understand your concept at that time, as it is also using packages in a way I found confusing. I abandoned the attempt for several weeks until I thought about the scheme which is explained in the original post. I see now that your suggestion here was less complicated than I originally feared when I read it several weeks ago.

Many things in informatics are really complicated and confusing until you understand them, then they are easy.

jimka · January 2, 2021, 6:20pm

I don’t think Scala tries to avoid my design. It is just that not many people understand the class precedence order. In CLOS we write OO code depending on the order of most specific to least specific, and relying on the fact that it works as specified. Two principles, a method on a class is more specific than a method on any of its superclasses. A method on a class mentioned earlier in a multiple inheritance list is more specific than one appearing later. I.e, if A inherits from X and Y in that order, then methods on X are more specific than methods on Y if the object is an A. and if B inherits from Y and X in that order, then methods on Y ore more specific than those on X if the object is a B.

In this case, even though I don’t understand the order Scala implements, it seems to work for my situation. The most specific and the least specific methods are intuitive. Just the intermediate methods are not.

BalmungSan · January 2, 2021, 6:25pm

No I do not mean about linearization, I mean about the fact that you need that, plus sealed guarantees, plus splitting into multiple files.

Your solution while apparently provides what you want, actually leaves the door open to external users to override things which may or may not be desired.
You may control that thought access control though.

sangamon · January 2, 2021, 6:27pm

…but why should an ADT be many hundreds of lines long?

There’s the “OO” way where you have a hierarchy with classes implementing traits through methods. The traits specify the complete API surface of the implementing classes, and code for all responsibilities goes into the concrete class itself.

Then there’s the “FP” way using an ADT. Behavior is implemented by pattern matching inside external functions or type classes. Code is organized by concerns - there’ll be one file for rendering the ADT, one for evaluating, etc., where each file will contain the implementations for all the ADT’s constructors.

So, among other differences, that’s two different ways of organizing code hierarchically - by “dispatchee” or by concern. Which leaves me wondering… If you want the OO style, why have an ADT (in particular: a sealed trait) at all? Whatever concrete class implements the trait(s) in a semantically sound way could be welcome. OTOH, if you want an ADT, why not go all in and implement all (or most) functionality externally via pattern matching, leaving you with a small code base for the core ADT as a side effect?

Of course you can mix and match - Scala is a hybrid language, after all. But this may introduce accidental complexity overhead just for balancing/merging the two approaches and require tradeoffs - e.g. in code organization. (Note that this usually shouldn’t be that much of an issue when mixing OO and FP approaches at different levels/layers of the code base.)

jimka · January 2, 2021, 6:29pm

I don’t yet see how, although I’m not claiming one way or another. How can the behavior of my sealed class be changed by other applications making some classes of abstract classes or traits, especially since I’m not inheriting any of their application code. I.e., allowing someone to inherit from my classes into their on applications, wont have any effect on my sealed ADT’s behavior. Right?

Sorry, I don’t yet see the problem.

jimka · January 2, 2021, 6:34pm

An example let’s assume there are three leaf classes. And suppose I want to specify the behavior of a method foo which for all the classes, and further suppose that foo can take as argument an instance of any of the leaf classes.

A.foo(A), A.foo(B), A.foo(C)
B.foo(A), B.foo(B), B.foo(C)
C.foo(A), B.foo(C), C.foo(C)

So in my implementation in the foo method of the class A, I want to use pattern matching on the argument, and I’d like the compiler to tell me whether I’ve missed a case. Especially if I add another leaf class later.

BalmungSan · January 2, 2021, 6:36pm

The ADT itself no, but the implementation classes and the parent abstract class are open to external overriding. So at the end, it seems that won’t be a problem if you never ask for the parent abstract class but just for the parent sealed trait, and you may hide everything from externals through access control.

So yeah, after a second look, it does seems that your code is correct. It just feels weird, like it feels like the code the compiler of a higher level language would generate (maybe Scala 3 does something similar with trait parameters?).
Actually, it is like my own suggestion just with additional tricks to allow parameters in the implementations and overriding parent methods; I would guess that if someday the compiler would support parameters on traits and multi-file ADTs it could generate code like this.

I would just again said the same as @sangamon your design feels like a complex hybrid, but whatever floats your boat.

jimka · January 2, 2021, 6:37pm

each of the leaf classes implement several methods, some of which are delicate. The code is sometimes lengthy. When multiple classes share such code, I try to move that code up into the common parent class and make use of super.foo()