Lib unification with Match Types + a problem with inheritance

bblfish · October 18, 2022, 6:57pm

I start with some explanation of why solving this problem would bring great good to all.
And then I put forward a puzzle that needs solving.

A little motivational background

There is an incredibly useful application of Match Types to bring together different code bases under the same API without loosing efficiency! This is especially useful for different implementations of the same standard. For example with Banana-RDF we have a library that unites 2 Java implementations of W3C standards and one JS implementation written by Tim Berners-Lee. And we should be able to add new ones, when needed.

The traditional Java way would be to write a unified interface and then wrapping every object with that interface to make that the standard mode of interaction, as done in Apache’s Common RDF. See their Triple Interface which requires every Triple in a store (and there can be millions) to be wrapped in a new object that has to wrap a subject, a relation and an object, creating thus 4 extra objects in the worst case.

With banana-rdf on the other hand, one can write your code to one unified interface but using directly the underlying objects of the library one is interested in without creating any new objects! This can be made secure with opaque types. Then by changing one line of code one can switch between the different underlying implementations with no loss of efficency. The code for each implementation is locked in an Ops[R] type that is passed implicitly as a given. A good example of such code is GraphTest for which we then have one line implementations for Jena and similarly one line implementations for IBM’s RDFlib.

I recently used the same trick to write a little abstraction for Http Messages that allowed me to abstract between Akka and Http4s to help me reduce the amount of test codes for a Signing HTTP Messages implementation, an IETF spec which is in last call.

One could use the same trick to help abstract between all the various Uri implementations, from Akka’s, to Http4s, lemonlabs Uri, the Java Uri class and even the up and coming cats Uri class, or even the usually much less rich URI classes from the RDF frameworks, not counting all the JavaScript Uri classes. See issue 7 on typelevel/cats-uri for more details.

I can see this being extremely useful in many projects where different teams implemented the same spec, and developers would rather not be tied to any one of them, but switch between them as needed easily.

But I have a little problem, which is making things a bit less nice than they should be…

Inheritance Problem of Match Types

How do I get inheritance to work? The RDF library has a lot of types (see RDF.scala), and I can’t work out how to get the inheritance to work correctly. I can perhaps solve the problem with Conversion types, but that really looks ugly, and the complexity is growing very fast.

So here is a simplified version of the code I am using:

trait RDF:
  rdf =>

  type R = rdf.type

  type rNode <: Matchable
  type Node <: rNode
  type URI <: Node
  type BNode <: Node

  given rops: ROps[R]

end RDF

object RDF:

  type rNode[R <: RDF] <: Matchable =
    R match
      case GetRelNode[n] => n

  type Node[R <: RDF] =
    R match
      case GetNode[n] => n

  type BNode[R <: RDF] = R match
    case GetBNode[bn] => bn

  type rURI[R <: RDF] = R match
    case GetRelURI[ru] => ru

  type URI[R <: RDF] = R match
    case GetURI[u] => u

  // for info about Getxxx types see https://github.com/lampepfl/dotty/issues/13416
  private type GetRelURI[U] = RDF { type rURI = U }
  private type GetURI[U] = RDF { type URI = U }
  private type GetRelNode[N <: Matchable] = RDF { type rNode = N }
  private type GetNode[N] = RDF { type Node = N }
  private type GetBNode[N] = RDF { type BNode = N }

end RDF

trait ROps[R <: RDF]:
  def nodeVal(node: RDF.Node[R]): Int

object SomeObject:
  def calculate[R <: RDF](node: RDF.Node[R])(
   using ops: ROps[R]
   ): Int =
    ops.nodeVal(node)

object OtherObject:
  def uriParse[R <: RDF](uri: RDF.URI[R])(using
      ops: ROps[R]
  ): String =
    ops.nodeVal(uri).toString()

The problem is with the last line. The ops.nodeVal function expects a RDF.Node[R]. We give it an RDF.URI[R] which should be a type that is a subtype of it. But it complains.

I have tried quite a few things to see how to get around this, the last one being described here but with no luck…

This can be seen in this Scastie:

bblfish · October 19, 2022, 5:13am

One way I tried resolving this problem was by specifying an upper bound. We somehow want to say that URI[R] <: Node[R] . That seemed to require specifying an upper bound on the GetURI type definition. But I can’t refer directly to the RDF types as those will have different instances, so I need to pass the instant of R around. I tried this:

 type URI[R <: RDF] <: Node[R] = R match
    case GetURI[R, u] => u
 private type GetURI[R <: RDF, U <: Node[R]] = RDF { type URI = U }

It does remove the red line on the last line of code, but it creates another error message, namely:

➤  scala scala/RDF.scala                                                                    1 ↵
-- [E007] Type Mismatch Error: /Volumes/Dev/Imec/ImecGH/CityFlowsSW/scala/RDF.scala:32:25
32 |    case GetURI[R, u] => u
   |                         ^
   |            Found:    u
   |            Required: RDF.Node[R]
   |
   |            where:    u is a type in type URI with bounds <: RDF.Node[R]
   |
   |
   |            Note: a match type could not be fully reduced:
   |
   |              trying to reduce  RDF.Node[R]
   |              failed since selector  R
   |              does not match  case RDF{Node = n} => n
   |              and cannot be shown to be disjoint from it either.
   |
   | longer explanation available when compiling with `-explain`
1 error found
Errors encountered during compilation

Running the above command with the -explain option gives some very long explanation of what went wrong, which I don’t know how to read yet.

The full code is online here:

bblfish · October 19, 2022, 1:01pm

I found this very recent talk by Matthieu Bovel on Match Types to be very helpful. (We need more like this!)

I think I now understand Neko-Kai’s point in dotty issue 13416: Match type syntax does not allow writing type patterns for type members directly. The idea is that we should be able to use the same pattern as for refinement types

class Vec:
  val size: Int

//v is an instance of a refinement type
val v: Vec {val size: 2} = new Vec:
  val size: 2 = 2

val vSize: 2 = v.size

So that if we could use refinement types like RDF { type URI = u } then we
could perhaps get at the type u without needing to also find a way to pass the subtype R of RDF that we are looking at. So one could have

object RDF:
   ...
   type URI[R <: RDF] <: Node[R] = R match 
       case (RDF { type URI = u }) => u

instead of the code I tried earlier following the pattern that works (when no type inferencing is needed)

  type URI[R <: RDF] <: Node[R] = R match
       case GetURI[R, u] => u
  private type GetURI[R <: RDF, U <: Node[R]] = RDF { type URI = U }

bishabosha · October 23, 2022, 4:04pm

I found this type works:

type URI[R <: RDF] <: Node[R] = R match
  case GetURI[u] => R match
    case GetNode[`u`] => Node[R]

does it work in the bigger case?
Edit: adding an actual implementation of the RDF trait makes it fail

bblfish · October 23, 2022, 8:29pm

@bishabosha I think the code would have also compiled simply with

type URI[R <: RDF] <: Node[R] = R match 
      case GetURI[u]  => Node[R]

What is happening is that this is returning the more general type Node[R] and loosing the type of URI[R]. So it converts immediately to the super-type, but then we would have trouble working with the specific type.

You can see this in this more developed example which provides an implementation of RDF.
On line 74 the error message makes clear that the type of uri is not the lemonlabs absolute URL which has an authority method, but the much wider io.lemonlabs.uri.AbsoluteUrl | BlankNode | String | Double which does not have such a method.

bishabosha · October 24, 2022, 8:17am

I hope I solved your problem, I used an intersection type in the rhs of URI:

type URI[R <: RDF] <: Node[R] = R match 
  case GetURI[u] => u & Node[R]

and here is a demo app (I extend the sub typing to all the match type):

bblfish · October 24, 2022, 9:38pm

Yep that does indeed work !

type URI[R <: RDF] <: Node[R] = R match 
  case GetURI[u] => u & Node[R]

I tried it out on the full code base. The resulting commit is here:

It works for the Jena and the rdflib.js libraries.

But I got stuck with a lot of compilation errors on the rdf4j implementation. The errors were pointing to the types not matching up, even though the code is very similar between both…

After carefully staring at the code for a long time, the only thing I could think could cause that problem is that RDF4J uses interfaces where Jena uses classes for the top level constructs (such as Node). See the rdf4j IRI class for example which is an interface.

So my thinking is: Java Interfaces don’t contain the marker trait Matchable. As a result types that are defined by interfaces cannot ever be of type Matchable, and so pattern matching cannot work for types purely defined on them. That should explain the following errors:

error] -- [E038] Declaration Error: /Volumes/Dev/hjs/Programming/Scala3/CoSy/banana-rdf/scala3/rdf4j/src/main/scala/org/w3/banana/rdf4j/Rdf4j.scala:410:32
[error] 410 |         override protected def stringVal(uri: RDF.URI[R]): String =
[error]     |                                ^
[error]     |method stringVal has a different signature than the overridden declaration
[error]     |---------------------------------------------------------------------------
[error]     | Explanation (enabled by `-explain`)
[error]     |- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
[error]     | There must be a non-final field or method with the name stringVal and the
[error]     | same parameter list in a super class of object URI to override it.
[error]     |
[error]     |   protected override def stringVal
[error]     |   (uri: org.eclipse.rdf4j.model.IRI & (org.eclipse.rdf4j.model.Value & (
[error]     |     org.eclipse.rdf4j.model.Value
[error]     |    & Matchable)) & (org.eclipse.rdf4j.model.IRI & (org.eclipse.rdf4j.model.Value
[error]     |
[error]     |   & Matchable))): String
[error]     |
[error]     | The super classes of object URI contain the following members
[error]     | named stringVal:
[error]     |   protected def stringVal
[error]     |   (uri: org.w3.banana.RDF.URI[org.w3.banana.rdf4j.Rdf4j.R]): String
[error]      ---------------------------------------------------------------------------
[error] Explanation
[error] ===========
[error] There must be a non-final field or method with the name stringVal and the
[error] same parameter list in a super class of object URI to override it.
[error]
[error]   protected override def stringVal
[error]   (uri: org.eclipse.rdf4j.model.IRI & (org.eclipse.rdf4j.model.Value & (
[error]     org.eclipse.rdf4j.model.Value
[error]    & Matchable)) & (org.eclipse.rdf4j.model.IRI & (org.eclipse.rdf4j.model.Value
[error]
[error]   & Matchable))): String
[error]
[error] The super classes of object URI contain the following members
[error] named stringVal:
[error]   protected def stringVal
[error]   (uri: org.w3.banana.RDF.URI[org.w3.banana.rdf4j.Rdf4j.R]): String

I thought of removing Matchable from the Top in the RDF trait. But that leads me to need to add .asMatchable on every type test in the banana-rdf library and all users would need to do the same. That is very tedious.

So perhaps the only way out for rdf4j is to specify that every instance of those interfaces is also Matchable, which is the case when we are working with objects created by the rdf4j factories - or at least I can make sure they are… But it is a bit tricky because rdf4j creates objects implementing those interface by using factories, and those only promise to return something with the right interface…

I could just using implementation classes. So Instead of the IRI trait I could use SimpleIRI (Eclipse RDF4J 4.2.0) . That could lead to my code being a lot less general, but it’s a starting point… Perhaps I could just build my own factory, as I do with scala.js…

bblfish · October 25, 2022, 6:43pm

I think I found a way to duplicate the last problem.

github.com/lampepfl/dotty

Should Java Interfaces be Matchable?

opened 07:45AM - 25 Oct 22 UTC

bblfish

stat:needs triage

## Compiler version 3.2.0 ## Minimized example It is a bit difficult to… get a minimized example on this. An example of a specific commit [fff0be9beb2c1edfabb9e385b72f12c21000f93f](https://github.com/bblfish/banana-rdf/commit/fff0be9beb2c1edfabb9e385b72f12c21000f93f) of banana-rdf, [diesel branch]( https://github.com/bblfish/banana-rdf/tree/diesel). This was a commit to solve the answer to the problem @bishabosha [came up with in issue 13416](https://github.com/lampepfl/dotty/issues/13416#issuecomment-1288691115) and discussed on scala-users thread [lib unification with match types](https://users.scala-lang.org/t/lib-unification-with-match-types-a-problem-with-inheritance/8880). The answer worked for 2 out of 3 cases (wee below). (btw. the pattern discussed in the thread seems so useful it seems to me, it should have a name. Perhaps it does?) ## Output **Update**: For a simplified version of the problem see the comment below https://github.com/lampepfl/dotty/issues/16247#issuecomment-1290526491 Having made the changes proposed by @bishabosha on that commit, two implementations worked well (Jena RDF and Tim Berners-Lee's rdflib.js) but we had a problem with the Eclipse RDF4J implementation. My best explanation for the otherwise minor difference in code, is that Eclipse is built around interfaces whereas Jena is built on classes. And I guess Interfaces don't implement Matchable by default. Here is the type of error I was getting: ```scala [error] -- [E038] Declaration Error: /Volumes/Dev/hjs/Programming/Scala3/CoSy/banana-rdf/scala3/rdf4j/src/main/scala/org/w3/banana/rdf4j/Rdf4j.scala:410:32 [error] 410 | override protected def stringVal(uri: RDF.URI[R]): String = [error] | ^ [error] |method stringVal has a different signature than the overridden declaration [error] |--------------------------------------------------------------------------- [error] | Explanation (enabled by `-explain`) [error] |- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - [error] | There must be a non-final field or method with the name stringVal and the [error] | same parameter list in a super class of object URI to override it. [error] | [error] | protected override def stringVal [error] | (uri: org.eclipse.rdf4j.model.IRI & (org.eclipse.rdf4j.model.Value & ( [error] | org.eclipse.rdf4j.model.Value [error] | & Matchable)) & (org.eclipse.rdf4j.model.IRI & (org.eclipse.rdf4j.model.Value [error] | [error] | & Matchable))): String [error] | [error] | The super classes of object URI contain the following members [error] | named stringVal: [error] | protected def stringVal [error] | (uri: org.w3.banana.RDF.URI[org.w3.banana.rdf4j.Rdf4j.R]): String [error] --------------------------------------------------------------------------- [error] Explanation [error] =========== [error] There must be a non-final field or method with the name stringVal and the [error] same parameter list in a super class of object URI to override it. [error] [error] protected override def stringVal [error] (uri: org.eclipse.rdf4j.model.IRI & (org.eclipse.rdf4j.model.Value & ( [error] org.eclipse.rdf4j.model.Value [error] & Matchable)) & (org.eclipse.rdf4j.model.IRI & (org.eclipse.rdf4j.model.Value [error] [error] & Matchable))): String [error] [error] The super classes of object URI contain the following members [error] named stringVal: [error] protected def stringVal [error] (uri: org.w3.banana.RDF.URI[org.w3.banana.rdf4j.Rdf4j.R]): String ``` ## Expectation It's difficult to have an expectation here. Should Java interfaces be Matchable automatically? If not then I'll need to find a workaround. That is not so easy because all of rdf4j is built around interfaces, and they use factories to create objects that just return the interface. So one would need to change a lot there... Is there a java interface for Matchable that I could add to the top of the hierarchy of rdf4j Node hierarchy to check if this is the problem? I'll try to see if I can find a minimal example code to duplicate the problem... But if you have any thoughts on the main question that would help.