Extension syntax

Russ · November 14, 2024, 11:37pm

Here is the standard example that is used for extension methods:

case class Circle(x: Double, y: Double, radius: Double)

extension (c: Circle)
  def circumference: Double = c.radius * math.Pi * 2

A while ago I suggested that this could be simplified by allowing a no-name option, as in

extension (Circle)
  def circumference: Double = radius * math.Pi * 2

I am still using Scala 3.3.3, so I am wondering if this has been done in later versions. If not, why not? I think it’s a no-brainer (in the positive sense, that is), and I can’t imagine it would be difficult to implement. It would make the extension files consistent with putting the new methods into the original class constructor, and that is desirable for several reasons, including elegance and more convenient refactoring.

alexelcu · November 15, 2024, 7:49am

The problem I’m seeing is that extension methods are not the class’s methods. For instance, they don’t have access to protected members of the class, or to method implementations that were overridden, and their dispatch happens at compile-time instead of runtime.

You’re proposing the overriding of this. I’ve seen it in Kotlin, and IMO, it ends up being confusing. Kotlin, however, also supports specifying the receiver of this in its function types, or in other words, it’s a more general concept, instead of being an exception for one language construct. And they tried running with it for “context receivers” as well, but back-tracked after feedback, so the current proposal for “context parameters” no longer does that.

devlaam · November 15, 2024, 2:31pm

Does he? He says:

The way I read this is that it makes it consistent mainly on the syntactical level. In other words, leaving out the name and interpreting all names in the extension as public fields of the class whenever possible. Nothing more, as syntactic sugar so to say. But i could be wrong of course. Anyway, imho, there is not much to gain here.

jducoeur · November 15, 2024, 3:11pm

Agreed that it’s really just syntax sugar. And honestly, I like it – for the simpler and more common cases, I think it’s a tad clearer in saying what it means.

But I suspect that there are a lot of edge cases involving shadowing of names and such that would need to be worked through in order to consider it a fully-baked idea, as well as detailed precisely how it would work. I don’t know if it pulls its weight.

Russ · November 15, 2024, 6:25pm

To clarify, yes I was proposing to make this mean the same as it normally means.

My main use case is to conveniently break up my own larger class constructor source files into smaller, more manageble files.

Not being a compiler developer, it did not even occur to me that there could be issues with access to private or protected fields, but now I see that that could be an issue. For my use case I don’t think it is an issue, but certainly the current named “this” should be maintained even if my proposal is accepted as an optional syntax.

What bothers me about the current syntax is that it is more like the explicit Python “self” convention than the implicit Scala/Java “this” convention. It requires that I modify class methods if I want to pull them out of the main constructor source file, which is time consuming and error prone.

BalmungSan · November 15, 2024, 6:36pm

TBF splitting managed files into multiple ones has never been the intention of extension
Also, depending on who you ask, you are actually making your code harder to maintain and read rather than easier.

jducoeur · November 15, 2024, 6:47pm

Yeah – actually splitting class definitions would be a vastly bigger lift, and AFAIK would break one of the invariants that the language was designed around. (And semantically is dramatically different from extension.) I think that’s a non-starter.

I sympathize with the problem, mind; it’s telling that I have several different workarounds to deal with it, since it happens fairly often. But in general, if a class’ file is getting that big, I take that as a hint that it needs some serious refactoring; the result is usually more compehensible code.

Russ · November 15, 2024, 11:25pm

I’m not sure how it would make code harder to understand. Yes, it could be confusing to someone who is not aware of the additional source files that implement the class, but a simple search for extension(MyClass) should remedy that problem fairly quickly.

Some of my class definition files are fairly large (e.g., over 1,000 lines), but most of my methods are fairly small. I just have many of them in some classes, and it seems desirable to be able seemlessly to put some of them (the ones that do not require access to private and protected fields) in separate files. I realize that I can break up class definition files by using traits, which I have actually done in some cases, but using extensions seems like it could be cleaner if my proposal is accepted.

Ichoran · November 16, 2024, 6:04am

I think it’s too big of a change in behavior at this point.

Consider this code:

class C():
  extension (s: String)
    def p = s + this
  def q = println("Hi ".p)

Straightforward enough. But now:

class C():
  extension (String)
    def p = this + this  // No, that's not right
  def q = println("Hi ".p)

Okay then, how about

class C() { outer =>
  extension (String)
    def p = this + outer
  def q = println("Hi ".p)
}

That would work. But it’s encoded as, effectively

class C():
  @extension
  def p(s: String) = s + this
  def q = println(p("Hi "))

And, in fact, you can call extension methods that way, even with parameter names:

scala> (new C()).p(s = "Hey ")
val res29: String = Hey rs$line$49$C@77fa31b7

So given that it’s both a change in behavior for existing code (where this is the enclosing class) or is inconsistent, and that it makes understanding the sugaring harder, I’m not a fan of the idea.

Russ · November 16, 2024, 7:06am

Your examples are interesting but seem a bit contrived. Don’t nested classes pose the same dilemma? Yet they are allowed in Scala.


class A:
  class B:
    def f = this // which "this" is it?

And how can I access the other “this” from inside f? Am I missing something?

alexelcu · November 16, 2024, 8:17am

Note that the receiver of this is obviously B and you can also refer to A easily:

class A:   
  class B:
    def f = A.this

sageserpent-open · November 16, 2024, 11:39am

This was discussed before: Combining export and extension - #14 by sageserpent-open and in more detail here: Supporting import in extension method blocks - #38 by Russ - Language Design - Scala Contributors.

Is it really the case that your classes can’t be broken down into separate classes? How many methods are we talking about here? Are these mostly public API methods, or are the majority really just implementation detail?

Split them up into separate abstractions at different levels, or if you must have a giant class, use the composition-and-export trick that was mentioned on the first linked discussion.

Unless you can solicit buy-in from the compiler team (or you submit a PR to them), either way is going to be quicker than getting a language change made for what seems to be a very one-off use case.

A different take on this is found in this core class that is extended here.

CodeMotionAnalysis defines a single almighty factory method in its companion object, and that breaks down internally into a localised universe of classes and functions. It’s big alright, but there is a decomposition into multiple levels. Very 1970s Pascal-style, but with an IDE it’s straightforward to navigate.

The extension adds the ability to work with the analysis to perform merges: it could have been written as a completely separate abstraction, but I felt that the code spends all its time dealing with a CodeMotionAnalysis, albeit only via the public API, so why not make it an extension?

It is another big blob of code, so folding into the core class would have stretched the size even further. As there was a natural fault line between the two, the extension approach seemed natural.

As there is one factory method in the core class and one in the extension, and both are large, the overhead of the extra access syntax isn’t a bother.

som-snytt · November 17, 2024, 12:52am

If this means, “if you own the code, it should be one big class, not a class plus extensions”, I disagree.

Dotty source has Symbol and

class SymUtils:

  extension (self: Symbol)

where the nomenclature recalls self: Symbol => for self or this type.

Similarly,

object ContextOps:

  extension (ctx: Context)

where ctx usually means

inline def ctx(using ctx: Context): Context = ctx

I haven’t looked at whether the extensions mark a boundary between “core functions” and “operations expressed in terms of core functions”, but it appears to be a blessed idiom.

Maybe it’s just a way of managing the namespace, or avoiding field bloat.

One issue with anonymous extension (Circle) is that it’s not callable as a “regular method”, circumference(c). Some other topic recently mentioned anonymous parameters, so maybe that would be a feature.

I just typoed circumfurence, suggesting that when you get out of your favorite armchair covered in cat hair, it’s called transfurence.

Russ · November 17, 2024, 9:07pm

Glad to hear I’m not the only one who thinks that way. As I see it, there are at least two reasons for breaking up a large class definition into multiple files. One is just the unwieldiness of navigating large files, although I guess that is not as much of an issue if you use a good IDE. The other is that a single class may serve different projects, and you may want to keep those projects separated.

Let me give an example from my own work. I have a case class that I call Route, which represents a route to be flown by an aircraft. It is based on discrete waypoints and a specified radius for each turn. It defines a curvilinear coordinate system with coordinates of along-track and cross-track position (not altitude).

I just did a quick check, and it has 165 methods and companion functions on 1014 lines of code (including blank lines). It implements a trait that I call RouteChange, which is dedicated to maneuvers for resolving conflicts. This file has 57 methods and functions on 1214 lines. I used the RouteChange trait to break up what would have been a file of over 2000 lines into two files.

It seems a bit awkward, however, that the Route class extends the RouteChange trait when in principle it is really the other way around. But that is just a technicality I suppose. I could just use more traits to further break up the files, but I was hoping that extensions might be a more elegant way to do it.

I have used this Route class in research prototype software for two different airspace types, terminal and urban airspace, and it could also apply to other airspace types. Terminal airspace is the airspace within about 30 miles of a major airport, and urban airspace is for the new concept of urban air mobility. In any case, some of the methods of the Route class apply to both of these airspace types, but some apply only to one or the other.

I could use OO inheritance for the different airspace types, but my experience is that that causes all kinds of subtle problems and should only be used as a last resort. But I still want to keep the methods for terminal airspace in separate files from the methods for urban airspace, and that is where I had hoped that extensions might be useful and preferable to traits.

I am always open to suggestions if anyone has any.

jducoeur · November 17, 2024, 10:34pm

If it’s a normal case class, with just public immutable data, the other obvious way to break it down is with type classes – that’s how I’d likely handle it. Look at the various collections of functional behavior, and define each as a distinct type class.

The notion that behavior is bound at the hip to the data is very much the OO view of the world. The FP style tends to separate them, with data type that are more about the data, and behavior pulled out to type classes that operate on top of that data.

sageserpent-open · November 18, 2024, 6:14am

What I’d be asking myself, is roughly…

How many of these methods are public (or couldn’t be made more restrictive based on who calls them)?

Do all of your projects deal with roughly the same public methods or are there distinct groups of methods for differing projects?

How many of these methods are short helpers to other methods?

How many functions are there (either local within method bodies or freestanding)?

How many data fields are there in Route/RouteChange?

Do all the methods work on pretty much all of Route/RouteChange’s data fields, or do different clumps get worked on by groups of different methods?

How many instances of Route do your various projects work with - multiple (presumably), or just one?

Do some of the data fields only have validity for short durations while certain methods execute, but not over the entire lifespan of the Route instance?

Do some of the methods concern themselves much more with the argument and result data than the data fields of Route/RouteChange?

Could a Route compose a RouteChange internally, or just instantiate route changes on the fly? It sounds to me like a route change is a transient thing that happens zero or many times for a given route. Perhaps a route change should take one route and yield another?

Are most if not all of the data fields in Route/RouteChange immutable? If so, could they passed as implicit context to functions that need them? (Or use Reader if you’re a fan of monadic code).

How many other classes or freestanding functions does a route instance interact with, or rely on in its implementation? I mean, other than the standard library things.

I’m reciting the standard OOP 101 here, forgive me if this is pitched at the wrong level, but I can’t help but feel that something is awry here in the design. Of course, working software is precisely that, design notwithstanding, but again, I’d urge you to consider a breakdown by classes and composition / delegation, functions, type classes or whatever.

That’s as much as I can usefully offer, so I’ll bow out of this thread and leave you to the tender mercies of the other participants. Happy hacking…

Russ · November 18, 2024, 5:54pm

Thanks for the feedback. You’ve given me a lot to think about.

Russ · November 18, 2024, 7:54pm

I found this brief description of type classes based on extensions:

Is this what you are suggesting?

jducoeur · November 18, 2024, 10:06pm

I mean, type classes in Scala long predate extension methods – hadn’t even been thinking of them as related since I still mainly work in 2.13. But yes, in Scala 3 they do wind up closely associated this way.

Basically, you can think of a type class as an “interface” collecting related extension methods; that’s a little imprecise, but close enough for thinking about it.

That’s the orthodox FP way of decomposing code, and I’d likely use a number of them if I was in your situation.

BalmungSan · November 18, 2024, 10:10pm

I once wrote a little bit about what typeclasses are and what problem they solve here: Polymorphism in Scala. · GitHub

Hopefully, that is useful, as always feel free to ping me for questions and or feedback.