Guarantees of `this.type`

foobar · September 29, 2021, 5:17pm

I wanted to ask for clarification for the return value declaration of methods with this.type:

Does it guarantee “return value .getClass will be identical to this.getClass, and dear compiler, please use the same static type as for the input” or does it guarantee "return value will be the identical JVM object to this"?

Example that works:

 trait MaybeCopy{
        def maybeCopy:this.type
        } 
trait Other{
        def foo:Int
        } 
final class Impl extends MaybeCopy with Other{
        def foo:Int = 1
        def maybeCopy:this.type = (new Impl).asInstanceOf[this.type]
        } 
val x: Other with MaybeCopy = new Impl
x.maybeCopy.foo

Example that does not work:

final class Impl extends MaybeCopy with Other{
        def foo:Int = 1
        def maybeCopy:this.type = (new Impl)
        } 
cmd11.sc:3: type mismatch;
 found   : ammonite.$sess.cmd11.Impl
 required: Impl.this.type
def maybeCopy:this.type = (new Impl)
                           ^
Compilation Failed

So either scalac is overly cautious at type inference (note that the class was final – Impl can always be safely upcasted to whatever static type the user had attached to this) or I am misusing this.type and optimizations could break my code (would it be legal for the scala compiler to discard the return value of x.maybeCopy and reuse x for the second call)?

sangamon · September 29, 2021, 5:41pm

It’s the (singleton) type of this instance. Types != Classes.

BalmungSan · September 29, 2021, 6:22pm

AFAIK there is only a single value that you can return when the return type is this.value and that value is this

tpolecat · September 29, 2021, 9:25pm

It looks like you’re trying to say “my type” or “the implementation type” which Scala doesn’t give you a way to express directly (nor does any other language as far as I know). As noted by @BalmungSan the only term that conforms with this.type is this.

But there are two ways to express what I think you’re getting at: f-bounded types and typeclasses. I discuss these in an extemely tiresome blog post that you might find helpful/irritating.

foobar · September 29, 2021, 9:50pm

I mean, it can be done with implicits without any issues:

implicit class MaybeCopyHelper[T<:MaybeCopy](val x:T){
def maybeCopy:T = x._maybeCopy.asInstanceOf[T]
}

 trait MaybeCopy{
      // could be e.g. package private
        def _maybeCopy:Any
}

final class Impl extends MaybeCopy with Other{
        def foo:Int = 1
        def _maybeCopy:Any = (new Impl)
        } 
val x: Other with MaybeCopy = new Impl
x.maybeCopy.foo

So the real question would be: Is the this.type or the implicit class MaybeCopyHelper better? Is e.g. cloning of the object a violation of the this.type promise?

Can the next patch release of scalac break all my code because “this.type means this, let’s optimize the code!”? That is, is it “somewhat non-idiomatic” or is it UB?

(I’d really like to be able to express “if this method returns, then it will be the this pointer, on pain of UB / miscompile / complaints when testing with UBSAN” to give the compiler license to discard return values in e.g. fluent builders – is this what this.type means?)

Of course the set of possible return values of a generic implementation of a method that promises to return something that can be casted to the same type is limited – null if one sits below AnyRef; this works always; this.clone() also works always, via JVM intrinsic (but may throw).

The only other way is to limit inheritance (e.g. sealed or final) or inform people that they MUST override the method when subclassing. I am happy to duplicate the implementation and make all non-abstract classes below the interface final.

As noted by @BalmungSan the only term that conforms with this.type is this.

null conforms without warnings as well:

 class baz{def x:this.type = null}

Is this then a compiler bug?

BalmungSan · September 29, 2021, 10:38pm

It is probably not, I just never think on null except if interacting with Java code.

this.type is erased to just the type of the enclosing type AFAIK, so from a practical point of view returning a different instance of the same underlying class is always valid.
However, IMHO, the point of this.type is to say: “this method will execute some side-effect and then will return itself so you can chain more methods”, so basically it is perfect for a builder.

Now, it seems that what you want to do is return the “current” type, as suggested by Rob the best / safest option is always a typeclass. Although check this SO post for more discussion about it.

Anyways, both of your approaches are unsafe due to the use of asInstanceOf, specially the second one that returns Any; while you may argue that you know all the code and you are sure all usage is correct, my humble opinion is that the idea of having a strong type system is to let the compiler verify that for you instead of having to be sure.

SethTisue · September 29, 2021, 10:53pm

This is covered at Scala FAQ | Scala Documentation

foobar · September 30, 2021, 11:24am

I don’t see how typeclasses solve the problem?

Consider

{
        trait Clonable[+A]{def cloneMe:A} 
        trait MixIn{ def f:Int=1}
        trait Base extends Clonable[Base] 
        final class Impl extends Base with MixIn with Clonable[Impl] {def cloneMe:Impl = this}
        val a: Base with MixIn =new Impl
        a.cloneMe.f
} 
cmd51.sc:6: value f is not a member of ammonite.$sess.cmd51.Base

The problem is that the typeclass-based cloneMe does not preserve static knowledge about additional mixins above Base. The relevant mixins / traits can be sealed, but I’m not gonna enumerate 2^N mixin combinations in the source.

I don’t particularly care about the compiler type-checking my code – so going with the implicit and checked casting is safe enough for me.

I have been bitten often enough by UB that I’m wary of lying to optimizing compilers, especially since that is so hard to test / repro / debug (in llvm-based languages).

So again, does the following have well-defined semantics (possibly faulty or throwing exceptions, that’s my job)?

final class Foo{def copyMe:this.type = (new Foo).asInstanceOf[this.type]}

Or is it UB and the compiler (or the runtime) will spawn bats in my nose (and importantly behave differently in unit-tests and prod)?

BalmungSan · September 30, 2021, 1:40pm

That is not a typeclass; that is just a traditional inheritance. This is how a typeclass would look like.
Although, it seems this approach will not work for your use case.

Not sure why you are using Scala then, but whatever.

Not sure what “UB” means? Unacceptable Behaviour maybe? If so, what does that means?

Not sure what you mean with “well-defined semantics” in this case.

The Scala compiler doesn’t do many optimizations AFAIK, most optimizations will be done by the JVM in runtime, but they should not affect the semantics of your code.

As I said before, AFAIK this.type is erased to whatever type this refers at that moment, so producing a new value of the same underlying class should be valid at the bytecode level. So if you are happy with your current code with the asInstanceOf casts then go ahead; the compiler will not change your code.
What may go wrong would either be adding a bad asInstanceOf call but a good unit test should catch that, or that a user would assume that it should have returned this so it discarded the output (but for what you have said, I doubt that worries you)

sangamon · September 30, 2021, 2:10pm

You are not using type classes. A type class based approach (as laid out in detail in @tpolecat’s blog post) to your example might rather look like this:

trait Cloneable[A] {
  def cloneMe(a: A): A
}

implicit class CloneableSyntax[A : Cloneable](a: A) {
  def cloneMe(): A = implicitly[Cloneable[A]].cloneMe(a)
}

trait MixIn { def f: Int = 1 }
trait Base

implicit def baseCloneable[A <: Base]: Cloneable[A] = identity

final class Impl extends Base with MixIn
val a: Base with MixIn = new Impl
a.cloneMe().f

Not sure whether this covers your use case.

Scala may not be the right language choice, then.

foobar · September 30, 2021, 4:51pm

Better than java, and not my choice unless I want to look for another job?

But I exaggerated: I don’t care that the compiler typechecks the implementation of the maybe-copy-interface (that’s like 5 lines of code), but I care very much that the compiler type-checks uses of the interface (which are by different people and all over the place).

“Undefined behavior”. In e.g. C,

extern void bar(int);

int foo(){
    for(int i = 1<<30;; i++) {
        if(i < 0) return -1;
        bar(i);
        }
    return 0;
}

the function bar will of course by called with negative arguments, because signed integer overflow is undefined in C – meaning the compiler can assume that it does not happen (“what may not be, cannot be”), and can use that knowledge to optimize out my careful runtime check. If the axioms of the compiler are violated, you get the usual “ex falso quod libet” consequence that your entire logical edifice crashes down. UB is the technical term for that.

@sangamon so your proposal is ultimately equivalent to my previous implicit-based approach? I.e. I have some dynamically dispatched function that performs the right clone-implementation (either identity or make a copy and wire external refs), and then do a .asInstanceOf cast?

But I still don’t understand whether this.type as a return type means “returns the same type” or means “returns the same instance”. Is final class Foo{def copyMe:this.type =null} valid? Is final class Foo{def copyMe:this.type = (new Foo).asInstanceOf[this.type]} valid?

BalmungSan · September 30, 2021, 5:10pm

It means the same instance, the only valid value of type this.type is this
You can of course trick the compiler by using null, doing type casts, throwing exceptions, using reflection, etc.
Most Scala developers will just never do that.

Again, if you are sure your code will always be correct and are ok with the cast go ahead.
The problem may be that a user may assume that this.type was used correctly thus would do something like:

val foo = ???
foo.clone() // Since clone returns this.type then this should have only done a side effect and return foo
// Thus I keep using foo
foo.bar()

However, given the name clone it is very clear it will not return the same value.
So ultimately, a method named clone that returns this.type doesn’t make sense on a semantic level.

But, it is clear that you want to use to refer to a value of the same type as this rather than the singleton type of this
Which again, is “OK” as long as people know that.

foobar · September 30, 2021, 5:47pm

Thanks a lot! I think this answers my question.

To paraphrase:

The scala 2 compiler won’t assume that this.type means this-instance, and will therefore not produce surprising results after optimization / inlining. If the project ever migrates to scala 3, we’ll need to revisit the issue.
The language and community and conventions really expect this.type to mean this-instance instead of “same type”.
The implicit based way of declaring the interface is therefore preferable.

SethTisue · September 30, 2021, 6:12pm

It’s the language. Community and conventions don’t enter into it.

This comes up a lot, so I’ve submitted FAQ: small improvement in 'this type' answer by SethTisue · Pull Request #2195 · scala/docs.scala-lang · GitHub to foreground it in the FAQ answer.

BalmungSan · September 30, 2021, 6:35pm

With that do you mean a typeclass? If so yes… but bear in mind that there is a lot of semantics in place there. Check this for a detailed discussion of the differences.

Great!

foobar · September 30, 2021, 9:07pm

Is final class Foo {def copyMe: this.type = null} ok or is it a compiler bug?

I misspoke
In the existing implementation of the scala language / compiler, this.type means “same static type”, while in the spec, it means “same instance or null”?

About the docfix, I’d suggest to:

document that lying to the compiler about the type can cause exceptions, but not undefined behavior in the C-sense
being explicit about whether or not null inhibits this.type
document the simple solution (which does not help to implement methods returning the same type, but which does allow writing interfaces that return the right type)

trait PackagePrivateFoo { def doTheThingPrivate:Any }
//static wrapper that does all the unsafe casting
def doTheThing[T<: PackagePrivateFoo](x:T):T= x.doTheThingPrivate.asInstanceOf[T]
//or as implicit, if it makes for nicer syntax
implicit class DoTheThingExt[T<: PackagePrivateFoo](val x:T) extends AnyVal{
    def doTheThing[T<: PackagePrivateFoo](x:T):T = x.doTheThingPrivate.asInstanceOf[T]
}

since people ask about this all the time, maybe one should consider adding language support, like e.g. this.staticType?

SethTisue · September 30, 2021, 10:59pm

No, absolutely not. That is not the right way to think about this or talk about this.

this.type is designed, specified, and implemented as meaning “same instance”, always has been, full stop. Once null enters the picture, perhaps there is some inconsistency — I haven’t studied that. But not otherwise. I will disregard null for the remainder of this post.

And: asInstanceOf is designed, specified, and implemented as allowing you to lie to the compiler and get violations of what would otherwise have been hard compile-time guarantees.

As a general rule, the JVM doesn’t have “undefined behavior in the C sense”.

I believe you are confused about the difference between types and classes, and about the difference between compile-time and run time.

The JVM is a safe runtime, which means that a field declared as holding instances of class A cannot possibly ever hold an instance of class B unless B is a subclass of A (or unless there is a bug in the JVM itself). Attempting to violate this results in an exception at runtime (if the bytecode even makes it past the verifier in the first place).

However, there is no such more general guarantee about types. Types are a compile-time concept, classes are a runtime concept. At runtime, the JVM guarantees that something can’t be the “wrong” class. But it can’t guarantee that something is the “right” type, because types don’t exist at runtime. The compiler is the only thing that can guarantee that something is the right type — unless you lie to it with asInstanceOf.

When I say that this.type means “same instance”, I say it means that at compile time. At runtime, there is no such guardrail. The runtime guardrails that exist are about classes, not types, because types don’t exist at runtime. I can’t say that too many times.

SethTisue · September 30, 2021, 11:29pm

By the way, confusion of this sort is quite common; see this FAQ entry (and linked blog post): Scala FAQ | Scala Documentation

foobar · October 1, 2021, 10:46am

Yeah, but an optimizing compiler for some language targeting JVM might have UB in the C-sense. If you compile C to llvm and run it via sulong in the JVM, then C-style UB will bite.

Scalac is an optimizing compiler targeting the JVM, and it is happy to mess up semantics of its output if some core scala language assumptions are violated, so I’m wary of “hard lies”.

E.g. inlining – that works under the specific assumption that “run-time dependency versions are the same as compile-time dependency versions”. If that assumption gets violated, then mayhem ensues, e.g. because I compile against some dependency and later switch it out in the classpath when running the program. I guess that’s why the inlining config for scalac is as complex as it is.

So my question about UB was: Is the this.type lie benign, like using reflection to mutate final fields in java (well defined), or is it like using Unsafe to update a final field (lol nope, likely to become UB in the future, cf eg https://bugs.openjdk.java.net/browse/JDK-8132243) or like lying to scalac about inlining (won’t blow up the JVM, but will have results that depend on compiler inlining/optimization heuristics, so I’ll call it UB).

I guess you answered that to my satisfaction (it’s a benign lie).

SethTisue · October 1, 2021, 12:40pm

Yeah. Maybe you’ve seen it already, but this blog post is a deep dive into that: Scala Inliner and Optimizer | Lightbend

Agree.

(Fsvo “benign”, anyway. And this is where community/convention does enter into it: you might consider it “benign” technically in the sense of probably having predictable behavior in normal scenarios. But unless it’s a one-person project, I would never use this.type like that, and I would be highly taken aback and object strongly if I encountered it in project code at work.)