Counterintuitive semantics of overridden vals

Bersier · January 30, 2024, 10:25pm

I just found out that the initialization of overridden vals still gets executed. Am I the only one who finds this unexpected or didn’t know about this? It looks like a Scala pitfall to me.

class A:
  val foo: Int =
    println("heavy computation and/or side effect")
    42

class B extends A:
  override val foo: 42 = 42

new B // Prints "heavy computation and/or side effect"

I guess that, unless a val member is abstract or effectively final, it should always be declared as lazy.

Otherwise, an overriding implementation might want to do some computation to set the val, which then itself might be overridden by another implementation, leading to wasted computation (or unwanted side effects).

som-snytt · January 30, 2024, 11:51pm

Yes, you are the only one who has mentioned this on the forum today.

I think it’s a form of “brittle inheritance”. The FAQ originally had one question about initialization semantics, and in fact I intend to update it with information about interactions with companion objects, but I’ll try to incorporate this behavior as well.

spamegg1 · January 31, 2024, 8:13am

I ran into this issue a lot when solving Advent of Code 2023 (there are many heavy computations there), so I declared all vals to be lazy. But I did not find it surprising / unexpected (maybe I should have?), since I expected all val declarations to be always evaluated no matter where they are (somewhere deep in my I vaguely remember Odersky mentioning it in one of the online courses or maybe it’s left over from my Standard ML years? )

devlaam · January 31, 2024, 1:44pm

No, this behaviour seems perfectly normal to me. If you construct an instance of B, first an instance of A is created:

class A:
  println("construct A")
  val foo: Int =
    println("heavy computation and/or side-effect")
    42

class B extends A:
  println("construct B")
  override val foo: 42 = 
    println("more work")
    42
 
new B

gives:

construct A
heavy computation and/or side-effect
construct B
more work

since val foo is defined in the constructor of A it can be expected to be called. Otherwise, what should you do in B? With just new B we do not call foo either, so should we leave it uninitialised?

But there is also an other issue. If you define it lazy val by default, you get a lock on this any time you call foo, not only the first time. So this is overhead as well.

MartinHH · January 31, 2024, 4:52pm

Note that with scala 3, one can use the @threadUnsafe annotation to opt out of that (which of course introdcues the risk that the initialization gets evaluated more than once): The @threadUnsafe annotation

som-snytt · January 31, 2024, 5:01pm

I have a different notion of normal.

This is the usual question about the interaction of constructors and initialization:

class A:
  val foo: Int =
    println("heavy computation and/or side-effect")
    42
  println(s"constructed A $foo")

class B extends A:
  override val foo: 42 =
    println("more work")
    42
  println(s"constructed B $foo")

@main def test() = new B

produces

scala demo.scala
heavy computation and/or side-effect
constructed A 0
more work
constructed B 42

I would object to the phrase, “first an instance of A is created”. One may say, “first the constructor of A is executed”. But what does that even mean? It doesn’t mean that the code I wrote happens, because my foo is uninitialized.

While it is possible to wrap one’s head around this behavior, it is considered normal only in Stockholm.

A compiler can provide linting support for initialization, but I think an omniscient IDE is best suited for helping here.

MartinHH · January 31, 2024, 5:35pm

I’d agree that that behavior is not the what everyone would expect and there are languages where this works differently.

Here’s a c++ variant of your example:

#include <iostream>

using namespace std;

class A {
public:
     A() {
        foo_ = createFoo();
        cout << "constructed A " << foo() << endl;
    }
    virtual int foo() {
        return foo_;
    }
private:
    int createFoo() {
        cout << "heavy computation and/or side-effect" << endl;
        return 42;
    }
    int foo_;
};


class B : public A {
public:
    B() {
        foo_ = createFoo();
        cout << "constructed B " << foo() << endl;
    }
    int foo() override {
        return foo_;
    }
private:
    int createFoo() {
        cout << "more work" << endl;
        return 42;
    }
    int foo_;
};

Produces


int main() {
    B b{};

    return 0;
}

produces

heavy computation and/or side-effect
constructed A 42
more work
constructed B 42

Because in c++, construction of the methods of a class is part of what is done during the constructor call, so B::foo() does not exist yet when the constructor of A calls foo() and hence the call will evaluate A::foo().

devlaam · January 31, 2024, 6:08pm

I see what you mean. Well, in any case, the code you wrote did “happen”, because the println("heavy computation and/or side-effect") told you so. Only, when you try to reach the result with $foo it does not work, because there is an override val foo in class B present. Without, there is no problem.

When I think about it a bit more, this is actually the worst you can get. You do not get the value of A.foo which is (could be) there, but you cannot reach the value of B.foo yet, so you are presented the non-initialised value?! For an integer this is relatively unharmful, for an object it implies a null pointer exception.

I change my opinion from “perfectly normal” to “is surprising” But this refers to the part described above, not the work done inside the definition of val foo in class A.

Bersier · January 31, 2024, 9:06pm

Here is the reasoning that makes it seem counterintuitive to me:

The whole point of vals is to be read. Since the overridden val is not readable, why is its initialization being executed? (Im)purely for the side effects? Then those side effects should not be part of that initialization code in the first place.

Another way to think of it: is there any case where the current semantics is advantageous for the programmer? For properly written code, I don’t think so.

Also notice the override keyword, which to me implies the parent behavior (why would side effects be excluded?) is overridden.

devlaam · January 31, 2024, 10:05pm

This seems a reasonable point of view for you cannot reach the value of the val foo in class A anymore, even not with super.foo:

class A:
  println("construct A")
  val foo: Int = 42
  
class B extends A:
  println("construct B")
  override val foo: Int = super.foo + 1 //super may be not be used on value foo

println((new B).foo)

Which, in fact, is also a surprise to me, whats wrong with super.foo in case of a val?? With def foo in class A it is not a problem.

So, in the end I agree with you, but for a different reason. I thought you’d be able in some manner to reach foo in A. Then it better be initialised. But if not … (are we not forgetting some sneaky way?)

And, also the possibility to get a null pointer exception as a result on a call on a value inside class A which is overridden is yuck! I like the solution in C++ much better.