Why does scala class keep references to its constructor parameters?

Recently, I have started decompiling my .class files, because I was curious how Scala generates the code.

Surprisingly to me, Scala keeps the references to all the constructor parameters, even if:

  1. it’s a normal class, not a case class.
  2. parameter is not marked as val
  3. it is not referenced in any methods or lambdas, only needed for class construction.

For example:

class A(str: String) {
   val scream = str.toUpperCase + "!"

   def screamTenTimes: Unit = for (_ <- 1 to 10) println(scream)
}

Here, I see str assigned to a class field even if not used outside of constructor. I would assume this is not great for garbage collection and memory usage.

Why is it done like this? Is this a bug or a spec?

Are you sure about that? Maybe the snippet is incomplete, but for this snippet there would be no str field stored
The .class under javap is presented as

Compiled from "test.scala"
public class A {
  private final java.lang.String scream;
  public A(java.lang.String);
  public java.lang.String scream();
  public void screamTenTimes();
  private final void screamTenTimes$$anonfun$1(int);
  private static java.lang.Object $deserializeLambda$(java.lang.invoke.SerializedLambda);
}

The same behaviour is present in all of Scala 2.12, 2.13 and Scala 3

1 Like

I am sure, but the example is a bit more complex than the demo example in my original post.

If I manage to show this, would that be considered a bug in the compiler? Does the spec guarantee that class parameters are not unnecessarily kept, or is this just an optional thing on the part of the compiler?

Welcome to Scala 3.8.2 (25, Java OpenJDK 64-Bit Server VM).
Type in expressions for evaluation. Or try :help.

scala> import annotation.*

scala> class C(s: String) { val scream = s"$s!" }
// defined class C

scala> class C(@constructorOnly s: String) { val scream = s"$s!" }
// defined class C

scala> class C(@constructorOnly s: String) { def scream = s"$s!" }
-- Error: ----------------------------------------------------------------------
1 |class C(@constructorOnly s: String) { def scream = s"$s!" }
  |                         ^
  |   s is marked `@constructorOnly` but it is retained as a field in class C
1 error found

scala>

to guarantee it. The spec does not mandate it as an implementation detail.

Aliasing in a class hierarchy may incur extra fields if care is not taken.

4 Likes

Thanks, using this annotation I managed to debug the issue. This code fails the @constructorOnly check.

class A(@constructorOnly str: String) {
	val scream = Array.fill(10)(str + "!")

	def screamTenTimes: Unit = scream.foreach(println)
}

I still do not understand why it has to be a class field. Just capture the variable from a constructor parameter, why does it have to get stored as a class field?

Array.fill takes a by-name parameter to compute values, which is a closure.

That is visible with -Vprint:flat (to view it closer to byte code):

scala> class C(s: String) { val m = Array.fill(10)(s) }
[[syntax trees at end of MegaPhase{dropOuterAccessors, dropParentRefinements, checkNoSuperThis, flatten, transformWildcards, moveStatic, expandPrivate, restoreScopes, selectStatic, Collect entry points, collectSuperCalls, repeatableAnnotations}]] // rs$line$3
package <empty> {
  class rs$line$3$C extends Object {
    def <init>(s: String): Unit =
      {
        s = s
        super()
        this.m =
          Array.fill(10,
            {
              closure(s | this.$init$$$anonfun$1:Function0)
            },
          scala.reflect.ClassTag.apply(classOf[String])).asInstanceOf[String[]]
        ()
      }
    private val s: String
    private val m: String[]
    def m(): String[] = this.m
    private final def $init$$$anonfun$1(s$1: String): String = s$1
  }
[snip]

By contrast, taking a by-name class parameter means the closure is already constructed:

scala> class C(@constructorOnly s: => String) { val m = Array.fill(10)(s) }
[[syntax trees at end of MegaPhase{dropOuterAccessors, dropParentRefinements, checkNoSuperThis, flatten, transformWildcards, moveStatic, expandPrivate, restoreScopes, selectStatic, Collect entry points, collectSuperCalls, repeatableAnnotations}]] // rs$line$4
package <empty> {
  @SourceFile("rs$line$4") final module class rs$line$4 extends Object {
    def <init>(): Unit =
      {
        super()
        ()
      }
    private def writeReplace(): Object =
      new scala.runtime.ModuleSerializationProxy(classOf[rs$line$4])
  }
  class rs$line$4$C extends Object {
    def <init>(@constructorOnly s: Function0): Unit =
      {
        super()
        this.m =
          Array.fill(10, s, scala.reflect.ClassTag.apply(classOf[String])).
            asInstanceOf[String[]]
        ()
      }
    private val m: String[]
    def m(): String[] = this.m
  }
  final lazy module val rs$line$4: rs$line$4 = new rs$line$4()
}

// defined class C

I think there is a request for an Array.fill that takes an eager arg, for the simple use case.

2 Likes

What I mean is that Array.fill can close over the constructor parameter, not over a class field.

Why is a class field necessary?

1 Like

That is one of my ignored PRs, IIRC after all this time.

1 Like