Are case classes like structs in dlang or other languages

alain · November 8, 2024, 4:58pm

Are they allocated on the heap or the stack ?
For assignment are they passed by reference or by value.

MarkCLewis · November 8, 2024, 5:40pm

The answer could depend on where you run it, but if you are compiling to JVM or JS you should assume they are heap allocated. They will always be passed by reference, not by value.

It is possible in the end they could get stack allocated as an optimization, but only if an escape analysis can verify it never escapes the thread or the current call stack. I believe GraalVM does that type of analysis. But your mental model of what is happening should be heap allocation and all variables are storing references.

jducoeur · November 8, 2024, 5:45pm

One more nuance to that: while they’re usually passed by reference, case classes are generally immutable. So while it is technically possible to mutate them during function calls, that’s rare and requires a somewhat unusually-designed case class.

alexelcu · November 16, 2024, 8:03am

The JVM doesn’t have value classes yet, those are coming in Project Valhalla.

Where are classes allocated, that’s a complicated question. Semantically, classes have “identity”, so they are usually allocated on the heap. The JVM, however, is also capable of escape analysis, so it can choose to allocate objects on the stack in hot loops. And GraalVM has increased this capability. But it’s limited, and somewhat unreliable.

What’s interesting about the upcoming Project Valhalla is that marking a class as being a “value class” just informs the runtime that the class has no identity; in which case the runtime is free to optimize its allocation in whatever way it sees fit, without being too smart about it.

Note that the issue is not actually stack versus heap. This issue is actually not that relevant because Java’s GCs are very efficient — i.e., allocating an object on the heap is just a pointer increment, being as efficient as stack allocation, and de-allocating short-lived objects happens in bulk, due to GCs being “generational”. Stack allocation will have more predictable behavior, but heap allocation is usually very well-behaved, with the runtime free to optimize it in ways not accessible to lower-level languages.

The bigger issue is one of being able to allocate “contiguous memory regions”. E.g., traversing an Array[Long] is very efficient, with the CPU being optimized it, as entire chunks are preloaded in the CPU’s cache. But traversing an Array[Object], on the other hand, requires a lot of indirection because the array is now an array of references to other memory locations. This is why database systems built to run on the JVM have all sorts of tricks for optimizing memory access patterns that you don’t normally see in your average Java code.

Scala’s case class is just a regular class. Note that the JVM does optimize access to final fields; as these have special properties in its “memory model” (regarding concurrent access). But Project Valhalla is close, and I’m sure Scala will integrate with it — as its benefits are too great — as an example, .copy would become zero cost.

Interestingly, Java & the JVM have been trending towards more and more immutability, and its “value classes” will be immutable by necessity. This is in contrast with .NET, where struct can have and usually does have vars, due to its design, as structs are forced to have a zero-arguments constructor. This works in Scala’s favor. But note that it will only work for case class definitions that don’t have any vars declared