Lazy vals in immutable case classes

#1

I am using lazy vals for virtually all data fields in my immutable case classes. Is there ever a reason to use a “regular” (non-lazy) val in such cases? If not, then would it make sense for the compiler to automatically make them all lazy vals by default? Otherwise, would it make sense to have a shorthand for less code clutter in Scala 3, such as “lval” or something like that?

#2

Lazy values can decrease performance, when they are used for fields that are frequently accessed.
The runtime has to check, if the right hand side has been evaluated already on each access of the field. So if your calculation is not too expensive, the performance cost of the lazy handling can be greater than the savings from not evaluating some values.

See this article for a detailed description of how they work under the hood: https://blog.codecentric.de/en/2016/02/lazy-vals-scala-look-hood/

#3

Lazy values also requires a surprisingly enormous amount of code under the hood – each lazy val expands to a lot of code necessary in order to make it work safely. So if code size is at all relevant, over-using them can have a cost there…

#4

That doesn’t sound good. I hope most of these issues will be resolved or mitigated in Scala 3.

#5

I doubt that Scala 3 will change this. The problem of having to store if the value is already calculated is inherent in lazy evaluation, e.g. Haskell has this problem too.
And Haskell compiles to machine code with a runtime optimized for it, while Scala runs on the JVM. It’s questionable, if there is a better threadsafe lazy val implementation you can create within the limits of Java bytecode.

#6

It’s not a matter of “issues” – lazy val is very, very hard. Keep in mind that you’re trying to make sure that something gets run precisely once, fending off multi-threading problems, deadlocks, and all that.

This isn’t a bug or some such: it’s simply that, every time you say lazy val, you’re demanding a pretty sophisticated piece of software. That requires quite a bit of code, and no small amount of overhead, in order to do that reliably…

#7

And seriously: if you want to be convinced, read SIP-20, which is the deep dive into how to make lazy vals work completely correctly. It’s astonishingly difficult…

#8

I would read it, but… I’m too lazy!

#9

Based on earlier warnings in this thread, I converted many of my lazy vals to defs, but the overall speed of my program decreased. Deciding when to use a lazy val, a regular val, or a def is not a simple matter! I guess it might help if I can get a profiler to provide me with a count of each usage of a val. Is there a reasonably simple way to get that?

1 Like
#10

Listers have a saying, “Premature optimization is the mother of all evils.” Or it’s many variants.

That goes for deciding lazy/def/val/eq among many other things. I like using vals for most everything. It’s mostly esthetics but it is founded in old compilers that could do a better job with constants that variables. Subroutine calls are expensive. Thread unsafe lazy and streams are to be preferred for occasional things. Call by name for less than often stuff, calling a function exactly once, …

But those are all premature optimizations (evil). I don’t know what the scala compiler does when -optimizers used. Hotspot vms are even less predictable. Functional languages with a hotspot compiler is really unpredictable. Even with rigorous profiling and optimization it is very difficult to gain speed-ups in production system. Especially if multiple cores are being used, short thread switch times, &c. Long time optimizations involving I/O (including modern supercomputer codes) can be nearly impossible. A simple case in business is optimization of database codes only to find it performs worse in production with multiple consumers of the database.

There still are general rules. Avoiding rare computations by using thread safe lazy might be one. How rare, how long, competition for cache lines, message passing are all big questions. The general rule of thumb is don’t spend much programming time doing tricks. FORTRAN can run numerical codes much faster than other languages. Should I convert my scala code to FORTRAN? For many problems, human time is more expensive than CPU time. Tricks cost QA time, maintenance time as well as the initial programming or design.

Keeping it simple is the lisper’s mantra. It’s good advice for most problems.

1 Like
#11

A disadvantage of pure parameterless defs and lazy vals compared to simple vals is also that defs and lazy vals can close over big amount data and in effect cause memory exhaustion. Consider following code:

trait MyData {
  def result: Int
}

def reduce(hugeArray: Array[SomethingBig]): Int = ??? // some implementation here

def returnAsDef(hugeArray: Array[SomethingBig]): MyData = new MyData {
  def data: Int = reduce(hugeArray) // closes over "hugeArray"
}

def returnAsLazyVal(hugeArray: Array[SomethingBig]): MyData = new MyData {
  lazy val data: Int = reduce(hugeArray) // closes over "hugeArray"
}

def returnAsPlainVal(hugeArray: Array[SomethingBig]): MyData = new MyData {
  val data: Int = reduce(hugeArray) // doesn't close over "hugeArray"
}

I don’t think blindly converting all defs, vals and lazy vals to a single type of class member makes sense at all. Every member should be considered independently. But if we’re going for rules of thumb then for immutable value classes I would propose the following:

  • use lazy vals for things that are expensive to compute (from members of the same class) but small
  • use defs for other things that can be computed from members of the same class (avoid infinite recursion of course)
  • use plain vals for everything else (i.e. use plain vals most of the time)

I don’t always apply the above rules but they are the first considered when deciding whether some case class member should be a val, lazy val or def.

Another strategy is to have multiple variants of case class, each suited for different type of usage and with different set of preprocessed data.

2 Likes
#12

Good advice. If you make your program convoluted and unreadable for 10% performance, what will you do for the next 10%?

I also hope the designers of Scala never get seduced by silly things like replace “lazy val” (which is pretty clear in what it does) with “lval”, which would be totally mysterious for the wondrous benefit of saving typing of 4 characters.

1 Like
#13

If lazy vals were widely used, then an abbreviation like “lval” would be no worse than the term “val” itself. Is the abbreviation “val” a “silly thing” that “would be totally mysterious for the wondrous benefit of saving typing of 4 characters.” I don’t think so. Actually “val” only saves 2 characters, whereas “lal” saves 5 characters over “lazy val”. And the issue isn’t just typing, but rather code clutter and line wrapping. So please spare me the BS.

But that leaves the question of how widely lazy vals should be used. I was using them extensively in my immutable case classes until I was informed of their pitfalls on this very thread. So I went and replaced many of them with defs – but the speed of my program decreased. So either lazy vals aren’t as bad as some here make them out to be… or I eliminated too many of them.

I don’t need a lecture about the purpose of lazy vals. The problem is that it is not always obvious what percentage of uses of a case class will need a particular computed value that is based only on other data fields of the same class. If I know that 90% of instances of the class will need a particular value, then I can go ahead and just make a regular (non-lazy) val. But if it is needed only 20% of the time and is costly to compute, I may want to make it a def or lazy val. And if it the value is used many times, it should of course be a val or a lazy val rather than a def to avoid unnecessary recomputation. But those needs could vary from application to application or even from one run to another of the same application.

This seems like an opportunity for someone to develop an application that keeps track of how many times each val or method is used and how long it takes to compute. Or is something like that available already in existing profilers?

#14

I’m a bit confused what you mean by using lazy vals in case classes, because the following seems to be illegal:

**Welcome to Scala 2.13.0 (OpenJDK 64-Bit Server VM, Java 1.8.0_212).
Type in expressions for evaluation. Or try :help.

case class A(lazy val x: Long, lazy val y: Long)
^
error: lazy modifier not allowed here. Use call-by-name parameters instead
^
error: lazy modifier not allowed here. Use call-by-name parameters instead**

Or is this new in 2.13.0?

If it was allowed, it would be, IMHO, a design smell, because case classes are meant to be open buckets of data, free to touch, while a lazy val should be something more guarded.

For example, adding a case class as an element to a default immutable set or as a key to a default immutable Map, resulting in more then five elements, respectively, will call hashCode and access every primary component of the case class, because those collections are HashSet and HashMap, respectively.

Or, interpolating your case class in a String will call toString and access all primary components.

My rule of thumb would be to only use lazy val if there is something special about your design that tells you that this particular value has a high chance of not being needed. If you need a profiler to know, you probably just want a val.

#15

No, I don’t mean lazy vals in the argument list. Why would anyone do that? I am referring to lazy vals as members of the class. As a basic example, I have a class that I call “LineSeg” to represent a finite line segment. It is determined by its end points, which are the constructor arguments. But it also has other derived properties, including it’s length and direction. These are not particularly costly to compute, but the cost could add up if they appear inside a loop that executes many times. The problem is that it is hard to know in advance how many instances of the class need those properties and how many access them many times. See the code snippet below

case class LineSeg(point1: Position, point2: Position) {
  // finite 2D line segment

  lazy val length = point1 separation point2
  lazy val dir = point1 directionTo point2

  lazy val alongDir = point1.unitVectorTo(point2) // unit vector in direction of segment
  lazy val crossDir = Position(alongDir.y, -alongDir.x)
#16

Thanks for clearing up the confusion. Constructor arguments of case classes are members, too.

This doesn’t include the definition for Position, but it’s probably something like

case class Position(x: Double, y: Double) { … }

Unless you are doing something very unexpected, these are all very cheap computations and not at all candidates for lazy val. You should just make them all defs.

The argument that lazy val might save something because there are so many of them doesn’t really work, because every single lazy val adds overhead, so many lazy vals means lots of overhead. They only make sense if each one of them is expensive.

#17

This is a bit like asking, “Should I always use List to store multiple elements?”

There are big, important differences between val, lazy val, and def. If there weren’t, there wouldn’t be much reason to have all of them.

val is fastest to access, but you always have to create it. def is fastest to compute (not usually noticeably faster than val but it saves memory), but you have to compute it whenever you want it. lazy val has a sizable additive penalty to computation time on first use, and a smaller additive penalty on every access, but you only have to pay those if you need it, and you only have to pay the big one once.

List has O(1) tail but O(n) indexing, while Vector has O(log n) indexing and tail (and with a big constant factor on tail), and Array has O(1) indexing but O(n) tail–so you can’t just choose one to “always” use; there are big tradeoffs. val, lazy val, and def similarly have a tradeoff.

If the amount of time to do an access is 1, and the time to do a computation is c (we can assume c >= 1), then the constant penalty for single-threaded access of a lazy val is roughly 2, and the constant penalty for creation is more like 20 (very roughly–you should benchmark!). So if you use a value k times, then the total runtime cost of each is very roughly

val x         c + k
lazy val x    (k min 1)*(c + 20) + 3*k
def x         c*k

Reality is more complicated, but you can see where each would work best: if you always use it 2+ times and the computation is expensive, go with val; if you almost never use it or the computation is really cheap, use def; and if sometimes you never use it but sometimes you use it a lot, then you use lazy val.

To see this, suppose you have 100 objects; 90 of those objects never use the thing, but 10 of those use it 20 times each. Then we have total costs of

val x       100*c + 200
lazy val x  10*c + 200 + 600 = 10*c + 800
def x       200*c

In this scenario, if c is 2 or less, def wins, but if c is 6 2/3 or more, lazy val wins.

If c is extremely large compared to the cost of access or storage, then lazy val wins any time that you could skip work in some cases and you can occasionally reuse work.

#18

I don’t see how you can conclude that they should just be defs. Granted, a root-sum-square computation for length may be simple, but what if it is needed hundreds of times in some heavy-duty numerical algorithm? Recomputing it each time would be inefficient. In that case, a val would be preferable. But then it is computed even in cases where it is not needed at all.

In this case, I guess one unnecessary computation is a lot less to worry about than hundreds, so I guess I’ll just go with val.

#19

Thanks for that analysis. The problem is that I don’t know the value of the parameters you called c and k. Just for kicks, I think I will make the length of my line segment (LineSeg) a def and instrument it to keep track of how many times is accessed for each instance.

Unfortunately, I don’t have time to do that kind of instrumentation for all my classes! That’s why I suggested that a profiling tool that does it automatically would be nice to have.

#20

Lazy vals use synchronization based on locks for which JVM has some sophisticated optimizations (e.g. lock elision, biased locking, etc). Depending on whether the optimization kicks in you could get vastly different performance characteristics. Sample program that shows the unpredictability of lock optimization (Core i5, 4 core/ 4 threads):

import java.util.concurrent.atomic.AtomicLong

object LockElision {
  def main(args: Array[String]): Unit = {
    timed("atomic 1 thread")(runAtomic(1, 80 * 1000 * 1000))
    timed("locked 1 thread")(runLocked(1, 80 * 1000 * 1000))
    timed("atomic 2 threads")(runAtomic(2, 40 * 1000 * 1000))
    timed("locked 2 threads")(runLocked(2, 40 * 1000 * 1000))
    timed("atomic 4 threads")(runAtomic(4, 20 * 1000 * 1000))
    timed("locked 4 threads")(runLocked(4, 20 * 1000 * 1000))
  }

  def timed[T](description: String)(action: => T): T = {
    val startTime = System.currentTimeMillis()
    val result = action
    val totalTime = System.currentTimeMillis() - startTime
    println(s"$description took $totalTime ms")
    result
  }

  def runAtomic(threadNum: Int, workPerThread: Long): Long = {
    val acc = new AtomicLong(0)
    val threads = (0 until threadNum).map(_ => new Thread {
      override def run(): Unit = {
        (0L until workPerThread).foreach(_ => acc.incrementAndGet())
      }
    })
    threads.foreach(_.start())
    threads.foreach(_.join())
    acc.get()
  }

  def runLocked(threadNum: Int, workPerThread: Long): Long = {
    val acc = new SyncLong(0)
    val threads = (0 until threadNum).map(_ => new Thread {
      override def run(): Unit = {
        (0L until workPerThread).foreach(_ => acc.increment())
      }
    })
    threads.foreach(_.start())
    threads.foreach(_.join())
    acc.raw
  }

  class SyncLong(var raw: Long) {
    def increment(): Unit = synchronized {
      raw += 1
    }
  }
}

Output:

atomic 1 thread took 1017 ms
locked 1 thread took 1659 ms
atomic 2 threads took 1545 ms
locked 2 threads took 6166 ms
atomic 4 threads took 1634 ms
locked 4 threads took 2912 ms