Why am I running out of heap space?

som-snytt · May 25, 2025, 10:28pm

:+ is appended, which makes a copy.

So xs = xs.appended(x) will copy the collection before assigning to the reference, so if some internal array (of references to elements of a Vector) is copied, at least that much extra memory is used before it could be collected.

(Maybe it’s a greater cost in time than space.)

I find xs :+= x obscure, although I will stoop to ::= for lists, simply because it is a common idiom (and inexpensive, since prepended is a/the normal way to construct lists). (But using ListBuffer avoids reversing the result.)

“Normalize using local mutable data!” is advice one hears. That is especially true for builders, which are conceptually simple. I have come to appreciate addOne for its lack of subtlety.

sangamon · May 26, 2025, 7:26am

In Eclipse MAT and JVisualVM (and probably other heap analysis tools, as well), you can trace how an object is reachable. Look for “GC roots” and/or “dominator” in the docs and try this for a couple of your Traj objects (and instances of other suspicious types, like the XML nodes), perhaps.

williamclocksin · May 26, 2025, 7:57am

In this line
for ind ← trajList.indices.drop(1) do
are you constructing a new list of indices on every iteration? Will they only be recognised as unreachable until after the loop terminates? I don’t know; only a guess.

SethTisue · May 27, 2025, 1:24am

you constructing a new list of indices on every iteration?

indices returns a Range which only occupies O(1) storage (the start and end), and drop(1) on Range also returns a Range. so there’s no issue.

Russ · May 27, 2025, 4:47am

:+ is appended, which makes a copy.

So xs = xs.appended(x) will copy the collection before assigning to the reference, so if some internal array (of references to elements of a Vector) is copied, at least that much extra memory is used before it could be collected.

Really? That’s news to me. I had assumed that the original Vector and the new element would somehow be referenced in the new Vector, with no copies made of either.

Are copies made if yield is used to construct the Vector[Encounter]?

som-snytt · May 27, 2025, 6:01am

appended says

A copy of this immutable collection with an element appended.

and map says

Builds a new iterable collection by applying a function to all elements of this iterable collection.

The Scaladoc for map should use the doc macro that says Vector.

Well, IterableOnceOps has

Builds a new $ccoll by applying a function to all elements of this $coll.

so I can’t explain the doc without investigating.

Scaladoc2 isn’t much better.

Builds a new immutable sequence by applying a function to all elements of this immutable sequence.

At least they both indicate that copying or building is involved. I think that is true from glancing at the code.

Certainly the general rule for Scala collections is that oped as in appended produces a “different reference”, while mutable op as in addOne returns the same reference.

Structural sharing is YMMV.

SethTisue · May 27, 2025, 2:23pm

This is mostly right. Vector uses structural sharing to greatly reduce copying when appending to a sufficiently long Vector.

Definitely no copying of the elements is taking place. (That’s not how the JVM works; access to objects is always through references.)

Are copies made if yield is used to construct the Vector[Encounter]?

yield is just syntactic sugar for calls to map, flatMap, and withFilter — all of which you can expect to be implemented efficiently — but they do all return new collections, as they logically must.

Russ · May 28, 2025, 7:24pm

In Eclipse MAT and JVisualVM (and probably other heap analysis tools, as well), you can trace how an object is reachable. Look for “GC roots” and/or “dominator” in the docs and try this for a couple of your Traj objects (and instances of other suspicious types, like the XML nodes), perhaps.

I am using visualvm. I stopped my run after only 5 conflict checks (of the 720 that would be done in a full run for this case), so the heap space should be nowhere near full. I got a heap dump, but when I do “Computing retained Sizes”, it got to 33% and just stopped. It’s been sitting there for perhaps 30 minutes now with no sign of progress. What is so hard about this computation? I was really hoping for some useful results, but this is frustrating.

Ichoran · May 28, 2025, 9:40pm

Try the same thing with jmap instead so you can see where all the objects are. The same problem that causes memory starvation while running also is apparently choking visualvm–probably also running out of memory, or maybe it’s a display issue where they aren’t expecting millions of items. But jmap has much much lower overhead.

The problem might just be that you’re keeping the parsed XML accessible. If you have a ton of XML stuff around, just make sure you call it in a method and return the trajectories, without passing back anything related to XML (so the method takes a path and returns trajectories). Then at least visualvm won’t have to worry about that because it can be GCed.

If you still have problems, winnow things down more. Drop all but two trajectories, with the XML not preserved. Still doesn’t work? Cut the trajectories in half. And so on–visualvm really should work, so you have to cut your problem down until it’s a size that can be inspected manually and you can figure out where the difficulty actually lies.

Russ · May 29, 2025, 12:36am

OK, I did another run of the program that generates the trajectories, and this time I generated only 5 trajectories, which are then dumped into the XML file.

I then ran the program that has the memory leak on those 5 trajectories. I put the reading of the XML file into a separate function that returns the Vector[Traj] as you suggested, so the XML stuff would be garbage collected.

I set the program to stop part way through and ran jmap on it. I tried -heap and -histo options. The -heap option complains about something (I forgot the message) but provides no results, and the -histo option gives me a dump of object counts but no useful information as far as I can tell.

I then opened visualvm 2.2 on it again, and generated a heap dump again. I then opened the profiler for memory on the Traj and Encounter classes. I see about a half dozen lines of results for each, but again I see nothing that shows any information about what is keeping these things from being garbage collected. I see columns labeled “live bytes”, “live objects”, “allocated objects”, and “generations”. What am I missing here, and what do I need to do to figure out which objects are holding references to these objects and keeping them from being garbage collected?

Russ · May 29, 2025, 12:49am

One more thing. It seems that there should be some function I should be able to call from my own source code, perhaps using reflection, that provides the information I am looking for, namely the objects that are holding references preventing garbage collection. Does anything like that exist in the Scala or java libraries? If not, why not?

sangamon · May 29, 2025, 2:19pm

As suggested above, look for “paths to GC roots” or similar. In MAT:

Open “Histogram” (or other view) and enter a class name (e.g. Traj) in the <Regex> field below the “Class Name” column header.
Select the class under investigation in the filtered table output, choose “List objects… with incoming references” in the (right click) context menu.
Select an (arbitrary) instance from the resulting list, choose “Path to GC Roots… with all references”.
You’ll end up with a tree view of all paths that lead from the instance to the GC roots.
Rinse and repeat with other instances of this type and/or other types under suspicion.

Example output for an sttp.apispec.Schema instance:

Given that after 30 years of the JVM all these tools feel somewhat inconvenient, there surely is some essential complexity involved. My naive guess is that it’s complex (and still slow) to do for bigger dump sizes without running into out of memory conditions. The best defense is to code in ways that avoid memory leaks in the first place, and/or to code in ways that allow running on small problem sizes. Which is much easier said than done, I know…

marcgrue · May 30, 2025, 9:05am

Likely unrelated to the actual problem, but I ran into stack overflows (from compiling though) recently and solved it by setting -xss2M - lowering from a previous higher value. You might try this just for fun to see if it affects your problem. I tested various xss values and found the lowest compile time duration with -xss2M causing no stack overflows anymore.