Performance issues after upgrading to 2.13 from 2.12.8

Hi all, I’ve been working on a research project implemented in Scala for a long while. Performance is always an issue, so I ported the project to 2.13 hoping some improvement in that aspect.
I have large data in LongMap (up to a million), Array (up to half a million), and some smaller data in various mutable/immutable collections like HashMap, ArrayBuffer, Vector etc. The code uses filter, map, and for loops heavily, and some partition, groupby, sortby functions not so heavily. I have automated system tests every time I check in my work to version control.
After porting to 2.13, my system tests that were previously taking about 28 minutes jumped up to 50 minutes. No need to say, I am in panic!
I read most of the documentation of the new design, and tried to address some function usages that might cause extra collection generation and copying. I tried to fix chained transforming functions using iterators, and sometimes views, etc. I tried to fix “.values.filter” calls to “.valuesIterator.filter” calls. Despite all my efforts I could not get anywhere close to 2.12.8 performance. I must still not understand the whole implications of the new design.
One clue might be to understand what was lazy in 2.12.8 that no longer is in 2.13 with respect to iteration and transformation.
I would appreciate any pointers and suggestions on potential problem areas that I should look at.

I suggest using profiler. Since you have the two versions available (fast and slow), and the difference is significant, you have a good chance of finding out quickly where the problem is.

You can use JDK Mission Control, which is open source. It seems currently you have to build it from source, see:

If you can run your project on JDK 11, Mission Control can connect to the JVM without any additional flags. On JDK 8, the necessary counterpart in the JDK (the “Flight Recorder”) is not free and needs the -XX:+UnlockCommercialFeatures flag, it was open-soruced later.

1 Like

I just learned that Zulu provides builds for Mission Control: https://www.azul.com/products/zulu-mission-control

Please ignore my ignorance - could you consider using Spark / data streams?

A random possibility, simply because I happened to come across it when upgrading a little outreach demo and the relative slowdown was similar*:

The re-aliasing of scala.Seq to scala.immutable.Seq can cause some performance losses in programs that make a lot of use of ArrayBuffers, because it also means that .toSeq now converts to an immutable type.

In particular, I had a regularly used function that accepted a supertype (I forget which, but something like IterableOnce) but internally called toSeq. The most common type passed in was an ArrayBuffer. In 2.12, that toSeq call was a no-op. In 2.13, it had work to do to produce an immutable collection.

(* - yes, the relative slowdown between your applicaton and mine is probably totally coincidental because they are very different applications, but it happened to remind me!)