How to control number of CPUs used when using par

Dear Scala Group,

I am using Scala3, and one of my favorite part is to use par (after import scala.collection.parallel._ and import scala.collection.parallel.CollectionConverters._) to automatically change my iterations into parallelizations. But I faced one problem for a very long time: I cannot control the number of CPUs for this.

It seems that it will use all my CPUs available at once if needed. From the official document, I tried to use

import java.util.concurrent.ForkJoinPool
val taskPar = gz2zstTasks.par

taskPar.tasksupport = new ForkJoinTaskSupport(new ForkJoinPool(2))

Then, in the .jvmopts file right under my project, I use -Djava.util.concurrent.ForkJoinPool.common.parallelism=2 to control the number of CPUs. But it does not work.

I also tried to add -XX:ActiveProcessorCount=20 in .jvmopts, but it does not work, too.

I think I must miss something here, and it should be easy to control the number of CPUs when using it.

Really appreciate any help on this. Thanks!

Songpeng

I haven’t actually done this myself, but the recommended way to change the parallelism on an individual collection is like so:

import scala.collection.parallel._
val tasksupport = new ForkJoinTaskSupport(new java.util.concurrent.ForkJoinPool(2))

val pc = mutable.ParArray(1, 2, 3)
pc.tasksupport = tasksupport

If you want all your parallel collection operations to run using 2-ish threads, you can set them each to use that tasksupport; if you want each to get their own pool of 2-ish threads you can create a new one each time, etc..

Parallel collections, unlike the Future stack, do not use an implicit execution context; they read the global one and use it. You can apparently set it via system properties as described in the description of the global field in ExecutionContext.

I also don’t haven’t set system properties in a very very long time, but that’s where to start. (The odd thing about them is that they are strings.)

2 Likes

Thank you! I don’t know ExecutionContext before, I will give it a look. It looks that I can set them as java parameters in .jvmopts. WIll try it.

If the global config works, great.

If you need to preserve TaskSupport when forking a task, then the first issue expresses the expectation that setting the support on the inner collection should work and the second issue talks about funky interactions when attempting that. (I don’t remember the details or what happened to JDK 9 ForkJoinPool support. Example test leveraging JDK 9 scala/test/files/jvm/scala-concurrent-tck-b.scala at 2.13.x · scala/scala · GitHub)

Ditto.

@som-snytt Thanks for your reply! I don’t know anything about TaskSupport. My aim is to simply let my program use few CPUs intead of eating all the resources at once.

@Ichoran Following your suggestions, I put

-Dscala.concurrent.context.numThreads=2
-Dscala.concurrent.context.maxThreads=4

in .jvmopts file right under my project root. I am using SBT for my project .
It seems that I still cannot control the total number of processors. It just use all the available ones. Any suggestions on this? I think sbt should read the jvmopts for me.

Thanks!

I tried it with scala-cli and using javaOpt, and also with sbt and .jvmopts, and both just worked.

I suggest you share a minimal project that demonstrates the issue, and what command you run.

It’s worth deleting your ~/.sbt to exclude other factors.

2 Likes

Thanks, I don’t know what happened in my configurations. (I deleted ~/.sbt, but it still did not work). Will follow your suggestion to create a minimal project for this testing.

Are you running your tests in a forked JVM? .jvmopts controls sbt’s own JVM, but it does not control forked JVMs; those are controlled by the javaOptions setting, as per sbt Reference Manual — Forking

1 Like

@SethTisue Thanks! I tried to set javaOptions instead of .jvmopts. But it does not work.

// Build-wide settings
ThisBuild / organization ≔ "io.github.beyondpie"
ThisBuild / organizationName ≔ "zulab"
ThisBuild / scalaVersion ≔ "3.7.1"
ThisBuild / logLevel ≔ Level.Info
ThisBuild / resolvers += "Bioviz".at(
  "https://nexus.bioviz.org/repository/maven-releases/")

ThisBuild / Compile / scalacOptions ≔ List(
  "-encoding",
  "utf8",
  "-feature",
  "-language:implicitConversions",
  "-language:existentials",
  // "-experimental",
  "-unchecked",
  "-explain-types",
  "-explain",
  "-deprecation"
)

// Ref: https://www.scala-sbt.org/1.x/docs/Multi-Project.html
lazy val bioscala = (project in file("bioscala"))
  .settings(
    name ≔ "bioscala",
    version ≔ "0.7.0",
    Test / logBuffered ≔ ⊥,
    javaOptions ++= Seq(
      "Dscala.concurrent.context.numThreads=2",
      "Dscala.concurrent.context.maxThreads=4"
    ),
    libraryDependencies ++= Seq(
      "org.scalatest" %% "scalatest" % "3.2.19" % "test",
      "com.lihaoyi" %% "os-lib" % "0.11.4",
      "org.scala-lang.modules" %% "scala-parallel-collections" % "1.2.0",
      // only works on scala 3.6.3 now
      // "com.lihaoyi" % "ammonite" % "3.0.2" % "test" cross CrossVersion.full,
      "commons-io"           % "commons-io"  % "2.19.0",
      "org.jsoup"            % "jsoup"       % "1.20.1",
     // I removed some other libraries here.
    )
  )
lazy val pt = (project in file("100.project"))
  .dependsOn(bioscala)
  .settings(
    name ≔ "pt",
    version ≔ "0.9"
  )
lazy val DE = (project in file("12.DE"))
  .dependsOn(bioscala, pt)
  .settings(
    name ≔ "DE",
    version ≔ "1.0"
  )

Here is my project’s build.sbt. I put javaOptions inside project bioscala, where my test is under project DE. I am not sure if there is something I got wrong when setting sbt. I have a script under src/test/scala, and use Test/run under project 12.DE for this parallel test under sbt.

If I put javaOptions inside 12.DE, the test reports that “[warn] Test / run / javaOptions will be ignored, Test / run / fork is set to false”, and still it uses more processors than what I imagined (I think scala.concurrent.context.numThreads=2 should use 2 CPUs? ).

This works for test; per instructions, use Test/run for run.

//Test / run / fork := true
Test / fork := true

Test / javaOptions ++= Seq(
  "-Dscala.concurrent.context.numThreads=2",
  "-Dscala.concurrent.context.maxThreads=4"
)
2 Likes

Hi @som-snytt
Thanks for correcting me. Yes, I add your suggested configurations into test (under one of my project with test, 12.DE). It still uses more CPUs.

It appears you left out the -s before the Ds.

This can be separated into two parts:

  • Is the system property actually being set in the JVM process in question?
  • Is the system property having the effect you intended?

It’s important to determine which part is the point of failure.

You can distinguish the two by inserting something like println(sys.props("scala.concurrent.context.numThreads")) into the code you’re running.

@SethTisue Thanks for your reply!

It appears you left out the -s before the Ds.

Yes, I’ve updated it. Thanks. But the result is not changed.

println(sys.props("scala.concurrent.context.numThreads"))

It says:

[info] running (fork) controlParallelCollection
[info] 2

So I think the system property actually is set. I am using Scala 3.7.1, Jdk 22.0.1-internal 2024-04-16. Not sure if this affects the results?

My own observation aligns with Som’s. If I do this:

% scala --dep org.scala-lang.modules::scala-parallel-collections:1.2.0 \
-Dscala.concurrent.context.maxThreads=2 
Welcome to Scala 3.7.0 (21.0.7, Java OpenJDK 64-Bit Server VM).
Type in expressions for evaluation. Or try :help.
                                                                                                    
scala> import scala.collection.parallel.CollectionConverters._
                                                                                                    
scala> List.fill(500)(0).par.map(_ => while(true) {})

I see 200% CPU usage in MacOS’s Activity Monitor, as expected.

And if I leave out the -Dscala.concurrent.context.maxThreads=2, then I get much higher CPU usage — around 1150%.

Note that my experience shows that it’s sufficient to set maxThreads only.

I think you need to:

  1. Verify that you are able to reproduce my results yourself, using the reproduction steps shown above
  2. Do some investigating and apply some rigor to the problem and figure out what it is that you are doing differently

Having me, Rex, and Som all try to guess what might be going on in your code that we haven’t seen — well, that just doesn’t seem to be working. Or rather, we have seem to have made some progress, but not all the way to a solution.

@SethTisue Thanks! I can reproduce your results! I will look into my codes. Thanks again!

1 Like