I'd like a weak hash table which allows GC when value is otherwise unreferenced

#1

I found an interesting class, WeakHashMap, in the scala library which seems to be a wrapper around the corresponding java implementation. This hash table allows entires to be garbage collected when the key becomes unreferenced.

Does anyone know if the opposite is available? I need the entries to become GCable when the value becomes unreferenced. Or is there a trick I can use to convert the free-on-unreferenced-key hash to a free-on-unreferenced-value hash? If I try to build a parallel table which maps the values to the keys, then all of a sudden nothing will every become unreferenced, because the other table will be referencing it.

#2

I found a java class WeakValueHashMap.java (provided by jboss) which is documented to do what I want. I think it is not wrapped in the Scala library.

Is it possible? recommended? should-avoid? to use this class in Scala? Is there a better alternative?

#3

As a rule of thumb, it’s usually fine to use Java libraries from Scala. They’re not always fully idiomatic for Scala (in particular, beware of nulls, and be aware that they tend to be mutable), but you can sometimes fix that by writing your own wrapper if you care, and most can be used right out of the box so long as you pay attention to their idiosyncracies.

#4

Is there a flow description written of how to use a java class in Scala. I assume, I’ll need to add something to build.sbt, do I need to declare the scala class somehow, e.g., to inherit from the java class?

#5

No, you do not need any special handling for Java classes, Scala is compatible with most Java code directly. In fact, some of the standard library classes, e.g. String, are Java classes.
If you want to compile Java sources together with Scala, for SBT it should be enough to place it under src/main/java. If you add a managed Java dependency, you can do that just like with Scala libraries, just use single % everywhere (the %% are otherwise used for appending the scala version against which a library is compiled to the package name).

In case of this specific class from JBoss, you may have some problems because of it being too old to have generics. This means its methods aren’t really type checked and it will always return Object when looking up elements, which maps to Scala’s AnyRef, so you’ll need casts (with asInstanceOf). It may be possible to change the class to support generics or you can write a wrapper that does the casts and checks incoming values.

#6

In principle you could just nudge the compiler a bit and use the standard collection conversions.

import scala.collection._
import scala.collection.JavaConverters._
import java.{util => ju}

val m: mutable.Map[Key, Value] = 
  new WeakValueHashMap().asInstanceOf[ju.Map[Key, Value]].asScala

I’m not 100% sure that the conversion wrapper doesn’t interfere with the weak references, but I’d be surprised if it were keeping hard references to the values somewhere.

1 Like
#7

@sangamon, in your suggestion where should WeakValueHashMap come from? Do I still need to install it in src/main/java?

#8

Just like @crater2150 described - either that or

libraryDependencies += "org.jboss" % "jboss-common-core" % "2.5.0.Final"

in build.sbt and then

import org.jboss.util.collection._

in your code.

EDIT: Just realized that in 2.5.0, this class is properly generified, so no asInstanceOf required:

val m = new WeakValueHashMap[Key, Value].asScala

1 Like
#9

it seems (If I used your edited suggestion) that I no longer need

import java.{util => ju}

Here is the code I’m using, that seems to work. And Yes I have to check that get returns None or Some(null).

  import scala.collection.mutable

  val hash: mutable.Map[(Int,Bdd,Bdd), BddNode] = {
    // this code was suggested by Patrick Römer
    // https://users.scala-lang.org/t/id-like-a-weak-hash-table-which-allows-gc-when-value-is-otherwise-unreferenced/4681/8?u=jimka
    import scala.collection.JavaConverters._
    import org.jboss.util.collection._
    new WeakValueHashMap[(Int,Bdd,Bdd), BddNode].asScala
  }

  var hashCount:Long = -1

  def apply(label: Int, positive: Bdd, negative: Bdd): Bdd = {
    if (positive == negative)
      positive
    else {
      hash.get((label, positive, negative)) match {
        case Some(bdd: BddNode) => bdd
        case None | Some(null) => { // Some(null) is necessary because the get of WeakValueHashMap sometimes returns Some(null)
          hashCount = 1 + hashCount
          if ( 0 == (hashCount % 100000))
            println(s"bdd hash size=${hash.size} hashCount=$hashCount")

          val bdd = BddNode(label, positive, negative)
          hash((label, positive, negative)) = bdd
          bdd
        }
      }
    }
  }

Here is the output when I run my tests, so I see that the GC does indeed remove some entries from the hash table.

bdd hash size=0 hashCount=0
bdd hash size=81416 hashCount=100000
bdd hash size=165414 hashCount=200000
bdd hash size=261157 hashCount=300000
bdd hash size=361157 hashCount=400000
bdd hash size=456206 hashCount=500000
bdd hash size=556206 hashCount=600000
bdd hash size=656206 hashCount=700000
bdd hash size=736279 hashCount=800000
bdd hash size=836279 hashCount=900000
bdd hash size=936279 hashCount=1000000
bdd hash size=1036279 hashCount=1100000
bdd hash size=1104525 hashCount=1200000
bdd hash size=1204525 hashCount=1300000
...
...
#10

Some(null) should only occur if null has explicitly been put as a value, which we definitely want to avoid in Scala. As far as I can see, you should get None for a purged weak reference…?

If this really occurs, you could at least in principle unify this to None via hash.get(x).flatMap(Option(_)).

#11

Here is the stack trace I get if I take out the | Some(null) from the pattern match. As per your comment about hash.get(x).flatMap(Option(_)), is that really better than just including the Some(null) case in the patter matching cases?

/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/bin/java -ea -Didea.test.cyclic.buffer.size=1048576 "-javaagent:/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar=53782:/Applications/IntelliJ IDEA CE.app/Contents/bin" -Dfile.encoding=UTF-8 -classpath "/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar:/Applications/IntelliJ IDEA CE.app/Contents/plugins/junit/lib/junit-rt.jar:/Applications/IntelliJ IDEA CE.app/Contents/plugins/junit/lib/junit5-rt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/charsets.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/deploy.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/ext/cldrdata.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/ext/dnsns.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/ext/jaccess.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/ext/jfxrt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/ext/localedata.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/ext/nashorn.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/ext/sunec.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/ext/sunjce_provider.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/ext/sunpkcs11.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/ext/zipfs.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/javaws.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/jce.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/jfr.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/jfxswt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/jsse.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/management-agent.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/plugin.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/resources.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/rt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/lib/ant-javafx.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/lib/dt.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/lib/javafx-mx.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/lib/jconsole.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/lib/packager.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/lib/sa-jdi.jar:/Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/lib/tools.jar:/Users/jimka/sw/common-lisp/regular-type-expression/cl-robdd/src/cl-robdd-scala/target/scala-2.12/test-classes:/Users/jimka/sw/common-lisp/regular-type-expression/cl-robdd/src/cl-robdd-scala/target/scala-2.12/classes:/Users/jimka/.ivy2/cache/junit/junit/jars/junit-4.10.jar:/Users/jimka/.ivy2/cache/org.scalatest/scalatest_2.12/bundles/scalatest_2.12-3.0.5.jar:/Users/jimka/.ivy2/cache/org.scalactic/scalactic_2.12/bundles/scalactic_2.12-3.0.5.jar:/Users/jimka/.ivy2/cache/org.scala-lang.plugins/scala-continuations-library_2.12/bundles/scala-continuations-library_2.12-1.0.3.jar:/Users/jimka/.ivy2/cache/org.scala-lang.modules/scala-xml_2.12/bundles/scala-xml_2.12-1.0.6.jar:/Users/jimka/.ivy2/cache/org.scala-lang/scala-reflect/jars/scala-reflect-2.12.8.jar:/Users/jimka/.ivy2/cache/org.scala-lang/scala-library/jars/scala-library-2.12.8.jar:/Users/jimka/.ivy2/cache/org.hamcrest/hamcrest-core/jars/hamcrest-core-1.1.jar:/Users/jimka/.ivy2/cache/org.jboss/jboss-common-core/jars/jboss-common-core-2.5.0.Final.jar:/Users/jimka/.ivy2/cache/org.jboss.logging/jboss-logging-spi/jars/jboss-logging-spi-2.1.2.GA.jar:/Users/jimka/.ivy2/cache/org.scala-sbt/test-interface/jars/test-interface-1.0.jar:/Users/jimka/.ivy2/cache/org.scalacheck/scalacheck_2.12/jars/scalacheck_2.12-1.14.0.jar" com.intellij.rt.execution.junit.JUnitStarter -ideVersion5 -junit4 dimacs.DimacsSuite
objc[91154]: Class JavaLaunchHelper is implemented in both /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/bin/java (0x10a1ff4c0) and /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/libinstrument.dylib (0x10a2834e0). One of the two will be used. Which one is undefined.

scala.MatchError: Some(null) (of class scala.Some)

	at bdd.Bdd$.apply(Bdd.scala:251)
	at bdd.Bdd$.bddOp(Bdd.scala:270)
	at bdd.Or$.apply(Bdd.scala:353)
	at bdd.Bdd$.$anonfun$clauseToBdd$1(Bdd.scala:228)
	at bdd.Bdd$.$anonfun$clauseToBdd$1$adapted(Bdd.scala:228)
	at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
	at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
	at scala.collection.immutable.List.foldLeft(List.scala:89)
	at bdd.Bdd$.clauseToBdd(Bdd.scala:228)
	at bdd.Bdd$.$anonfun$cnfToBdd$1(Bdd.scala:232)
	at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
	at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
	at scala.collection.immutable.List.foldLeft(List.scala:89)
	at bdd.Bdd$.cnfToBdd(Bdd.scala:232)
	at dimacs.dimacsSimplify$.$anonfun$semanticsStressTest$3(dimacs.scala:335)
	at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)

#12

It would be better for my application if rather than null being put into the hash table, if simply the key/value pair were removed from he table. Null would never be accidentally found, and the hash table size would be smaller.

#13

This doesn’t help much - it only shows where the execution stumbles over the null, not where it is introduced.

If I run this code…

import scala.collection._

case class BddNode(id: Int)

val hash: mutable.Map[Int, BddNode] = {
  import scala.collection.JavaConverters._
  import org.jboss.util.collection._
  new WeakValueHashMap[Int, BddNode].asScala
}

var hashCount:Long = -1

def apply(id: Int): BddNode = {
  hash.get(id) match {
    case Some(null) =>
      println("OUCH!")
      sys.exit(-1)
    case Some(bdd: BddNode) if bdd != null => bdd
    case None =>
      hashCount = 1 + hashCount
      if ( 0 == (hashCount % 100000))
        println(s"bdd hash size=${hash.size} hashCount=$hashCount")
      val bdd = BddNode(id)
      hash(id) = bdd
      bdd
  }
}

for(i <- 0 until 50000000) {
  apply(i)
}

…I cannot reproduce the Some(null). Maybe I’m doing something wrong/different here? Otherwise I’d suspect that null explicitly is introduced as a value somewhere in your code. (I really do hope that this is not something esoteric like different implementations of weak references in different JDKs or something like that…)

Not immediately, in particular since instead of matching on None | Some(null) you could just match on _ in the given case. :slight_smile:

However…

implicit class SafeMap[K, V](val m: Map[K, V]) extends AnyVal {
  def safeGet(k: K): Option[V] = m.get(k).flatMap(Option(_))
}

Now you could just use #safeGet() everywhere and have no Some(null) trouble at the call sites.

#14

Indeed. :slight_smile: From the code it is not obvious where you’re producing null (in BddNode#apply()?), but if you’re deliberately using null as a value anywhere, just don’t.

#15

Here is line 251

#16

The only place my code is modifying the hash table is on line 259.

#17

I’m happy to make the entire project available to you if you like.
The git repo is: https://gitlab.lrde.epita.fr/jnewton/regular-type-expression.git
The commit id is d0286174e89541f92d051d9637b62d76fd389908
The scala project starts in regular-type-expression/cl-robdd/src/cl-robdd-scala.
The issue occurs when running the test suite named DimacsSuite.

My suspicion is that the null is introduced by the weak hash map, because when I swap the definition of the hash variable below with the val hash which is commented out, the null issue never occurs.

  import scala.collection.mutable

  //val hash = mutable.Map.empty[(Int, Bdd, Bdd), BddNode] // for strong hash table
  val hash: mutable.Map[(Int,Bdd,Bdd), BddNode] = { // for weak value hash table
    // this code was suggested by Patrick Römer
    // https://users.scala-lang.org/t/id-like-a-weak-hash-table-which-allows-gc-when-value-is-otherwise-unreferenced/4681/8?u=jimka
    import scala.collection.JavaConverters._
    import org.jboss.util.collection._
    new WeakValueHashMap[(Int,Bdd,Bdd), BddNode].asScala
  }
#18

What happens when you run this code? I believe it contains everything necessary to reproduce the error, except for the actual code for WeakValueHashMap. The null problem occurs by running the main method in BddTest defined in the same file.

#19

The nature of my code is the following. A BddNode instance has a positive and negative child, and what is called a label (which is an integer). The positive and negative children are either another BddNode or a BddTerm which has no children. Every time I create a new BddNode I register it in the hash map as the value of the key (label, positive, negative). Thereafter, before allocating a new BddNode with a label/positive/negative, I first check wether such already exists in the hash map, and simply return that if it exists, or allocate a new one otherwise (and register it in the hash). This scheme assures that any two BddNodes which are isomorphic are actually eq to each other.

The weak hash is used so that if nothing is still referencing a BddNode, then I know nothing can be isomorphic to it, so It can be removed from the hash map and GC’ed.

So the hash table has 1000 or tens of 1000s of objects which have interweaved pointers one to another. But if any is every found with no pointers to it, it can be removed, thus perhaps leaving others which can thereafter be removed.

It is important to remove them eventually as memory can fill up with objects which no one cares about any longer.

#20

I see, but I only have one call site. The use of the mutable hash table is isolated to one single area of code. BTW, what is the correct way in scala to make sure that only one function (method) accesses a particular variable? In Lisp, I’d just wrap the variable and method in a let, and then nothing outside the let could reference the variable, and the method would effectively be a closure. As far as I can tell, there is no let-like construct in Scala.