How to check equality of Array[Array[...]]

jimka · April 2, 2021, 7:21am

I know that to check equality of the corresponding elements of two Arrays I need to use the .sameElements method. But when I look into this code, it seems to me that it uses == to check the constituent elements. This won’t work for Array[Array[Int]]. Is there a standard way of doing this? Or do I need to write my own equivalence function with the semantics I want?

sangamon · April 2, 2021, 11:26am

There used to be a (potentially costly) Array#deep conversion for this purpose, but this has been removed in Scala 2.13. At first glance I can’t find any current builtin solution. Unless I’m missing something (which is not unlikely, though), you’ll have to roll your own (or hook into an existing mechanism like cats’ Eq).

Another question is whether you really need arrays or whether a (perhaps specialized) collection might be more beneficial in the long run, anyway.

jimka · April 2, 2021, 12:45pm

Not a bad question. I’ve never really understood the rules of which methods can be defined on which classes when it involves List of List of List or Seq of Seq of Seq.

What I’m using this for is an alternate constructor for a class.

  object sqMatrix {
    // we can construct an sqMatrix, by providing an Array of Arrays of Double
    def apply(entries: Array[Array[Double]]): sqMatrix = {
      // don't allow arrays of different size in the same sqMatrix,
      //  and force the length of the outer array = length of inner arrays
      //  I.e., the matrix is square.
      assert(entries.forall{a => entries.length == a.length}, s"non-square matrix specified")
      sqMatrix(entries.length, (row, col) => entries(row)(col))
    }

This constructor is mostly used for testing.
The primary constructer, which is the one normally used is more functional.
Because sqMatrix is a case class, we can construct a new sqMatrix instance,
by specifying a dimension, dim, and a function mapping (Int,Int) to Double.

  case class sqMatrix(dim: Int, tabulate: (Int, Int) => Double) {
    val arr: Array[Double] = Array.tabulate(dim * dim)((i: Int) => tabulate(i / dim, i % dim))
   ...
}

BalmungSan · April 2, 2021, 1:33pm

Why not using ArraySeq?

sangamon · April 2, 2021, 1:58pm

AFAICS you could use Seq[Seq[Double]] (or any specific Seq subtype) as input to the alternative constructor - it’s just used for the “outer” size and as a function, both to be passed to the primary constructor.

I also don’t see where the equality issue enters the picture for this use case…?

sangamon · April 2, 2021, 2:02pm

This would work, but only if all nested arrays are wrapped (and I guess that’s just what the Array#deep extension did). If only the outer layer is wrapped into ArraySeq, you’ll run into the default (identity based) Array#equals() for the nested instances.

BalmungSan · April 2, 2021, 2:10pm

Yeah I mean that instead of using Array use ArraySeq everywhere.

jimka · April 2, 2021, 3:59pm

ahhh, the equality issue is that in a test case I want to assert that a computed array of arrays is equal to a hard coded one. For example.

assert(m.inverse =array= Array(Array(1,2,3),Array(2,3,4),Array(3,4,5))

I’m currently using the following to assert that the distance between two matrices is 0.

assert(m.inverse.dist(sqMatrix(Array(Array(1,2,3),Array(2,3,4),Array(3,4,5)))) == 0.0)

jimka · April 2, 2021, 4:01pm

What’s the advantage of ArraySeq, is it to allow == to test equal components?

sangamon · April 2, 2021, 4:18pm

So #inverse() seems to return the same nested array type and that’s what you want to compare - just changing the alternative factory method param type won’t help, then. Possible options include switching to a different type than raw array throughout (e.g. ArraySeq), “deep conversion” to ArraySeq just for the comparison, writing a custom comparison method or - bazinga! - a custom matcher.

sangamon · April 2, 2021, 4:21pm

It’s a thin wrapper around raw arrays that provides better integration with the collection framework than the implicit extensions for raw arrays alone - including an #equals() implementation based on equality of the elements.

cbley · April 2, 2021, 4:32pm

For this purpose, scalatest provides ready-made matchers which work out of the box, so you would not need to role your own:

array1 should equal(array2)

See ScalaTest

cheapsolutionarchite · April 2, 2021, 4:35pm

This is maybe a bit “dirty”

Welcome to the Ammonite Repl 2.3.8-54-a1dec1cf (Scala 2.13.5 Java 11.0.10)
@ val a = Array( Array(“a”, “b”, “c”), Array(“d”, “e”, “f”) )
a: Array[Array[String]] = Array(Array(“a”, “b”, “c”), Array(“d”, “e”, “f”))

@ a.asInstanceOf[Array[Object]]
res1: Array[Object] = Array(Array(“a”, “b”, “c”), Array(“d”, “e”, “f”))

@ java.util.Arrays.deepEquals(res1,res1)
res2: Boolean = true

sangamon · April 2, 2021, 4:45pm

Nice. I assumed that this would only work for flat arrays, but looks like it correctly handles the “deep” comparison, as well.

BalmungSan · April 2, 2021, 4:51pm

Correct equality.
Pretty toString
immutability
Covariance

In general, it is a real collection instead of a JVM primitive.

Plain Arrays should only be used for performance sensitive applications / methods.

jimka · April 2, 2021, 4:51pm

And what would happen if I just define == for my sqMatrix class, which I recall now I’ve already done. This uses sameElements on this.arr and that.arr, which works because .arr is a 1-d array.

    override def equals(that: Any): Boolean = {
      // two sqMatrix instances are considered == if the underlying arrays have the same elements
      // in the corresponding locations.
      that match {
        case that: sqMatrix => (this.dim == that.dim) && (this.arr sameElements that.arr)
        case _ => false
      }
    }

sangamon · April 3, 2021, 12:06pm

It certainly is a good idea in general to have a proper algebra defined over sqMatrix rather than just using it as an overt array wrapper, and then a custom #equals() would be the way to go, indeed, no matter what the actual internal implementation looks like.

Another thing that feels somewhat odd… The only reason for using raw arrays I can imagine is performance concerns. However, if you keep creating new arrays via #arr, performance doesn’t seem to be a primary concern…?

jimka · April 3, 2021, 12:34pm

Ideally, I would like to accept any ordered sequence of ordered sequences of numbers,
with the restriction that it designates a square array of numbers. Such objects only serve for initializing the sqMatrix. Thereafter, all the operations are in terms of a more abstract interface, to an underlying 1d array. I decided not to put too much effort into make such a large set of constructions for every possible way which it makes sense to designate.
I think I used arrays because it is easy to check wether the number of rows equals the number of columns. Although, I admit the choice is somewhat arbitrary.

sangamon · April 4, 2021, 3:47pm

Sounds fine. In this thread I got the impression, however, that a) #inverse returns an array rather than a sqMatrix instance, i.e. that the abstract interface is leaking its internal representation, and that b) the underlying array gets recreated on each usage, which sounds highly inefficient and would outweigh any hypothetical performance benefit from using arrays by orders of magnitude.

Just as easy as with Seq (or any subtype thereof).

jimka · April 4, 2021, 3:56pm

Not sure what you mean by this? Yes, I never modify the internal array. So yes operations which create a new sqMatrix instance create a new underlying 1-d array. Is that bad?