Lscala.Tuple2;@2856cf17

Can someone help me understand the problem often encountered when working with arrays?
The problem that arrays are sometimes mysteriously printed as

[Lscala.Tuple2;@724a1b65: Array[(String, Int)]

It’s a really curious thing when encountered by new students.
Is this only an artifact of old Scala versions? or is it evidence that the programmer’s approach is wrong?

The following use of mapValues works fine in 2.12. It is easy to understand, and concise.

val tale = Array("it", "was", "the", "best", "of", "times", "it", "was", 
                 "the", "worst", "of", "times", "it", "was", "the", "age", 
                 "of", "wisdom", "it", "was", "the", "age", "of", "foolishness", 
                 "it", "was", "the", "epoch", "of", "belief", "it", "was", "the", 
                 "epoch", "of", "incredulity", "it", "was", "the", "season", "of", 
                 "light", "it", "was", "the", "season", "of", "darkness", "it", 
                 "was", "the", "spring", "of", "hope", "it", "was", "the", 
                 "winter", "of", "despair")
tale.groupBy(identity).mapValues(_.size)

But when trying the same thing in 2.13 we get a deprecation warning.
same in 2.13

When I try to take the advise of issued by the compiler then I get a MapView which is not computed.

tale.groupBy(identity).view.mapValues(_.size)

And to see the content I added .toArray and got [Lscala.Tuple2;@724a1b65: Array[(String, Int)]
with .toArray

What is the correct way to pretty print such an Lscala object?

BTW I also see that using .toMap rather than .toArray

tale.groupBy(identity).view.mapValues(_.size).toMap

prints a much more reasonable result.

HashMap(belief -> 1, it -> 10, spring -> 1, was -> 10, hope -> 1, times -> 2, foolishness -> 1, light -> 1, of -> 10, the -> 10, worst -> 1, winter -> 1, epoch -> 2, best -> 1, darkness -> 1, despair -> 1, wisdom -> 1, incredulity -> 1, season -> 2, age -> 2): scala.collection.immutable.Map[String,Int]
12
​

Arrays are a construct of the JVM, when you toString on an array it tells you the memory reference and that’s it (and this can’t be overridden). You are best to convert to another type of collection if you want to conveniently see all of the values (List,Seq,Vector, …).

Alternatively you can use collection.mkString(",") to get a comma separated string of all the values in any collection type.

1 Like

Maybe I’m wrong, but my impression is that the offending object is inside some other data structure which is inside some data structure which is … Thus a top-level call to mkString doesn’t really help.

What does [Lscala.Tuple2 ...] tell me? Is this scala telling me that it is a tuple, or is it java trying to print a scala object of unrecognized type which it doesn’t know how to print?

Object (Java Platform SE 8 ) returns getClass().getName() + '@' + Integer.toHexString(hashCode()).

Class (Java Platform SE 8 ) returns

If this class object represents a class of arrays, then the internal form of the name consists of the name of the element type preceded by one or more ’ [ ’ characters representing the depth of the array nesting. The encoding of element type names is as follows:

Element Type Encoding
boolean Z
byte B
char C
class or interface Lclassname;
double D
float F
int I
long J
short S

Therefore [Lscala.Tuple2;@7ceec737 means an array of scala.Tuple2 instances where the identity hash code of that array is 7ceec737.

Actually the Map.mapValues also returns a view, but hidden under a Map type under Scala 2.12. Following code prints hello world 4 times:

val mapped = Map(1 -> "hello world").mapValues(println)
mapped(1)
mapped(1)
mapped(1)
mapped(1)

This code prints hello world once:

val mapped = Map(1 -> "hello world").map(kv => kv._1 -> println(kv._2))
mapped(1)
mapped(1)
mapped(1)
mapped(1)
1 Like

It just tells you that you have an Array of tuples. The toString is coming from doing a toString on the Array[Tuple2] which is implemented by the jvm

1 Like

Note that this is one of the many reasons why Arrays are used far less in Scala than in some other languages. They’re extremely rare in my own codebases, because they’re a nuisance: they’re a primitive type that often misbehaves like this in the Scala context. I would strongly recommend focusing on other types, except for cases where you specifically need an Array for performance reasons…

1 Like

Arrays are not collections, are a low level primitive of the JVM. The only valid use case for Arrays is performance sensitive code.

Please, use any real collection instead (especially if you are teaching Scala to students), like: List, Vector or ArraySeq.

Things why Arrays are discouraged:

  • They are mutable.
  • They are invariant.
  • They are not really part of the Scala collections hierarchy, thus all the usual methods (like map) are added as extension methods and implicit conversions.
  • They do not have a pretty toString
  • They do not have a by-value equals
  • They can have extremely poor performance due constant memory reservation and data copying when used like normal Scala collections.

It seems like if a function parameter is declared Seq, I can pass an Array. It could be less confusing to just talk about List and Vector if they both work as Seq.

Yes – these (and indeed, most collections) are Seq.

That is another good point of why asking for Seq is usually a bad idea.
We already talked about that the other day, so I won’t go further, just a reminder that IMHO trying to abstract over collections is not really a good idea.

Is there a reason why to use Vector instead of Seqor IndexedSeq?

It’s a matter of context. Seq is high-level and vague – it doesn’t tell you much about the performance characteristics of the data, so it’s hard to know how to use it appropriately.

I’ve observed a growing consensus: data members should usually be relatively precise (List, Vector, etc), so that you know how your data is going to work in practice. OTOH, it’s generally fine for function parameters to be broader (eg, Seq), if they work reasonably well with all concrete classes that conform to that type’s contract.

1 Like

Just to do not repeat myself, please see this comment and consecuent answers.

@BalmungSan My issue with all those explainations is, that most of the time, in many situations, we are talking about collections containing just a few elements. In other words: Is there a real performance impact if a collection contains between a few and a few hunderet elements? What collection do I use for basic usage? Where I jsut want to return 10, 20 or 100 elements, for example.

My main concern is not really performance, but rather the ability to reason about my code.

So, I just use Lists most of the time.
They are easy to understand, easy to use, allow you to write tail-recursive algorithms when you need to, and have an OK performance for most use cases (especially for small number of elements as you say).

The Scala REPL includes a hack that prints arrays nicely if the result happens to be an array or one of a handful of data types containing arrays as members (I don’t know which) but in general printing out an array results in the internal JVM toString.

scala> Array(1,2,3)
val res11: Array[Int] = Array(1, 2, 3)

scala> List(Array(1,2,3))
val res12: List[Array[Int]] = List(Array(1, 2, 3))

scala> ("foo", true, Array(1,2,3))
val res13: (String, Boolean, Array[Int]) = (foo,true,Array(1, 2, 3))

scala> Option(Array(1,2,3)) // nope
val res14: Option[Array[Int]] = Some([I@5af08232)

scala> Array(1,2,3).toString // in general
val res16: String = [I@340b7fca
1 Like

OTOH explicit println bypasses the pretty printing:

scala> Array(1, 2, 3)
res0: Array[Int] = Array(1, 2, 3)

scala> println(Array(1, 2, 3))
[I@2e64c541
1 Like