When to use Array and when to use List

Sorry I have zero experience on collections.
In python, most time we operate with list, unless you were using Numpy where has array existing.
In ruby, AFAIK array and list are the same stuff.

ruby:
irb(main):031:0> x=[1,2,3]
=> [1, 2, 3]
irb(main):032:0> y=["a","b","c"]
=> ["a", "b", "c"]
irb(main):033:0> x.class
=> Array
irb(main):034:0> y.class
=> Array

python:
>>> x=["a","b","c"]
>>> y=[1,2,3]
>>> x.__class__
<type 'list'>
>>> y.__class__
<type 'list'>

scala:
scala> val x = List("a","b","c")
val x: List[String] = List(a, b, c)

scala> val y = Array(1,2,3)
val y: Array[Int] = Array(1, 2, 3)

scala> val x = List("a",1)
val x: List[Any] = List(a, 1)

scala> val y = Array("a",1)
val y: Array[Any] = Array(a, 1)

So I am confused on this. When should I use an array object, and when should I use the list?

Thank you.

Never use Array unless you need it for Java interop, or unless you are writing very-high-performance code and you are absolutely convinced that the only way to make it fast enough is to use Array.

Array never appears at all in normal Scala code.

You should prefer to use immutable collections such as List. But even if you decide you need a mutable collection, Array isn’t the first one you reach for. Because it comes directly from the JVM, it isn’t a proper Scala collection and as a result, has multiple peculiarities.

7 Likes

Array

Never
Well, if you need to optimize the performance of some function and you know how to properly use an array to do that then go ahead. But other than such a niche case you should never use Array, they are mutable, invariant, don’t have a proper toString, their equals is by reference instead of by-value, they are not real collections but a primitive of the JVM / JS / LLVM runtime.

List

While learning can be a good default collection. Other collections like Vector or just using abstract Seq may be good options while learning.

After learning List is still a great collection if all you need is linear iteration, (tail) recursion, and you are okay with building it thought constant pre-pends and a final reverse (if required)
For example, I personally use List 99% of the time.

Vector

Good for having an okey-ish performance for most operations.
Very good for constantly appending data, although cats.data.Chain is better for data, but has the disadvantage of being outside of the stdlib.

ArraySeq

The best collection if all you need is fast access by index.
Also, it is very efficient to create when you use methods like tabulate

3 Likes

May I ask another related question, when to use Map and when to use Hashmap?

Thanks.

Map is the common trait for various implementations of maps, while HashMap is a specific one. So unless you require a specific implementation (e.g. if you need a map with a specific element ordering or specific performance characteristics), use Map. The Map(...) factory method currently defaults to a HashMap anyways.

For both Map and HashMap there exists one version in scala.collection.immutable and one in scala.collection.mutable, the immutable Map is the one available without imports. If you need a mutable Map, a good practice is to import scala.collection.mutable and then use mutable.Map, so it stays clear that this isn’t the default immutable one.

I’d also recommend the guide on scala collections for a more detailed overview of the various collections and their hierarchy and differences.

3 Likes

Thanks. I appreciate it.