Seq vs List: which should I choose?

I feel like this question has been asked countless times already, but I’m still unsure.

I read this SO answer a long time ago, which led me to mostly choose Seq for class/method parameters.

However lately I keep getting tripped by the fact that Stream is also a Seq [1], and I find myself wishing there were a non-lazy Seq counterpart that would only abstract both List and Vector away.

Am I overthinking it? Should I just switch to using List where I commonly used Seq, except in those less common cases where I do care about either Vector (not often) or Stream (very rarely)?


1 especially when toSeq-ing an Iterator, with the laziness of Stream sometimes causes issues to occur much later than I'd like
1 Like

iterator.toSeq produces a List AFAIR Scastie - An interactive playground for Scala.

println(Iterator(1, 2, 3).toSeq)
// prints: List(1, 2, 3)

In order to get a Stream you need to ask for that explicitly. Default Seq implementation is List.

Thanks Tarsa, it’s possible it wasn’t in the context of toSeq-ing an Iterator, but it definitely happened to me in the past, I’ll try and find how it happened again.

But anyway, per Murphy’s Law, the problem remains :slight_smile:

You could call toVector on your Seq to make sure you got something non-lazy.

This must have changed with the 2.13 collections revamp.

Welcome to Scala 2.12.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_221).
Type in expressions for evaluation. Or try :help.

scala> Iterator(1, 2).toSeq
res0: Seq[Int] = Stream(1, ?)
Welcome to Scala 2.13.1 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_221).
Type in expressions for evaluation. Or try :help.

scala> Iterator(1, 2).toSeq
res0: Seq[Int] = List(1, 2)
1 Like

…but then you’d convert List instances without necessity…

@curoli, that tends to be what I do lately, but that seems pretty sub-optimal (I’d rather have a proper abstraction for the non-laziness)

Personally, if this had bitten me too often, I’d probably stick to “programming to the interface” and enforce strictness on a case-by-case basis.

implicit class StrictifyableSeq[T](val s: Seq[T]) extends AnyVal {

  def toStrict: Seq[T] =
    s match {
      case _: Vector[T] => s
      case _: List[T] => s
      // Stack, Queue,...
      case _ => s.toVector
    }

}

But yes, a StrictSeq trait would be nice… Not sure, though, whether this would fit nicely into the existing hierarchy.

Considering that most of the time, List is indeed what I want (I only really use Vector when I know for a fact I may need to access specific indices), I might just go with using List in my abstractions.

The main downside I can think of is client code that would heavily use Vector and that would therefore have to .toList a lot. I’m less worried about the Stacks/Queues and the likes because I think they’re less common and as such, forcing others to .toList them seems less awkward.

The other thing that troubles me though is that the varargs/splat mechanism uses Seq, not List :frowning:

For data types I almost always use List. If I know I’m going to be doing a lot of concatenation I’ll use cats.data.Chain. I find that I basically never need to do indexing, but if I did I would probably use Vector.

In method signatures where I just need something I can map over I’ll use F[_]: Functor; or if I need to combine the contents I’ll use F[_]: Foldable; or if I need to compute an effectful value over the elements I’ll use F[_]: Traverse, and so on (these abstractions are from Cats). I don’t find the superclass-based abstractions like Seq to be a meaningful enough to be useful.

1 Like

@tpolecat

For data types I almost always use List

I agree, I almost always use List for datatypes

find that I basically never need to do indexing

Likewise, rarely using it, except when parsing TSV data really.

Regarding cats, I’d rather stick to the standard lib as much as possible. Cats is impressive, but I certainly wouldn’t want to use it in every projects. I guess I’m one of those “blue sky” Scala people (I heard that term being used before, hopefully employed correctly here).

PS: love your blog

But this approach doesn’t help to avoid unwanted lazy behavior, either, right?

It does, as far as your code doesn’t need anymore to be abstract based on a data type, like Iterator/Seq.

You’re now handling the abstraction with a type class, and when you use that code, you want to pass in a concrete data-type, whose lazy/strict semantic is less ambiguous.

1 Like

this approach doesn’t help to avoid unwanted lazy behavior

What do you mean by “unwanted lazy behavior”? The presence of Functor[F] means given F[A] and A => B I can compute an F[B]. Whether that F does things eagerly or lazily shouldn’t make any difference to me.

1 Like

He’s referring to this aspect of my original question

Oh, you’re talking about turning an Iterator into a Stream and passing it to someone? I recommend you don’t do that. Iterators are extremely hard to reason about and should only be used locally (ideally introduced and eliminated in a single expression).

If that’s not what you’re referring to then I remain confused.

I mostly mentioned Iterator as an example.

What I mean is this:

def myLibraryMethod(values: Seq[SomeType]): ???

vs

def myLibraryMethod(values: List[SomeType]): ???

I’ve always used the former, but lately I’ve been tempted to switch to the latter in general, mostly because sometimes I receive a Stream, whose semantics are almost never what I want. But Vector is still fine, so I’m hesitant to switch to List and wish there were a common abstraction over List and Vector but not Stream.