I feel like this question has been asked countless times already, but I’m still unsure.
I read this SO answer a long time ago, which led me to mostly choose Seq for class/method parameters.
However lately I keep getting tripped by the fact that Stream is also a Seq[1], and I find myself wishing there were a non-lazy Seq counterpart that would only abstract both List and Vector away.
Am I overthinking it? Should I just switch to using List where I commonly used Seq, except in those less common cases where I do care about either Vector (not often) or Stream (very rarely)?
1 especially when toSeq-ing an Iterator, with the laziness of Stream sometimes causes issues to occur much later than I'd like
Thanks Tarsa, it’s possible it wasn’t in the context of toSeq-ing an Iterator, but it definitely happened to me in the past, I’ll try and find how it happened again.
This must have changed with the 2.13 collections revamp.
Welcome to Scala 2.12.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_221).
Type in expressions for evaluation. Or try :help.
scala> Iterator(1, 2).toSeq
res0: Seq[Int] = Stream(1, ?)
Welcome to Scala 2.13.1 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_221).
Type in expressions for evaluation. Or try :help.
scala> Iterator(1, 2).toSeq
res0: Seq[Int] = List(1, 2)
Personally, if this had bitten me too often, I’d probably stick to “programming to the interface” and enforce strictness on a case-by-case basis.
implicit class StrictifyableSeq[T](val s: Seq[T]) extends AnyVal {
def toStrict: Seq[T] =
s match {
case _: Vector[T] => s
case _: List[T] => s
// Stack, Queue,...
case _ => s.toVector
}
}
But yes, a StrictSeq trait would be nice… Not sure, though, whether this would fit nicely into the existing hierarchy.
Considering that most of the time, List is indeed what I want (I only really use Vector when I know for a fact I may need to access specific indices), I might just go with using List in my abstractions.
The main downside I can think of is client code that would heavily use Vector and that would therefore have to .toList a lot. I’m less worried about the Stacks/Queues and the likes because I think they’re less common and as such, forcing others to .toList them seems less awkward.
The other thing that troubles me though is that the varargs/splat mechanism uses Seq, not List
For data types I almost always use List. If I know I’m going to be doing a lot of concatenation I’ll use cats.data.Chain. I find that I basically never need to do indexing, but if I did I would probably use Vector.
In method signatures where I just need something I can map over I’ll use F[_]: Functor; or if I need to combine the contents I’ll use F[_]: Foldable; or if I need to compute an effectful value over the elements I’ll use F[_]: Traverse, and so on (these abstractions are from Cats). I don’t find the superclass-based abstractions like Seq to be a meaningful enough to be useful.
Likewise, rarely using it, except when parsing TSV data really.
Regarding cats, I’d rather stick to the standard lib as much as possible. Cats is impressive, but I certainly wouldn’t want to use it in every projects. I guess I’m one of those “blue sky” Scala people (I heard that term being used before, hopefully employed correctly here).
It does, as far as your code doesn’t need anymore to be abstract based on a data type, like Iterator/Seq.
You’re now handling the abstraction with a type class, and when you use that code, you want to pass in a concrete data-type, whose lazy/strict semantic is less ambiguous.
this approach doesn’t help to avoid unwanted lazy behavior
What do you mean by “unwanted lazy behavior”? The presence of Functor[F] means given F[A] and A => B I can compute an F[B]. Whether that F does things eagerly or lazily shouldn’t make any difference to me.
Oh, you’re talking about turning an Iterator into a Stream and passing it to someone? I recommend you don’t do that. Iterators are extremely hard to reason about and should only be used locally (ideally introduced and eliminated in a single expression).
If that’s not what you’re referring to then I remain confused.
I’ve always used the former, but lately I’ve been tempted to switch to the latter in general, mostly because sometimes I receive a Stream, whose semantics are almost never what I want. But Vector is still fine, so I’m hesitant to switch to List and wish there were a common abstraction over List and Vector but not Stream.