Append/prepend to Seq

jimka · November 18, 2020, 10:35am

I see that the methods :+ and +: can be used to append or prepend to a Seq. Is there a method which choses to prepend or append depending on which is easier?

For example if it is a List, I want to prepend, if it is a Vector I want to append. In some algorithms it does not matter semantically which order the Seq is maintained in.

Jasper-M · November 18, 2020, 11:00am

I don’t think there is such a method. I guess you could write it yourself. Anyway I suspect that it only matters in case of List. In all other std library Seqs append and prepend will probably be equally (in)efficient.

jimka · November 18, 2020, 11:01am

But if I understand correctly, a Seq is very often a List, right?

Jasper-M · November 18, 2020, 11:06am

In general, taken over all applications, probably yes. List is currently also the default Seq implementation. But that doesn’t allow one to draw conclusions about any specific application of course.

jimka · November 18, 2020, 12:25pm

Is this what you are referring to about implementing it myself? If not, how should I do it?

  def conj[T](obj:T, seq:Seq[T]):Seq[T] = seq match {
    case Seq() => Seq(obj)
    case l @ List(_::_) => (obj :: l) 
    case _ => seq :+ obj
  }

BTW IntelliJ complains about the type of the 2nd clause.

Screenshot 2020-11-18 at 13.23.56

jimka · November 18, 2020, 12:30pm

If I define it like this, the List[Any] warning goes away.

  def conj[T](obj:T, seq:Seq[T]):Seq[T] = seq match {
    case Seq() => Seq(obj)
    case l @ List(_::_) => obj +: seq
    case _ => seq :+ obj
  }

sangamon · November 18, 2020, 12:54pm

The compiler is still right - don’t mute it, listen. This will only match nested lists, i.e. never. Use either l: List[T] or l @ _ :: _.

The idea feels pretty odd, anyway - with Seq, the order is supposed to matter…

jimka · November 18, 2020, 1:06pm

Ouch, you’re right.

what about this?

  def conj[T](obj:T, seq:Seq[T]):Seq[T] = seq match {
    case Seq() => Seq(obj)
    case l @ _::_ => obj :: l
    case _ => seq :+ obj
  }

Curiously the compiler is happy with this, and my tests pass, but IntellJ marks :: in red.

jimka · November 18, 2020, 1:08pm

Not sure what you mean. Which data structure would you recommend if the order of the sequence doesn’t matter for my application?

sangamon · November 18, 2020, 1:30pm

I don’t think there’s anything in the standard collection hierarchy that is order-agnostic, supports adding elements and doesn’t have additional constraints (like uniqueness), so I might just go with Seq - or, more likely, List, Vector,…, specifically if I worried about performance characteristics.

I’m not saying this helper is bad, it just feels a bit odd. Using a collection and not caring about one of its specific characteristics feels different than deliberately un-forcing this characteristic under the hood…

sangamon · November 18, 2020, 1:39pm

Looks like the type of l is inferred as Seq[T]. If the compiler is happy with this and infers List[T], this might hint at a bug (or a potential improvement) in the presentation compiler…? I’d just go with the l: List[T] variant, then.

BalmungSan · November 18, 2020, 1:44pm

If the order doesn’t matter and you just want to add a bunch of not unique elements, I would just use List everywhere and always prepend. It is the simplest and more efficient solution.

jimka · November 19, 2020, 7:57am

@BalmungSan , It might not be me who is creating the collection. It might be the client who is calling my code.

BalmungSan · November 19, 2020, 1:43pm

Then force your user to pass in a List.

The argument for Seq is that I should let the user decide, but as I have discussed multiple times, you have two big broad sets of operations on a sequence. Operations that will traverse all the sequence once and as such can be abstract to the underlying collection, and operations that depend on the structure of the sequence to be efficient.

For the first group of operations, I have personally found typeclasses to be easier to use and understand; but if you prefer inheritance then fantastic.
For the second group of operations, I have seen three opinions:

Ask for the appropriate type (my opinion).
Ask for a broader type and transform to the appropriate type (I sometimes do this, but usually for small functions the transformation is expensive enough that the benefits are lost).
Ask for a broader type and just use it in any way, the performance would depend on what the user actually gave you (I personally do not like this, but I may see its value for small programs).

jimka · November 20, 2020, 9:45am

I cannot do this, because sometimes it is Scala itself which is passing the sequence to me. For example, var-args function pass their arguments as a Seq if I understand correctly. I cannot force that to change.

I think I understand and sympathize with your discontent with the problems of the Seq abstraction. However, in my way of thinking, the Seq abstraction, for better or worse, is carved into the way Scala works.

sangamon · November 20, 2020, 10:32am

…which likely gives you an Array-based implementation and thus nothing you’d want to call :+ on at all if you’re worried about performance. (Just guessing, haven’t verified this.)

BalmungSan · November 20, 2020, 12:34pm

That is why I told you the other day that I (again, just myself) would not use varargs internally in my code, but rather just an auxiliary apply at the edge of my API.

jimka · November 20, 2020, 3:10pm

Yes, I think I understand your point of view.