Extending versus wrapping Vector types

Russ · December 25, 2024, 10:46pm

As I have discussed in earlier posts, I often find that I would like to add methods to a Vector of some particular type. One approach is to just write functions in the type’s companion object that take a Vector of that type as an argument. As the number of such functions grows, however, that does not seem like the most elegant solution.

The next logical alternative is to wrap the Vector in a case class. For example, here are partial code snippets of two classes that I use to represent flight trajectories:

case class Track(time: Scalar, pos: Position, alt: Scalar=0):
  ...

case class Tracks(tracks: Vector[Track] = Vector()):

  import tracks.*

  export tracks. { isEmpty, nonEmpty, length, head, last }
  export tracks. { sliding, foreach, apply, withFilter, map }

  def :+ (t: Track) = copy(tracks :+ t)
  def +: (t: Track) = copy(t +: tracks)
  def ++ (t: Tracks) = copy(tracks ++ t.tracks)
  def ++ (v: Vector[Track]) = copy(tracks ++ v)

  def reverse = copy(tracks.reverse)
  def negateTime = copy(tracks.map(_.negateTime))
  def reverseTime = negateTime.reverse

  def timeShift(delt: Scalar): Tracks = if delt == 0 then this else
      copy(tracks.map(_.timeShift(delt)))

  def tail = copy(tracks.tail)
  def take(i: Int) = copy(tracks.take(i))
  def drop(i: Int) = copy(tracks.drop(i))
  def takeRight(i: Int) = copy(tracks.takeRight(i))
  def dropRight(i: Int) = copy(tracks.dropRight(i))

  def takeWhile(f: Track => Bool) = copy(tracks.takeWhile(f))
  def dropWhile(f: Track => Bool) = copy(tracks.dropWhile(f))
  def filter(f: Track => Bool) = copy(tracks.filter(f))

  def takeStart(t: Scalar) = copy(tracks.takeWhile(_.time <= startTime + t))
  def takeEnd(t: Scalar) = copy(tracks.takeWhile(_.time >= endTime - t))

  ...

As you can see, I am using imports and exports and manually adding Vector methods and using copy to elevate results of type Vector[Track] to class Tracks. This is a pattern that I do with several different types, and it seems that it should be doable without all the boilerplate. I am wondering if there is a cleaner and simpler way to do this.

It seems that I should just be able to do something like

case class Tracks extends Vector[Track]:
  def myMethod1(...) ...
  def myMethod2(...) ...
  ...

and be done with all the boilerplate. Is there a reason this is not allowed? Am I missing something obvious?

spamegg1 · December 26, 2024, 9:36am

I’m not sure what the reason is, it’s a sealed abstract class in the Library, there must be a technical reason. Probably due to how it’s implemented in a complex way? (there are different subclasses for sizes 0,1,2,3,4,5,6, it uses radix tree balancing) Maybe there is another collection class that can be extended? Compiler / library people might enlighten us.

To avoid boilerplate you can use extension methods. But, if you are writing such a “beefy” class with so many methods, maybe it’s a good idea to implement your own from scratch. Otherwise I’d stick to extensions.

I’m not really sure why you’re doing the manual exports (just to avoid writing tracks.?), feels like a very strange pattern. At that point I would realize “I shouldn’t be doing this, let’s change the design.”

Just my two cents, could be wrong.

jducoeur · December 26, 2024, 3:16pm

Not part of the team, but I’ve wandered around this code before in order to implement serialization, and yeah, just so: here’s the source code for Vector, showing the subclasses. (Indeed, it’s a pretty complex tree of classes, when you dig into it.)

Russ · December 26, 2024, 6:53pm

Thanks for the replies, but I am still wondering why it has to be “sealed.” This is not a major issue for me, but I tend to be a bit obsessive about avoiding boilerplate when possible. I want my code to be the minimal required to specify the actual functionality, without extraneous junk. That’s a large part of the reason that I use Scala to start with.

I am wondering if I can create a generic wrapper for Vector that would allow me to extend it. If so, I think it could be useful for many potential applications, including time series analysis and signal processing.

BalmungSan · December 26, 2024, 7:57pm

Extending Vector (assuming it would be possible) wouldn’t give you what you want. It is mostly an interface and the implementation are in all the subclasses that are transparent to all of us.
So you would need to extend all of those as well and that would not save you much TBH.

Maybe you don’t actually need to extend Vector per se. But rather IndexedSeq and use the IndexexSeqOps to get most of the stuff for free.
Your data structure would still wrapped over a Vector.

som-snytt · December 26, 2024, 8:47pm

Apologies in advance if this is the use case that made me read how to use exports in extensions recently.

If the goal is to handle different kinds of vectors, but you’re willing to write out the “interface” for forwarding (once), then opaque types may suffice.

Instead of wrapping the vector, the opaque type conceals it. But you still want an op to receive and return that type (and not a vector).

case class Track(t: Int)

object Tracker:
  opaque type Tracks = Vector[Track]
  object Tracks:
    def apply(tracks: Track*): Tracks = Vector(tracks*)
    extension (tracks: Tracks)
      private def ops: Vex[Track, Tracks] = Vex(tracks)
      export ops.*

class Vex[A, C](c: C)(using C =:= Vector[A]):
  def ++(other: C): C = (c ++ other).asInstanceOf[C]
  def :+(a: A): C = (c :+ a).asInstanceOf[C]

@main def test() = println:
  import Tracker.*
  val tracks = Tracks(Track(42), Track(27))
  assert(tracks ++ Tracks(Track(5)) == tracks :+ Track(5))
  tracks :+ Track(5)

Tracks is in an object to keep it opaque to the main method.

If the wrapping class had other elements, then some machinery to update an element and rewrap would be useful.

tarsa · December 26, 2024, 10:30pm

Russ:

As you can see, I am using imports and exports and manually adding Vector methods and using copy to elevate results of type Vector[Track] to class Tracks. This is a pattern that I do with several different types, and it seems that it should be doable without all the boilerplate. I am wondering if there is a cleaner and simpler way to do this.

It seems that I should just be able to do something like
case class Tracks extends Vector[Track]:
  def myMethod1(...) ...
  def myMethod2(...) ...
  ...
and be done with all the boilerplate. Is there a reason this is not allowed? Am I missing something obvious?

even if it was allowed, the inherited methods would return the type Vector[Track] instead of type Tracks. inheritance doesn’t magically narrow the return types of methods. to return type Tracks, the inherited methods would need to call constructor of Tracks instead of constructor of Vector[Track], but you haven’t taught the inherited methods to do that.

tarsa · December 26, 2024, 11:04pm

here’s my take: Scastie - An interactive playground for Scala.

abstract class SeqWrapper[Elem, Wrapper <: SeqWrapper[Elem, _]](elems: Elem*) {
  protected def wrap(rawCollection: Seq[Elem]): Wrapper
  protected def unwrap: Seq[Elem] = elems

  // below methods don't need to be overridden in subclasses
  def ++(other: Wrapper): Wrapper = wrap(unwrap ++ other.unwrap)
  def :+(elem: Elem): Wrapper = wrap(elems :+ elem)
}

case class Track(whatever: Int)

case class Tracks(tracks: Vector[Track]) extends SeqWrapper[Track, Tracks](tracks*) {
  def this(tracks: Track*) = this(Vector(tracks*))

  override protected def wrap(rawCollection: Seq[Track]): Tracks =
    Tracks(rawCollection.toVector)

  override protected def unwrap: Vector[Track] =
    tracks
}


@main def test() = {
  val tracks = new Tracks(Track(42), Track(27))
  assert(tracks ++ new Tracks(Track(5)) == tracks :+ Track(5))
  println(tracks :+ Track(5))
}

it would need some extra work to add a base class for companion objects to have the apply method to avoid using new keyword.

note that if you carefully manage the underlying sequence, so that it doesn’t change its implementation, then in methods like:

  override protected def wrap(rawCollection: Seq[Track]): Tracks =
    Tracks(rawCollection.toVector)

the .toVector will be no-op, as the rawCollection will be Vector already. maybe there’s a way to enforce it statically (on type-level?).

also note that the solution from @som-snytt is more lightweight as it doesn’t re-wrap the underlying sequence constantly.

jducoeur · December 26, 2024, 11:56pm

I totally understand the motivation (I’m also highly boilerplate-allergic), but I suspect the reason is that the various classes involved in the Vector implementation are pretty interdependent. My guess is that, unless you completely reimplement the interface (which sort of defeats your purpose here), you would likely wind up with something that looks reasonable but is broken in practice.

SethTisue · December 27, 2024, 2:21am

In short, the Scala collection types simply aren’t designed to be extended. (The collections design is already rather subtly balanced to achieve a number of other design goals; it’s far from clear that there’s any way this additional design goal could have been accommodated at the same time.)

But (as Luis has mentioned) there is a great deal of support for defining your own custom collection types without having do everything yourself. See Implementing Custom Collections (Scala 2.13) | Scala Documentation , not omitting the “Methods to overload to support the ‘same result type’ principle” appendix.

Russ · December 27, 2024, 5:53am

Thanks for that idea. It’s an innovative use of opaque types, but it does not seem to be generic. I was thinking of something generic so I would not have to repeat the boilerplate for each use case.

Actually, I had a solution to this problem years ago with implicit classes, but I believe they are now deprecated (and I also had some other problem with them a while back that forced me to abandon them). With implicit classes, I could tell the compiler to automatically convert a Vector[Track] to a Tracks class wherever needed, which is essentially what I am looking for.

This is certainly not something that I need, and in fact I have other more pressing problems that this is distracting me from. Things have slowed down a bit for the holidays though, so I am thinking about this boilerplate issue that has bothered me for a while.

Russ · December 27, 2024, 7:12am

I just came across this:

and I think it may be what I have been looking for. I added an implicit conversion to the companion object for my Tracks class:

case class Tracks(tracks: Vector[Track] = Vector()):
...

object Tracks:

  given Conversion[Vector[Track], Tracks] = Tracks(_)
  given Conversion[Tracks, Vector[Track]] = _.tracks

It’s past my bedtime, so I’ll work out the details tomorrow.

Thanks Baeldung!

tarsa · December 28, 2024, 12:15pm

that solution incurs the constant re-wrapping overhead as in my solution. this is probably not significant, as Vector is immutable anyway and any operation that result in new vector requires allocation of new stuff.

there is an issue with implicit conversions back and forth. if you do some operation that doesn’t result with Vector[Track] then you’ll just get something else after that operation instead of having an compilation error at that exact line. consider e.g.

// ... definitions of Track, Tracks and implicit conversion omitted

// some tracks here
val t: Tracks = Tracks(...)
// that results in Vector[Any] or something similar
// no fail-fast compilation error here, but there should be one
// to make operations on Tracks more predictable, in my opinion
val r1 = t.append("hello")
// complation error, because .negateTime is not a method on `Vector[Any]`
// and it cannot be converted to `Tracks` which has that method
r1.negateTime

in my solution you would need something like this:

// ... definitions of Track, Tracks and wrapper base classes omitted

// some tracks here
val t: Tracks = Tracks(...)
// compilation error as the `.append` method accepts only Track
val r1 = t.append("hello")
// this results in Vector[Any] and it's ok and predictable, since we have
// explicit .unwrap to signal that we want to escape the Tracks container
val r2 = t.unwrap.append("hello")
// this results in Tracks again since we didn't escape the Tracks container
// using  explicit .unwrap without .wrap
val r3 = t.tail.append(t.head)
// this works as expected
r3.negateTime

Russ · December 28, 2024, 7:06pm

I thought the implicit conversion method was exactly what I need, but a problem just occurred to me.

I want the Tracks wrapper and Vector[Track] types to be seemlessly interchangable, and the implicit conversion seems to achieve that.

The problem is that the Tracks wrapper class contains lazy vals that store the results of moderately costly computations, and I guess those will have to be recomputed every time a Vector[Track] is converted to a Tracks. If so, that is not acceptable.

For example, Tracks has lazy vals to store the results of algorithms for determining the end of climb, the start of descent, the start of steady speed, the start of steady cruise, and other basic parameters of the trajectory. I certainly want those to have to be recomputed just to avoid explicit wrapping and unwrapping of the Vector.

So I guess I am back to wanting to be able to extend a Vector to a wrapper class so I can get efficient and seemless interoperability without the boilerplate. (I will also take a closer look at the suggestions in the earlier posts on this thread.)

tarsa · December 28, 2024, 10:44pm

you can easily force the conversion of Vector[Track] to Tracks by using type ascription Types | Style Guide | Scala Documentation or by explicitly typing variables, but it will be easy to forget them anyway. i think being explicit on when wrapping and unwrapping happens (as in my solution with SeqWrapper base class) brings predictability which should be a desirable property.

Russ · December 28, 2024, 10:56pm

After thinking about it a bit more, it occurred to me that the situation is not as bad as I thought. If an operation changes the trajectory, then the lazy val parameters need to be recomputed anyway, so there is no loss of efficiency. The only remaining problem is that if an operation does not change the trajectory (and the lazy val has already been computed), then a recomputation would be done unnecessarily. Such a computation would be triggered, for example, by adding an empty Vector to the trajectory, which could occasionally happen. Typically the lazy parameters are computed after the trajectory if fully constructed, so that is not likely to happen often enough to worry about. I don’t know if there is a straightforward way to guard against it.

Russ · January 1, 2025, 2:56am

In case anyone is interested, I think I have finally figured out how to best handle this Vector wrapping pattern. The idea is to get as close as possible to making the Vector and the wrapped Vector interchangable. I don’t expect this to be a huge revelation to Scala experts, but it allows me to clean up and standardize my code a bit, and that gives me a good feeling.

The idea is to use an implicit conversion and a few other little tricks. Going back to my original example:


case class Track(time: Scalar, pos: Position, alt: Scalar):
  ...

import trajspec.Tracks.given_Conversion_Vector_Tracks

case class Tracks(tracks: Vector[Track] = Vector()):

  import tracks.*
  export tracks.*

  def negateTime: Tracks = tracks.map(_.negateTime)
  def reverseTime: Tracks = negateTime.reverse
  def timeShift(delt: Scalar): Tracks = if delt==0 then this else
      tracks.map(_.timeShift(delt))

  ...

object Tracks:

  given Conversion[Vector[Track], Tracks] = Tracks(_)
  given Conversion[Tracks, Vector[Track]] = _.tracks

As you can see, the conversion is in the Tracks companion object. Note the wildcard imports and exports.

In case you are wondering what negateTime and reverseTime are for, they are used for constructing the arrival segment of a trajectory in reverse to satisfy terminal boundary conditions. The relevant point here is that declaring the return value as Tracks causes the resulting Vector to be automatically converted to an instance of Tracks.

The import statement for the implicit conversion is something that I would have had a hard time figuring out without the compiler suggestion. I am indebted to whoever decided to add that suggestion to the compiler error output.

sageserpent-open · January 1, 2025, 9:44am

Happy New Year…

I’ve not tried this myself, but if you modify tracks to extend AnyVal, perhaps it will become a value class? I’m not certain whether or not the import / export clauses have any bearing on that.

Has that idea already been proposed? Forgive me if I’m going over old ground again, I confess to having lost the thread on the overall theme…

If so, that would save the allocation overhead when converting from a plain Vector to an adorned Tracks. You would have something like an opaque type with method forwarding into the bargain.

All of which reminds me of the discussion about import / export clauses in extensions, but I definitely remember that one being talked about before.

Glad you got it sorted out, thanks for sharing the successful outcome.

jducoeur · January 1, 2025, 4:51pm

The above code examples look to be Scala 3 – AnyVal is kind of deprecated there, supported mainly for back compatibility. Opaque types were intended to (among other things) replace it.

The AnyVal trick, while useful, has never been a panacea. Due to the way things actually work under the hood, it sometimes allocates, and the rules for when it does so are a little subtle. Personally, I mostly gave up on it as an optimization years ago, because it was more trouble than it was worth to make sure that the instances never allocate.

More details can be found at Value Classes and Universal Traits | Scala Documentation

None of which is to say that it’s definitely a no-go. But I’m skeptical, given the limitations of value classes.

Russ · January 1, 2025, 10:06pm

The pattern that I presented above is good enough for now, but I would still like to see Scala allow something even simpler and cleaner. I may be beating a dead horse, but I would still like to be able to just write something like

case class Tracks extends Vector[Track]:
  // add methods and lazy val calculations here

After all, Track and therefore Vector[Track] are my own types, so why should I not be allowed to extend the latter? I am no expert on the internals of Scala compiler or the Scala libraries, but unless there is some fundamental reason for now allowing this, I think it should be allowed.

And yes, there are fairly simple workarounds, one of which I outlined above. Another is to just live with lots of explicit wrapping and unwrapping in your code, but that is not the elegant syntax that drew me to Scala from the start.

As I said before, I think this could be useful for time series analysis and signal processing as well as other applications, including the aircraft trajectory analysis, generation, and deconfliction application that I am working on.