How can I create a custom implementation of SortedSet that only deals with a certain type of element?

I’m trying to create a subclass of SortedSet that only deals with a specific type of generic type, but I’m struggling. I have the following implementation, but as you can see there are a bunch of quite cryptic errors from the compiler as to what I’m doing wrong. Not being super familiar with the (quite complex) type hierarchy of the collections library, I’ve reached my limit of being able to debug this.

Can someone help me out?

Hello, @anna_shaw and welcome!

Your local level one support grug here. :smile:

Let me check I understand you correctly … you want a sorted set of things that are ranges, using a custom range type you’ve written yourself?

(BTW: You could just use Range from the standard library, but I understand if you are doing this from scratch to limber up in Scala in general.)

Just to be clear, do you really want your T to extend a Range[?]? This seems odd in two ways:

  1. One being that I presume you really want a T[Underlying] <: Range[Underlying] for some given generic Underlying; i.e. you don’t want to park ranges of integers side by side with ranges of doubles and ranges of strings in the same set. If you want to keep the ranges over the same type per set instance, you could introduce Underlying as an extra generic type parameter, working in terms of T[Underlying].
  2. The other being that extending case classes isn’t really the done thing in Scala, for the most part they are generally treated as final; so I’d just work with the Range type and stop there.

(I expect the likes of @BalmungSan or @DmytroMitin to be along soon to remind me not to mix wildcard syntax, type constructor syntax and old Scala 2 syntax up, which I do on a regular basis.)

So if you just want a sorted set of ranges of some type, let’s call it Underlying, then how about a simple type lambda in Scala 3?

type SortedRangeSet[Underlying] = [Underlying] =>> SortedSet[Range[Underlying]]

Or more concisely:

type SortedRangeSet = SortedSet[Range[_]]

(Had to edit the above as once again, wildcards versus type constructors, etc…)

That would police client code’s use like this:

val yep: SortedRangeSet[Int] = SortedSet(Range(0, 1)) // That's fine.

val nope: SortedRangeSet[Int] = SortedSet(1) // Does not compile.

val chancer: SortedRangeSet[String] = SortedSet(Range(0, 1)) // Thought we might get away with it, but does not compile either.

I’m not sure whether my response is pitched at the right level, given the sophistication of your Scastie example. Are you trying to solve some much more complex problem? Was an LLM involved?

Anyway, hope that helps. If you really want to roll your own collection, you could look at: Implementing Custom Collections (Scala 2.13) | Scala Documentation for inspiration.

Thanks for your reply @sageserpent-open. Really appreciate you trying to help me out and for taking the time to write something so thoughtful.

Let me clarify how I arrived at the code sample. No LLMs were involved, though as you’ve probably noticed, I’m not the most experienced in Scala :smiling_face_with_tear:

  • The Scastie example is a simplified version of my actual code. I stripped out the domain-specific parts to better highlight the issues I was hitting with the collections library. The Range class in the example is just a placeholder – in my real code, it’s replaced by a more complex interval type.
  • I can’t use SortedSet directly because I need a custom implementation with different internals to support efficient interval queries. Using the standard collection types would require scanning the whole set, which isn’t viable for me as these sets will get very large.

Initially, I wanted to write my RangeSet class like this: class RangeSet[T: Ordering](ranges: Set[Range[T]]) extends SortedSet[Range[T]]. That felt more natural and avoids the issue you pointed out about mixing different Ts. But as far as I can tell, it doesn’t work because the type constructor passed to SortedSetOps expects a different shape. Here’s an example.

Is there a way to make this approach work?

2 Likes

@anna_shaw

Ah, I see - glad no LLMs were involved. The problem is more complex. As long as you’re learning… :grinning_cat_with_smiling_eyes:

Before getting into a load of back-and-forth about the custom collection approach and getting the type signatures sorted out, any chance something off-the-peg would suit you?

For example: Diet

Or maybe: sciss/FingerTree: A Scala implementation of the versatile purely functional data structure of the same name. - Codeberg.org (I use this a fair bit for interval queries).

I wrote something a bit like Diet way back when that you could mess around with and adapt (although I’d recommend using either of the above if it works for you: americium/src/main/scala/com/sageserpent/americium/RangeOfSlots.scala at master · sageserpent-open/americium · GitHub. This is definitely more as inspiration for how to implement an interval search using ranges, the API is intended for quite different purposes.)

I went down the road of looking at that very same Wikipedia page a few years ago, but ended up using the fingertree (in RangedSeq guise instead).

EDIT: something to bear in mind with the finger tree approach is that the API is rather quirky. It isn’t wrong, and is bug-free and performant, but you have to pay attention to whether your intervals are closed, open, or left-closed / right-open or vice versa. IIRC, the API expects left-closed / right-open; I suggest playing around with it in Scastie etc to verify this. The code comments are a bit misleading too, but it’s still a fine piece of work.

I could be wrong, but it seems the stdlib is unable to support this case.