Too difficult to write a function which accepts Array or List

So in Scala 3 can I do the following to prevent the creation of a copy?

  def makeAdj_6(edges:Seq[(Int,Int)] | Array[(Int,Int)]):Map[Int,Set[Int]] = {
    edges.foldLeft(Map[Int,Set[Int]]()){
      case (adj,(src,dst)) =>
        adj + (src -> (adj.getOrElse(src,Set())+dst))
    }
  }

Wait, does it also copy the Array when I call Seq methods on an Array? That would be really terrible.

The reason it creates a copy is that Seq is immutable.Seq. If you take collection.Seq for instance the array will just be wrapped into a mutable.ArraySeq without copying.

I doubt the scala 3 union type will work without jumping through a bunch of hoops.

1 Like

Alas, no. In Scala 3 we get the error
value foldLeft is not a member of Array[(Int, Int)] | Seq[(Int, Int)]

This seems wrong to me. foldLeft is a member of the union of those two types. Right? See the example.

foldLeft is a member of ArrayOps, not of Array itself. I think selections on a union type only work if the member you’re selecting is a member of the LUB of the parts of the union type.

Which type is the LUB of these two types, the error message reports that the method is missing from Array[(Int, Int)] | Seq[(Int, Int)]? Perhaps just the wrong error message.

AnyRef? :sweat_smile:

it seems to me that the LUB is precisely Array[(Int, Int)] | Seq[(Int, Int)]. Isn’t that what it means to be a type lattice? I.e., if A and B are types then their intersection and union are also types. A lattice is closed under union and intersection.

I think you’re right. But I meant the non-union-type LUB.

Not in general. Usually you’ll end up with the more lightweight ArrayOps, ArraySeq,… As @Jasper-M pointed out, the copying happens only if you force an immutable.Seq.

IMHO abstracting over collections is a failed attempt of the language and a design that never makes sense. Just either ask for a concrete type or ask for an iterable once and use its iterator (again a concrete type)

Or, if you want to write code that can be expressed in terms of for comprehensions and work for any collection. Just ask for a Monad.

What do you mean by, “never makes sense”? Do you mean it doesn’t make sense in the presence of static typing? Please elaborate your idea.

I mean what can you do with something that may either be a List or a Vector or a Set or a Map or a infinite lazy Stream?

Better question, what can you sanely do?
Because you can do a lot of things, but all can be or very efficient or very inefficient.

For example:

  • Get the head (first elements), actually efficient on all. But what does it means to get the first element of an unordered collection like a set?
  • Get the tail, forgetting about the ordering problem, this is pretty efficient on List and Stream, but it is pretty inefficient on Vector and Array.
  • Check if it contains an element, straight forward on all, but if you require it multiple times, you really hope you have a Set which provides it as an O(1) operation.
  • Get the length, efficient for many of those, inefficient for List, will never terminate for the infinite Stream.
  • etc.

My point is, for most algorithms you write them with an especial kind of structure in mind. And if you want to reason about the complexity of your whole algorithm, you need the complexity of your collection operations (which you can only do with concretes).

And the few algorithms that can be written for multiple collections fallback to three special cases (IMHO):

  1. Algorithms using map & flatMap; i.e. for Monads.
  2. algorithms that basically iterate the collection with an internal state, e.g. foldLeft, traverse. Again for those, it is better to use a typeclass (Foldable / Traverse in this case) instead of trying to model those with subtyping.
  3. Algorithms that are basically data processing pipelines, which you can do for every collection by requiring a TraversableOnce and asking for its iterator; you then chain all the transformations on that iterator (taking advantage on it being lazy) and finally consume the iterator by transforming to a concrete collection again (you may do black magic to ensure that concrete is the same one it was passed to you, but it has always been extremely complex for me, so I never did that).

PS: For the second use case, I would rather use a proper Stream; e.g. fs2, akka streams, monix observables or zio zstreams.

2 Likes

So do you think that folding is just a one-off special case? To me the point of fold is that it’s an operation which abstracts the implementation away and just lets you compute something iteratively as a function of the content and a starting value.

Right, I forgot the third case (iteration), I have edited my previous post.

While in principle I agree, I think your dictum may be a bit strong, given that Scala as a hybrid language explicitly supports a range of programming styles.

Without checking, I’d guess that the common super type of all these probably is Iterable, which you suggested to use above. :slight_smile: (And note that Iterator is an abstract trait, as well, not a concrete type.)

However, closer to the OP’s problem, the “natural” abstraction for List and Vector would be Seq - an ordered collection. And this usually works just fine, as long as you don’t care about performance specifics, as is often the case for small collections (where big-O characteristics of a concrete collection type can be quite misleading, anyway). Actually, having concrete types specified everywhere might even be detrimental for performance if you are operating multiple libraries that happened to decide on different concrete collection types, forcing you to convert to and fro…

As an additional difficulty, the OP added Array to the mix, which is either a dirty artifact of the JVM heritage, or a deliberate choice for performance reasons. It is not part of the Scala collection hierarchy proper for obvious reasons. One should either try to get rid of it ASAP (and probably pay a conversion fee) or stick with it and accept the pain points.

So I’d argue that Seq is not that bad an abstraction when used cautiously. The alternative you outline as of now requires to buy into a full-fledged eco system on top of Scala, be it scalaz, Cats,… You cannot “just ask for monad” in plain Scala to start with. While I fully agree with you that going “all in” on functional style and avoiding subtyping in the large is worthwhile and probably superior in the long run, one has to concur that it’s still a journey down the rabbit hole and that more OO-ish approaches can still be a perfectly valid choice in the Scala world.

1 Like

Actually I do not consider myself a functional herald, I have always defended Scala for being a mix and I like to use OOP abstractions where it makes sense.
However, I do think that abstracting collections with subtyping doesn’t make sense. (Note, this is different to the fact that the stdlib uses a lot of inheritance, I understand that is done to increase reuse of internal code, I just think those traits are just for internal use and mostly useless for users)

Without checking, I’d guess that the common super type of all these probably is Iterable , which you suggested to use above. :slight_smile: (And note that Iterator is an abstract trait, as well, not a concrete type.)

I suggested to use IterableOnce because it allows you to get an Iterator, nothing else.
Also, I do not care if Iterator is a trait or a class it is a concrete type in the sense that it has a well defined API with a well defined complexity… (this not different at List being a concrete type, despite being a sealed abstract class, or Option being a concrete type where Some and None shouldn’t)

However, closer to the OP’s problem, the “natural” abstraction for List and Vector would be Seq - an ordered collection. And this usually works just fine.

No it doesn’t. Can you get the tail? Can you pattern match? Can you get the length? Can you access by index? Are you even sure it will terminate? (remember both Stream and LazyList are Seqs)

Actually, having concrete types specified everywhere might even be detrimental for performance if you are operating multiple libraries that happened to decide on different concrete collection types, forcing you to convert to and fro…

I actually do not care about the performance being good or bad. I care about being able to reason about it.
If you have to convert that is an explicit and conscious decision.

So I’d argue that Seq is not that bad an abstraction when used cautiously. The alternative you outline as of now requires to buy into a full-fledged eco system on top of Scala, be it scalaz, Cats,… You cannot “just ask for monad” in plain Scala to start wit

Yeah, I agree, and that is really a pain point.
My opinion is that cats is a utility library that most people will use, no matter if they go full pure FP or not. But that doesn’t demerit the fact that you need a library to do this, which is kind of sad.

I didn’t want to insinuate that - you probably wouldn’t have recommended using iterators, if you were. :wink:

My “all in” was rather hinting at my experience that, while vanilla Scala allows rather gradual progress through the paradigm/style space, scalaz/Cats have black hole characteristics - once you pull them in, they start taking over the whole project. I’m not saying that’s a bad thing, and maybe I’ve just been holding it wrong.

This still strikes me as an very strong statement. I certainly do have issues with aspects of the Scala collection API myself, Seq being at the center of quite some of those, but I’d still place it pretty much on the usable side - I’ve seen worse. :wink: Even if I accepted this statement for the standard API, I probably wouldn’t want to generalize this - I still don’t see what’d be wrong with a trait for finite, strict, ordered collections that basically only hides the performance differences between the concrete implementations.

The difference between Iterator and Seq in terms of complexity doesn’t look that striking to me. Both are traits with 2-3 base methods, lots of derived methods, among them some that allow to shoot your own foot.

Apart from length, I think I can - although not necessarily using the “obvious” methods on Seq for this purpose. As for length, again, I don’t see the difference to Iterator. But yes, I get and share your point.

It’s explicit and conscious, but not necessarily a decision. :slight_smile: If I need to interop with these APIs, I need to pass my data with the type they require.

That seems to be the other point where we have different experiences. I certainly like Cats, but I’d think it comes with quite a learning curve (not just the individual concepts, but combining them in a full application), and it feels pretty much all-or-nothing to me. Thus I’d be hesitant to recommend it as a near compulsory alternative to the vanilla Scala API for everyone.

1 Like

Start, maybe, although it can take a fair while. There’s a bunch of “free candy”, in the form of monad transformers and such, that don’t necessarily take over.

That said, as soon as you touch IO, you do wind up sliding down the rabbit hole really fast. (In the process of making that switch at work right now…)

1 Like

I didn’t want to insinuate that.

Right, I also didn’t want to insinuate you insinuate that :sweat_smile:
It was mostly to put like a context.

Once you pull them in, they start taking over the whole project.

That seems to be the other point where we have different experiences. I certainly like Cats, but I’d think it comes with quite a learning curve.

Agree to disagree.
I believe you can use many things from cats even if you believe IO is the worst thing ever (just to make clear this phrase comes for a person I know which still uses cats).
I do believe that cats-effect is a black hole, you go there is you want / like / need pure functional programming.

This still strikes me as a very strong statement.

Yeah, I have to agree it is. That is why I made clear from my first ost it is my opinion, and only that. I am not trying to sold it as the universal truth.

but not necessarily a decision.

Do we really have decision in life or not? :thinking:
Leaving bad philosophical jokes aside, yes it is a trade-off.
I just believe it is better to transform if needed instead of just hoping the library won’t work with the Seq I provided. (but again, my opinion, based on my narrow scope)


Anyways, I believe we distracted the conversation a bit off-topic.
Mostly my fault, so sorry about that.
And I believe the best would be to leave it here. :slight_smile: