How to convert varargs class constructor to Seq[T] constructor

jimka · January 28, 2021, 2:25pm

@BalmungSan, you mentioned in another post (I think it was you) that you prefer using a single parameter which is a List or Array rather than a var-args definition. And then just defining one var-args interface in the companion object.
I have some follow up questions about this approach.

I have several classes and method in my program which are defined using the same idiom. I’m thinking of refactoring them to clean up but I don’t know what will happen in several corner cases. Maybe you can forewarn me???

case class SAnd(override val tds: SimpleTypeD*) extends SCombination {
  override def create(tds:SimpleTypeD*):SimpleTypeD = SAnd(tds: _*)
...
}

Since I really don’t understand type-erasure, I don’t know whether I can define the class as follows. Of course I can try it and see what happens,

case class SAnd(override val tds: Seq[SimpleTypeD]) extends SCombination { 
  override def create(tds:Seq[SimpleTypeD]):SimpleTypeD = SAnd(tds)
...
}

I was under the impression that the JVM is not able to use the paramerized type of a Seq to determine method applicability.

Supposed I define var-args constructor for SAnd in the companion object, and then if I use the syntax at the call-site SAnd(x), will the compiler be able to determine whether x is a Seq[SimpleTypeD] or simply a SimpleTypeD and call the correct constructor?

Now a more troubling case:

abstract class SMemberImpl(val xs:Any*) extends SimpleTypeD with TerminalType { ... }

case class SMember(override val xs: Any*) extends SMemberImpl(xs :_*)

If I redefined SMember to have parameter xs:Seq[Any], and a var args constructor with parameter xs:Any *, how can the compiler possibly know which one to call when the call site has exactly one argument? Because an argument such as List(1) has both type Any and also type Seq[Any].

BalmungSan · January 28, 2021, 3:25pm

Sorry, I am not sure if I understood the questions.

Why not? Those two are exactly the same at bytecode.
Rember foo: Foo* is exactly the same as foo: Seq[Foo] the only difference is that the first one can be called like foo1, foo2, foo3 without creating the Seq by yourself, but that is a language feature, not a real type. - And again, my point was mostly about using a concrete type like List or ArraySeq rather than Seq.

Not sure what you mean with this.

If I understand you correctly you have something like this:

final case class Foo[A](data: List[A])

object Foo {
  def apply[A](as: A*): Foo[A] =
    new Foo(data = as.toList)
}

Then yes, it works.

It seems the compiler is smart enough if the types are precise enough.

jimka · January 28, 2021, 4:58pm

Type erasure is really confusing to me. As I understood, the runtime environment cannot distinguish between X[A] and X[B], because they are both compiled to X. So can’t have two different methods of the same name, one with argument of type X[A] and the other with type X[B], right? This is because (as I understand) the runtime cannot verify that an object has type X[A], it can only verify that the type is X[_] for some _.

If that understanding is correct (maybe it’s completely wrong), then why can I have a method whose argument is Seq[SimpleTypeD] ?

BalmungSan · January 28, 2021, 5:07pm

More or less, yes that is correct.

Yes.

Why? Which another method you have in scope?
Again, if it used to work with SimpleTypeD*, then it will work with List[SimpleTypeD], because (again) those two are the same thing; Foo* is not a real type.

SethTisue · January 28, 2021, 6:13pm

Because the compiler won’t let you call it unless the compiler knows at compile time that you’re passing it an argument of the right type.

No runtime check is possible, but no runtime check is needed if the compiler has already ruled out ill-typed calls ahead of time.

SethTisue · January 28, 2021, 7:26pm

I would like to add that this is a limitation that comes from the JVM. To facilitate Java interop in both directions, Scala adopts many of Java’s encodings and calling conventions. In particular, we encode method signatures and overloading the same way.

You’re right to be a bit surprised that two overloads that different only by the type parameter of an argument isn’t allowed. After all, overload resolution happens at compile time, not runtime, so the compiler has the information it needs to disambiguate calls. The problem is that the JVM won’t allow it. (The compiler could use some workaround like giving the overloads different names at the bytecode level, but Scala chose not to impair Java interop by doing that.)

BalmungSan · January 28, 2021, 7:32pm

But it is good to also add that the Scala standard library does provide a (manual) workaround, using the DummyImplicit.

ClintGilbert · January 28, 2021, 7:59pm

That’s neat, thanks for posting this. I’ve been using a similar
workaround for years:

def foo[A](as: Blah[A]) = ???
def bar[B](as: Blah[B])(implicit discriminator: Int = 42) = ???

but that occasionally runs afoul of restrictions on default args.

jimka · January 29, 2021, 8:02am

That is interesting information. Yes, a scala programmer who does not know java is surprised at this seemingly inconsistent limitation. It is an artifact of using an existing framework.

jimka · January 29, 2021, 8:18am

OK, thanks everyone for the enlightening explanations; as I said before, type erasure and its consequences are nonintuitive for the non-java aficionado.

Another thing I don’t understand, related to the conversion of varargs constructor to Seq[T] constructor, is how this relates to pattern matching. Suppose I have the primary constructor from the class definition case class Foo(val ts:Seq[T]) { ...} and I also have an apply method in the companion object object Foo { def apply[T](ts: T*) = ...}

Then what do obj match { case Foo(_) => ...} and obj match {case Foo(ts@_*) => ...} mean?
In my original design only with abstract class Foo(val ts:T*} {...}, the pattern matching obj match {class Foo(_) => ...} and obj match {class Foo(ts@_*) => ...} both match the same constructor, the former with exactly 1 arg, the second with 0 or more args.

curoli · January 29, 2021, 9:05am

To be clear, we are talking about two different kinds of type erasure here. Usually, when we talk about type erasure, we mean that Seq[A] becomes Seq[AnyRef]. Then there is another kind of erasure that turns m(as: A*) into m(as: Seq[A]) (actually, m(as: Seq[AnyRef]).

If you want to overload a method to have both the Seq version and the varargs version, you can do:

def m(as: Seq[A]): R = ...
def m(): R = m(Seq.empty)
def m(a: A, as: A*): R = m(a +: as)

and then you can call it as m(), m(a1), m(a1, a2), m(a1, a2, a3) or m(Seq(a1, a2, a3)).

jimka · January 29, 2021, 11:41am

curoli:

If you want to overload a method to have both the Seq version and the varargs version, you can do:
def m(as: Seq[A]): R = ...
def m(): R = m(Seq.empty)
def m(a: A, as: A*): R = m(a +: as)
and then you can call it as m() , m(a1) , m(a1, a2) , m(a1, a2, a3) or m(Seq(a1, a2, a3)) .

Should this still work if m(as: Seq[A]) is the constructor for a case class, and the latter two are apply methods in the companion object?

curoli · January 29, 2021, 12:36pm

Yes, it should work.

BalmungSan · January 29, 2021, 1:04pm

Thanks for providing another argument to my list of reasons why not using varargs as a real type inside my code; it makes pattern matching more confusing.

Using varargs:

case Foo(_) => Means match a Foo that has only one element and ignore that element.
case Foo(x) => Means match a Foo that has only one element and assign that element to the x variable.
case Foo(ts @ _*) => Means match a Foo and collect in ts all the values inside it. - Note that the type of ts is Seq[T] but uses the same underlying class (and thus I hope same value, thus no copying) that it has inside it.

Using final case class Foo[T](ts: List[T]) then:
(Because, again, the point was not using Seq)

case Foo(_) => means match any Foo and ignore whatever value it has inside.
case Foo(ts @ _*) => doesn’t compile.
case Foo(List(ts @ _*)) means match any Foo and collect the values inside the List in ts - Note, the type of ts is Seq[T] but it seems to be returning the same underlying value (and thus class) that it has inside it.
case Foo(list) => means match any Foo and assign it the List it has inside in list, basically is the same as above but simpler. - Note, there is a difference in that list is of type List[T] which is better.
case Foo(_ :: Nil) => means match a Foo whose its underlying List has only one value and discard that value.
case Foo(x :: Nil) => means match a Foo whose its underlying List has only one value and assign the name x to that element.

cbley · January 29, 2021, 1:07pm

Pattern matching does not use the constructors, it uses the unapply or unapplySeq methods of the case class’s companion object.

These are auto-generated from the first argument list of the primary constructor.

Depending on whether you use case class C(x: X*) or case class C(x: Seq[X]) there will be a difference. The former one will generate a unapplySeq method, the latter unapply.

The difference when trying to match is:

stmt match {
  case X(a, b, c)      => ... // case class X(as: T*)
  case Y(Seq(a, b, c)) => ... // case class Y(as: Seq[T])
}

(but of course, you can also define unapply or unapplySeq methods manually)

jimka · January 29, 2021, 1:48pm

Are you sure about this one? I think Foo(x) matches the single argument case, same as Foo(_) except that the argument is bound rather than ignored.

BalmungSan · January 29, 2021, 1:52pm

Yes, I am sure, check my comment again.

What you said is how it works if using varargs but that section is for using List[T].

jimka · January 29, 2021, 2:00pm

Yes I see, you’re right. So for completion, in the first section you might also include case Foo(x) to match the singleton sequence and bind to the first element of the sequence.

jimka · May 6, 2021, 2:08pm

I don’t understand why, but to have a Seq[...] and varargs apply method, apparently I need an implicit dummy parameter.

case class And(operands:Seq[Rte]) extends Rte{
  override def toLaTeX:String = "(" ++  operands.map(_.toLaTeX).mkString("\\wedge")  ++ ")"
}

object And {
  def apply(operands: Rte*)(implicit ev: DummyImplicit) = new And(operands)
}

BalmungSan · May 6, 2021, 2:20pm

The problem is that Rte* is not a real type, just syntactic sugar. So, at the bytecode level, you have two identical methods, since Rte* is just Seq[Rte].
The dummy implicit makes the final bytecode method have an extra method thus being a valid overload.