The seemingly flawed ordinality of type bounds and type variances

venkat · June 13, 2021, 1:48pm

I’m learning Scala. I see that Scala has a concept of greater than or less than kind of notation to express the type inheritance which is seems a bit contradictory and flawed.

Co-variance: class foo[+A] means it can accept A or its subtypes of A. This also indicates that the subtypes are “greater than” their super types in some way, due to the usage of + symbol.

Type bounds:
A >: B means A must be a super type of B or same as B. This indicates super types are greater than subtypes in some way, due to usage of > symbol. This contradicts the above.

Things get worse:
C >: A <: B means A and B are lower and upper bounds for C. One is expected to understand this as C being in a “range from A to B” (A<C<B). The order in the expression seems messed up, putting A in the middle and hard to read.

Afterall, at the end of the day, Scala is most used for processing data and streams using its nice collections API, which is not quite a math-heavy use, compared to how python’s multi-dimensional numerical processing power is used in the data science work. That’s mostly a boring enterprise back-end programming being done by ordinary programmers, and not some mathematical scientific work being done by researchers and Ph.D people. So, I hope the language moves closer to it’s users, instead of asking them to be math specialists.

I’m just trying to express my honest views from a perspective of how a newbie sees the Scala language, hoping that such views would be helpful in improving the language adaption.

BalmungSan · June 13, 2021, 2:48pm

First of all that was unnecessary and is just your opinion, not a fact.
Scala can be used for whatever their users want.
Unless you have some statistics to prove your comment it just your vision; curiously mine would be the opposite, most Python users I know do not even know much of algebra, yet they are writing neuronal networks all day all days because libraries like Tensorflow are great to hide those details.
Calling my job boring doesn’t really make me want to help you.
Finally, nothing of this is being a math specialist, subtyping and variance are computer science topics that most programmers should know.

I personally would believe it would have been better to just ask for help, plus feedback of the topic being complex.

Variance is not easy, we all know that. However, we can not just remove it from the language for multiple reasons, the most important one is that it is very useful; even if you have the impression it is not, you will be using it all the time.
Nevertheless, the community has accepted the feedback of many people and what we have done is

A lot of material trying to explain and simplify this topic (I will share some later).
Not really just for this, but we have many channels where people can ask for help (like this one), and some of them like Gitter or the more recent and more active Discord server where you could get NRT help (it was a bad joke, sorry).
Most resources that cover the language have moved the topic of variance to the later chapters because reality is most people do not need to think about variance until they are designing abstractions; rather than using already defined ones, when they would get the benefits of variance without thinking about it.

Additionally, you are using terminology that is not standard, what does “greater than” means?
Also, why do you think the type bouns contradict the above? Care to show a concrete example that got you confused?
Finally, I am pretty sure I have never need to write a type bound that complex as the third one nor I have seen many of them during my more than 5 years of writing Scala, so no worries about it.

In any case, hope those two links help you understand this topic.
For variance

https://www.youtube.com/watch?v=aUmj7jnXet4&t=83s
https://www.youtube.com/watch?v=b1ftkK1zhxI&ab_channel=RocktheJVMRocktheJVM
Assuming what you found confusing of B >: A was seeing something like the prepend method on List:
functional programming - Scala type inference rule over contravarience? - Stack Overflow

Hope that helps and feel free to ask any follow-up question you may have.

Anyways, as more of a personal bias, for what you mentioned in your other thread, it seems you will be using mainly Spark, the reality is all the Scala you need to learn to use Spark is its syntax and basic features like case classes & pattern matching. Anything else is considered unnecessary and even bad practices in the Sparkcommunity, this is also one of the reasons PySpark is becoming the main interest of the maintainers is a bigger market and companies are more willing to use Python over Scala.

jducoeur · June 13, 2021, 4:31pm

Generalizing a bit on this: it feels like you’re falling into the classic anti-pattern of trying to learn every bit of Scala, stubbing your toe on parts, and deciding that Scala is broken.

It’s worth saying explicitly – some parts of Scala exist primarily for library authors. The thing about Scala is that it is a relatively “fair” language – the language is designed so that many features can be implemented in libraries, rather than in the language itself, so that it can be customized to your needs.

That means it exposes a number of power features that application authors rarely or never need to use directly. Variance is one of those: it is absolutely essential in order to build many library functions that work the way you would intuitively expect them to, but I’m not sure I’ve ever needed to use it in business logic. A typical engineer needs to know roughly what it means when it shows up in documentation, but not often more than that.

So none of this is asking users to be math specialists – indeed, it has little to do with mathematics, but is mostly about building highly-reusable data structures. Really, I think the only bug is that many courses put too much emphasis on variance too early, given that most engineers don’t need much more than knowing that it exists, and vaguely what it is talking about.

Russ · June 16, 2021, 5:38am

I don’t understand how the usage of the + symbol ‘indicates that subtypes are “greater than” their super types in some way’. It’s just a notational convention, and wouldn’t it be awkward to indicate contravariance with plus and covariance with minus?

I think the very term “supertype” implies “greater” in some sense. After all, the subtype depends on the supertype, but the supertype does not even need to know that the subtype exists! Then again, a subtype can do everything the supertype can do, but not vice versa, so in that sense the subtype is greater, I suppose.

In any case, I agree with the other replies here. Most users, myself included, do not need to know how to use covariance and contravariance directly. I’ve developed many algorithms with Scala, and I’ve never needed to use co/contravariance directly. There were times when I thought I might need them, but I was wrong. I get their benefit indirectly by using Vector and other standard collection classes.

I am not an expert on who exactly uses Scala and who uses Python. but I would be surprised to learn that Python users are more “mathematical” on average than Scala users. I switched from Python to Scala many years ago because (native) Python couldn’t keep up with my performance needs, and it couldn’t scale up well in code size either. As your code base increases in size, the lack of argument type declarations (“duck typing”) can become very confusing.

siddhartha-gadgil · June 16, 2021, 6:14am

One point : Foo[+A] means “Foo is an order-preserving function of the type A”, i.e., if A is bigger than B, then Foo[A] is bigger than Foo[B] and if A is smaller that B then Foo[A] is smaller than Foo[B]. This has nothing to do with the order (if you flip the order consistently, the notion of order preserving is unchanged).

The notation + is natural as multiplication by positive numbers is order-preserving (and multiplication by negative numbers is order-reversing, like Foo[-A]).

As for supertypes being bigger - they are bigger sets of objects, so this is natural enough. But my main point is what is the order and what functions are order-preserving are independent, so clearly cannot clash.

regards,
Siddhartha

sangamon · June 16, 2021, 11:16am

I find this claim somewhat problematic. I agree that the average developer will only rarely have to add variance to their own parameterized types. Variance isn’t encapsulated within library implementations, though, it’s right there in the API, and so developers using these APIs need to understand it - both the concept and its syntax. And I don’t see a huge step from grasping the concept and syntax to being able to apply it in your own code when/if appropriate.

Of course you can’t learn all language features at once, and variance may be one of the concepts you can push further down the line for a while - but saying that you only ever really need to understand variance by heart if you are a library programmer is taking this a bit to far and verges on recommending programming by coincidence. But maybe I’m just excessively dogmatic here…

I just checked and I found a couple of variance usages (at “application logic” level) in some projects of mine, most of those quite similar to this (contrived/reduced) example:

case class IDItem[T](id: Long, item: T)

def handleIDCharSeqs(is: List[IDItem[CharSequence]]): Unit = ???

val is = List("a", "b").zipWithIndex.map { case (s, i) => IDItem(i, s) }
handleIDCharSeqs(is) // duh!

So while these are comparatively rare occasions, it’s not like this issue never comes up for the average application programmer.

jducoeur · June 16, 2021, 1:04pm

I’m really not sure that’s true in practice – I would bet that 80% of Scala developers do not understand variance, but use it just fine.

The thing is, variance is mostly about “make this do what I expect it to do”. Folks in an OO environment have certain intuitive expectations about what you should be able to do with subtypes and supertypes, including their use in collections. That is how they mostly interact with variance, and I don’t think most of them pay any attention to it: it does what they want, and that’s great.

It does come up in application programming occasionally, sure – mostly when you expect something to compile and it doesn’t – and that is when folks tend to spend the time to really understand it. But honestly, I tend to forget the details between those occasions, because in my experience they’re pretty few and far between.

The above is basically what I’ve been saying to our folks who are learning Scala (it’s a FAQ, because this is a point where many of them get stuck): get a vague sense of what variance is about, and otherwise don’t worry too much about it. So far, I haven’t found any reason to believe that was a bad choice – not one of them has come to me yet to say that they needed a deeper understanding for their work…

tarsa · June 16, 2021, 9:53pm

At risk of repeating what others said, I’ll summarize variance forms and usefulness.

Variance is needed to describe relationship between F[A] and F[B] given relationship between A and B. Without variance annotations we don’t know how to infer the relationship in F[_] case, so F[A] would always be incompatible with F[B]. That’s not what we want. We want e.g. following code to work:

val myList: List[AnyRef] = List[String]("hello", "world")

Above code works because the generic parameter for element in list is covariant.

There are two types of variance annotations: use-site and declaration-site. Scala and C# have declaration site variance (Scala uses +A and -A, C# uses in A and out A). They are described e.g. here: Covariance and contravariance (computer science) - Wikipedia . Declaration-site variance annotations lead to shorter code, are bit less powerful than use-site ones, but the main difference is that declaration-site variance annotations look much simpler than use-site ones (at least comparing Scala and C# to Java - Java has very unreadable variance syntax).

There is no universal default variance type that would suit all cases, so variance should be explicitly stated. If F[A] returns values of type A then it could be covariant (F[+A]). If F[A] accepts values of type A then it could be contravariant (F[-A]). If F[A] both returns and accepts values to type A then it must be invariant in generic parameter A. If that’s not the case and compiler doesn’t yell, then we have a problem. A prominent example of such problem is behavior of Java and C# arrays. They are covariant, while they should be invariant, as they both accept and return values of the same type. Following code illustrates the problem (copied from Wikipedia):

// a is a single-element array of String
String[] a = new String[1];

// b is an array of Object
Object[] b = a;

// Assign an Integer to b. This would be possible if b really were
// an array of Object, but since it really is an array of String,
// we will get a java.lang.ArrayStoreException.
b[0] = 1;

We don’t want to have type mismatch errors in code that typechecks without warnings. Scala even makes its arrays invariant, thus fixing the type system unsoundness.

Ichoran · June 16, 2021, 10:24pm

As Siddhartha points out, this is a misinterpretation of the symbol. Foo[A] means that there is no relationship between Foo[X] and Foo[Y]. Foo[+A] (covariance) means that any subtyping relationship there is between X and Y is preserved through Foo, in the same direction. So if X is a subtype of Y, then Foo[X] is a subtype of Foo[Y]. Or if X is a supertype of Y then Foo[X] is a supertype of Foo[Y]. What if you want to switch the order? Why, Foo[-A].

Yeah, this is pretty ugly. You can’t write it A <: C <: B because then you wouldn’t know which one you were declaring. But you could write something like C in (A, B) or C: {A < C < B}. That’d be better.

Fortunately, it doesn’t come up very often in practice. May as well just get used to the wrinkle and move on.

venkat · June 19, 2021, 8:20pm

I thought Foo is a Type, not a function? Again, I think we are getting into too much math for simple concepts. For me, an order preserving or monotonic function is defined for the order relation between two or more applications of a function. For all X,Y {X >= Y} in some domain, if f(X) >= f(Y) always, then f() is monotonic. So I wonder what are the 3 entities - X,Y and f() in my example of Foo(+A).

My point was really about the choice and consistency in the meaning of “+” and “>” in the examples I gave. There is definitely an order involved (an hierarchical one). An order is unidirectional as in “arrow of time” and the symbols would mean something to do with that order.

But I agree that these things are rarely used. I was only indicating to make the concepts more programmer-friendly.

Thanks everyone for the great clarifications!

siddhartha-gadgil · June 20, 2021, 11:28am

Foo(A) is a type that depends on the type A, so Foo(_) is a function from types to types (just as log(x) is a real number depending on the (positive) real number x so log is a function.

An accurate symbol for increasing is perhaps ↑ but that would cause a bunch of problems. So + is a pragmatic choice. This is in some sense justified as x => +x is an increasing function and x => -x is a decreasing function.

regards,
Siddhartha

tarsa · June 20, 2021, 5:45pm

+ means that there’s positive correlation between relationship of generic types and relationship of their parameters. For example: if A <: B and we have F[+T] then F[A] <: F[B]. OTOH, if A <: B and we have F[-T] then F[A] >: F[B].

The idea here is that in a standard hierarchy visualization (i.e. the directed acyclic graph of types) the subtypes are below their supertypes (i.e. they are lower in the hierarchy). The syntax follow the intuition coming from diagrams like: UnifiedTypes

Bear in mind that in notation like C >: A <: B we’re not only imposing bounds on C but also introducing it. First we have a introduced type parameter name, then we have its bounds. Notation like A <: C <: B is currently rejected by compiler, but even if it would be accepted, then it would introduce type A instead of type C. Therefore you would need something else to describe which type is being introduced, like e.g. extra syntax, perhaps A :< C <: B (notice the swapped :<), but wouldn’t that be even more confusing?

martijnhoekstra · June 21, 2021, 11:21am

Somewhat confusingly, if you have a parameterized types like Foo[A], Foo is sometimes called a type function, analogous to a “normal” function.

Where a function f: A => B takes a value of type A and returns a value of type B, at compile time, Foo can be seen as a construct that takes a type parameter, and returns a fully parameterized type. In this parlance, Foo is not a proper type before it gets its type parameter, but a function from a parameter to a fully parameterized type.

In the same way, it’s sometimes called a type constructor, in that it’s able to take a parameter and construct a proper type from it.

In other words, type Foo[A] declares a parameterized type, type constructor or type function, and you can call it any of those things.