Beginner questions about type classes

jimka · July 15, 2019, 11:51am

From reading just one (my first) blog post description of type classes, perhaps it is too early to generalize, but it seems to me that type classes basically allow you to define methods on classes you don’t own. For example I don’t own the Int class, and I can’t (and don’t want to) update its source code by adding additional methods. However for my particular application, I’d like to call an application specific method on an it.

val x:Int = 3
x.mymethod(...)

Is that assessment just me drawing a conclusion too early before I understand it?

If Scala supported adding methods a priory to classes I don’t own and don’t have the source code for, and if it further had a way to limit the scope at the class sites, would type classes still be necessary?

Also another thing that is not yet clear: How are type classes limited by type erasure. For example, I suppose I still cannot add a method to a type like List[Set[String=>Int]] which is different than a method on the List[Set[Int=>String]]. Right?

jducoeur · July 15, 2019, 12:54pm

Side note, because I don’t want future readers to get confused: that is not the scalac documentation – it’s an article from a company named scalac. No relation, and it is in no way official documentation.

When you say new with a body like that, you’re creating an anonymous subclass. So this is basically creating a subclass of Show[Int], that fills in the previously-abstract show() method, and then creating an instance of that subclass. This is very, very common for typeclass instances, although it’s a little unusual elsewhere.

It’s not that the instance has a method – it’s that this is defining a new anonymous class (with a method) that only has one instance. Everything’s consistent with what you knew before, save that it’s adding this “anonymous class” concept.

It’s a companion object for the Show trait. Traits can have companions, just like classes; that’s quite common.

That’s incorrect. In the code here, Show[T] is a trait; Show[Int] is a concrete class that inherits from that trait.

That terminology isn’t standard AFAIK, but what they mean is that it holds a concrete class that fills in the abstract member from the Show trait.

There’s no relationship between that definition of show(a) and the definition inside of the Show trait – they’re just demonstrating a way to define a function with different syntax that sits on top of the Show trait. This allows you to say something like:

val v = 3
show(v)

It makes use of the Show[Int] to provide a comfortable-looking syntax, but it’s an unrelated function – it could just as easily have been named foo().

(While the details vary, this technique of defining a more-ergonomic function with the same name as a typeclass function, solely to make it easier to call, is pretty common.)

That’s exactly right – it’s not the only point of typeclasses, but it’s a very important aspect of them.

Yes – a typeclass isn’t just a single method, it’s a coherent interface to a type you don’t own, that happens to contain methods. There are actually easier ways (implicit classes) to define individual extension methods, but they’re less powerful than being able to talk about an interface as a whole.

There will, in fact, be first-class syntax in Scala 3 for extension methods, and Scala 3 typeclasses will make use of that syntax, but there’s lots more you do with typeclasses than just that.

Very good, very important question. Typeclasses are evaluated at compile time (that is, the compiler chooses the typeclass instance to use at compile time) – so they aren’t affected by erasure, which is a runtime problem. They are often the answer to erasure problems, and this is one of the reasons why experienced Scala programmers reach for them fairly often.

So don’t think of it as “add a method to a type” – think of it as “define a typeclass method for a type”. You can do a lot more with the typeclass methods than you can with the routine OO approaches.

philipschwarz · July 15, 2019, 6:15pm

perhaps it is too early to generalize, but it seems to me that type classes basically allow you to define methods on classes you don’t own

Here is a simple example of one very useful thing you can do with a type class

trait Semigroup[A] {
  def combine(l: A, r: A): A
}

implicit val intSemigroup = new Semigroup[Int] {
  def combine(l: Int, r: Int): Int = l + r
}

implicit val stringSemigroup = new Semigroup[String] {
  def combine(l: String, r: String): String = l + r
}

intSemigroup.combine(2,3)
stringSemigroup.combine("2","3")

def combine3[A](x: A, y: A, z: A)(implicit sg:Semigroup[A]): A =
  sg.combine(sg.combine(x,y),z)

combine3(2,3,4)
combine3("2","3","4")

def combineAll[A](as: Seq[A])(implicit sg:Semigroup[A]): A =
  as.reduce(sg.combine(_,_))

combineAll(Seq(2,3,4))
combineAll(Seq("2","3","4"))

implicit def listSemigroup[A] = new Semigroup[List[A]] {
  def combine(l: List[A], r: List[A]): List[A] = l ++ r
}

combineAll(Seq(List(2,3), Nil, List(4)))

implicit def optionSemigroup[A] = new Semigroup[Option[A]] {
  def combine(l: Option[A], r: Option[A]): Option[A] = l orElse r
}

combineAll(Seq(Some(2), None, Some(3)))
combineAll(Seq(None, None, Some(3)))

curoli · July 16, 2019, 10:22am

There are two possible mechanisms for type classes:

(1) You provide implicit conversions to types that have myMethod, e.g.

trait A { … }

trait B { … }

trait RichA extends A {

** def myMethod: B = …**

}

implicit def aToRichA(a: A): RichA = …

val myA: A = …

myA.myMethod // resolves to aToRichA(myA).myMethod

My understanding is that implicit conversions are falling out of favor, because they can interfere with each other in unexpected ways.

(2) You provide a generic method that takes an implicit helper object, e.g.

trait A { … }

trait B { … }

trait Helper[T] {

** def doMyMethod(thing: T): B**

}

implicit val aHelper: Helper[A] = (a: A) => { … }

def myMethod[T](thing: T)(implicit helper: Helper[T]): B = {

** helper.doMyMethod(thing)**

}

val myA: A = …

myMethod(myA) // uses aHelper

This relies on implicit instances, not conversions, which I think is considered preferable.

Regardless of whether you use (1) or (2), the limitation is implicit resolution, which is a compile-time thing, which is not directly related to type erasure, which is a runtime thing. The most direct limitation is that if you want to write generic code with an unknown type T, you cannot satisfy the need for a Helper[T] by providing a bunch of possible Helper instances, because the compiler needs to know which one to choose at compile time. Instead, you typically would provide a matching Helper[T] as an implicit argument along with your object of type T. Somewhere up the call stack, the type T needs to be known and meet its matching Helper[T] instance.

The issue with Set[T] is a different one: if you wanted to provide a fixed (i.e. val) Helper instance for every collection/element type combination, the number of necessary Helper instances would quickly become large or indefinite (e.g. Helper[Set[Set[T]]). But you can create an implicit collection helper from an element helper using an implicit def:

implicit def setHelper[T](implicit elementHelper: Helper[T]): Helper[Set[T]] = …

yawaramin · July 16, 2019, 9:57pm

Are you familiar with Java? Let me translate the code sample you posted to Java, to give you a new perspective on it:

interface Show<A> {
  String show(A a);

  public static Show<Integer> intCanShow = new Show<Integer>() {
    public String show(Integer integer) {
      return "int " + integer;
    }
  };
}

This is really all a typeclass is: a new interface (Show) with implementations that use but are independent of the underlying data type (in this case, Integer or Int).

To answer your specific questions:

This is an anonymous class, i.e. a new class declared and instantiated in a single statement. As you can see, this is a concept that exists in Java too where it was used for a long time as a ‘poor man’s lambda’
object Show is a companion object of trait Show[A]
Traits are (basically) interfaces and in OOP, interfaces can be implemented whereas classes can be extended. This is standard terminology
There is a confusion here, show is not a function at all, it is a method definition, they are different concepts in Scala. See https://stackoverflow.com/q/2529184/20371 for details

jimka · July 17, 2019, 8:06am

As someone already suggested, the resource I’m using is perhaps not very appropriate, as I don’t see how to generalize that example to the the case I’d like to implement.

Perhaps someone can suggest a more extensive example more suitable to my problem?

The example from scalac shows how to add a show method to Int and String. (I hesitate to say “add a method” because it doesn’t really add the method to Int nor String; it just makes the syntax look like a method has been added). The result being that the application programmer can use the syntax 10.show and "hello".show.

In my case I already have a function treeAggregate which only works for TraversableOnce and must be called as a function. I’d like to extend it to work with more types of containers, and to use the method calling syntax.

// current usage
 def clauseToBdd(clause: ClauseAsList, ident: BddTerm, op: BinaryOperation): Bdd = {
    // interpreting clause as an OR so that clause is CNF rather than DNF
    def conv(i: Int): Bdd = Bdd(i)

    treeAggregate(clause, ident: Bdd, conv, (acc: Bdd, b: Bdd) => op(acc, b))
  }

// desired usage
 def clauseToBdd(clause: ClauseAsList, ident: BddTerm, op: BinaryOperation): Bdd = {
    clause.treeAggregate(ident)(Bdd, op)
  }

The current implementation of treeAggregate is as follows.

object Aux {
  def treeAggregate[A, B](objects: TraversableOnce[A], init: B, seqop: A => B, combop: (B, B) => B): B = {
    def consumeStack(stack: List[(Int, B)]): List[(Int, B)] = {
      stack match {
        case (i, a1) :: (j, a2) :: tail if i == j => consumeStack((i + 1, combop(a1, a2)) :: tail)
        case _ => stack
      }
    }

    val stack = objects.foldLeft((1, init) :: Nil) { (stack: List[(Int, B)], ob: A) =>
      (1, seqop(ob)) :: consumeStack(stack)
    }
    // there may be up to log_2 of objects.size many pairs left on the stack
    // but there is not precisely 0 such pairs on the stack.
    assert(stack != Nil)

    stack.tail.map(_._2).fold(stack.head._2)(combop)
  }
...
}

There are at least two obstacles which I don’t understand for generalizing the example.

The example starts by defining a trait Show[A] whose type parameter A represents the type of object we’d like to put on the left hand side of the .show. In my case I would like to put a container object on the left hand side. Set(1,2,3).treeAggregate(...), List("hello","world",...).par.treeAggregate(...) etc.
In the example the function show does not have any type parameters. In my case I have two type parameters, [A,B], one of which A correlates the object on the left hand side of the . to the arguments of the method.
The example defines implicit vars intCanShow:Show[Int] and stringCanShow:Show[String], i.e. having how type parameter variables, rather having fixed types Int and `String. I don’t see how to do that in my case. Do I need a combinatorial explosion of implicit variables?

jimka · July 17, 2019, 9:55am

There seem to be two schools of thought in the literature I’ve read. Some people resist in calling methods functions, other authors do it freely. We are lacking a word to describe something that syntactically looks like a function, acts like a function, walks like a function, and quacks like a function. For example method, anonymous function, object with an apply method, partial function, object inheriting from Function. What should we call something that can be applied to arguments to obtain a result?

The point of my question 4 was that to me (the reader of the blog) it was confusing that one definition of show makes a function-like-entity which when called with one argument returns a String, and another show which when called with one argument returns a function-like-entity which when called returns a String, which is what the syntax def f(...)(...):String = {...} does.

jimka · July 17, 2019, 9:59am

No, I generally try to avoid analogies to Java when possible. I’d rather try to understand the semantics rather than having to learn Java in order to understand Scala. That might indeed be an interesting endeavour, but until now is one I’ve tried to avoid.

curoli · July 17, 2019, 11:00am

A term denoting an object with an apply method? Perhaps an Applicable?

The term function has a long history.

For a long time, programming languages had two types of subroutines: procedure, that don’t return a value, and functions that do return a value.

The came C and unified procedures and functions, so now all subroutines are functions. If they don’t return a value, they have a quasi-type void that syntactically acts like a type.

Then came object-oriented languages like C++. A variable that is member of a class is a field, and a function that is a member is a method.

Now, in a purely OO language like Java, every function is a method, so we can drop the word function and only talk of methods.

Then comes Scala and resurrects the term function, but it now means something else. It is not a subroutine that returns a value, but an object of type FunctionN. Also, Scala replaces quasi-type void with Unit, which is a real type.

jimka · July 17, 2019, 11:08am

Sussman and Abelson (SICP) used the the word function to represent the thing you are calculating, and distinguished it from a procedure which was an algorithmic implementation. For example, the cosine function might have many different imaginable procedures to calculate its value for a given input. In Scheme some people use function to mean a procedure with no side effects.

tarsa · July 17, 2019, 11:25am

It’s best to just stick to terminology that is idiomatic in a specific language. Otherwise we wouldn’t agree on anything. Examples:

typeclasses are denoted by keyword class in Haskell, trait in Rust, protocol in Swift, etc
class in Scala, Java, C++, C#, etc is a classic OOP class containing both state and behaviour
struct in C++ is almost the same thing as class in C++ - only difference is default visibility of members
struct in C# is different from class in C# - struct is for value types, class is for reference types
trait in Scala is not a typeclass as in Rust. trait in Scala is something in between class and interface in Java. trait in Scala can contain both state and behaviour.
sealed class in C# is like a final class in Scala - you can’t extend them at all
sealed class in Scala is different - it can have direct subclasses if they are defined in the same source code file
etc

sangamon · July 17, 2019, 11:46am

I’m not sure I fully understand your problem and whether type classes (or other implicit usages) might be a part of a solution. It might help if you provided minimal, self-contained examples that highlight the structure of the problem instead of reworked extracts from your actual code base.

Is the implementation of #treeAggregate() relevant to the problem?
What other containers would you want to provide #treeAggregate() to, and how - via completely separate implementations? Are the three questions below related to this part at all?
What’s ClauseAsList, BinaryOperation, etc.? Type aliases are nice if they can be looked up and if the corresponding types are used excessively - not in this kind of example.
Just wondering: Couldn’t seqop and A be left out in exchange for an upstream #map() invocation?

It’s showing how to create a type class Show (that incidentally happens to have one method only) with instances for String and Int. The ShowOps wrapper is just sugar on top and not part of the type class concept as such.

This could probably already be written as

treeAggregate(clause, ident, Bdd(_), op)

So far this doesn’t invoke the concept of type classes.

implicit class TraversableTreeAggregate[A](val objects: TraversableOnce[A]) extends AnyVal {
  def treeAggregate[B](init: B, seqop: A => B, combop: (B, B) => B): B =
    Aux.treeAggregate(objects, init, seqop, combop)
}

This one I don’t understand.

jimka · July 17, 2019, 11:49am

What does the # notation mean? I see this often in scala discussions, but I’ve never understood what it means. e.g., #treeAggregate()

sangamon · July 17, 2019, 12:09pm

I just use ‘#’ as a prefix to mark (class/object) members (i.e. methods and fields). Not even sure where I picked up this habit, probably someplace in the Java world.

jducoeur · July 17, 2019, 12:23pm

The Type#member style is Scala type-projection syntax, so a fair number of purists like to use it to signify “member on some instance of Type”. It’s a little pedantic, but accurate. That’s likely where you picked it up.

jducoeur · July 17, 2019, 12:36pm

It’s a bit too early in the morning for me to really engage with this deeply, but roughly speaking:

You should identify the functions that you require the input type to have. (I see foldLeft, not sure if you need anything else.)
Define a typeclass – I’ll call it Aggregateable for the sake of argument – and put foldLeft (and any other functions you need to call on it) in as an abstract member.
For each of the collection types you would like treeAggregate to be able to take, define a typeclass instance of Aggregateable. That is, for (say) List you would define Aggregateable[List], filling in the foldLeft method with one that takes a List as the first parameter and probably just forwards to List.foldLeft. More importantly, you can also define an Aggregateable[T] for any type T for which you can figure out a reasonable definition of foldLeft, regardless of whether T defines that itself.
Adjust the signature of treeAggregate() to something like this:

  def treeAggregate[T[_]: Aggregateable, A, B](objects: T[A], init: B, seqop: A => B, combop: (B, B) => B)(implicit aggregateable: Aggregateable[T]): B = {

and use aggregateable.foldLeft(objects) instead of calling objects.foldLeft directly.

Or something like that, anyway. I’m focusing on foldLeft() because it’s what you are using here, but the nice thing about defining your own typeclass is that, if some other function would work better for your needs, you could define that as you like and use typeclass instances to implement that for each collection.

(If you do just want foldLeft, you could probably just pick up the Foldable typeclass from Cats, which IIRC already has typeclass instances for a bunch of types, and use that instead of defining your own Aggregateable.)

Hopefully that’s coherent enough to be helpful…

jimka · July 17, 2019, 1:05pm

Nice clue, it is not clear to me how to define this trait with all the type parameters. Here is my obviously wrong attempt.

trait Aggregateable[M[_]] {
  def foldLeft[A,B](z:A)(op:(A,B)=>A):M[A]
}

jimka · July 17, 2019, 1:19pm

I’m confused for two reasons

How do I define these instances? the example from scalac defines an instance called intCanShow of type Show[Int]. In my case I want to define an instance for every kind of List. A car cannot have a type parameter in scala so I can’t define var listIsTreeAggregable:Agregatabe[List[A]] = new Aggregatable[List[A]] { ... }
Do I really need to define methods for foldLeft? We can already foldLeft a List of anything. It seems to me I need to define the treeAggregate method for each of the implicit instances. Right?

jducoeur · July 17, 2019, 4:45pm

The key thing is, in a typeclass the methods almost always take what you think of as “this” as an explicit parameter. So more like this:

trait Aggregateable[M[_]] {
  def foldLeft[A,B](m: M)(z:A)(op:(A,B)=>A):M[A]
}

That m parameter is the collection itself.

Hmm. Well, it needs to be parameterized over A, so it has to be a def instead of a val (since vals can’t have type parameters). So something like:

def listAsTreeAggregateable[A]: Aggregateable[List[A]] = new Aggregateable[List[A]] { ... }

Yeah, but that’s irrelevant. Remember, a typeclass is a whole separate wrapper around types. What methods the types themselves implement doesn’t have much to do with that typeclass’ definition, although it sometimes means that the typeclass instance for a particular type might be a one-line forwarder.

No, that’s specifically what we’re trying to avoid. This is all about defining a single typeclass that is sufficient so that treeAggregate can work with any type for which there exists a typeclass instance. The Aggregateable is a parameter to treeAggregate in this design.

sangamon · July 17, 2019, 6:44pm

That doesn’t look entirely right.

trait Aggregateable[M[_]] {
  def foldLeft[A, B](m: M[A])(z: B)(op: (B, A) => B): B
}

We don’t depend on the generic type parameter for M in the signature, so it can be a val or, more convenient, an object:

implicit object ListAggregateable extends Aggregateable[List] {
  override def foldLeft[A, B](m: List[A])(z: B)(op: (B, A) => B): B = m.foldLeft(z)(op)
}

However, this way we’d have to implement a type class instance for each concrete collection. I’ve tried to come up with an instance at TraversableOnce level. (Now requiring a def, indeed.) It kind of seems to work, but…

implicit def traversableOnceAggregateable[T[a] <: TraversableOnce[a]]: Aggregateable[T] =
  new Aggregateable[T] {
    override def foldLeft[A, B](m: T[A])(z: B)(op: (B, A) => B): B = m.foldLeft(z)(op)
  }

…I have no idea how to interpret/explain the effect of the ‘a’ type parameter. I can’t use ‘_’ instead because then T[A] won’t be accepted as a subtype of TraversableOnce[A], and thus op will be rejected as an argument to #foldLeft(). Any explanation about what’s going on there (or just the proper term(s) to throw at a web search) would be appreciated.

Exactly. The signature would become

def treeAggregate[M[_] : Aggregateable, A, B](objects: M[A], init: B, seqop: A => B, combop: (B, B) => B): B

…and the #foldLeft() invocation would be implicitly[Aggregateable[M]].foldLeft(objects)(...).