Comparable Interface Question


#1

I am “trying” to teach myself scala by reading the book, Programming in Scala, 3rd Edition. I am a java developer professionally. I am trying to write a generic scala case class. I guess I am approaching it from a Java Mind Set. So far cannot get it to work properly. Here is my issue and also what I am attempting to do:
I wish to create a generic class that takes type parameters:

case class bob[T,W]( name: T , weight: W) {
I want this part of the class to override hashcode and equals. I also want
this part of the class to implement sorting .
}

Now … this is relatively simple to do in java. But I am having one seriously hard time doing this in Scala. You see, if I create a list of objects of type Bob , I wish for them to be sorted by some other type that sorts them, such as a Binary Trees class, or passing to a sort method. I know with my limited knowledge of scala at this time, I am probably doing everything completely wrong.

For example in java, it is just a simple matter of doing something like this:
class Bob<T extends Comparable<? super T> , W> extends Comparable {
and in here you just override the compareTo method in java. It is very simple.
}
With that I can create objects of type Bob and use different types that are Comparable. I am trying to do the same thing in Scala.
Maybe some one knows how to create a sortable generic class in Scala and can help me with this or point me in the right direction.
Thanks …


#2

Ordered trait may help you.


#3

A case class gives you #equals()/#hashCode() for free, given that its component types properly implement those methods.

In Java, I would choose Comparator over Comparable because it’s more flexible - you can have different orderings in different contexts, they are easier to compose, etc.

In Scala, where there’s Ordered/Ordering built on top of Comparator/Comparable, I would prefer Ordering even more so, because this makes it easier to abstract away the primitive/reference chasm inherited from Java (String is not Ordered, Int isn’t even Comparable,…)

You could start like this:

case class Foo[T, W](name: T, weight: W)

// "natural" ordering
implicit def fooOrdering[T : Ordering, W : Ordering]: Ordering[Foo[T, W]] =
  Ordering.by(foo => (foo.name, foo.weight))

There’s some implicit magic going on here:

  • [T : Ordering, W: Ordering] is sugar for (implicit tOrd: Ordering[T], wOrd: Ordering[W]). It requires orderings to be available for the component types.
  • The implementation through Ordering#by exploits the fact that there’s an implicit Ordering defined for tuples. It compares by name first, and if these are equal according to the given ordering for T, it falls back to comparison by weight.

(Note that this doesn’t require Foo instances and their components to “have” an ordering upon creation.)

Test ride:

println(Seq(Foo("a", 2), Foo("b", 1)).sorted) // List(Foo(a,2), Foo(b,1))
println(Seq(Foo("a", 2), Foo("a", 1)).sorted) // List(Foo(a,1), Foo(a,2))

In cases where the “natural” ordering is not appropriate, you can assemble the desired Ordering and pass it explicitly.

println(Seq(Foo("a", 2), Foo("a", 1)).sorted(fooOrdering[String, Int].reverse)) // List(Foo(a,2), Foo(a,1))
val customOrd: Ordering[Foo[String, Int]] =
  fooOrdering(implicitly[Ordering[String]].reverse, implicitly[Ordering[Int]])
println(Seq(Foo("a", 2), Foo("b", 1)).sorted(customOrd)) // List(Foo(b,1), Foo(a,2))

For interfacing with APIs based on Comparable/Comparator/Ordered, there’s a bunch of implicit and explicit conversions available.

I know this looks weird at first glance from a Java PoV, but it’s definitely worth taking this step IMHO.

If you insist on sticking to the Java Comparable way, you could do it like this:

case class Bar[T <: Comparable[T], W <: Comparable[W]](name: T, weight: W) extends Comparable[Bar[T, W]] {
  override def compareTo(other: Bar[T, W]): Int =
    name.compareTo(other.name) match {
      case 0 => weight.compareTo(other.weight)
      case r => r
    }
}

  println(Seq(Bar("a", Integer.valueOf(2)), Bar("b", Integer.valueOf(1))).sorted) // List(Bar(a,2), Bar(b,1))
  println(Seq(Bar("a", Integer.valueOf(2)), Bar("a", Integer.valueOf(1))).sorted) // List(Bar(a,1), Bar(a,2))

…but this is going to cause you quite some pain in the long run. So just don’t. :wink:


#4

Hey thanks … I will look over your answer carefully later today. However, regarding Comparator versus Comparable , they are used for different reasons within the java eco system. Comparable is considered to implement the natural ordering of an an object and is generally used for that purpose only. Comparator , on the other hand, is used as alternate non-natural ordering of an object and is generally used via a Constructor parameter. These are for the most part silly semantic differences but are considered “Best Practices” within the java world. ( See the book “Java Generics and Collections by Maurice Naftalin, Philip Wadler” for a formal discussion regarding comparable and comparator )
I ended up giving on using the Ordered Trait for the time being. I really do not need to order the object in question yet but I will need to do that if I change the implementation of the Adjacency structure that I am attempting to create using Scala. I will try to better understand how implicits are used and that may give me a better understanding on how to make a generic class comparable within a collection type that requires ordering.
THanks


#5

There’s a key Scala concept that you’re touching on here, which you may want to look into: typeclasses. Ordering is a typeclass; there are many others, and they are a common and popular pattern for Scala code. (Especially when you are working with generics, as in your example.)

Roughly speaking, a typeclass is an “interface” that, instead of implementing it via inheritance, you implement via an external object that you make implicitly available. It winds up looking like your class has the methods of the typeclass, but it’s actually implemented elsewhere.

It takes a little getting used to, but is very powerful, and works in a number of cases where OO-style inheritance has difficulty.

Oh, and more generally: if you’re coming from Java, you may want to look into the book “Scala for the Impatient”, which is specifically intended for folks with Java or C# background.


#6

you are exactly correct. In java , this concept of type classes , is used heavily in generics and that is what I was initially attempting to simulate in Scala … the idea of implementing the Comparable Interface …
Thinking in java-mode , I assumed that there must be some type of Comparable ( or Ordered or Whatever ) trait that would achieve the same functionality. But perhaps it is done differently in Scala and maybe it is not a good idea to try to do it the java way … in scala. I’ll look into this more deeply tonight. In nutshell , here is what I want to achieve, which is a no-brainer in java:
I am implementing an Adjacency Map … So far , my first attempt at doing it in Scala was successful ( though it probably looks a bit silly from the point of view of a real Scala programmer. ) . It is a graph processing algorithm, very simple. Every graph is basically a collection of vertices connected via an edge. I want the vertices to be completely generic. I am able to accomplish that easily. I created a simple class representing a vertex and it works within the graph algorithm:

final case class GraphVertexGen[T, W](name: T, weight: W)
{
  override def equals(o: Any) = o match {
    case that: GraphVertexGen[T,W] => that.name.equals(this.name)
    case _                 => false
  }
  override def hashCode = name.hashCode
  override def toString = name.asInstanceOf[String] + ":" + weight
}

Now the graph itself is designed like this:

trait Graph[T] {
  def addEdge(v: T, w: T) // add edge v-w to this graph
  def adj(v: T): Iterable[T] // vertices adjacent to v
  def V: Int // number of vertices
  def E: Int // number edges
  def printGraph: Unit // print the contents of graph
  def getGraph: mutable.Map[T, mutable.Map[T, T]]
}
Note that the "T" above is a Vertex.  And note also that the graph is a:
mutable.Map[T, mutable.Map[T,T]]

Now herein lies the Ordered or Comparable issue. If I change the implementation here to a Tree instead of a Map … then I most likely would get an error because the Vertex is not comparable right now. As it is, everything works ok. But I am toying with the idea of having the actual ( ADJACENCY LIST ( or map in this case ) become a Tree instead.
My initial implementation of the algorithm, so silly and java looking , can be found at my personal github site if you want to see it:
scala studies


#7

I know this line of reasoning, but fail to find it convincing, given the Java implementation of the concepts. Even if your type has a clear natural ordering, like Int, you still will want to use a different ordering occasionally, often derived from the natural ordering - most obviously the reverse, but there’s more exotic variants like “order by remainder”. So you need orderings, anyway, you want some kind of “algebra” for composing them, and yes, you also want a way to highlight an optional “natural” ordering for a type. Comparable basically duplicates the Comparator API and there is no “algebraic” way of building derived Comparators (or Comparables, naturally) from them - it’s a slightly dysfunctional mirror image of Comparator. That’s a high price to pay for a “natural ordering” marker, IMHO.

On the Scala side, I’d think that an implicit on a type’s companion object suffices for highlighting the intended natural ordering. Due to the available converters, there’s less harm in using Ordered, though, I guess.

“Programming in Scala” certainly covers implicit mechanics, however, it doesn’t seem to frame it into the context of type classes, at least in 1st ed, and indeed it mentions Ordered but not Ordering.

So you can either just move on with the flow of the book and stick to whatever it introduces (that would be Ordered), or put some additional literature in the mix - @jducoeur already mentioned another book, and of course there’s a bunch of blog/SO/whatever posts out there - this one might be a good start.


#8

lol … it will be a while before I get thru the Programming in scala book. It is very difficult for java developers to get a handle on Scala. Usually, we have to spend 8 hours or more a day working in a Java environment and then try to learn a little scala off hours. I am trying to be as discipline as possible to learn enough about the language to use it productively in a professional environment; I AM A LONG WAY FROM THAT RIGHT NOW. I will try to start looking at other books soon. But first I want to at least try and complete as much of the PIS book as I can. I still do not understand a lot of the things the author presents in the book. I guess one could argue about the merits of Comparable versus Comparator … It is trivially unimportant to me. Some guy much smarter than I am came up with that concept so who am I to argue one way or the other. Oh, one logical explanation to the difference between the two in Java could be this: With Comparator you are not required to have your class implement anything for comparison. Sort of like a hard-coded Scala implicit thing.

One thing I find curious about Scala and the Scala eco system is this: Coming from an Enterprise Java background, I find it difficult to understand the reason for using SBT to build your applications. Reasoning in a Java-way, I would deduce that it is best to use the most mature build tool or the one that is used more widely. In this case it would be , MAVEN, as opposed to SBT. So far I just use Eclipse and Maven to study scala. Learning scala is difficult enough without having to throw yet another hurdle in the mix in the form of a yet-to-mature SBT tool. With that said, I might eventually try to learn SBT.


#9

Both approaches shown in my previous post apply here.

The cast in toString is as evil in Scala as it is in Java.

Why a Map[T,T] as values, instead of Seq[T] or Set[T]? I’d also consider using immutable collections.


#10

sangamon … Did you not read the on boarding document for this scala forum. You are not supposed to try to shove your opinions regarding how you would do things down a poster’s throat. Instead , attempt to help the poster in a constructive manner. I do appreciate your opinion, however. But would rather here about things that solve the original question I posted. Whether something is mutable or immutable is not what I am currently interested in. However, immutability versus mutability in the construction of an Adjacency Structure is an interesting topic that deserves its own question-posting. I am not interested it that right now, however. And using seq versus Map is yet another tangential comment; though it is interesting and deserves its own question-posting. In fact, regarding that I can say that the classic implementation of Adjacency Structures presented in most College text uses a sort of seq structure to represent the list of edges in the graph. I deviated from that approach and decided to use a Map instead. But this is off topic.


#11

I should be more overt about this: IMO, Programming in Scala is not a great place to start. It’s not often recommended for starting out nowadays. It’s a useful book to own, but I’ve mostly used it for reference, rather than for education.

Precisely because of where you are coming from and what you already know, and because you’re trying to get productive quickly, I would recommend putting it aside and reading Scala for the Impatient instead – it’s a more appropriate starting point for learning the language.

That’s essentially what scala.math.Ordering is – the typeclass for ordered comparison.

Note that overriding equals and hashCode is not very common in Scala code, and is becoming less common year-by-year – you can do it, but it’s hard to get right (due to subclass issues, among other things), and is generally a pain in the neck.

Just as a data point for you: I’ve been doing Scala full-time for six years, and I don’t believe I have ever needed to implement equals or hashCode. There are occasions when it is necessary, but it’s pretty rare these days – Scala idiom has evolved such that you generally just don’t do that. There are usually better solutions.

Also, note that case classes, used properly, implement equality and hash code correctly automatically. (They’re auto-generated by the compiler.) So it’s extremely rare to override those functions on case classes; it almost always means that you’re using case classes incorrectly. I suspect that it’s strictly unnecessary in your example code, although I haven’t sat down and thought through your code enough to be sure of that.


#12

This is particularly funny given your unsolicited expert advice on “immature” sbt
vs mvn one post ago.

A preference for immutability is not just my opinion but baked into
Scala’s DNA. I genuinely just wanted to make things easier for you. I
apologize, won’t happen again.

I have given you compilable examples, directly applicable to your code,
for both the Comparable and the Ordering approach, along with
explanations and links regarding the latter and my reasoning why it’s
preferable. If that’s not constructive enough for you, I’m at loss. I
hope you’ll find someone to compose that Comparable song, or whatever it
is that you are still longing to hear. Good riddance!


#13

being a newbie for the most part, I have to assume that you are correct regarding the book selection. All I did regarding books was just put the word “scala” in the safari online books search box … and that one came back first. So I picked it :grinning: Have no idea what is or is not a good book. I can say that Programming In Scala is a real pain in the *** to read. And some of the author’s examples are just down-right impractical but hey … what do I know??? As for the overriding of hash-code and equals , I am not sure about that one yet. What I wanted to do was to have the Vertex identified only by the name component of that class. I am not sure what the underlying case class’s implementation of hashcode does. I did not wish for it to also use the weight component of the vertex because it is possible that the weight component will change for a node in some client level
processing of the adjacency structure. Cannot see that far ahead right now. But
I do know for sure that the unique identifier of a vertex is its “name”. And the way hashmaps work , they use the hashcode of the node to find the correct bucket. If
there is a collision, then they use the equals method to find the node in some other collection; be it a linked list, tree or map … Havent looked at the Scala source code yet to actually determine how HashMap is implemented in Scala. I know how it is implemented in Java, however.
I think someone mentions why not use an immutable map instead of a mutable map. I think in scala when you use an immutable collection, it is recreated each time you add elements to it. That would be way too inefficient for this type of algorithm since the number of vertices and edges could easily be in the millions.

Ehhh it will probably take me a long time to actually understand Scala. I am not in a big hurry since I just do it for a hobby and fun right now.


#14

Sangamon …
I am sorry if I upset you. I appreciate your advice and examples very much. And please do continue to give them. I am an older guy. My words regarding your approach are not meant as an insult to you in the least. But as something a father would say to a youngster. And that is not an insult either. I will continue to read your comments because they are very helpful to me. You see, I do not come from the typical background that most software engineers come from. I am both an artist and an ex-serious-musician. So sometimes my words tend to sound as if I am maybe being , mean ? But that is not the case. I want you to continue to help me and to help others also. You are a great Scala resource and your knowledge can be of benefit to many Scala newbies. My suggestion would be to try and be more diplomatic in your approach and to bend over backwards in helping people but refraining from immediately criticizing their silly elementary approaches to the language. Believe me when I tell you, having a soft, non-critical approach to helping and teaching others will take you a long way.


#15

Yeah, and I’m not astonished – it’s essentially the reference manual, so it is arguably the most important book in the Scala ecosystem, as well as one of the first. (Disclaimer: I work for the publisher.) But it’s not really aimed primarily at newcomers, the way that several of the newer books like “Scala for the Impatient” and “Creative Scala” are.

For most use cases, it does the right thing: it computes the hashes of its fields, and combines those. (Similarly, it computes equality as whether the contents of the fields are equal.) Unless you have specialized requirements, it’s usually the right answer.

That said, since you are intentionally avoiding using the weight for hashing and equality, it’s slightly different. That’s a little unusual, and there’s a particular trick that’s often used for those – splitting the parameter list.

Scala allows multiple parameter lists (they’re necessary for many use cases), and one important detail with case classes is that they compute equality and hash based on the contents of the first one. Indeed, the first parameter list is different from the second in a lot of subtle but important ways.

So the more typical Scala-idiomatic way of defining your case class would be to not spell out equals and hashCode, but instead define its signature as:

final case class GraphVertexGen[T, W](name: T)(weight: W)

I believe that will auto-generate equals and hashCode more or less identically to the way you’ve built them by hand.

Here’s a pretty good overview of the interesting aspects of case classes.

Largely the same, so yes – this is an example where you do want to manage equality and hashing. It’s just easier to do so with the multi-param-list trick.

No – absolutely untrue. The key to essentially all of the immutable data structures is that they are “diff”-centric. When you create a “new” immutable map by adding or deleting a record, what you’re actually doing is creating a single record that describes the change, which points to the old version of the map for the rest.

This is dead-critical to idiomatic Scala programming: the standard library is full of immutable data structures that work this way. In most cases, making a change is actually quite cheap, because under the hood it’s just recording what has changed and then pointing to the old structure, without copying anything else.

Yep – and it is totally common to use immutable data structures for such cases. In my own application, the core data structure is often multiple megabytes, made up of thousands of smaller structures, and is entirely immutable all the way down. It’s cheap and fast to “copy” because the immutable structures are smart about only actually creating as much data as is actually changing, and leaving the rest alone.

(This is why, for example, Scala’s immutable Vector type is much, much more efficient than Array for many use cases.)

This is a very common misconception for folks coming to functional-programming languages like Scala for the first time. Suffice it to say, the immutable data structures are built on decades of research into this stuff, and are often smarter than you would guess at first glance. And they’re still evolving and improving.

Caveat: for the most performance-critical code, folks do often use mutable data structures. But the common rule of thumb is to default to immutability, understand the performance implications, and switch to mutability only if it’s actually going to buy you an important win. Depending on usage, immutable data structures often work better than you might expect; many projects eschew mutability entirely except for central, performance-crucial libraries. And even then, the mutability is often “under the hood” – used to quickly initialize data structures that otherwise expose immutable APIs.


#16

Interesting information. I will change my funny little graph application to use immutable maps tonight. And the thing about using the multiple argument list is also very interesting. I’ll change that also and just allow the case class to handle the hashcode and equals method.
Thanks a lot for the information.
I do not recall the Programming In Scala book mentioning that point about the hashcode/equals thing being used in that way. Maybe it did mention it but I was just to overwhelmed or confused to catch that point …


#17

Honestly, I haven’t looked at Programming in Scala in a while, so I’m not sure. The multi-param-list thing is an important detail, but it’s pretty subtle – it’s quite possible that, even if the book mentions it, it may not make it clear about why it’s useful…


#18

The book, Programming Scala, does mention multiple parameter lists. I think it is called currying , or something like that. You can just give it one of the arguments first. And then use the resultant function to apply against different versions of the other parameters … some type of functional programming thing. Very interesting. I do understand that concept and did go thru that part of the book. But he did not mention the fact that case classes would just use the first element for generating the hashcode and equals.

you see … in my opinion, the book, Programming In Scala, 3rd Edition, is very tough to use for a java programmer trying to learn scala. I have been reading it as carefully as possible … AND IT IS A PAINFUL EXPERIENCE.


#19

( ps … and I am not sure why my replies appear in large letters sometimes :face_with_raised_eyebrow: )


#20

More precisely, multiple parameter lists enable currying, which is a hugely useful technique. In some programming languages such as Haskell, currying is absolutely central. In Scala it’s less critical, but still good to know – I use it pretty often.

They use the first parameter list, not the first element – it’s important to distinguish those. You can have as many elements as you like in each parameter list. And it’s by far most common to only have a single parameter list for case classes: this use case of needing to override equals and hashcode is one of the few reasons to use multiples.

Yep, agreed. Like I said, it’s not usually one that gets recommended to beginners. (At least, not any more. In the early days it was the only game in town, but there are better options now.)