DRY: matching generic types of a specific type?


#1

Hello,

I am attempting to express a set of “constraints” that must be processed into a “function” that uses those constraints to generate values. An example will make it easier to understand. Say I have these set of “constraints”:

  sealed trait Param[T]
  final case class Range[T](start: T, end: T, delta: T)(implicit num: Numeric[T]) extends Param[T]
  final case class Options[T](e: Seq[T]) extends Param[T]
  final case class Composite[T1,T2](paramA: Param[T1], paramB: Param[T2]) extends Param[(T1,T2)]

Now I can construct a search space so:

  val p1: Range[Int] = Range(1, 10, 1)
  val p2: Options[String] = Options(List("a", "b", "c"))
  val p3: Composite[Int, String] = Composite(p1, p2)

The idea is then to have several strategies that can be “applied” to these constraints to convert them into a sampler (that generates values according to those constraints). For example I could have:

  def gridSearch[T](p: Param[T]): Sample[T] = p match {
    case g@Range(_, _, delta) =>
      val r: Sample[T] = linear(g, delta)
      r
    case l@Options(_) =>
      val r = list(l)
      r
    case Composite(paramA, paramB) =>
      val a = gridSearch(paramA)
      val b = gridSearch(paramB)
      val c = cartesian(a, b)
      c
  }

I could then use this so:

  val s1: Sample[Int] = gridSearch(p1)
  val s2: Sample[String] = gridSearch(p2)
  val s3: Sample[(Int, String)] = gridSearch(p3)
  val s4: Sample[(Int, (Int, String))] = gridSearch(Composite(p1,p3))

Notice that the return type is a “compound tuple” that will then be used as a function parameter elsewhere.

All seems to be fine and dandy as long as the functions linear, list and cartesian are also generic. However, my problem is that functions like linear and list are not (but cartesian is).

More concretely: in the example above p1 is of type Range[Int]and results in a Sample[Int] but it could be of any other type. The following is a “complete” example showing the issue:

object Exp7c {

  sealed trait Sampler[T]
  case class IntUser(s:Int,e:Int) extends Sampler[Int]
  case class DoubleUser(s:Double,e:Double) extends Sampler[Double]

  sealed trait Param[T]
  final case class Range[T](start: T, end: T, delta: T) extends Param[T]
  final case class Options[T](e: Seq[T]) extends Param[T]
  final case class Composite[T1,T2](paramA: Param[T1], paramB: Param[T2]) extends Param[(T1,T2)]

  val p1: Range[Int] = Range(1, 10, 1)
  val p2: Options[String] = Options(List("a", "b", "c"))
  val p3: Composite[Int, String] = Composite(p1, p2)

  def useInt(p : Range[Int]): Sampler[Int] = IntUser(p.start, p.end)
  def useDouble(p : Range[Double]): Sampler[Double] = DoubleUser(p.start, p.end)

  def process[T](p: Param[T]): Sampler[T] = p match {
    case g@Range[Int](_, _, delta) =>
      val r = useInt(g)
      r
    case g@Range[Double](_, _, delta) =>
      val r = useDouble(g)
      r
  }
}

Of course process does not compile. So my question is, short of creating each Param[T] for a specific type, what is the best way to model this in Scala. Any pointers or examples will be greatly appreciated.


#2

The problem is that you’ve written [T](p: Param[T]): Sampler[T], which means "for all types T, if I have a Param[T] I can yield a Sampler[T]" This is not possible, because Param[α] is inhabited for all types α, and there are infinitely many such types, compared to the two types your Sampler supports.

One thing you could try is to constrain T at the point of process with a typeclass, containing your two currently-supported types and maybe some others. You can pattern-match on that typeclass to determine which of useInt or useDouble ought to be used. With a little fiddling with patmat, you can even get it to do it type-safely with no casts.

Unfortunately, I don’t think that’s going to be good enough, because you seem to have constrained the problem so the supported T is dependent on the case of Param as well. But then you threw that type information away. So you might need to constrain the case class constructors with relevant subsets of your typeclass directly, instead.

I don’t really know enough about your problem to explain further. I have some examples of advanced GADT-typeclass pattern matching somewhere around here; maybe I can find them.


#3

Hi Stephen,

Appreciate the feedback.

I understand this, hence the need to declare each type explicitly. This seems to require duplicating code. I was trying to see if their is a better way to do this in Scala.

Confess I don’t understand. Do you mean use implicits to identify the correct conversion (standard idiom of typeclass use)? This would imply the result is determined at compile time, correct?

Again, I am too obtuse to understand. My apologies. Are you referring to the Param itself or the T in Param[T]? Note that Param and all its subtypes express the set of parameters that should be fed to a specific machine learning algorithm (ranges of values, whatever the type).

The Sampler[T] and its subclasses take the Param[T] and generates a actual set of values that will be used by the machine learning algorithm. So the Param[String]:

ListRange(List(a, b, c))

is converted to a:

 Sampler[Finite, String]:

that produces the values:
[a, b, c]

In the above case I am using strings but can use any other type. So the T is really the type of interest.

My current solution is to have some generic case classes and some case classes with specific types:

  sealed trait Param[T]
  final case class Constant[T](c: T) extends Param[T]
  final case class Composite[T1,T2](paramA: Param[T1], paramB: Param[T2]) extends Param[(T1,T2)]
  final case class PairThese[T1,T2](l: Seq[(Param[T1],Param[T2])]) extends Param[(T1,T2)]
  final case class IntRange(start: Int, end: Int) extends Param[Int]
  final case class DoubleRange(start: Double, end: Double) extends Param[Double]
  final case class Strings(e: List[String]) extends Param[String]

And can match so:

  def gridSearch[L <: LengthSample, T](numSamples: Int)(p: Param[T]): Sampler[Finite, T] = p match {
    case r@Constant(_) => const(r)
    case r@IntRange(_, _) => intRange(r,numSamples)
    case r@DoubleRange(_, _) => doubleRange(r,numSamples)
    case l@Strings(_) => stringRange(l)
    case c@Composite(paramA, paramB) =>
      val a = gridSearch(numSamples)(paramA)
      val b = gridSearch(numSamples)(paramB)
       val c = cartesian(a, b)
      c
    case c@PairThese(prs) =>
      val r1= prs.map { e =>
        val a = gridSearch(numSamples)(e._1)
        val b = gridSearch(numSamples)(e._2)
        cartesian(a, b)
      }
      val r2 = r1.toVector
      val r3 = AppendB(r2)
      r3
  }

which seems to work:

  val s1: Sampler[Finite, Int] = gridSearch(numSamples)(p1)
  val s2: Sampler[Finite, String] = gridSearch(numSamples)(p2)
  val s3: Sampler[Finite, (Int, String)] = gridSearch(numSamples)(p3)
  val s4: Sampler[Finite, (Int, (Int, String))] = gridSearch(numSamples)(Composite(p1,p3))
  val s5: Sampler[Finite, (String, String)] = gridSearch(numSamples)(q8)

Currently I seem to only require making explicit the cases that use numerical types. So this solution may not need as much boilerplate as I imagined.

Did a quick search. Going to study that.

Thanks once again.


#4

Your last example is a good starting point for factoring, so we can see how

you might need to constrain the case class constructors with relevant subsets of your typeclass directly

would work.

Do you mean use implicits to identify the correct conversion (standard idiom of typeclass use)? This would imply the result is determined at compile time, correct?

Well, it’s kind of a mix. If you add a typeclass that describes the possible Range types

sealed trait TC1[T]

object TC1 {
  implicit object TInt extends TC1[Int]
  implicit object TDouble extends TC1[Double]
}

Then add this to your single Range constructor

  final case class Range[T](start: T, end: T)(implicit val tc1: TC1[T]) extends Param[T]

You can use tc1 to figure out (at runtime) where you are in the space of possible Ts. This is GADT pattern matching.

    case r@Range(_, _) => r.tc1 match {
      case TC1.TInt => intRange(r, numSamples)
      case TC1.TDouble => doubleRange(r, numSamples)
    }

#5

Example makes it clear. Thank you.