Scala syntax question

arbeiter · July 22, 2019, 8:20pm

I see a method like this:
def mkPerson(name: String, age: Int): Either[String, Person] = mkName(name).map2(mkAge(age))(Person(_, _))

here’s mkName and mkAge:

def mkName(name: String): Either[String, Name] =
if (name == "" || name == null) Left("Name is empty.")
 else Right(new Name(name))

def mkAge(age: Int): Either[String, Age] = 
if (age < 0) Left("Age is out of range.") 
else Right(new Age(age))

and here’s the type signature of map2:

def map2[EE >: E, B, C](b: Either[EE, B])(f: (A, B) => C):
    Either[EE, C]

lets assume that it is a method on a sealed trait Either. (edited)
how is

1. is  mkName(name).map2(mkAge(age))((a,b) => Person(a, b)) 
the same as 
mkName(name).map2(mkAge(age))(Person(_, _))?
, 2. if so, how does the compiler infer the values of a 
and b here?

jducoeur · July 23, 2019, 12:30am

Howdy! You really want to put those triple backquotes (```) on a line by themselves – as it is, the formatting is getting badly messed up because you have other things on those lines.

Yes, and the key thing to keep in mind is that _ gets bound once per usage. So the first time it shows up, it gets bound to the first parameter (a), the second time to the second parameter (b). So on a quick casual reading, yes, those look to mean the same thing. (I wouldn’t necessarily recommend this syntax – using _ to mean more than one things can confuse the casual reader. But it’s legal.)

As for how the compiler can figure it out: well, let’s walk through it. mkAge produces Either[String, Age]. map2() therefore knows that EE is String and B is Age. The function f expects a Name as the first parameter, so the compiler can suss that parameter a should be a Name.

I’ll admit – that’s pushing the type inference fairly hard, and I wouldn’t necessarily have assumed that the compiler could figure it out. But it doesn’t astonish me: there’s enough information there that I can see it even without knowing the type signature of Person, so it’s reasonable that the compiler can put it together…

crater2150 · July 23, 2019, 8:35am

The underscores are really only syntactic sugar. Person(_, _) is directly translated to (a, b) => Person(a, b), just with generated names for the parameters, so both are totally the same. It actually happens before any typechecks whatsoever (you can check with scalac -Xshow-phases, which will list phase 1 as parse source into ASTs, perform simple desugaring, while typing is phase 4).

So the type inference has to happen in both cases, as both do not give types for the function parameters. The inference always goes from left to right, per parameter list. This is important, having both parameters to map2 in the same parameter list would reduce the compiler’s ability to infer types and we may have to specify them for the function.

With the multiple parameter lists, it works like @jducoeur says, the compiler looks at the first parameter list, with expects an Either[EE, B] and the Parameter we give it is an Either[String, Age]. So B is definitely fixed.
For EE, the compiler will look if String is a supertype of E. E (from mkName) is also String, so no problem. If it wasn’t, the compiler would then look for the narrowest type, that is both a supertype of String and E.
Now we know EE and B, A was already set to Name from the Either we call map2 on, so only C is left. This means, all types for the input to our function are known. The return type C can be inferred from the body of the function, the last expression is of type Person, so that’s what C is.

I don’t feel like this is pushing inference hard, most higher-order functions use this. I rarely see lambdas with specified parameter types.

To better remember how the underscore syntax works, you can compare it with a fill-in-the-blanks text: you give an expression with blanks _ and the compiler fills them with parameters from left to right, one parameter per blank.