Strange "nested" Partial Function behavior

Tomassino · September 25, 2024, 10:35am

Hi all, I’ve stumbled upon a strange behavior regarding “nested” partial functions. I have the following code:

package example

object TestPF extends App {
  val innerPf: PartialFunction[Any, Option[Unit]] = { case x: String => Some(()) }
  val outerPf_1: PartialFunction[Any, Any]        = { p =>
    innerPf(p) match { case Some(_) => println("Something") }
  }
  val outerPf_2: PartialFunction[Any, Any]        = { p =>
    val something = p

    // Only here compiler warning: "match may not be exhaustive. It would fail on the following input: None"
    innerPf(something) match { case Some(_) => println("Something") }
  }

  println(s"innerPf.isDefinedAt(10): ${innerPf.isDefinedAt(10)}")
  println(s"innerPf.isDefinedAt(\"abc\"): ${innerPf.isDefinedAt("abc")}")

  try {
    println(s"outerPf_1.isDefinedAt(10): ${outerPf_1.isDefinedAt(10)}")
  } catch {
    case _: MatchError => println("outerPf_1.isDefinedAt(10): MatchError")
  }
  try {
    println(s"outerPf_1.isDefinedAt(\"abc\"): ${outerPf_1.isDefinedAt("abc")}")
  } catch {
    case _: MatchError => println("outerPf_1.isDefinedAt(\"abc\"): MatchError")
  }

  try {
    println(s"outerPf_2.isDefinedAt(10): ${outerPf_2.isDefinedAt(10)}")
  } catch {
    case _: MatchError => println("outerPf_2.isDefinedAt(10): MatchError")
  }
  try {
    println(s"outerPf_2.isDefinedAt(\"abc\"): ${outerPf_2.isDefinedAt("abc")}")
  } catch {
    case _: MatchError => println("outerPf_2.isDefinedAt(\"abc\"): MatchError")
  }

  // OUTPUT:
  // innerPf.isDefinedAt(10): false
  // innerPf.isDefinedAt("abc"): true
  // outerPf_1.isDefinedAt(10): MatchError
  // outerPf_1.isDefinedAt("abc"): true
  // outerPf_2.isDefinedAt(10): true
  // outerPf_2.isDefinedAt("abc"): true
}

As you can see I define three partial functions:

innerPf which is a simple partial function only defined on strings;
outerPf_1 which is a partial function that internally calls innerPf. Notice that the body is just the call to innerPf and a match statement, nothing else;
outerPf_2 which is similar to outerPf_1 but has an additional instruction before the call to innerPf and the match, i.e. val something = p

If you look at the output, there’s something odd. In particular, outerPf_1 and outerPf_2 behave slightly differently, even if I would expect them to be basically identical:

of the two, I would say that outerPf_2 is the “normal” one, since it’s defined on all possible inputs (the fact that is will fail later on anything except strings is expected and isDefinedAt cannot know). In fact, the compiler even warns about the non exhaustive internal pattern match;
outerPf_1, instead is a bit strange: the compiler gives no warning and isDefinedAt throws a strange MatchError, which I suppose should never be thrown by that function.

I tried that same exact code with scala 2.13.11, 3.3.3 and 3.5.1 and the outcome is exactly the same. Is this expected? Why are outerPf_1 and outerPf_2 behaving differently?

jrlemieux · September 25, 2024, 2:33pm

Hello,

Your outerPf_1 and outerPf_2 partial functions are in fact behaving like normal functions having a parameter (p: Any), as your “p =>” notation means “case p: Any =>”, meaning that the .isDefinedAt(x) method will always return true for any value of x of any type. So, in your code, either .isDefinedAt() returns true, or either it throws an exception, but it will never return false.

But you are right that there is something really fishy in this code.

In outerPf_2, if you replace the method call “innerPf(something)” by “innerPf(p)”, the code will behave exactly the same, as expected. But then, the code does not behave the same if you then comment the statement “val something = p”. This is very strange, as the existence/absence of this useless statement should not alter the behavior of the code.

So these two implementations of outerPf_2 bring the same behavior:

val outerPf_2: PartialFunction[Any, Any]        = { p =>
    val something = p
    innerPf(something) match { case Some(_) => println("Something") }
  }

val outerPf_2: PartialFunction[Any, Any]        = { p =>
    val something = p
    innerPf(p) match { case Some(_) => println("Something") }
  }

But now, this implementation changes the behavior:

val outerPf_2: PartialFunction[Any, Any]        = { p =>
    //val something = p
    innerPf(p) match { case Some(_) => println("Something") }
  }

This is the real mystery in this code. Maybe is it a compiler bug?

I tested using Scala 3.5.0.

sageserpent-open · September 25, 2024, 3:45pm

Try this for fun:

val outerPf_1: PartialFunction[Any, Any] = (p: Any) => {
  val ooble = innerPf(p)
  ooble match
    case Some(_) => println("Something")
}
val outerPf_2: PartialFunction[Any, Any] = (p: Any) =>
  val something = p

  // Only here compiler warning: "match may not be exhaustive. It would fail on
  // the following input: None"
  val booble = innerPf(something)
  booble match
    case Some(_) => println("Something")
  end match

outerPf_1(10) // scala.MatchError: 10 (of class java.lang.Integer)
outerPf_1("abc") // Something
outerPf_2(10) // scala.MatchError: 10 (of class java.lang.Integer)
outerPf_2("abc") // Something

Here, I’ve out-of-lined the call to the inner partial function. That hides the potential match error going on in the inner partial function call, so both of the outer partial function calls are essentially written as total functions with a potential runtime hole in the subsequent pattern match - and that is warned about in both cases.

The heavy mob will weigh in on this better than I, but my guess is that the inlining of the call to the inner partial function in the original code was somehow defeated by the use of the function-local temporary, thus hiding the match error in the second outer partial function. The first one did inline the call, and it got picked up as part of the ahem, partiality, of the first outer partial function.

Either that, or there is some type inference subtlety that means the call to the inner partial function can have a different type depending on whether its a subexpression or simply used as-is to initialize a variable.

BTW: why do you want to define a partial function is returning an Option? I can understand unlifting a total function returning an Option into a partial one, but that feels odd…

jrlemieux · September 25, 2024, 5:02pm

I tried few things in order to understand the code, and I am very puzzled.

I created a small Scala3 project at

I created two classes Test1 and Test2 that inherit common code, each one however implementing its own flavor of the outerPf_2 partial function.

The only difference between the two outerPf_2 partial function implementations is the commented “val something = p” statement.


class Test1() extends TestTrait:

  val outerPf_2: PartialFunction[Any, Any] = p =>
    val something = p
    innerPf(p) match
      case Some(_) => println("Something")

class Test2() extends TestTrait:

  val outerPf_2: PartialFunction[Any, Any] = p =>
    // val something = p
    innerPf(p) match
      case Some(_) => println("Something")

This simple commenting of a useless statement causes different output:

Running Test1
innerPf.isDefinedAt(10): false
innerPf.isDefinedAt("abc"): true
outerPf_1.isDefinedAt(10): MatchError
outerPf_1.isDefinedAt("abc"): true
outerPf_2.isDefinedAt(10): true
outerPf_2.isDefinedAt("abc"): true

Running Test2
innerPf.isDefinedAt(10): false
innerPf.isDefinedAt("abc"): true
outerPf_1.isDefinedAt(10): MatchError
outerPf_1.isDefinedAt("abc"): true
outerPf_2.isDefinedAt(10): MatchError
outerPf_2.isDefinedAt("abc"): true

Running the script “check.sh” in the project root folder will clean build the classes, run the program and decompile the various classes for Test1 and Test2, and compare them.

We end up with very different code generation for the isDefinedAt() method of each outerPf_2 implementations:

For Test1:

  public final boolean isDefinedAt(java.lang.Object);
    Code:
       0: aload_1
       1: astore_2
       2: iconst_1
       3: ireturn

For Test2:


  public final boolean isDefinedAt(java.lang.Object);
    Code:
       0: aload_0
       1: getfield      #24                 // Field $outer:Lcom/playzone/some-common-name;
       4: invokevirtual #33                 // Method com/playzone/some-common-name.innerPf:()Lscala/PartialFunction;
       7: aload_1
       8: invokeinterface #39,  2           // InterfaceMethod scala/PartialFunction.apply:(Ljava/lang/Object;)Ljava/lang/Object;
      13: checkcast     #41                 // class scala/Option
      16: astore_2
      17: aload_2
      18: instanceof    #43                 // class scala/Some
      21: ifeq          26
      24: iconst_1
      25: ireturn
      26: iconst_0
      27: ireturn

I double-checked my code, all seems right, but the compiler behavior seems wrong.

Any idea what might go wrong?

Tomassino · September 26, 2024, 2:23pm

Hi, thanks for the replies and the interesting discussion :-). First, let me say that the code I posted is really just an example, what I was writing wasn’t even using isDefinedAt and the “outer” partial function was an Akka receive handlers (thus had type Receive, i.e. PartialFunction[Any, Unit]). Since it was using Akka and I don’t really know how the receive handler is called internally and what members of PartialFunction are used, I tried to find a simpler example in plain scala.

As @jrlemieux pointed out, it’s actually the presence/absence of an instruction before the call to the inner partial function that makes a difference. If you replace val something = p with e.g. a call to println, the code behaves the same (by the way, that’s how I noticed that something was not right in the first place).

Regarding the experiment of @jrlemieux I’m not experienced enough to read the decompiled bytecode, but it seems to me that the code for Test1 is correct (as noted in a previous comment, it’s actually a total function, so it always return true), while for Test2 it seems to actually invoke the internal partial function (do I read line 4 correctly?), which seems completely wrong to me…

At this point, should I maybe open a bug report? If this is the case, can I link your repository?

som-snytt · September 27, 2024, 12:05am

There is already a new ticket: Adding a useless assignment statement alters program behavior. · Issue #21649 · scala/scala3 · GitHub

I commented and linked to the old scala 2 ticket with discussion.

I agree that it is a “gotcha”; if it is not a bug, it deserves a “lint”.

I think it is not a bug, but I also agree with the Martin Oderskyism, “No one writes code like that.” There should be more warnings with those words.

The nice example from the ancient ticket via Jasper Moeys in 2020 is:

scala> List("1","two","3").collect { s => util.Try(s.toInt) match { case util.Success(i) => i }}
val res4: List[Int] = List(1, 3)

That ticket was from 2011. Everyone is 13 years older now.

sageserpent-open · September 27, 2024, 5:32am

What perturbs me is not the behaviour of the ‘extreme adaptation’ - I read through the discussion on the older ticket as well as the new one. It is the lack of referential transparency when for example, the inner partial function call is or is not extracted into a local variable binding. That is very confusing. Even more so when there is some leading dead code, which was what started this discussion.

I confess, I wouldn’t lose any sleep over this if the compiler or linter confessed as to what it is doing in the ~~two situations~~ situation where the subexpression reveals its partiality, and how the resulting behaviour might vary.

DISCLAIMER: I rarely read compiler warnings, and I don’t wear the FP zealot outfit that often. Treat me not as a uniformed agitator, rather an uninformed agitator.

Tomassino · September 27, 2024, 10:47am

I replied to the bug report adding more experiments. I’m still convinced that this is a bug (of course not a serious one), because I’m not expecting isDefinedAt to throw an exception and for sure not to call the inner partial function (see my message for more details). If this cannot be easily fixed because e.g. it would make it impossible to apply a very important optimization, then I would at least reject code and fail to compile it.

Regarding “No one writes code like that”: it sounds to me similar to saying “nobody is called Robert'); DROP TABLE Students; --, so no need to sanitize database inputs” (here) .

Odersky · September 27, 2024, 1:20pm

I added a reply to the issue thread.

som-snytt · September 30, 2024, 3:40pm

There is a new ticket on HEAD that reminded me that some syntactic idioms make you look twice (if not askance).

The first line is from the ticket, the second line is the topic of this topic (where you can’t factor out the expression), and the third line is the expectation that it should just work.

scala> List(42, 27).collect { var n = 0; { case i if i > 30 => i+n } }
val res0: List[Int] = List(42)

scala> List(42, 27).collect { x => { x+1 match { case i if i > 30 => i*2 } } }
val res1: List[Int] = List(86)

scala> List(42, 27).collect { x => { var n = 0; x match { case i if i > 30 => i+n } } }
scala.MatchError: 27 (of class java.lang.Integer)
  at $anonfun$1.applyOrElse(<console>:1)
  at $anonfun$1.applyOrElse(<console>:1)
  at scala.collection.immutable.List.collect(List.scala:276)
  ... 30 elided

The expectation is due to the sense that for a “pattern matching anonymous function”, the x => x match is just boilerplate syntax.

scala> List(42, 27).collect { var n = 0; x => x match { case i if i > 30 => i+n } }
val res3: List[Int] = List(42)

scala> List(42, 27).collect { x => { var n = 0; { case i if i > 30 => i+n } } }
                                                ^
       error: missing parameter type for expanded function
       The argument types of an anonymous function must be fully known. (SLS 8.5)
       Expected type was: ?

The urgency to disallow beloved res1 syntax is to justify why the other two forms don’t work. The argument to make them work might be: a match or matchless cases in result position of a function literal (which has a PF expected) does the obvious, safe thing.

sageserpent-open · October 1, 2024, 9:46am

@som-snytt That seems to make sense to me.

To recapitulate my education so far:

I thought only matchless cases defined anonymous partial functions, so expected full-fat lambdas to define potentially runtime-unsafe total functions.
I’ve learned that there’s been a need to workaround lack of type-inference for matchless cases, so lambdas that are simply eta-expanded matches are treated as partial functions too. OK, cool.
Somewhere along the line the boundary between full-fat lambdas and partial functions got fuzzy, and now all kinds of lambdas may or may not be partial functions in disguise.
You are proposing that as long as the questionable lambda culminates in cases and the context wants a partial function, then it will be a partial function. That sounds good, I could just look at the lambda and know what it means.

As a footnote…

Going back to the original example, the call to the named (‘inner’) partial function binding was presumbly a red-herring - following what @jrlemieux posted, when the outer lambda compiled as full-fat, it said “Trust me, I’m a total function” and when as an anonymous partial function, it simply blew up when making the naked call to the inner partial function as a prelude to doing the pattern match dry-run. Not sure how that ClassCastException got reported as a MatchError, I presume there is a catch-all handler in the compiled match expression?

My head hurts!

Odersky · October 1, 2024, 10:15am

We have to keep in mind that ultimately we need a spec for this. It’s precisely this play loose and hack the compiler without a spec that got us into these troubles. I believe by far the simplest thing to spec is the original behavior with the x => x match ... addition. The alternative of admitting anything that ends in a match statement would be super messy to spec, and I have an aversion against messy specs for features that are in the end marginal.

jducoeur · October 1, 2024, 10:39am

Strong agreement. And honestly, I find anything else confusing as heck – in particular, these examples that include vars inside the body just leave me scratching my head trying to understand their behavior, which is usually a sign that we really shouldn’t be doing that. (When these questions started showing up, my first reaction was, “Surely that can’t possibly compile”, and I was a little dismayed to find that I was wrong.)

I’m not sure how much code in the wild would be broken by tightening this up, but unless it’s substantial, I’d love to see some clarification.

sageserpent-open · October 1, 2024, 11:29am

@jducoeur , @Odersky, @som-snytt Truth be told, I’ll take whatever I can get - I’m a low-effort arguer on this topic, and this kind of code probably won’t come up for me.

Having this documented somewhere, regardless of how it is resolved, would be nice.

Incidentially, even if nothing was changed from the current behaviour, I can imagine that putting in a compiler warning stating that “You think you’re getting a partial function, but you’re not” could be a help - and that sort of exists in the original example in the sense that the match expression is reported as missing patterns when a full-fat lambda is generated.

I would still be mystified as to the lack of referential transparency / dead code having an influence, but to paraphrase: “Compiler says no.” OK, I’ll go with the compiler.

Not everyone reads the spec, so perhaps issuing a warning might be just as useful if either of the two resolutions discussed are implemented.

Again, I don’t pay enough attention to compiler warnings either - but that’s my fault, PICNIC.

Tomassino · October 1, 2024, 5:51pm

My 2 cents: I would be even more strict and say that x => ... is a total function, while { cases } is a partial function. This would mean also excluding x => x match { ... } from being interpreted as a partial function. However it’s probably too strict, so the proposal of @Odersky seems the best one, to me.

jducoeur · October 2, 2024, 12:57pm

I actually agree with this – I was kind of surprised when I first discovered that the x match form worked, and find it a tad weird. But I suspect that would be too breaking a change to be worth it, and it’s at least clear what the code means.

Jasper-M · October 3, 2024, 1:27pm

som-snytt:

I agree that it is a “gotcha”; if it is not a bug, it deserves a “lint”.

I think it is not a bug, but I also agree with the Martin Oderskyism, “No one writes code like that.” There should be more warnings with those words.

The nice example from the ancient ticket via Jasper Moeys in 2020 is:
scala> List("1","two","3").collect { s => util.Try(s.toInt) match { case util.Success(i) => i }}
val res4: List[Int] = List(1, 3)
That ticket was from 2011. Everyone is 13 years older now.

I’m sure that whoever wrote that crazy code still agrees that it’s a gotcha.