Has a loop/recurs comprehension been considered for scala

tarsa · July 14, 2019, 5:38pm

I don’t think Clojure is a good tool for disovering Java semantics at all. IMO it’s just as bad as Jython or JRuby in this regard. You can implement very exotic semantics on JVM if you don’t care about performance, bytecode size and compatibility with other languages, but Scala wants to be performant and compatible. Often Scala classes can be extended normally in Java, Kotlin, etc code (i.e. it’s not hard to make a class in Scala that will be extensible in Java and Kotlin).

trait Bar {
  def m(p:String):Unit 
}

class Foo extends Bar {
  val f: (p:String) => Unit =
    println(“function”)
  
  def m(s:String): Unit =
    println(“function”)
  
  val fn = m _
}

Above code is semantically equivalent to this:

trait Bar {
  def m(p:String):Unit 
}

class Foo extends Bar {
  val f = new (String => Unit) { // or (equivalently): new Function1[String, Unit] {
    def apply(p: String): Unit =
      println("function")
  }
  
  def m(s:String): Unit =
    println(“function”)
  
  val fn = new (String => Unit) { // or (equivalently): new Function1[String, Unit] {
    def apply(s: String): Unit =
      m(s)
  }
}

It was desugared this way in Scala 2.11 and earlier. Scala 2.12 uses low-level Java 8 constructs for creating lightweight lambdas, but they have the same semantics as anonymous inner classes shown above.

There’s a little catch in desugaring, though. this in sugar form refers to outer class, while this in desugared form refers to function itself. The same difference is present in Java 8 language (in lambda vs anonymous class syntax). Anyway, that doesn’t matter in our discussion because outer - inner relationship is completely different from parent - child relationship.

tyohDeveloper · July 14, 2019, 5:56pm

The code and data structures the scala compiler creates is pretty big for this forum tool. More than I think I can type in correctly. Here is a pointer to the definition.

https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-5.html#jvms-5.1

Scala compiles directly to jvm bytecode. Java likely creates different code, but the datastructures and opcodes will be identical.

If you mean tail-recursion for optimization, my description above specifies the difference between a tail recursive function call and an invokevirtual method call. I don’t know if scala optimizes tail recursive “val” functions.

The pseudo code for tail recursion

def f(sx:List[String]):Unit = {
If ( sx.isEmpty ) println( “done” ) else
f(sx.tail)
}

Is referentially identical to

def f(sx:List[String]):Unit = {
 var mysx = sx
  while( mysx.nonEmpty ) mysx = mysx.tail
  println(“done”)
}

Both the calling code and the compiled method generate identical method reference blocks.

	invokevirtual C.m:(A*)T

Is identical for the caller and the method being compiled. The method being compiled doesn’t have a reference to the caller’s code (unless it uses .getClass). But the concrete method can only be called if “ invokevirtual C.m:(A*)T`” is identical. Therefore, the compiler knows how to generate the same “invokevirtual C.m:(A*)T”. If that the caller’s invokevirtual resolved to the executing method, invokevirtual will resolve to the executing method.

super.f()

And
new Foo().f()

Are not the same code as would be generated for the executing method. They clearly are invoking different logical methods. Those can’t be tail recursive even for final methods.

Because both invokevirtual and tail recursion will invoke the currently executing bytecode, the code is the same.

The Hotspot compiler (which compiles bytecode to machine code) does additional optimizations. But the function must execute exactly like evaluating the bytecode. The jvm only calls the Hotspot compiler once it determines the bytecode is a hot spot (e.g. executed a lot). The Hotspot compiler can not produce different results than interpreting the bytecode. What you are calling the JIT compiler is the Hotspot compiler. So the bytecode is interpreted until the jvm determines that it needs to be optimized to machine code. After it is compiled, it is executed natively. Because the jvm always executes the bytecode in the interpreter and compiles the bytecode iff the code is “hot” the results of executing the method in both cases must be identical.

The code run is obviously different (bytecode vs. machine code). The result of both must be identical. If/when the Hotspot compiler implements tail recursion or uses another method to preserve the stack. The Hotspot compiler might chose to optimize non-final methods or it might optimize those methods as well. The scala compiler can always optimize both. So it is a good idea for the scala to optimize all tail recursion rather than wait for the implementation of the Hotspot compiler.

BTW, there is always bytecode executed after the tail recursive optimized code. The return value of the method must be set up before the return can be executed. I think that means recursive functions can be inlined even if they aren’t in the tail location. I need to thing more on that. It probably requires creating “final private[this] methods”. I haven’t thought through that.

tarsa · July 14, 2019, 6:07pm

tyohDeveloper:

The pseudo code for tail recursion

def f(sx:List[String]):Unit = {
If ( sx.isEmpty ) println( “done” ) else
f(sx.tail)
}

Is referentially identical to

def f(sx:List[String]):Unit = {
 var mysx = sx
  while( mysx.nonEmpty ) mysx = mysx.tail
  println(“done”)
}

It’s not as long as f can be overridden. Period. That’s how a method invocation in Java works. And I shown it so many times in this topic…

invokevirtual needs an instance of an object and a method signature (and call arguments of course). invokevirtual follows the object reference, then follows the class reference residing in object header and then invokes a method from the vtable of that class. The type of object reference (semantically) doesn’t matter during the search in vtable. It also doesn’t matter from which method or object that invokevirtual was invoked, as both object reference and method signature are always provided explicitly to invokevirtual.

tyohDeveloper · July 14, 2019, 6:08pm

I used closure to define what I mean by “function” and “method”. Clojure doesn’t do the type calculus of scala. But it doesn’t need to. Neither scala or clojure have the characteristics of Idris, so one isn’t superior to the other. Good programmers create good programs is both languages. Mediocre programmers will create mediocre programs in both languages. The only real difference is if you like prefix or infix notation.

tyohDeveloper · July 14, 2019, 6:31pm

That’s not true. If the method reference is identical the method call is always the same. Even if the function is overridden. If it should call the overridden method by the initial caller it should also call the same overriden method when called by the method.

In other words, the concrete method being executed can only be executed if the invokevirtual resolved to the current concrete method. If that’s not true, different calls for the identical object will run different concrete methods. That would be very bad. I can’t even imagine how invokevirtual would pick one concrete method over another.

If invokevirtual resolved to the executing concrete method, it will always resolve to the executing method. The block that describes which virtual function used in the lookup phase is the the same if it is called by another function or the currently executing function. The block might be the same block or it might be a different block. The block will be the same even if the caller has a different class loader.

If the compiler knows enough to do an invokevirtual for any object, it knows how to invokevirtual for the current object. It (should) know that the invokevirtual will resolve to the executing method. That it doesn’t, it’s because the compiler decided not to implement the feature. It is easy to implement the feature, but it is more complex than it is for a final method. I’m sure the compiler developer decided to punt rather than implement the general case. Invokevirtual will never select a different concrete method.

Calling it a bug is too strong. But it must have been a decision to not do it. A new version of the compiler can implement the optimization w/o changing the exact results of the method. They could decide to do it only when the “-optimize” flag is used. The code would be different than what is executed in the IDE, but that’s true of everything generated by “-optimize”.

tarsa · July 14, 2019, 6:45pm

No, because you can invoke a particular implementation with invokespecial instruction. super.method(args) compiles to invokespecial instead of invokevirtual. You cannot know whether a subclass will use invokespecial in its own implementation of a method and you must not break semantics during optimization.

tyohDeveloper · July 14, 2019, 7:22pm

Again, Super is telling the jvm to call another concrete method. Tail recursion only works if it is the object that is being called. If another object is being called or super methods are being called, it can never be tail reclusive even if the method is final. It is no different than calling an overloaded method. The method has the same name, but that’s all.

tarsa · July 14, 2019, 7:40pm

If a method m in class C is declared final, then invoking this.m(args) in class C will always call the implementation of m that belongs to class C (because you cannot override final methods). Therefore it’s safe to do tail recursion optimization on final methods. Or to put it differently: if we have a class C with final method m and reference ref of type C then calling ref.m(args) is statically known to resolve to implementation of m in class C. Without final modifier there’s no guarantee. this in class C is of type C (but the object pointed to by this could be an instance of some subclass of C). Therefore final is crucial for correctness of tail recursion optimization.

If you think otherwise, show an counterexample.

Another description:

class SomeClass {
  def someMethod(a: Int, b: String, c: Any): Unit = {
    // this can be converted to a jump only if we can
    //   be sure `someMethod` cannot be overridden
    // if it can be overridden then implementation from subclass can use
    //   super.someMethod(Int, String, Any) to call into this implementation
    // it cannot be overridden if it's final
    this.someMethod(a, b, c)
  }
}

tyohDeveloper · July 14, 2019, 8:42pm

As I outlined in the previous posts, a method need not be final to be know, without question, that would he currently executing code is the code that will/would be invoked with invokevirtual. The compiler must create the data structures to allow the lookup function to be invoked. That also requires that only the currently executing method will be dispatched.

In short, if invokevirtual dispatched to the currently executing method, it will also invoke the exact same method as is currently running. The scala compiler has all of the required information to make that determination. A final method just makes it easier to determine.

tarsa · July 14, 2019, 9:06pm

No. scalac doesn’t know if you’ve entered a method implementation using invokevirtual or invokespecial.

tyohDeveloper · July 14, 2019, 9:24pm

Perhaps this will be more transparent.

final class Consumer  {
def consumer: Unit = {
  val fval = new Foo()
  fval.f()  // first call
  fval.f()  // second call
  consumer(fval)
}
def consumer(myfval:Foo):Unit = {
  val boo = myfval
  boo.f() // third call
}
}

class Foo extends Whatever {
  def f(): Unit = println( “hello world”)
}

Will the invokevirtual called by the consumer execute the exact same concrete method in the first call and the second call? Yes. Will the code executed by the third call be the same as the first two calls? Yes.

Is the value of fval and boo identical to “this” in the code being executed by the invoke virtual? Yes.

So we’ve established that the same code called by every pointer to the same object is the same code.* The pointer “this” is the same as fval and boo. All dispatch to the method f() via the vals fval, boo, and this must always call the same concrete method. A dispatch on that val will call the same concrete method regardless of the class calling the method.

.getClass will return a pointer to the same class structure with this as well as fval.

tarsa · July 14, 2019, 9:31pm

If invokespecial didn’t exist, you would be right. But you just ignored it existence.

The whole point of invokespecial is to be able to select different method implementation than invokevirtual would do. Thus in general you can’t be sure if current (currently executing) method implementation was selected by invokevirtual or invokespecial.

In your case I could do:

class Quux extends Foo {
  def f(): Unit = { println("hey"); super.f() } 
}

therefore the f body in Foo could also be entered using invokespecial instruction in f body in Quux.

tyohDeveloper · July 15, 2019, 12:34am

Invoke special will remove to the same function. Besides, the executing function will never use an invokespecial to call the currently executing method on the variable this.

tyohDeveloper · July 15, 2019, 2:09pm

I think I misunderstood your point on invokespecial. The executing method does not care who/how it was called.

The executing method’s behavior depends solely on its inputs (including instance and class field). So it does not need to know how it was called. If invokespecial was used, the lookup phase of that opcode does all of the function/concrete-method resolution then checks if the method is private/reachable or not. That happens as the invokespecial bytecode is evaluated (or compiled eqv).

The naïve to-bytecode compiler will use a invokedynamic or a jump to execute the next iteration. In a final method is can always use a jump. In a non-final method it must use the execution frame to determine the class associated with itself. Then compare that to the class of this. If they are the same, jump. If they are different the code must invokedynamic to call the correct method.

That determination cane be made once in the preamble of the super method. It can also happen at the recursion call site. Which technique should be used depend on the expected number of times it will be executed. Tail recursion is a optimization of stack usage. It normally isn’t to optimize speed. When a compiler generates tail recursion for a non-final method, it usually buts the check at the jump/call site. It is possible to write two different set of bytecode. One for tail recursion and one with normal recursion, the select which codepath to execute when first entered.

There are a few other options. Always using call, but unwinding the stack if it gets too deep. Kind of like an exception. I know that is/was a serious contender for the Hotspot compiler.

tarsa · July 15, 2019, 2:11pm

You’ve got it backwards. I’m not talking about using invokespecial in optimized method. I’m talking about invokespecial calling into optimized method.

What the Scala compiler could do is to change this:

class Parent {
  def method(n: Int): Unit = {
    println("Entering Parent.method")
    if (n > 0) this.method(n - 1) // explicit "this." !!!
  }
}

class Child extends Parent {
  override def method(n: Int): Unit = {
    println("Entering Child.method")
    super.method(n)
  }
}

object Main extends App {
  println("Checking Parent")
  new Parent().method(3)
  println(".")
  println("Checking Child")
  new Child().method(3)
}

into this:

class Parent {
  def method(n: Int): Unit = {
    // ensure that invokevirtual will call back to this implementation
    if (getClass
          .getDeclaredMethod("method", classOf[Int])
          .getDeclaringClass eq classOf[Parent])
      return restricted_method(n)
    println("Entering Parent.method")
    if (n > 0) this.method(n - 1) // explicit "this." !!!
  }

  // implementation copied from original method and optimized with the
  // assumption of invokevirtual behaviour checked in original method
  @annotation.tailrec
  private def restricted_method(n: Int): Unit = {
    println("Entering Parent.method")
    if (n > 0) this.restricted_method(n - 1) // explicit "this." !!!
  }
}

class Child extends Parent {
  override def method(n: Int): Unit = {
    println("Entering Child.method")
    super.method(n)
  }
}

object Main extends App {
  println("Checking Parent")
  new Parent().method(3)
  println(".")
  println("Checking Child")
  new Child().method(3)
}

Now it can do tail recursion optimization on non-final methods, but the overhead is high. Not only we have method duplication, but the reflective check on method entry is costly.

If the method is non-overridable (i.e. final) then compiler doesn’t have to do reflective check nor does it need to duplicate method body before optimization.

Jasper-M · July 15, 2019, 3:27pm

If you have this class Foo then foo1 and foo2 should be equivalent, right?

class Foo { 
  def foo1(a: Int): Int = 
    if (a < 10) foo1(a * 2) else a

  def foo2(a: Int) = { 
    var r = a
    while (r < 10)  {
      r = r * 2
    }
    r
  }
}

However if a user of your class defines this class Bar, foo1 and foo2 will no longer produce the same results. Which proves that you cannot transform foo1 into foo2 when there’s the possibility that that method will ever be overridden. This is something that cannot reasonably be changed.

class Bar extends Foo {
  override def foo1(a: Int) = 
    if (a < 10) super.foo1(a - 1) else a

  override def foo2(a: Int) = 
    if (a < 10) super.foo2(a - 1) else a
}