Breaking out of loop/fold/etc

jimka · June 5, 2019, 9:33am

I discovered a way to break out of a loop such as fold. The compiler does not complain, and it appears to give the correct results. I’d love to hear anyone’s comments.

  def testSS(nums:List[Int]):Int = {
    nums.fold(1) { (acc, num) => {
      if (num == 0)
        return 0
      else {
        println(s"num=$num")
        num * acc
      }
    }
    }
  }

It is a pattern which only works in the special case where there’s a named function to return from. For example, I don’t think it could work if I have concentric looks and I want to break out of one, but not the other. And I also suspect, this pattern prevents the compiler from inlining the function testSS. Nevertheless, when I test it, it gives the correct results.

    println("result = "+testSS(List(1,2,3,0,4,5,6,7)))
    println("result = "+testSS(List(1,2,3,4,5,6,7)))

outputs the following.

num=1
num=2
num=3
result = 0
num=1
num=2
num=3
num=4
num=5
num=6
num=7
result = 5040

martijnhoekstra · June 5, 2019, 9:55am

return immediately returns the enclosing method. This may be implemented by throwing a special exception (that may nevertheless be caught by user code).

The inimitable tpolecat writes at tpolecat (paraphrased)

using return is bad, and you should feel bad.

OK, he doesn’t say that, but it’s remarkably close.

jimka · June 5, 2019, 10:03am

I read the Scala specification Section 6.20 for return. It says that return does not necessarily return from the enclosing method, but rather returns from the “inner-most enclosing named method or function”. I.e., return cannot be used to accidentally or intentionally return only from an unnamed block or anonymous function.

jimka · June 5, 2019, 10:16am

@martijnhoekstra, I took a look at the link you recommended. It is somewhat curious. His argument is basically that return can always be avoided even by making the code more obfuscated. He basically recommends, converting fold based code to tail-recursion.

And he even goes so far as to give an example of a use of return which is very readable, and replaces it with an obfuscated tail call which is about 2x the amount of code.

Funny enough, my experience is that trying to rewrite folds as tail recursion makes code hard to read.

It is difficult for me to understand that a well-defined, supported, dependable, specified feature of the language, should never be used, and developers should go through great lengths to avoid it.

That being said, the blogger does indeed make a good and valid point that users new to Scala have a desire to overuse or misuse the feature. I admit that I don’t have a counter argument to this point.

In my opinion, a good programmer struggles to balance between performance, readability, maintainability, scalability etc.

martijnhoekstra · June 5, 2019, 10:33am

The spec is rather vague on what, exactly, is a method or a function, either named or not.

Common usage is converging on methods being any def, and functions being any value with type Function, regardless of what/how they are named.

The spec doesn’t exactly align with that, but what it does exactly align with, I don’t think anyone knows for sure.

If I have val x = (i: Int) => return i.toString I would argue that x is the name for that function, and it’s a named function.

I don’t think the author would vehemently disagree with saying that return sometimes has uses, but that it’s an advanced language feature used in edge cases, and not for the faint-hearted. The post is aimed mostly at newcomers who are used to return being the only way to return a value/control.

There is some related discussion at scala - Abort early in a fold - Stack Overflow

That discuss the several techniques: Writing out the recursion, using return, and the possibility of lazy evaluation of a right fold (that the scala stdlib doesn’t have)

PeterE · June 5, 2019, 11:50am

Also consider

exists, find, collectFirst - these break out for you
scala.util.control.Breaks
…

jimka · June 5, 2019, 11:52am

I’m tempted (motivated) to learn scalaz simply to have access to lazy fold. At least I’ve heard the claim that such is supported in scalaz.

jimka · June 5, 2019, 11:54am

yes, but unless I’m mistaken, those examples find return the domain value which satisfies a condition and exists simply returns a Boolean. These could of course be used in conjunction with a var which is set as a side effect. Not convinced that’s better.

Sorry, what’s findFirst? I didn’t find it, except in the java library.

martijnhoekstra · June 5, 2019, 12:10pm

It is supported in scalaz, and also in cats.

Both have good learning resources, for example https://leanpub.com/fpmortals or https://underscore.io/books/scala-with-cats/

curoli · June 5, 2019, 5:13pm

Why not just use a while loop?

jimka · June 5, 2019, 5:24pm

You mean a while loop and a var ?

curoli · June 5, 2019, 7:09pm

Yes.

Ichoran · June 5, 2019, 7:38pm

The bottom line is that return is fine if you’re careful with it, but you shouldn’t use it just 'cause you can. Have some reason.

The problem with return is that it is nonlocal control flow, and it implements the nonlocality by throwing stackless exceptions and it will try even if the result is absurd.

Take, for instance:

def makeSomething(foo: Foo): Future[Thing] =
  Future {
    complicatedThing.fold(b){ (acc, x) =>
      if (canStopEarly(x)) return Future{ simple(x) }
      acc = complex(acc, x)
    }
  }

The intent seems clear.

This is completely broken.

When you call the method, it will generate the future and happily schedule it for execution later. The method has a hook to catch the exception generated to pass back the return at the time it’s called, but that’s not when the return happens. It happens later, in some other random thread, while the Future is executing. The thread, sometime later, dies a gruesome death with a completely uninformative error message since it doesn’t know how to deal with the exception and, with no stack, there isn’t any information to give.

Okay, so, return can get you into trouble. But what if you’re single-threaded, not catching all exceptions (i.e. no try { stuff() } catch { case t: Throwable => default() }), and you can really simplify things by using a nonlocal return?

Then do, absolutely! Unless you’re working in a code base where the style is to simply never use them. Like with most things that have observable side-effects, it’s generally safest to use it in contexts where the caller cannot in principle detect what’s happening.

So for instance,

def sumTo100(xs: Seq[Int]): Int =
  xs.fold(0){ (sum, x) => if (sum > 100) return 100 else sum+x }

is fine.

You can use breakable/break, but it is implemented using the same mechanism, so it has the same dangers. The advantage is only that you don’t have to define a method to break out of, and since you can define a method anywhere, I don’t really see the point of it.

Note that if you’re trying to stop early, there are other options also that are faster, like

def sumTo100(xs: Seq[Int]): Int = {
  def to100(it: Iterator[Int], sum: Int = 0): Int =
    if (sum > 100) 100
    else if (it.hasNext) to100(it, sum + it.next)
    else sum
  to100(xs.iterator)
}

These things sometimes avoid the return also; sometimes they don’t. They’re often worth writing when speed not just logic is important. For instance, the non-recursive indexed version is

def sumTo100(xs: Array[Int]): Int = {
  var sum, i = 0
  while (i < xs.length) {
    sum += xs(i)
    if (sum > 100) return 100
  }
  sum
}

which I would consider eminently good code, albeit not very high-level.

SethTisue · June 6, 2019, 2:51pm

Nonlocal returns are about to be deprecated in Dotty; reference: https://github.com/lampepfl/dotty/pull/6623

Ichoran · June 6, 2019, 10:16pm

Even if they’re deprecated, there’s the same issue with breakable/break and the new library equivalent that carries a value (i.e. is just like return).

It will be harder to do them by accident, though, which is perhaps a good thing.

jimka · June 7, 2019, 6:51am

deprecating non-local returns? That’s a shame. Are they being replaced with first class continuations? If so I’m all for it.

PeterE · June 7, 2019, 12:28pm

Hi, you asked for alternatives to “Breaking out of loop/fold/etc” that is what I replied to. I did not advise you to use a var.
Sorry I meant collectFirst - just corrected.

Ichoran · June 7, 2019, 5:07pm

They’re just being moved from a language feature to a library feature. Mechanistically,

def foo(): Foo = xs.fold(z)((a, b) => if (p(a)) return y else a ~ b)

is just doing

def foo(): Foo =
  try { xs.fold(z)((a, b) => if (p(a)) throw Return(y) else a ~ b }
  catch { case Return(y) => y }

where Return is a stackless exception, so you may as well switch it to a library method to make it more obvious without requiring quite so much boilerplate:

def foo() = returnable[Foo]{
  xs.fold(z)((a, b) => if (p(a)) returnThrow(y) else a ~ b
}

The danger in this approach is that nesting might be handled wrong (and it needs to be inline or it will add overhead), but otherwise it’s equivalent.

jimka · July 13, 2019, 2:10pm

It is not clear whether delimited continuations will be available in scala 3, but if they are then there is no need for block/return because it can be accomplished with reset/shift as shown here.

  def reduceWithEscape[A](objects:TraversableOnce[A], init:A, nullify:A, f:(A,A)=>A):A = {
    import scala.util.continuations._

    reset {
      shift { cc: (A => A) =>
        objects.fold(init) { (acc: A, obj: A) =>
          if (acc == nullify)
            cc(nullify) // effectively return the value of nullify from shift, aborting the fold
          else
            f(acc, obj)
        }
      }
    }
  }

tyohDeveloper · July 13, 2019, 4:52pm

Poisoning the loop (above) is often/usually the best approach. Especially if it will stop downstream computation in a large process or ensure a result of a larger process is likewise dead. Like a non-signaling NaN in numerical programming.

The general mechanism is a try/catch and throw. It that sound like too much overhead for your problem, the implementation is likely too complex for the problem at hand. Rewriting the code, using small methods rather than inline code might expose the solution. Also using recursive calls to loop will help. Scala doesn’t implement tail recursion properly*, so you might need to write a private method to achieve the general loop and call that from the general method.

Signal the hate. Tail recursion is on “this”, it doesn’t change the external contract of the method, so the whole “only final method can be tail-recursive” is a bug.