Breaking out of loop/fold/etc

#1

I discovered a way to break out of a loop such as fold. The compiler does not complain, and it appears to give the correct results. I’d love to hear anyone’s comments.

  def testSS(nums:List[Int]):Int = {
    nums.fold(1) { (acc, num) => {
      if (num == 0)
        return 0
      else {
        println(s"num=$num")
        num * acc
      }
    }
    }
  }

It is a pattern which only works in the special case where there’s a named function to return from. For example, I don’t think it could work if I have concentric looks and I want to break out of one, but not the other. And I also suspect, this pattern prevents the compiler from inlining the function testSS. Nevertheless, when I test it, it gives the correct results.

    println("result = "+testSS(List(1,2,3,0,4,5,6,7)))
    println("result = "+testSS(List(1,2,3,4,5,6,7)))

outputs the following.

num=1
num=2
num=3
result = 0
num=1
num=2
num=3
num=4
num=5
num=6
num=7
result = 5040
#2

return immediately returns the enclosing method. This may be implemented by throwing a special exception (that may nevertheless be caught by user code).

The inimitable tpolecat writes at https://tpolecat.github.io/2014/05/09/return.html (paraphrased)

using return is bad, and you should feel bad.

OK, he doesn’t say that, but it’s remarkably close.

1 Like
#3

I read the Scala specification Section 6.20 for return. It says that return does not necessarily return from the enclosing method, but rather returns from the “inner-most enclosing named method or function”. I.e., return cannot be used to accidentally or intentionally return only from an unnamed block or anonymous function.

#4

@martijnhoekstra, I took a look at the link you recommended. It is somewhat curious. His argument is basically that return can always be avoided even by making the code more obfuscated. He basically recommends, converting fold based code to tail-recursion.

And he even goes so far as to give an example of a use of return which is very readable, and replaces it with an obfuscated tail call which is about 2x the amount of code.

Funny enough, my experience is that trying to rewrite folds as tail recursion makes code hard to read.

It is difficult for me to understand that a well-defined, supported, dependable, specified feature of the language, should never be used, and developers should go through great lengths to avoid it.

That being said, the blogger does indeed make a good and valid point that users new to Scala have a desire to overuse or misuse the feature. I admit that I don’t have a counter argument to this point.

In my opinion, a good programmer struggles to balance between performance, readability, maintainability, scalability etc.

#5

The spec is rather vague on what, exactly, is a method or a function, either named or not.

Common usage is converging on methods being any def, and functions being any value with type Function, regardless of what/how they are named.

The spec doesn’t exactly align with that, but what it does exactly align with, I don’t think anyone knows for sure.

If I have val x = (i: Int) => return i.toString I would argue that x is the name for that function, and it’s a named function.

I don’t think the author would vehemently disagree with saying that return sometimes has uses, but that it’s an advanced language feature used in edge cases, and not for the faint-hearted. The post is aimed mostly at newcomers who are used to return being the only way to return a value/control.

There is some related discussion at https://stackoverflow.com/questions/12892701/abort-early-in-a-fold

That discuss the several techniques: Writing out the recursion, using return, and the possibility of lazy evaluation of a right fold (that the scala stdlib doesn’t have)

#6

Also consider

  • exists, find, collectFirst - these break out for you
  • scala.util.control.Breaks
#7

I’m tempted (motivated) to learn scalaz simply to have access to lazy fold. At least I’ve heard the claim that such is supported in scalaz.

#8

yes, but unless I’m mistaken, those examples find return the domain value which satisfies a condition and exists simply returns a Boolean. These could of course be used in conjunction with a var which is set as a side effect. Not convinced that’s better.

Sorry, what’s findFirst? I didn’t find it, except in the java library.

#9

It is supported in scalaz, and also in cats.

Both have good learning resources, for example https://leanpub.com/fpmortals or https://underscore.io/books/scala-with-cats/

1 Like
#10

Why not just use a while loop?

#11

You mean a while loop and a var ?

#12

Yes.

#13

The bottom line is that return is fine if you’re careful with it, but you shouldn’t use it just 'cause you can. Have some reason.

The problem with return is that it is nonlocal control flow, and it implements the nonlocality by throwing stackless exceptions and it will try even if the result is absurd.

Take, for instance:

def makeSomething(foo: Foo): Future[Thing] =
  Future {
    complicatedThing.fold(b){ (acc, x) =>
      if (canStopEarly(x)) return Future{ simple(x) }
      acc = complex(acc, x)
    }
  }

The intent seems clear.

This is completely broken.

When you call the method, it will generate the future and happily schedule it for execution later. The method has a hook to catch the exception generated to pass back the return at the time it’s called, but that’s not when the return happens. It happens later, in some other random thread, while the Future is executing. The thread, sometime later, dies a gruesome death with a completely uninformative error message since it doesn’t know how to deal with the exception and, with no stack, there isn’t any information to give.

Okay, so, return can get you into trouble. But what if you’re single-threaded, not catching all exceptions (i.e. no try { stuff() } catch { case t: Throwable => default() }), and you can really simplify things by using a nonlocal return?

Then do, absolutely! Unless you’re working in a code base where the style is to simply never use them. Like with most things that have observable side-effects, it’s generally safest to use it in contexts where the caller cannot in principle detect what’s happening.

So for instance,

def sumTo100(xs: Seq[Int]): Int =
  xs.fold(0){ (sum, x) => if (sum > 100) return 100 else sum+x }

is fine.

You can use breakable/break, but it is implemented using the same mechanism, so it has the same dangers. The advantage is only that you don’t have to define a method to break out of, and since you can define a method anywhere, I don’t really see the point of it.

Note that if you’re trying to stop early, there are other options also that are faster, like

def sumTo100(xs: Seq[Int]): Int = {
  def to100(it: Iterator[Int], sum: Int = 0): Int =
    if (sum > 100) 100
    else if (it.hasNext) to100(it, sum + it.next)
    else sum
  to100(xs.iterator)
}

These things sometimes avoid the return also; sometimes they don’t. They’re often worth writing when speed not just logic is important. For instance, the non-recursive indexed version is

def sumTo100(xs: Array[Int]): Int = {
  var sum, i = 0
  while (i < xs.length) {
    sum += xs(i)
    if (sum > 100) return 100
  }
  sum
}

which I would consider eminently good code, albeit not very high-level.

1 Like
#14

Nonlocal returns are about to be deprecated in Dotty; reference: https://github.com/lampepfl/dotty/pull/6623

#15

Even if they’re deprecated, there’s the same issue with breakable/break and the new library equivalent that carries a value (i.e. is just like return).

It will be harder to do them by accident, though, which is perhaps a good thing.

#16

deprecating non-local returns? That’s a shame. Are they being replaced with first class continuations? If so I’m all for it.

#17

Hi, you asked for alternatives to “Breaking out of loop/fold/etc” that is what I replied to. I did not advise you to use a var.
Sorry I meant collectFirst - just corrected.

#18

They’re just being moved from a language feature to a library feature. Mechanistically,

def foo(): Foo = xs.fold(z)((a, b) => if (p(a)) return y else a ~ b)

is just doing

def foo(): Foo =
  try { xs.fold(z)((a, b) => if (p(a)) throw Return(y) else a ~ b }
  catch { case Return(y) => y }

where Return is a stackless exception, so you may as well switch it to a library method to make it more obvious without requiring quite so much boilerplate:

def foo() = returnable[Foo]{
  xs.fold(z)((a, b) => if (p(a)) returnThrow(y) else a ~ b
}

The danger in this approach is that nesting might be handled wrong (and it needs to be inline or it will add overhead), but otherwise it’s equivalent.