Motivating example for extending `for` comprehension

jimka · May 12, 2023, 6:42am

The school where I work has asked me to make a single lecture in a class about Category Theory for Programmers (CT4P). My idea is to first talk about Python. Why? Not because we love Python, but rather because Python is the lingua franca—all the students have already programmed in that language. I’ll use the example here A monda in Python. After that simple example, I’d like to talk about extending the for comprehension in Scala by defining a monad.

The problem is that I don’t know of any motivating examples of why you might want to do this. There are certainly useful cases of non-trivial for comprehension extensions, but they are all already built into the standard library. List, LazyList, Vector, Option, Either, etc. Therefore, there is no motivation for the programmer to duplicate this effort.

I’d like to have a simple enough example that the CT4P student can understand even though it is the first time the student has seen Scala syntax, but the example should be something where the student will respond, “Yea, that is indeed useful”.

Any suggestions?

sangamon · May 12, 2023, 10:31am

You may find some inspiration in the Cats documentation, e.g. the book.

Besides collections/containers, the natural domain of monads is computation plans: plain functions, Eval, Reader/Kleisli, Writer, State, IO,… (Some might argue that containers are just a subset of this group, and at least as far as lazily evaluated collections are concerned, they would have a point.)

Note that just adding #map()/#flatMap() methods to a type in order to “duck type” into for expression usage does not unlock the full power of monads. This happens once you have a proper Monad type class so you can write generic code that applies to all types that “have” a monad (and integrates with other abstractions from this stack).

Shameless plug: Some time ago I’ve written a short series of posts that tries to naturally evolve this kind of abstraction while writing a naive CSV parser from scratch. It doesn’t cover monad (other than in passing for Either based error handling), as it’s entirely applicative based. However, one could imagine a requirement that a column parser needs to be dynamically selected depending on the parsed content of a previous column, triggering introduction of monad.

crater2150 · May 12, 2023, 12:09pm

In our FP lecture, we also use Parser as a motivational example for using the Monad abstraction. It’s not builtin and also not a collection, so it may be suited for your use case too. The typical for-comprehension for a monadic parser is IMO more readable than other parser code, so you can directly see the benefits of comprehensions. Also, writing a Parser monad is relatively straigth forward compared to other non-builtin monads like IO.

This is a part of our example code (from a real project), which parses configuration entries that turn on or off lamps at set times (e.g. + MO 08:00).

  def entry: Parser[Entry] = for
    dir     <- direction
    _       <- whitespace
    day     <- weekDay
    _       <- whitespace
    hours   <- int 
    _       <- string(":")
    minutes <- int
  yield Entry(dir, day, hours, minutes)

(this example of course wouldn’t require a monad, but without flatMap you can’t use for )

sangamon · May 12, 2023, 1:59pm

Another possible example, inspired by the other recent thread, might be a monadic generator, just like scalacheck’s Gen. For motivation, this probably already requires the insight that mutable state is an issue, but it may be even easier to grasp than the parser.

Note that quite some of the “concrete” use cases mentioned so far are variants of the State monad at their core, with different state types and some bells and whistles added: parser (remaining input/position), generator (RNG/seed) and IO (“the state of the universe”).

sangamon · May 12, 2023, 2:41pm

Addendum: …and the generator example more directly motivates the need for monadic behavior than a parser. A language would need to be somewhat context dependent to actually require a monadic parser, whereas monadic chaining will be required quite naturally with generators, e.g.: Generate a number in the range x, then generate a list of strings with this exact size.

jducoeur · May 14, 2023, 3:58pm

There are scads of examples (mine is RequestM, a Future-like monad for use with Akka), but honestly – for a single-lecture introduction, I think it’s too deep in the weeds to implement a Monad. I mean, in ten years of full-time, high-level Scala work I’ve done that exactly twice, and it took days of work to get it right each time. It’s pretty rare to implement a Monad (and often tricky), but extremely common to use them, so I focus on the usage side.

When I teach introductory CT4P as a single talk (which I do occasionally, and will be doing again in a few months at work), I pretty much always follow roughly this outline:

Introduce Functors: the ‘map()’ function, and some examples of why it is helpful to be able to abstract over Functor.
Introduce Monoids (which are slightly less common, but really easy to explain, so they’re good to cover early), and an example or two of generalizing there.
Describe ‘List’, ‘Option’, and ‘Future’. Show each in a nested ‘for’ comprehension. Start with ‘List’, since people are usually familiar with that from other languages. Finish with ‘Future’, because it isn’t a container, and thus shows the breadth of the concept. (Yes, ‘Future’ is ill-behaved, but for an introductory class it’s usually appropriate.)
Introduce Monads as the concept of “nested” above, illustrating that the ability to “squash” nested instances (the ‘flatten’ function) is the defining characteristic of a monad.
Desugar those ‘for’ comprehensions, showing the underlying ‘map’/‘flatMap’ functions and how they combine to make this tick. (I usually don’t bother with ‘withFilter’ - it tends to be beside the point.)
If I have time and a sufficiently knowledgeable class, show a higher-level example of an operation that is context-bounded on the Cats ‘Monad’ type class, to show how you can abstract over Monads to write generally-useful utility functions.

That’s a pretty chewy lecture all by itself, just covering the usage side. And IMO it’s helpful to cover Functor and Monoid, not just Monads, to emphasize that Monads are just one detail of CT, and not the only important bit.

(ETA: if I have time, I sometimes also talk about ‘Applicative’, as a nice compare-and-contrast with ‘Monad’.)

sangamon · May 16, 2023, 1:13pm

Fully agree. I somehow missed the “single lecture” constraint - there’s only so much you can do in this time window…

Just to note, subjectively and with hindsight, a focus on the “flatten” aspect and the omission of Applicative from the discussion have rather been an obstacle for me - YMMV. What eventually made it click for me was to view the whole stack as computations over an opaque context.

Functor is for adding computation steps to isolated instances of the context.
Applicative is for merging independent instances.
Monad is for chaining dependent instances.

jimka · May 16, 2023, 2:51pm

Actually, another motivating example which I should have thought of earlier is how to use foldM to iterate over an infinite list until some condition is satisfied. Everyone who has done numerical computation has wanted to do this, and foldM gives an exception-free solution.

It is a problem which is easy to motivate, and everyone will see how the solution works. It might be difficult to explain what this has to do with a monad though.

sangamon · May 17, 2023, 2:37pm

foldM is interop between Monad and Foldable, and short-circuiting in the infinite case hinges on “the foldRight implementation for F and the tailRecM implementation for G [being] sufficiently lazy”. The underlying mechanics feels way too complex for a motivational Monad example.

jimka · May 17, 2023, 2:39pm

Is there a way I can convert this function to be tail recursive? I know there are techniques such as trampolining, but I don’t know how to trampoline in this case.

  def tailRecWithShortCircuit[T](seq: Seq[T], 
                                 f: (T, T) => Either[T,T], 
                                 z: T
                                ): Either[T,T] = {
    def loop(it: Iterator[T], x: Either[T,T]): Either[T,T] = {
      if (it.hasNext) {
        val n = it.next
        x.flatMap(a => loop(it, f(a, n)))
      }
      else
        x
    }

    loop(seq.iterator, Right(z))
  }

BalmungSan · May 17, 2023, 2:50pm

Just pattern match instead of using flatMap
Also, no need to use the Iterator, just pattern match on the List as well.

jimka · May 17, 2023, 2:52pm

yes that would work but that avoids the goal to compose using flatMap and won’t generalize to other implementations of flatMap, which would be the next iteration of the code.

jimka · May 17, 2023, 2:55pm

Yes, List is just a temporary implementation. Hope to change it to something more general later.