Using flatMap to chain conditional operations

dubaut · June 10, 2020, 7:41am

Let’s say I write a validation that turns user input into a case class.

final case class User(name: String, email: String, age: Int)

def validateUser(name: String, email: String, age: Int): Either[String, User] = {
  {
    if (name != null) {
      Left("name is missing")
    } else {
      Right(User(name, _, _))
    }
  } flatMap {user => 
    if (email == null) {
      Left("email is missing")
    } else {
      Right(user(email, _))
    }
  } flatMap { user => 
    if (age < 18) {
      Left("invalid age value")
    } else {
      Right(user(age))
    }
  }
}

In validateUser I chain the validation steps using flatMap. To me this is very convenient, but I am wondering if this is a good style or if this is an “abuse” of flatMap?

martijnhoekstra · June 10, 2020, 8:39am

This is as intended.

As you get more flatmaps, for (which is syntactic sugar for a sequence of flatmap/map) becomes more and more attractive. re-written with for:

def validateUser(name: String, email: String, age: Int): Either[String, User] = 
    for {
      validName <- if (name == null) Left("name is missing") else Right(name)
      validEmail <- if (email == null) Left("email is missing") else Right(email)
      validAge <- if (age < 18) Left("invalid age") else Right(age)
    } yield User(validName, validEmail, validAge)

As a side note, in scala null isn’t commonly used, and null checks are not normally done. Optional information is encoded as Option[A], but in this case, you should just have an A that never is null – that’s exactly what you validate here, and don’t have to validate anymore if you don’t use null.

Jasper-M · June 10, 2020, 8:42am

I would say it’s a perfectly valid use of flatMap, though usually you’d use a for-comprehension for this as it makes the code a bit cleaner. And the use of the partially applied function is a little unconventional.

I think usually you’d see it written more or less like the following. Depending on the complexity of the validation perhaps each step factored out into its own function.

def validateUser(name: String, email: String, age: Int): Either[String, User] =
  for {
    n <- Option(name).toRight("name is missing")
    e <- Option(email).toRight("email is missing")
    a <- Either.cond(age < 17, age, "invalid age")
  } yield User(n, e, a)

You might also want to look into a data type like cats.data.Validated which can accumulate all errors instead of bailing out on the first error.

dubaut · June 10, 2020, 8:43am

To me, for comprehensions are black magic. I just cannot get my brain around what they are and how they work

Regarding null: I know that, I used it just for the sake of a simple example.

dubaut · June 10, 2020, 8:48am

I came up with this because it is an easy way to propagate the results through the flatMap chain. Is there some deeper problem I did not see or is it just unconventional?

Jasper-M · June 10, 2020, 9:11am

I don’t think there’s any problem with it. But once you write it as a for-comprehension it loses most of its appeal. And if you’d want to validate in parallel with Validated instead of sequentially with Either you definitely don’t want to use the partially applied functions anymore.

martijnhoekstra · June 10, 2020, 9:41am

for just translates to flatMap (and a final map in the yield)

What I wrote gets rewritten by the compiler to

def validateUser(name: String, email: String, age: Int): Either[String, User] =
   (if (name == null) Left("name is missing") else Right(name)).flatMap(
   validName => (if (email == null) Left("email is missing") else Right("email is missing")).flatMap(
   validEmail => (if (age < 18) Left("invalid age") else Right(age)).map(
   validAge => User(validName, validEmail, validAge))))

For the longest time I preferred the flatMaps because I had a hard time trusting myself to actually understand what was going on if I wrote it as for. That disappeared far quicker than I expected when as an experiment I decided to just practice writing everything as for.

It’s entirely up to you of course, but at least practising writing for when you have 3 or more generators may quickly put you in a situation where you prefer it.

sangamon · June 10, 2020, 11:56am

Just as an aside: Threading partially applied functions this way would be rather idiomatic when using Applicative. In Haskell this is straightforward:

validateUser :: String -> String -> Int -> Either String User
validateUser n e a =
  User
    <$> eitherIf "name is missing" (not . null) n   -- Either String (String -> Int -> User)
    <*> eitherIf "email is missing" (not . null) e  -- Either String (Int -> User)
    <*> eitherIf "invalid age" (> 17) a             -- Either String User

When using cats’ Applicative, one would rather resort to the convenience #mapN() method:

import cats.instances.either._
import cats.syntax.apply._

(
  Option(name).toRight("name is missing"),
  Option(email).toRight("email is missing"),
  Either.cond(age > 17, age, "invalid age")
).mapN(User)

Using monadic #flatMap() (with or without cats) is perfectly fine, of course, but then I’d rather use it via a for expression (and without partially applied functions), as others have suggested already.

philipschwarz · June 10, 2020, 8:01pm

I hope this will help: https://www.slideshare.net/pjschwarz/for-and-flatmap-a-close-look (download for best quality).

philipschwarz · June 10, 2020, 8:22pm

if anyone who is relatively new to Haskell is interested in knowing more about @sangamon’s validation example using map (<$>) and apply (<*>) then this might be of interest https://www.slideshare.net/pjschwarz/applicative-functor-part-2

philipschwarz · June 10, 2020, 8:30pm

hello @sangamon, where is eitherIf defined?

sangamon · June 10, 2020, 9:40pm

I couldn’t find any matching signature on Hoogle, so I rolled my own. The missing parts for the above snippet:

data User = User { name :: String, email :: String, age :: Int } deriving Show

eitherIf :: b -> (a -> Bool) -> a -> Either b a
eitherIf b p a = if p a then Right a else Left b

…and in case anybody is wondering: The Haskell code doesn’t check for null (which doesn’t exist in Haskell), but for the empty string instead.

philipschwarz · June 10, 2020, 10:34pm

thanks

charpov · June 11, 2020, 3:19pm

If you use IntelliJ IDEA, there’s a “desugar for comprehension” action that shows the map/flatMap equivalent of a for expression. It can help figure out what’s really going on.

bjornregnell · June 11, 2020, 3:48pm

To me, for comprehensions are black magic

Here is an in-depth explanation of the de-sugaring of for comprehensions by Odersky et al.:
https://www.artima.com/pins1ed/for-expressions-revisited.html#23.4

dubaut · June 11, 2020, 5:18pm

Thank you, I will look into that!

I believe my problem with understanding for comprehensions is that I don’t get the naming. Neither do I understand why flatMap is called flatMap, nor why for comprehensions are called like this.

I mean, I get the map part, but what is flat? Flat in what sense?

And regarding the other: What does for have to do with all that? I understand the for as in for loop and the for in forEach but I don’t get what those all have to do with for comprehensions.

philipschwarz · June 11, 2020, 5:42pm

flatMap is called that way because it first maps and then flattens

scala> assert( Some(Some(3)).flatten == Some(3) )

scala> assert( "3".toIntOption == Some(3) )

scala> assert( Some("3").map(_.toIntOption) == Some(Some(3)) )

scala> assert( Some("3").map(_.toIntOption).flatten == Some(3) )

scala> assert( Some("3").flatMap(_.toIntOption) == Some(3) )

BalmungSan · June 11, 2020, 5:49pm

I mean, I get the map part, but what is flat? Flat in what sense?

Let’s start with map.
The map function has the following signature:

def map[F[_], A, B](fa: F[A])(f: A => B): F[B]

Which we can read like:
For a given effect F (also called contexts, or containers), if we have some effectual value fa (or type F of A, F[A]) and a function from A to B, A => B. It will return a new effectual value of type B, F[B].

So, the idea of the map function is to apply normal (also called plain) function to an effectual value.
In other words, the map function allows us to forget about managing the effect and just focus on the values. The function will take care of the unwrapping and wrapping that has to be done.

So a typical example would be the following.
We will use on the most basic effects, the Option (which represents the possibility of the absence of a value), to model a safe division.

def safeDivision(x: Int, y: Int): Option[Int] =
  if (y != 0) Some(x / y) else None

Now, if we have to compute the following arithmetic expression: y = a + b / c
We could do something like this:

def foo(a: Int, b: Int, c: Int): Option[Int] =
  safeDivision(b, c) match {
    case Some(temp) => Some(temp + a)
    case None => None
  }

However, look at all that boilerplate; we have to manually unwrap and wrap again all the time.
A more complex expression would be a nightmare and we could easily make mistakes.
Enter map

def map[A, B](oa: Option[A])(f: A => B): Option[B] = oa match {
  case Some(a) => Some(f(a))
  case None => None
}

def foo(a: Int, b: Int, c: Int): Option[Int] =
  map(safeDivision(b, c))(temp => temp + a)

Great!
Now, what happens if we want to compute this new expression: y = (a / b) / c
Well, we could try to do the same as before…

def bar(a: Int, b: Int, c: Int): Option[Option[Int]] =
  map(safeDivision(a, b))(temp => safeDivision(temp, c))

Which we may say it works… but having to handle that nesting is not really nice.
And again a complex expression would result in a very nested structure.
Also, we actually do not care if the first division failed or the second, we just care that the complete expression failed.

So, it is time to meet another helper function, flatten.

def flatten[F[_], A](ffa: F[F[A]]): F[A]

Which again, reads as follows:
For a nested effectual value, return a no-nested effectual value. Which we may call that process a flattening over the value; we flatten the two Fs into one F.

Applying that to our previous function we get the following:

def flatten(ooa: Option[Option[A]]): Option[A] = ooa match {
  case Some(Some(a)) => Some(a)
  case Some(None) => None
  case None => None
}

def bar(a: Int, b: Int, c: Int): Option[Int] =
  flatten(map(safeDivision(a, b))(temp => safeDivision(temp, c)))

But, as you may have already guessed that process of mapping and then flattening is pretty common, so we may create a helper flatMap:

def flatMap[F[_], A, B](fa: F[A])(f: A => F[B]): F[B] =
  flatten(map(fa)(f))

Which we can read as:
Given an effectual value and a function that returns a new effectual value, map the function and the flatten the result.

So we can again refactor our previous example as:

def bar(a: Int, b: Int, c: Int): Option[Int] =
  flatMap(safeDivision(a, b))(temp => safeDivision(temp, c)))

  // Or with normal method like syntax
  safeDivision(a, b).flatMap(temp => safeDivision(temp, c))

  // Or with for:
  for {
    temp <- safeDivision(a, b)
    result <- safeDivision(temp, c)
  } yield result

Hope this helps

philipschwarz · June 11, 2020, 5:52pm

TIL

Screenshot 2020-06-11 at 18.50.241682×742 113 KB

Screenshot 2020-06-11 at 18.50.41708×382 40.4 KB

dubaut · June 11, 2020, 7:47pm

First of all: Thank you so much for your detailed explanations, they really help to understand what’s happening under the hood!

But there is things I (still) cannot follow:

What map function does have this signature? If I look at Either’s map function I see the following signature:

def map[B1](f: B => B1): Either[A, B1]

Is this equal to the signature you mentioned? Plus: Until now I believed that map does nothing else than applying a function to a value to transform it into another value. So there is more to map than I thought?