Structured logging context with cats-effect

sangamon · January 22, 2024, 10:36am

Any suggestions for structured logging with a cats-effect stack (http4s, fs2, doobie, circe and more)? Currently I’m using log4cats and the logstash logback encoder for JSON formatted log lines.

Now I’d like to add custom structured data to the log messages. AFAIU the standard procedure is to pass this data through the MDC…? But how to aggregate and manage these data items? Possible scenarios:

global scope: Data such as request ID (from header) and/or user ID (from auth) will be available from the very beginning.
local scope: When e.g. calling a 3rd party HTTP service, information like target URL, response status code, (string) body, etc. become successively available as request handling progresses.

Global scope may nicely be covered using Kleisli (and #local()). For local scope it’d be nice to have for expressions instead of nesting, though. I don’t see any obvious solution for this - StateT might cover this, but scoping semantics are somewhat off and there be dragons when it gets to constraints beyond Sync…

Does anybody have suggestions for implementing this - be it libraries supporting this OOTB or a custom design, based on the cats[-effect] stack or something different (that plays nice with it, though)? Any input appreciated.

sageserpent-open · January 22, 2024, 11:36am

I didn’t know about log4cats, that’s good to know.

Have you considered using Treelog?

It treats logging as an effect in its own right, rather like using WriterT with a logging monoid - and last time I looked, that what is does under the covers, but with some special sauce added to get nesting of logs into a tree structure.

I used it myself a while back for the functional parts of an application, mixed it with straight imperative use of SLF4J. The codebase stripped down the resulting trees so we could get a traceback of how things went wrong in production; much better than trawling through random mutterings from imperative calls direct to SLF4J.

I wasn’t so keen about having to layer its effect monad with other monads, so I didn’t push it too hard for the next project, but I’m still considering it as an option as I’m slightly more comfortable with the thought of stacking monads / free monads / tagless final etc these days.

sangamon · January 24, 2024, 11:42am

Thanks a lot for the reply!

Hadn’t heard of it, just took a quick look. This doesn’t match my use case. It seems to aim at an isolated, pure layer for aggregating logging information in a specific way. This can be piggy-backed on effects, but it comes with the known downsides of pure logging, such as plain WriterT: No logging for failed computations.

What I’m looking for is support for scenarios like this, ideally on top of a standard JVM/Scala/cats-effect logging stack…

Upon receiving a request, a user ID is extracted and registered with the log context.
The logic for handling this request requires sending a request to a 3rd party service. The URL is registered with the log context.
Upon receiving the response, the HTTP status code is registered with the log context.

Now, if parsing of the body fails with some exception and I do

log.error("parsing failed", exc)

The corresponding log line should be something like

{
  "level": "ERROR",
  "msg": "parsing failed",
  "exc": ...,
  "ctx": {
    "user_id": "usr123",
    "url": "https://foo.example/",
    "status": "200"
  }
}

I’m looking for a way to make this as convenient as possible within the cats-effect stack, ideally already proven “in the wild”.

sageserpent-open · January 24, 2024, 12:09pm

It may be worth you taking a second look, @sangamon - it definitely logs through failures, that was what we used when things went wrong - we had a nice pared down traceback to the point of failure, along with all the good computations that led up the the failing one.

Did you look at this example?

From what I understand, you are looking to generalise what is done with imperative logging using an MDC to correlate disparate computations together, but you may find that the pure-functional tree approach might serve you better if you take a step back - it did for me, anyway.

I’ll leave it with you to mull, thanks for responding.

sangamon · January 24, 2024, 3:02pm

I should’ve been more precise - it doesn’t log through failed computations in the underlying effect. The failure case in the Future example is about reporting a “fail” to treelog itself - a failure in the Future would terminate without logging output.

treelog may be an interesting idea in its own right, but I’m looking for something that supports structured logging with structured arguments in the cats-effect stack, and treelog doesn’t seem to be part of such a solution.

sageserpent-open · January 24, 2024, 4:10pm

Ah, I see.

I’m not sure what other effects you intend to use - IO, Either(T)? Maybe you plan on letting these raise errors at their own level and then wrap them into a log, or could you raise errors directly at say, a DescribedComputationT, using that to wrap your effect?

If the latter, maybe writing an instance of ApplicativeError for DescribedComputationT[YourEffect] might be enough to express your error handling and integrate into the Cats stack?

Than again, you could raise an issue with the Treelog folk on GitHub.

(As an aside, I liked the idea that perhaps Treelog could be shoehorned into jobs running across Spark or whatever distributed computation engine one uses; I’ve used it in a production environment with Scalaz Streams, but nothing as ambitious as that.)

It would be interesting to know what you or anyone else comes up with…

sangamon · January 25, 2024, 9:58am

It’s running in IO - the code is a REST service that’s mostly glueing together results from DB and HTTP queries/requests. If parts of the code that can’t be fully pure/effect-free can at least in tests run in something like EitherT, that’s a plus. At the end of the day, logging goes to something like Sumo or Elastic.

DescribedComputationT[F, *] is an F[DescribedComputation, *], so at first glance I don’t see how it should gracefully cover failure in F. But even if this issue were solved, treelog still is a solution to a quite different requirement than mine.

sangamon · January 26, 2024, 3:02pm

JFTR, for now I’m trying to make do with some sugar on top of plain Kleisli. With this, code may look like this…

class Program[F[_] : CtxLogAware : MonadCancelThrow](client: Client[F]) {

  private def parseBody(body: String): F[String] =
    if (body.startsWith("!"))
      body.tail.pure[F]
    else
      new IllegalStateException("body parse error").raiseError[F, String]

  private def handleResp(status: Int, body: String): F[String] = {
    val res =
      if (status == 200) {
        parseBody(body)
      } else
        new IllegalStateException("unexpected status").raiseError[F, String]
    res
      .logOnFail("error handling request")
      .addLogCtx("status" -> status)
      .addLogCtx("body" -> body)
  }

  def run(url: String): F[String] =
    client
      .runRequest(url)
      .use((handleResp _).tupled)
      .logAround("running request")
      .addLogCtx("url" -> url)
}

override def run: IO[Unit] = {
  val client: Client[IO] = (_: String) => Resource.eval[IO, (Int, String)]((200, "body").pure[IO])
  val prog =
    new Program[StructLog[IO, *]](client.mapK(liftToStructLogFunctionK))
      .run("http://foo.test/")
      .addLogCtx("user" -> "usr-xyz")
  prog(Map.empty)
    .flatMap(IO.println)
}

Logging output like this:

{"level":"INFO","ctx":{"user":"usr-xyz","url":"http://foo.test/"},"message":"started: running request"}
{"level":"WARN","ctx":{"user":"usr-xyz","url":"http://foo.test/","body":"body","status":200},"message":"error handling request","stack_trace":"java.lang.IllegalStateException: body parse error [...]"}
{"level":"WARN","ctx":{"user":"usr-xyz","url":"http://foo.test/"},"message":"failed: running request","stack_trace":"java.lang.IllegalStateException: body parse error [...]"}

Not sure how happy I am with this…