How to produce ScalaDoc in machine readable form?

I want to produce ScalaDoc in machine readable form (Json or Yaml preferably, or even XML would do). I though I could use a doclet for this, but it seems doclets are not supported in Scala 3, at least Standarize scaladoc parameters between Scala2 and Scala3 · Issue #11907 · scala/scala3 · GitHub lists corresponding option as not implement and I have found no traces of any Doclet in Scaladoc sources.

Can I produce ScalaDoc in some structured format, or do I have to parse HTML?

Scala 3 ScalaDoc contains an option -Ygenerate-inkuire to produce Inkuire database, which is stored as Json. Format is not documented, but models and Circe decoder configuration are available in the library sources under Apache 2.0 license.

The db structure is a bit different from the produced HTML (one table for types, another table for all functions including type members), but it seems complete enough and processing it should be much easier and robust than processing HTML files.

The whole thing is not much documented, but it powers the search of the main Scala 3 ScalaDoc site at Scala 3, therefore it is likely to stay supported in the near future.

From what I know, the people behind Inkuire are very sloppy when it comes to documentation. But overall the InkuireDB isn’t the most reliable source of source information. It doesn’t support some language constructs and can’t encode correct type bounds in the current form. And it’s not being actively developed now. (I started working on a reimplementation that would allow for encoding the full type system, but I don’t have time to work on it now)

Some decent approaches/hacks that you can try to generate your own representation:

  • fork the scaladoc and generate your custom format either similarly to Inkuire or as a custom renderer
  • (the most obvious way) use tasty-inspector and translate it into your own represantation
  • fork the compiler and derive json encoders for all the trees, then output json around the pickling phase

Can I read the content of the ScalaDoc comments with tasty inspector?

Docs sound promising:

TASTy files contain the full typed tree of a class including source positions and documentation.

Let us see where will this road lead me.

Yes, it uses the Quotes API, so you can use the Symbol#docstring method. (scala3/library/src/scala/quoted/Quotes.scala at main · scala/scala3 · GitHub)

1 Like

It seems that also means I can get the docstring from a macro. As I already use macros to process and enumerate those method signatures, this is by far the most straightforward method for me. It did not occurred to me it could be this easy.

It is a pity autocomplete and ScalaDoc does not work in IntelliJ for most quotes types, discovering such methods would be much easier then (https://youtrack.jetbrains.com/issue/SCL-22297/Improve-completion-for-Scala-3-TypeRepr).