I want to produce ScalaDoc in machine readable form (Json or Yaml preferably, or even XML would do). I though I could use a doclet for this, but it seems doclets are not supported in Scala 3, at least Standarize scaladoc parameters between Scala2 and Scala3 · Issue #11907 · scala/scala3 · GitHub lists corresponding option as not implement and I have found no traces of any Doclet in Scaladoc sources.
Can I produce ScalaDoc in some structured format, or do I have to parse HTML?
Scala 3 ScalaDoc contains an option -Ygenerate-inkuire to produce Inkuire database, which is stored as Json. Format is not documented, but models and Circe decoder configuration are available in the library sources under Apache 2.0 license.
The db structure is a bit different from the produced HTML (one table for types, another table for all functions including type members), but it seems complete enough and processing it should be much easier and robust than processing HTML files.
The whole thing is not much documented, but it powers the search of the main Scala 3 ScalaDoc site at Scala 3, therefore it is likely to stay supported in the near future.
From what I know, the people behind Inkuire are very sloppy when it comes to documentation. But overall the InkuireDB isn’t the most reliable source of source information. It doesn’t support some language constructs and can’t encode correct type bounds in the current form. And it’s not being actively developed now. (I started working on a reimplementation that would allow for encoding the full type system, but I don’t have time to work on it now)
Some decent approaches/hacks that you can try to generate your own representation:
fork the scaladoc and generate your custom format either similarly to Inkuire or as a custom renderer
(the most obvious way) use tasty-inspector and translate it into your own represantation
fork the compiler and derive json encoders for all the trees, then output json around the pickling phase
It seems that also means I can get the docstring from a macro. As I already use macros to process and enumerate those method signatures, this is by far the most straightforward method for me. It did not occurred to me it could be this easy.