I am an artificial intelligence researcher and inventor of Code Building Genetic Programming. I am improving the Scala implementation and I am hoping to get some advice on something that has me stuck.
Without going into too many details, here is the gist of CBGP:
- Generate a few hundred random type-safe Scala ASTs (see paper if you are curious how)
- Put each AST into a method definition of a class definition and use the Toolbox to define the class at runtime.
- Use refection to instantiate the new classes.
- Evaluate the behavior/correctness of the method on a dataset of input-output pairs. (Think “loss functions” form other forms of AI)
- “Breed” the ASTs that perform better than average to create new ASTs.
- Repeat from step 2 until a correct program evolves.
Scala’s reflection and toolbox utilities have been great to work with, and it wasn’t hard to get an implementation working. The problem is that it is incredibly slow. Evolution is always expensive, especially with tens-of-thousands of calls to the Scala toolbox for defining new classes. Profiling shows that >99% of the time is spent on compiling ASTs and defining new classes.
I built a simple LRU cache so that identical ASTs are never re-compiled, which improved the runtime significantly. Unfortunately, this isn’t enough to make the overall system viable.
I am wondering if anyone has any experience with building frameworks/applications that do excessive amounts of runtime class definitions.
- Are there any strategies for compiling faster at runtime?
- Can the toolbox be configured to trade-off the amount of work for other features? (ie. optimizations)
- Are there any tools that would aid in communicating between multiple JVMs to unlock compiling in parallel?
Any thoughts would be much appreciated. Thanks in advance!