Is it possible to save functions in database and load from it if needed

IlmarsCirulis · October 26, 2024, 5:50pm

Let’s suppose I want to write many functions of type Int => Int and then save them to database for later use (either as text or maybe even as compiled things).
What’s the best way to do such thing? Is it even possible? Reasonable?

(All my apologies for this weird question, as I’m complete beginner.)

coreyoconnor · October 26, 2024, 7:08pm

It is not a weird question warning: here be dragons (aka subtle challenges)

What you are asking about is serialization/ persitence of functions. Or, in more accurate terms: closure serialization.

This is supported on the JVM and is typically called lambda serialization

One article on this: Serialize a Lambda in Java | Baeldung

So, in general one can serialize a lambda. Which has a closure over scope. The tricky part is picking where in this closure to draw the line between what is serialized (and thus deserialized) and what is assumed.

Spark, for instance, when serializing a closure “cleans” the closure of references that will be provided by the node deserializing the closure. Eg: sparks walks the closure capture but does not descend into references to things like the spark context and broadcast variables.

coreyoconnor · October 26, 2024, 10:00pm

Writing up some sample code and I noticed the test in the scala compiler is a pretty good overview of the details:

scala3/tests/run/lambda-serialization.scala at 91ef92159c628eaeab8311dc82bed7ed4fe03c63 · scala/scala3 · GitHub

Still, here’s a scala-cli session show serialization:

scala> val foo: Int => Int = _ + 1
val foo: Int => Int = Lambda/0x00007fa494581800@37227aa7

scala> import java.io.{ByteArrayInputStream, ByteArrayOutputStream, ObjectInputStream, ObjectOutputStream, PrintWriter, StringWriter}
                                                                                                                                                                                                                                                     
scala> val buffer = new ByteArrayOutputStream
val buffer: java.io.ByteArrayOutputStream = 
                                                                                                                                                                                                                                                     
scala> val out = new ObjectOutputStream(buffer)
val out: java.io.ObjectOutputStream = java.io.ObjectOutputStream@529b8e41
                                                                                                                                                                                                                                                     
scala> out.writeObject(foo)
                                                                                                                                                                                                                                                     
scala> out.close()
                                                                                                                                                                                                                                                     
scala> buffer.toByteArray
val res0: Array[Byte] = Array(-84, -19, 0, 5, ...

Which shows the bytes that foo is serialized to.

In the scala-cli repl, I could not get deserialization to work right for that example. One of those dragons mentioned above.

IlmarsCirulis · October 27, 2024, 5:00pm

Thank you!

Looks/sounds complex, but it shouldn’t be too complex. Will take some time to understand all this stuff and then ask some more questions.

BalmungSan · October 27, 2024, 5:09pm

May we ask what is the meta-problem you are trying to solve here?
While it is possible and doesn’t sound too unreasonable, it does sound like something that mostly only very complex problems would need.
Which feels a bit contradictory to your statement of being a beginner.

IlmarsCirulis · October 27, 2024, 6:06pm

Ok, will try, as this quite possibly could be xy problem.
(Although I was interested in the question itself too (curiosity etc).)

Text blob incoming…

So, here are many study websites that allow to study theory and solve exercises (that are randomized at least a bit usually) and get better at math or some other subject. Few examples: Khan Academy, IXL, etc etc.

I don’t know how most of them make practice exercises (how teachers and programmers interact to produce them etc.), but I worked at one local version of them as editor for some 13 years (from inception until I got kinda bored). Here practice exercises and theory is represented as XML, that was created and edited through special GUI and that was done usually by teachers. Math formulas as correspondingly Content MathML and Presentation MathML, also other stuff (answer input fields, radio button inputs, etc, etc.) as XML etc. Each xml saved in database.

Imagine very limited programming language with variables and arrays/lists (to randomly choose from), some built-in math and no conditionals or loops. Even such simple approach allows to make good enough educational materials/practice exercises, given some patience and skills from the teacher/editor.

My idea — nothing special, but it doesn’t leave me alone — is to replace this limited language with proper Turing complete programming language with some libraries (existing or to be made). More specifically, with some beautiful functional language with strict type system.

Next year I start studying math in the university and want to make educational website for fellow students of this study program (slowly advancing as I study further and further). I want to do it with Scala backend, with Turing complete underlying structure of these exercises, that would allow me to do stuff I want without any unnecessary pain and make/and use libraries etc.

Maybe I’m missing something and am too fixated on database for educational content, dunno. In many ways I am beginner. I have problems working with large, complex codebases with many tools etc. (so no professional work experience or career), but I have it much easier to tinker with smaller, more self-contained portions of programming if there’s motivation. I “float” somewhere in the place between math and relatively tiny programming, that’s the stuff I like.

I could do everything as the static webpages (js, html, mathml/mathjax, svg - it’s possible and not that hard, I already tinkered with that, in this case there’s just lot’s of html webpages and other resources), but then I can’t have users with saved progress and probably some more useful stuff. This approach is harder but I have some year until next academic year that starts in the September. It’s doable.

… Anyway, my apologies for this long rant.

jducoeur · October 27, 2024, 7:11pm

Suggestion: that all being the case, I would recommend not trying to serialize Scala functions per se to your database (which is a pretty big and not-simple project), but instead define your own DSL for the problem and serialize that (which is relatively common and straightforward, there are tons of libraries to help with it, and it would be 100% under your control).

That would avoid the serialization turning into a huge yak-shaving distraction, letting you instead focus on the application itself.

sageserpent-open · October 27, 2024, 8:58pm

Does this mean you have two lots of users - those who devise course content and those who work through the courses? Do you want a rich DSL of some sort to set up courses?

Your webserver has a well-defined classpath, so why don’t you allow your course content authors to write snippets of Scala that access DSL primitives implemented in Scala that you supply as part of your webserver’s classpath, filtered right down so that malicious people can’t interact with the rest of the webserver code / execute commands in the OS etc?

~~A sandbox, in other words.~~

The likes of Scastie, Jupiter and Zeppelin allow snippets of code to be run, and the Scala compiler can be embedded in another application, so rather than write a DSL interpreter (and the DSL language definition) from scratch, you let Scala do the heavy lifting…

EDIT: I’m thinking of the old presentation compiler, and to be honest I recall having to mess around with process forking to workaround some classpath subtlety, so perhaps this is a bad idea. It seems Scastie spawns SBT in some Docker container or suchlike. Maybe you could go down that road, or even just spam Scastie with the code snippets from your server.

charpov · October 28, 2024, 2:12pm

Here’s an old writing of mine on serialization and lambdas (in Java), if you decide to go that route: On Lambdas, Anonymous Classes and Serialization in Java | by Michel Charpentier | Level Up Coding. It might help you understand serialization and some of its pitfalls.

I’d agree with the previous reply, though, that a DSL might be the better choice, especially if you use Scala, not Java. Just the vision of shaving a yak is enough to distract me…

BalmungSan · October 28, 2024, 3:49pm

Sorry for taking too long to reply. But, thankfully, the community already provided with great alternatives.

IMHO, I would say you have two main options.

You do want to expose Scala to your users. As such, the best may be to just receive plain code and compile it; applying all security constraints like running in Docker.
As others have said, you also may wonder if you really need to take care of the processing of snippets at all, or simply let an existing platform manage that like Scastie or maybe Relpit. And just focus on the content.
Create a very simplistic programming language and write your own parser for it. That is basically what the ADT idea ends up looking like.