Hi everyone,
New user here. I’m hoping you all can recommend learning resources for the Scala / JVM ecosystem to a Pythonista with experience in data analysis & numerical computing, very little experience with Java, and with some exposure to functional programming.
Here’s a much more lengthy set of details about my background:
I’m well-versed in Python-centric tools for prototyping code and exploring data, ie Jupyter notebooks and iPython, and I’m very familiar with Python-as-a-glue-language and as a shell scripting replacement. I have some experience doing functional-like programming in Python; iterators, lambdas, list comprehensions, and passing functions as objects (I’m looking at you, map()) are all very useful in certain circumstances, as is some of the functional-style operation chaining you can do on Pandas dataframes. I also have experience using Python (or Python modules) to do numerical computing, often in a parallel (multi-core) and/or distributed (multi-node) environment.
Unfortunately, this also means I’ve run headlong into some of Python’s limitations; the limits of the GIL on shared memory multithreading have caused me headaches, as has the slow speed of pure Python code on occasion.
So for the last few years, I’ve been on the hunt for a Python alternative or companion that shares many of its strengths but not its major weaknesses. In particular, large ecosystem and low barrier to entry are important to me. As an example of the importance of ecosystem, this study out of UC Berkeley & Princeton found that “existing code, existing expertise, and open source libraries are the dominant drivers of adoption” of a programming language. In other words, having access to lots of high quality libraries is really useful to someone who wants to be productive, as is the ability to find solutions (online) to problems that someone else has inevitably had at some point. As for the desire for a low barrier to entry, I think one of the reasons that Python is so popular is that, as Li Haoyi pointed out in a recent blog post about Scala’s future, Python is unmatched in its ease of getting started. It’s accessible. Developers and non-developers alike can start doing productive stuff pretty quickly in Python. I want that in a language, given that I’m not really a developer!
Over the course of my search, I’ve evaluated a number of languages but had issues with them all: systems programming languages like C++ and Rust have high barriers to entry (I’m not very good at explicitly managing memory, and pointers make my brain hurt) and/or issues with ecosystem (small or fragmented). Cython and Numba are interesting projects in the Python ecosystem but the former requires good knowledge of C and the latter only shows large performance increases in pretty specific cases. Julia is pretty easy to pick up, and has an energetic community, but still has a fairly small number of libraries, and its learning curve becomes much steeper when performance becomes important (also, not being able to produce a stand-alone executable can be problematic). C# and Java have huge ecosystems, but they’re both really verbose and they seem to want users to use an OOP programming style (and C# is still pretty Windows-centric). Kotlin looks rather interesting, but it seems to me like it was maybe designed to be a Java replacement - most of the books and other learning resources for it assume a working knowledge of Java, which I don’t have. It’s also not clear to me how large the Kotlin community is. Golang looks fast and simple, but it seems to me like the domains it plays in don’t really overlap the ones I’m interested in.
That brings me to Scala. It looks interesting. I’ve grabbed a copy of Odersky’s “Programming in Scala,” which I think will teach me what I need to know about the language. I can pick up a copy of “Hands-on Scala Programming” if I want exercises. A book on doing numerical computing or data analysis using Scala would be useful if there’s a good one.
But I think my big struggle is going to be how to take advantage of the JVM’s huge ecosystem, since Scala plays so nicely with it. If I want to educate myself on the ecosystem (what libraries are available, where do I go to find them, etc), what’s the best way to do that?
Thanks!