Making scala easier and more popular, with compiler level switches


#1

Short version: With some compiler flags, the “high learning curve” objection to scala could be easily (I hope) overcome and it could be seen instead as: easy to start, grow in place when ready, never unduly limited.

Specificaly, it is hard to persuade people to use Scala because of the perceived learning curve. Scala can be seen as 1) a better java, and 2) as a way to use rich language features on the JVM. When choosing a language for someone’s next project, those two are in conflict. On one existing project, scala code seemed to me fairly hard to read, even with experience, and some developers thought “not for me”.

I want to suggest that such a concern can be removed, to the benefit of smiling programmers, by defining levels of language complexity, enforced by the compiler, to make using scala in a project always potentially at least as easy as java, kotlin, etc, with just ~1 hour of scala training, and that a project could then specify when to allow other language features when the project developers are ready and want that. That could completely change the perception of the learning curve and how it can be managed by projects or organizations.

Would this be best brought up in the “contributors” forum, something for dotty, or elsewhere? It seems easy to implement right now without waiting for dotty, but maybe it just seems that way due to my ignorance.

Longer version / more thoughts:
I’ve been using scala at a non-expert level for some years, thinking of it as a “better java today, with headroom for the future”, and would like to encourage others to use it. But in the attempts I found either someone was put off by the learning curve on the one hand, or they wanted to use functional or advanced idioms and features that discourage other new developers from joining the project. I wish I could more easily explain that in about 1 hour of scala study or training, scala can be successfully used at a basic level that is better than java, and which allows one build on that knowledge to grow greatly as a programmer in the future.

Or, can you recommend a language like that? I realize that scheme/lisp, C, or many languages can be so adapted, just like C can be used for some OOP with structs, but not as easily as if the compiler could help enforce clarity and simplicity and let projects choose their level of preferred complexity.

I think it would help that effort if there were a strictly-defined “introductory scala” level, enforced at a compiler level, and then 1-2 or more other levels that can be chosen by a project when they are ready to use more language features. This could be done with compiler flags and/or some very simple project-level configuration, even specifying language features to allow at a granular level if someone wanted to be very incremental or specific.

Thus scala becomes a viable option in practice for any level of developer or any project that wants to preserve maintainability by a particular level of developer, and make conscious per-project choices of when to change that level.

Examples: some people like java, kotlin, golang, python, etc for the reason that more developers can understand them. But it is possible to reach a point where one regrets not having the features that those languages lack. Similarly, I sometimes like to use the jedit text editor because it is ~“easy like notepad, powerful like emacs”.

I often think in terms of maturity models (details buried in my web site at http://onemodel.org ) where one who wants to work and grow concurrently, can do so in a planned way without throwing away the knowledge of the initial learning steps. Scala could become like that: a better java, with room to grow (all the headroom you will probably ever need).

I don’t know, but maybe this could even affect compiler speed further for many projects.

The “simpler” scala could be the default, and specified to contain all the “better java” features (maybe something like java + immutability, Option type, better for loops, and traits), ideally a level that is “guessable” or fully usable by a java5 programmer with 1-2 hours max of scala learning, a fully usable language as strong as java is today (possibly excepting features few use, if necessary to achieve that defined simplicity): a full-fledged programming language. Other levels then add scala features that would be used as one learns more about types, functional programming, etc, all the way up to unlimited/open.

The “simpler” scala could probably be ~“marketed” as “new/improved” (catch some buzz) if it had a new name: maybe dotty is that, or maybe “filbert” or whatever, pitched as a kotlin, easy to learn, with terrific headroom.

Would this question be better on the “contributors” forum, or in some dotty-related forum?


#2

ps: using a style enforcer add-on might be feasible but wouldn’t be as easy to use, as persuasive, or receive the same notice as announcing a “new” thing. It would also be slower and less portable than being able to move among projects that are at a clearly labeled level, the same as each other. “I’m a Filbert programmer”, or “I like scala level 1 plus those underscore fill-in-the-blank name thingies.”


#3

I think newer versions of the compiler already provide this. For example with the default project you can’t use implicit or postfix without enabling via compiler flags or importing the flag on your code, each feature as a flag an can be enabled as needed.
On the documentation about implicit conversions states that you need to import scala.language.implicitConversions so the compiler works, so it basically says when you try to write one function implicit “Are you really sure? If so sign here (by using the flag)”. So at start its just Scala with no magic, as you need and aknowledge that you can use more advanced features. But the problem AFAIK is that if you import a library that does make use of these features then your project automatically accepts without you enabling the flags. Flags for that are under the -language: prefix, there is the docs here


#4

Thanks for the comments. I haven’t explored that change in detail, but I think there are still many other language and library features that can still fill a book, confuse a new programmer with 1 hr of training, and would be removed in the “onboarding” level I envision. Also library use would have to be addressed as you mention.

Are there compiler flags that prevent using map (and such unfamiliar-to-many functional idioms), case classes, match, type features, traits (maybe), and basically everything else complex that a java5 programmer couldn’t learn (or be able to guess) after about 1 hour of learning? I imagine one could come up with quite a list.


#5

I’m thinking just train them on for loops, immutability (val/var), and some scoping syntax and public/private differences, and almost nothing else. Oh yeah, also basic class/method syntax, default return values and a few very simple things–just how to write code they have already been writing, with scala syntax.

Probably 2 (3?) defined full levels below “unlimited”, with ability to add individual features, maybe in something like a reverse .gitignore file. Maybe you have to put yourself in the shoes of someone who has zero interest in functional programming, and has to deal with a scala language enthusiast, but wants to keep things very simple and almost not learn a new language, with no complexity creep in the project. This would make it easier for the scala enthusiast to sell the idea of using scala. (And room to grow: the ability to use scala for more kinds of peers, therefore more projects.)


#6

I think the problem in real life projects arrives once you use external libraries i.e. for simple tasks like I/O and database persistence. Understanding even a trivial Slick example requires knowledge of advanced Scala syntax and features and beginners have a hard time even reading the code. Not to speak of stuff like Cats and Scalaz, both which are more than bizarre (and totally useless) for any moderate developer coming from the Java world.

Anyway, I‘m afraid it‘s already too late and Scala has found its niche but will not become the better Java. For that it‘s far too difficult to learn.


#7

Thanks for the comments.

My idea of levels is intended exactly to fix that kind of complexity, so scala can be used with more kinds of peers, thus more kinds of projects. The learning and library materials could specify which level they are using.

Otherwise one has to learn and use multiple languages for multiple kinds of peers and projects: and that sounds good in a way (it’s hard to replace shell scripts with scala, for example), but for me (and I think others), frankly, it is not realistic to try to stay current in strong knowledge of best practices, idioms, quirks, gotchas, libraries, etc etc, across multiple languages with nearly the level of productivity as could be done if I could use one language for simple procedural/OOP tasks (collaborating on it with developers who will only learn that far), and the same language for more complex stuff, with hope of learning in-place to get to the really advanced concepts without learning a brand new language over again (and all its best practices etc).

I have been bit hard by the lack of this, more than once. I want to be a programmer without devoting my life to staying up on all those environments. Scala seems like a good candidate to do it, and it seems like it might be at low cost.


#8

I’m not sure this is possible at all using just compiler switches. Have you seen the signatures of the Scala collection methods (the CanBuildWith stuff)? No matter how “pared down” the language is, trying to grok the type signature of List.filter, for example, coming from Java or most other languages is bracing to say the least. I recall when I was learning Scala (coming from Java) it was the collection APIs that were, simultaneously, the first off-putting feature if Scala, as well as the best entre/motivation for learning functional programming and type gymnastics.

I don’t see the purpose of the language without these features, other than as a slightly-better Java. And a slightly-better java is not worth the re-tooling and learning all the new standard APIs.

That as my experience. I’m not sure a Scala-with-training-wheels approach would have really made the learning curve much better in the end for me.

Brian Maso


#9

Thanks. Your and the other comments have been interesting, but yours comes closest to talking me out of it, so far.

You were motivated and able to learn it. Some would not be (and would not care about those details, to that level). I would like to work with them and with you, with a lower total burden to all of us. I wonder what the best way is, if there is one, that doesn’t mean everyone learning 3x as many languages. (I have used many, and want to simplify now, grow, and also work with others well.)

Maybe a sacrifice to make it work would be assuming that some people, while they are at level 1 or 2, will be willing to use sample code and not always worry about fully understanding library signature details. Maybe that will motivate some, be tolerable for others, and … kumbaya-ness for the rest who just want some sample code to mimic, for now. :wink: But still one language (or fewer total!).


#10

“Are there compiler flags that prevent using map (and such unfamiliar-to-many functional idioms), case classes, match, type features, traits (maybe), and basically everything else complex that a java5 programmer couldn’t learn (or be able to guess) after about 1 hour of learning?”

If you think that case classes are too complicated for beginners, then I think you have missed the point. Case classes simplify code for everyone, including beginners. In fact, I contend that the vast majority of the classes that you define and instantiate yourself should be immutable case classes. If you are not using case classes, you are missing out on some of the main benefits of using Scala. The “copy” method alone is indispensable.


#11

Yes, there would be more thought than what I expressed, in choosing the features of the levels. But the main question is whether it is possible to have the benefits of defined levels. I don’t know how else to solve the tradeoff between high cost to learn & become proficient with the best practices & gotchas of each new language, vs. lack of headroom after you do.

Just as I don’t have to learn assembly, to use a computer, or go to med. school to perform first aid, but I could learn a little something now, and then build on those, not starting from scratch, if I wanted to learn more. I think there are benefits in some consolidation and simplification of the many languages, and scala is the best candidate I’ve thought of so far (scheme, maybe? different issues.)

Some benefits of defining some kind of levels that I see are:

  • easier entry for beginners who might not become advanced, ever (but still have the option, more easily than switching languages).
  • less boilerplate code than java, immutables, etc. even for beginners.
  • easier to persuade managers to allow scala, because you can easily train java-level programmers (1 hour is the goal for level 1) to work on some existing projects which belong at that level (scala is no longer always “too hard”), and they can move up better when ready, to the advanced library/concurrent/whatever work.
  • thus transitions between projects are clearer and easier with different types (programming levels) of peers and different kinds of projects (vs. having to fully switch languages as often)
  • more support between teams, because there are fewer different languages
  • summary: less likely to be told “scala is too hard”, or “this language can’t do that, you are stuck with boilerplate and no advanced paradigms”, or “we are a ___ shop so that limits our options and hiring potential”: you get both goods, with lower cost overall.

As a possible arrangement, just for illustration:

  1. A “basic” or “level 1” that is minimally different from what they are used to (by some useful definition of minimal, and which they could use permanently as a java replacement for the benefits listed above (minimum benefit: less boilerplate code, and the opportunity to go to level 2 with less work than if they came straight from java);
  2. a “level 2” for those comfortable with level 1 and want to move to more, which adds some of the key scala goodness, like maybe case classes are here or in level 1, but we can all think of things we would put here, that would have to be decided, to optimize tradeoffs between usability and power, without burdening intermediate programmers with full complexity;
  3. “level 3”, if it exists, has stuff that is still more complicated, but omits some of the hardest-to-read stuff, or maybe this is the unlimited level. Maybe level 3 is not needed because it is, or level 2 is, a set of areas (functional idioms, type magic, etc etc) that can each be included orthogonally.
  4. (or 3 is) unlimited.

Details aside, I hope to persuade that the concept has merit. I am not saying “one language for everything”, but that we can swing back from the extreme of a new language every time you want to be a different kind & level of programmer. “Scala can be for beginners too” seems like it would help solve problems I have had at work, and in a personal project, and it might remove some arguments for kotlin or corporate java, and be easier to agree on for some FLOSS projects.

I think it could work, but I’m not a compiler writer.


#12

(NB: I do Scala training, among other things, professionally.)

Some thoughts…

Note that this idea has been around for a long time – I and others were pushing for it something like 8-9 years ago, and Martin responded with what have become informally known as “the Odersky levels”. They’re worth reading, since they are very much to your point.

That definition is pretty old and partly obsolete; I’ve been toying with coming up with a strawman new definition for us to debate, but haven’t had time yet.

Little of it is compiler-enforced yet. I suspect that it would be somewhat challenging and hacky to build in the old compiler, but it seems like it might be more practical in the new TASTy-based world. Or possibly through Scalafix? It feels like you could probably build at least some of this simply as Scalafix rules. That’s a bit post-facto, but if you plug it into your sbt build, it seems like it would be almost like being compiler-enforced.

Keep in mind that there is no single sensible definition of “introductory Scala”. We thought that made sense in the old days, when we were strictly focused on people coming in from Java. But I’ve found that folks coming from Java often have very different hurdles from those coming in from JavaScript or Python – they actually call for different “introductory” subsets.

Also, many of the really advanced features are already slightly “hidden”, in that they are only enabled via explicit language imports. (Whether that is helpful or not is a topic of some debate.)

All that said, it’s worth noting that, yes, the amount folks can grasp in the first hour is pretty limited. But I’ve found that, with three days of training, we can usually teach a large swathe of the main concepts to most experienced programmers. So IMO the issue is real, but is often overestimated – it can be a barrier to casual adoption, but less so for companies that want to train people up.

And carrying this too far is counter-productive, IMO: teaching people pure Java-in-Scala (which is what you tend to get if you remove everything like case classes and pattern matching) winds up inculcating bad habits that can be really painful to unlearn later. Idiomatic Scala is different, and I’ve had much more success teaching these basic tools and why they are cool and helpful, as quickly as possible, rather than coddling folks too much and pretending it doesn’t exist. Most programmers, in my experience, are happy to learn a new concept if you can show, really concretely, why it will make their life easier. So in general, my first hour is similar to what you’re describing, but by the end of the first day it’s getting rather different, and folks are starting to have more fun…


#13

Thanks. If it is not compiler-enforced, there will be inevitable bleed or creeping complexity, and the problems come back or benefits are lost, I think.

Is TASTy-based something dotty-related?


#14

I hope my last longer comment (shown to me on this page at the moment as 11/13) could help reignite the discussion, or suggest a viable alternative to meet the same goals, perhaps. Thanks again.


#15

ps: part of the point of level 1 is to possibly have almost no training cost, but it helps you move forward, or leaves the door less closed, for other levels. I.e., one web page being enough for most maintainers, and the scala advocates can do the conversions and that minimal amount training themselves, and those who want to move on can self-train in most cases. I.e., no “big-bang” disruption to the org, hopefully. I think that helps attain some of the benefits I listed. A big argument against scala in one workplace was … training cost.

[EDITed to say “almost no” not “no” training cost, & some clarifying.]


#16

I’m a bit unclear on the concept. As an individual, you can always restrict yourself to the features you are comfortable with and need no compiler to tell you. So is this meant to apply to entire projects to prevent one engineer from writing something that their teammate might find challenging? Then what happens over time, when engineers get more comfortable - are they assigned to projects with higher levels? Or is the entire project advanced to a higher level? Either sounds like an organizational challenge to me.

Typically, teams are mixed in terms of experience, some learn from others and projects have different parts of varying difficulty

and engineers pick parts according to their experience.


#17

The team can choose a higher level, but this makes it explicit so nobody slips something in that nobody wants. I’ve done a lot of maintenance of unclear code, and if you are introducing a language like scala it is complex. That way management can know what to train to, and team members can know what to expect. I also suggest considering allowing individual language features to be added [EDIT: mixing levels slightly where needed via a simple configuration, or maybe annotations], but again explicitly.

(From what I’ve read & experienced, maintenance can be the most costly part of software ownership.)


#18

I used to take the Scala-is-complex argument as somewhat valid. That was before ES6 was a thing. Now somehow the whole javascript ecosystem loves as many features as they can ship. Have you seen pushback against adding the |> operator to javascript? Then think about typescript – have you seen the Advanced Types section? Far more powerful and cryptic than Scala in many ways. Have you heard a drop of negative talk about it? Personally I’ve only heard talk of how wonderful and great it is. Meanwhile C# keeps coming out with new versions that add more and more new feature, often very ad-hoc versions of things we’ve had in Scala, like nested methods, and something that resembles type classes. Does anyone complain? All I hear from the C# corner is people putting in effort to master whatever is needed.

Well, that was when Go was all the hype. Now the common sentiment is that it’s too simple and therefore painful.

Meanwhile, Kotlin, which is Scala with a few uninteresting changes, some features left out, and some features left out until they add them, has become beyond ubiquitous. And that’s besides the popularity of Rust and Swift, also very comparable to Scala in terms of cognitive burden and language complexity. All of this happened since then.

That’s why my current theory is that it’s mainly a PR issue. That includes documentation, although it has gotten a lot better.

What I mean by that is that in my opinion, a huge amount of what leads a person to perceive something as complex, is the cues they pick up. If the vibes you pick up are that it’s complex, then every little thing that seems surprising is going to create a lot of confirmation bias. And if you don’t have documentation written so that the words practically jump off the page, every second you can’t find the answer to something odd is proving the point to you.

On the other hand these other languages first of all have the messaging and conversation done very well. For instance everyone knows that typescript is solving the huge problem in the javascript ecosystem for people that want types. Or that Kotlin makes Android programming much less painful. Etc etc. That’s the first thing people associate with them, not the particulars of syntax.

Secondly, in many cases the documentation makes it feel like (1) we’re holding your hand firmly, we’re going to make everything within your grasp, and (2) the navigation is very clear so that you can see that the amount of content is very finite. There’s this and that and that to go through, but that’s it. Typescript docs are a good example of this.

There are probably a lot of things like this. For example, to my mind iptables is this arcane, weird, complex system. I mean I’ve worked with it a drop but typically as part of some unrelated tutorial that required doing something with it and didn’t really explain what all the terms meant. I’m sure that given the right resource, that perception could change in about 20 minutes to something more along the lines of “wow, this is a really simple and elegant way of solving an important problem, and yes naming is hard.”

So in short, yes Scala will take effort to master but the same is true of any modern language. However the real problem to my mind is perception, and we need to do a better job of fixing it somehow.


#19

It’s not about how many advanced features a language has, but how quickly you run into these features. No one complains about a complicated feature if you can ignore it until later.

For example, SQL.

In most languages, basic use is this: you create a connection, submit a piece of SQL, get back an iterator of results, and each result is represented as a map with column names as key. Super easy! Show me a new language and I can do it almost immediately.

In Scala, what’s the preferred way to do SQL? Oh, we want to be typesafe, so let’s first define a table. And let’s not just write SQL as a string literal, no, use some super-fancy interpolation. And import tons of implicits to make it work.

For example, send a very simple HTTP GET.

In most languages, we create a client (e.g. new HttpClient()), give it a URL, and get back, say, a byte stream. Very easy.

In Scala, what should we use? How about http4s? They say it is “minimalist”, so that should be super easy, right? So let’s create a client. In the docs, they say they are “forced to use”

Http1ClientIO.unsafeRunSync

but they say in production, we should use

Http1Client.stream[F[_]: Effect]: Stream[F, Http1Client]

Oh dear, what should I do? Should I do something unsafe? Is my VM going to blow up? Or should I use the stream variant? But why, I don’t want a whole stream of clients, I just want a single one! Is there no safe way to create a single client? And what on Earth is IO? Or F? Or Effect? Why do I need that?

(For the record, I think http4s is pretty decent, I’m just saying its pretty tough for beginners. For those not familiar with effect monads, by “unsafe” they just mean it potentially has side effects, because we are creating the client now and are not just making plans to potentially create one later when we decide to do so. But don’t ask me why I would want a stream of clients.)


#20

I saw maintainability as a huge concrete issue in my workplace, and different kinds of programmers who needed to be able to work with each other in a predictable way.

I also saw scala complexity as the reason some developers wanted it and some did not: and the “nots” won.

With the changes I suggest (which seem like they could be easy/simple at some level), this would make scala then that much better than those complex examples you cite, where people use things even though they are overly complex for some purposes. I think the benefits I listed earlier (given the compiler switches) would make scala that much better than all those languages, because it is far more manageable for teams and moving across teams, easy to start, everything.

From the business perspective, one also wants to avoid having systems where only one person knows how to maintain it. (ie, being a reliable team, vs. a cowboy).