Converting shell scripts to scala, how to do it?

I see that Alvin Alexander recently updated some of his blog pages. In particular:
How to use Scala as a scripting language

I have a set of spaghetti shell scripts which I use to grade my students scala homework exercises. The shell scripts copy around student submission files, together with other files needed which were provided in the lecture, and also copies the test suite files. The scripts build a directory structure in /tmp with a build.sbt file and launches sbt in batch to compile and run the tests. The scripts then look at the exit status and log files to decide whether the student code was correct.

I’d love to convert these scripts to a coherent program, because the scripts are plagued with problems such as spaces in file names, special characters (my students often have names including french special characters, accents, circumflexes, etc). There is a huge amount of duplication in the scripts which would be much easier in Scala than in a shell scripting language.

Unfortunately, I don’t really know how to develop scala code which is intended to be run from the UNIX command line. I only know how to develop code in IntelliJ which contains the HUGE boiler plating maven directory structure, and run it by pressing the little green rectangles provided in IntelliJ.

When I say boiler plate: I have 106 scala files which I have created for my project, but IntelliJ has created 830 directories and 4139 additional files in the project.

QUESTION: To develop the scala shell-script replacement, should I abandon IntellJ and use emacs to develop the code and debug it by just running it from the shell, inserting println calls to figure out what’s happening? Or is there a better way?

Take a look at Ammonite, perhaps? There seems to be IntelliJ IDEA support for ammonite, as well. I wouldn’t worry too much about temporary cruft the IDE is generating under the hood, as long as it supports the coding process well and the result can be run from the command line without any ceremony.

(I haven’t really used the scripting/shell parts, though. I’ve only been using Ammonite-REPL for embedded custom REPLs in full-fledged projects and can wholeheartedly recommend it for this purpose.)


A little scary, the Ammonite support in IntelliJ seems to be all or nothing. Seems I have to agree that all worksheets are ammonite, or none at all.

This is per project, no? Committing to a single worksheet flavor within a project doesn’t seem that constraining to me, but I haven’t used worksheets at all so far.

in my project I use worksheets a lot.

I see. Can’t the scripts be factored out into their own sbt/IntelliJ project, then? (Could probably even live in a nested folder inside the main project, if need be.)

yes something like that. I was wondering whether I could use a module for this.
I don’t know how to use modules. BTW I posted another question to this forum to
try to understand modules.

I’d assume that IntelliJ worksheet flavor configuration is project scope rather than module scope, so that probably won’t help.

Ammonite is more featureful, but note that sbt also supports scripts with dependencies. (You can set any sbt setting, setting libraryDependencies is just a special case of that). Here’s the beginning of one of my scripts:

#!/usr/bin/env sbt -Dsbt.version=1.3.12 -Dsbt.main.class=sbt.ScriptMain -Dsbt.supershell=false -error

scalaVersion := "2.13.2"
onLoadMessage := ""
scalacOptions ++= ...
libraryDependencies ++= Seq(...)

and then it goes into the Scala code from there.


You might also want to have a look at os-lib, which really simplifies IO / process handling in a “pythonesque” way.

Otherwise, I cannot (fully) recommend using Scala instead of shell scripts for simple tasks like creating directories, copying files and running some programs. That’s a lot easier to do with Bash and I also do not see large benefit of having a type safe language, since it’s all Strings anyways, so it does not buy you much. Using a reasonably recent Bash 4, and adhering to some best practices (basically starting the script with set -eEuCo pipefail and proper quoting) is a lot less work, and usually much faster too.


I can’t say that I agree.

The reason for using Scala, is that I first did indeed implement this as shell scripts, but it has turned out to be a huge set of spaghetti scripts which lots of repetition, lots of dead code paths, and lots of errors. I wanted to rewrite it as a single program so I could understand it better and debug it. The original scripts are/were plagued with problems of spaces in file names, and unicode characters. These problems go away when using a programming language as opposed to a scripting language.

The problem is that some things indeed are easier to do in the shell. e.g., dealing with environment variables, and exit status of sub processes. And these cases are frustrating, but the conversion to Scala is a big win over all, despite these few difficulties.

1 Like

I’ve basically switched to Ammonite for all of my scripting (admittedly a modest amount), and find it a godsend – it makes so much more sense to me than shell scripts do, that I find it a far easier way to deal. (I have a long-overdue blog article on this subject, that I really need to get out the door.)

I will admit, though, that I gave up on trying to do it in IntelliJ a while ago. Things may have improved since then, but the support when I tried before was just too weak to be worthwhile. My scripts so far are mostly simple enough (< 200 lines) that I’ve wound up finding it sufficient to just work in a text editor, and let the Ammonite compiler tell me if I do anything wrong.


My script in scala is 585 lines and replaces 1881 lines of shell script.
Plus its blazingly faster.
So it’s a win, even with the limitations of the tools.

This is just not true. If you just say

val file =  "file name with spaces.txt"
s"ls $file".!! 

it’s the same problem you have when doing this stuff in a shell script. You have to use proper quoting or, in this case, the Seq(...) program builder method.

It’s the same with unicode characters. If you improperly configure your locale, it will break.

This really hasn’t anything to do with the choice of language, but rather how you use them. You even can debug shell scripts (at least for bash: if you want.

I never use that syntax. I always use Seq("ls","file").! which avoids the string interpolation completely. That’s one of the advantages of the programming language as opposed to a scripting langauge.

Where is the advantage? In a shell script you have to quote, in Scala you have to use Seq(...) (or quote properly yourself). There’s nothing stopping you from doing the wrong thing in either language.

If you would always quote variable expansions in your shell scripts, I claim that you would not have had any problems with spaces in filenames. But, you have to do it (like always using Seq(...) in Scala) – and there’s shellcheck to enforce this.

since this is a bit off-topic already, I just want to say that you should use the right tool for the job; that can mean to code something in Scala or in a shell script, and for various reasons one could be better than the other, being it that you need to implement some logic that would be hard to realize with a shell script, or that you just don’t have enough experience in writing shell scripts and feel more comfortable in Scala (yay!)

In a programming language like scala expressions are composable. In the shell things are not. For example you can use single quotes within backquotes within double quotes within backquotes within double quotes.

I recall one problem which I never solved in the shell. I had a variable representing a file name (a file which I didn’t create but was given to me by the system). The file name encoded the student’s name which I had to extract. Sometimes the student names had french characters like è, é, ô, etc and sometimes the student name had double spaces. if i had variable name path containing a path containing a double space and I tried to break it into head and tail using $path:h and $path:t, the shell would replace the double space with single space. and sometimes it would change the encoding of the special characters. So when I’d grep in the files for the student names, it would not find it because the spacing or encoding had changed. This was a nightmare.

This problem completely went away using Seq.

There’s nothing stopping you from doing the wrong thing

No there’s nothing stopping you from writing bad code. But with the shell you simple don’t have the option of treating file names and strings as objects without the shell manipulating them.

Another reason I didn’t mention above for fear of being ridiculed is that my shell scripts were all using csh, not bash. I suspect (but never verified) that my encoding problems were really due to csh bugs. When I considered converting them to bash I decided it’s probably easier to convert them to Scala, especially since I don’t know bash, and I’d have to learn it to do the conversion.

Probably a prime example of improper quoting:

$ set f="/abc/foo bar   bza"
$ echo $f:t
foo bar bza
$ echo "$f:t"
foo bar   bza
1 Like

yes indeed it was improper quoting. But the problem goes away using Seq().! because there’s no quoting. Right?