Need guidance on testing

Russ · January 19, 2021, 8:28am

I work independently and have been doing my testing in an unconventional way for years, It has worked well for me, but it is now causing problems in migrating to Scala 3. (Those problems may go away once Scala 3 is officially released, but I don’t know that for sure.)

My “main” directory currently has 66 source files in it, but it actually has no “main” function in it. It is really a library of methods. Having a main function would make no sense.

My “test” directory has approximately 50 source files in it, and most of them have a main function. These are not just unit tests. Most of them generate plots that have to be inspected for correctness, which would be impossible to automate.

I also have a directory that I call “plot”, which has approximately 100 source files, most of which have a main function that also generates plots of various kinds. Some of these source files are just scripts less than a page long, but some of them are several pages long.

In addition, I have a directory that I call “sim”, which contains four source files that I use for fast-time simulations of various kinds. Each of those has a main function as well. They generate data and plots.

I have a bash script that I use to compile and run. I compile using sbt, and I run using the “scala” command with a main entry point as the argument. One problem with this setup is that the Scala version used to run can be different than the version used by sbt to compile. With Scala 3, I am also having problems getting the scala run command to find the specified main function for some reason that I is baffling me. The CLASSPATH structure seems to be different than it is for Scala 2.

I am thinking that my “main” directory should perhaps be a library since it has no main function. But what should I do with my test and other directories with many main functions? Should each of them be a separate sbt “project”? That doesn’t make sense to me, but am I missing something? Can a project have many main functions? Guidance needed.

curoli · January 19, 2021, 9:35am

Projects can have as many main methods as you like. In SBT, you can run one with “runMain full.name.of.MyClass”.

However, if you have multiple main methods, in some cases you may want to specify a default main method (or more precisely, specify the class it is in) in the build definition (e.g. build.sbt), for example, if you want to package the project and enable a default entry point, or if you want to be able to simply say “run” in SBT.

The other problem is that in a normal setup:

For regular deployment, SBT will only compile the main branch and ignore the test branch
Only for (unit) testing, SBT will compile and make available both the main and the test branch

AFAIK, the problem is that tests are not normally entered via main methods.

In your case, I would do one of these:

(1) You may decide that your tests are not really unit tests, and therefore for SBT, they are just regular apps, not tests. In that case, just move them to the main branch and invoke them as you would invoke any regular app.

(2) You may decide that you do want unit tests. Pick a framework (e.g. ScalaTest) and write your unit tests in the test branch. Those unit tests can call main methods, if desired, and those main methods then can be anywhere - although, I don’t see a reason to have main methods unless as regular apps in the main branch.

And a project is a library if and only if you use it as a dependency of another project.

Jasper-M · January 19, 2021, 9:53am

I would move your test source directory to the main source directory of a separate subproject that depends on your “main” subproject. If you insist on using bash scripts you could run something like sbt "show testProject/discoveredMainClasses", parse the output, then loop over those mainclasses with sbt "testProject/runMain $mainClass". Or you could write a sbt task that does all that which avoids all the parsing and is more performant.

Sbt is pretty intimidating but it’s actually really easy to work with a multi-project setup. And staying in the build tool as much as possible for all tasks related to the build will probably avoid lots of future headaches. It should already take care of things like classpaths for you.

hmf · January 19, 2021, 11:06am

I am curious, are you using a testing framework? If you are not, I would suggest doing so. It has several advantages:

no need to create and explicitly run main functions
you can run all tests - the system will find and execute them
you can execute a single or a set of tests via filters - frameworks provide these
you can do a quickTest (or is that testQuick?) in SBT to run only those tests that previously failed (don’t forget to force all tests when those are ok)

I beg to differ. Some time ago I was working on a wrapper for Ploty (tried to compile the JSON to objects/class, but not consistent enough). In the tests I had to check if my output was the same as the expected Plotly output. I generate a set of plots, checked them manually and saved them as the “gold” standard or use existing output from Plotly’s site. In the tests I would generate the new plot and compare with the “gold plate”. I would fail errors for large differences (was using original Plotly output). Generated images were always rewritten.

SBT does this for you (as does Mill and any other build system). But you have to set-up the project according to the tools conventions (you can change these, but avoid that). I would suggest you set-up the structure first. Separate the main classes into src/main and the tests into src/test. Then create a test or two. Run those. Once you got that going, keep adding your sources and tests.

It will take time and effort but I believe it will be well worth it. Especially when it comes to cross-compiling.

HTHs

Russ · January 19, 2021, 8:10pm

Thanks for the replies. I’ll have to think about it a bit, but I am leaning toward curoli’s idea of leaving my “main” testing methods in place and using ScalaTest as a sort of shell to invoke them. That would allow me to continue more or less the same approach, and it would also allow me to eventually add conventional unit testing if and when I decide to do so.

But there is one potential gotcha. I need to be able to pass command-line arguments to my main methods. Will I be able to do that with ScalaTest if I run one test at a time? I looked briefly at the docs, and that is not clear to me. Thanks.

sangamon · January 19, 2021, 9:53pm

To me it feels like the easiest path would be to go with @Jasper-M’s suggestion: Create a multiproject with subprojects core, test, plot, etc., each having a main/src/scala (and no test/src/scala) and make all other projects depend on core. This way you don’t have to change anything about your code and its invocation patterns, and this layout better reflects what these modules are (in the sbt context).

With the multiproject approach, you could seamlessly start with framework based unit testing, just by adding a test/src/scala in the corresponding subproject(s) and creating/moving code there.

I am not aware of any official command line argument support in/for Scalatest (or the sbt test task), but there may be workarounds. Perhaps this could also be accomplished through a custom test runner, never looked into this. Alternatively, vou could read system properties passed to the sbt runner (or defined from within sbt by manipulating sys.props) instead.

However, all of this somewhat feels like working around/against the build system - personally I’d think that the time would be better spent integrating with it.

Russ · January 20, 2021, 12:30am

The sbt multi-project approach makes sense. One question. Does the sbt run command allow for command-line arguments to be passed to the underlying application to be run? If not, that is a serious problem for me. Thanks.

Russ · January 20, 2021, 1:36am

OK, I did a little test and verified that the sbt run command passes command-line arguments to the application. That’s good news.

Now I have another question. How do I run from a different directory than the project directory? I typically run from a directory that contains some kind of input data (air traffic route or trajectory data, for example), which can be completely unrelated to the project directory. But when I start sbt from that directory, it tries to create some new project based on the name of the current working directory.

sangamon · January 20, 2021, 1:41am

Yes, with runMain like that:

$ sbt 'subprojectName/runMain com.example.MainClass arg1 arg2'

sangamon · January 20, 2021, 1:48am

Seems to be possible in forking mode. (In the long run, I’d think about making the data directory another command line argument, anyway, though.)

Russ · January 20, 2021, 4:03am

I’m experimenting with sbt, and I must say that it is driving me nuts. It does everything but mow the lawn, yet the simplest operations seem to be impossible.

Suppose I want to change the working directory while in sbt. Is that possible? It doesn’t seem to be, although it would be incredibly useful.

Suppose I want to run in a data directory that has the input data and will have the output data. Is it possible to cd to that directory, then start sbt for the project, which might be in another directory? I thought that perhaps "sbt project might do that. No such luck. Is there a way to do it?

Is there a way to just specify a build.sbt file from a different directory?

SethTisue · January 20, 2021, 6:11am

It’s a JVM limitation

Russ · January 20, 2021, 7:00am

I just remembered what sangamon said about changing directories in sbt with forking. There is hope! I think I can make it work after all.

https://www.scala-sbt.org/1.x/docs/Forking.html#Change+working+directory

BalmungSan · January 20, 2021, 1:42pm

The more I read your problems the more I convince that you simply shouldn’t not be using a build tool to run a project, rather you want to install a program in your system and use it.

A couple of ideas:

Use sbt-assembly to create a fat jar with all your dependencies that you can run just using java - jar jar, you may use a bash script to copy that jar on a general location and have another bash script to launch it.
Use sbt-native-packager to create an installer of your project and install it in your machine.
Use coursier to install it in your machine.

sangamon · January 20, 2021, 1:58pm

Agreed in principle. However, the status quo is that there’s dozens of test cases in the form of main classes. Building full-fledged assemblies in order to run tests feels awkward, and you’d need to add some custom scripting in order to selectively run testcases from the jar afterwards. Just getting to selectively run the test main classes from within a multiproject setup still sounds like a good (first!) step toward integrating with sbt concepts.

Russ · January 21, 2021, 2:54am

In sbt, the run / baseDirectory can be set in build.sbt as follows:

run / baseDirectory := file(“path”)

Is it possible to set this path from the command line? If not, it would be a very useful feature to add.

This question was also posted on stackovervlow. Thanks.

cbley · January 21, 2021, 7:08am

You can change any setting in sbt’s console or from the command line using the set command:

$ sbt 'set run / baseDirectory := file("path")' package 'runMain ...'

Ichthyostega · January 28, 2021, 9:13pm

After reading this thread, I am under the impression that your notion or concept of “Test” is different to what is customary in contemporary software development. Please note, I am in no way implying that your approach is “wrong”. Just my suspicion is that your understanding of “Test” might create some kind of “impedance mismatch”, which then leads to the impression that the tools are working against you or getting into the way.

The typical / common meaning of software testing is to set up some kind of formal specification, and then to verify your actual software automatically against that specification. What also is very common is the idea to decompose the tests. Most tests would then just serve to cover the basic building blocks in your software, and only a small number of typically rather contrived tests would be dedicated to verify a proper integration. Following this approach, in a system producing graphics (as “plots” in your case), or sound or video, or 3D models, you would rarely ever run that media production itself in your test suite. Rather you would cover the underlying functions producing that data (graphic, colour, sound, statistics) and verify the results of function invocation mathematically.

However, from your description I could imagine that what you are building is rather a collection or framework of probe / analysis / data “testing” functions or modules. These in turn are implemented on top of a library of basic functionality, which is in your “main” section. In most cases, you would then rather use these tools and apply them to some data you are working on. While, from time to time, you’d improve your code and then check everything is sane by applying your tooling to a known data set and just visually verify the results still do make sense.

So maybe what you are looking for is rather to build some kind of Shell to launch your tools from. Maybe you’d want to use a clever bash script, which defines some shell functions, aliasses and environment variables and then drops you off into interacive mode again. You would then cd to your data, and launch your analysis tools via this shell binding. The build tool would then just play the role of recompiling your sources, and produce an invokable binary, which is automatically installed into a location from where it can be invoked by your working shell.

Russ · January 29, 2021, 8:53am

Thanks for the reply. Yes, my testing methods are unconventional. I work in a research environment on air traffic control automation concepts and methods. The code that I write is prototype research code that is not intended for operational use, but I try to make is as close as possible to operationally usable. My hope is that someday it will be used as a starting point for actual operational software.

For that to happen, it would need a lot more testing, of course. As a more or less independent developer, I don’t have time for extensive unit testing, but that is not to say that I don’t do much testing. It’s just a different kind of testing, mostly based on generating plots of trajectories (and many other things) and examining them. I am obsessive about tracking down every anomaly that I see in the plots.

My “main” directory is the software that would actually be deployed in the field, but as I said, it contains no actual “main” method. I use many different “main” methods in my test and other subdirectories to test it, to run fast-time traffic simulations, and visualize the results.

I use a bash script to orchestrate the running of test drivers. I am using sbt to compile, but I was using a locally installed version of scala to run. When those versions were different, I had problems. I usually run out of a test directory that contains input traffic data and stores the output data. I was baffled as to how to get sbt to change directories from the top-level project directory to the test directory that I need to run in. But sangamon pointed that sbt can effectively change directories by “forking”, so I am now using that feature.

The bash script that I wrote follows in case anyone is interested. It is called “run” and it allows me to go to a test directory and type something like

run terminalSim

The default project directory is “~/trajspec”, but the run script allows me to specify another project directory (in a file called projectDir in the test directory) so I can use the same script on another project without changing it.

#!/usr/bin/env bash

set -o nounset # exit when reading any unset variable
set -o errexit # exit when any command returns an error
#set -x # for debugging

projectDir=~/trajspec # default project directory (modify as desired)

if [ -f projectDir ]; then read projectDir < projectDir; fi
projectDir=$(eval echo $projectDir) # expand ~ or .. if necessary

project=$(basename $projectDir) # project name
runDir=$(pwd) # current working directory

echo -e "\nrunning $1 in project $project ($projectDir)\n"

cd $projectDir

com1="set run/baseDirectory := sbt.file(\"$runDir\")"
com2="test:runMain $project.test.$1 ${@:2}"

sbt "$com1; ${com2}" | tee $runDir/$1.out

curoli · January 29, 2021, 9:31am

I think the main argument for following conventions is not that it is the path to production quality, but that it probably will make your life a lot easier in the long run, because most of your tools are optimized to be used according to conventions.