I’d like to put some autogenerated reference data (not input to any program, just example output) inside the project area somewhere. This will be useful for documentation purposes. When I explain to someone which git repo to clone, I want to tell him where in the cloned repo is example data which was generated by the program. Normally the code generates it in /tmp.
What’s the correct place in the directory structure to put this non-input reference data?
Build tools only care about files they process and generate, and they only prescribe folder structure for those files - with sbt that’d be
target. Everything else is up to you. I’m not aware of any authoritative convention for Java/Scala projects beyond the build.
If these files are generated once by you and checked into VCS, I’d put them in some
reference folder (and refer to them in
README.md). If the user is supposed to run the program themselves locally in order to generate the files, you could hardwire it to some
out folder below the project root and exclude that in
.gitignore, but I’d prefer to ask the user to provide a target directory outside the project root when invoking the program.
This assumes that your audience are developers for whom checking out from VCS is the natural way of accessing the program. If it’s users who are only supposed to run the program, this becomes an entirely different question of assembly/deployment, and the original project structure doesn’t really matter.
Thanks. No, the user should not regenerate the files in this special directory.
This is a paper I’m submitting to the symposium on Trends in Functional Programming.
The files are there so that user’s who do not care to run the program (perhaps they are not Scala experts) can still see the raw data I used for the plots in my paper. I’m including the gnuplot files, the .csv files (containing parallel information), and the .png files which is the output of gnuplot. If the user cares to he can run the Scala program in the same git repo to regenerate the .csv and gnuplot files in his own environment, cpu, OS etc and verify or falsify my results.
I really prefer keeping this sample data with the code as I can change them together as the program progresses, but the revision history remains intact.