Thank you Seth. This is a great tool. for now I think it could be having some bugs.
$ time scala-cli -q scala-set.sc
(send,21835)
(message,19172)
(unsubscribe,16515)
(2021,15340)
(list,14130)
(mailing,13025)
(file,12323)
(mail,12052)
(jan,11759)
(email,10633)
(flink,10238)
(pm,9855)
(group,9496)
(code,9482)
(problem,9458)
(data,9291)
(received,8714)
(2022,8621)
(return,8505)
(2020,8409)
real 2m6.948s
user 2m7.423s
sys 0m0.425s
$ time scala scala-set.sc
(send,21835)
(message,19172)
(unsubscribe,16515)
(2021,15340)
(list,14130)
(mailing,13025)
(file,12323)
(mail,12052)
(jan,11759)
(email,10633)
(flink,10238)
(pm,9855)
(group,9496)
(code,9482)
(problem,9458)
(data,9291)
(received,8714)
(2022,8621)
(return,8505)
(2020,8409)
real 0m5.503s
user 0m7.662s
sys 0m0.234s
$ cat scala-set.sc
import scala.io.Source
val li = Source.fromFile("words.txt").getLines()
val set_sw = Source.fromFile("stopwords.txt").getLines().toSet
val hash = scala.collection.mutable.Map[String,Int]()
for (x <- li) {
if ( ! set_sw.contains(x) ) {
if (hash.contains(x)) hash(x) += 1 else hash(x) = 1
}
}
val sorted = hash.toList.sortBy(-_._2)
sorted.take(20).foreach {println}
This is not the first time to run scala-cli certainly.
And the test sets could be found here.
Thanks.