Which is quickly and efficiently for the word count? Why

val file = Source.fromFile("/Users/xhsoft/document/scala.txt")
1:
file.getLines().flatMap(_.split("[.,?\s]+")).toList.
groupBy(x => x).map(x => (x._1, x._2.length)).foreach(println)

2:
val res = new mutable.HashMapString, Int
val tmp = file.getLines().flatMap(_.split("[.,?\s]+")).reduce((a, b) => {
if (res.get(a) == None) res(a) = 1 else res(a) = 1
b
})
if (res.get(tmp) == None) res(tmp) = 1 else res(tmp) += 1
res.foreach(println)

You can do everything in one iteration and still be declarative like this:

import scala.io.Source
import scala.util.Using

val result =
  Using(Source.fromFile("/Users/xhsoft/document/scala.txt")) { file =>
    val lines =
      View.fromIteratorProvider(() => file.getLines())

    lines
      .flatMap(_.split("[.,?\s]+"))
      .groupMapReduce(identity)(_ => 1)(_ + _)
  }

This is also safer because Using will ensure the file is closed and will catch any exception.

1 Like

@BalmungSan

It is better to use the "\\W+" regexp rather than ("[.,?\\s]+". Compare:

val line="my: My home"
scala> line.split("\\W+")
val res1: Array[String] = Array(my, My, home)
scala> line.split("[.,?\\s]+")
val res2: Array[String] = Array(my:, My, home)
1 Like

I mean, probably; no idea. I didn’t want to go into details about the regex. Rather just focus on the algorithm at hand.

1 Like

Thanks very much.