How to count number of regular expression matches in a column?

Hi, if I have a dataframe like the one below:

Screenshot from 2020-09-09 11-39-27

and I want to keep track of how many times the values of a column match an email regular expression pattern, for simplicity let’s just say I want to match:

".*@.*"

I would like the return value for the email column to be 3, since the @ symbol is in 3 of the entries.

How would I do this in spark scala?

In spark shell

val data = List(
    "[email protected]",
    "[email protected]",
    "null",
    "tom smith",
    "[email protected]"
)



val df = data.zipWithIndex.toDF("email", "id")
df.printSchema()
df.show()

df.filter($"Email" rlike ".*@.*").count() // 3

see: https://stackoverflow.com/questions/33964957/filter-dataframe-with-regex-with-spark-in-scala