Uneven behavior of split function on String

can someone please make me understand why is happening with split function on string ?

“Word1XXWord2XXWord3”.split(“XX”)
results into : res18: Array[String] = Array(Word1, Word2, Word3)

Whereas,

“Word1||Word2||Word3”.split("||")
results into : res15: Array[String] = Array(W, o, r, d, 1, |, |, W, o, r, d, 2, |, |, W, o, r, d, 3)

I have tried below options

“Word1||Word2||Word3”.split("""||""")
“Word1||Word2||Word3”.split(raw"||")
“”“Word1||Word2||Word3"”".split("""||""")

But results are same.
Note : I am using Spark shell console to run my code here

split takes a regex and || means anything so that is why it is splitting by every character.

You should use .split(raw"\|\|")

2 Likes

@BalmungSan: Thanks for reverting quickly. It worked perfectly fine.