How to determine a character class

Everything about this is wrong.

  1. Never use eq for integers and characters. The compiler is too smart to let you, normally, and eq uses reference equality which is wrong for larger boxed integers and characters .

  2. One can get into trouble assuming that floating-point math gives exact values, but “don’t compare unless 0.0” isn’t even the right way to solve that as 1.5 == 1.5f gives the correct result, and 0.3 - (0.2 + 0.1) == 0.0 doesn’t, even though everything should be 0.0.

No, that’s not right either. == translates to scala.runtime.BoxesRunTime.equals(Object, Object) which first tries reference equality, but then defers to something that will properly unbox numbers and compare their values. Using eq on boxed primitives will give you the wrong answer.

Yes, the methods available on characters include some but not all of those:

c.isControl
c.isDigit
c.isHighSurrogate
c.isLetter
c.isLetterOrDigit
c.isLowSurrogate
c.isLower
c.isUpper
c.isWhitespace

You can find these (and other methods) on Char at Char

Set lookups are generally really slow. You shouldn’t use those for efficiency, only for clarity, even if the set isn’t constructed each time but is stored in a companion object.

There are additional methods for testing classes of characters in java.lang.Character as @tyohDeveloper says. (That part is correct.)

In particular, the often-not-very-helpful unicode classes can be inspected awkwardly by calling .getType on the character and comparing to the sea of unicode constants. (Look up getType in java.lang.Character to know which constants might be returned.)

If you want things to be efficient, for short lists use chains of equality, and for long lists use precomputed bitsets and/or boolean arrays of length 65536.

1 Like