Raw strings and interpolation

This works:

"""\("""

And so does this:

val x = 42
raw"\($x"

But this doesn’t:

s"""\($x"""

Scala 3.2.2 gives me an invalid escape error.

Is that the expected behavior? And if so, what (other) differences are there between """ and s""" and between s" and s""" ?

I mean, raw"""...""" and s"""...""" are very different – specifically in that raw does no escaping. So it seems reasonable that the invalid escape error doesn’t show up when you are using raw. Of the examples you give, s"""...""" is the only one that is doing both escaping and interpolation, so AFAIK it’s the only one where the compiler cares about that particular combination of characters – it makes sense that it’s where the error would appear.

The parsing behavior is specified at this section.

There are ordinary strings with escapes: "a\tc" has a tab.

Triple-quoted strings preserve arbitrary content, including newlines, so no escaping. """a\tc""" has no tab.

Interpolators are free to return arbitrary values for their inputs. Of the standard interpolators, s and f choose to process escapes (and f also has other behavior); raw does no escaping.

Interpolated strings can be triple-quoted, for multilines and embedded quote chars, but otherwise the quote style does not alter behavior.

The overview seems quite regular because really the odd duck is the ordinary string. Really, opinionated Scala 3 ought to just remove ordinary strings from the language. Escaped strings which do not interpolate should be written esc"hello,\nworld" in this view.

2 Likes

Thanks for the explanation. The source of my confusion might be that I’m not thinking in two stages—first, string literal, then, interpolation—like you do.

Still the two-stage view somewhat breaks down because the standard interpolators are implemented as macros. Since there’s a TAB in "a\tc", a two-stage view would expect it to be still there in raw"a\tc$x" but it’s not. Conceptually, the raw interpolator prevented the quotes from expanding \t into a TAB, so the two stages are not fully separated.

The StringContext documentation tells me that raw"a\tc$x" is compiled into StringContext("a\tc", "").raw(x), which is very much two-stage, but could misled one into thinking that there’ll be a TAB in the final output (since there’s one in one of the arguments).

The language specification section you point to states:

Inside an interpolated string none of the usual escape characters are interpreted

which doesn’t help, given that you say: “Of the standard interpolators, s and f choose to process escapes.”

Anyway, all I need to remember is that s"""\(""" has s unhappy about \( (which is fair), and that raw does what its name suggest (even if it doesn’t quite square with a two-stage view).

The line you quote should read:

Inside an interpolated string none of the usual escape characters are interpreted by the compiler.

(Perhaps “by the compiler” is implicit in the spec for the language.)

That is why your expectation that raw receives a tab instead of "\t" is incorrect.

The interpolator receives "\t" and it is up to the interpolator how to interpret those characters. Similarly, it is up to the interpolator to decide how to interpolate an arbitrary value. (As f interpolator shows, it need not be toString that is interpolated.)

In particular, raw does not “undo” the escaping.

This is obvious after one has written an interpolator (which not everyone has had the pleasure of doing). The weird case, as I suggested, is that escapes are processed for “natural” strings. I previously proposed that “natural” strings should be processed as though written apply"text"; and people have asked for API that informs the interpolator whether it is processing x"text" or x"""text""". Then at least the syntax would be more regular.

What’s bit confusing is that the documentation seems to suggest that raw"a\tc$x" is rewritten to be StringContext("a\tc", "").raw(x), but it’s actually equivalent to StringContext("a\\tc", "").raw(x). Maybe the illustrative example in StringContext should include escaped characters.