This proposal adjusts how escaped code points are handled by Sass outside of string contexts. It's intended to bring Sass's semantics more in line with how CSS handles escapes.
This section is non-normative.
At time of writing, while Sass recognizes escaped code points in identifiers and
other names, it doesn't resolve them into the code points they represent. This
means that, for example, Sass considers the selector .\!foo
and the selector
.\21 foo
to be distinct. This is contrary to the CSS Syntax Level 3, which
says that the value of an escaped code point should be
included in the name rather than the syntax of the escape.
However, the current behavior works well for unquoted strings in SassScript.
These strings need to distinguish between escaped code points and the literal
characters they represent, because unquoted strings can represent more than just
identifiers. For example, the SassScript expression unquote("@x")
should be
rendered to CSS as @x
, whereas the expression \@x
should be rendered as
\@x
(or \40 x
). Any proposal for parsing escapes properly should preserve
this distinction.
This section is non-normative.
As identifiers are parsed, escapes will be normalized into a canonical form.
This preserves the benefits of the existing behavior, where \@x
and
unquote("@x")
are different SassScript expressions, while ensuring that
.\!foo
and .\21 foo
are considered the same selector.
The canonical form of a code point is:
-
The literal code point if it's a valid identifier character; or
-
a backslash followed by the code point's lowercase hex code followed by a space if it's not printable or a newline; or
-
a backslash followed by the code point's lowercase hex code followed by a space if it's a digit at the beginning of an identifier; or
-
a backslash followed by the literal code point.
For example, in SassScript:
ax
,\61x
, and\61 x
all parse to the unquoted stringax
;\7f x
,\7fx
, and\7Fx
all parse to the unquoted string\7f x
; and\31 x
and\31x
parse to the unquoted string\31 x
; and\@x
,\40x
, and\0040x
all parse to the unquoted string\@x
.
The proposed change affects existing observable behavior. It's theoretically
possible that an existing user is, for example, using \@x
and \40 x
as
distinct map keys; or that they're relying on length(\40 x)
returning 5
rather than 3
. However, the chances of this seem extremely low, and it would
be very difficult to produce actionable deprecation warnings without
compromising efficiency.
Given that, and given that this is arguably a bug fix (in that we're moving towards interpreting plain CSS text following the CSS spec, which we hadn't been before), I propose that we don't consider this a breaking change and release it with only a minor version bump.
This proposal defines a new algorithm for consuming an identifier and an interpolated identifier. These are intended to replace the existing algorithms.
Other than modifying the way escaped code points are handled, these algorithm are designed to accurately capture the current behavior of all Sass implementations.
This algorithm consumes input from a stream of code points and returns a string.
This production has the same grammar as <ident-token>
.
-
Let
string
be an empty string. -
If the stream starts with
--
, consume it and append it tostring
. -
Otherwise:
-
If the stream starts with
-
, consume it and append it tostring
. -
If the stream starts with
\
, consume an escaped code point with thestart
flag set and append it tostring
. -
Otherwise, if the stream starts with a name-start code point, consume it and append it to
string
. -
Otherwise, throw an error.
-
-
Consume a name and append it to
string
. -
Return
string
.
This algorithm consumes input from a stream of code points and returns a sequence of strings and/or expressions.
The grammar for this production is:
InterpolatedIdentifier ::= (<ident-token> | '-'? Interpolation) (Name | Interpolation)*
No whitespace is allowed between components of an InterpolatedIdentifier
.
-
Let
components
be an empty list of strings and/or expressions. -
If the input starts with
-#{
, consume a single code point and add"-"
tocomponents
. -
If the input starts with
#{
, consume an interpolation and add its expression tocomponents
. -
Otherwise, consume an identifier and add its string to
components
. -
While the input starts with
#{
, a name code point, or\
:-
If the input starts with
#{
, consume an interpolation and add its expression tocomponents
. -
Otherwise, consume a name and add its string to
components
.
-
-
Return
components
.
This algorithm consumes input from a stream of code points and returns a string.
The grammar for this production is:
Name ::= (name code point | escape)+
-
Let
string
be an empty string. -
While the input starts with a name code point or
\
:-
If the input starts with a name code point, consume it and append it to
string
. -
Otherwise, consume an escaped code point and append it to
string
.
-
-
Return
string
.
This algorithm consumes input from a stream of code points. It takes an
optional boolean flag, start
, which indicates whether it's at the beginning of
an identifier and defaults to false. It returns a string.
This production has the same grammar as escape
in CSS Syntax Level 3.
-
If the stream doesn't start with a valid escape, throw an error.
-
Let
codepoint
be the result of consuming an escaped code point. -
Let
character
be the string containing onlycodepoint
. -
If
codepoint
is a name-start code point, returncharacter
. -
Otherwise, if
codepoint
is a name code point and thestart
flag is not set, returncharacter
. -
Otherwise, if
codepoint
is a non-printable code point, U+000A LINE FEED, U+000D CARRIAGE RETURN, or U+000C FORM FEED; or ifcodepoint
is a digit and thestart
flag is set:-
Let
code
be the lowercase hexadecimal representation ofcodepoint
, with no leading0
s. -
Return
"\"
+code
+" "
.
-
-
Otherwise, return
"\"
+character
.