This section describes a grammar (and forward-compatible
parsing rules) common to any version of CSS (including
CSS 2.1). Future versions of CSS will adhere to this core syntax,
although they may add additional syntactic constraints.
These descriptions are normative. They are also
complemented by the normative grammar rules presented in Appendix G.
All levels of CSS — level 1, level 2, and any future levels — use
the same core syntax. This allows UAs to parse (though not completely
understand) style sheets written in levels of CSS that didn't exist at
the time the UAs were created. Designers can use this feature to
create style sheets that work with older user agents, while also
exercising the possibilities of the latest levels of CSS.
At the lexical level, CSS style sheets consist of a sequence of tokens.
The list of tokens for CSS 2.1 is as follows. The definitions use Lex-style
regular expressions. Octal codes refer to ISO 10646 ([ISO10646]). As in
Lex, in case of multiple matches, the longest match determines the token.
any other character not matched by
the above rules, and neither a single nor a double quote
The macros in curly braces ({}) above are defined as follows:
Macro
Definition
ident
[-]?{nmstart}{nmchar}*
name
{nmchar}+
nmstart
[_a-zA-Z]|{nonascii}|{escape}
nonascii
[^\0-\177]
unicode
\\[0-9a-f]{1,6}(\r\n|[ \n\r\t\f])?
escape
{unicode}|\\[^\n\r\f0-9a-f]
nmchar
[_a-zA-Z0-9-]|{nonascii}|{escape}
num
[0-9]+|[0-9]*\.[0-9]+
string
{string1}|{string2}
string1
\"([^\n\r\f\\"]|\\{nl}|{escape})*\"
string2
\'([^\n\r\f\\']|\\{nl}|{escape})*\'
invalid
{invalid1}|{invalid2}
invalid1
\"([^\n\r\f\\"]|\\{nl}|{escape})*
invalid2
\'([^\n\r\f\\']|\\{nl}|{escape})*
nl
\n|\r\n|\r|\f
w
[ \t\r\n\f]*
Below is the core syntax for CSS. The sections that follow describe
how to use it. Appendix G describes a
more restrictive grammar that is closer to the CSS level 2 language.
COMMENT tokens do not occur
in the grammar (to keep it readable), but any number of these tokens
may appear anywhere between other tokens.
The token S in the grammar above stands for whitespace. Only the characters "space" (U+0020), "tab" (U+0009), "line feed" (U+000A), "carriage return" (U+000D), and
"form feed" (U+000C) can occur in whitespace. Other space-like characters,
such as "em-space" (U+2003) and "ideographic space" (U+3000), are never part of whitespace.
The meaning of input that cannot be tokenized or parsed is
undefined in CSS 2.1.
In CSS2.1, identifiers may begin with '-' (dash) or '_' (underscore). Keywords and property names, beginning with -' or '_' are reserved for vendor-specific extensions. Such vendor-specific extensions should have one of the following formats:
'-' + vendor identifier + '-' + meaningful name
'_' + vendor identifier + '-' + meaningful name
Example(s):
For example, if XYZ organization added a property to describe the color of the
border on the East side of the display, they might call it -xyz-border-east-color.
Other known examples:
-moz-box-sizing
-moz-border-radius
-wap-accesskey
An initial dash or underscore is guaranteed never to be used in a property or keyword by any current or future level of CSS. Thus typical CSS implementations may not
recognize such properties and may ignore them according to the rules for handling parsing errors. However, because the initial dash or underscore is part of the grammar, CSS2.1 implementers should always be able to use a CSS-conforming parser, whether or not they support any vendor-specific extensions.
All CSS style sheets are case-insensitive, except for parts that are
not under the control of CSS. For example, the case-sensitivity of
values of the HTML attributes "id" and "class", of font names, and
of URIs lies outside the scope of this specification. Note in
particular that element names are case-insensitive in HTML, but
case-sensitive in XML.
In CSS 2.1, identifiers
(including element names, classes, and IDs in selectors) can contain only the
characters [A-Za-z0-9] and ISO 10646 characters U+00A1 and higher,
plus the hyphen (-) and the underscore (_); they cannot start with
a digit, or a hyphen followed by a digit.
Only properties, values, units, pseudo-classes,
pseudo-elements, and at-rules may start with a hyphen (-); other
identifiers (e.g. element names, classes, or IDs) may not.
Identifiers can also contain escaped characters and any ISO 10646
character as a numeric code (see next item).
For instance, the identifier "B&W?" may
be written as "B\&W\?" or "B\26 W\3F".
Note that Unicode is code-by-code equivalent to ISO 10646 (see
[UNICODE] and [ISO10646]).
In CSS 2.1, a backslash (\) character indicates three types of
character escapes.
First, inside a string, a backslash
followed by a newline is ignored (i.e., the string is deemed not
to contain either the backslash or the newline).
Second, it cancels the meaning of special CSS characters.
Any character (except a hexadecimal digit) can be escaped
with a backslash to remove its special meaning.
For example, "\"" is a string consisting of one
double quote. Style sheet preprocessors must not remove
these backslashes from a style sheet since that would
change the style sheet's meaning.
Third, backslash escapes allow authors to refer to characters
they can't easily put in a document. In this case, the backslash
is followed by at most six hexadecimal digits (0..9A..F), which
stand for the ISO 10646 ([ISO10646])
character with that number, which must not be zero.
(It is undefined in CSS 2.1 what happens if a style sheet
does contain a zero.)
If a character in the range [0-9a-fA-F] follows the hexadecimal number,
the end of the number needs to be made clear. There are two ways
to do that:
with a space (or other whitespace character): "\26 B" ("&B").
In this case, user agents should treat a "CR/LF" pair
(U+000D/U+000A) as a single whitespace character.
by providing exactly 6 hexadecimal digits: "\000026B" ("&B")
In fact, these two methods may be combined. Only one whitespace
character is ignored after a hexadecimal escape. Note that this means
that a "real" space after the escape sequence must itself either be
escaped or doubled.
Backslash escapes are always considered to be part of an identifier or a string (i.e.,
"\7B" is not punctuation, even though "{" is, and "\32" is allowed
at the start of a class name, even though "2" is not).
A CSS style sheet, for any version of CSS, consists of a list of
statements
(see the grammar above). There are two kinds of statements: at-rules
and rule
sets. There may be whitespace
around the statements.
In this specification, the expressions "immediately before" or
"immediately after" mean with no intervening whitespace or comments.
A block
starts with a left curly brace ({) and ends with the matching right
curly brace (}). In between there may be any characters, except that
parentheses (( )), brackets ([ ]) and braces ({ }) must
always occur in
matching pairs and may be nested. Single (') and double quotes (")
must also occur in matching pairs, and characters between them
are parsed as a string.
See Tokenization above for the definition
of a string.
Illegal example(s):
Here is an example of a block. Note that the right brace between
the double quotes does not match the opening brace of the block, and that the
second single quote is an escaped
character, and thus doesn't match the first single quote:
{ causta: "}" + ({7} * '\'') }
Note that the above rule is not valid CSS 2.1, but it is still
a block as defined above.
A rule set (also called "rule") consists of a selector followed by
a declaration block.
A declaration-block (also
called a {}-block in the following text) starts with a left curly
brace ({) and ends with the matching right curly brace (}). In between
there must be a list of zero or more semicolon-separated (;)
declarations.
The selector (see also the section on selectors) consists of everything up to (but
not including) the first left curly brace ({). A selector always goes
together with a {}-block. When a user agent can't parse the selector (i.e., it
is not valid CSS 2.1), it must ignore the {}-block as well.
CSS 2.1 gives a special meaning to the comma (,) in
selectors. However, since it is not known if the comma may acquire
other meanings in future versions of CSS, the whole statement should
be ignored if there is an error anywhere in the
selector, even though the rest of the selector may look reasonable in
CSS 2.1.
Illegal example(s):
For example, since the "&" is not a valid token in a CSS 2.1
selector, a CSS 2.1 user agent must
ignore
the whole second line, and not set the color of H3 to red:
h1, h2 {color: green }
h3, h4 & h5 {color: red }
h6 {color: black }
Example(s):
Here is a more complex example. The first two pairs of curly braces
are inside a string, and do not mark the end of the selector. This is
a valid CSS 2.1 rule.
p[example="public class foo\
{\
private int x;\
\
foo(int x) {\
this.x = x;\
}\
\
}"] { color: red }
A property is an identifier. Any character may occur
in the value. Parentheses ("( )"), brackets ("[ ]"),
braces ("{ }"), single
quotes (') and double quotes (") must come in matching
pairs, and semicolons not in strings must be escaped. Parentheses, brackets, and
braces may be nested. Inside the quotes, characters are parsed as a
string.
The syntax of values
is specified separately for each property, but in any case, values are
built from identifiers, strings, numbers, lengths, percentages, URIs, and
colors.
A user agent must ignore a declaration with an invalid property
name or an invalid value. Every CSS 2.1 property has its own syntactic
and semantic restrictions on the values it accepts.
Illegal example(s):
For example, assume a CSS 2.1 parser encounters this style sheet:
h1 { color: red; font-style: 12pt } /* Invalid value: 12pt */
p { color: blue; font-vendor: any; /* Invalid prop.: font-vendor */
font-variant: small-caps }
em em { font-style: normal }
The second declaration on the first line has an invalid value
'12pt'. The second declaration on the second line contains an
undefined property 'font-vendor'. The CSS 2.1 parser will ignore these
declarations, effectively reducing the style sheet to:
h1 { color: red; }
p { color: blue; font-variant: small-caps }
em em { font-style: normal }
Comments begin
with the characters "/*" and end with the characters "*/". They may
occur anywhere between tokens,
and their contents have no influence on the rendering. Comments may
not be nested.
CSS also allows the SGML comment delimiters ("<!--" and
"-->") in certain places defined by the grammar, but they do not
delimit CSS
comments. They are permitted so that style rules appearing in an HTML
source document (in the STYLE element) may be hidden from pre-HTML 3.2
user agents. See the HTML 4.0 specification ([HTML40]) for more information.
In some cases, user agents must ignore part of an illegal style
sheet. This specification defines ignore to mean
that the user agent parses the illegal part (in order to find its
beginning and end), but otherwise acts as if it had not been there.
CSS2.1 reserves for future versions of CSS all property:value combinations
and @-keywords that do not contain an identifier beginning with dash or
underscore. Implementations must ignore such combinations (other than those
introduced by future versions of CSS).
To ensure that new properties and new values for existing
properties can be added in the future, user agents are required to
obey the following rules when they encounter the following
scenarios:
Unknown properties. User agents must ignore a declaration with an unknown
property. For example, if the style sheet is:
h1 { color: red; rotation: 70minutes }
the user agent will treat this as if the style sheet had been
h1 { color: red }
Illegal values. User agents must ignore a
declaration with an illegal value. For example:
img { float: left } /* correct CSS 2.1 */
img { float: left here } /* "here" is not a value of 'float' */
img { background: "red" } /* keywords cannot be quoted */
img { border-width: 3 } /* a unit must be specified for length values */
A CSS 2.1 parser would honor the first rule and
ignore
the rest, as if the style sheet had been:
img { float: left }
img { }
img { }
img { }
A user agent conforming to a future CSS specification may accept one or
more of the other rules as well.
Malformed declarations. User agents must handle unexpected tokens encountered while parsing a declaration by reading until the end of the declaration, while observing the rules for matching pairs of (), [], {}, "", and '', and correctly handling escapes. For example, a malformed declaration may be missing a property, colon (:) or value. The following are all equivalent:
p { color:green }
p { color:green; color } /* malformed declaration missing ':', value */
p { color:red; color; color:green } /* same with expected recovery */
p { color:green; color: } /* malformed declaration missing value */
p { color:red; color:; color:green } /* same with expected recovery */
p { color:green; color{;color:maroon} } /* unexpected tokens { } */
p { color:red; color{;color:maroon}; color:green } /* same with recovery */
Invalid at-keywords. User agents must ignore an
invalid at-keyword together with everything following it, up to and
including the next semicolon (;) or block ({...}), whichever
comes first. For example, consider the following:
@three-dee {
@background-lighting {
azimuth: 30deg;
elevation: 190deg;
}
h1 { color: red }
}
h1 { color: blue }
The '@three-dee' at-rule is not part of CSS 2.1. Therefore, the whole
at-rule (up to, and including, the third right curly brace) is ignored. A
CSS 2.1 user agent ignores it, effectively reducing the style sheet
to:
h1 { color: blue }
Something inside an at-rule that is ignored because it is invalid,
such as an invalid declaration within an @media-rule, does not make
the entire at-rule invalid.
Unexpected end of style sheet.
User agents must close all open constructs (for example: blocks, parentheses, brackets, rules, strings, and comments) at the end of the
stylesheet. For example:
@media screen {
p:before { content: 'Hello
would be treated the same as:
@media screen {
p:before { content: 'Hello'; }
}
in a conformant UA.
Unexpected end of string.
User agents must close strings upon reaching the end of a line, but
then drop the construct (declaration or rule) in which the string
was found. For example:
p {
color: green;
font-family: 'Courier New Times
color: red;
color: green;
}
...would be treated the same as:
p { color: green; color: green; }
...because the second declaration (from 'font-family' to the
semicolon after 'color: red') is invalid and is dropped.
Some value types may have integer values (denoted by <integer>)
or real number values (denoted by <number>). Real numbers and
integers are specified in decimal notation only. An <integer>
consists of one or more digits "0" to "9". A <number> can either
be an <integer>, or it can be zero or more digits followed by a
dot (.) followed by one or more digits. Both integers and real numbers
may be preceded by a "-" or "+" to indicate the sign.
Note that many properties that allow an integer or real number as a
value actually restrict the value to some range, often to a
non-negative value.
Lengths refer to horizontal or vertical measurements.
The format of a length value (denoted by <length> in this specification) is
a <number> (with or without a
decimal point) immediately followed by a unit identifier (e.g., px,
em, etc.). After a zero length, the unit identifier is optional.
Some properties allow negative length values, but this may
complicate the formatting model and there may be
implementation-specific limits. If a negative length value cannot be
supported, it should be converted to the nearest value that can be
supported.
If a negative length value is set on a property that does not allow
negative length values, the declaration is ignored.
h1 { margin: 0.5em } /* em */
h1 { margin: 1ex } /* ex */
p { font-size: 12px } /* px */
The 'em' unit is equal to the computed value of
the 'font-size' property of
the element on which it is used. The exception is when 'em' occurs in
the value of the 'font-size' property itself, in which case it refers
to the font size of the parent element. It may be used for vertical or
horizontal measurement. (This unit is also sometimes called the
quad-width in typographic texts.)
The 'ex' unit is defined by the font's 'x-height'. The x-height is so called
because it is often equal to the height of the lowercase "x". However,
an 'ex' is defined even for fonts that don't contain an "x".
Example(s):
The rule:
h1 { line-height: 1.2em }
means that the line height of "h1" elements will be 20% greater
than the font size of the "h1" elements. On the other hand:
h1 { font-size: 1.2em }
means that the font-size of "h1" elements will be 20% greater than
the font size inherited by "h1" elements.
When specified for the root of the
document tree (e.g., "HTML" in HTML), 'em' and 'ex' refer to
the property's initial value.
Pixel units are relative to the
resolution of the viewing device, i.e., most often a computer
display. If the pixel density of the output device is very different
from that of a typical computer display, the user agent should rescale
pixel values. It is recommended that the reference pixel be the
visual angle of one pixel on a device with a pixel density of 96dpi
and a distance from the reader of an arm's length. For a nominal arm's
length of 28 inches, the visual angle is therefore about 0.0213
degrees.
For reading at arm's length, 1px thus corresponds to about 0.26 mm
(1/96 inch). When printed on a laser printer, meant for reading at a
little less than arm's length (55 cm, 21 inches), 1px is about
0.20 mm. On a 300 dots-per-inch (dpi) printer, that may be
rounded up to 3 dots (0.25 mm); on a 600 dpi printer, it can
be rounded to 5 dots.
The two images below illustrate the effect of viewing distance on
the size of a pixel and the effect of a device's resolution. In the
first image, a reading distance of 71 cm (28 inch) results
in a px of 0.26 mm, while a reading distance of 3.5 m
(12 feet) requires a px of 1.3 mm.
In the second image, an
area of 1px by 1px is covered by a single dot in a low-resolution
device (a computer screen), while the same area is covered by 16 dots
in a higher resolution device (such as a 400 dpi laser printer).
The format of a percentage value (denoted by <percentage> in this specification)
is a <number> immediately
followed by '%'.
Percentage values are always relative to another value, for
example a length. Each property that allows percentages also defines
the value to which the percentage refers. The value may be that of
another property for the same element, a property for an ancestor
element, or a value of the formatting context (e.g., the width of a containing block). When a
percentage value is set for a property of the root element and the percentage is
defined as referring to the inherited value of some property, the
resultant value is the percentage times the initial value of that property.
Example(s):
Since child elements (generally) inherit the computed values of their parent, in
the following example, the children of the P element will inherit a
value of 12px for 'line-height', not the percentage
value (120%):
p { font-size: 10px }
p { line-height: 120% } /* 120% of 'font-size' */
URLs (Uniform Resource Locators)
provide the address of a resource on
the Web. Another way of identifying resources is called URN (Uniform Resource Name). Together they are
called URIs (Uniform Resource
Identifiers, see [RFC3986]). This specification uses the term URI.
URI values in this specification are denoted by <uri>. The
functional notation used to designate URIs in property values is
"url()", as in:
Example(s):
body { background: url("http://www.example.com/pinkish.png") }
The format of a URI value is 'url(' followed by optional whitespace followed by an optional single quote
(') or double quote (") character followed by the URI
itself, followed by an optional single quote (') or double quote (")
character followed by optional whitespace followed by
')'. The two quote characters must be the same.
Example(s):
An example without quotes:
li { list-style: url(http://www.example.com/redball.png) disc }
Some characters appearing in an unquoted URI, such as parentheses,
commas, whitespace characters, single quotes (') and double quotes
("), must be escaped with a backslash: '\(', '\)', '\,'.
Depending on the type of URI, it might also be possible to write
the above characters as URI-escapes (where "(" = %28, ")" = %29, etc.)
as described in [URI].
In order to create modular style sheets that are not dependent on
the absolute location of a resource, authors may use relative URIs.
Relative URIs (as defined in [RFC3986]) are resolved to full URIs
using a base URI. RFC 3986, section 5, defines the normative
algorithm for this process. For CSS style sheets, the base URI is that
of the style sheet, not that of the source document.
Example(s):
For example, suppose the following rule:
body { background: url("yellow") }
is located in a style sheet designated by the URI:
http://www.example.org/style/basic.css
The background of the source document's BODY will be tiled with
whatever image is described by the resource designated
by the URI
http://www.example.org/style/yellow
User agents may vary in how they handle invalid URIs or URIs that
designate
unavailable or inapplicable resources.
To refer to a sequence of nested counters of the same name, the
notation is 'counters(<identifier>, <string>)' or
'counters(<identifier>, <string>,
<list-style-type>)'. See "Nested
counters and scope" in the chapter on generated content.
In CSS2, the values of counters can
only be referred to from the 'content' property. Note that 'none'
is a possible <list-style-type>: 'counter(x,
none)' yields an empty string.
Example(s):
Here is a style sheet that numbers paragraphs (p) for each chapter
(h1). The paragraphs are numbered with roman numerals, followed by a
period and a space:
A <color>
is either a keyword or a numerical RGB specification.
The list of keyword color names is: aqua, black, blue, fuchsia,
gray, green, lime, maroon, navy, olive, orange, purple, red, silver, teal,
white, and yellow. These 17 colors have the following values:
maroon #800000
red #ff0000
orange #ffA500
yellow #ffff00
olive #808000
In addition to these color keywords, users may specify
keywords that correspond to the colors used by certain objects in the
user's environment. Please consult the section on system colors for more information.
Example(s):
body {color: black; background: white }
h1 { color: maroon }
h2 { color: olive }
The RGB color model is used in numerical color
specifications. These examples all specify the same color:
Example(s):
em { color: #f00 } /* #rgb */
em { color: #ff0000 } /* #rrggbb */
em { color: rgb(255,0,0) }
em { color: rgb(100%, 0%, 0%) }
The format of an RGB value in hexadecimal notation is a '#'
immediately followed by either three or six hexadecimal
characters. The three-digit RGB notation (#rgb) is converted into
six-digit form (#rrggbb) by replicating digits, not by adding
zeros. For example, #fb0 expands to #ffbb00. This ensures that
white (#ffffff) can be specified with the short notation (#fff) and
removes any dependencies on the color depth of the display.
The format of an RGB value in the functional notation is 'rgb('
followed by a comma-separated list of three numerical values (either
three integer values or three percentage values) followed by ')'.
The integer value 255 corresponds to 100%, and to F or FF in the
hexadecimal notation: rgb(255,255,255) = rgb(100%,100%,100%) =
#FFF. Whitespace characters are allowed
around the numerical values.
All RGB colors are specified in the sRGB color space (see
[SRGB]). User agents may vary in the fidelity with which they
represent these colors, but using sRGB provides an unambiguous and
objectively measurable definition of what the color should be, which
can be related to international standards (see [COLORIMETRY]).
Conforming user agents may limit their color-displaying efforts to
performing a gamma-correction on them. sRGB specifies a display gamma
of 2.2 under specified viewing conditions. User agents should adjust the
colors given in CSS such that, in combination with an output device's
"natural" display gamma, an effective display gamma of 2.2 is
produced. See the section on gamma correction for further
details. Note that only colors specified in CSS are affected; e.g.,
images are expected to carry their own color information.
Values outside the device gamut should be clipped: the red, green,
and blue values must be changed to fall within the range supported by
the device. For a typical CRT monitor, whose device gamut is the same
as sRGB, the four rules below are equivalent:
Example(s):
em { color: rgb(255,0,0) } /* integer range 0 - 255 */
em { color: rgb(300,0,0) } /* clipped to rgb(255,0,0) */
em { color: rgb(255,-10,0) } /* clipped to rgb(255,0,0) */
em { color: rgb(110%, 0%, 0%) } /* clipped to rgb(100%,0%,0%) */
Other devices, such as printers, have different gamuts than sRGB;
some colors outside the 0..255 sRGB range will be representable
(inside the device gamut), while other colors inside the 0..255 sRGB
range will be outside the device gamut and will thus be clipped.
Strings can either be written
with double quotes or with single quotes. Double quotes cannot occur
inside double quotes, unless escaped (e.g., as '\"' or as
'\22'). Analogously for single quotes (e.g., "\'" or "\27").
Example(s):
"this is a 'string'"
"this is a \"string\""
'this is a "string"'
'this is a \'string\''
A string cannot directly contain a newline.
To include a newline in a string, use an escape representing the line feed
character in Unicode (U+000A), such as "\A" or "\00000a".
This character represents the generic notion of "newline" in CSS.
See the 'content' property for an example.
It is possible to break strings over several lines, for esthetic
or other reasons, but in such a case the newline itself has to be
escaped with a backslash (\). For instance, the following two
selectors are exactly the same:
Example(s):
a[title="a not s\
o very long title"] {/*...*/}
a[title="a not so very long title"] {/*...*/}
If a UA does not support a particular value, it should ignore that
value when parsing stylesheets, as if that value was an
illegal value. For example:
Example(s):
h3 {
display: inline;
display: run-in;
}
A UA that supports the 'run-in' value for the 'display' property will
accept the first display declaration and then "write over" that value with
the second display declaration. A UA that does not support the 'run-in'
value will process the first display declaration and ignore the second
display declaration.
A CSS style sheet is a sequence of characters from the Universal
Character Set (see [ISO10646]). For transmission and
storage, these characters must be encoded by a character encoding that
supports the set of characters available in US-ASCII (e.g., UTF-8, ISO
8859-x, SHIFT JIS, etc.). For a good introduction to character sets
and character encodings, please consult the HTML 4.0
specification ([HTML40], chapter 5), See also the XML 1.0
specification ([XML10], sections 2.2 and 4.3.3, and Appendix F).
When a style sheet is embedded in another document, such as in the
STYLE element or "style" attribute of HTML, the style sheet shares the
character encoding of the whole document.
When a style sheet resides in a separate file, user agents must
observe the following priorities when
determining a style sheet's character
encoding (from highest priority to lowest):
An HTTP "charset" parameter in a "Content-Type" field
(or similar parameters in other protocols)
<link charset=""> or other metadata from the linking mechanism (if any)
charset of referring stylesheet or document (if any)
Assume UTF-8
Authors using an @charset rule must
place the rule at the very beginning of the style sheet, preceded by
no characters. (If a byte order mark is appropriate for the encoding
used, it may precede the @charset rule.)
After "@charset", authors specify
the name of a character encoding (in quotes). For example:
@charset "ISO-8859-1";
@charset must be written literally, i.e., the 10 characters
'@charset "' (lowercase, no backslash escapes), followed by the
encoding name, followed by '";'.
The name must be a charset name as described in the IANA registry.
See [CHARSETS] for a complete list of charsets. Authors should use
the charset names marked as "preferred MIME name" in the IANA
registry.
User agents must support at least the UTF-8 encoding.
User agents must ignore any @charset rule not at the beginning of the
style sheet. When user agents detect the character encoding using the
BOM and/or the @charset rule, they should follow the following rules:
Except as specified in these rules, all @charset rules are ignored.
The encoding is detected based on the stream of bytes that begins
the stylesheet. The following table gives a set of possibilities for
initial byte sequences (written in hexadecimal). The first row that
matches the beginning of the stylesheet gives the result of encoding
detection based on the BOM and/or @charset rule. If no rows match, the
encoding cannot be detected based on the BOM and/or @charset rule. The
notation (...)* refers to repetition for which the best match is the one
that repeats as few times as possible. The bytes marked "XX" are those
used to determine the name of the encoding, by treating them, in the
order given, as a sequence of ASCII characters. Bytes marked "YY" are
similar, but need to be transcoded into ASCII as noted. User agents may
ignore entries in the table if they do not support any encodings
relevant to the entry.
as specified (with LE endianness if not specified)
00 00 FE FF
UTF-32-BE
FF FE 00 00
UTF-32-LE
00 00 FF FE
UTF-32-2143
FE FF 00 00
UTF-32-3412
FE FF
UTF-16-BE
FF FE
UTF-16-LE
7C 83 88 81 99 A2 85 A3 40 7F (YY)* 7F 5E
as specified, transcoded from EBCDIC to ASCII
AE 83 88 81 99 A2 85 A3 40 FC (YY)* FC 5E
as specified, transcoded from IBM1026 to ASCII
00 63 68 61 72 73 65 74 20 22 (YY)* 22 3B
as specified, transcoded from GSM 03.38 to ASCII
analogous patterns
User agents may
support additional, analogous, patterns if they support encodings
that are not handled by the patterns here
If the encoding is detected based on one of the entries in the table
above marked "as specified", the user agent ignores the stylesheet if it
does not parse an appropriate @charset rule at the beginning of the
stream of characters resulting from decoding in the chosen @charset.
This ensures that:
@charset rules should only function if they are in the
encoding of the stylesheet,
byte order marks are ignored only
in encodings that support a byte order mark, and
encoding names cannot contain newlines.
User agents must ignore style sheets in unknown encodings.
A style sheet may have to refer to characters that cannot be
represented in the current character encoding. These characters must
be written as escaped references to
ISO 10646 characters. These escapes serve the same purpose as numeric
character references in HTML or XML documents (see [HTML40],
chapters 5 and 25).
The character escape mechanism should be used when only a few
characters must be represented this way. If most of a style sheet
requires escaping, authors should encode it with a more appropriate
encoding (e.g., if the style sheet contains a lot of Greek characters,
authors might use "ISO-8859-7" or "UTF-8").
Intermediate processors using a different character encoding may
translate these escaped sequences into byte sequences of that
encoding. Intermediate processors must not, on
the other hand, alter escape sequences that cancel the special meaning
of an ASCII character.
Conforming user agents must
correctly map to Unicode all characters in any character encodings
that they recognize (or they must behave as if they did).
For example, a style sheet transmitted as ISO-8859-1
(Latin-1) cannot contain Greek letters directly:
"κουρος" (Greek: "kouros") has to be
written as "\3BA\3BF\3C5\3C1\3BF\3C2".
Note.
In HTML 4.0,
numeric character references are interpreted in "style" attribute
values but not in the content of the STYLE element. Because of this
asymmetry, we recommend that authors use the CSS character
escape mechanism rather than numeric character references
for both the "style" attribute and the STYLE element.
For example, we recommend: