Character reference
An HTML character reference is a formatted pattern of characters that is used to represent another character in the rendered web page.
Character references are used as replacements for characters that are reserved in HTML, such as the less-than (<
) and greater-than (>
) symbols used by the HTML parser to identify element tags, or "
or '
within attributes, which may be enclosed by those characters.
They can also be used for invisible characters that would otherwise be impossible to type, including non-breaking spaces, control characters like left-to-right and right-to-left marks, and for characters that are hard to type on a standard keyboard.
There are three types of character references:
- Named character references
-
These use a name string between an ampersand (
&
) and a semicolon (;
) to refer to the corresponding character. For example,<
is used for the less-than (<
) symbol, and©
for the copyright symbol (©
). The string used for the reference is often a camel-cased initialization or contraction of the character name. - Decimal numeric character references
-
These references start with
&#
, followed by one or more ASCII digits representing the base-ten integer that corresponds to the character's Unicode code point, and ending with;
. For example, the decimal character reference for<
is<
, because the Unicode code point for the symbol isU+0003C
, and3C
hexadecimal is 60 in decimal. - Hexadecimal numeric character reference
-
These references start with
&#x
or&#X
, followed by one or more ASCII hex digits, representing the hexadecimal integer that corresponds to the character's Unicode code point, and ending with;
. For example, the hexadecimal character reference for<
is<
or<
, because the Unicode code point for the symbol isU+0003C
.
A very small subset of useful named character references along with their unicode code points are listed below.
Character | Named reference | Unicode code-point |
---|---|---|
& | & |
U+00026 |
< | < |
U+0003C |
> | > |
U+0003E |
" | " |
U+00022 |
' | ' |
U+00027 |
|
U+000A0 | |
– | – |
U+02013 |
— | — |
U+02014 |
© | © |
U+000A9 |
® | ® |
U+000AE |
™ | ™ |
U+02122 |
≈ | ≈ |
U+02248 |
≠ | ≠ |
U+02260 |
£ | £ |
U+000A3 |
€ | € |
U+020AC |
° | ° |
U+000B0 |
The full list of HTML named character references can found in the HTML specification here.