Skip to content Skip to sidebar Skip to footer

Numerical Character Reference Entities... Nomenclature

It used to be so simple. Or so I thought. nbsp is an entity   is, therefore, an entity reference (a reference to an entity)   is a character reference (a referen

Solution 1:

I am basing this answer on the HTML5 specification, which I usually treat as trustworthy, although it is a working draft so subject to change.

nbsp is a "character reference name" (but the spec also calls it an "entity name")

  is a "named character reference"

  is a "decimal numerical character reference"

There is another option too:

† is a "hexadecimal numeric character reference"

Solution 2:

You are correct except that nbsp is not an entity but an entity name. The entity is the thing that the entity reference refers to, in this case the no-break space character.

The entity reference can also be called named entity reference (since SGML in general allows other types of entity reference, too). Similarly, the character reference can be called numeric character reference (to distinguish it from certain SGML concepts that never applied in HTML).

This is the SGML (ISO 8879) terminology that HTML specifications nominally adhere to, be their formal references to the SGML standard, up to and including HTML 4.01.

(Even HTML specifications use SGML terms sloppily, though. And in fact, HTML was never implemented as SGML-based, though some features of SGML are reflected in implementations.)

XHTML is based on XML, which is a simplification of SGML and formally defined as standalone. XML uses the terms entity reference and character reference, like SGML, but the longer names don’t apply.

HTML5 is something different: designed to be independent of SGML and XML. It also introduces its own terminology.

Post a Comment for "Numerical Character Reference Entities... Nomenclature"