Parse HTML character references: fast, spec-compliant, positional information.
This package is ESM only: Node 12+ is needed to use it and it must be import
ed
instead of require
d.
npm:
npm install parse-entities
import {parseEntities} from 'parse-entities'
parseEntities('alpha & bravo')
// => alpha & bravo
parseEntities('charlie ©cat; delta')
// => charlie ©cat; delta
parseEntities('echo © foxtrot ≠ golf 𝌆 hotel')
// => echo © foxtrot ≠ golf 𝌆 hotel
This package exports the following identifiers: parseEntities
.
There is no default export.
Additional character to accept (string?
, default: ''
).
This allows other characters, without error, when following an ampersand.
Whether to parse value
as an attribute value (boolean?
, default: false
).
Whether to allow non-terminated entities (boolean
, default: true
).
For example, ©cat
for ©cat
.
This behavior is spec-compliant but can lead to unexpected results.
Error handler (Function?
).
Text handler (Function?
).
Reference handler (Function?
).
Context used when invoking warning
('*'
, optional).
Context used when invoking text
('*'
, optional).
Context used when invoking reference
('*'
, optional)
Starting position
of value
(Position
or Point
, optional).
Useful when dealing with values nested in some sort of syntax tree.
The default is:
{line: 1, column: 1, offset: 0}
string
— Decoded value
.
Error handler.
this
refers to warningContext
when given to parseEntities
.
Human-readable reason the error (string
).
Place at which the parse error occurred (Point
).
Machine-readable code for the error (number
).
The following codes are used:
Code | Example | Note |
---|---|---|
1 |
foo & bar |
Missing semicolon (named) |
2 |
foo { bar |
Missing semicolon (numeric) |
3 |
Foo &bar baz |
Ampersand did not start a reference |
4 |
Foo &# |
Empty reference |
5 |
Foo &bar; baz |
Unknown entity |
6 |
Foo € baz |
Disallowed reference |
7 |
Foo � baz |
Prohibited: outside permissible unicode range |
Text handler.
this
refers to textContext
when given to parseEntities
.
String of content (string
).
Location at which value
starts and ends (Position
).
Character reference handler.
this
refers to referenceContext
when given to parseEntities
.
Encoded character reference (string
).
Location at which value
starts and ends (Position
).
Source of character reference (string
).
stringify-entities
— Encode HTML character referencescharacter-entities
— Info on character entitiescharacter-entities-html4
— Info on HTML4 character entitiescharacter-entities-legacy
— Info on legacy character entitiescharacter-reference-invalid
— Info on invalid numeric character references