Entities are the variables that can be used in XML content to maintain consistency. Eg,
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE note [
<!ENTITY nbsp " ">
<!ENTITY writer "Writer: Donald Duck.">
<!ENTITY copyright "Copyright: W3Schools.">
]>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body attr="&writer;">Don't forget me this weekend!</body>
<footer>&writer; ©right;</footer>
</note>
You can define your own entities using DOCTYPE. FXP by default supports following XML entities;
Entity name | Character | Decimal reference | Hexadecimal reference |
---|---|---|---|
quot | " | " |
" |
amp | & | & |
& |
apos | ' | ' |
' |
lt | < | < |
< |
gt | > | > |
> |
However, since the entity processing can impact the parser's performance drastically, you can use processEntities: false
to disable it.
XML Builder decodes default entities value. Eg
const jsObj = {
"note": {
"@heading": "Reminder > \"Alert",
"body": {
"#text": " 3 < 4",
"attr": "Writer: Donald Duck."
},
}
};
const options = {
attributeNamePrefix: "@",
ignoreAttributes: false,
// processEntities: false
};
const builder = new XMLBuilder(options);
const output = builder.build(jsObj);
Output:
<note heading="Reminder > "Alert">
<body>
3 < 4
<attr>Writer: Donald Duck.</attr>
</body>
</note>
Though FXP doesn't silently ignores entities with &
in the values, following side effects are possible
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE note [
<!ENTITY nbsp "writer;">
<!ENTITY writer "Writer: Donald Duck.">
<!ENTITY copyright "Copyright: W3Schools.">
]>
<note>
<heading>Reminder</heading>
<body attr="&writer;">Don't forget me this weekend!</body>
<footer>&writer;& ©right;</footer>
</note>
Output
{
"note": {
"heading": "Reminder",
"body": {
"#text": "Don't forget me this weekend!",
"attr": "Writer: Donald Duck."
},
"footer": "Writer: Donald Duck.Writer: Donald Duck.Copyright: W3Schools."
}
}
To deal with such situation, use &
instead of &
in XML document.
Following attacks are possible due to entity processing
- Denial-of-Service Attacks
- Classic XXE
- Advanced XXE
- Server-Side Request Forgery (SSRF)
- XInclude
- XSLT
Since FXP doesn't allow entities with &
in the values, above attacks should not work.
Following HTML entities are supported by the parser by default when htmlEntities: true
.
Result | Description | Entity Name | Entity Number |
---|---|---|---|
non-breaking space | |
  |
|
< | less than | < |
< |
> | greater than | > |
> |
& | ampersand | & |
& |
" | double quotation mark | " |
" |
' | single quotation mark (apostrophe) | ' |
' |
¢ | cent | ¢ |
¢ |
£ | pound | £ |
£ |
¥ | yen | ¥ |
¥ |
€ | euro | € |
€ |
© | copyright | © |
© |
® | registered trademark | ® |
® |
₹ | Indian Rupee | &inr; |
₹ |
In addition, numeric character references are also supported. Both decimal (num_dec
) and hexadecimal(num_hex
).
In future version of FXP, we'll be supporting more features of DOCTYPE such as ELEMENT
, reading content for an entity from a file etc.
You can set external entities without using DOCTYPE.
const xmlData = `<note>&unknown;
last</note> `;
const parser = new XMLParser();
parser.addEntity("#xD", "\r"); // &unknown;\rlast
let result = parser.parse(xmlData);
This way, you can also override the default entities.