-
-
Notifications
You must be signed in to change notification settings - Fork 83
Configuring Woodstox I ‐ Basic Stax Properties
Beyond standard Stax and SAX configurability, Woodstox allows much wider variety of configuration. When using Stax API, all configuration uses the same standard Stax API, regardless of whether setting itself is defined as part of Stax:
XMLInputFactory inputF = XMLInputFactory.newFactory();
inputF.setProperty(<property-to-set>, <value>);
XMLOutputFactory outputF = XMLOutputFactory.newFactory();
outputF.setProperty(<property-to-set>, <value>);
where property-to-set
is a String
with one of pre-defined constant values. Value is often of type Boolean
, but not always: this depends on configuration setting in question.
These constants can be divided into 3 different groups:
- Standard Stax properties (see
XMLInputFactory
andXMLOutputFactory
for constants): implemented by all compliant Stax implementations -
Stax2 extension API (
XMLInputFactory2
andXMLOutputFactory2
): implemented by all Stax2-compliant parsers (currently this means Woodstox and Aalto) - Woodstox-specific properties (
WstxInputProperties
,WstxOutputProperties
), supported only by Woodstox itself
In the following, these categories are described in more detail.
Set of standard properties is covered by JDK Javadocs (and I link entries below). Most are Boolean
valued: I only mention type if it is something different.
XMLInputFactory
defines a few settings; most important are:
-
IS_COALESCING: if enabled, parser will ensure that all adjacent text (“cdata”) segments are combined into a single
CHARACTERS
event. If disabled, text segments may be returned in arbitrary number of events (of typeCHARACTERS
andCDATA
) — often split at places where entities are used - IS_NAMESPACE_AWARE: whether namespace-processing is enabled or not: if disabled, namespace-binding does not occur, and full element/attribute name is reported as “local name” (for example: xml:space would have local name of “xml:space”, and no namespace prefix or URI). If enabled, namespace declarations are handled and prefix/namespace binding applied as expected
-
SUPPORT_DTD: whether DTD subset (definition) processing is enabled or not. If enabled, DTD definitions are read (both internal and external), and parsed entities are expanded. If disabled, internal DTD subset is skipped and external subset is not read.
NOTE: if disabled, no DTD validation occurs, regardless of other settings -
IS_VALIDATING: whether DTD validation is enabled or not (note: does not affect XML Schema, Relax NG, or other validation settings).
NOTE: only takes effects ifSUPPORT_DTD
is also enabled -
RESOLVER: unlike other options, NOT of type
Boolean
butXMLResolver
. Allows overriding reading of external DTD subsets (and parsed external entities defined from there), to (for example) add caching, or allow rewriting, replacing or just removing external DTD definitions. Often used for security purposes to just prevent external reads -
IS_SUPPORTING_EXTERNAL_ENTITIES: if DTD processing is enabled (see
SUPPORT_DTD
), external entities (references to external resources outside of XML document or DTD subset itself) are recognized and processed. However, their expansion may be disabled if this setting is disabled. This is typically done for security reasons: if XML content comes from untrusted sources, enabling expansion is not a good idea.
If disabled, entities are only reported as entity references; if enabled, entities are expanded as per XML specification and reported as XML tokens.
XMLOutputFactory
has only one configuration setting:
- IS_REPAIRING_NAMESPACES: more properly should be called “automatic namespaces”, enabling of which removes need to declared namespace bindings before use. If enabled, passing of namespace prefixes is optional, and all namespace declarations are automatically written by XMLStreamWriter. If disabled, caller must explicitly write namespace declarations. Note that in latter case it is possible that output is not namespace-compliant in cases where namespace declarations (bindings) are missing or misplaced.
Continue reading for the wider sets of configuration beyond basic Stax configuration settings: