xmlts20130923.xmltest.canonxml.html Maven / Gradle / Ivy
Canonical XML
Canonical XML
This document defines a subset of XML called canonical XML.
The intended use of canonical XML is in testing XML processors,
as a representation of the result of parsing an XML document.
Every well-formed XML document has a unique structurally equivalent
canonical XML document. Two structurally equivalent XML
documents have a byte-for-byte identical canonical XML document.
Canonicalizing an XML document requires only information that an XML
processor is required to make available to an application.
A canonical XML document conforms to the following grammar:
CanonXML ::= Pi* element Pi*
element ::= Stag (Datachar | Pi | element)* Etag
Stag ::= '<' Name Atts '>'
Etag ::= '</' Name '>'
Pi ::= '<?' Name ' ' (((Char - S) Char*)? - (Char* '?>' Char*)) '?>'
Atts ::= (' ' Name '=' '"' Datachar* '"')*
Datachar ::= '&' | '<' | '>' | '"'
| '	'| ' '| ' '
| (Char - ('&' | '<' | '>' | '"' | #x9 | #xA | #xD))
Name ::= (see XML spec)
Char ::= (see XML spec)
S ::= (see XML spec)
Attributes are in lexicographical order (in Unicode bit order).
A canonical XML document is encoded in UTF-8.
Ignorable white space is considered significant and is treated equivalently
to data.
James Clark