com.caucho.xml.package.html Maven / Gradle / Ivy
XML parsing and printing package.
Parsing
Applications using strict XML parsing will either use the
JAXP API or the SAX API to create their parsers. Applications needing
to parse HTML will generally instantiate their own parsers.
There are four parser flavors with the same API:
Xml, LooseXml, Html and LooseHtml. The core of the API is in AbstractParser.
- Xml is a strict XML parser.
- LooseXml is a forgiving XML parser.
- Html is a strict HTML parser.
- LooseHtml is a forgiving HTML parser.
You can parse XML into a DOM tree or you can use the SAX callback
API. The core of the API is documented in AbstractParser.
DOM parsing looks like:
Document doc = new Html().parseDocument("test.html");
Parsing directly from a string looks like:
String str = "<em>test html doc</em>";
Document doc = new Html().parseDocumentString(str);
SAX parsing looks like:
Html html = new Html();
html.setContentHandler(myContentHandler);
html.parse("test.html");