All Downloads are FREE. Search and download functionalities are using the official Maven repository.

src.au.id.jericho.lib.html.package.html Maven / Gradle / Ivy

Go to download

Jericho HTML Parser is a simple but powerful java library allowing analysis and manipulation of parts of an HTML document, including some common server-side tags, while reproducing verbatim any unrecognised or invalid HTML. It also provides high-level HTML form manipulation functions.

There is a newer version: 2.3
Show newest version


	Jericho HTML Parser (jericho-html) release 1.0
	
		

A simple but powerful java library for parsing and modifying HTML documents, including analysis of abritrary HTML forms to determine the structure of submitted data.

The Jericho HTML Parser is an open source library released under the GNU Lesser General Public License (LGPL). You are therefore free to use it in commercial applications subject to the terms detailed in the licence document.

For downloads, support and updates visit the SourceForge.net project page at http://sourceforge.net/projects/jerichohtml/

For a summary of features and comparison with some other java HTML parsers, visit the homepage at http://jerichohtml.sourceforge.net

Modifying an HTML Document

The typical method for modifying a document is as follows. See the description of the {@link au.id.jericho.lib.html.OutputDocument} class for sample code.

  1. Create a {@link au.id.jericho.lib.html.Source} object from the source text
  2. Find the required segments by calling methods on the Source object and other segments
  3. Create an {@link au.id.jericho.lib.html.OutputDocument} object from the source text
  4. Add an {@link au.id.jericho.lib.html.IOutputSegment} to the OutputDocument for each segment of the document that is to be replaced with other text
  5. Call the {@link au.id.jericho.lib.html.OutputDocument#toString()} method to get the final output

Analysing or Extracting Information from an HTML Document

If the document only needs to be analysed instead of modified, only the first two steps listed above are required. See the description of the {@link au.id.jericho.lib.html.FormFields} class for sample code.





© 2015 - 2024 Weber Informatics LLC | Privacy Policy