doc-files.overview.html Maven / Gradle / Ivy
Go to download
Show more of this group Show more artifacts with this name
Show all versions of htmlparser Show documentation
Show all versions of htmlparser Show documentation
HTML Parser is the high level syntactical analyzer.
The newest version!
HTML Parser Libraries Overview
The HTML Parser Libraries.
These java libraries provide access to the contents of local or remote HTML
resources in a programatic way.
Components
The HTML Parser distribution is composed of:
- a low level {@link org.htmlparser.lexer.Lexer lexer} that converts characters from a HTML page into a linear sequence of nodes
- a high level {@link org.htmlparser.Parser parser} that provides a heirarchical document model of a HTML page
- source code in the src.zip file
Getting Started
For novice users, an introductory guide on how to set up your environment to
use the HTML Parser is provided in HTML Parser for Dummies.
Building
To build the HTML Parser you'll need to get the sources from the
HTML
Parser project on Sourceforge if you haven't already, and then follow the
build instructions.
Outstanding Issues.
Bugs are by far, the highest priority issues. Various reports of bugs related to
the HTML Parser are available from the Bug
Tracker on SourceForge. Issues related to incorrect behaviour of the
current parser should be logged and tracked using this mechanism. Please use
task lists and enhancement requests for issues that would not be considered
bugs.
Several task lists are used to track the items that are not percieved as bugs,
but are viewed by developers as things that need attention. The following list
summarizes the purpose and target issues for each list.
-
Applications - Work associated with the sample applications included with
the HTML Parser download is tracked by this list. This would also include
proposals for other example applications.
-
Release - Work to be done before a major release is tracked by this list.
Items included here must be resolved before the major release is considered
complete. This can include refactoring, code clean-up, out-of-the-box
experience work, build process fixes, platform (JDK) issues, performance
or scalability enhancements, memory usage issues and other 'quality' issues
that are not associated with a specific bug.
-
API - Work needed to enhance or fix the parser API is tracked by this list.
Standards compliance, additional classes, method signatures, changes to
parameter types, refactoring, deprecation, new or enhanced constructors, and
other programatic interface issues would fall into this category. This list
should be limited to those changes that could impact the developer community
that relies on existing behaviour from the parser.
-
Documentation - Work associated with documenting the parser and it's
example code and sample applications is tracked by this list. Javadocs, the
web site, Sourceforge site maintenance, mailing lists, forums,
project documentation and other developer visible reference material would all
fall under this category.
The
Request For Enhancement list contains items that are proposed for future versions
of the parser. Users may add to this list what they feel are extensions beyond
simple bug fixing. Some user entered bugs are also transferred to this list if
the scope of the fix would be too significant a change for the current
version, or involve API changes that need to be vetted against the current
user community.
Mailing Lists.
If you want to be notified when new releases of HTML Parser are available, join the
HTML Parser Announcement List.
If you have questions about the usage of the parser, join the
HTML Parser User List.
If you want to join as a developer, please sign up on the
HTML Parser Developer List
© 2015 - 2024 Weber Informatics LLC | Privacy Policy