All Downloads are FREE. Search and download functionalities are using the official Maven repository.

org.attoparser.select.package-info Maven / Gradle / Ivy

There is a newer version: 2.0.7.RELEASE
Show newest version
/**
 * 

* Handlers for filtering a part or several parts of markup during parsing * in a fast and efficient way. *

* *
*

Handler Implementations

* *

* There are two main handlers (implementations of {@link org.attoparser.IMarkupHandler} for * markup selection in this package: *

*
*
{@link org.attoparser.select.BlockSelectorMarkupHandler}
*
* For selecting entire blocks of markup (i.e. * elements and all the nodes in their subtrees). This can be used, for example, for extracting * fragments of markup during the parsing of the document, in a way so that discarded markup does * never reach higher layers of the document processing infrastructure. *
*
{@link org.attoparser.select.NodeSelectorMarkupHandler}
*
* For selecting only specific nodes in markup (i.e. not including their subtrees). This can be used * for modifying certain specific tags in markup during parsing, for example by * adding additional attributes to them that are not present in the original parsed markup. *
*
* *
*

Markup Selector Syntax

* *

* Markup selectors used by handlers in this package use a specific syntax with features borrowed from * XPath, CSS and jQuery selectors, in order to provide ease-of-use for most users. Many times there are several * ways to express the same selector, depending on the user's preferences. *

*

* For example, all the following equivalent selectors will select every <div> with class * content, in any position in markup: *

*

 *   //div[class='content']
 *   //div[@class='content']
 *   div[class='content']
 *   div[@class='content']
 *   //div.content
 *   div.content
 * 
*

* These are the different operations this syntax allows: *

* *
*

Basic selectors

* *
* *
x
//x
*
* Both are equivalent, and mean children of the current node with name x, at any depth in * markup. If a reference resolver is being used, they will also be equivalent to * %x (see below). *
* *
/x
*
* Means direct children of the current node with name x. *
* *
x/y
*
* Means direct children with name y of elements with name x, being the parent * x elements at any level in markup. *
* *
x//y
*
* Means children (at any level) with name y of elements with name x, being the parent * x elements also at any level in markup. *
* *
text()
comment()
cdata()
doctype()
xmldecl()
procinstr()
*
* These can be used like x (in the same places) but instead of selecting elements (i.e. tags) * will select, respectively: text nodes, comments, CDATA sections, DOCTYPE clauses, XML Declarations and * Processing Instructions. *
*
* *
*

Attribute matching

* *
* *
x[z='v']
x[z="v"]
x[@z='v']
x[@z="v"]
*
* All four equivalent, mean elements with name x and an attribute called z with value * v. Note attribute values can be surrounded by single or double quotes, and attribute names * can be specified with a leading @ (as in XPath) or without it (more similar to jQuery). For * the sake of simplicity, only the single-quoted, no-@ syntax will be used for the rest of * the examples below. *
* *
[z='v']
//[z='v']
*
* Means any elements with an attribute called z with value v. *
* *
x[z]
*
* Means elements with name x and an attribute called z, with any value. *
* *
x[!z]
*
* Means elements with name x and no attribute called z. *
* *
x[z1='v1' and z2='v2']
*
* Means elements with name x and attributes z1 and z2 with values * v1 and v2, respectively. *
* *
x[z1='v1' or z2='v2']
*
* Means elements with name x and, either an attribute z1 with value * v1, or an attribute z2 with value v2. *
* *
x[z1='v1' and (z2='v2' or z3='v3')]
*
* Selects according to the specified attribute complex expression. As can be seen, these expressions * can be parenthesized to introduce a certain evaluation order. *
* *
x[z!='v']
x[z^='v']
x[z$='v']
x[z*='v']
*
* Similar to x[z='v'] but applying different operators to attribute matching instead of * equality (=). Respectively: not equal (!=), * starts with (^=), ends with ($=) and * contains (*=). *
* *
x.z
x[class='z']
*
* When parsing in HTML mode (and only then), these two selectors will be completely equivalent. Besides, * in this case the selector will look for an x element which has the z class, knowing that * HTML's class attribute allows the specification of several classes separated by white space. So * something like <x class="z y w"> will be matched by this selector. *
* *
x#z
x[id='z']
*
* When parsing in HTML mode (and only then), these two selectors will be completely equivalent, matching those * x elements that have an ID with value z. *
* *
* *
*

Index-based matching

* *
*
x[i]
*
* Means elements with name x positioned in index i among its siblings. * Sibling here means node child of the same parent element, matching the same * conditions (in this case "having x as name"). Note indexes start with * 0. *
* *
x[z='v'][i]
*
* Means elements with name x, attribute z with value v and positioned in * number i among its siblings (same name, same attribute with that value). *
* *
x[even()]
x[odd()]
*
* Means elements with name x positioned in an even (or odd) index among its siblings. * Note even includes the index number 0. *
* *
x[>i]
x[<i]
*
* Mean elements with name x positioned in an index greater (or lesser) than i * among its siblings. *
* *
text()[i]
comment()[>i]
*
* Applies the specified index-based matching operations to nodes of types other than elements: texts, * comments, CDATA sections, etc. *
* *
* *
*

Reference-based matching

* *
* *
x%ref
*
* Means elements with name x and matching markup selector reference * with value ref. These markup selector references usually have a user-defined * meaning and are resolved to a markup selector without references by means of an instance of the * {@link org.attoparser.select.IMarkupSelectorReferenceResolver} interface passed to the selecting * markup handlers ({@link org.attoparser.select.BlockSelectorMarkupHandler} and * {@link org.attoparser.select.NodeSelectorMarkupHandler}) during construction. * For example, a reference resolver could be * configured that converts (resolves) %someref into * div[class='someref' or id='someref']. Also, the * Thymeleaf template engine uses this mechanism * for resolving %fragmentName (or simply fragmentName, as explained below) into * //[th:fragment='fragmentName' or data-th-fragment='fragmentName']. *
* *
%ref
*
* Means any elements (whichever the name) matching reference with value ref. *
* *
ref
*
* Equivalent to %ref. When a markup selector reference resolver has been configured, * ref can bean both "element with name x" and * "element matching reference x" (both will match). *
* *
*
*
* */ package org.attoparser.select;




© 2015 - 2024 Weber Informatics LLC | Privacy Policy