All Downloads are FREE. Search and download functionalities are using the official Maven repository.

doc.api.au.id.jericho.lib.html.Attributes.html Maven / Gradle / Ivy

Go to download

Jericho HTML Parser is a simple but powerful java library allowing analysis and manipulation of parts of an HTML document, including some common server-side tags, while reproducing verbatim any unrecognised or invalid HTML. It also provides high-level HTML form manipulation functions.

There is a newer version: 2.3
Show newest version






Attributes (Jericho HTML Parser 1.5-dev1)

















au.id.jericho.lib.html
Class Attributes

java.lang.Object
  extended byau.id.jericho.lib.html.Segment
      extended byau.id.jericho.lib.html.internal.SequentialListSegment
          extended byau.id.jericho.lib.html.Attributes
All Implemented Interfaces:
java.lang.CharSequence, java.util.Collection, java.lang.Comparable, java.util.List

public final class Attributes
extends au.id.jericho.lib.html.internal.SequentialListSegment

Represents the list of Attribute objects present within a particular StartTag.

The attributes in this list are a representation of those found in the source document and are not modifiable. The AttributesOutputSegment class provides a means to add, delete or modify attributes and their values for inclusion in an OutputDocument.

This segment starts at the end of the StartTag's name and ends at the end of the last attribute.

Note that before version 1.5 the segment ended just before the closing '/', '?' or '>' character of the StartTag instead of at the end of the last attribute.

Created using the StartTag.getAttributes() method, or explicitly using the Source.parseAttributes(int pos, int maxEnd) method.

It is possible (and common) for instances of this class to contain no attributes.

See also the XML 1.0 specification for attributes.

See Also:
StartTag, Attribute

Method Summary
static java.lang.String generateHTML(java.util.Map attributesMap)
          Returns the contents of the specified attributes map as HTML attribute name/value pairs.
 Attribute get(java.lang.String name)
          Returns the Attribute with the specified name (case insensitive).
 int getCount()
          Returns the number of attributes.
 java.lang.String getDebugInfo()
          Returns a string representation of this object useful for debugging purposes.
static int getDefaultMaxErrorCount()
          Returns the default maximum error count allowed when parsing attributes.
 java.util.List getList()
          Deprecated. use this instance instead.
 java.lang.String getValue(java.lang.String name)
          Returns the decoded value of the attribute with the specified name (case insensitive).
 java.util.Iterator iterator()
          Returns an iterator over the Attribute objects in this list in proper sequence.
 java.util.ListIterator listIterator(int index)
          Returns a list iterator of the Attribute objects in this list (in proper sequence), starting at the specified position in the list.
 java.util.Map populateMap(java.util.Map attributesMap, boolean convertNamesToLowerCase)
          Populates the specified Map with the name/value pairs from these attributes.
static void setDefaultMaxErrorCount(int value)
          Sets the default maximum error count allowed when parsing attributes.
 
Methods inherited from class au.id.jericho.lib.html.internal.SequentialListSegment
add, add, addAll, addAll, clear, contains, containsAll, get, indexOf, isEmpty, lastIndexOf, listIterator, remove, remove, removeAll, retainAll, set, size, subList, toArray, toArray
 
Methods inherited from class au.id.jericho.lib.html.Segment
charAt, compareTo, encloses, encloses, equals, findAllCharacterReferences, findAllComments, findAllElements, findAllElements, findAllStartTags, findAllStartTags, findAllStartTags, findFormControls, findFormFields, findWords, getBegin, getEnd, getSourceText, getSourceTextNoWhitespace, hashCode, ignoreWhenParsing, isComment, isWhiteSpace, length, parseAttributes, subSequence, toString
 
Methods inherited from class java.lang.Object
getClass, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface java.util.List
equals, hashCode
 

Method Detail

get

public Attribute get(java.lang.String name)
Returns the Attribute with the specified name (case insensitive).

If more than one attribute exists with the specified name (which is technically illegal HTML), the first is returned.

Parameters:
name - the name of the attribute to get.
Returns:
the attribute with the specified name, or null if no attribute with the specified name exists.
See Also:
getValue(String name)

getValue

public java.lang.String getValue(java.lang.String name)
Returns the decoded value of the attribute with the specified name (case insensitive).

Returns null if no attribute with the specified name exists or no value has been assigned to the attribute.

This is equivalent to get(name).getValue(), although it will return null if no attribute with the specified name exists instead of throwing a NullPointerException.

Note that before version 1.5 this method returned the raw value of the attribute, without decoding.

Parameters:
name - the name of the attribute to get.
Returns:
the decoded value of the attribute with the specified name, or null if the attribute has no value.
See Also:
get(String name)

getCount

public int getCount()
Returns the number of attributes.

This is equivalent to calling the size() method specified in the List interface.

Returns:
the number of attributes.

iterator

public java.util.Iterator iterator()
Returns an iterator over the Attribute objects in this list in proper sequence.

Returns:
an iterator over the Attribute objects in this list in proper sequence.

listIterator

public java.util.ListIterator listIterator(int index)
Returns a list iterator of the Attribute objects in this list (in proper sequence), starting at the specified position in the list.

The specified index indicates the first item that would be returned by an initial call to the next() method. An initial call to the previous() method would return the item with the specified index minus one.

Parameters:
index - the index of the first item to be returned from the list iterator (by a call to the next() method).
Returns:
a list iterator of the items in this list (in proper sequence), starting at the specified position in the list.
Throws:
java.lang.IndexOutOfBoundsException - if the specified index is out of range (index < 0 || index > size()).

populateMap

public java.util.Map populateMap(java.util.Map attributesMap,
                                 boolean convertNamesToLowerCase)
Populates the specified Map with the name/value pairs from these attributes.

Both names and values are stored as String objects.

The entries are added in order of apprearance in the source document.

An attribute with no value is represented by a map entry with a null value.

Attribute values are automatically decoded before storage in the map.

Parameters:
attributesMap - the map to populate, must not be null.
convertNamesToLowerCase - specifies whether all attribute names are converted to lower case in the map.
Returns:
the same map specified as the attributesMap argument, populated with the name/value pairs from these attributes.
See Also:
generateHTML(Map attributesMap)

getDebugInfo

public java.lang.String getDebugInfo()
Description copied from class: Segment
Returns a string representation of this object useful for debugging purposes.

Overrides:
getDebugInfo in class Segment
Returns:
a string representation of this object useful for debugging purposes.

getDefaultMaxErrorCount

public static int getDefaultMaxErrorCount()
Returns the default maximum error count allowed when parsing attributes.

The system default value is 1.

Returns:
the default maximum error count allowed when parsing attributes.
See Also:
setDefaultMaxErrorCount(int value), Source.parseAttributes(int pos, int maxEnd, int maxErrorCount)

setDefaultMaxErrorCount

public static void setDefaultMaxErrorCount(int value)
Sets the default maximum error count allowed when parsing attributes.

When searching for start tags, the parser can find the end of the start tag only by parsing the the attributes, as it is valid HTML for attribute values to contain '>' characters (see section 5.3.2 of the HTML spec).

If the source text being parsed does not follow the syntax of an attribute list at all, the parser assumes that the text which was originally identified as the beginning of of a start tag is in fact some other text, such as an invalid '<' character in the middle of some text, or part of a script element. In this case the entire start tag is rejected.

On the other hand, it is quite common for attributes to contain minor syntactical errors, such as an invalid character in an attribute name, or a couple of special characters in server tags that otherwise contain only attributes. For this reason the parser allows a certain number of minor errors to occur while parsing an attribute list before the entire start tag or attribute list is rejected. This method sets the number of minor errors allowed.

Major syntactical errors will cause the start tag or attribute list to be rejected immediately, regardless of the maximum error count setting.

Some errors are considered too minor to count at all (ignorable), such as missing whitespace between the end of a quoted attribute value and the start of the next attribute name.

The classification of particular syntax errors in attribute lists into major, minor, and ignorable is not part of the specification and may change in future versions.

To track errors as they occur, use the Source.setLogWriter(Writer writer) method to set the destination of the error log.

Parameters:
value - the default maximum error count allowed when parsing attributes.
See Also:
getDefaultMaxErrorCount(), Source.parseAttributes(int pos, int maxEnd, int maxErrorCount), Source.setLogWriter(Writer writer)

generateHTML

public static java.lang.String generateHTML(java.util.Map attributesMap)
Returns the contents of the specified attributes map as HTML attribute name/value pairs.

Each attribute (including the first) is preceded by a single space, and all values are encoded and enclosed in double quotes.

The map keys must be of type String and values must be objects that implement the CharSequence interface.

A null value represents an attribute with no value.

Parameters:
attributesMap - a map containing attribute name/value pairs.
Returns:
the contents of the specified attributes map as HTML attribute name/value pairs.
See Also:
StartTag.generateHTML(String tagName, Map attributesMap, boolean emptyElementTag)

getList

public java.util.List getList()
Deprecated. use this instance instead.

Returns this instance.

This method has been deprecated as of version 1.5 as the Attributes class now implements the List interface, so the instance itself can be used instead.

Returns:
this instance.






© 2015 - 2024 Weber Informatics LLC | Privacy Policy