All Downloads are FREE. Search and download functionalities are using the official Maven repository.

info.codesaway.util.regex.Matcher Maven / Gradle / Ivy

Go to download

Extends Java's regular expression syntax by adding support for additional Perl and .NET syntax.

The newest version!
package info.codesaway.util.regex;

// TODO: replace doesn't correctly handle references with named subgroups
// (if case of duplicates, use the first matching one - not necessarily the
// first occurrence)

// specifying the empty string for the group will find the first non-null group
// (same functionality as getUsedGroup in the (extended) Pattern class)

// TODO: overload matched method to allow checking for matched group without throwing error if it doesn't exist
import static info.codesaway.util.regex.Pattern.getMappingName;
import static info.codesaway.util.regex.RefactorUtility.fullGroupName;

import java.lang.reflect.Field;
import java.util.ArrayList;
import java.util.Collection;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import java.util.NoSuchElementException;

import edu.umd.cs.findbugs.annotations.SuppressFBWarnings;

/**
 * An engine that performs match operations on a {@link java.lang.CharSequence
 * character sequence} by interpreting a {@link Pattern}.
 *
 * 

This class is an extension * of Java's {@link java.util.regex.Matcher} class. Javadocs were copied and * appended with the added functionality.

* *

A matcher is created from a pattern by invoking the pattern's {@link Pattern#matcher matcher} method. Once * created, a matcher can be used * to perform three different kinds of match operations:

* *
    * *
  • The {@link #matches matches} method attempts to match the entire * input sequence against the pattern.

  • * *
  • The {@link #lookingAt lookingAt} method attempts to match the * input sequence, starting at the beginning, against the pattern.

  • * *
  • The {@link #find find} method scans the input sequence looking for * the next subsequence that matches the pattern.

  • * *
* *

Each of these methods returns a boolean indicating success or failure. * More information about a successful match can be obtained by querying the * state of the matcher.

* *

A matcher finds matches in a subset of its input called the * region. By default, the region contains all of the matcher's input. * The region can be modified via the{@link #region region} method and queried * via the {@link #regionStart regionStart} and {@link #regionEnd regionEnd} methods. The way that the region boundaries * interact with some pattern * constructs can be changed. See {@link #useAnchoringBounds * useAnchoringBounds} and {@link #useTransparentBounds useTransparentBounds} for more details.

* *

This class also defines methods for replacing matched subsequences with * new strings whose contents can, if desired, be computed from the match * result. The {@link #appendReplacement appendReplacement} and {@link #appendTail appendTail} methods can be used in * tandem in order to * collect the result into an existing string buffer, or the more convenient {@link #replaceAll replaceAll} method can * be used to create a string in which * every matching subsequence in the input sequence is replaced.

* *

The explicit state of a matcher includes the start and end indices of * the most recent successful match. It also includes the start and end * indices of the input subsequence captured by each capturing group in the pattern as well as a total * count of such subsequences. As a convenience, methods are also provided for * returning these captured subsequences in string form.

* *

The explicit state of a matcher is initially undefined; attempting to * query any part of it before a successful match will cause an {@link IllegalStateException} to be thrown. The explicit * state of a matcher * is recomputed by every match operation.

* *

The implicit state of a matcher includes the input character sequence as * well as the append position, which is initially zero and is updated * by the {@link #appendReplacement appendReplacement} method.

* *

A matcher may be reset explicitly by invoking its {@link #reset()} method * or, if a new input sequence is desired, its {@link #reset(java.lang.CharSequence) reset(CharSequence)} method. * Resetting * a matcher discards its explicit state information and sets the append * position to zero.

* *

Instances of this class are not safe for use by multiple concurrent * threads.

*/ public final class Matcher implements MatchResult, Iterable { /** * The internal {@link java.util.regex.Matcher} object for this matcher. */ private final java.util.regex.Matcher internalMatcher; /** * The matcher current being used to perform matches. * *

Sometimes the current matcher is a different matcher than the internal * one, so this acts as a pointer to the matcher used to perform * matches.

* *

For example, the {@link #proceed()} method makes use of a separate * matcher to perform its matches than the internal one. In this case, * this object points to that temporary matcher, instead of the internal * matcher.

*/ private java.util.regex.Matcher usedMatcher; /** * The Pattern object that created this Matcher. */ private Pattern parentPattern; /** * Whether group names (for named capture groups) are case-insensitive. */ // private boolean caseInsensitiveGroupNames; /** * The index of the last position appended in a substitution. */ private int lastAppendPosition = 0; /** * The original string being matched. */ private CharSequence text; /** * @since 0.2 */ private boolean treatNullAsEmptyString = false; /** * @param matcher the Java Matcher * @since 0.2 */ public Matcher(final java.util.regex.Matcher matcher) { this(matcher, new Pattern(matcher.pattern()), getText(matcher)); } /** * Gets the text field from a Java matcher. * * @param matcher * @return * @since 0.2 */ private static CharSequence getText(final java.util.regex.Matcher matcher) { try { // Uses reflection Field field = matcher.getClass().getDeclaredField("text"); // Necessary, since private field (doen't affect actual field) field.setAccessible(true); // Return CharSequence text field return (CharSequence) field.get(matcher); } catch (RuntimeException | NoSuchFieldException | IllegalAccessException e) { throw new AssertionError(e); } } // public Matcher(CharSequence text, java.util.regex.Pattern pattern) // { // this(pattern, text); // } // // public Matcher(java.util.regex.Pattern pattern, CharSequence text) // { // this(pattern.matcher(text), new Pattern(pattern), text); // } /** * All matchers have the state used by Pattern during a match. * * @param matcher * the internal {@link java.util.regex.Matcher} * @param parent * the parent pattern * @param text * the input text */ Matcher(final java.util.regex.Matcher matcher, final Pattern parent, final CharSequence text) { this.internalMatcher = matcher; this.parentPattern = parent; this.text = text; // this.caseInsensitiveGroupNames = parent.has(CASE_INSENSITIVE_NAMES); // Put fields into initial states this.resetPrivate(); } /** * Returns whether this matcher uses case-insensitive group names. * * @return if this matches uses case-insensitive group * names */ // private void setCaseInsensitiveGroupNames(boolean // caseInsensitiveGroupNames) // { // this.caseInsensitiveGroupNames = caseInsensitiveGroupNames; // } /** * @return the internal {@link java.util.regex.Matcher} used by this * matcher. */ // public java.util.regex.Matcher getInternalMatcher() // { // return internalMatcher; // } /** * {@inheritDoc} */ @Override public Pattern pattern() { return this.parentPattern; } /** * Returns the match state of this matcher as a {@link MatchResult}. * The result is unaffected by subsequent operations performed upon this * matcher. * * @return a MatchResult with the state of this matcher */ public MatchResult toMatchResult() { return this.toMatchResult(this.text.toString()); } private MatchResult toMatchResult(final String text) { return new ImmutableMatchResult( this.pattern(), this.usedMatcher.toMatchResult(), text, this.treatNullAsEmptyString()); } /** * Changes the Pattern that this Matcher uses to find * matches with. * *

This method causes this matcher to lose information about the groups * of the last match that occurred. The matcher's position in the input is * maintained and its last append position is unaffected.

* * @param newPattern * The new pattern used by this matcher * @return This matcher * @throws IllegalArgumentException * If newPattern is null */ public Matcher usePattern(final Pattern newPattern) { if (newPattern == null) { throw new IllegalArgumentException("Pattern cannot be null"); } java.util.regex.Pattern pattern = newPattern.getInternalPattern(); this.internalMatcher.usePattern(pattern); if (this.usedMatcher != this.internalMatcher) { this.usedMatcher.usePattern(pattern); } this.parentPattern = newPattern; // this.caseInsensitiveGroupNames = // newPattern.has(CASE_INSENSITIVE_NAMES); return this; } /** * Resets this matcher. * *

Resetting a matcher discards all of its explicit state information and * sets its append position to zero. The matcher's region is set to the * default region, which is its entire character sequence. The anchoring and * transparency of this matcher's region boundaries are unaffected.

* * @return This matcher */ public Matcher reset() { this.internalMatcher.reset(); return this.resetPrivate(); } /** * Resets this matcher with a new input sequence. * *

Resetting a matcher discards all of its explicit state information and * sets its append position to zero. The matcher's region is set to the * default region, which is its entire character sequence. The anchoring and * transparency of this matcher's region boundaries are unaffected.

* * @param input * The new input character sequence * * @return This matcher */ public Matcher reset(final CharSequence input) { this.internalMatcher.reset(input); this.text = input; return this.resetPrivate(); } /** * Resets the fields in this class. * * @return this matcher */ private Matcher resetPrivate() { this.lastAppendPosition = 0; this.usedMatcher = this.internalMatcher; return this; } /** * Returns the start index of the previous match. * * @return The index of the first character matched * * @throws IllegalStateException * If no match has yet been attempted, or if the previous match * operation failed */ @Override public int start() { return this.usedMatcher.start(); } /** * Returns the start index of the subsequence captured by the given group * during the previous match operation. * *

Capturing groups are indexed * from left to right, starting at one. Group zero denotes the entire * pattern, so the expression m.start(0) is equivalent to * m.start().

* * @param group * The index of a capturing group in this matcher's pattern * * @return The index of the first character captured by the group, or * -1 if the match was successful but the group itself did * not match anything * * @throws IllegalStateException * If no match has yet been attempted, or if the previous match * operation failed * * @throws IndexOutOfBoundsException * If there is no capturing group in the pattern * with the given index */ @Override public int start(final int group) { return this.usedMatcher.start(this.getGroupIndex(group)); } /** * Returns the start index of the subsequence captured by the given * group during the previous match * operation. * * @param group * A capturing group in this matcher's pattern * * @return The index of the first character captured by the group, or * -1 if the match was successful but the group itself did * not match anything * * @throws IllegalStateException * If no match has yet been attempted, or if the previous match * operation failed * * @throws IllegalArgumentException * If there is no capturing group in the pattern * of the given group */ @Override public int start(final String group) { return this.usedMatcher.start(this.getGroupIndex(group)); } /** * Returns the start index of the subsequence captured by the given * group during the previous match * operation. * *

An invocation of this convenience method of the form

* *
	 * m.start(groupName, occurrence)
* *

behaves in exactly the same way as

* *
	 * m.start(groupName + "[" + occurrence + "]")
* * @param groupName * The group name for a capturing group in this matcher's pattern * * @param occurrence * The occurrence of the specified group name * * @return The index of the first character captured by the group, or * -1 if the match was successful but the group itself did * not match anything * * @throws IllegalStateException * If no match has yet been attempted, or if the previous match * operation failed * * @throws IllegalArgumentException * If there is no capturing group in the pattern * of the given group */ @Override public int start(final String groupName, final int occurrence) { return this.usedMatcher.start(this.getGroupIndex(groupName, occurrence)); } /** * Returns the offset after the last character matched. * * @return The offset after the last character matched * * @throws IllegalStateException * If no match has yet been attempted, or if the previous match * operation failed */ @Override public int end() { return this.usedMatcher.end(); } /** * Returns the offset after the last character of the subsequence captured * by the given group during the previous match operation. * *

Capturing groups are indexed * from left to right, starting at one. Group zero denotes the entire * pattern, so the expression m.end(0) is equivalent to * m.end().

* * @param group * The index of a capturing group in this matcher's pattern * * @return The offset after the last character captured by the group, or * -1 if the match was successful but the group itself did * not match anything * * @throws IllegalStateException * If no match has yet been attempted, or if the previous match * operation failed * * @throws IndexOutOfBoundsException * If there is no capturing group in the pattern * with the given index */ @Override public int end(final int group) { return this.usedMatcher.end(this.getGroupIndex(group)); } /** * Returns the offset after the last character of the subsequence captured * by the given group during the previous * match operation. * * @param group * A capturing group in this matcher's pattern * * @return The offset after the last character captured by the group, or * -1 if the match was successful but the group itself did * not match anything * * @throws IllegalStateException * If no match has yet been attempted, or if the previous match * operation failed * * @throws IllegalArgumentException * If there is no capturing group in the pattern * of the given group */ @Override public int end(final String group) { return this.usedMatcher.end(this.getGroupIndex(group)); } /** * Returns the offset after the last character of the subsequence captured * by the given group during the previous * match operation. * *

An invocation of this convenience method of the form

* *
	 * m.end(groupName, occurrence)
* *

behaves in exactly the same way as

* *
	 * m.end(groupName + "[" + occurrence + "]")
* * @param groupName * The group name for a capturing group in this matcher's pattern * * @param occurrence * The occurrence of the specified group name * * @return The offset after the last character captured by the group, or * -1 if the match was successful but the group itself did * not match anything * * @throws IllegalStateException * If no match has yet been attempted, or if the previous match * operation failed * * @throws IllegalArgumentException * If there is no capturing group in the pattern * of the given group */ @Override public int end(final String groupName, final int occurrence) { return this.usedMatcher.end(this.getGroupIndex(groupName, occurrence)); } /** * Returns the occurrence of the first matched group with the given * name. * *

Branch reset patterns and the {@link Pattern#DUPLICATE_NAMES} flag * allow multiple capture groups with the same group name to exist. * This method offers a way to determine which occurrence matched.

* * @param groupName * the name of the group * * @return the occurrence of the first matched group with the given * name. * */ @Override public int occurrence(final String groupName) { return MatcherHelper.occurrence(this, this.usedMatcher, groupName); } /** * Returns the input subsequence matched by the previous match. * *

For a matcher m with input sequence s, the expressions * m.group() and s.substring(m. * start(), m.end()) are equivalent.

* *

Note that some patterns, for example a*, match the empty * string. This method will return the empty string when the pattern * successfully matches the empty string in the input.

* * @return The (possibly empty) subsequence matched by the previous match, * in string form * * @throws IllegalStateException * If no match has yet been attempted, or if the previous match * operation failed */ @Override public String group() { return this.groupPrivate(0); } /** * Returns the input subsequence captured by the given group during the * previous match operation. * *

For a matcher m, input sequence s, and group index * g, the expressions m.group(g) and * s.substring(m.start(g), *  m.end(g)) are equivalent.

* *

Capturing groups are indexed * from left to right, starting at one. Group zero denotes the entire * pattern, so the expression m.group(0) is equivalent to * m.group().

* *

If the match was successful but the group specified failed to match * any part of the input sequence, then null is returned. Note that * some groups, for example (a*), match the empty string. This * method will return the empty string when such a group successfully * matches the empty string in the input.

* * @param group * The index of a capturing group in this matcher's pattern * * @return The (possibly empty) subsequence captured by the group during the * previous match, or null if the group failed to match * part of the input * * @throws IllegalStateException * If no match has yet been attempted, or if the previous match * operation failed * * @throws IndexOutOfBoundsException * If there is no capturing group in the pattern * with the given index */ @Override public String group(final int group) { return this.groupPrivate(this.getGroupIndex(group)); } /** * Returns the input subsequence captured by the given * group during the previous match * operation. * *

Capturing groups are indexed * from left to right, starting at one. Group zero denotes the entire * pattern, so the expression m.group(0, null) is equivalent to * m.group().

* *

If the match was successful but the group specified failed to match * any part of the input sequence, then defaultValue is returned, * whereas {@link #group(int)} would return null. Otherwise, * the return is equivalent to that of m.group(group).

* *

As a note,

* *
m.group(group, null)
* *

behaves in exactly the same way as

* *
m.group(group)
* * @param group * The index of a capturing group in this matcher's pattern * * @param defaultValue * The string to return if {@link #group(int)} would return * null * * @return The (possibly empty) subsequence captured by the group during the * previous match, or defaultValue if the group failed to * match part of the input * * @throws IllegalStateException * If no match has yet been attempted, or if the previous match * operation failed * * @throws IndexOutOfBoundsException * If there is no capturing group in the pattern * with the given index * * @see #group(int) */ @Override public String group(final int group, final String defaultValue) { return this.groupPrivate(this.getGroupIndex(group), defaultValue); } /** * Returns the input subsequence captured by the given * group during the previous match * operation. * *

If the match was successful but the group specified failed to match * any part of the input sequence, then null is returned. Note that * some groups, for example (a*), match the empty string. This * method will return the empty string when such a group successfully * matches the empty string in the input.

* * @param group * A capturing group in this matcher's pattern * * @return The (possibly empty) subsequence captured by the group during the * previous match, or null if the group failed to match * part of the input * * @throws IllegalStateException * If no match has yet been attempted, or if the previous match * operation failed * * @throws IllegalArgumentException * If there is no capturing group in the pattern * of the given group */ @Override public String group(final String group) { return this.groupPrivate(this.getGroupIndex(group)); } /** * Returns the input subsequence captured by the given * group during the previous match * operation. * *

If the match was successful but the group specified failed to match * any part of the input sequence, then defaultValue is returned, * whereas {@link #group(String)} would return null. Otherwise, * the return is equivalent to that of m.group(group).

* *

As a note,

* *
m.group(group, null)
* *

behaves in exactly the same way as

* *
m.group(group)
* * @param group * A capturing group in this matcher's pattern * * @param defaultValue * The string to return if {@link #group(String)} would return * null * * @return The (possibly empty) subsequence captured by the group during the * previous match, or defaultValue if the group failed to * match part of the input * * @throws IllegalStateException * If no match has yet been attempted, or if the previous match * operation failed * * @throws IllegalArgumentException * If there is no capturing group in the pattern * of the given group * * @see #group(String) */ @Override public String group(final String group, final String defaultValue) { return this.groupPrivate(this.getGroupIndex(group), defaultValue); } /** * Returns the input subsequence captured by the given * group during the previous match * operation. * *

If the match was successful but the group specified failed to match * any part of the input sequence, then null is returned. Note that * some groups, for example (a*), match the empty string. This * method will return the empty string when such a group successfully * matches the empty string in the input.

* *

An invocation of this convenience method of the form

* *
m.group(groupName, occurrence)
* *

behaves in exactly the same way as

* *
	 * m.group(groupName + "[" + occurrence + "]")
* * @param groupName * The group name for a capturing group in this matcher's pattern * * @param occurrence * The occurrence of the specified group name * * @return The (possibly empty) subsequence captured by the group during the * previous match, or null if the group failed to match * part of the input * * @throws IllegalStateException * If no match has yet been attempted, or if the previous match * operation failed * * @throws IllegalArgumentException * If there is no capturing group in the pattern * of the given group */ @Override public String group(final String groupName, final int occurrence) { return this.groupPrivate(this.getGroupIndex(groupName, occurrence)); } /** * Returns the input subsequence captured by the given * group during the previous match * operation. * *

If the match was successful but the group specified failed to match * any part of the input sequence, then defaultValue is returned, * whereas {@link #group(String, int)} would return null. * Otherwise, the return is equivalent to that of * m.group(groupName, occurrence).

* *

As a note,

* *
m.group(groupName, occurrence, null)
* *

behaves in exactly the same way as

* *
m.group(groupName, occurrence)

* *

An invocation of this convenience method of the form

* *
	 * m.group(groupName, occurrence, defaultValue)
* *

behaves in exactly the same way as

* *
	 * m.group(groupName + "[" + occurrence + "]", defaultValue)
*
* * @param groupName * The group name for a capturing group in this matcher's pattern * * @param occurrence * The occurrence of the specified group name * * @param defaultValue * The string to return if {@link #group(String, int)} would * return null * * @return The (possibly empty) subsequence captured by the group during the * previous match, or defaultValue if the group failed to * match part of the input * * @throws IllegalStateException * If no match has yet been attempted, or if the previous match * operation failed * * @throws IllegalArgumentException * If there is no capturing group in the pattern * of the given group * * @see #group(String, int) */ @Override public String group(final String groupName, final int occurrence, final String defaultValue) { return this.groupPrivate(this.getGroupIndex(groupName, occurrence), defaultValue); } /** * Returns the captured contents of the group with the given index. * * @param mappedIndex * the index for the group (in the internal matcher) whose * contents are returned * @return the captured contents of the group with the given index * @see java.util.regex.Matcher#group(int) */ private String groupPrivate(final int mappedIndex) { String value = this.usedMatcher.group(mappedIndex); return value == null && this.treatNullAsEmptyString ? "" : value; } private String groupPrivate(final int mappedIndex, final String defaultValue) { boolean matched = this.usedMatcher.start(mappedIndex) != -1; return (matched ? this.usedMatcher.group(mappedIndex) : defaultValue); } /** * Attempts to match the entire region against the pattern. * *

If the match succeeds then more information can be obtained via the * start, end, and group methods.

* * @return true if, and only if, the entire region sequence matches * this matcher's pattern */ public boolean matches() { // TODO: use internalMatcher or useMatcher ?? return this.internalMatcher.matches(); } /** * {@inheritDoc} * * @throws IllegalStateException * If no match has yet been attempted, * or if the previous match operation failed */ @Override public boolean isEmpty() { return this.isEmptyPrivate(0); } /** * {@inheritDoc} * * @throws IllegalStateException * If no match has yet been attempted, * or if the previous match operation failed * * @throws IndexOutOfBoundsException * If there is no capturing group in the pattern * with the given index */ @Override public boolean isEmpty(final int group) { return this.isEmptyPrivate(this.getGroupIndex(group)); } /** * {@inheritDoc} * * @throws IllegalStateException * If no match has yet been attempted, * or if the previous match operation failed * * @throws IllegalArgumentException * If there is no capturing group in the pattern * of the given group */ @Override public boolean isEmpty(final String group) { return this.isEmptyPrivate(this.getGroupIndex(group)); } /** * {@inheritDoc} * * @throws IllegalStateException * If no match has yet been attempted, * or if the previous match operation failed * * @throws IllegalArgumentException * If there is no capturing group in the pattern * of the given group */ @Override public boolean isEmpty(final String groupName, final int occurrence) { return this.isEmptyPrivate(this.getGroupIndex(groupName, occurrence)); } private boolean isEmptyPrivate(final int mappedIndex) { int start = this.usedMatcher.start(mappedIndex); int end = this.usedMatcher.end(mappedIndex); // length zero match, and group actually matched (start != -1) return start == end && start != -1; } /** * Attempts to find the next subsequence of the input sequence that matches * the pattern. * *

This method starts at the beginning of this matcher's region, or, if a * previous invocation of the method was successful and the matcher has not * since been reset, at the first character not matched by the previous * match.

* *

If the match succeeds then more information can be obtained via the * start, end, and group methods.

* * @return true if, and only if, a subsequence of the input * sequence matches this matcher's pattern */ public boolean find() { // if (originalMatch != null) // { // internalMatcher.region(originalMatch.regionStart(), originalMatch // .regionEnd()); // // originalMatch = null; // } boolean found; if (this.usedMatcher != this.internalMatcher) { // System.out.println("Test: " + useMatcher.end()); found = this.internalMatcher.find(this.usedMatcher.end()); this.usedMatcher = this.internalMatcher; } else { found = this.internalMatcher.find(); } // boolean found = useMatcher.find(); // lastResult = (java.util.regex.Matcher) // internalMatcher.toMatchResult(); // lastResult = (java.util.regex.Matcher) useMatcher.toMatchResult(); // originalResult = lastResult; // checkShorter = true; return found; } // private java.util.regex.Matcher originalResult; // private java.util.regex.Matcher lastResult = null; // private boolean checkShorter = true; // TODO: can find each matching string // cannot find each combination of group captures, for the same string // public boolean proceed() // { // // TODO: optimize finding of next match // // Ensure original matcher's region is obeyed // // // TODO: use JRegex as tester // // Note: sometimes it doesn't mimic Java, so use best judgement // int start; // // if (lastResult != null) { // if (useMatcher == internalMatcher) // useMatcher = parentPattern.getInternalPattern().matcher(text); // // if (checkShorter) { // /* Proceed to find all shorter matches */ // int matchStart = lastResult.start(); // int matchEnd = lastResult.end(); // // while (matchStart != matchEnd) { // useMatcher.region(matchStart, --matchEnd); // // if (useMatcher.matches()) { // lastResult = (java.util.regex.Matcher) useMatcher.toMatchResult(); // return true; // } // } // // checkShorter = false; // lastResult = originalResult; // } // // if (!checkShorter) {/* Proceed to find all longer matches */ // int matchStart = lastResult.start(); // int matchEnd = lastResult.end(); // int regionEnd = originalResult.regionEnd(); // // while (matchEnd != regionEnd) { // useMatcher.region(matchStart, ++matchEnd); // // if (useMatcher.matches()) { // lastResult = (java.util.regex.Matcher) useMatcher.toMatchResult(); // return true; // } // } // } // // // internalMatcher.region(originalResult.regionStart(), originalResult // // .regionEnd()); // // internalMatcher.region(internalMatcher.regionStart(), internalMatcher // // .regionEnd()); // start = originalResult.start() + 1; // useMatcher = internalMatcher; // } else // start = 0; // // if (start > text.length()) // return false; // // // No other match available, find next // // // TODO: move position to , don't reset matcher // boolean found = useMatcher.find(start); // // lastResult = (java.util.regex.Matcher) useMatcher.toMatchResult(); // originalResult = lastResult; // checkShorter = true; // // return found; // } /** * Resets this matcher and then attempts to find the next subsequence of the * input sequence that matches the pattern, starting at the specified index. * *

If the match succeeds then more information can be obtained via the * start, end, and group methods, and subsequent * invocations of the {@link #find()} method will start at the first * character not matched by this match.

* * @param start * 0-based start index * * @throws IndexOutOfBoundsException * If start is less than zero or if start is greater than the * length of the input sequence. * * @return true if, and only if, a subsequence of the input * sequence starting at the given index matches this matcher's * pattern */ public boolean find(final int start) { // TODO: handle case useMatcher != internalMatcher // boolean find = this.internalMatcher.find(start); boolean find = this.usedMatcher.find(start); this.resetPrivate(); // TODO: before or after resetPrivate() ?? // lastMatch = (Matcher) toMatchResult(); // originalMatch = lastMatch; // checkShorter = true; return find; } /** * Attempts to match the input sequence, starting at the beginning of the * region, against the pattern. * *

Like the {@link #matches matches} method, this method always starts at * the beginning of the region; unlike that method, it does not require that * the entire region be matched.

* *

If the match succeeds then more information can be obtained via the * start, end, and group methods.

* * @return true if, and only if, a prefix of the input sequence * matches this matcher's pattern */ public boolean lookingAt() { return this.internalMatcher.lookingAt(); } /** * Returns a literal replacement String for the specified * String. * *

This method produces a String that will work as a literal * replacement s in the appendReplacement method * of the {@link Matcher} class. The String produced will match * the sequence of characters in s treated as a literal * sequence. Slashes ('\') and dollar signs ('$') will be given no special * meaning.

* * @param s * The string to be literalized * @return A literal string replacement */ public static String quoteReplacement(final String s) { return java.util.regex.Matcher.quoteReplacement(s); } private final java.util.regex.Pattern replacementPart = java.util.regex.Pattern .compile("\\G(?:(-?\\d++)|" + fullGroupName + ")\\}"); /** *

Gets the replacement string, replacing any group references with their actual value

* *

The replacement string may contain references to subsequences captured * during the previous match: Each occurrence of $g * will be replaced by the result of evaluating * * {@link #group(int) group}(g). The first number * after the $ is always treated as part of the group reference. * Subsequent numbers are incorporated into g if they would form a legal * group reference. Only the numerals '0' through '9' are considered as * potential components of the group reference. If the second group matched * the string "foo", for example, then passing the replacement * string "$2bar" would cause "foobar" to be appended to * the string buffer. A dollar sign ($) may be included as a * literal in the replacement string by preceding it with a backslash * (\$).

* *

Note that backslashes (\) and dollar signs ($) in * the replacement string may cause the results to be different than if it * were being treated as a literal replacement string. Dollar signs may be * treated as references to captured subsequences as described above, and * backslashes are used to escape literal characters in the replacement * string.

* * @param replacement * The replacement string * * @return the replacement string replacing any group references with their group value from this Matcher * * @throws IllegalStateException * If no match has yet been attempted, or if the previous match * operation failed * * @throws IllegalArgumentException * If the replacement string refers to a named-capturing * group that does not exist in the pattern * * @throws IndexOutOfBoundsException * If the replacement string refers to a capturing group * that does not exist in the pattern */ public String getReplacement(final String replacement) { int cursor = 0; StringBuilder result = new StringBuilder(); while (cursor < replacement.length()) { char nextChar = replacement.charAt(cursor); if (nextChar == '\\') { cursor++; nextChar = replacement.charAt(cursor); result.append(nextChar); cursor++; } else if (nextChar == '$') { // Skip past $ cursor++; // A StringIndexOutOfBoundsException is thrown if // this "$" is the last character in replacement // string in current implementation, a IAE might be // more appropriate. nextChar = replacement.charAt(cursor); int mappedIndex; if (nextChar == '<') { // e.g. $ // (Java's named group reference) cursor++; StringBuilder gsb = new StringBuilder(); while (cursor < replacement.length()) { nextChar = replacement.charAt(cursor); if (nextChar >= 'a' && nextChar <= 'z' || nextChar >= 'A' && nextChar <= 'Z' || nextChar >= '0' && nextChar <= '9') { gsb.append(nextChar); cursor++; } else { break; } } if (gsb.length() == 0) { throw new IllegalArgumentException( "named capturing group has 0 length name"); } if (nextChar != '>') { throw new IllegalArgumentException( "named capturing group is missing trailing '>'"); } String gname = gsb.toString(); String mappingName = getMappingName(gname, 1); if (!this.pattern() .getGroupMapping() .containsKey(mappingName)) { throw new IllegalArgumentException( "No group with name <" + gname + ">"); } mappedIndex = this.pattern() .getGroupMapping() .get(mappingName); cursor++; } else if (nextChar == '{') { // TODO: don't create a matcher - performance hit java.util.regex.Matcher matcher = this.replacementPart .matcher(replacement); if (!matcher.find(++cursor)) { throw new IllegalArgumentException( "Illegal group reference"); } String numberGroup = matcher.group(1); String match = matcher.group(); if (!match.endsWith("}")) { throw new IllegalArgumentException( "named capturing group is missing trailing '}'"); } cursor += match.length(); // Get mapped index // String refGroupName = refGroup.toString(); // if (false) { // if (isNum && refGroupName.length() != 0) { // int refNum; // try { // refNum = Integer.parseInt(refGroupName); // } catch (NumberFormatException e) { // throw new IllegalArgumentException( // "Illegal group reference"); // } // // mappedIndex = getGroupIndex(refNum); // } else { // mappedIndex = getGroupIndex(refGroupName); if (numberGroup != null) { int groupIndex = Integer.parseInt(numberGroup); mappedIndex = this.getGroupIndex(groupIndex); } else { String groupName = matcher.group(2); if (groupName.length() == 0) { throw new IllegalArgumentException( "named capturing group has 0 length name"); } String groupOccurrence = matcher.group(3); if (groupOccurrence == null) { mappedIndex = this.getGroupIndex(groupName); } else { int occurrence = Integer.parseInt(groupOccurrence); mappedIndex = this.getGroupIndex(groupName, occurrence); } } // } } // else if (nextChar == '$'){ // result.append('$'); // cursor++; // continue; // } else { // e.g $123 // (original functionality) // The first number is always a group int refNum = nextChar - '0'; if ((refNum < 0) || (refNum > 9)) { throw new IllegalArgumentException( "Illegal group reference"); } cursor++; // Capture the largest legal group string boolean done = false; while (!done) { if (cursor >= replacement.length()) { break; } int nextDigit = replacement.charAt(cursor) - '0'; if ((nextDigit < 0) || (nextDigit > 9)) { // not a // number break; } int newRefNum = (refNum * 10) + nextDigit; if (this.groupCount() < newRefNum) { done = true; } else { refNum = newRefNum; cursor++; } } mappedIndex = this.getGroupIndex(refNum); } // Append group String group = this.groupPrivate(mappedIndex); if (group != null) { result.append(group); } } else { result.append(nextChar); cursor++; } } return result.toString(); } /** * Implements a non-terminal append-and-replace step. * *

This method performs the following actions:

* *
    * *
  1. *

    It reads characters from the input sequence, starting at the append * position, and appends them to the given string buffer. It stops after * reading the last character preceding the previous match, that is, the * character at index {@link #start()} - 1.

    *
  2. * *
  3. *

    It appends the given replacement string to the string buffer.

    *
  4. * *
  5. *

    It sets the append position of this matcher to the index of the last * character matched, plus one, that is, to {@link #end()}.

    *
  6. * *
* *

The replacement string may contain references to subsequences captured * during the previous match: Each occurrence of $g * will be replaced by the result of evaluating * * {@link #group(int) group}(g). The first number * after the $ is always treated as part of the group reference. * Subsequent numbers are incorporated into g if they would form a legal * group reference. Only the numerals '0' through '9' are considered as * potential components of the group reference. If the second group matched * the string "foo", for example, then passing the replacement * string "$2bar" would cause "foobar" to be appended to * the string buffer. A dollar sign ($) may be included as a * literal in the replacement string by preceding it with a backslash * (\$).

* *

Note that backslashes (\) and dollar signs ($) in * the replacement string may cause the results to be different than if it * were being treated as a literal replacement string. Dollar signs may be * treated as references to captured subsequences as described above, and * backslashes are used to escape literal characters in the replacement * string.

* *

This method is intended to be used in a loop together with the {@link #appendTail appendTail} and * {@link #find find} methods. The * following code, for example, writes one dog two dogs in the * yard to the standard-outputSyntax stream:

* *
* *
	 * Pattern p = Pattern.compile("cat");
	 * Matcher m = p.matcher("one cat two cats in the yard");
	 * StringBuffer sb = new StringBuffer();
	 * while (m.find()) {
	 *     m.appendReplacement(sb, "dog");
	 * }
	 * m.appendTail(sb);
	 * System.out.println(sb.toString());
	 * 
* *
* * @param sb * The target string buffer * * @param replacement * The replacement string * * @return This matcher * * @throws IllegalStateException * If no match has yet been attempted, or if the previous match * operation failed * * @throws IllegalArgumentException * If the replacement string refers to a named-capturing * group that does not exist in the pattern * * @throws IndexOutOfBoundsException * If the replacement string refers to a capturing group * that does not exist in the pattern */ public Matcher appendReplacement(final StringBuffer sb, final String replacement) { int first = this.start(); int last = this.end(); // Append the intervening text sb.append(this.getSubSequence(this.lastAppendPosition, first)); // Append the match substitution sb.append(this.getReplacement(replacement)); this.lastAppendPosition = last; return this; } /** * Implements a terminal append-and-replace step. * *

This method reads characters from the input sequence, starting at the * append position, and appends them to the given string buffer. It is * intended to be invoked after one or more invocations of the {@link #appendReplacement appendReplacement} method * in order to copy the * remainder of the input sequence.

* * @param sb * The target string buffer * * @return The target string buffer */ public StringBuffer appendTail(final StringBuffer sb) { sb.append(this.text, this.lastAppendPosition, this.getTextLength()); return sb; } /** * Replaces every subsequence of the input sequence that matches the pattern * with the given replacement string. * *

This method first resets this matcher. It then scans the input * sequence * looking for matches of the pattern. Characters that are not part of any * match are appended directly to the result string; each match is replaced * in the result by the replacement string. The replacement string may * contain references to captured subsequences as in the {@link #appendReplacement appendReplacement} method.

* *

Note that backslashes (\) and dollar signs ($) in * the replacement string may cause the results to be different than if it * were being treated as a literal replacement string. Dollar signs may be * treated as references to captured subsequences as described above, and * backslashes are used to escape literal characters in the replacement * string.

* *

Given the regular expression a*b, the input * "aabfooaabfooabfoob", and the replacement string "-", * an invocation of this method on a matcher for that expression would yield * the string "-foo-foo-foo-".

* *

Invoking this method changes this matcher's state. If the matcher * is to be used in further matching operations then it should first be * reset.

* * @param replacement * The replacement string * * @return The string constructed by replacing each matching subsequence by * the replacement string, substituting captured subsequences as * needed */ public String replaceAll(final String replacement) { this.reset(); boolean result = this.find(); if (result) { StringBuffer sb = new StringBuffer(); do { this.appendReplacement(sb, replacement); result = this.find(); } while (result); this.appendTail(sb); return sb.toString(); } return this.text.toString(); } /** * Replaces the first subsequence of the input sequence that matches the * pattern with the given replacement string. * *

This method first resets this matcher. It then scans the input * sequence looking for a match of the pattern. Characters that are not part * of the match are appended directly to the result string; the match is * replaced in the result by the replacement string. The replacement string * may contain references to captured subsequences as in the {@link #appendReplacement appendReplacement} method. * *

Note that backslashes (\) and dollar signs ($) in * the replacement string may cause the results to be different than if it * were being treated as a literal replacement string. Dollar signs may be * treated as references to captured subsequences as described above, and * backslashes are used to escape literal characters in the replacement * string.

* *

Given the regular expression dog, the input * "zzzdogzzzdogzzz", and the replacement string "cat", an * invocation of this method on a matcher for that expression would yield * the string "zzzcatzzzdogzzz".

* *

Invoking this method changes this matcher's state. If the matcher * is to be used in further matching operations then it should first be * reset.

* * @param replacement * The replacement string * @return The string constructed by replacing the first matching * subsequence by the replacement string, substituting captured * subsequences as needed */ public String replaceFirst(final String replacement) { if (replacement == null) { throw new NullPointerException("replacement"); } this.reset(); if (!this.find()) { return this.text.toString(); } StringBuffer sb = new StringBuffer(); this.appendReplacement(sb, replacement); this.appendTail(sb); return sb.toString(); } /** * Sets the limits of this matcher's region. The region is the part of the * input sequence that will be searched to find a match. Invoking this * method resets the matcher, and then sets the region to start at the index * specified by the start parameter and end at the index * specified by the end parameter. * *

Depending on the transparency and anchoring being used (see {@link #useTransparentBounds useTransparentBounds} * and {@link #useAnchoringBounds useAnchoringBounds}), certain constructs such * as anchors may behave differently at or around the boundaries of the * region.

* * @param start * The index to start searching at (inclusive) * @param end * The index to end searching at (exclusive) * @throws IndexOutOfBoundsException * If start or end is less than zero, if start is greater than * the length of the input sequence, if end is greater than the * length of the input sequence, or if start is greater than * end. * @return this matcher */ public Matcher region(final int start, final int end) { this.internalMatcher.region(start, end); return this.resetPrivate(); } /** * Reports the start index of this matcher's region. The searches this * matcher conducts are limited to finding matches within {@link #regionStart regionStart} (inclusive) and * {@link #regionEnd * regionEnd} (exclusive). * * @return The starting point of this matcher's region */ public int regionStart() { return this.internalMatcher.regionStart(); } /** * Reports the end index (exclusive) of this matcher's region. The searches * this matcher conducts are limited to finding matches within {@link #regionStart regionStart} (inclusive) and * {@link #regionEnd * regionEnd} (exclusive). * * @return the ending point of this matcher's region */ public int regionEnd() { return this.internalMatcher.regionEnd(); } /** * Queries the transparency of region bounds for this matcher. * *

This method returns true if this matcher uses * transparent bounds, false if it uses opaque * bounds.

* *

See {@link #useTransparentBounds useTransparentBounds} for a * description of transparent and opaque bounds.

* *

By default, a matcher uses opaque region boundaries.

* * @return true iff this matcher is using transparent bounds, * false otherwise. * @see Matcher#useTransparentBounds(boolean) */ public boolean hasTransparentBounds() { return this.internalMatcher.hasTransparentBounds(); } /** * Sets the transparency of region bounds for this matcher. * *

Invoking this method with an argument of true will set this * matcher to use transparent bounds. If the boolean argument is * false, then opaque bounds will be used.

* *

Using transparent bounds, the boundaries of this matcher's region are * transparent to lookahead, lookbehind, and boundary matching constructs. * Those constructs can see beyond the boundaries of the region to see if a * match is appropriate.

* *

Using opaque bounds, the boundaries of this matcher's region are * opaque to lookahead, lookbehind, and boundary matching constructs that * may try to see beyond them. Those constructs cannot look past the * boundaries so they will fail to match anything outside of the region.

* *

By default, a matcher uses opaque bounds.

* * @param b * a boolean indicating whether to use opaque or transparent * regions * @return this matcher * @see Matcher#hasTransparentBounds */ public Matcher useTransparentBounds(final boolean b) { this.internalMatcher.useTransparentBounds(b); return this; } /** * Queries the anchoring of region bounds for this matcher. * *

This method returns true if this matcher uses * anchoring bounds, false otherwise.

* *

See {@link #useAnchoringBounds useAnchoringBounds} for a description * of anchoring bounds.

* *

By default, a matcher uses anchoring region boundaries.

* * @return true iff this matcher is using anchoring bounds, * false otherwise. * @see Matcher#useAnchoringBounds(boolean) */ public boolean hasAnchoringBounds() { return this.internalMatcher.hasAnchoringBounds(); } /** * Sets the anchoring of region bounds for this matcher. * *

Invoking this method with an argument of true will set this * matcher to use anchoring bounds. If the boolean argument is * false, then non-anchoring bounds will be used.

* *

Using anchoring bounds, the boundaries of this matcher's region match * anchors such as ^ and $.

* *

Without anchoring bounds, the boundaries of this matcher's region will * not match anchors such as ^ and $.

* *

By default, a matcher uses anchoring region boundaries.

* * @param b * a boolean indicating whether or not to use anchoring bounds. * @return this matcher * @see Matcher#hasAnchoringBounds */ public Matcher useAnchoringBounds(final boolean b) { this.internalMatcher.useAnchoringBounds(b); return this; } /** *

Returns the string representation of this matcher. The string * representation of a Matcher contains information that may be * useful for debugging. The exact format is unspecified.

* * @return The string representation of this matcher */ @Override public String toString() { String group; try { group = this.group(); } catch (IllegalStateException e) { group = null; } StringBuffer sb = new StringBuffer(); sb.append("info.codesaway.util.regex.Matcher"); sb.append("[pattern=" + this.pattern()); sb.append(" region="); sb.append(this.regionStart() + "," + this.regionEnd()); sb.append(" lastmatch="); if (group != null) { sb.append(group); } sb.append("]"); return sb.toString(); } /** * Returns true if the end of input was hit by the search engine in the last * match operation performed by this matcher. * *

When this method returns true, then it is possible that more input * would have changed the result of the last search.

* * @return true iff the end of input was hit in the last match; false * otherwise */ public boolean hitEnd() { // TOOD: which should I use ?? // return this.internalMatcher.hitEnd(); return this.usedMatcher.hitEnd(); } /** * Returns true if more input could change a positive match into a negative * one. * *

If this method returns true, and a match was found, then more input * could cause the match to be lost. If this method returns false and a * match was found, then more input might change the match but the match * won't be lost. If a match was not found, then requireEnd has no * meaning.

* * @return true iff more input could change a positive match into a negative * one. */ public boolean requireEnd() { // TODO: which should I use ?? // return this.internalMatcher.requireEnd(); return this.usedMatcher.requireEnd(); } /** * Returns the end index of the text. * * @return the index after the last character in the text */ int getTextLength() { return this.text.length(); } /** * Generates a String from this Matcher's input in the specified range. * * @param beginIndex * the beginning index, inclusive * @param endIndex * the ending index, exclusive * @return A String generated from this Matcher's input */ CharSequence getSubSequence(final int beginIndex, final int endIndex) { return this.text.subSequence(beginIndex, endIndex); } /** * Returns the string being matched. * * @return the string being matched */ @Override public String text() { return this.text.toString(); } /** * Returns the group name (if any) for the specified group. * *

If a match has been successful, this function's return is the group * name associated with the given group for this match. This group * name is match independent, except, possibly, when the group is part of a * "branch reset" pattern.

* *

If there is no successful match, this function's return is the group * name, only in the case that it is match independent. The only case where * the group name is not match independent is when the group is part of a * "branch reset" subpattern, and there are at least two groups with the * given number.

* *

For example, in the pattern * (?|(?<group1a>1a)|(?<group1b>1b), the group name * for group 1 depends on whether the pattern matches "1a" or "1b". In this * case, an IllegalStateException is thrown, because a match is required to * determine the group name.

* *

If there is more than one occurrence of the group, the returned group * name includes the occurrence - for example, myGroup[1]. If * there is only one occurrence of the group, only the group name is * returned - for example, myGroup.

* * @param group * The index of a capturing group in this matcher's pattern * @return the group name for the specified group, or null if * the group has no associated group name * * @throws IllegalStateException * If the group name is match dependent, and no match has yet * been attempted, or if the previous match operation failed */ @Override public String getGroupName(final int group) { Integer groupIndex = this.getGroupIndex(group); return MatcherHelper.getGroupName(this, groupIndex); } /* * Returns the given group name, adjusting the case, if necessary, based on * the {@link #CASE_INSENSITIVE_NAMES} flag. * * @param groupName * the group name * * @param groupOccurrence * the group occurrence * * @param totalOccurrences * the total occurrences * * @return the absolute occurrence */ // String handleCase(String groupName) // { // return parentPattern.handleCase(groupName); // } /** * Handles errors related to specifying a non-existent group for a * parameter. * * @param group * the index for the non-existent group */ static IndexOutOfBoundsException noGroup(final int group) { return new IndexOutOfBoundsException("No group " + group); } /** * Handles errors related to specifying a non-existing group as parameter. * * @param group * the non-existent group */ static IllegalArgumentException noGroup(final String group) { return new IllegalArgumentException("No group <" + group + ">"); } static IllegalArgumentException noGroup(final String groupName, final int occurence) { return noGroup(getMappingName(groupName, occurence)); } /** * Handles errors related to specifying a non-existent (named) group as a * parameter. * * @param group * the non-existent group */ static IllegalArgumentException noNamedGroup(final String group) { return new IllegalArgumentException("No group with name <" + group + ">"); } /** * Returns the absolute group for the given group. * *

If the input is negative (a relative group), then it is converted to * an absolute group index.

* * @param index * the index * @param groupCount * the group count * @return the absolute group index * @throws ArrayIndexOutOfBoundsException * If group is a relative group (that is, negative), which * doesn't refer to a valid group. * @throws IndexOutOfBoundsException * If index in greater than groupCount */ static int getAbsoluteGroupIndex(int index, final int groupCount) { if (index < 0) { // group is relative // e.g. -1 with groupCount == 5 -> 5 index += groupCount + 1; if (index <= 0) { throw new ArrayIndexOutOfBoundsException( "Index: " + index + ", Size: " + groupCount); } } else if (index > groupCount) { throw new IndexOutOfBoundsException( "Index: " + index + ", Size: " + groupCount); } return index; } /** * Returns the mapped index for the specified group. * * @throws IndexOutOfBoundsException * If there is no capturing group in the pattern * of the given group * @throws IllegalStateException * If no match has yet been attempted, or if the previous match * operation failed; only thrown if {@link #groupCount(String) * groupCount("[" + group + "]")} >= 2. */ Integer getGroupIndex(final int group) { return MatcherHelper.getGroupIndex(this, this.usedMatcher, group); } /** * Returns the mapped index for the specified group. * * @throws IllegalArgumentException * If there is no capturing group in the pattern * of the given group */ Integer getGroupIndex(final String group) { return MatcherHelper.getGroupIndex(this, this.usedMatcher, group); } /** * Returns the mapped index for the specified group. * * @throws IllegalArgumentException * If there is no capturing group in the pattern * of the given group */ Integer getGroupIndex(final String groupName, final int occurrence) { return MatcherHelper.getGroupIndex(this, this.usedMatcher, groupName, occurrence); } /** * @since 0.2 */ @Override public boolean treatNullAsEmptyString() { return this.treatNullAsEmptyString; } /** * @param treatNullAsEmptyString whether this Matcher should treat null as the empty string * @return true if should treat null as the empty string * @since 0.2 */ public Matcher treatNullAsEmptyString(final boolean treatNullAsEmptyString) { this.treatNullAsEmptyString = treatNullAsEmptyString; return this; } // public boolean containsKey(Object key) // { // if (key instanceof CharSequence) { // try { // getGroupIndex(key.toString()); // return true; // } catch (IllegalArgumentException e) { // return false; // } // } else if (key instanceof Number) { // Number index = (Number) key; // // try { // getAbsoluteGroupIndex(index.intValue(), // groupCount()); // return true; // } catch (IndexOutOfBoundsException e) { // return false; // } // } // // throw new IllegalArgumentException("Invalid group name/index: " // + key); // } // }; /** * @throws IllegalArgumentException * If key is not a CharSequence or * Number * @since 0.2 */ @Override @SuppressFBWarnings("RV_RETURN_VALUE_IGNORED_NO_SIDE_EFFECT") public boolean containsKey(final Object key) { return MatcherHelper.containsKey(this, this.usedMatcher, key); } /** * @since 0.2 */ // private static final Comparator> // entryComparator = new Comparator>() { // // public int compare(Entry o1, // Entry o2) // { // Integer index1 = o1.getKey(); // Integer index2 = o2.getKey(); // // return index1.compareTo(index2); // } // }; /** * @since 0.2 */ private final Map> entryCache = new HashMap<>( 2); /** * * @since 0.2 */ @Override public Entry getEntry(final int group) { return this.entryCache.computeIfAbsent(group, MatchResult.super::getEntry); } private Matcher cloneReset() { Matcher matcher = this.pattern().matcher(this.text()); matcher.useAnchoringBounds(this.hasAnchoringBounds()); matcher.useTransparentBounds(this.hasTransparentBounds()); return matcher; } /** * Maintains compatibility with Groovy regular expressions * * @since 0.2 */ /** * Returns an {@link java.util.Iterator} which traverses each match. * * @return an Iterator for a Matcher * @since 0.2 */ // Copied from Groovy // public Iterator> iterator() @Override public Iterator iterator() { // Iterator doesn't modify this Matcher final Matcher matcher = this.cloneReset(); return new MatchResultIterator(matcher); } // suggestion from SpotBugs private static class MatchResultIterator implements Iterator { private boolean found /* = false */; private boolean done /* = false */; private final Matcher matcher; public MatchResultIterator(final Matcher matcher) { this.matcher = matcher; } @Override public boolean hasNext() { if (this.done) { return false; } if (!this.found) { this.found = this.matcher.find(); if (!this.found) { this.done = true; } } return this.found; } // public List next() @Override public MatchResult next() { if (!this.found) { if (!this.hasNext()) { throw new NoSuchElementException(); } } this.found = false; // List list = new ArrayList(matcher.groupCount() + 1); // // for (int i = 0; i <= matcher.groupCount(); i++) { // list.add(matcher.group(i)); // } // // return list; return this.matcher.toMatchResult(); // return matcher; // return matcher.toMatchResult(); // if (matcher.groupCount() > 0) { // // are we using groups? // // yes, so return the specified group as list // List list = new ArrayList(matcher.groupCount()); // for (int i = 0; i <= matcher.groupCount(); i++) { // list.add(matcher.group(i)); // } // return list; // } else { // // not using groups, so return the nth // // occurrence of the pattern // return matcher.group(); // } } @Override public void remove() { throw new UnsupportedOperationException(); } } /** * Gets each match as a MatchResult * @return the results */ public List getResults() { List results = new ArrayList<>(); Iterator iterator = this.iterator(); while (iterator.hasNext()) { results.add(iterator.next()); } return results; } /* Groovy methods - makes RegExPlus groovier */ /** * Alias for {@link #find()}. * *

Coerces a Matcher instance to a boolean value, for use in Groovy truth

* * @since 0.2 */ @Override public boolean asBoolean() { RegExPlusSupport.setLastMatcher(this); return this.find(); } /** * {@inheritDoc} */ @Override public Object getAt(int idx) { // TODO: optimize code // NOTE: can produce different results than Groovy, since this code doesn't affect this Matcher // (iterator does not affect this matcher) List results = this.getResults(); int count = results.size(); if (idx < -count || idx >= count) { throw new IndexOutOfBoundsException("index is out of range " + (-count) + ".." + (count - 1) + " (index = " + idx + ")"); } if (idx < 0) { idx += count; } return results.get(idx); } /** * * @param indexes the indexes * @return for a MatchResult, the specified group; otherwise, the specified MatchResult */ public List getAt(final Collection indexes) { List matches = new ArrayList(); // TODO: optimize code // NOTE: can produce different results than Groovy, since this code doesn't affect this Matcher // (iterator does not affect this matcher) List indexList = flatten(indexes); List results = this.getResults(); int count = results.size(); for (int idx : indexList) { if (idx < -count || idx >= count) { throw new IndexOutOfBoundsException("index is out of range " + (-count) + ".." + (count - 1) + " (index = " + idx + ")"); } if (idx < 0) { idx += count; } matches.add(results.get(idx)); } return matches; } /* * Slightly modified from DefaultGroovyMethods.getAt(Collection), * and modified to run in Java */ private List flattenGroupIndexes(final Collection indices) { List answer = new ArrayList<>(indices.size()); for (Object value : indices) { if (value instanceof List) { answer.addAll(this.flattenGroupIndexes((List) value)); } else if (value instanceof CharSequence) { CharSequence group = (CharSequence) value; // System.out.println(group + ": " + getGroupIndex(group.toString())); answer.add(this.getGroupIndex(group.toString())); } else { int idx = intUnbox(value); answer.add(this.getGroupIndex(idx)); } } return answer; } /* Slightly modified from DefaultGroovyMethods.getAt(Collection), and modified to run in Java */ private static List flatten(final Collection indices) { List answer = new ArrayList<>(indices.size()); for (Object value : indices) { if (value instanceof List) { answer.addAll(flatten((List) value)); } else { int idx = intUnbox(value); answer.add(idx); } } return answer; } /* Groovy helper methods, required by the getAt method (copied from Groovy code) */ private static int intUnbox(final Object value) { Number n = castToNumber(value); return n.intValue(); } private static Number castToNumber(final Object object) { // default to Number class in exception details, else use the specified Number subtype. return castToNumber(object, Number.class); } private static Number castToNumber(final Object object, final Class type) { if (object instanceof Number) { return (Number) object; } if (object instanceof Character) { return Integer.valueOf(((Character) object).charValue()); } if (object instanceof String) { String c = (String) object; if (c.length() == 1) { return Integer.valueOf(c.charAt(0)); } else { throw new ClassCastException(makeMessage(c, type)); } } throw new ClassCastException(makeMessage(object, type)); } static String makeMessage(Object objectToCast, final Class classToCastTo) { String classToCastFrom; if (objectToCast != null) { classToCastFrom = objectToCast.getClass().getName(); } else { objectToCast = "null"; classToCastFrom = "null"; } return "Cannot cast object '" + objectToCast + "' " + "with class '" + classToCastFrom + "' " + "to class '" + classToCastTo.getName() + "'"; } public MatchResult next() { MatchResult result = this.toMatchResult(); this.find(); return result; } // public Matcher next() // { // // find(); // System.out.println("Next: " + this); // return this; // } // public String getProperty(String name) // { // return group(name); // } /** * Returns this Matcher. * *

Added for consistency for use in Groovy, since +javaMatcher is also supported. * This method ensures that the 'positive' operator will return a RegExPlus Matcher, for both cases:

* *
    *
  1. Promoting a Java Matcher: +javaMatcher
  2. *
  3. When used on an existing RegExPlus Matcher: +regexPlusMatcher
  4. *
* * @return this Matcher. */ public Matcher positive() { return this; } private static class ImmutableMatchResult implements MatchResult { private final Pattern pattern; private final java.util.regex.MatchResult usedMatcher; private final String text; private final boolean treatNullAsEmptyString; ImmutableMatchResult(final Pattern pattern, final java.util.regex.MatchResult internalMatchResult, final String text, final boolean treatNullAsEmptyString) { this.pattern = pattern; this.usedMatcher = internalMatchResult; this.text = text; this.treatNullAsEmptyString = treatNullAsEmptyString; } @Override public int start() { return this.usedMatcher.start(); } @Override public int end() { return this.usedMatcher.end(); } Integer getGroupIndex(final int group) { return MatcherHelper.getGroupIndex(this, this.usedMatcher, group); } Integer getGroupIndex(final String group) { return MatcherHelper.getGroupIndex(this, this.usedMatcher, group); } Integer getGroupIndex(final String group, final int occurrence) { return MatcherHelper.getGroupIndex(this, this.usedMatcher, group, occurrence); } @Override public int start(final int group) { return this.usedMatcher.start(this.getGroupIndex(group)); } @Override public int start(final String group) { return this.usedMatcher.start(this.getGroupIndex(group)); } @Override public int start(final String groupName, final int occurrence) { return this.usedMatcher.start(this.getGroupIndex(groupName, occurrence)); } @Override public int end(final int group) { return this.usedMatcher.end(this.getGroupIndex(group)); } @Override public int end(final String group) { return this.usedMatcher.end(this.getGroupIndex(group)); } @Override public int end(final String groupName, final int occurrence) { return this.usedMatcher.end(this.getGroupIndex(groupName, occurrence)); } @Override public int occurrence(final String groupName) { return MatcherHelper.occurrence(this, this.usedMatcher, groupName); } /** * Returns the captured contents of the group with the given index. * * @param mappedIndex * the index for the group (in the internal matcher) whose * contents are returned * @return the captured contents of the group with the given index * @see java.util.regex.Matcher#group(int) */ private String groupPrivate(final int mappedIndex) { String value = this.usedMatcher.group(mappedIndex); return value == null && this.treatNullAsEmptyString ? "" : value; } private String groupPrivate(final int mappedIndex, final String defaultValue) { boolean matched = this.usedMatcher.start(mappedIndex) != -1; return (matched ? this.usedMatcher.group(mappedIndex) : defaultValue); } private boolean isEmptyPrivate(final int mappedIndex) { int start = this.usedMatcher.start(mappedIndex); int end = this.usedMatcher.end(mappedIndex); // length zero match, and group actually matched (start != -1) return start == end && start != -1; } @Override public String group() { return this.groupPrivate(0); } @Override public String group(final int group) { return this.groupPrivate(this.getGroupIndex(group)); } @Override public String group(final int group, final String defaultValue) { return this.groupPrivate(this.getGroupIndex(group), defaultValue); } @Override public String group(final String group) { return this.groupPrivate(this.getGroupIndex(group)); } @Override public String group(final String group, final String defaultValue) { return this.groupPrivate(this.getGroupIndex(group), defaultValue); } @Override public String group(final String groupName, final int occurrence) { return this.groupPrivate(this.getGroupIndex(groupName, occurrence)); } @Override public String group(final String groupName, final int occurrence, final String defaultValue) { return this.groupPrivate(this.getGroupIndex(groupName, occurrence), defaultValue); } @Override public boolean isEmpty() { return this.isEmptyPrivate(0); } @Override public boolean isEmpty(final int group) { return this.isEmptyPrivate(this.getGroupIndex(group)); } @Override public boolean isEmpty(final String group) { return this.isEmptyPrivate(this.getGroupIndex(group)); } @Override public boolean isEmpty(final String groupName, final int occurrence) { return this.isEmptyPrivate(this.getGroupIndex(groupName, occurrence)); } @Override public Pattern pattern() { return this.pattern; } @Override public String text() { return this.text; } @Override public String getGroupName(final int group) { Integer groupIndex = this.getGroupIndex(group); return MatcherHelper.getGroupName(this, groupIndex); } @Override public boolean treatNullAsEmptyString() { return this.treatNullAsEmptyString; } @Override public boolean containsKey(final Object key) { return MatcherHelper.containsKey(this, this.usedMatcher, key); } } }




© 2015 - 2024 Weber Informatics LLC | Privacy Policy