![JAR search and dependency download from the Maven repository](/logo.png)
edu.stanford.nlp.ling.tokensregex.package-info Maven / Gradle / Ivy
Show all versions of stanford-parser Show documentation
/**
* This package contains a library, TokensRegex, for matching regular expressions over
* tokens. TokensRegex is incorporated into the
* {@link edu.stanford.nlp.pipeline.TokensRegexAnnotator}
* and {@link edu.stanford.nlp.pipeline.TokensRegexNERAnnotator}.
*
* Rules for extracting expression using TokensRegex
* TokensRegex provides a language for specifying rules to extract expressions over token sequence.
* {@link edu.stanford.nlp.ling.tokensregex.CoreMapExpressionExtractor} and {@link edu.stanford.nlp.ling.tokensregex.SequenceMatchRules} describes
* the language and how the extraction rules are created
* Core classes for token sequence matching using TokensRegex
* At the core of TokensRegex are the
* {@link edu.stanford.nlp.ling.tokensregex.TokenSequenceMatcher} and
* {@link edu.stanford.nlp.ling.tokensregex.TokenSequencePattern} classes which
* can be used to match patterns over a sequences of tokens.
* The usage is designed to follow the paradigm of the Java regular expression library
* java.util.regex
. The usage is similar except that matches are done
* over List<CoreMap>
instead of over String
.
*
* Example:
*
*
* List<CoreLabel< tokens = ...;
* TokenSequencePattern pattern = TokenSequencePattern.compile(...);
* TokenSequenceMatcher matcher = pattern.getMatcher(tokens);
*
*
* The classes {@link edu.stanford.nlp.ling.tokensregex.SequenceMatcher} and {@link edu.stanford.nlp.ling.tokensregex.SequencePattern} can be used to build
* classes for recognizing regular expressions over sequences of arbitrary types
* Utility classes
* TokensRegex also offers a group of utility classes.
*
* {@link edu.stanford.nlp.ling.tokensregex.MultiPatternMatcher} provides utility functions for finding expressions with multiple patterns.
* For instance, using {@link edu.stanford.nlp.ling.tokensregex.MultiPatternMatcher#findNonOverlapping}
* you can find all nonoverlapping subsequences for a given set of patterns.
*
* To find character offsets of multiple word expressions in a String
,
* can also use {@link edu.stanford.nlp.ling.tokensregex.MultiWordStringMatcher#findTargetStringOffsets}.
*
* @author Angel Chang ([email protected])
*/
package edu.stanford.nlp.ling.tokensregex;