resources.pipelines.regexp.unwantedText4LDA.txt Maven / Gradle / Ivy
Go to download
Show more of this group Show more artifacts with this name
Show all versions of learningframework Show documentation
Show all versions of learningframework Show documentation
A GATE plugin that provides many different machine learning
algorithms for a wide range of NLP-related machine learning tasks like
text classification, tagging, or chunking.
// URLs
|(https?://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|])
1 => Unwanted type="url"
// (some) email addresses
|(\b[a-zA-Z0-9!#$%&'*+/=\?^_`{|}~-]{1,64}(?:\.[a-zA-Z0-9!#$%&'*+/=\?^_`{|}~-]{1,64}){0,32})@([a-zA-Z0-9-]{1,63}(?:\.[a-zA-Z0-9-]{1,63}){1,32}\b)
1 => Unwanted type="email"