All Downloads are FREE. Search and download functionalities are using the official Maven repository.

resources.NE.first.jape Maven / Gradle / Ivy

Go to download

ANNIE is a general purpose information extraction system that provides the building blocks of many other GATE applications.

The newest version!
/*
*  first.jape
*
* Copyright (c) 1998-2004, The University of Sheffield.
*
*  This file is part of GATE (see http://gate.ac.uk/), and is free
*  software, licenced under the GNU Library General Public License,
*  Version 2, June 1991 (in the distribution as file licence.html,
*  and also available at http://gate.ac.uk/gate/licence.html).
*
*  Diana Maynard, 10 Sep 2001
* 
*  $Id: first.jape 18909 2015-09-15 11:37:47Z dgmaynard $
*/

Phase:	First
Input: Token NumberLetter
Options: control = appelt

// this has to be run first of all 

//////////////////////////////////////////////////////////////
Macro: SPACE
// space
// control
// space control
// control space

( 
 ({SpaceToken.kind == space}
  ({SpaceToken.kind == control})?
  ({SpaceToken.kind == control})?
 )
|
 ({SpaceToken.kind == control}
  ({SpaceToken.kind == control})?
  ({SpaceToken.kind == space})?
 )
)




///////////////////////////////////////////////////////////////

Rule: ClosedClass
// closed class words should not be part of names generally, so let's identify them
Priority: 100

(
 {Token.category == DT}|
 {Token.category == PRP}|
 {Token.category == RB}|
 {Token.category == IN}
):tag
-->
:tag.ClosedClass = {rule = "ClosedClass"}

Rule: NumberLetter
Priority: 100
( 
 {NumberLetter}
):tag
-->
{} 


Rule: UpperAllCaps
Priority: 100
// separate proper nouns that are in all caps, as they're more ambiguous
(
 {Token.category == NNP, Token.orth == allCaps}
 ({Token.string == "-"}
  {Token.category == NNP, Token.orth == allCaps}
 )?
):tag
-->
:tag.Upper = {kind = "allCaps", rule = "Upper"}

Rule: Upper
// define what can be a possible proper noun - cater for the fact that POS tag might not be correct
(
 ({Token.category == NNP}| 
   {Token.orth == upperInitial}|
   {Token.orth == mixedCaps} 
  )
 ({Token.string == "-"}
  {Token.category == NNP}
 )?
):tag
-->
:tag.Upper = {rule = "Upper"}









© 2015 - 2024 Weber Informatics LLC | Privacy Policy