All Downloads are FREE. Search and download functionalities are using the official Maven repository.

gate.DocumentContent Maven / Gradle / Ivy

Go to download

GATE - general achitecture for text engineering - is open source software capable of solving almost any text processing problem. This artifact enables you to embed the core GATE Embedded with its essential dependencies. You will able to use the GATE Embedded API and load and store GATE XML documents. This artifact is the perfect dependency for CREOLE plugins or for applications that need to customize the GATE dependencies due to confict with their own dependencies or for lower footprint.

The newest version!
/*
 *  DocumentContent.java
 *
 *  Copyright (c) 1995-2012, The University of Sheffield. See the file
 *  COPYRIGHT.txt in the software or at http://gate.ac.uk/gate/COPYRIGHT.txt
 *
 *  This file is part of GATE (see http://gate.ac.uk/), and is free
 *  software, licenced under the GNU Library General Public License,
 *  Version 2, June 1991 (in the distribution as file licence.html,
 *  and also available at http://gate.ac.uk/gate/licence.html).
 *
 *  Hamish Cunningham, 15/Feb/2000
 *
 *  $Id: DocumentContent.java 15333 2012-02-07 13:18:33Z ian_roberts $
 */

package gate;

import java.io.Serializable;

import gate.util.InvalidOffsetException;

/** The content of Documents.
  */
public interface DocumentContent extends Serializable {

  /**
   * Return the contents under a particular span.
   * 

* Conceptually the annotation offsets are defined as falling in between * characters, with "0" pointing before the fist character. * Because of that, the offsets where an annotation ends and the space after * it starts are the same. *

* So this is what the "abcde" string looks like with the offsets explicitly * included: 0a1b2c3d4e5 *

* "ab cd" would then look like this: 0a1b2 3c4d5 *

* with the following annotations:
* Token "ab" [0,2]
* SpaceToken " " [2,3]
* Token "cd" [3,5] *

* @param start the beginning index, inclusive. * @param end the ending index, exclusive. * @return the specified substring for the document. * @throws gate.util.InvalidOffsetException if the * start is negative, or * end is larger than the length of * this DocumentContent object, or * start is larger than * end. */ public DocumentContent getContent(Long start, Long end) throws InvalidOffsetException; /** The size of this content (e.g. character length for textual * content). */ public Long size(); } // interface DocumentContent





© 2015 - 2024 Weber Informatics LLC | Privacy Policy