![JAR search and dependency download from the Maven repository](/logo.png)
src.it.unimi.dsi.big.mg4j.index.IndexReader Maven / Gradle / Ivy
Show all versions of mg4j-big Show documentation
package it.unimi.dsi.big.mg4j.index;
/*
* MG4J: Managing Gigabytes for Java (big)
*
* Copyright (C) 2005-2011 Paolo Boldi and Sebastiano Vigna
*
* This library is free software; you can redistribute it and/or modify it
* under the terms of the GNU Lesser General Public License as published by the Free
* Software Foundation; either version 3 of the License, or (at your option)
* any later version.
*
* This library is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
* or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License
* for more details.
*
* You should have received a copy of the GNU Lesser General Public License
* along with this program; if not, see .
*
*/
import it.unimi.dsi.io.SafelyCloseable;
import it.unimi.dsi.big.util.StringMap;
import java.io.IOException;
/** Provides access to an inverted index.
*
* An {@link it.unimi.dsi.big.mg4j.index.Index} contains global read-only metadata. To get actual data
* from an index, you need to get an index reader via a call to {@link Index#getReader()}. Once
* you have an index reader, you can ask for the {@linkplain #documents(CharSequence) documents matching a term}.
*
*
Alternatively, you can perform a read-once scan of the index calling {@link #nextIterator()},
* which will return in order the {@linkplain IndexIterator index iterators} of all terms of the underlying index.
* More generally, {@link #nextIterator()} returns an iterator positioned at the start of the inverted
* list of the term after the current one. When called just after the reader creation, it returns an
* index iterator for the first term.
*
*
Warning: An index reader is exactly what it looks like—a reader. It
* cannot be used by many threads at the same time, and all its access methods are exclusive: if you
* obtain a {@linkplain #documents(long) document iterator}, the previous one is no longer valid. However,
* you can generate many readers, and use them concurrently.
*
*
Warning: Invoking the {@link it.unimi.dsi.big.mg4j.search.DocumentIterator#dispose()} method
* on iterators returned by an instance of this class will invoke {@link #close()} on the instance, thus
* making the instance no longer accessible. This behaviour is necessary to handle cases in which a
* reader is created on-the-fly just to create an iterator.
*
* @author Paolo Boldi
* @author Sebastiano Vigna
* @since 1.0
*/
public interface IndexReader extends SafelyCloseable {
/** Returns a document iterator over the documents containing a term.
*
*
Note that the index iterator returned by this method will
* return null
on a call to {@link IndexIterator#term() term()}.
*
*
Note that it is always possible
* to call this method with argument 0, even if the underlying index
* does not provide random access.
*
* @param termNumber the number of a term.
* @throws UnsupportedOperationException if this index reader is not accessible by term
* number.
*/
public IndexIterator documents( long termNumber ) throws IOException;
/** Returns an index iterator over the documents containing a term; the term is
* given explicitly.
*
*
Unless the {@linkplain Index#termProcessor term processor} of
* the associated index is null
, words coming from a query will
* have to be processed before being used with this method.
*
*
Note that the index iterator returned by this method will
* return term
on a call to {@link IndexIterator#term() term()}.
*
* @param term a term (the term will be downcased if the index is case insensitive).
* @throws UnsupportedOperationException if the {@linkplain StringMap term map} is not available for the underlying index.
*/
public IndexIterator documents( CharSequence term ) throws IOException;
/** Returns an {@link IndexIterator} on the term after the current one (optional operation).
*
*
Note that after creation there is no current term. Thus, the first call to this
* method will return an {@link IndexIterator} on the first term. As a consequence, repeated
* calls to this method provide a way to scan sequentially an index.
*
* @return the index iterator of the next term, or null
if there are no more terms
* after the current one.
*/
public IndexIterator nextIterator() throws IOException;
}