All Downloads are FREE. Search and download functionalities are using the official Maven repository.

src.it.unimi.di.mg4j.search.visitor.package.html Maven / Gradle / Ivy

Go to download

MG4J (Managing Gigabytes for Java) is a free full-text search engine for large document collections written in Java.

There is a newer version: 5.2.2
Show newest version


  
    MG4J: Managing Gigabytes for Java
  

  

    

Visitors for composite {@linkplain it.unimi.di.mg4j.search.DocumentIterator document iterators}.

Composites and visitors

A {@link it.unimi.di.mg4j.search.DocumentIterator} (in particular, those provided by MG4J in the package {@link it.unimi.di.mg4j.search}) is usually structured as a composite, with operators as internal nodes and {@link it.unimi.di.mg4j.index.IndexIterator}s as leaves. A composite can be explored using a visitor: thus, the {@link it.unimi.di.mg4j.search.DocumentIterator} interface provides two methods, {@link it.unimi.di.mg4j.search.DocumentIterator#accept(it.unimi.di.mg4j.search.visitor.DocumentIteratorVisitor) accept(DocumentIteratorVisitor)} and {@link it.unimi.di.mg4j.search.DocumentIterator#acceptOnTruePaths(it.unimi.di.mg4j.search.visitor.DocumentIteratorVisitor) acceptOnTruePaths(DocumentIteratorVisitor)}, that let a {@link it.unimi.di.mg4j.search.visitor.DocumentIteratorVisitor} visit the composite structure.

A {@link it.unimi.di.mg4j.search.visitor.DocumentIteratorVisitor} provides methods for visiting in {@linkplain it.unimi.di.mg4j.search.visitor.DocumentIteratorVisitor#visitPre(it.unimi.di.mg4j.search.DocumentIterator) preorder} and in {@linkplain it.unimi.di.mg4j.search.visitor.DocumentIteratorVisitor#visitPost(it.unimi.di.mg4j.search.DocumentIterator,Object[]) postorder} all internal nodes. Leaves have two visit methods, {@link it.unimi.di.mg4j.search.visitor.DocumentIteratorVisitor#visit(it.unimi.di.mg4j.index.IndexIterator)} and {@link it.unimi.di.mg4j.search.visitor.DocumentIteratorVisitor#visit(it.unimi.di.mg4j.index.MultiTermIndexIterator)}.

Note that a {@link it.unimi.di.mg4j.search.visitor.DocumentIteratorVisitor} must be (re)usable after each call to {@link it.unimi.di.mg4j.search.visitor.DocumentIteratorVisitor#prepare() prepare()}.

The abstract class {@link it.unimi.di.mg4j.search.visitor.AbstractDocumentIteratorVisitor} provides stubs implementing internal visits and {@link it.unimi.di.mg4j.search.visitor.DocumentIteratorVisitor#prepare() prepare()} as no-ops for visitors that do not return values.

Computing true terms

A simple example of a visitor is {@link it.unimi.di.mg4j.search.visitor.TrueTermsCollectionVisitor}, which just collects all terms that make a query true.

Counting term occurrences

Another example of the utility of visitors for document iterators is given by term counting: using a number of coordinated visitors, it is possible to compute a count for each term appearing in a (no matter how complex) query. The count can be used as an input for counting-based scoring schemes, such as BM25 or cosine-based measures. For more information, please read the documentation of {@link it.unimi.di.mg4j.search.visitor.CounterCollectionVisitor}.





© 2015 - 2025 Weber Informatics LLC | Privacy Policy