All Downloads are FREE. Search and download functionalities are using the official Maven repository.

edu.byu.hbll.box.Harvester Maven / Gradle / Ivy

There is a newer version: 2.5.3
Show newest version
package edu.byu.hbll.box;

/**
 * Harvests documents or at least the document ids from a source system. Given an initial state, the
 * client will load documents from a source system paging through the results until an end is
 * reached. The client should return the resulting documents after each page.
 *
 * 

If paging is not possible and streaming is necessary, the harvester may use {@link * Source#save(java.util.Collection)} to stream in the resulting documents rather than returning * them in the {@link HarvestResult}. * *

For Box's purposes, there will only be one instance of the harvester per source and only one * thread will call harvest at a time. So a harvester does not need to be thread safe. Also for * performance or other reasons, state may be maintained for subsequent calls of {@link * #harvest(HarvestContext)}. In order to pick back up where left off in the case of application * redeployments, a "cursor" will be saved to a database and offered back to the client letting the * client know where it left off. * *

The harvester can also be responsible for just gathering ids to be processed by the processor * rather than documents to be saved. This is done by returning a list of unprocessed documents * where only the id is set. * * @author Charles Draper */ public interface Harvester extends BoxConfigurable { /** * Returns the next set or page of documents from the source system. The client knows what the * "next" set is by observing the cursor object inside the context and determining which documents * to return next. The cursor is an object that is created by the client and returned as part of * the {@link HarvestResult} after each set. * * @param context context informing the harvester what to do next * @return a result containing resulting documents and other information for Box */ HarvestResult harvest(HarvestContext context); }





© 2015 - 2024 Weber Informatics LLC | Privacy Policy