All Downloads are FREE. Search and download functionalities are using the official Maven repository.

org.archive.crawler.postprocessor.LinksScoper_en.utf8 Maven / Gradle / Ivy

The newest version!
description:
LinksScoper. Rules on which extracted links are within configured scope.

preference-depth-hops-description:
Number of hops (of any sort) from a seed up to which a URI has higher 
priority scheduling than any remaining seed. For example, if set to 1 
items one hop (link, embed, redirect, etc.) away from a seed will be 
scheduled with HIGH priority. If set to -1, no preferencing will occur, 
and a breadth-first search with seeds processed before discovered links 
will proceed. If set to zero, a purely depth-first search will proceed, 
with all discovered links processed before remaining seeds. Seed 
redirects are treated as one hop from a seed. 


log-rejects-rules-description:
DecideRules applied to URIs that have been rejected by normal scoping. 
If the rules return ACCEPT, the URI is logged (if the logging level is 
INFO). Depends on override-logger being enabled. 


seed-redirects-new-seeds-description:
If enabled, any URL found because a seed redirected to it (original seed 
returned 301 or 302), will also be treated as a seed. 






© 2015 - 2024 Weber Informatics LLC | Privacy Policy