All Downloads are FREE. Search and download functionalities are using the official Maven repository.

org.archive.crawler.processor.HashCrawlMapper_en.utf8 Maven / Gradle / Ivy

The newest version!
description:
HashCrawlMapper. Maps URIs to a numerically named crawler by hashing the URI's 
(possibly transfored) classKey to one of the specified number of buckets.


crawler-count-description:
Number of crawlers among which to split up the URIs. Their names are 
assumed to be 0..N-1. 


reduce-prefix-regex-description:
A regex pattern to apply to the classKey, using the first match as the 
mapping key. If empty (the default), use the full classKey. 


use-publicsuffixes-regex-description:
Whether to use a built-in regular expression, built from
the 'public suffix' list at publicsuffix.org, for
reducing classKeys to mapping keys. If true, the default,
then the reduce-prefix-pattern setting is ignored.


frontier-description:
The frontier.




© 2015 - 2024 Weber Informatics LLC | Privacy Policy