org.archive.crawler.prefetch.Preselector_en.utf8 Maven / Gradle / Ivy
The newest version!
description:
Preselector. Does one last bit of checking to make sure that the current URI
should be fetched.
allow-by-regexp-description:
Allow only URIs matching the regular expression to be processed.
block-all-description:
Block all URIs from being processed. This is most likely to be used in
overrides to easily reject certain hosts from being processed.
block-by-regexp-description:
Block all URIs matching the regular expression from being processed.
recheck-scope-description:
Recheck if uri is in scope. This is meaningful if the scope is altered
during a crawl. URIs are checked against the scope when they are added to
queues. Setting this value to true forces the URI to be checked against
the scope when it is coming out of the queue, possibly after the scope
is altered.