![JAR search and dependency download from the Maven repository](/logo.png)
org.archive.modules.BeanShellProcessor_en.utf8 Maven / Gradle / Ivy
Go to download
Show more of this group Show more artifacts with this name
Show all versions of heritrix-modules Show documentation
Show all versions of heritrix-modules Show documentation
This project contains some of the configurable modules used within the
Heritrix application to crawl the web. The modules in this project can
be used in applications other than Heritrix, however.
description:
BeanShellProcessor. Runs the BeanShell script source (supplied directly or via
a file path) against the current URI. Source should define a script method
'process(curi)' which will be passed the current CrawlURI. The script may also
access this BeanShellProcessor via the 'self' variable and the CrawlController
via the 'controller' variable.
isolate-threads-description:
Whether each ToeThread should get its own independent script context, or
they should share synchronized access to one context. Default is true,
meaning each threads gets its own isolated context.
script-file-description:
BeanShell script file.
manager-description:
The SheetManager used to configure this crawl. Can be used to look up any
other module based on its settings path. The value you specify here will be
made available to the BeanShell script as the variable "manager".
© 2015 - 2025 Weber Informatics LLC | Privacy Policy