Download all versions of SiteCrawler JAR files with all dependencies
SiteCrawler from group io.github.jasperroel (version 1.0.0)
This project provides a simple WebCrawler with retry-capabilities, functionality to distinguish between http/https sites.
It biggest feature is that it allows for plugins (or CrawlerActions), which allows you to hook your scripts into the crawling process.
It also allow for setting "blocked" URLs. Those URLs or patterns will not be crawled.
Artifact SiteCrawler
Group io.github.jasperroel
Version 1.0.0
Last update 30. July 2018
Tags: allow plugins simple crawleractions biggest webcrawler setting scripts allows feature blocked crawling crawled between https sites functionality hook those with patterns urls your into provides process that project which http will distinguish capabilities retry this also
Organization Salesforce.com
URL https://github.com/forcedotcom/SiteCrawler
License The BSD 2-Clause License
Dependencies amount 3
Dependencies jcl-over-slf4j, htmlunit, commons-lang,
There are maybe transitive dependencies!
Group io.github.jasperroel
Version 1.0.0
Last update 30. July 2018
Tags: allow plugins simple crawleractions biggest webcrawler setting scripts allows feature blocked crawling crawled between https sites functionality hook those with patterns urls your into provides process that project which http will distinguish capabilities retry this also
Organization Salesforce.com
URL https://github.com/forcedotcom/SiteCrawler
License The BSD 2-Clause License
Dependencies amount 3
Dependencies jcl-over-slf4j, htmlunit, commons-lang,
There are maybe transitive dependencies!
Page 1 from 1 (items total 1)
© 2015 - 2024 Weber Informatics LLC | Privacy Policy