All Downloads are FREE. Search and download functionalities are using the official Maven repository.

Download JAR files tagged by webcrawler with all dependencies


mule-web-crawler-connector from group cloud.anypoint (version 0.1.0)

The MAC WebCrawler Connector enables a Mule application to crawl websites and retrieve content, potentially for creating vector embeddings for structured knowledge extraction.

Group: cloud.anypoint Artifact: mule-web-crawler-connector
Show documentation 
There is no JAR file uploaded. A download is not possible! Please choose another version.
0 downloads
Artifact mule-web-crawler-connector
Group cloud.anypoint
Version 0.1.0
Last update 15. November 2024
Organization not specified
URL https://mac-project.ai/docs/mac-webcrawler/connector-overview
License MIT License
Dependencies amount 2
Dependencies jsoup, jackson-databind,
There are maybe transitive dependencies!

SiteCrawler from group io.github.jasperroel (version 1.0.0)

This project provides a simple WebCrawler with retry-capabilities, functionality to distinguish between http/https sites. It biggest feature is that it allows for plugins (or CrawlerActions), which allows you to hook your scripts into the crawling process. It also allow for setting "blocked" URLs. Those URLs or patterns will not be crawled.

Group: io.github.jasperroel Artifact: SiteCrawler
Show documentation Show source 
Download SiteCrawler.jar (1.0.0)
 

0 downloads
Artifact SiteCrawler
Group io.github.jasperroel
Version 1.0.0
Last update 30. July 2018
Organization Salesforce.com
URL https://github.com/forcedotcom/SiteCrawler
License The BSD 2-Clause License
Dependencies amount 3
Dependencies jcl-over-slf4j, htmlunit, commons-lang,
There are maybe transitive dependencies!



Page 1 from 1 (items total 2)


© 2015 - 2024 Weber Informatics LLC | Privacy Policy