All Downloads are FREE. Search and download functionalities are using the official Maven repository.
Search JAR files by class name

Download boilerpipe JAR 1.1.0 with all dependencies


The boilerpipe library provides algorithms to detect and remove the surplus "clutter" (boilerplate, templates) around the main textual content of a web page. The library already provides specific strategies for common tasks (for example: news article extraction) and may also be easily extended for individual problem settings. Extracting content is very fast (milliseconds), just needs the input document (no global or site-level information required) and is usually quite accurate. Boilerpipe is a Java library written by Christian Kohlschütter. It is released under the Apache License 2.0. The algorithms used by the library are based on (and extending) some concepts of the paper "Boilerplate Detection using Shallow Text Features" by Christian Kohlschütter et al., presented at WSDM 2010 -- The Third ACM International Conference on Web Search and Data Mining New York City, NY USA.

Files of the artifact boilerpipe version 1.1.0 from the group de.l3s.boilerpipe.

Test

Download boilerpipe (1.1.0)
Artifact boilerpipe
Group de.l3s.boilerpipe
Version 1.1.0
Last update 03. November 2010
Organization not specified
URL http://code.google.com/p/boilerpipe/
License Apache License 2.0
Dependencies amount 0
Dependencies No dependencies
There are maybe transitive dependencies!
The newest version!
Show more of this group  Show more artifacts with this name
Show all versions of boilerpipe Show documentation

Please rate this JAR file. Is it a good library?

10 downloads

Source code of boilerpipe version 1.1.0

META-INF
META-INF.META-INF.MANIFEST.MF
de.l3s.boilerpipe
de.l3s.boilerpipe.de.l3s.boilerpipe.BoilerpipeExtractor
de.l3s.boilerpipe.de.l3s.boilerpipe.BoilerpipeFilter
de.l3s.boilerpipe.de.l3s.boilerpipe.BoilerpipeInput
de.l3s.boilerpipe.de.l3s.boilerpipe.BoilerpipeProcessingException
de.l3s.boilerpipe.conditions
de.l3s.boilerpipe.conditions.de.l3s.boilerpipe.conditions.TextBlockCondition
de.l3s.boilerpipe.demo
de.l3s.boilerpipe.demo.de.l3s.boilerpipe.demo.HTMLHighlightDemo
de.l3s.boilerpipe.demo.de.l3s.boilerpipe.demo.Oneliner
de.l3s.boilerpipe.demo.de.l3s.boilerpipe.demo.UsingSAX
de.l3s.boilerpipe.demo.de.l3s.boilerpipe.demo.package.html
de.l3s.boilerpipe.document
de.l3s.boilerpipe.document.de.l3s.boilerpipe.document.TextBlock
de.l3s.boilerpipe.document.de.l3s.boilerpipe.document.TextDocument
de.l3s.boilerpipe.document.de.l3s.boilerpipe.document.TextDocumentStatistics
de.l3s.boilerpipe.document.de.l3s.boilerpipe.document.package.html
de.l3s.boilerpipe.estimators
de.l3s.boilerpipe.estimators.de.l3s.boilerpipe.estimators.SimpleEstimator
de.l3s.boilerpipe.extractors
de.l3s.boilerpipe.extractors.de.l3s.boilerpipe.extractors.ArticleExtractor
de.l3s.boilerpipe.extractors.de.l3s.boilerpipe.extractors.ArticleSentencesExtractor
de.l3s.boilerpipe.extractors.de.l3s.boilerpipe.extractors.CanolaExtractor
de.l3s.boilerpipe.extractors.de.l3s.boilerpipe.extractors.CommonExtractors
de.l3s.boilerpipe.extractors.de.l3s.boilerpipe.extractors.DefaultExtractor
de.l3s.boilerpipe.extractors.de.l3s.boilerpipe.extractors.ExtractorBase
de.l3s.boilerpipe.extractors.de.l3s.boilerpipe.extractors.KeepEverythingExtractor
de.l3s.boilerpipe.extractors.de.l3s.boilerpipe.extractors.KeepEverythingWithMinKWordsExtractor
de.l3s.boilerpipe.extractors.de.l3s.boilerpipe.extractors.LargestContentExtractor
de.l3s.boilerpipe.extractors.de.l3s.boilerpipe.extractors.NumWordsRulesExtractor
de.l3s.boilerpipe.extractors.de.l3s.boilerpipe.extractors.package.html
de.l3s.boilerpipe.filters.english
de.l3s.boilerpipe.filters.english.de.l3s.boilerpipe.filters.english.DensityRulesClassifier
de.l3s.boilerpipe.filters.english.de.l3s.boilerpipe.filters.english.HeuristicFilterBase
de.l3s.boilerpipe.filters.english.de.l3s.boilerpipe.filters.english.IgnoreBlocksAfterContentFilter
de.l3s.boilerpipe.filters.english.de.l3s.boilerpipe.filters.english.KeepLargestFulltextBlockFilter
de.l3s.boilerpipe.filters.english.de.l3s.boilerpipe.filters.english.MinFulltextWordsFilter
de.l3s.boilerpipe.filters.english.de.l3s.boilerpipe.filters.english.NumWordsRulesClassifier
de.l3s.boilerpipe.filters.english.de.l3s.boilerpipe.filters.english.TerminatingBlocksFinder
de.l3s.boilerpipe.filters.english.de.l3s.boilerpipe.filters.english.package.html
de.l3s.boilerpipe.filters.heuristics
de.l3s.boilerpipe.filters.heuristics.de.l3s.boilerpipe.filters.heuristics.BlockProximityFusion
de.l3s.boilerpipe.filters.heuristics.de.l3s.boilerpipe.filters.heuristics.DocumentTitleMatchClassifier
de.l3s.boilerpipe.filters.heuristics.de.l3s.boilerpipe.filters.heuristics.ExpandTitleToContentFilter
de.l3s.boilerpipe.filters.heuristics.de.l3s.boilerpipe.filters.heuristics.KeepLargestBlockFilter
de.l3s.boilerpipe.filters.heuristics.de.l3s.boilerpipe.filters.heuristics.SimpleBlockFusionProcessor
de.l3s.boilerpipe.filters.heuristics.de.l3s.boilerpipe.filters.heuristics.package.html
de.l3s.boilerpipe.filters.simple
de.l3s.boilerpipe.filters.simple.de.l3s.boilerpipe.filters.simple.BoilerplateBlockFilter
de.l3s.boilerpipe.filters.simple.de.l3s.boilerpipe.filters.simple.InvertedFilter
de.l3s.boilerpipe.filters.simple.de.l3s.boilerpipe.filters.simple.LabelToBoilerplateFilter
de.l3s.boilerpipe.filters.simple.de.l3s.boilerpipe.filters.simple.LabelToContentFilter
de.l3s.boilerpipe.filters.simple.de.l3s.boilerpipe.filters.simple.MarkEverythingContentFilter
de.l3s.boilerpipe.filters.simple.de.l3s.boilerpipe.filters.simple.MinClauseWordsFilter
de.l3s.boilerpipe.filters.simple.de.l3s.boilerpipe.filters.simple.MinWordsFilter
de.l3s.boilerpipe.filters.simple.de.l3s.boilerpipe.filters.simple.SplitParagraphBlocksFilter
de.l3s.boilerpipe.filters.simple.de.l3s.boilerpipe.filters.simple.package.html
de.l3s.boilerpipe.labels
de.l3s.boilerpipe.labels.de.l3s.boilerpipe.labels.ConditionalLabelAction
de.l3s.boilerpipe.labels.de.l3s.boilerpipe.labels.DefaultLabels
de.l3s.boilerpipe.labels.de.l3s.boilerpipe.labels.LabelAction
de.l3s.boilerpipe
de.l3s.boilerpipe.de.l3s.boilerpipe.package.html
de.l3s.boilerpipe.sax
de.l3s.boilerpipe.sax.de.l3s.boilerpipe.sax.BoilerpipeHTMLContentHandler
de.l3s.boilerpipe.sax.de.l3s.boilerpipe.sax.BoilerpipeHTMLParser
de.l3s.boilerpipe.sax.de.l3s.boilerpipe.sax.BoilerpipeSAXInput
de.l3s.boilerpipe.sax.de.l3s.boilerpipe.sax.CommonTagActions
de.l3s.boilerpipe.sax.de.l3s.boilerpipe.sax.DefaultTagActionMap
de.l3s.boilerpipe.sax.de.l3s.boilerpipe.sax.HTMLDocument
de.l3s.boilerpipe.sax.de.l3s.boilerpipe.sax.HTMLFetcher
de.l3s.boilerpipe.sax.de.l3s.boilerpipe.sax.HTMLHighlighter
de.l3s.boilerpipe.sax.de.l3s.boilerpipe.sax.InputSourceable
de.l3s.boilerpipe.sax.de.l3s.boilerpipe.sax.TagAction
de.l3s.boilerpipe.sax.de.l3s.boilerpipe.sax.TagActionMap
de.l3s.boilerpipe.sax.de.l3s.boilerpipe.sax.package.html
de.l3s.boilerpipe.util
de.l3s.boilerpipe.util.de.l3s.boilerpipe.util.UnicodeTokenizer
de.l3s.boilerpipe.util.de.l3s.boilerpipe.util.package.html
org.cyberneko.html
org.cyberneko.html.org.cyberneko.html.HTMLElements
org.cyberneko.html.org.cyberneko.html.HTMLTagBalancer




© 2015 - 2025 Weber Informatics LLC | Privacy Policy