Uses of Interface
org.apache.nutch.net.URLFilter
-
Packages that use URLFilter Package Description org.apache.nutch.collection Subcollection is a subset of an index.org.apache.nutch.net Web-related interfaces: URLfilters
andnormalizers
.org.apache.nutch.urlfilter.api GenericURL filter
library, abstracting away from regular expression implementations.org.apache.nutch.urlfilter.automaton URL filter plugin based on dk.brics.automaton Finite-State Automata for JavaTM.org.apache.nutch.urlfilter.domain URL filter plugin to include only URLs which match an element in a given list of domain suffixes, domain names, and/or host names.org.apache.nutch.urlfilter.domaindenylist URL filter plugin to exclude URLs by domain suffixes, domain names, and/or host names.org.apache.nutch.urlfilter.fast URL filter plugin that first does fast exact suffix matches on host/domain names before applying regular expressions to the path component of a URL.org.apache.nutch.urlfilter.ignoreexempt URL filter plugin which identifies exemptions to external urls when when external urls are set to ignore.org.apache.nutch.urlfilter.prefix URL filter plugin to include only URLs which match one of a given list of URL prefixes.org.apache.nutch.urlfilter.regex URL filter plugin to include and/or exclude URLs matching Java regular expressions.org.apache.nutch.urlfilter.suffix URL filter plugin to either exclude or include only URLs which match one of the given (path) suffixes.org.apache.nutch.urlfilter.validator URL filter plugin that validates given urls. -
-
Uses of URLFilter in org.apache.nutch.collection
Classes in org.apache.nutch.collection that implement URLFilter Modifier and Type Class Description class
Subcollection
SubCollection represents a subset of index, you can define url patterns that will indicate that particular page (url) is part of SubCollection. -
Uses of URLFilter in org.apache.nutch.net
Methods in org.apache.nutch.net that return URLFilter Modifier and Type Method Description URLFilter[]
URLFilters. getFilters()
-
Uses of URLFilter in org.apache.nutch.urlfilter.api
Classes in org.apache.nutch.urlfilter.api that implement URLFilter Modifier and Type Class Description class
RegexURLFilterBase
GenericURLFilter
based on regular expressions. -
Uses of URLFilter in org.apache.nutch.urlfilter.automaton
Classes in org.apache.nutch.urlfilter.automaton that implement URLFilter Modifier and Type Class Description class
AutomatonURLFilter
RegexURLFilterBase implementation based on the dk.brics.automaton Finite-State Automata for JavaTM. -
Uses of URLFilter in org.apache.nutch.urlfilter.domain
Classes in org.apache.nutch.urlfilter.domain that implement URLFilter Modifier and Type Class Description class
DomainURLFilter
Filters URLs based on a file containing domain suffixes, domain names, and hostnames. -
Uses of URLFilter in org.apache.nutch.urlfilter.domaindenylist
Classes in org.apache.nutch.urlfilter.domaindenylist that implement URLFilter Modifier and Type Class Description class
DomainDenylistURLFilter
Filters URLs based on a file containing domain suffixes, domain names, and hostnames. -
Uses of URLFilter in org.apache.nutch.urlfilter.fast
Classes in org.apache.nutch.urlfilter.fast that implement URLFilter Modifier and Type Class Description class
FastURLFilter
Filters URLs based on a file of regular expressions using host/domains matching first. -
Uses of URLFilter in org.apache.nutch.urlfilter.ignoreexempt
Classes in org.apache.nutch.urlfilter.ignoreexempt that implement URLFilter Modifier and Type Class Description class
ExemptionUrlFilter
This implementation ofURLExemptionFilter
uses regex configuration to check if URL is eligible for exemption from 'db.ignore.external'. -
Uses of URLFilter in org.apache.nutch.urlfilter.prefix
Classes in org.apache.nutch.urlfilter.prefix that implement URLFilter Modifier and Type Class Description class
PrefixURLFilter
Filters URLs based on a file of URL prefixes. -
Uses of URLFilter in org.apache.nutch.urlfilter.regex
Classes in org.apache.nutch.urlfilter.regex that implement URLFilter Modifier and Type Class Description class
RegexURLFilter
Filters URLs based on a file of regular expressions using theJava Regex implementation
. -
Uses of URLFilter in org.apache.nutch.urlfilter.suffix
Classes in org.apache.nutch.urlfilter.suffix that implement URLFilter Modifier and Type Class Description class
SuffixURLFilter
Filters URLs based on a file of URL suffixes. -
Uses of URLFilter in org.apache.nutch.urlfilter.validator
Classes in org.apache.nutch.urlfilter.validator that implement URLFilter Modifier and Type Class Description class
UrlValidator
Validates URLs.
-