Class BasicIndexingFilter

  • All Implemented Interfaces:
    Configurable, IndexingFilter, Pluggable

    public class BasicIndexingFilter
    extends Object
    implements IndexingFilter
    Adds basic searchable fields to a document. The fields added are : domain, host, url, content, title, cache, tstamp domain is included depending on indexer.add.domain in nutch-default.xml. title is truncated as per indexer.max.title.length in nutch-default.xml. (As per NUTCH-1004, a zero-length title is not added) content is truncated as per indexer.max.content.length in nutch-default.xml.