com.norconex.importer.filter
Interface IDocumentFilter

All Superinterfaces:
IImportHandler, Serializable
All Known Implementing Classes:
RegexMetadataFilter

public interface IDocumentFilter
extends IImportHandler

Filters documents. Before import has occurred, the properties are limited (e.g. HTTP headers, if coming from HTTP Collector). After import, all document properties should be available.

Author:
Pascal Essiembre

Method Summary
 boolean acceptDocument(InputStream document, Properties metadata, boolean parsed)
          Whether to accepts a document.
 

Method Detail

acceptDocument

boolean acceptDocument(InputStream document,
                       Properties metadata,
                       boolean parsed)
                       throws IOException
Whether to accepts a document.

Parameters:
document - the document to evaluate
metadata - document metadata
parsed - whether the document has been parsed already or not (a parsed document should normally be text-based)
Returns:
true if document is accepted
Throws:
IOException - problem reading the document


Copyright © 2009-2013 Norconex Inc.. All Rights Reserved.