com.norconex.importer.transformer
Class AbstractStringTransformer
java.lang.Object
com.norconex.importer.transformer.AbstractRestrictiveTransformer
com.norconex.importer.transformer.AbstractCharStreamTransformer
com.norconex.importer.transformer.AbstractStringTransformer
- All Implemented Interfaces:
- IImportHandler, IDocumentTransformer, Serializable
- Direct Known Subclasses:
- StripAfterTransformer, StripBeforeTransformer, StripBetweenTransformer
public abstract class AbstractStringTransformer
- extends AbstractCharStreamTransformer
Base class to facilitate creating transformers on text content, load text
into StringBuilder
for memory processing, also giving more options
(like fancy regex). This class check for free memory every 10KB of text
read. If enough memory, it keeps going for another 10KB or until
all the content is read, or the buffer size reaches half the available
memory. In either case, it pass the buffer content so far for
transformation (all of it for small enough content, and in several
chunks for large content).
Implementors should be conscious about memory when dealing with the string
builder.
Subclasses implementing IXMLConfigurable
should allow this inner
configuration:
<contentTypeRegex>
(regex to identify text content-types, overridding default)
</contentTypeRegex>
<restrictTo
caseSensitive="[false|true]" >
property="(name of header/metadata name to match)"
(regular expression of value to match)
</restrictTo>
- Author:
- Pascal Essiembre
- See Also:
- Serialized Form
AbstractStringTransformer
public AbstractStringTransformer()
transformTextDocument
protected final void transformTextDocument(String reference,
Reader input,
Writer output,
Properties metadata,
boolean parsed)
throws IOException
- Specified by:
transformTextDocument
in class AbstractCharStreamTransformer
- Throws:
IOException
transformStringDocument
protected abstract void transformStringDocument(String reference,
StringBuilder content,
Properties metadata,
boolean parsed,
boolean partialContent)
equals
public boolean equals(Object obj)
- Overrides:
equals
in class AbstractCharStreamTransformer
hashCode
public int hashCode()
- Overrides:
hashCode
in class AbstractCharStreamTransformer
toString
public String toString()
- Overrides:
toString
in class AbstractCharStreamTransformer
Copyright © 2009-2013 Norconex Inc.. All Rights Reserved.