com.norconex.importer.transformer.impl
Class StripBetweenTransformer
java.lang.Object
com.norconex.importer.transformer.AbstractRestrictiveTransformer
com.norconex.importer.transformer.AbstractCharStreamTransformer
com.norconex.importer.transformer.AbstractStringTransformer
com.norconex.importer.transformer.impl.StripBetweenTransformer
- All Implemented Interfaces:
- IXMLConfigurable, IImportHandler, IDocumentTransformer, Serializable
public class StripBetweenTransformer
- extends AbstractStringTransformer
- implements IXMLConfigurable
Strips any content found between a matching start and end strings. The
matching strings are defined in pairs and multiple ones can be specified
at once.
This class can be used as a pre-parsing (text content-types only)
or post-parsing handlers.
XML configuration usage:
<transformer class="com.norconex.importer.transformer.impl.StripBetweenTransformer"
inclusive="[false|true]"
caseSensitive="[false|true]" >
<contentTypeRegex>
(regex to identify text content-types for pre-import,
overriding default)
</contentTypeRegex>
<restrictTo
caseSensitive="[false|true]" >
property="(name of header/metadata name to match)"
(regular expression of value to match)
</restrictTo>
<stripBetween>
<start>(regex)</start>
<end>(regex)</end>
</stripBetween>
<-- multiple strignBetween tags allowed -->
</transformer>
- Author:
- Pascal Essiembre
- See Also:
- Serialized Form
StripBetweenTransformer
public StripBetweenTransformer()
transformStringDocument
protected void transformStringDocument(String reference,
StringBuilder content,
Properties metadata,
boolean parsed,
boolean partialContent)
- Specified by:
transformStringDocument
in class AbstractStringTransformer
isInclusive
public boolean isInclusive()
setInclusive
public void setInclusive(boolean inclusive)
- Sets whether start and end text pairs should themselves be stripped or
not.
- Parameters:
inclusive
- true
to strip start and end text
isCaseSensitive
public boolean isCaseSensitive()
setCaseSensitive
public void setCaseSensitive(boolean caseSensitive)
- Sets whether to ignore case when matching start and end text.
- Parameters:
caseSensitive
- true
to consider character case
addStripEndpoints
public void addStripEndpoints(String fromText,
String toText)
getStripEndpoints
public List<Pair<String,String>> getStripEndpoints()
loadFromXML
public void loadFromXML(Reader in)
throws IOException
- Specified by:
loadFromXML
in interface IXMLConfigurable
- Throws:
IOException
saveToXML
public void saveToXML(Writer out)
throws IOException
- Specified by:
saveToXML
in interface IXMLConfigurable
- Throws:
IOException
toString
public String toString()
- Overrides:
toString
in class AbstractStringTransformer
hashCode
public int hashCode()
- Overrides:
hashCode
in class AbstractStringTransformer
equals
public boolean equals(Object obj)
- Overrides:
equals
in class AbstractStringTransformer
Copyright © 2009-2013 Norconex Inc.. All Rights Reserved.