Class RemoveValuesDataProcessor

  • All Implemented Interfaces:
    DocumentPreProcessor

    public class RemoveValuesDataProcessor
    extends ConfigureableDataprocessor<PatternConfiguration>
    DocumentPreProcessor implementation which removes values from a fields value based on a regular expression. Will be auto configured and can be further configuration like described below:
      data-processor-configuration: 
       processors:
         - RemoveValuesDataProcessor
      configuration:
         RemoveValuesDataProcessor:
           someFieldName: ".*\\d+.*"
           someFieldName_destination: "someDestinationField"
           # Optional configuration:
           # RegEx used to split the value into chunks, //s+ if omitted
           someFieldName_wordSplitRegEx: "/"
           # join character used when combining splitted cleared chunks, default space " "
           someFieldName_wordJoinSeparator: "/"
     
    This would remove all numerical values from the field with the name 'someFieldName' and write it into the field 'someDestinationField'. If no destination is specified, the destination will be the source field. This implementation splits the value into separate tokens and checks the regular expression against each token. If the regular expression matches the token, the token get's removed.