Character Replacement
The character replacement module is a mechanism to use configurable character set rules to sanitize messages going via a scheme pack.
Example usage
The module uses spring configuration to define the rules used. The interface CharacterReplacer has a default implementation provided by spring, so to use simply add the csm-character-replacement module to maven and then dependency injection will pull the implementation through.
Then we simply pass the message (in string format) to our characterReplacer, this will return a sanitized string.
System.out.println(characterReplacer.replaceCharacters("ÀÁÂ"))
# prints AAA
If you wish to only replace part of the message, you can include optional startFrom and endAfter strings, which will find the first instance of those strings; within the message; and will only replace characters within those boundaries, including the startFrom and endAfter strings.
System.out.println(characterReplacer.replaceCharacters("ÀÁÂ íîï éèê", Optional.of(íîï), Optional.empty())
# prints ÀÁÂ iii eee
(everything after and including íîï is replaced)
System.out.println(characterReplacer.replaceCharacters("ÀÁÂ íîï éèê", Optional.empty(), Optional.of(íîï))
# prints AAA iii éèê
(everything before and including íîï is replaced)
System.out.println(characterReplacer.replaceCharacters("ÀÁÂ íîï éèê", Optional.of(íîï), Optional.of(íîï))
# prints ÀÁÂ iii éèê
(only íîï is replaced)
Configuration
There are 3 types of configuration available;
-
Char-to-char - replace a single character with another
-
list-to-char - replace any character in a list to a defined character
-
regex-to-char - replace any character in a regular expression to a defined character
| Any combination of these 3 is possible if required. |
character-replacements {
char-to-char-replacements = [
{character = À, replaceWith = A, replaceInDomOnly = true/false},
{character = ï, replaceWith = i, replaceInDomOnly = true/false},
]
list-to-char-replacements = [
{list = [È,É,Ê,Ë], replaceWith = E, replaceInDomOnly = true/false}
]
regex-to-char-replacements = [
{regex = "[\\p{InLatin-1Supplement}]", replaceWith = ., replaceInDomOnly = true/false}
]
}
| Config | Type | Description |
|---|---|---|
character-replacements.char-to-char-replacements.character |
Char |
Character to be replaced |
character-replacements.char-to-char-replacements.replaceWith |
Character |
Character replacement |
character-replacements.char-to-char-replacements.replaceInDomOnly |
Boolean |
Flag indicating whether the replacement should happen only in the text nodes of the DOM |
character-replacements.list-to-char-replacements.list |
List<Character> |
List of character’s to be replaced |
character-replacements.list-to-char-replacements.replaceWith |
Character |
Character replacement for list |
character-replacements.list-to-char-replacements.replaceInDomOnly |
Boolean |
Flag indicating whether the replacement should happen only in the text nodes of the DOM |
character-replacements.regex-to-char-replacements.regex |
String |
Regular expression for replaced characters |
character-replacements.regex-to-char-replacements.replaceWith |
Character |
Character replacement for regex |
character-replacements.regex-to-char-replacements.replaceInDomOnly |
Boolean |
Flag indicating whether the replacement should happen only in the text nodes of the DOM |