Class AlphabetConverter

java.lang.Object
org.apache.commons.text.AlphabetConverter

public final class AlphabetConverter extends Object

Convert from one alphabet to another, with the possibility of leaving certain characters unencoded.

The target and do not encode languages must be in the Unicode BMP, but the source language does not.

The encoding will all be of a fixed length, except for the 'do not encode' chars, which will be of length 1

Sample usage

 Character[] originals;   // a, b, c, d
 Character[] encoding;    // 0, 1, d
 Character[] doNotEncode; // d

 AlphabetConverter ac = AlphabetConverter.createConverterFromChars(originals,
 encoding, doNotEncode);

 ac.encode("a");    // 00
 ac.encode("b");    // 01
 ac.encode("c");    // 0d
 ac.encode("d");    // d
 ac.encode("abcd"); // 00010dd
 

#ThreadSafe# AlphabetConverter class methods are thread-safe as they do not change internal state.

Since:
1.0
  • Field Details

    • ARROW

      private static final String ARROW
      Arrow constant, used for converting the object into a string.
      See Also:
    • originalToEncoded

      private final Map<Integer,String> originalToEncoded
      Original string to be encoded.
    • encodedToOriginal

      private final Map<String,String> encodedToOriginal
      Encoding alphabet.
    • encodedLetterLength

      private final int encodedLetterLength
      Length of the encoded letter.
  • Constructor Details

    • AlphabetConverter

      private AlphabetConverter(Map<Integer,String> originalToEncoded, Map<String,String> encodedToOriginal, int encodedLetterLength)
      Hidden constructor for alphabet converter. Used by static helper methods.
      Parameters:
      originalToEncoded - original string to be encoded
      encodedToOriginal - encoding alphabet
      encodedLetterLength - length of the encoded letter
  • Method Details

    • codePointToString

      private static String codePointToString(int i)
      Creates new String that contains just the given code point.
      Parameters:
      i - code point
      Returns:
      a new string with the new code point
      See Also:
      • "http://www.oracle.com/us/technologies/java/supplementary-142654.html"
    • convertCharsToIntegers

      private static Integer[] convertCharsToIntegers(Character[] chars)
      Converts characters to integers.
      Parameters:
      chars - array of characters
      Returns:
      an equivalent array of integers
    • createConverter

      public static AlphabetConverter createConverter(Integer[] original, Integer[] encoding, Integer[] doNotEncode)
      Creates an alphabet converter, for converting from the original alphabet, to the encoded alphabet, while leaving the characters in doNotEncode as they are (if possible).

      Duplicate letters in either original or encoding will be ignored.

      Parameters:
      original - an array of ints representing the original alphabet in code points
      encoding - an array of ints representing the alphabet to be used for encoding, in code points
      doNotEncode - an array of ints representing the chars to be encoded using the original alphabet - every char here must appear in both the previous params
      Returns:
      The AlphabetConverter
      Throws:
      IllegalArgumentException - if an AlphabetConverter cannot be constructed
    • createConverterFromChars

      public static AlphabetConverter createConverterFromChars(Character[] original, Character[] encoding, Character[] doNotEncode)
      Creates an alphabet converter, for converting from the original alphabet, to the encoded alphabet, while leaving the characters in doNotEncode as they are (if possible).

      Duplicate letters in either original or encoding will be ignored.

      Parameters:
      original - an array of chars representing the original alphabet
      encoding - an array of chars representing the alphabet to be used for encoding
      doNotEncode - an array of chars to be encoded using the original alphabet - every char here must appear in both the previous params
      Returns:
      The AlphabetConverter
      Throws:
      IllegalArgumentException - if an AlphabetConverter cannot be constructed
    • createConverterFromMap

      public static AlphabetConverter createConverterFromMap(Map<Integer,String> originalToEncoded)
      Creates a new converter from a map.
      Parameters:
      originalToEncoded - a map returned from getOriginalToEncoded()
      Returns:
      The reconstructed AlphabetConverter
      See Also:
    • addSingleEncoding

      private void addSingleEncoding(int level, String currentEncoding, Collection<Integer> encoding, Iterator<Integer> originals, Map<Integer,String> doNotEncodeMap)
      Recursive method used when creating encoder/decoder.
      Parameters:
      level - at which point it should add a single encoding
      currentEncoding - current encoding
      encoding - letters encoding
      originals - original values
      doNotEncodeMap - map of values that should not be encoded
    • decode

      public String decode(String encoded) throws UnsupportedEncodingException
      Decodes a given string.
      Parameters:
      encoded - a string that has been encoded using this AlphabetConverter
      Returns:
      The decoded string, null if the given string is null
      Throws:
      UnsupportedEncodingException - if unexpected characters that cannot be handled are encountered
    • encode

      public String encode(String original) throws UnsupportedEncodingException
      Encodes a given string.
      Parameters:
      original - the string to be encoded
      Returns:
      The encoded string, null if the given string is null
      Throws:
      UnsupportedEncodingException - if chars that are not supported are encountered
    • equals

      public boolean equals(Object obj)
      Overrides:
      equals in class Object
    • getEncodedCharLength

      public int getEncodedCharLength()
      Gets the length of characters in the encoded alphabet that are necessary for each character in the original alphabet.
      Returns:
      The length of the encoded char
    • getOriginalToEncoded

      public Map<Integer,String> getOriginalToEncoded()
      Gets the mapping from integer code point of source language to encoded string. Use to reconstruct converter from serialized map.
      Returns:
      The original map
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object
    • toString

      public String toString()
      Overrides:
      toString in class Object