Class UnescapeTransliterator

java.lang.Object
com.ibm.icu.text.Transliterator
com.ibm.icu.text.UnescapeTransliterator
All Implemented Interfaces:
StringTransform, Transform<String,String>

class UnescapeTransliterator extends Transliterator
A transliterator that converts Unicode escape forms to the characters they represent. Escape forms have a prefix, a suffix, a radix, and minimum and maximum digit counts.

This class is package private. It registers several standard variants with the system which are then accessed via their IDs.

  • Field Details

    • spec

      private char[] spec
      The encoded pattern specification. The pattern consists of zero or more forms. Each form consists of a prefix, suffix, radix, minimum digit count, and maximum digit count. These values are stored as a five character header. That is, their numeric values are cast to 16-bit characters and stored in the string. Following these five characters, the prefix characters, then suffix characters are stored. Each form thus takes n+5 characters, where n is the total length of the prefix and suffix. The end is marked by a header of length one consisting of the character END.
    • END

      private static final char END
      Special character marking the end of the spec[] array.
      See Also:
  • Constructor Details

    • UnescapeTransliterator

      UnescapeTransliterator(String ID, char[] spec)
      Package private constructor. Takes the encoded spec array.
  • Method Details

    • register

      static void register()
      Registers standard variants with the system. Called by Transliterator during initialization.
    • handleTransliterate

      protected void handleTransliterate(Replaceable text, Transliterator.Position pos, boolean isIncremental)
      Specified by:
      handleTransliterate in class Transliterator
      Parameters:
      text - the buffer holding transliterated and untransliterated text
      pos - the indices indicating the start, limit, context start, and context limit of the text.
      isIncremental - if true, assume more text may be inserted at pos.limit and act accordingly. Otherwise, transliterate all text between pos.start and pos.limit and move pos.start up to pos.limit.
      See Also:
    • addSourceTargetSet

      public void addSourceTargetSet(UnicodeSet inputFilter, UnicodeSet sourceSet, UnicodeSet targetSet)
      Description copied from class: Transliterator
      Returns the set of all characters that may be generated as replacement text by this transliterator, filtered by BOTH the input filter, and the current getFilter().

      SHOULD BE OVERRIDDEN BY SUBCLASSES. It is probably an error for any transliterator to NOT override this, but we can't force them to for backwards compatibility.

      Other methods vector through this.

      When gathering the information on source and target, the compound transliterator makes things complicated. For example, suppose we have:

       Global FILTER = [ax]
       a > b;
       :: NULL;
       b > c;
       x > d;
       
      While the filter just allows a and x, b is an intermediate result, which could produce c. So the source and target sets cannot be gathered independently. What we have to do is filter the sources for the first transliterator according to the global filter, intersect that transliterator's filter. Based on that we get the target. The next transliterator gets as a global filter (global + last target). And so on.

      There is another complication:

       Global FILTER = [ax]
       a >|b;
       b >c;
       
      Even though b would be filtered from the input, whenever we have a backup, it could be part of the input. So ideally we will change the global filter as we go.
      Overrides:
      addSourceTargetSet in class Transliterator
      Parameters:
      targetSet - TODO
      See Also: