Class CJKWidthCharFilter

All Implemented Interfaces:
Closeable, AutoCloseable, Readable

public class CJKWidthCharFilter extends BaseCharFilter
A CharFilter that normalizes CJK width differences:
  • Folds fullwidth ASCII variants into the equivalent basic latin
  • Folds halfwidth Katakana variants into the equivalent kana

NOTE: this char filter is the exact counterpart of CJKWidthFilter.

  • Field Details

    • KANA_NORM

      private static final char[] KANA_NORM
    • KANA_COMBINE_VOICED

      private static final byte[] KANA_COMBINE_VOICED
    • KANA_COMBINE_SEMI_VOICED

      private static final byte[] KANA_COMBINE_SEMI_VOICED
    • HW_KATAKANA_VOICED_MARK

      private static final int HW_KATAKANA_VOICED_MARK
      See Also:
    • HW_KATAKANA_SEMI_VOICED_MARK

      private static final int HW_KATAKANA_SEMI_VOICED_MARK
      See Also:
    • prevChar

      private int prevChar
    • inputOff

      private int inputOff
  • Constructor Details

    • CJKWidthCharFilter

      public CJKWidthCharFilter(Reader in)
      Default constructor that takes a Reader.
  • Method Details

    • read

      public int read() throws IOException
      Overrides:
      read in class Reader
      Throws:
      IOException
    • combineVoiceMark

      private int combineVoiceMark(int ch, int voiceMark)
      returns combined char if we successfully combined the voice mark, otherwise original char
    • read

      public int read(char[] cbuf, int off, int len) throws IOException
      Specified by:
      read in class Reader
      Throws:
      IOException