JIS encoding

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

Lua error in package.lua at line 80: module 'strict' not found. In computing, JIS encoding refers to several Japanese Industrial Standards for encoding the Japanese language. Strictly speaking, the term means either:

  • A set of standard character sets for Japanese, notably:
    • JIS X 0201, the Japanese version of ISO 646 (ASCII) containing the base 7-bit ASCII characters (with some modifications) and 64 half-width katakana characters.
    • JIS X 0208, the most common kanji character set containing 6,879 characters, including 6355 kanji and 524 other characters
    • JIS X 0212, an extension of JIS X 0208 which adds 5801 kanji, totalling 12156 kanji
    • JIS X 0213, which extends JIS X 0208
  • JIS X 0202 (also known as ISO-2022-JP), a set of encoding mechanisms for sending JIS data over transmission mediums that only support 7-bit data.

In practice, "JIS encoding" usually refers to JIS X 0208 data encoded with JIS X 0202.

There is also the Shift JIS encoding, which adds the kanji, full-width hiragana and full-width katakana from JIS X 0208 in a compatible way to JIS X 0201. Shift JIS is perhaps the most widely used encoding in Japan, as the compatibility with the single-byte JIS X 0201 character set made it possible for electronic equipment manufacturers (such as cash register manufacturers) to offer an upgrade from older cheaper equipment that was not capable of displaying kanji to newer equipment while retaining character-set compatibility.

The main alternatives to JIS encoding are EUC (used on UNIX systems where the JIS encodings are incompatible with POSIX standards) and more recently Unicode, particularly in the form of UTF-8.

See also