Unicode System


Unicode is referred to as an international encoding standard for character encoding. Through Unicode we can represent a number of languages.


Why Java uses Unicode System?

There were many standard languages before the Unicode system. Some of the renowned languages are given below:

  1. ASCII stands for American Standard Code for Information Interchange. The ASCII code was introduced for United States.
  2. ISO 8859-1 was introduced for Western European Language.
  3. KOI-8 was introduced for Russian.
  4. GB18030 and BIG-5 was introduced for Chinese.



There were two problems that were caused by these language standards. The following are the problems that were caused by these language standards:

  1. In different language, there are different letters and a particular code is for different letters in various languages.
  2. There were some languages that have a large set of characters and the encodings for these languages have variable length. In such languages some of the characters were encoded as a single byte and some of them were encoded as two or more bytes.



The problems that were caused by these language standards were overcome by introducing a new language standard that is Unicode system. The characters in the Unicode system have two bytes and the Java programming language also had two bytes for each character. The lowest value for the characters is \u0000 and the highest value for the characters is \uFFFF.