Chinese character classification

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

All Chinese characters are logograms, but several different types can be identified, based on the manner in which they are formed or derived. There are a handful which derive from pictographs (象形 pinyin: xiàngxíng) and a number which are ideographic (指事 zhǐshì) in origin, including compound ideographs (會意 huìyì), but the vast majority originated as phono-semantic compounds (形聲 xíngshēng). The other categories in the traditional system of classification are rebus or phonetic loan characters (假借 jiǎjiè) and "derivative cognates" (轉注 zhuǎn zhù). Modern scholars have proposed various revised systems, rejecting some of the traditional categories.

In older literature, Chinese characters in general may be referred to as ideograms, due to the misconception that characters represented ideas directly, whereas in fact they do so only through association with the spoken word.[1]

Traditional classification

Traditional Chinese lexicography divided characters into six categories (六書 liùshū "Six Writings"). This classification is known from Xu Shen's second century dictionary Shuowen Jiezi, but did not originate there. The phrase first appeared in the Rites of Zhou, though it may not have originally referred to methods of creating characters. When Liu Xin (d. 23 CE) edited the Rites, he glossed the term with a list of six types without examples.[2] Slightly different lists of six types are given in the Book of Han of the first century CE, and by Zheng Zhong quoted by Zheng Xuan in his first-century commentary on the Rites of Zhou. Xu Shen illustrated each of Liu's six types with a pair of characters in the postface to the Shuowen Jiezi.[3]

The traditional classification is still taught but is no longer the focus of modern lexicographic practice. Some categories are not clearly defined, nor are they mutually exclusive: the first four refer to structural composition, while the last two refer to usage. For this reason, some modern scholars view them as six principles of character formation rather than six types of characters.

The earliest significant, extant corpus of Chinese characters is found on turtle shells and the bones of livestock, chiefly the scapula of oxen, for use in pyromancy, a form of divination. These ancient characters are called oracle bone script. Roughly a quarter of these characters are pictograms while the rest are either phono-semantic compounds or compound ideograms. Despite millennia of change in shape, usage and meaning, a few of these characters remain recognizable to the modern reader of Chinese.

At present, more than 90%[citation needed] of Chinese characters are phono-semantic compounds, constructed out of elements intended to provide clues to both the meaning and the pronunciation. However, as both the meanings and pronunciations of the characters have changed over time, these components are no longer reliable guides to either meaning or pronunciation. The failure to recognize the historical and etymological role of these components often leads to misclassification and folk etymology. A study of the earliest sources (the oracle bones script and the Zhou-dynasty bronze script) is often necessary for an understanding of the true composition and etymology of any particular character. Reconstructing Middle and Old Chinese phonology from the clues present in characters is part of Chinese historical linguistics. In Chinese, it is called Yinyunxue (音韻學 "Studies of sounds and rimes")[citation needed].


Roughly 600[citation needed] Chinese characters are pictograms (象形 xiàng xíng, "form imitation") — stylised drawings of the objects they represent. These are generally among the oldest characters. A few, indicated below with their earliest forms, date back to oracle bones from the twelfth century BCE.

These pictograms became progressively more stylized and lost their pictographic flavor, especially as they made the transition from the oracle bone script to the Seal Script of the Eastern Zhou, but also to a lesser extent in the transition to the clerical script of the Han Dynasty. The table below summarises the evolution of a few Chinese pictographic characters. Where no modern simplified form is provided, it is identical to the traditional character.

Oracle Bone Script Seal Script Clerical Script Semi-Cursive Script Cursive Script Regular Script (Traditional) Regular Script (Simplified) Pinyin Meaning
日-oracle.svg 日-seal.svg File:Character Ri4 Cler.svg File:Character Ri4 Semi.svg File:Character Ri4 Cur.svg Character Ri4 Trad.svg Character Ri4 Trad.svg Sun
60px 月-seal.svg File:Character Yue4 Cler.svg File:Character Yue4 Semi.svg File:Character Yue4 Cur.svg File:Character Yue Trad.svg File:Character Yue Trad.svg yuè Moon
60px File:Character Shan Seal.svg File:Character Shan1 Cler.svg File:Character Shan1 Semi.svg File:Character Shan1 Cur.svg File:Character Shan1 Trad.svg File:Character Shan1 Trad.svg shān Mountain
60px 60px File:Character Shui3 Cler.svg File:Character Shui3 Semi.svg File:Character Shui3 Cur.svg File:Character Shui Trad.svg File:Character Shui Trad.svg shuǐ Water
60px 雨-seal.svg File:Character Yu3 Cler.svg File:Character Yu3 Semi.svg File:Character Yu3 Cur.svg File:Character Yu3 Trad.svg File:Character Yu3 Trad.svg Rain
60px File:Character Mu4 Seal.svg File:Character Mu4 Cler.svg File:Character Mu4 Semi.svg File:Character Mu4 Cur.svg File:Character Mu4 Trad.svg File:Character Mu4 Trad.svg Wood
60px File:Character He Seal.svg File:Character He Cler.svg File:Character He2 Semi.svg File:Character He Cur.svg File:Character He Trad.svg File:Character He Trad.svg Rice Plant
60px 人-seal.svg File:Character Ren2 Cler.svg File:Character Ren2 Semi.svg File:Character Ren2 Cur.svg File:Jan ren.svg File:Jan ren.svg rén Person
60px 女-seal.svg File:Character Nyu3 Cler.svg File:Character Nyu3 Semi.svg File:Character Nuu Cur.svg File:Character Nü3 Trad.svg File:Character Nü3 Trad.svg Woman
母-oracle.svg 60px File:Character Mu Cler.svg File:Character Mu3 Semi.svg File:Character Mu3 Cur.svg Character Mu3 Trad.svg Character Mu3 Trad.svg Mother
File:Character Eye Oracle.svg File:Character Eye Seal.svg File:Character Eye Cler 2.svg File:Character Eye Semi 2.svg File:Character Eye Cur.svg File:Character Eye Trad.svg File:Character Eye Trad.svg Eye
60px File:Character Niu Seal.svg File:Character Niu Cler.svg File:Character Niu2 Semi.svg File:Character Niu2 Cur.svg File:Character Niu2 Trad.svg File:Character Niu2 Trad.svg niú Cow
60px 羊-seal.svg File:Character Yang2 Cler.svg File:Character Yang2 Semi.svg File:Character Yang2 Cur.svg File:Character Yang2 Trad.svg File:Character Yang2 Trad.svg yáng Goat
馬-oracle.svg 馬-seal.svg File:Character Ma Cler.svg File:Character Ma3 Semi.svg File:Character Ma3 Cur.svg File:Character Ma Trad.svg 50px Horse
60px 鳥-seal.svg File:Character Niao3 Cler.svg File:Character Niao3 Semi.svg File:Character Niao3 Cur.svg File:Character Niao Trad.svg File:Character Niao Simp.svg niǎo Bird
File:Character Gui Oracle.svg File:Character Gui Seal.svg File:Character Gui Cler.svg File:Character Gui1 Semi.svg File:Character Gui Cur.svg File:ChineseTrad Gui.svg File:Character Gui Simp.svg guī Tortoise
60px File:Character Long Seal.svg File:Character Long Cler.svg File:Character Long2 Semi.svg File:Character Long Cur.svg File:ChineseTrad Long.svg File:Character Long Simp.svg lóng Chinese Dragon
File:Character Feng4 Oracle.svg File:Character Feng Seal.svg File:Character Feng Cler.svg File:Character Feng4 Semi.svg File:Character Feng Cur.svg File:ChineseTrad Feng.svg File:Character Feng Simp.svg fèng Chinese Phoenix


  • shuǐ "water" represents the lines of a flowing river.

Simple ideograms

Ideograms (指事 zhǐ shì, "indication") express an abstract idea through an iconic form, including iconic modification of pictographic characters. In the examples below, low numerals are represented by the appropriate number of strokes, directions by an iconic indication above and below a line, and the parts of a tree by marking the appropriate part of a pictogram of a tree.


Pinyin èr sān shàng xià běn
Gloss one two three up below root apex


  • běn "root" - a tree (木 ) with the base indicated by an extra stroke.
  • "apex" - the reverse of 本 (běn), a tree with the top highlighted by an extra stroke.

Compound ideographs

Compound ideographs (會意 huì yì, "joined meaning"), also called associative compounds or logical aggregates, are compounds of two or more pictographic or ideographic characters to suggest the meaning of the word to be represented. In the postface to the Shuowen Jiezi, Xu Shen gave two examples:[3]

  • 武 "military", formed from 戈 "dagger-axe" and 止 "foot"
  • 信 "truthful", formed from 人 "person" (later reduced to 亻) and 言 "speech"

Other characters commonly explained as compound ideographs include:

  • lín "grove", composed of two trees[4]
  • sēn "forest", composed of three trees[5]
  • xiū "shade, rest", depicting a man by a tree[6]
  • cǎi "harvest", depicting a hand on a bush (later written 採)[7]

Many characters formerly classed as compound ideographs are now believed to have been mistakenly identified. For example, Xu Shen's example 信, representing the word xìn < *snjins "truthful", is now usually considered a phono-semantic compound, with 人 rén < *njin as phonetic and 言 "speech" as signific.[2][8] In many cases, reduction of a character has obscured its original phono-semantic nature. For example, the character 明 "bright" is often presented as a compound of 日 "sun" and 月 "moon". However this form is probably a simplification of an attested alternative form 朙, which can be viewed as a phono-semantic compound.[9]

Peter Boodberg and William Boltz have argued that no ancient characters were compound ideographs. Boltz accounts for the remaining cases by suggesting that some characters could represent multiple unrelated words with different pronunciations, as in Sumerian cuneiform and Egyptian hieroglyphs, and the compound characters are actually phono-semantic compounds based on an alternative reading that has since been lost. For example, the character 安 ān < *ʔan "peace" is often cited as a compound of 宀 "roof" and 女 "woman". Boltz speculates that the character 女 could represent both the word < *nrjaʔ "woman" and the word ān < *ʔan "settled", and that the roof signific was later added to disambiguate the latter usage. In support of this second reading, he points to other characters with the same 女 component that had similar Old Chinese pronunciations: 妟 yàn < *ʔrans "tranquil", 奻 nuán < *nruan "to quarrel" and 姦 jiān < *kran "licentious".[10] Other scholars reject these arguments for alternative readings and consider other explanations of the data more likely, for example viewing 妟 as a reduced form of 晏, which can be analysed as a phono-semantic compound with 安 as phonetic. They consider the characters 奻 and 姦 to be implausible phonetic compounds, both because the proposed phonetic and semantic elements are identical and because the widely differing initial consonants *ʔ- and *n- would not normally be accepted in a phonetic compound.[11] Notably, Christopher Button has shown how more sophisticated palaeographical and phonological analyses can account for Boodberg's and Boltz's proposed examples without relying on polyphony.[12]

While compound ideographs are a limited source of Chinese characters, they form many of the kokuji created in Japan to represent native words. Examples include:

  • hatara(ku) "to work", formed from 人 "person" and 動 "move"
  • tōge "mountain pass", formed from 山 "mountain", 上 "up" and 下 "down"

As Japanese creations, such characters had no Chinese or Sino-Japanese readings, but a few have been assigned invented Sino-Japanese readings. For example, the common character 働 has been given the reading (taken from ), and even been borrowed into written Chinese in the 20th century with the reading dòng.[13]

Rebus (phonetic loan) characters

Jiajie (假借 jiǎjiè, "borrowing; making use of") are characters that are "borrowed" to write another homophonous or near-homophonous morpheme. For example, the character was originally a pictogram of a wheat plant and meant *mlək "wheat". As this was pronounced similarly to the Old Chinese word *lai "to come", 來 was also used to write this verb. Eventually the more common usage, the verb "to come", became established as the default reading of the character 來, and a new character was devised for "wheat". (The modern pronunciations are lái and mài.) When a character is used as a rebus this way, it is called a jiajiezi 假借字 (lit. "loaned and borrowed character") (in Wade-Giles "chia-chie" or "chia-chieh"), translatable as "phonetic loan character" or "rebus character".

As in Egyptian hieroglyphs and Sumerian cuneiform, early Chinese characters were used as rebuses to express abstract meanings that were not easily depicted. Thus many characters stood for more than one word. In some cases the extended use would take over completely, and a new character would be created for the original meaning, usually by modifying the original character with a radical (determinative). For instance, yòu originally meant "right hand; right" but was borrowed to write the abstract word yòu "again; moreover". In modern usage, the character 又 exclusively represents yòu "again" while , which adds the "mouth radical" to 又, represents yòu "right". This process of graphic disambiguation is a common source of phono-semantic compound characters.

Examples of jiajie
Pictograph or
New character for
original word
"four" "nostrils" (mucous; sniffle)
"flat, thin" "leaf"
běi "north" bèi "back (of the body)"
yào "to want" yāo "waist"
shǎo "few" shā "sand" and
yǒng "forever" yǒng "swim"

While this word jiajie dates from the Han Dynasty, the related term tongjia (通假 tōngjiǎ "interchangeable borrowing") is first attested from the Ming Dynasty. The two terms are commonly used as synonyms, but there is a linguistic distinction between jiajiezi being a phonetic loan character for a word that did not originally have a character, such as using "a bag tied at both ends" [1] for dōng "east", and tongjia being an interchangeable character used for an existing homophonous character, such as using zǎo "flea" for zǎo "early".

According to Bernhard Karlgren, "One of the most dangerous stumbling-blocks in the interpretation of pre-Han texts is the frequent occurrence of [jiajie], loan characters."[14]

Phono-semantic compound characters

  • 形聲 xíng shēng "form and sound" or 諧聲 xié shēng "sound agreement"

These are often called radical-phonetic characters. They form the majority of Chinese characters by far—over 90%, and were created by combining a rebus with a determinative—that is, a character with approximately the correct pronunciation (the phonetic element, similar to a phonetic complement) with one of a limited number of determinative characters which supplied an element of meaning (the semantic element, which in most cases is also the radical under which a character is listed in a dictionary). As in ancient Egyptian writing, such compounds eliminated the ambiguity caused by phonetic loans (above).

Most often, the radical is on one side (often the left), while the phonetic is on the other side (often the right), as in 沐 = 氵 "water" + 木 . Also common is for the semantic and phonetic elements to be stacked on top of each other, as in 菜 = 艹 "plant" + 采 cǎi. More rarely, the phonetic may be placed inside the semantic, as in 園 = 囗 "enclosure" + 袁, or 街 = 行 "go, movement" + 圭. More complicated combinations also exist, such as 勝 = 力 "strength" + 朕, where the semantic is in the lower-right quadrant, and the phonetic is the other three quadrants.

This process can be repeated, with a phono-semantic compound character itself being used as a phonetic in a further compound, which can result in quite complex characters, such as 劇 (豦 = 虍 + 豕, 劇 = 刂 + 豦).


As an example, a verb meaning "to wash oneself" is pronounced mù. Although difficult to draw, it happens to sound the same as the word "tree", which was written with the simple pictograph 木. The verb could simply have been written 木, like "tree", but to disambiguate, it was combined with the character for "water", giving some idea of the meaning. The resulting character eventually came to be written 沐 "to wash one's hair". Similarly, the water determinative was combined with 林 lín "woods" to produce the water-related homophone 淋 lín "to pour".

Determinative Rebus Compound


"to wash oneself"



lín "to pour"

However, the phonetic component is not always as meaningless as this example would suggest. Rebuses were sometimes chosen that were compatible semantically as well as phonetically. It was also often the case that the determinative merely constrained the meaning of a word which already had several. 菜 cài "vegetable" is a case in point. The determinative 艹 for plants was combined with 采 cǎi "harvest". However, 采 cǎi does not merely provide the pronunciation. In classical texts it was also used to mean "vegetable". That is, 采 underwent semantic extension from "harvest" to "vegetable", and the addition of 艹 merely specified that the latter meaning was to be understood.

Determinative Rebus Compound


"harvest, vegetable"

cài "vegetable"

Some additional examples:

Determinative Rebus Compound



pāi "to clap, to hit"

to dig into


jiū "to investigate"



yìng "reflection"

Sound change

Originally characters sharing the same phonetic had similar readings, though they have now diverged substantially. Linguists rely heavily on this fact to reconstruct the sounds of Old Chinese – see historical Chinese phonology. Contemporary foreign pronunciations (Sino-Xenic pronunciations) of characters are also used to reconstruct older Chinese, chiefly Middle Chinese.

When people try to read a two-part character of which they are ignorant, they will typically follow the folk wisdom of you bian du bian (有邊讀邊) "read the side" and take one component to be a phonetic, which often results in errors.


Since the phonetic elements of many characters no longer accurately represent their pronunciations, when the People's Republic of China simplified characters, they often substituted a phonetic that was not only simpler to write, but more accurate for a modern reading in Mandarin as well.[citation needed] This has sometimes resulted in forms which are less phonetic than the original ones in varieties of Chinese other than Mandarin. (Note for the example that many determinatives were simplified as well, usually by standardizing cursive forms.)

Traditional character
Determinative Rebus Compound



zhōng "bell"
Simplified character
New rebus New compound



Derivative cognates

The derivative cognate (轉注 zhuǎn zhù, "reciprocal meaning") is the smallest category and also the least understood.[15] In the postface to the Shuowen Jiezi, Xu Shen gave as an example the characters kǎo "to verify" and lǎo "old", which had similar Old Chinese pronunciations (*khuʔ and *C-ruʔ respectively[16]) and may have had the same etymological root, meaning "elderly person", but became lexicalized into two separate words. The term does not appear in the body of the dictionary, and may have been included in the postface out of deference to Liu Xin.[17] It is often omitted from modern systems.

Modern classifications

The liushu had been the standard classification scheme for Chinese characters since Xu Shen's time. Generations of scholars modified it without challenging the basic concepts. Tang Lan (唐蘭) (1902–1979) was the first to dismiss liushu, offering his own sanshu (三書 "Three Principles of Character Formation"), namely xiangxing (象形 "form-representing"), xiangyi (象意 "meaning-representing") and xingsheng (形聲 "meaning-sound"). This classification was later criticised by Chen Mengjia (1911–1966) and Qiu Xigui. Both Chen and Qiu offered their own sanshu.[18]

See also


  1. Hansen 1993.
  2. 2.0 2.1 Sampson & Chen 2013, p. 261.
  3. 3.0 3.1 Wilkinson 2013, p. 35.
  4. Qiu 2000, pp. 54, 198.
  5. Qiu 2000, p. 198.
  6. Qiu 2000, pp. 209–211.
  7. Qiu 2000, pp. 188, 226, 255.
  8. Qiu 2000, p. 155.
  9. Sampson & Chen 2013, p. 264.
  10. Boltz 1994, pp. 106–110.
  11. Sampson & Chen 2013, pp. 266–267.
  12. Button 2010.
  13. Seeley 1991, p. 203.
  14. Karlgren 1968, p. 1.
  15. Norman 1988, p. 69.
  16. Baxter 1992, pp. 771, 772.
  17. Sampson & Chen 2013, pp. 260–261.
  18. Qiu 2000, ch. 6.3.
  • This page draws heavily on the French Wikipedia page Classification des sinogrammes, retrieved 12 April 2005.
  • Baxter, William H. (1992), A Handbook of Old Chinese Phonology, Berlin: Mouton de Gruyter, ISBN 978-3-11-012324-1.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  • Boltz, William G. (1994), The origin and early development of the Chinese writing system, New Haven: American Oriental Society, ISBN 978-0-940490-78-9.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  • Button, Christopher (2010), Phonetic Ambiguity in the Chinese Script: A Palaeographical and Phonological Analysis, Munich: Lincom Europa, ISBN 978-3-89586-632-6.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  • DeFrancis, John (1984), The Chinese Language: Fact and Fantasy, Honolulu: University of Hawaii Press, ISBN 978-0-8248-1068-9.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  • —— (1989), Visible Speech: The Diverse Oneness of Writing Systems, Honolulu: University of Hawaii Press, ISBN 978-0-8248-1207-2.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  • Lua error in Module:Citation/CS1/Identifiers at line 47: attempt to index field 'wikibase' (a nil value).
  • Karlgren, Bernhard (1968), Loan Characters in Pre-Han Texts, Stockholm: Museum of Far Eastern Antiquities.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  • Norman, Jerry (1988), Chinese, Cambridge: Cambridge University Press, ISBN 978-0-521-29653-3.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  • Qiu, Xigui (2000), Chinese writing, trans. by Gilbert L. Mattos and Jerry Norman, Berkeley: Society for the Study of Early China and The Institute of East Asian Studies, University of California, ISBN 978-1-55729-071-7.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles> (English translation of Wénzìxué Gàiyào 文字學概要, Shangwu, 1988.)
  • Sampson, Geoffrey; Chen, Zhiqun (2013), "The reality of compound ideographs" (PDF), Journal of Chinese Linguistics, 41: 255–272.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  • Seeley, Christopher (1991), A History of Writing in Japan, BRILL, ISBN 978-90-04-09081-1.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  • Wang, Hongyuan 王宏源 (1993), The Origins of Chinese characters, Beijing: Sinolingua, ISBN 978-7-80052-243-7.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  • Wilkinson, Endymion (2013), Chinese History: A New Manual, Harvard-Yenching Institute Monograph Series, Cambridge, MA: Harvard University Asia Center, ISBN 978-0-674-06715-8.<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
  • Woon, Wee Lee 雲惟利 (1987), 漢字的原始和演變, Macau: University of Macau. Unknown parameter |trans_title= ignored (help)<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>

Further reading

External links