Xiao’erjing—Writing Chinese with Arabic Letters: An Introduction

Written by Iskandar Ding

A writing system is not merely a means of recording sounds and ideas, but itself a carrier of religious and cultural values.

In Europe, for example, the Latin script is strongly associated with Catholic and Protestant cultures, while Cyrillic most often appears in societies where Orthodox Christianity is predominant. Jewish communities in Eurasia, for their part, are known for writing their various mother tongues—whether of Germanic, Romance, Slavic, or Iranian derivation—in an ancestral Hebrew script. Moreover, a community’s religious conversion is frequently accompanied by a corresponding change of script. During the Christianization of Scandinavia between the 8th and 12th centuries, runes were gradually relinquished and replaced by a Latin alphabet affiliated with the new faith.

Indeed, this practice is near-universal, with the Islamic world no exception. The conquest of the Sasanian Empire by Arab Muslims in the 7th century gave rise to the first non-Arabic literary language to be written in the Arabic script: New Persian. Since then, Muslim communities in many parts of the world have developed Arabic-based scripts for their native tongues, among which the most famous are those of Iranian languages such as Kurdish and Pashto, Turkic languages including Ottoman and Chaghatay, and Indo-Aryan languages like Urdu and Punjabi, not to mention Classical Malay. Perso-Arabic scripts less familiar to the general public include, to name just a few, those of Mozarabic (a medieval variety of Spanish), Bosnian, the Belarusian of Lipka Tatars, Chechen, and Wolof.

Even less familiar to many is the Arabic script used to write Chinese.

The Sino-Arabic script in question is usually called 小儿经 xiǎo’érjīng, literally meaning ‘the small children’s canon,’ although the near-homophonous 小儿锦 xiǎo’érjǐn ‘small children’s brocade,’ as well as 小经 xiǎojīng ‘small canon,’ and 消经 xiāojīng can also be found. The latter term, 消经 xiāojīng, is thought to be the script’s original moniker, where 消 xiāo, literally ‘to digest,’ suggests using Chinese as an expository aid in one’s study of the Qurʾān, or religious canon (经 jīng). The rhotic element ér in Xiao’erjing would thus simply be a phonological addition of the sort common to northern Chinese dialects, i.e., one not intended to convey the semantic content of the character 儿 ér ‘child.’

This etymology notwithstanding, the educational imperative to explain the Islamic scriptures via Chinese should not be understood as the sole motivation for writing the Chinese language with the Arabic script.

Chinese-language poem written in Xiao’erjing, as found in the Rashḥāt (热什哈尔), a hagiography of the 18th-century Hui notable Wiqāyatullāh Ibrāhīm Ma, also known as Ma Mingxin (马明心).

The primary users of Xiao’erjing are the Hui, who, according to the latest official Chinese census results from 2020, are the second most populous Muslim ethnicity (minzu) in China after the Uyghurs. In previous statistics, they were found to be the single largest. A Sinophone ethnic group in our own day, the Hui are an exogenous people whose ancestors came to China from lands further west, mainly during the Mongol Yuan Dynasty (1271-1368), and spoke a variety of languages, the most prominent of which being Persian. These predominantly Central and West Asian Muslims gradually Sinicized, especially during the post-Mongol Ming Dynasty (1368-1644), to become the Hui of today.

Despite this Sinicization, a strong ethno-religious Hui identity distinct from that of the broader Chinese population meant that literacy among these Sino-Muslim communities continued to be associated with the (Perso-)Arabic rather than the Chinese script, even after their shift to using the Chinese language. Any Hui educated in Chinese schools would, of course, also have mastered the Chinese script, but a majority of Hui in pre-modern China acquired literacy—if at all—through a religiously infused madrasa education (经堂教育 jīngtáng jiàoyù) and would therefore have been more familiar with the Arabic writing system than with Chinese characters. The Dungans, or Chinese-speaking Muslims of ex-Soviet Central Asia, are also known to have used this Sino-Arabic script for written communication before transitioning to the Cyrillic they use today.

Xiao’erjing is, at its core, a phonetic writing system which represents the phonemes of Chinese using adapted Arabic letters. In its phonetic aspect, Xiao’erjing is thus akin to pinyin, the system commonly used to write Mandarin Chinese in Latin letters, though differing notably in that tones are not explicitly marked. This lack of tone markings may be cause for confusion, given the vast repertoire of homophones in Chinese. In a given semantic context, however, native users of this writing system rarely encounter ambiguity, just as an experienced reader of Arabic or Persian has little difficulty inferring the short vowels of a given word despite the absence of diacritics from most texts.

Like other Arabic-based writing systems, Xiao’erjing has additional letters to represent sounds unique to Chinese and uses extant Arabic letters in idiosyncratic ways. The Persian inventions پ and چ are unsurprisingly used for the phonemes [IPA: ph] (pinyin: p) and [ʈʂʰ] (ch) respectively, but ژ, which is [ʒ] in Persian, is certainly a more faithful representation of the Mandarin consonant [ʐ] than the r used in pinyin. The Persian گ [g] is absent and ق is used instead for the sound represented by the pinyin g—as one might expect given that the use of گ was rare in pre-modern Persian manuscripts and standardized only much later.

Xiao’erjing innovations include the letter ݣ for [tɕ] (j) and ڞ for [ʦʰ] (c). Some manuscripts employ an additional letter ٿ for [tɕʰ] (q), although this phoneme is more commonly represented by کِ. Examples of idiosyncratic usage of extant Arabic letters include ثِ for [ɕ] (x), ح for [x] (h) before the vowel e, خ for [x] (h) before the vowel u, ص for [s] (s) before the vowel u, ط for [th] (t) before the vowel u, ظ for [ts] (z) before the vowel u, and ء for the initial semi-vowel [j] (y), etc.

The Arabic ḥarakāt or vowel markings are almost always present in Xiao’erjing texts, in order to distinguish among the rich inventory of Chinese vowels. It is worth mention that the velar nasal [ŋ] shares the same graphemes—either the tanwīn or the letter ن—with the dental nasal [n], reflecting the merger of the two phonemes in the Northwestern Mandarin dialect communities that invented Xiao’erjing, and explains why the term xiao(’er)jing, as mentioned above, may also be called xiao’erjin. Apart from these basic points, Xiao’erjing has many orthographical rules and conventions which exceed the scope of this article.

Hand-written notes on quotations from Mao Zedong, by
20th century Hui ahong (imam) He Dejiang, incorporating a
mix of Xiao’erjing and Chinese characters.
Excerpt from a bilingual Qurʾān (Surah al-Ikhlas), featuring
a translation of the original Arabic text into Chinese via both
Xiao’erjing and Chinese characters.

Beyond these phonological adaptations and idiosyncrasies, it must be noted that Xiao’erjing is not merely a phonetic writing system. In addition to its readily apparent phonetic function, the script also actively employs Perso-Arabic heterograms, i.e., Persian or Arabic words written just as they are in their languages of origin, but read out as would be their Chinese equivalents, much similar to the kunyomi reading of Sino-Japanese characters (kanji).

For example, the Hui scholar Ma Zhenwu (马振武)’s translation of the Qurʾān into Xiao’erjing renders مَالِكِ يَوْمِ الدِّينِ māliki yawmi-d-dīn (1:4) ‘Sovereign of the Day of Recompense’ as جیِ جِاۤنْ خُوَنْ بَوْ يَوْمُ زِ دِ جُو (jī jān ẖuwan baw yawmu zi di jū). This corresponds to the Chinese 执掌还报日子的主 zhízhǎng huánbào rìzǐ de zhǔ, where the Arabic یوم yawm ‘day’ is written out just as it would be in Arabic, but represents—and should be read as—the Chinese ‘day.’

The form of Chinese used by the Hui also incorporates Persian and Arabic loanwords, especially for concepts specific to Islamic culture, such as 穆民 mùmín for the Arabic مؤمن muʾmin ‘believer’ and 乃麻子 nǎimázi for the Persian نماز namāz ‘prayer.’ However, we know that the heterograms referenced above are not themselves loanwords, because Arabic or Persian terms employed as heterograms may also be read merely for the phonetic value of their Chinese translations, rather than for both their semantic and phonetic components.

In other words, homophones in Chinese may share the same Arabic or Persian heterogram. For example, once the Arabic تمام tamām ‘complete’ has been used for its Chinese translation 全 quán ‘complete,’ it becomes the graphemic representation of the syllable quán regardless of its meaning. It can thus, among other things, be used for the homophonous 泉 quán ‘spring, stream,’ whence تمام ياً tamām yan for the reading 泉眼 quányǎn ‘source’ (lit. ‘spring-eye’).

Even more peculiarly, an Arabic or Persian heterogram may also substitute for just one part of a Chinese syllable-word. In the example هِ ماه, which is used for 靴 xuē  ‘boot,’ the heterogram ماه (māh, Persian for ‘moon’) represents the vowel component of xuē, i.e., ue, which is near homophonous with Chinese word for moon, yuè, while the syllable’s initial consonant x is, for its part, represented phonetically in the letter هِ.

Xiao’erjing in practice
The following poem is found in the Rashḥāt (热什哈尔), a hagiography of 18th-century Hui notable Wiqāyatullāh Ibrāhīm Ma, also known as Ma Mingxin (马明心), founder of the Jahriyya Sufi order (menhuan), a branch of the Naqshbandiyya order.

The original Xiao’erjing text is presented below, along with its transliteration into Chinese characters and a translation into English:

فِ وقت خُوُآنْ ءٍ آمد شَنْ سِی
تِی کِه لِیآنْ جُوَا بُ جَنْ نِی
بُ پَا جِنْ جُو ژِیَ نَوْ نُو
لِیآنْ بَآنْ شُوِی شَ وًا اسب تِ

非时黄鹰来陕西
提起两爪不粘泥
不怕真主惹恼怒
两膀摔折万马蹄

Untimely, the golden eagle comes to Shaanxi,
Lifting its two claws so as not to stain them with mud.
The eagle fears not to provoke the wrath of Allah;
Its two wings are trampled by ten thousand horse-hooves.

Explanation

The Xiao’erjing text above contains three heterograms, i.e., Arabic or Persian words which are meant to be pronounced according to their Chinese translations rather than their own ostensible phonetic values:

  • وقت = waqt (Ar., time) -> 时 = shí (Ch., time)
  • آمد = āmad (Fa., come) -> 来 = lái (Ch., come)
  • اسب = asab (Fa., horse) -> 马 = mǎ (Ch., horse)

The poem can be transliterated phonetically as follows, with the Perso-Arabic heterograms appearing in bold:

fi waqt ẖuwuʾān ʾin āmad shansi
tī kih liyʾān juwā bu jan nī
bu pā jin jū zhiya naw nū
liyʾān baʾān shuwī sha wan asb ti

According to Xiao’erjing usage, the entire text would be read in Chinese, including the Perso-Arabic heterograms. In standard contemporary Mandarin pronunciation, this can be rendered via pinyin as:

Fēi shí huáng yīng lái Shǎnxī
Tí qǐ liǎng zhǎo bù zhān ní
Bú pà Zhēnzhǔ rě nǎo nù
Liǎng bǎng shuāi zhé wàn

Xiao’erjing’s liberal interpretation of Chinese phonology and heavy reliance on Perso-Arabic loanwords and heterograms are telltale of a historical reality hardly imaginable in our days—that many Hui communities were, in certain key respects, substantively closer to the Perso-Arabic literary-cultural space than to the Chinese, all while communicating amongst themselves in a Sinitic language.

The script is a vital cultural heritage that testifies to the flow of population, and, along with it, culture, from West and Central Asia to China. However, the categorization of available textual evidence and research thereon are still in their initial stage. Currently, Chinese scholar Liu Yingsheng (刘迎胜)’s three-volume Xiao’erjin Yanjiu 小儿锦研究 (‘A Study in Xiao’erjin,’ 2013, Lanzhou University Press) is the most authoritative reference on this subject and should be consulted by those interested in learning more.

It is the author’s hope that this unique writing system will attract wider attention.


Disclaimer: The views expressed in this article are solely those of the author and do not purport to reflect the position of the Sino-Arabica Project.

Editor: Raphael Angieri

Special thanks to Sama Aziz, Fourat El Khoury, and Mateja Lazarević.

All images belong to the author himself.

Xiao’erjing poem translated into English by Raphael Angieri.


Comments

3 responses to “Xiao’erjing—Writing Chinese with Arabic Letters: An Introduction”

  1. […] EDIT: But for now, Iskandar Ding, the author of the first image in this post, has coincidentally published a much better overview of Xiao’erjing the same day I published my post. Please check it out here: https://sinoarabica.com/2025/02/03/xiaoerjing-an-introduction/ […]

  2. […] 阿拉伯文字用来用中文写。上个月在《全球中国脉搏》中,王·王(Jie Wang)发表了一篇有关 […]

  3. […] Xiao’erjing—Writing Chinese with Arabic Letters: An Introduction A writing system is not merely a means of recording sounds and ideas, but itself a carrier of religious and cultural values. In Europe, for example, the Latin script is strongly associated with Catholic and Protestant cultures, while Cyrillic most often appears in societies where Orthodox Christianity is predominant. Jewish communities in Eurasia, for their… […]

Discover more from The Sino-Arabica Project

Subscribe now to keep reading and get access to the full archive.

Continue reading

Discover more from The Sino-Arabica Project

Subscribe now to keep reading and get access to the full archive.

Continue reading

×