important book on Pinyin to be excerpted on this site

cover image for the bookXīnhuá Pīnxiě Cídiǎn (《新华拼写词典》 / 《新華拼寫詞典》), is the second of Yin Binyong’s two books on Pinyin orthography. The first, Chinese Romanization: Pronunciation and Orthography, is in English and Mandarin; much of it is already available here on Pinyin.Info.

Although Xinhua Pinxie Cidian is only in Mandarin, the large number of examples makes it easy to get the point even if you may not read Mandarin in Chinese characters very well.

This week I will begin posting some excerpts from this invaluable work. What’s more, I have made a version in traditional Chinese characters, which I hope that readers in Taiwan, Hong Kong, and elsewhere will take advantage of. So those not used to reading simplified Chinese characters will have a choice (which is more than the government of Taiwan is providing these days).

I’m extremely happy to be able to bring you this information and wish to acknowledge the generosity of the Commercial Press. Stay tuned.

ɑ vs. a

image of the rounded 'a' and the normal 'a' with the example given of the word 'Hanyu' (with tone marks)About a year ago (which is roughly how overdue this post is), a commenter noted that some Chinese publishers “are convinced that Pinyin must be printed with ɑ (single-story „Latin alpha“, as opposed to double-story a), and with ɡ (single story; not double story g).”

But does Hanyu Pinyin in fact call for this longstanding Chinese habit of bad typography? This was one of the first questions I asked of Zhou Youguang, the father of Hanyu Pinyin, when I met with him: Are those who insist upon the ɑ-style letter correct?

“Oh, no,” Zhou replied. “That ‘ɑ’ is just for babies!” And he laughed that wonderful laugh of his that no doubt has contributed to his remarkable longevity.

Zhou was referring to the facts that the “ɑ” style of letter is usually found specifically in books for infants … and that this style generally does not belong elsewhere. In fact, ɑ and ɡ (written thusly, as opposed to g) are often referred to as infant characters. A variant of the letter y is sometimes included in this set.

Letters in that style are also found in the West — but almost always in books for toddlers, and often not even in those. Furthermore, even in those cases the use of such letters appears to have no positive effect on children’s reading.

The correct-style letters for Pinyin are the same as those for English, Zhou stated.

I hope that anyone who has been using “ɑ” will both officially and in practice switch to “a”. It’s long past time that the supposed rule calling for “ɑ” was treated as a dead letter.

Long live good typography!

-r endings, their pronunciation, and Pinyin spelling

cover of Chinese Romanization: Pronunciation and OrthographyArrr! In recognition of International Talk Like a Beijinger Pirate Day, here are the rules for how to spell those -r endings in Hanyu Pinyin and how those endings affect the pronunciation of syllables. In many cases, it’s more complicated than just adding an -r sound at the end of the standard syllable.

This information is from Yin Binyong’s Chinese Romanization: Pronunciation and Orthography. The full section from this book is available in PDF form: r- Suffixed Syllables.

Written form Actual pronunciation
-ar (mǎr, horse) -ar (mǎr)
-air (gàir, lid) -ar (gàr)
-anr (pánr, plate) -ar (pár)
-aor (bāor, bundle) -aor (bāor)
-angr (gāngr, jar) -ãr (gãr)
-or (mòr, dust) -or (mòr)
-our (hóur, monkey) -our (hóur)
-ongr (chóngr, insect) -õr (chõr)
-er (gēr, song) -er (gēr)
-eir (bèir, back) -er (bèr)
-enr (ménr, door) -er (mér)
-engr (dēngr, lamp) -ẽr (dẽr)
-ir* (zìr, Chinese character) -er (zèr)
-ir (mǐr, rice) -ier (mǐer)
-iar (xiár, box) -iar (xiár)
-ier (diér, saucer) -ier (diér)
-iaor (niǎor, bird) -iaor (niǎor)
-iur (qiúr, ball) -iour (qióur)
-ianr (diǎnr, bit) -iar (diǎr)
-iangr (qiāngr, tune) -iãr (qiãr)
-inr (xīnr, core) -ier (xīer)
-ingr (língr, bell) -iẽr (liẽr)
-iongr (xióngr, bear) -iõr (xiõr)
-ur (tùr, rabbit) -ur (tùr)
-uar (huār, flower) -uar (huār)
-uor (huór, work) -uor (huór)
-uair (kuàir, piece) -uar (kuàr)
-uir (shuǐr, water) -uer (shuěr)
-uanr (wánr, to play) -uar (wár)
-uangr (kuāngr, basket) -uãr (kuãr)
-unr (lúnr, wheel) -uer (luér)
-ür (qǔr, song) -üer (qǔer)
-üer (juér, peg) -üer (juér)
-üanr (quānr, loop) -üar (quār)
-ünr (qúnr, skirt) -üer (quér)


  • ã, õ, ẽ indicate nasalized a, o, e.
  • The -i marked with an asterisk indicates either of the apical vowels that follow zh, ch, sh, r and z, c, s.

onomatopoeia in Mandarin and how to write it in Hanyu Pinyin

cover of Chinese Romanization: Pronunciation and OrthographyToday’s selection from Yin Binyong’s Chinese Romanization: Pronunciation and Orthography is onomatopoeic words (340 KB PDF).

Yin Binyong makes a distinction between onomatopoeic words that originate in Literary Sinitic (which thus generally have fixed forms in Chinese characters) and those from Modern Mandarin. The former can be written with tone marks, the latter aren’t.

In practice that distinction may well be more trouble than it’s worth. But I was happy to learn a new expression from his examples: shūshēnglǎnglǎng (the sound of reading aloud), which YBY writes as two words and the ABC Chinese-English Comprehensive Dictionary writes solid.

OK, for some niceties and examples:

Some of these words can be stretched out for auditory effect; to express this lengthening in writing, a dash is added after the syllable:

  • Du — , qìdi xiǎng le. (Toot went the steam whistle.)
  • Dà gōngjī, o — o — tí. (The rooster crowed cock-a-doodle-do.)

Reduplication is of course quite common in Mandarin.

  • huahua (sound of water or rain)
  • huhu (sound of wind)
  • wawa (sound of calling or crying)
  • dongdong (sound of beating drums).
  • wangwang (sound of a dog barking)
  • miaomiao (sound of a cat meowing)
  • jiji (sound of insects buzzing or chirping)
  • zizi (sound of a mouse squeaking)
  • gugu (sound of a pigeon cooing)
  • wengweng (sound of bees or flies buzzing)
  • gaga (sound of a duck quacking)
  • haha (sound of laughter)
  • heihei (sound of bitter or sardonic laughter)
  • xixi (sound of giggling)
  • gege (sound of guffawing)

All of those could also be written tripled instead of doubled, e.g., wangwangwang, miaomiaomiao, hahaha.

Yin provides some orthographic rules based on the patterns of the onomatopoeic words. The sound of a ticking clock, for example, could take various forms, such as

  • dida
  • dida dida
  • didi-dada

Note spacing, hyphens, and lack thereof. See the PDF for all the details.

Still, don’t sweat the stylistic niceties of these too much. It’s onomatopoeia, so have fun!

further reading:

mood particles in Mandarin — and how to write them in Hanyu Pinyin

cover of Chinese Romanization: Pronunciation and OrthographyToday’s post is on the “mood particles” of Mandarin (426 KB PDF), e.g., a, ba, la, ma, ne.

Mood-indicating particles are used to add various moods, spirits and tones to an utterance. “Mood” includes such diverse qualities as interrogation, request, command, emphasis, and exclamation. Some Chinese grammatologists classify mood particles as an independent part of speech, calling them “mood words.” Two distinctive features of mood particles are their position, typically at the end of a sentence or phrase, and their tone — they are usually read in the neutral tone. (In Hanyu Pinyin, consequently, they are never marked with tones.) The intonation of a sentence, which in Putonghua usually rests largely on the final syllable of an utterance, is in the case of a particle-final sentence transferred to the penultimate syllable. Mood particles are always written separately, from other components of a sentence.

Again: They’re always written separately and never with tone marks. So the orthography of these is easy.

OK, well, maybe the orthography is a little trickier than that. First, the examples give “bàle” (罷了/罢了), which sure looks to me like it has a tone mark. And then there’s the case of “a” (啊), which is an extremely common particle “used to express emotion, affirmation, interrogation, and other moods.”

In speech, its pronunciation is partially determined by the final of the syllable preceding it. After -a, -e, -i, -o, or -ü, a 啊 is pronounced “ya” 呀; after -u, “wa” 哇; and after -n, “na” 哪. These different pronunciations are conventionally represented by the different characters seen here; in Hanyu Pinyin, however, a single “a” is used to represent them all.

That certainly complicates matters if you’re trying to get a Chinese-characters-to-Pinyin converter to work properly. Note that when Yin Binyong is writing above about finals, he’s referring to sounds, not spellings. Thus, what’s written “hǎo a” is pronounced “hǎo wa,” not “hǎo ya” (and not “hǎo a” either, of course). If you’re still wondering about this, say –ao very slowly to notice the -u final. (Y.R. Chao and George Kennedy had good reasons for choosing -au in their romanization systems rather for what is -ao in Hanyu Pinyin.) Also, the distinction between a/ya isn’t absolute.

But the practice of just using “a” makes life easy if you’re writing something in Pinyin, which I’m grateful for, given that particles are, for people trying to learn how to employ them, zhēn de hǎo máfan a! So beginning and intermediate students of Mandarin should definitely read this selection.

le redux

cover of Chinese Romanization: Pronunciation and OrthographyNo, I’m not switching to French. I just wanted to get back to the matter of the particle le (了), which was discussed previously in How to write verbs in Hanyu Pinyin. Le is so frequently used that it deserves its own section.

Because today’s selection on this from Chinese Romanization: Pronunciation and Orthography is just a few pages long, for this post I typed out all of it — other than most Chinese characters, which can be seen in the PDF of the original: Tense-Marking Particles (le/了) (240 KB PDF).


9.2. Tense-Marking Particles

Tense-marking particles have already been discussed in some detail in Chapter 5, Verbs. It was noted there that the tense markers zhe (indicating an action in progress) and guo (indicating a past experience) are always written as a single unit with the verb they follow. The particle le 了 (indicating a completed action) is sometimes, but not always, written as a single unit with its verb. This is because le, unlike zhe and guo, may be separated from its verb by other elements; and also because le itself can act as a mood particle as well as a tense particle. (For details on le as a mood particle, see Section 3 of chapter 9.)

This section is devoted to a discussion of orthography specifically as it relates to the tense particle le. Three rules are laid out to help the student master the written forms of this particle.

  1. When le occurs in the middle of a sentence or phrase, and immediately follows a verb or verb construction written as a single unit, le is written together with that verb or verb construction:
    • kànle yī chǎng diànyǐng (saw a movie)
    • tǎolùnle xǔduō wèntí (discussed many issues)
    • chīwánle píngguǒ he xiāngjiāo (finished off the apples and bananas)
    • dǎsǐle sān zhī tùzi (shot three rabbits)
  2. When le occurs in the middle of a sentence or phrase, and follows a verb phrase written as two or more units, then le is written separately:
    • zǒu jìnlai le yī wèi jiāngjūn (a general came in)
    • shōushi hǎo le zìjǐ de xíngli (gathered up one’s luggage)
    • dǎsǎo gānjìng le zhè jiān shūfáng (cleaned up the study)
    • yánjiū bìng jiějué le huánjìng wūrǎn de wèntí (researched and solved the problem of environmental pollution)
      • Note that le here applies to both verbs, so that the meaning is equivalent to yánjiūle bìng jiějuéle.
  3. When le occurs at the end of a phrase or sentence (that is, immediately before any form of punctuation), it is written separately from other elements:
    • Xiàtiān lái le. (Summer is here.)
    • Wǒmen fàngle jià le. (Our vacation has begun.)
    • Kělián de xiǎoyáng, bèi láng gěi chīdiào le. (The poor little lamb was eaten up by the wolf.)
    • Tiān kuài liàng le, wǒmen gāi dòngshēn le. (It’s almost dawn; we should get moving.)
    • Hǎo le, hǎo le, nímen zài bùyào zhēnglùn le. (All right, stop arguing, all of you.)
    • Nǐ bù shì chīguo fàn le ma? (Haven’t you eaten already?)
      • Note that le is here treated as if it occupied the sentence-final position, despite the presence of another particle (ma) following it.


OK, it’s me again. In closing I want to draw attention to that final note, because it’s important: If le is followed by ma, le is still treated as if it came at the end of the sentence and thus is written separately from its verb.

Mandarin interjections in Pinyin

cover of Chinese Romanization: Pronunciation and OrthographyAh, interjections! Such flavor they can add! With a few of the many interjections in today’s reading on Mandarin interjections (325 KB PDF) you’ll sound a lot more like a native speaker. But don’t overdo it unless you also want to sound like a drama queen.

Here’s the introduction:

Interjections, sometimes also called exclamations, are a type of function word used in calling out, to express strong emotions, or to indicate agreement. Interjections may form complete utterances on their own, or function as part of a larger utterance. When they form a part of a larger sentence, they most usually appear at the beginning. They are separated from the rest of the sentence by a comma or exclamation point in writing.

Interjections can tolerate a wide degree of variation in tone and intonation in order to better express the emotions they indicate. This makes it difficult to set a fixed Chinese-character form for each different interjection. To better suit this variability, interjections are permitted to go without tone markers in HP.

Interjections, as function words, are written separately from the words around them. Most interjections are monosyllabic, though there are a number of polysyllabic ones, like haiyo, heihei, aiya, and aiyaya. Some interjections are composed wholly of consonants: ng, hm, hng. These too are treated as ordinary syllables.

Thus, when it comes to writing interjections in Hanyu Pinyin, the rules are simple. Pinyin’s greater flexibility than Chinese characters could also open up all sorts of possibilities.

Here are some standard examples from the reading:

  1. a 啊
    • A? Nǐ shuō shénme? (Eh? What did you say?) [INQUIRY]
    • A? Yǒu zhèyàng de shìr? (What? Is such a thing possible?) [SURPRISE]
    • A, wǒ míngbai le. (Oh, I get it.) [AGREEMENT, COMPREHENSION]
  2. ai 唉 噯
    • Ai, wǒ lái le. (Here I am.) [RESPONSE]
    • Ai, bù shì nàme huí shìr. (No, it’s not like that at all.) [DISAGREEMENT]
    • Ai, yīqiè dōu wán le. (Oh dear, it’s all over.) [SADNESS]
  3. aiya 哎呀
    • Aiya, zhè nánguā zhēn dà! (My, what a big pumpkin!) [SURPRISE]
  4. aiyo 哎喲; also aiyao, aiyou
    • Aiyo, wǒ dùzi hǎo téng! (Oh, how my stomach aches!) [PAIN]
    • aiyo may also be used to express alarm or pleased surprise.
  5. e, ei
    • Ei, nǐ kuài lái! (Hey, come quick.) [USED IN CALLING SOMEONE]
    • Ei, tā zènme pao le? (Hey, where did he run off to?) [SURPRISE]
    • Ei, bù shì zhèyàng ba. (That can’t be right.) [DISAGREEMENT, DISAPPROVAL]
    • Ei, wǒ jiù lái le. (I’m coming.) [USED IN REPLYING- TO A CALL OR SUMMONS.
  6. haha
    • Haha, wǒ cāiduì le. (Ha, I guessed right.) [HAPPINESS OR SMUGNESS]

I’m tempted to keep typing all of these out. There’s not much point in that, though, since everyone can just turn to the PDF. But I’d like to point out a few outside examples.

Y.R. Chao’s translation into Mandarin of Humpty Dumpty has plenty of interjections: hng, ng, a, o, etc.

And remember Crouching Tiger, Hidden Dragon (Wòhǔcánglóng)? After Zhang Ziyi’s character wakes up in Xiao Hu’s cave in Xinjiang, she gives us a good example of the contemptuous interjection pei.

Xiǎo Hǔ: Gàosu wǒ nǐ de míngzi. [Tell me your name.]

Xiǎo Lóng: Pei!

Xiǎo Hǔ: Pei? Hànrén méiyǒu zhèzhǒng míngzi de.

image from 'Crouching Tiger, Hidden Dragon' with the lines Pei? Hànrén méiyǒu zhèzhǒng míngzi de. [Pei? I didn't think the Hans had names like that.]

Also, the very first word in Crouching Tiger, Hidden Dragon is “Yo!” — just the Mandarin one, not the English one. (“Yo! Lǐ yé lái la.“)

de de de — d di de

cover of Chinese Romanization: Pronunciation and OrthographyWhat’s the most commonly used morpheme in Mandarin? It isn’t the word for is (shì/是). And it’s not the one for not (/不). And the number one (/一) is only number two — in frequency, that is. (Even some of that is that Hanzi frequency counts include 一 used as a dash.) Nope, it’s that little grammatical particle de (的).

Today’s selection from Chinese Romanization: Pronunciation and Orthography is all about de (800 KB PDF).

So, whaddaya do with de in Pinyin? Simple: It’s almost always written separately from the words around it.

  • māma de ài (mother’s love)
  • zhàopiàn de bèimiàn (back of a photograph)
  • lìshǐ de jīngyàn (the experience of history)
  • dàmén wài de shíshīzi (the stone lions outside the gate)
  • nǐ de yǔsǎn (your umbrella)
  • zhèyàng de rén (people of that sort)
  • tā zìjǐ de cuòwu (his own mistake)
  • jìlái de xìn (the letter that was sent)
  • chī chóngzi de zhíwù (insectivorous plants)
  • Chī de, chuān de, yòng de, yàngyàng dōu yǒu. (They have all kinds of food, clothing, and other items of use.)
  • hǎo de bànfǎ (a good solution)
  • wǒ xǐhuān de xiézi (the shoes I like)

So, yeah, that means if you want to write down a common Mandarin obscenity, it’s tāmā de (他媽的), not tāmāde — though I wouldn’t be surprised if that becomes treated as one word over time.

There are just a few exceptions. This particular de is written together with the component it follows only in the following cases:

  • yǒude 有的 (some): Yǒude rén tànxi, yǒude rén liúlèi. (Some people were sighing, while others wept.)
  • shìde 是的 (yes, certainly): Shìde, wǒ jiù qù. (Certainly, I’ll go right away.)
  • shìde 似的 (like, as): Xiàng hóuzi shìde, tiàolái tiàoqù. (Jumping around, just like a monkey.)

But 的 isn’t Mandarin’s only common de. Let’s not forget de (地, the 20th most commonly used Hanzi) and de (得, 35th).

These three homophonous particles are represented by three different characters in writing; would it perhaps be useful to create three different Hanyu Pinyin forms to differentiate them in Hanyu Pinyin writing? The basic principle of Hanyu Pinyin orthography is to take the language’s sound system as the basis for spelling, and, by this standard the three particles 的, 地, and 得 should all be written identically as “de.” But it may be desirable in certain situations (such as Chinese-language word processing and other computer applications, and in machine translation) to differentiate the three. In this case, they may be assigned different written forms: 的, the most commonly used, as “d”; 地 as “di” (an alternate pronunciation of this character); and the third, 得, as “de.”


  • 的 = d (pronounced de)
  • 地 = di (pronounced de)
  • 得 = de (pronounced de)*

(* Yes, I know those all have other readings. But we’re not talking here about Chinese characters with multiple pronunciations.)

But you don’t have to use those orthographic variants if you don’t want to. For an example of a text that does use d and de, see this lovely story: Dàshuǐ Guòhòu (After the Flood).

OK, let’s get back to those other de‘s.

de 地

The principal function of this particle is to link an adverbial modifier to “the verb or adjective it modifies. de 地 is always written separately from the elements preceding and following it.


  • suíbiàn de kàn (look over casually)
  • mànmàn de zǒu (walk slowly)
  • yī kǒu yī kǒu de chī (eat bite by bite)

de 得

The principal function of this particle is to link a verb or adjective with its complement. The complement expresses possibility, degree, or result, and may be composed of a single word or a phrase. The verb or adjective preceding de 得 may only be a single word, never a phrase. de 得 is in principle written separately from the elements preceding and following it. The bù 不 that negates a de 得 expressing possibility is also written separately from the elements around it.


  • hǎo de hěn (very good)
  • duō de duō (much more)
  • lěng de yàomìng (freezing cold)
  • hēi de kànbujiàn rén (so dark one can’t see the people around one)
  • gāoxìng de jǐnjǐn wòzhu ta de shǒu shuō: “Xièxie! Xièxie!” (so happy I could only grasp his hand and say, “Thank you! Thank you!)

There are two main situations in which de 得 should be written as one unit with the component that precedes or follows it. Let us take a look at these:
(1) de 得 sometimes joins together with the verb that precedes it to form a single word. Sometimes a bù 不 is interposed between the verb and de 得 to indicate negation. In either case, all elements are written as one unit.

  • dǒngde (to understand)
  • jìde (to remember)
  • jiànde (to seem)
  • juéde (to feel)
  • láide (to be competent (to do something)
  • láibude (impermissable)
  • liǎode (terrible)
  • liǎobude (teriffic)

(2) In certain trisyllabic verb-complement constructions in which de 得 (or the negative marker bù 不) forms the middle syllable, the meaning of the complement has altered and the whole has come to express a single concept. In this case all three syllables should be written as one unit.

  • láidejí (there’s still time; to be in time)
  • láibují (there’s no time; to be too late)
  • chīdekāi (to be popular)
  • chībukāi (be unpopular)
  • duìdeqǐ (not let somebody down)
  • duìbuqǐ (let somebody down; also, “excuse me”)
  • chīdexiāo (be able to bear)
  • chībuxiāo (be unable to bear)