As most people reading this blog know, Mandarin has about 1,300 syllables (interjections and loan words complicate the count a little). If tones — a basic part of the language — are disregarded, the number of drops to 400 and something syllables.

Given 410 or so basic syllables and 4 tones — one of these days I need to write something more on the wrongful neglect of the so-called neutral tone — some people might expect there to be more like 1,640 syllables instead of about 1,300. The reason for the lower number is that not all syllables exist in all four tones. For example, quite clearly the official language of Zh?ngguó does not lack zh?ng … or zh?ng or zhòng. But zhóng is another matter.

So not all possible tonal variations of those 400-something syllables appear in modern standard Mandarin. But what about letters?

If you look at the official alphabet for Hanyu Pinyin, it’s exactly the same as that for English (other than in pronunciation, of course), which is a bit odd, especially considering that Pinyin doesn’t use the letter v (or at least isn’t supposed to for Mandarin words).

So in this case, I’m excluding v but otherwise being expansionist about the glyphs I’m calling letters. To be specific: I’m referring to a-z, minus v, but including ?, á, ?, à, ?, é, ?, è, ?, í, ?, ì, ?, ó, ?, ò, ?, ú, ?, ù, ü, ?, ?, ?, and ?. (Even though ?, Í, ?, Ì, ?, Ú, ?, Ù, Ü, ?, ?, ?, and ? never come at the beginning of a word, let’s not automatically eliminate them, because there is an occasional need for ALL CAPS.)

Are there any of those possible glyphs that don’t appear at all — at least as given in the large ABC Comprehensive Chinese-English Dictionary?

The answer, perhaps surprisingly, is yes.

Which letter is it?

a. ? b. ? c. ? d. ?

Have you made your choice?

It doesn’t take much thought to eliminate C as the answer. “N?” (woman) is one of those first-couple-of-Mandarin-lessons vocabulary terms. And the word for green (l?sè) is hardly obscure either. It might be harder to think of a word with the letter ?; but there are some. Donkey (l?) is probably the most common. So the answer is A: ?.

It’s important to note that the lack of ? is in appearance only. The sound ? occurs in plenty of Mandarin words; it’s just that Pinyin’s simplified orthography calls for writing “u” instead where ? follows j, q, x, or y.

But even though I didn’t find an example of ?, I’d encourage font designers not to scratch it from their list of must-have glyphs for Pinyin faces, especially since teachers will no doubt want to continue giving tone-pattern drills based on four tones for all vowels, regardless. Also, someone with a searchable edition of the Hanyu Da Cidian or maybe the new Oxford online edition is probably about to use the comments to point me to some obscure entry there….

  1. I never understand why e.g. xu instead of xü should be a “simplification”. Instead, it rather makes it more complicated: Why not write ü everywhere, where it is actually pronounced?
    (And please don’t tell me that ü is such a special letter which is hard to type/write. Computers should adapt to the alphabet, not the other way round. Even more that there were no computers when Pinyin was created.)

  2. Xu instead of xü is simplification because there is no “xu”, just “xü”, so there shouldn’t be any confusion. The same goes for j, q & y. Only l & n could be either u or ü.

  3. But ü and u are completely different sounds. If you have to remember rules for the writing, it makes it more complicated.
    If ü would _always_ be written ü, one could skip the rule that after x, j, q, y you have to write u, even though you speak ü.
    The same goes the other way round: You don’t need to know that u after x is not u, but ü. If a foreigner without any knowledge of Chinese reads “xu“, he will pronounced it xu (ok, he will most likely not know how to pronounce x, but that is a different issue). If it were written ü, he would just pronounce it xü, and no problem there.
    Why should writing u be simpler than always writing ü?

  4. Why should writing u be simpler than always writing ü?

    Keep in mind that Pinyin was designed during the same period as the “simplification” of Chinese characters. A major focus of Hanzi simplification was a reduction in the number of strokes it takes to write something. (This proved to be a relatively minor improvement for all of the change it involved. But oh well. I don’t want to get into all that here.) So, from that standpoint, u is simpler than ü … by two strokes.

    Pinyin has a number of such orthographic conventions aimed at brevity. Some are well known (e.g., uei –> ui, and iou –> iu), others much less so (e.g., uen –> un). That should probably be the subject of a separate post.

  5. @Gerrit – No one is arguing that it’s inconsistent to not use ü for xu, ju, qu and yu. It also isn’t that difficult to remember that when it comes after x, j, q & y, u is always ü. I currently tutor 3 Mandarin students in my city and it took no time to memorize that. I wish everything was completely logical and consistent but if that’s true, the English spelling system would be very different than it is today :)

  7. Ah, keep this up and you’ll become the second .

    Fun items:
    ? is -n everywhere except Korean, where it is -m just like ?.
    ? at least is still -p in Cantonese and Korean, not -t as in Taiwanese.
    At least Taiwanese preserves the -p in ?? Yes, all these are explainable by rules.

    What bugs me is how some folks squeeze a han out of ?. Never did figure that one out.

  8. ? is -n everywhere except Korean, where it is -m just like ?.

    Also excepting Vietnamese, where the pronunciation is ph?m. Of course the character isn’t used in modern Vietnamese, but the word/morpheme itself appears in compounds like s?n ph?m (??), d??c ph?m (??),nhu y?u ph?m (???), etc.

  9. As I understand the Mandarin pronunciation system, Pinyin ‘ü’ is actually shorthand for ‘iu’ and ‘x’, ‘j’, and ‘q’ have an implied ‘i’ after them. Thus ‘xü’, for example, would be redundant, equivalent to ‘xiiu’.

  10. As a Cantonese speaker struggling with PTH – learning it by myself – I have found it strange for the inconsistency regarding ü or u. Once I realize the so called “reason,” I recognize the “beauty” of the Pinyin system. But then, when I was introduced the BPMF and Romatyzh systems years later, I then realized perhaps it’s a mater of choice between straightforwardness and simplicity.

    BTW: ? is still -t in Cantonese.

