OK, so here’s what I’m gonna do

The encoding problem caused by the hack still isn’t fixed. This means that Chinese characters and Pinyin with tone marks still don’t appear properly on this blog (but they’re fine on pages in the rest of Pinyin.info). But, still, there are some things I’d like to let people know about, including an important announcement coming up soon. So I’m going to start posting some things, even though that means no Hanzi or tonal Pinyin for at least the near future. (Don’t forget: That means Hanzi won’t work in your comments here.) Fortunately, most of the time Pinyin doesn’t really need tone marks.

Without Hanzi and tone marks it’s more difficult to write about Chinese characters and Pinyin, which are, er, only the main topics of the site. But I’ll do what I can. Anyway, why let Victor Mair have all the fun?

So until the encoding issue is resolved y’all can expect a relatively large number of posts catching up on Pinyin-friendly fonts, a few posts covering news and announcements, and probably at least a little of the bile that you’ve come to expect from this site — unless, of course, during the years I’ve let this blog go fallow public signage has all been fixed, the authorities are finally using Pinyin correctly, and people who ought to know better have stopped spouting complete nonsense about Chinese characters. Heh. We’ll see.

Gift ideas for Mandarin learners

Here are some books I recommend. You may still have time to buy some of these for others (or persuade others to buy for you) before Christmas.

In a departure from my usual practice, all of the images below are linked to Amazon — in part to make things easier for most readers of this site but also because I’m a bit curious to see if the potential kickbacks from that site would ever add up to enough to buy myself some books I’ve been wanting. Mainly, though, I’d like to see these books make it into the hands of more readers. This isn’t meant to be a complete list; but it’s a good start.

One of these days I’ll post about the works below I haven’t written about previously.


ABC English-Chinese, Chinese-English Dictionary, edited by John DeFrancis and Zhang Yanyin. This is the Mandarin-English/English-Mandarin dictionary that every student needs. Suitable for all ages and levels. It’s small enough to carry with you. And at US$20 or even less it’s a bargain too. For an e-edition, get Wenlin (see below). Description. Excerpt from Mandarin –> English half. Excerpt from English –> Mandarin half.
ABC Chinese-English Comprehensive Dictionary, edited by John DeFrancis. The best large Mandarin-English dictionary. Entries are arranged alphabetically, by words, rather than head Chinese characters. Note: This is a Mandarin –> English dictionary and does not offer an English –> Mandarin section. For an e-edition (which does allow for the lookup of English words), get Wenlin (see below). Sample of what entries in this dictionary look like.
Chinese Biographies: Lang Lang, by Grace Wu. Pinyin-annotated biography of pianist Lang Lang, with English notes. There’s also a helpful Web site with additional resources. Ideal for beginning and intermediate students.
Chinese Biographies: Yao Ming, by Grace Wu. Pinyin-annotated biography of basketball star Yao Ming, with English notes. There’s also a helpful Web site with additional resources. Ideal for beginning and intermediate students.
The Besieged City (Abridged Chinese Classic Series), by Qian Zhongshu. Pinyin-annotated abridged version of a terrific Chinese novel. With notes in English, proper word-parsed Hanyu Pinyin for the entire text, simplified Chinese characters, and a CD with MP3 files of the entire book being read aloud. Excerpt.
Family (abridged and annotated edition, with full Hanyu Pinyin), by Ba Jin. With notes in English, proper word-parsed Hanyu Pinyin for the entire text, simplified Chinese characters, and a CD with MP3 files of the entire book being read aloud. Excerpt.
Spring (abridged and annotated edition, with full Hanyu Pinyin), by Ba Jin. With notes in English, proper word-parsed Hanyu Pinyin for the entire text, simplified Chinese characters, and a CD with MP3 files of the entire book being read aloud. Excerpt.
Autumn, (abridged and annotated edition, with full Hanyu Pinyin), by Ba Jin. With notes in English, proper word-parsed Hanyu Pinyin for the entire text, simplified Chinese characters, and a CD with MP3 files of the entire book being read aloud. Excerpts.
Basic Spoken Chinese: An Introduction to Speaking and Listening for Beginners, by Cornelius C. Kubler. Although this book is not in orthographically standard Pinyin, it’s nonetheless strong. A practical, real-world textbook that focuses on learning the language, not getting beginners bogged down memorizing character after character.
Fundamental Spoken Chinese, by Robert Sanders. Another excellent textbook. Audio files are available online. Excerpt.
The Chinese Language: Fact and Fantasy, by John DeFrancis. Essential reading. This book will inoculate you against the absolute nonsense that many people — including all too many teachers — believe about Chinese characters. Excerpt.
Asia’s Orthographic Dilemma, by William C. Hannas. A wide-ranging, detailed book that discusses some of the drawbacks of the continued use of Chinese characters. Excerpt.
Mandarin Chinese: A Functional Reference Grammar, by Li and Thompson. Good for the linguistically inclined. Just about the only Chinese characters in this book are on the cover, which, yes, I consider to be a good thing.
Wenlin software for learning Chinese, version 4. I use this on a daily basis. This incorporates both dictionaries listed above.
Beyond This List

Here are some things not listed above, in most cases because Amazon doesn’t stock them.

  • Pinyin Riji Duanwen, by Zhang Liqing. A book of largely autobiographical short stories, written entirely in Hanyu Pinyin (except for one brief letter in English). For intermediate and advanced learners — and for native speakers of Mandarin as well. At just US$5, plus shipping, this is the least expensive work on this list. The complete text is also available for free online, though a URL just doesn’t have that same Christmas feeling as a physical book, does it?
  • Any or all of the three volumes in Y.R. Chao’s Sayable Chinese series. For intermediate and advanced learners — and for native speakers of Mandarin as well. Note: These books are in Chinese characters and Gwoyeu Romatzyh, not Hanyu Pinyin, so for most people the learning curve is steeper than for reading something in Hanyu Pinyin. With some notes in English. Excerpt (Gwoyeu Romatzyh column only).
  • Other works on my recommended readings list, which may be available at Amazon but which may or may not fit well on a list for Mandarin learners.
  • KEY5 2011 Multimedia — a different sort of software than Wenlin but one that offers excellent Pinyin support.

A clang on the Taipei MRT announcements

photo of a sign at the Zhongxiao Xinsheng MRT stationPeople generally don’t listen carefully to the announcements on the Taipei MRT, a subway/elevated train mass-transit system. With four languages to get through — Mandarin, Taiwanese, Hakka, and English — that’s a lot of talking. And anyway, the cars can be so full that it’s hard to hear such things clearly over all the background noise anyway. Still, you’d think that at least the people who make the recordings would be paying attention.

Below is a link to a recording of a relatively new announcement, advising people on the Danshui line that Minquan West Road is the place to change trains for the Luzhou line, which opened late last year: Mínquán West Road Station. Attention: passengers transferring to Sānchóng, Lúzhōu, or Zhōngxiào-Xīnshēng please change trains at this station.

Or at least what I typed above is what the announcement is supposed to give. As you may have noticed, however, “Zhōngxiào-Xīnshēng” is rendered “Zhongxiao-Xinshang,” with a very un-Mandarin shang that rhymes with the English words clang, pang, hang, and sang. And that’s without getting into the matter of tones.

I pointed out this error to Taipei City Hall and the authorities in charge of the MRT. As usual, I had to spend some time repeatedly explaining: “No, Xinshang is not the English pronunciation of Xīnshēng. Xīnshēng isn’t English. It’s Mandarin. What the announcement gives is simply an error….” I was pleasantly surprised, however, that the main person I spoke to at TRTS did not require the usual explanations. He understood the problem and said it would be fixed.

This, however, was a couple of months ago. The recordings have not yet been changed. I haven’t been holding my breath over this, though, because the official with the MRT system warned that it would take time to run a public bid notice for a new recording, make the new recording, and then install the recording in the front and back cars of some 100 trains. Still, the system has been known to move fairly quickly; unfortunately, this usually happens only when the change is for the worse, such as renaming Xindian City Hall as Xindian City Office (now Xindian District Office), or renaming the whole Muzha line because some superstitious nitwits thought that a joking, non-official nickname was bringing the system bad luck.

For longtime residents of Taipei, the shang mispronunciation will likely bring back memories of the bad old days when the MRT system first opened. Back then the signage was predominantly in bastardized Wade-Giles, with the pronunciations in the English announcements matching what a clueless Westerner might say when shown names like Kuting and Nanking (properly: Gǔtíng and Nánjīng, respectively). Perhaps the most offensive pronunciation on the system then was given to Dànshuǐ, which at the time was [mis]spelled Tamshui on the MRT system. This was pronounced as three syllables: Tam (rhymes with the English word “dam”) + shu (“shoe”) + i (as in “machine”).

By the way, the Xinbei City Government has been changing signs around Danshui from Danshui to the old Taiwanese spelling of Tamsui (note: not Tamshui). But more about that in a different post.

Ni neibian ji dian?

Here are some photos of a large, elaborate, and no-doubt expensive sundial outside the Nangang high-speed rail station (next door to the Nangang train station and Nangang MRT station).

These were taken at 11 a.m. (The one of the sundial itself was taken on a different day.) But as you can see below, the sundial certainly isn’t indicating the time is 11:00. Rather, it’s pointing toward 9:20 or so.

The disc labeled IX is actually XI (11). I took the photo from a reverse vantage point, so the number is upside down in the photo.

This vantage point puts the number upside down. So you should read this as XI, not IX.

Perhaps whoever erected the main part of the sundial doesn’t know Roman numerals. (Sorry: that’s about as close as this post gets to talking about scripts.) But that wouldn’t account for the dial indicating 9:20 instead of 9:00.

I contacted the Taipei City Government about this. They said to contact the Taiwan High Speed Rail Corporation, which I did. They, in turn, responded that I’d reached the wrong office and should write a different office; but they didn’t forward the message or provide me with the correct e-mail address. Once I’d tracked down another office I e-mailed the folk there. That was more than a week ago. There has been no response.

I spoke with someone at the site who appeared to be in a position of authority. He told me that the sundial hadn’t been adjusted yet and that they would get to it next year. He was too busy to answer any more questions though, such as “Next year?” Also, I suspect that it won’t be easy to rotate that huge thingamajig, so why didn’t they get it right the first time?

Still, at least someone in authority seems to understand there’s a problem.

*For anyone who doesn’t recognize the title of this post, it’s an allusion to the 2001 movie Nǐ nèibiān jǐ diǎn (《你那边几点》 / What Time Is It There?).

misc. links

click for complete imageI’m feeling guilty that I haven’t posted in over a month. But since I still don’t have anything ready I’ll make do for now with mention of just a few relatively recent items elsewhere:

My parents speak Taiwanese better than I do. agree: 77%; disagree: 9%; no opinion: 14%

and for lagniappe:

software for Shanghainese

Professor Qián Nǎiróng (Qian Nairong / 錢乃榮) of Shanghai University has just issued free software to help with the writing of Shanghainese (上海话). People may now download the 1.3 MB zip file of the program.

Some examples:

shanghe 上海
shanghehhehho 上海闲/言话(上海话)
whangpugang 黄浦江
shyti 事体(事情)
makshy 物事(东西)
bhakxiang 白相(玩)
dangbhang 打朋(开玩笑)
ghakbhangyhou 轧朋友(交朋友)
cakyhangxiang 出洋相(闹笑话,出丑)
linfhakqin 拎勿清(不能领会)
dhaojiangwhu 淘浆糊(混)
aoshaoxhin 拗造型(有意塑造姿态形象)
ghe 隑(靠)
kang 囥(藏)
yin 瀴(凉、冷)
dia 嗲
whakji 滑稽

The program offers two flavors of romanization. Here are some examples of the differences between the two styles:

New Folk Old Timers
makshy 物事(东西)
bhakxiang 白相(玩)
dangbhang 打朋(开玩笑)
ghakbhangyhou 轧朋友(交朋友)
cakyhangxiang 出洋相(闹笑话,出丑)
linfhakqin 拎勿清(不能领会)
mekshy 物事(东西)
bhekxian 白相(玩)
danbhan 打朋(开玩笑)
ghakbhanyhou 轧朋友(交朋友)
cekyhanxian 出洋相(闹笑话,出丑)
linfhekqin 拎勿清(不能领会)

Here’s a brief story on this:

Xiànzài, wǒmen zài wǎngluò zhōng liáotiān de shíhou yuèláiyuè duō de péngyou dōu kāishǐ xǐhuan yòng Shànghǎihuà. Dànshì yǒushíhou shìbushì juéde xiǎng biǎodá dehuà bùzhīdào zěnme dǎ, nòng de yǒudiǎn bùlúnbùlèi ne? Xiànzài, yī ge kěyǐ qīngsōng dǎchū Shànghǎihuà de chéngxù chūlai le.

Jīngguò liǎng nián nǔlì, Shànghǎi dàxué Zhōngwénxì Qián Nǎiróng jiàoshòu jí tā de yánjiūshēng hé dādàng zhōngyú yú běnyuè wánchéng le Shànghǎihuà shūrùfǎ de zhìzuò. Zhíde guānzhù de shì, zhè tào shūrùfǎ hái bāokuò xīn-lǎo liǎng ge bǎnběn, 45 suì yǐshàng de lǎo Shànghǎi rénhé niánqīng yī dài de Shànghǎirén dōu kěyǐ zhǎodào zìjǐ de “dǎfǎ.”

Háishi tóngyàng 26 ge zìmǔ de jiànpán, 8 yuè 1 rì qǐ xiàzài le Shànghǎihuà shūrùfǎ zhīhòu, nín jiù kěyǐ tōngguò shūrù “linfhakqin” dǎchū “līn wù qīng,” shūrù “dhaojiangwhu” dǎchū “táo jiànghu” děng yuánzhī yuán wèi de Shànghǎihuà le. Zuótiān, jìzhě tíqián xiàzài dào gāi ruǎnjiàn. Ànzhào shǐyòng shuōmíng, yòng quánpīn de fāngshì chángshì shūrù “laoselaosy” zhèxiē zìmǔ, píngmù shàng, lìjí chūxiàn le “lǎo sānlǎo sì” (Shànghǎihuà, yìsi shì “màilǎo, chōng lǎochéng de yàngzi”).

Jùxī, yóuyú Shànghǎihuà yǔ Pǔtōnghuà de dúfǎ yǒusuǒbùtóng, suǒyǐ zài pīnyīn pīnxiě fāngshì shàng háishi xūyào shǐyòng shuōmíng de bāngzhù. Bǐrú jìzhě fāxiàn, fánshì yǔ Pǔtōnghuà shēngmǔ, yùnmǔ xiāngtóng de zì, zài Shànghǎihuà shūrùfǎ zhōng zuìzhōng yòng de háishi Pǔtōnghuà pīnyīn, bùtóng de zé cǎiyòng Shànghǎihuà shūrùfǎ de pīnxiě fāngshì. Rú “chénguāng” de “chén,” “huātou” de “tóu” dōu fāchéng zhuóyīn, Shànghǎihuà pīnyīn shūrùfǎ zhōng yàozài shēngmǔ zhōng jiā yī ge zìmǔ h, pīnchéng “shen,” “dhou;” fánshì rùshēng zì, zé zài pīnyīn hòu jiā zìmǔk, rú “báixiāng” de “bái” jiù pīnchéng bhek.

Bùguò, dàjiā bùyào juéde tài nán. Jìzhě fāxiàn, Shànghǎihuà shūrùfǎ yǔ Pǔtōnghuà de shūrùfǎ zuìdà xiāngtóng zhī chǔzài yú, zhǐyào liánxù shūrù shēngmǔ hé yùnmǔ jiù kěyǐ, bùxū shūrù shēngdiào. Cǐwài, Shànghǎihuà pīnyīn shūrù xìtǒng háiyǒu lèisì “zhìnéng” yōudiǎn, kěyòng suōlüè fāngshì bǎ cíyǔ pīnxiě chūlai.

Zhǔchí Shànghǎihuà shūrùfǎ kāifā de Shànghǎi dàxué Zhōngwénxì Qián Nǎiróng jiàoshòu gàosu jìzhě, zhè tào shūrùfǎ bùjǐn néng dǎchū Shànghǎihuà dà cídiǎn zhōng 15,000 duō ge cítiáo, érqiě hái néng yòng Shànghǎihuà pīnyīn dǎchū Shànghǎihuà zhōng shǐyòng zhe de, yǔ Pǔtōnghuà cíyì xiāngtóng dàn yǔyīn bùtóng de chángyòng cíyǔ. Rú “Huángpǔ Jiāng” shūrù “whangpugang” , “lǐxiǎng” zéshì lixiang děng, gòngjì 10,000 duō ge cítiáo.


separating Pinyin syllables: PHP code

A few weeks ago I had someone write to ask if I had a script that can divide Pinyin texts into their individual syllables. It so happens that I do have something that does just that. Since I sent out that bit of code, I might as well make it available to everyone (GNU GPL, and links back to Pinyin.Info are always appreciated).

It has lots of regular expressions, to make the code nice and compact. I’ve added comments for clarity.

// In the lines below, \s means space
// This program assumes that ü is written as v
// The i at the end of a line means case insensitive
// \W is a single, non-word character (e.g., punctuation)

$search = array ("'([aeiouv])([^aeiounr\W\s])'i", // This line does most of the work
"'(\w)([csz]h)'i", // double-consonant initials
"'(n)([^aeiouvg\W\s])'i", // cleans up most n compounds
"'([aeiuov])([^aeiou\W\s])([aeiuov])'i", // assumes correct Pinyin (i.e., no missing apostrophes)
"'([aeiouv])(n)(g)([aeiouv])'i", // assumes correct Pinyin, i.e. changan = chan + gan
"'([gr])([^aeiou\W\s])'i", // fixes -ng and -r finals not followed by vowels
"'([^e\W\s])(r)'i", // r an initial, except in er

$replace = array ("\\1 \\2",
"\\1 \\2",
"\\1 \\2",
"\\1 \\2\\3",
"\\1\\2 \\3\\4",
"\\1 \\2",
"\\1 \\2",

$usertext = preg_replace($search, $replace, $document);


Since I’m always going on about the need for word parsing and not separating Pinyin into single syllables, some of you are probably wondering just why I of all people would have ever written such code. The answer is that it’s part of my Pinyin spell-checker, which is only a very basic utility in that it functions by checking for theoretically correct groups of syllables rather than real words (i.e., anything composed of correctly spelled groups of syllables, minus tone marks, will pass even if that word isn’t found in a dictionary).

Suggestions for improvements are always welcome.