Attitudes in Hong Kong toward Mandarin: survey

Mandarin is less well loved in Hong Kong than most other “icons” of China, according to the results of a survey there.

Although the percentage of those who described themselves as “averse” (kàngjù) to Mandarin is quite small (in the single digits), it has quadrupled since 2006 (1.8% to 7.3%). (I’m using the English and Mandarin terms given in the source material.)

Meanwhile, the percentage of those who are “affectionate” (q?nqiè) toward Mandarin has dropped, though not to an all-time low. And the percentage of those who are “proud” (zìháo) of Mandarin is also down, though it remains much higher than it was in 1994 when the survey began.

Affection toward, pride in, and averseness to Mandarin in Hong Kong, 1994-2010
graph showing affection toward Mandarin in the 27-35% range, pride in Mandarin rising from 19% to 34% percent but dropping since 2006, and aversion to Mandarin at around 3% until the climb to about 8% in 2010

Interestingly, averseness to Mandarin has been growing, while averseness to most other mainland icons has been dropping.

In the graphs below I have omitted some surveyed icons — Hong Kong’s regional flag/emblem, the night view of Victoria Harbor, the Legislative Council building, the Hong Kong and Shanghai Bank Building, and the Bank of China Building — to keep the graphs from getting too busy looking and because those are within Hong Kong itself.

The lines for Mandarin are in dark red. Click to enlarge the images to a useful size.

Percentage of respondents feeling “averse to” Mandarin (“Putonghua”) and other Chinese icons

Percentage of respondents feeling “affectionate towards” Mandarin (“Putonghua”) and other Chinese icons

Percentage of respondents feeling “proud of” Mandarin (“Putonghua”) and other Chinese icons

But even though Mandarin hasn’t gained much affection recently from the people of Hong Kong, it’s still far more liked than the least popular of the PRC’s institutions: the police (g?ng’?n).

sources and further reading:

Google improves its maps of Taiwan

Two years ago when Google switched to Hanyu Pinyin in its maps of Taiwan, it did a poor job … despite the welcome use of tone marks.

Here are some of the problems I noted at the time:

  • The Hanyu Pinyin is given as Bro Ken Syl La Bles. (Terrible! Also, this is a new style for Google Maps. Street names in Tongyong were styled properly: e.g., Minsheng, not Min Sheng.)
  • The names of MRT stations remain incorrectly presented. For example, what is referred to in all MRT stations and on all MRT maps as “NTU Hospital” is instead referred to in broken Pinyin as “Tái Dà Y? Yuàn” (in proper Pinyin this would be Tái-Dà Y?yuàn); and “Xindian City Hall” (or “Office” — bleah) is marked as X?n Diàn Shì G?ng Su? (in proper Pinyin: “X?ndiàn Shìg?ngsu?” or perhaps “X?ndiàn Shì G?ngsu?“). Most but not all MRT stations were already this incorrect way (in Hanyu Pinyin rather than Tongyong) in Google Maps.
  • Errors in romanization point to sloppy conversions. For example, an MRT station in Banqiao is labeled X?n Bù rather than as X?np?. (? is one of those many Chinese characters with multiple Mandarin pronunciations.)
  • Tongyong Pinyin is still used in the names of most cities and townships (e.g., Banciao, not Banqiao).

I’m pleased to report that Google Maps has recently made substantial improvements.

First, and of fundamental importance, word parsing has finally been implemented for the most part. No more Bro Ken Syl La Bles. Hallelujah!

Here’s what this section of a map of Tainan looked like two years ago:

And here’s how it is now:

Oddly, “Jiànx?ng Jr High School” has been changed to “Tainan Municipal Chien-Shing Jr High School Library” — which is wordy, misleading (library?), and in bastardized Wade-Giles (misspelled bastardized Wade-Giles, at that). And “Girl High School” still hasn’t been corrected to “Girls’ High School”. (We’ll also see that problem in the maps for Taipei.)

But for the most part things are much better, including — at last! — a correct apostrophe: Y?u’ài St.

As these examples from Taipei show, the apostrophe isn’t just a one-off. Someone finally got this right.

Rén’ài, not Renai.
screenshot from Google Maps, showing how the correct Rén'ài (rather than the incorrect Renai) is used

Cháng’?n, not Changan.
screenshot from Google Maps, showing how the correct Cháng'?n is used

Well, for the most part right. Here we have the correct Dà’?n (and correct Ruì’?n) but also the incorrect Daan and Ta-An. But at least the street names are correct.
click for larger screenshot from Google Maps, showing how the correct Dà'?n (and correct Ruì'?n) is used but also the incorrect Daan and Ta-An

Second, MRT station names have been fixed … mostly. Most all MRT station names are now in the mixture of romanization and English that Taipei uses, with Google Maps also unfortunately following even the incorrect ones. A lot of this was fixed long ago. The stops along the relatively new Luzhou line, however, are all written wrong, as one long string of Pinyin.

To match the style used for other stations, this should be MRT Songjiang Nanjing, not Jieyunsongjiangnanjing.
screenshot from Google Maps, showing how the Songjiang-Nanjing MRT station is labeled 'Jieyunsongjiangnanjing Station' (with tone marks)

Third, misreadings of poyinzi (pòy?nzì/???) have largely been corrected.

Chéngd?, not Chéng D?u.
screenshot from Google Maps, showing how the correct 'Chéngd? Rd' is used

Like I said: have largely been corrected. Here we have the correct Chéngd? and Chóngqìng (rather than the previous maps’ Chéng D?u and Zhòng Qìng) but also the incorrect Houbu instead of the correct Houpu.
screenshot from Google Maps, showing how the correct Chóngqìng Rd and Chéngd? St are used but also how the incorrect Houbu (instead of Houpu) is shown

But at least the major ones are correct.

Unfortunately, the fourth point I raised two years ago (Tongyong Pinyin instead of Hanyu Pinyin at the district and city levels) has still not been addressed. So Google is still providing Tongyong Pinyin rather than the official Hanyu Pinyin at some levels. Most of the names in this map, for example, are distinctly in Tongyong Pinyin (e.g., Lujhou, Sinjhuang, and Banciao, rather than Luzhou, Xinzhuang, and Banqiao).

Google did go in and change the labels on some places from city to district when Taiwan revised their names; but, oddly enough, the company didn’t fix the romanization at the same time. But with any luck we won’t have to wait so long before Google finally takes care of that too.

Or perhaps we’ll have a new president who will revive Tongyong Pinyin and Google will throw out all its good work.

Zhou Youguang on NPR

Louisa Lim had a story on National Public Radio yesterday about Zhou Youguang (??? / Zh?u Y?ugu?ng), who’s often referred to as the father of Pinyin.

Most stories in the mass media about him focus on just two things, which might be summarized as “pinyin” and “wow, he’s really old.” This story, however, draws welcome notice to some some other things about him, as the title reveals: At 105, Chinese Linguist Now A Government Critic. (There’s a link to the audio version near the top of the page. Zhou can be heard in the background speaking Mandarin — though his English is excellent.)

The article also provides a link to his blog: B?isuì xuérén Zh?u Y?ugu?ng de bó kè (??????????).

Further reading:

Hat tip to John Rohsenow.

photo of Zhou Youguang signing a book for me

Google Translate’s Pinyin converter: now with apostrophes

Google has taken another major step toward making Google Translate‘s Pinyin converter decent. Finally, apostrophes.

Not long ago “??????????????” would have yielded “??rb?níy? ránér rénài lián?u p??r chá.” But now Google produces the correct “?’?rb?níy? rán’ér rén’ài lián’?u p?’?r chá.” (Well, one could debate whether that last one should be p?’?r chá, p?’?rchá, P?’?r chá, P?’?r Chá, or P?’?rchá. But the apostrophe is undoubtedly correct regardless.)

Also, the -men suffix is now solid with words (e.g., ??? –> péngyoumen and ??? –> háizimen). This is a small thing but nonetheless welcome.

The most significant remaining fundamental problem is the capitalization and parsing of proper nouns.

And numbers are still wrong, with everything being written separately. For example, “???????????????” should be rendered as “q?qi?n ji?b?i sìshís?n wàn w?qi?n liùb?i w?shíb?.” But Google is still giving this as “q? qi?n ji? b?i sì shí s?n wàn w? qi?n liù b?i w? shí b?.”

On the other hand, Google is starting to deal with “le”, with it being appended to verbs. This is a relatively tricky thing to get right, so I’m not surprised Google doesn’t have the details down yet.

So there’s still a lot of work to be done. But at least progress is being made in areas of fundamental importance. I’m heartened by the progress.

Related posts:

The current state:
screen shot of what Google Translate's Pinyin converter produces as of late September 2011

Kindles and Pinyin

Sure, Amazon Kindles can store thousands of books, play mp3 files, provide Web access, and allow one to spend money on books with alarming ease. But can they handle Pinyin?

photo of a Kindle 3 displaying the opening of 'Muqin Chujia' -- showing that all tone marks appear correctly


This test was made on a Kindle 3 purchased at a U.S. retail store. All three typefaces — regular, condensed, and sans serif — worked well.

Yes, Kindles can display Hanzi as well — though there may be some problems with those appearing correctly in book titles in the device’s index.

Below are links to my files, in case you want to test this yourself. I’d appreciate hearing about how Nook and other devices handle this. Thanks.

Google Web fonts and Hanyu Pinyin

Back in the last century, getting Web browsers to correctly display Pinyin was such a troublesome task that I remember once even employing GIFs of first- and third-tone letters to get those to look right. So there were a whole lotta IMG tags in my text. Sure, I put the necessary info in ALT tags (e.g., “alt=’a3′”), just in case. But, still, I shudder to recall having to resort to that particular hack.

Things are better now, though still far from ideal. Something that promises to considerably improve the situation of website viewers not all having the same font you may wish to use is CSS3′s @font-face, which allows those creating Web pages to employ fonts that are provided online. Google is helping with this through its Google Web Fonts. (Current count: 252 font families.)

But is anything in Google’s collection capable of dealing with Hanyu Pinyin? Armed with a handy-dandy Pinyin pangram, I had a look at what Google has made available.

Not surprisingly, most of the 29 font families marked as offering the “Latin Extended” character set failed to handle the entire Hanyu Pinyin set. The ??? group is the most likely to be unsupported at present, with third-tone vowels also frequently missing.

Here are the Google Web fonts that do support Hanyu Pinyin with tone marks:

  • EB Garamond (227 KB)
  • Gentium Basic (263 KB — and about the same for each of the three accompanying styles: italic, bold, bold italic)
  • Gentium Book Basic (267 KB — and about the same for each of the three accompanying styles: italic, bold, bold italic)
  • Neuton (56 KB — and about the same for each of the five accompanying styles: italic, bold, light, extra light, extra bold)

screenshot of the Pinyin fonts above


  • Neuton has relatively weak tone marks, so I wouldn’t recommend it for Web pages aimed at beginning students of Mandarin.

Sans Serifs

  • Andika (1.4 MB)
  • Ubuntu (350 KB) — available in eight styles

screenshot of the Pinyin fonts above

Some Ubuntu sample PDFs: Ubuntu regular, Ubuntu italic, Ubuntu bold, Ubuntu bold italic, Ubuntu light, Ubuntu light italic, Ubuntu medium, Ubuntu medium italic.

Andika sample PDF.


  • Andika’s relatively large size (1.4 MB) makes it unsuitable for @font-face use because of download time. (Its license, however, would permit someone with the time and energy to crack it open and remove lots of the glyphs not needed for Pinyin, thus reducing the size.) More fundamentally, though, I don’t much like the look of it; but YMMV.

Since Google is likely to expand the number of fonts it offers, I’m including the list of all 29 faces I tried for this experiment, which should make it easier for those wanting to test only new fonts. (It is possible, however, that Pinyin support will be added later to some fonts that fail in this area now. If anyone hears of any such changes, please let me know.) Use of bold indicates Pinyin support; everything else failed.

Display Faces with Latin Extended (all fail)

  • Abril Fatface
  • Forum
  • Kelly Slab
  • Lobster
  • MedievalSharp
  • Modern Antiqua
  • Ruslan Display
  • Tenor Sans

Handwriting Faces with Latin Extended (all fail)

  • Patrick Hand

Serif Faces with Latin Extended

  • Cardo
  • Caudex
  • EB Garamond
  • Gentium Basic
  • Gentium Book Basic
  • Neuton
  • Playfair Display
  • Sorts Mill Goudy

Sans Serif Faces with Latin Extended

  • Andika
  • Anonymous Pro
  • Anton
  • Didact Gothic
  • Francois One
  • Istok Web
  • Jura
  • Open Sans
  • Open Sans Condensed
  • Play
  • Ubuntu
  • Varela

Additional resource: SIL Fonts for downloading (including the full versions of Andika and Gentium).

Pinyin pangram challenge

One of the many things I plan to do eventually is to put up some graphics of how Pinyin looks in various font faces. A Pinyin pangram would do nicely for a sample text. You know: a short Mandarin sentence in Hanyu Pinyin that uses all of the following 26 letters: abcdefghijklmnopqrstuüwxyz (i.e., the English alphabet’s a-z, minus v but plus ü).

But then I couldn’t find one. So I put the question out to some people I know and quickly got back two Pinyin pangrams.

Ruanwo bushi yingzuo; putongfan bushi xican; maibuqi lüde kan jusede. (57 letters)


Zuotian wo bang wo de pengyou Lü Xisheng qu chengli mai yi wan doufuru he ban zhi kaoji. (70 letters)

from Robert Sanders and Cynthia Ning, respectively.

James Dew weighed in with some helpful advice. And, with some additional help from the original two contributors and my wife, I made some additional modifications, eventually resulting in a variant reduced to 48 letters:

Zuotian wo bang nü’er qu yi jia chaoshi mai kele, xifan, doupi.

With tone marks, that’s “Zuóti?n w? b?ng n?’ér qù y? ji? ch?oshì m?i k?lè, x?fàn, dòupí.

I suppose x?fàn is not really the sort of thing one buys at a ch?oshì. On the other hand, people probably don’t worry much about whether jackdaws really do love someone’s big sphinx of quartz, so I think we’re OK. Still, something shorter than 48 letters should be possible — though pangram-friendly brevity is more easily accomplished in English than in Mandarin as spelled in Hanyu Pinyin. As one correspondent noted:

Most of the “excess” letters are vowels. Trouble is that Chinese doesn’t pile up the consonants much. Brown, for example, takes care of b, r, w, and n, while only expending one little o…. There’s no word like string in Chinese (5 consonants; one vowel). Chinese piles up vowels: zuotian and chaoshi and doufu and kaoji all use more vowels than consonants.

I’m challenging readers to come up with more Pinyin pangrams.

But I don’t want this to be a reversed shi shi shi stunt, so let’s stay away from Literary Sinitic. And I’d prefer the equivalent of “The quick brown fox jumps over the lazy dog” to that of “Cwm fjord veg balks nth pyx quiz.” In other words, wherever possible this should be in real-world, sayable Mandarin.

One possible variant on this would be to use “abcdefghijklmnopqrstuüwxyz” plus all the forms with diacritics ?á?à?é?è?í?ì?ó?ò?ú?ù???.” (No ? — first-tone ü, that is — is necessary.) But that would be even more work.

Those who devise good pangrams will will be covered in róngyào — or something like that.

Happy hunting.

DPP position on romanization

(BTW, this is my 7 KB JPG version of the 442 KB(!) BMP(!) file used on the DPP's site.)With Taiwan’s presidential election less than six months away and various position papers being issued, perhaps it’s time to take a look at where the opposition stands on romanization.

Sure, various politicians rant from time to time. But they may or may not be taken seriously. What about the party itself and its candidate?

Google doesn’t find any instances of “??” (“p?ny?n”) on the official Web site of the Democratic Progressive Party’s presidential candidate, Tsai Ing-wen (Cài Y?ngwén / ???). But searching for “??” on the DPP’s official Web site does yield at least a few results. (See the “sources” at the end of this piece.) It’s probably no surprise that none of them contain anything but bad news for those who support Taiwan’s continued use of Hanyu Pinyin.

Typical is the “e-paper” piece from 2008 that states the change to Hanyu Pinyin will cost NT$7 billion (about US$240 million). (If the DPP candidate wins, will the DPP follow its own assertions and logic and say that it would be far too expensive for Taiwan to change from the existing Hanyu Pinyin to Tongyong Pinyin?) I have no more faith in that inflated figure than I have in the other claims there, such as that the use of Hanyu Pinyin would not be convenient for foreigners and that there is no relationship between internationalization and using the world’s one and only significant romanization system for Mandarin (Hanyu Pinyin).

Then there’s the delicious irony that the image of a Tongyong Pinyin street sign the DPP chose to use in that anti-Hanyu Pinyin message has a typo! The sign, shown at top right, should read Guancian, not Guanciao. (In Hanyu Pinyin it would be “Guanqian.”) That’s right: The DPP says Taiwan needs to use Tongyong — but the supposed expert who put together that very argument apparently doesn’t know the difference between Tongyong Pinyin and a hole in the wall..

That document is a few years old, though. What about something more recent? Just three months ago the DPP spokesman, Chen Qimai (Chen Chi-mai / ???), complained that the Ma Ying-jeou administration had replaced Tongyong Pinyin with Hanyu Pinyin, calling this an example of removing Taiwan culture and abandoning Taiwan’s sovereignty. So there’s nothing to indicate a change in position over time.

It’s worth remembering that there’s a lot of blame to go around for the inconsistencies and sloppiness that characterize Taiwan’s romanization situation. Historically speaking, the KMT is certainly responsible for much of the mess. And the Ma administration’s willingness to go along with “New Taipei City” instead of “Xinbei,” “Tamsui” instead of “Danshui,” and “Lukang” instead of “Lugang” demonstrates that it is OK with cutting back its own policy in favor of Hanyu Pinyin. Nevertheless, it’s now the DPP — or at least some very loud and opinionated people within it — that represents the main force for screwing up perfectly good signage, etc.

Back when I was more often around DPP politicians, I would occasionally ask them privately about their opinions of Hanyu Pinyin. For the most part, they had no opposition to Taiwan’s use of it, regarding this as simply a practical matter. But they would not say so publicly because President Chen Shui-bian’s dumping of Ovid Tzeng made it clear what fate would meet those who opposed Chen on this issue.

Even though Chen is no longer in the picture, I fear that many in the DPP have come to believe their own propaganda on this issue.

I urge individuals (esp. those with known pro-green sentiments) and organizations (Hey, ECCT and AmCham: that means you especially!) that want to avoid a return to the national embarrassment that is Tongyong Pinyin to tell Cai Yingwen and the DPP now that Taiwan’s continued use of Hanyu Pinyin is simply good policy and is supported by the vast majority of the foreign community here, including pro-green foreigners.