Today I’d like to talk about a sign at a stand that sells guabao, a quintessential Taiwanese snack.

I took my own photo, but it didn’t make the guabao look particularly appetizing, so I’m using a public-domain image instead so you can see what one looks like if you don’t already know. But when I buy one I have them leave off the cilantro/xiāngcài. I hate that stuff.

Here’s the sign.

guabao sign, as described below



(NT$50 is about US$1.50.)

The sign uses some Taiwanese, specifically “a刈包.” If the whole thing were in romanized Taiwanese, it would be

Su-pâng ê

To̍k-ka kháu-bī
50 îⁿ

But parts of that are unidiomatic, as Taiwanese expert Michael Cannings informs me. (Alas, my Taiwanese sucks.) So this is a sign in both Taiwanese and Mandarin, which isn’t particularly surprising given that guabao is a Taiwanese food but most people in northern Taiwan use Mandarin most of the time. (I’m using the spelling “guabao” rather than “koah-pau” in most of this post because this is a Pinyin site.)

Something about this sign did surprise me a lot. Can you guess?

  • It’s not the use of a Roman letter — I should probably say “English letter” in this case, since here the letter is meant to be pronounced much like the “A” in “ABC” — though regular readers know that’s certainly more than enough to get me interested.
  • It’s not that the sign has “刈包” rather than “割包” for guabao. In searches restricted to .tw domains, Google returns 181,000 results for “刈包” and just 41,900 results for “割包”, even though Taiwan’s Ministry of Education prefers the latter form. Even on government Web pages “刈包” beats “割包” by a ratio of more than two to one.
  • It’s not the style in which “刈包” is written by hand, though I kinda like that.
  • And it’s not even that “a” was used instead of a different Roman letter: “ê”.

What seems to me most distinctive about this sign is that the Roman letter appears in lowercase rather than as “A”.

A single letter being used to represent a Sinitic morpheme in a text otherwise in Chinese characters is almost always written in upper case, e.g., A菜, 宮保G丁, K書. (Oh, that reminds me: I really need to answer that e-mail message about K. Sorry, Steven.)

In other words, if a sign is going to have the Roman letter “a” stand in for the Taiwanese possessive particle (the equivalent of Mandarin’s de/的), I would expect in this particular case for the sign to have “私房A” rather than “私房a”. I’m pleased by the use of lowercase; capital letters should be mainly for proper nouns and the beginnings of sentences.

It’s probably a one-off. But just in case I’ll be on the lookout to see if there’s a trend toward greater use of lowercase.

The text also presents a challenge: How should this be written in Pinyin? The last part (獨家口味 / 50元) is easy, because it’s just straight modern standard Mandarin:

dújiā kǒuwèi
50 yuán

But what to do with this?


Probably this:

Sīfáng ê

Most Common Taiwanese Given Names

Below are the most common given names for Taiwanese, as of June 2016. For the numbers of people with any of these given names, see the graph below. Note that there are more Taiwanese with even the tenth-most-popular name for girls than the most popular name for boys.

If you would like a chart of such names for Taiwanese in their twenties and thirties (specifically, those born 1976–1994), see Common Taiwanese given names. For the most common family names in Taiwan, see Taiwan personal names: a frequency list.

For the most likely spelling, bastardized Wade-Giles is given.

Most popular given names for Taiwanese males

No. Hanzi Pinyin Spelling Likely Used by Someone with This Name
1 家豪 Jiāháo Chia-hao
2 志明 Zhìmíng Chih-ming
3 俊傑 Jùnjié Chun-chieh
4 建宏 Jiànhóng Chien-hung
5 俊宏 Jùnhóng Chun-hung
6 志豪 Zhìháo Chih-hao
7 志偉 Zhìwěi Chih-wei
8 文雄 Wénxióng Wen-hsiung
9 金龍 Jīnlóng Chin-lung
10 志強 Zhìqiáng Chih-chiang

Most popular given names for Taiwanese females

No. Hanzi Pinyin Spelling Likely Used by Someone with This Name
1 淑芬 Shūfēn Shu-fen
2 淑惠 Shūhuì Shu-hui
3 美玲 Měilíng Mei-ling
4 雅婷 Yǎtíng Ya-ting
5 美惠 Měihuì Mei-hua
6 麗華 Lìhuá Li-hua
7 淑娟 Shūjuān Shu-chuan
8 淑貞 Shūzhēn Shu-chen
9 怡君 Yíjūn Yi-chun
10 淑華 Shūhuá Shu-hua

Graph, in Mandarin, of the most common male and female names in Taiwan

Note: Although I refer to these as “Taiwanese” names, I give the Mandarin forms (since Hanyu Pinyin is a system for writing Mandarin), not names in Hoklo/Hokkien (the language often referred to as Taiwanese).

Source: ROC Ministry of the Interior.

Gwoyeu Romatzyh in the wild

Although Gwoyeu Romatzyh was technically the ROC’s official romanization system for most of the twentieth century (through 1986), it’s very seldom seen in Taiwan. The most common place for it to appear is on the side of coach buses. But here’s an example of Guoyeu Romatzyh on a shipping box for thousand-year-old eggs:



Guoyeu Romatzyh is often most easily identified by the doubled vowel in most (but not all) third-tone syllables. But this example doesn’t have any of those. The y indicates second tone (except when it doesn’t). And the doubled final n is a marker of fourth tone. (Have I ever mentioned that Gwoyeu Romatzyh often reminds me of “The Name Game“?)

In Hanyu Pinyin, songhua pyidann is sōnghuā pídàn.

Another technical point, this photo wasn’t taken in Taiwan proper but rather on Kinmen (金門), which provides an example of a romanization system older than Gwoyeu Romatzyh, older than Wade-Giles even. It’s postal romanization, which I regard as too mixed up to properly be called a system. In Hanyu Pinyin, Kinmen is Jinmen. The island is also known as Quemoy.

Tai vs Tai

Taipei’s MRT system, wonderful though it is, continues to find new ways to irritate me. Today I present the case of

台 vs. 臺

Semantically, there is no difference between these two characters. They both represent the tái in Taipei/Taibei and Taiwan. But the 台 form is more common in Taiwan, where it is seen as a variant form and thus not as one of the “simplified” characters used in China.

So why is the MRT’s new airport line using a huge “臺” on its signs when a normal “台” would do just as well? In fact, the regular 台 form is found six times on the same sign, with the fourteen-stroke “臺” seen just once.

To show that this isn’t just a one-off, I’m providing photos of a few more signs in a station along the “purple” (airport) line.

So, in the first sign alone, we have:

  • 臺北 (×1),
  • 台北 (×4),
  • 月台 (yuetai, platform), and
  • 台鐵 (×1), for Tai-Tie, Taiwan’s railroad company, and thus any ordinary train line.

I blame Ma Ying-jeou.

Zhou Youguang, 1906-2017

Zhou Youguang

Zhou Youguang, who is often called the “father of Hanyu Pinyin,” died earlier today.

He lived to the age of 111. He was “the man God forgot,” he liked to joke. And he did like to laugh. His sense of humor, which he kept despite some of the trials he suffered, no doubt helped him flourish so long.

He was most remarkable, however, not for his longevity but for his monumental contribution to literacy, his dedication to helping others, and his sense of justice.

I’ll add more information later.


How to find Chinese characters in an MS Word document

Recently someone wrote me with a problem. She had a book-length manuscript, most of which was in English. It also had some Chinese characters interspersed throughout the text. She needed to make some alterations to just the parts in Chinese characters and was hoping to avoid going through the entire Microsoft Word document line by line and changing the Chinese characters phrase by phrase. That could have taken hours or even days.

Fortunately, there’s a much easier and much faster way. So here’s how to search for Chinese characters inside a Microsoft Word document.

First, the simplest and easiest way. Copy the following line:

In Microsoft Word, use Ctrl+H to bring up the Find and Replace box.

  1. Paste the text you just copied in the Find what box.
  2. Click on the More >> button to reveal additional options.
  3. Select Use wildcards.



Then Find away. That’s all there is to it. You can alter all the Chinese characters you find at the same time if you so desire.

Pro tip: If you want to change something about the Chinese characters, you might be better off in the long run making a new Word style and changing all the relevant characters to that style and then adjusting the style to meet your needs. Use ReplaceFormatStyle....


Now comes a longer explanation, which you can safely ignore if the above worked fine for you.

But in case the special code above didn’t work for you or if you’d like to understand this a little better, here’s some more information on how to enter [⺀-■]{1,} yourself and why it works.

Basically, what you’re searching for is a range of characters, such as everything from A to Z. But in this case you’re going to be looking for everything from the start to the finish of Unicode’s set of graphs related to Chinese characters. Word calls this a wildcard search. Others refer to the use of wildcards as “regular expressions,” or “regex” for short.

Searches for ranges go in square brackets, with a hyphen between the first character and the last one, e.g. [A-Z].

The part at the end, {1,}, just tells Microsoft Word to look for one or more of the previous expression, so it will locate entire sections in Chinese characters, not just one character at a time. That will save you a lot of time and trouble.

OK, to get those special characters in a Word document, use

  1. Insert
  2. Symbol
  3. More Symbols



  1. Under Font, select (Asian text).
  2. Under Subset, scroll down until you can select the CJK Radicals Supplement.
  3. Word should have already selected ⺀ (CJK Radical Repeat) for you. If not, you can click on it.
  4. Click the Insert button.


If needed, repeat Insert → Symbol → More Symbols.


This time, with Font, still set at (Asian text):

  1. Under Subset, scroll all the way down until you can select the Halfwidth and Fullwidth Forms.
  2. Scroll all the way down the selection of glyphs and select the very last one.
  3. Click the Insert button.


On my system at the moment that final character is a “halfwidth black square.” But as Unicode — and fonts — expand, the final character may be something else. Just use whatever is last and you should be fine. Just be sure to type in the square brackets, the hyphen, and the {1,} to complete the expression:

In case anyone’s wondering, no, you can’t just enter Unicode code numbers, because searches for those (u^ +number) won’t work when “Use wildcards” is on. So you have to enter the characters themselves.

This method can be easily adapted to suit searches for Greek letters, Cyrillic, etc.

I hope you find it useful.

Taipei to spend NT$300 million making MRT signage worse

Taipei MRT station
Commonwealth Magazine (Tiānxià zázhì) recently interviewed me for a Mandarin-language piece related to the signage on Taipei’s MRT system.

As anyone who has looked at Pinyin News more than a couple of times over the years should be able to guess, I had a lot to say about that — most of which understandably didn’t make it into the article. For example, I recall making liberal use of the word “bèn” (“stupid”) to describe the situation and the city’s approach. But the reporter — Yen Pei-hua (Yán Pèihuá / 嚴珮華), who is perhaps Taiwan’s top business journalist — diplomatically omitted that.

Since the article discusses the nicknumbering system Taipei is determined to implement “for the foreigners,” even though most foreigners are at best indifferent to this, but doesn’t include my remarks on it, I’ll refer you to my post on this from last year: Taipei MRT moves to adopt nicknumbering system. Back then, though, I didn’t know the staggering amount of money the city is going to spend on screwing up the MRT system’s signs: NT$300 million (about US$10 million)! The main reason given for this is the sports event Taipei will host next summer. That’s supposed to last for about ten days, which would put the cost for the signs alone at about US$1 million per day.

On the other hand, the city does not plan to fix the real problems with the Taipei MRT’s station names, specifically the lack of apostrophes in what should be written Qili’an (not Qilian), Da’an (not Daan) (twice!), Jing’an (not Jingan), and Yong’an (not Yongan) — in Chinese characters: 唭哩岸, 大安, 景安, and 永安, respectively. And then there’s the problem of wordy English names.

Well, take a look and comment — here, or better still, on the Facebook page. (Links below.) I’m grateful to Ms. Yen and Commonwealth for discussing the issue.


Aiyo! OED fails to use Pinyin for some new entries

The Oxford English Dictionary has just added some new entries, including several from Sinitic languages.

A lot of these come by way of Singapore and so reflect the Hokkien language. For example, among the new entries is “ang pow,” which is Hokkien’s equivalent of Mandarin’s “hongbao,” which also made the list.

A few of the entries, however, come from Mandarin, for example two common interjections for surprise. Oddly, though, the OED uses “aiyoh” and “aiyah” instead of their proper Pinyin spellings of “aiyo” and “aiya.”

“Ah,” you say, “but maybe the aiyoh and aiyah spellings are more common in English.”


Even in Singapore domains (.sg), the Pinyin spellings are more common than those the OED calls for. As the tables below show, in every instance the Pinyin spellings are also more common in Hong Kong, China, and Taiwan. Throughout the world, the Pinyin spellings are more common — the vast majority of the time by a factor of at least two.

Google search results for “aiyo” (Pinyin) and “aiyoh” (spelling used in the OED)

  aiyo aiyoh
.sg 12,200 5,680
.hk 2,570 187
.cn 6,040 984
.tw 4,690 196
all domains 1,250,000 137,000
all domains  + “chinese” 97,700 77,100
all domains  + “mandarin” 51,800 14,100

Google search results for “aiya” (Pinyin) and “aiyah” (spelling used in the OED)

  aiya aiyah
.sg 17,600 8,310
.hk 6,400 2,360
.cn 13,200 1,860
.tw 5,910 1,710
all domains 3,370,000 332,000
all domains  + “chinese” 238,000 63,200
all domains  + “mandarin” 36,500 22,800

Searching Google Books also reveals that the Pinyin forms are more common.

In short, I do not see any good reason for the OED to have adopted ad hoc spellings rather than the Pinyin standard. They must have their reasons, but it looks like they botched this.