Tai vs Tai

Taipei’s MRT system, wonderful though it is, continues to find new ways to irritate me. Today I present the case of

台 vs. 臺

Semantically, there is no difference between these two characters. They both represent the tái in Taipei/Taibei and Taiwan. But the 台 form is more common in Taiwan, where it is seen as a variant form and thus not as one of the “simplified” characters used in China.

So why is the MRT’s new airport line using a huge “臺” on its signs when a normal “台” would do just as well? In fact, the regular 台 form is found six times on the same sign, with the fourteen-stroke “臺” seen just once.

To show that this isn’t just a one-off, I’m providing photos of a few more signs in a station along the “purple” (airport) line.

So, in the first sign alone, we have:

  • 臺北 (×1),
  • 台北 (×4),
  • 月台 (yuetai, platform), and
  • 台鐵 (×1), for Tai-Tie, Taiwan’s railroad company, and thus any ordinary train line.

I blame Ma Ying-jeou.

How to find Chinese characters in an MS Word document

Recently someone wrote me with a problem. She had a book-length manuscript, most of which was in English. It also had some Chinese characters interspersed throughout the text. She needed to make some alterations to just the parts in Chinese characters and was hoping to avoid going through the entire Microsoft Word document line by line and changing the Chinese characters phrase by phrase. That could have taken hours or even days.

Fortunately, there’s a much easier and much faster way. So here’s how to search for Chinese characters inside a Microsoft Word document.

First, the simplest and easiest way. Copy the following line:
[⺀-■]{1,}

In Microsoft Word, use Ctrl+H to bring up the Find and Replace box.

  1. Paste the text you just copied in the Find what box.
  2. Click on the More >> button to reveal additional options.
  3. Select Use wildcards.

find_chinese_characters_word

find_chinese_characters_word_wildcard

Then Find away. That’s all there is to it. You can alter all the Chinese characters you find at the same time if you so desire.

Pro tip: If you want to change something about the Chinese characters, you might be better off in the long run making a new Word style and changing all the relevant characters to that style and then adjusting the style to meet your needs. Use ReplaceFormatStyle....

———–

Now comes a longer explanation, which you can safely ignore if the above worked fine for you.

But in case the special code above didn’t work for you or if you’d like to understand this a little better, here’s some more information on how to enter [⺀-■]{1,} yourself and why it works.

Basically, what you’re searching for is a range of characters, such as everything from A to Z. But in this case you’re going to be looking for everything from the start to the finish of Unicode’s set of graphs related to Chinese characters. Word calls this a wildcard search. Others refer to the use of wildcards as “regular expressions,” or “regex” for short.

Searches for ranges go in square brackets, with a hyphen between the first character and the last one, e.g. [A-Z].

The part at the end, {1,}, just tells Microsoft Word to look for one or more of the previous expression, so it will locate entire sections in Chinese characters, not just one character at a time. That will save you a lot of time and trouble.

OK, to get those special characters in a Word document, use

  1. Insert
  2. Symbol
  3. More Symbols

insert_symbol_more_symbols

Next,

  1. Under Font, select (Asian text).
  2. Under Subset, scroll down until you can select the CJK Radicals Supplement.
  3. Word should have already selected ⺀ (CJK Radical Repeat) for you. If not, you can click on it.
  4. Click the Insert button.

symbols_asian_text_cjk_insert

If needed, repeat Insert → Symbol → More Symbols.

sdsd

This time, with Font, still set at (Asian text):

  1. Under Subset, scroll all the way down until you can select the Halfwidth and Fullwidth Forms.
  2. Scroll all the way down the selection of glyphs and select the very last one.
  3. Click the Insert button.

half_fullwidth_forms

On my system at the moment that final character is a “halfwidth black square.” But as Unicode — and fonts — expand, the final character may be something else. Just use whatever is last and you should be fine. Just be sure to type in the square brackets, the hyphen, and the {1,} to complete the expression:
[⺀-■]{1,}

In case anyone’s wondering, no, you can’t just enter Unicode code numbers, because searches for those (u^ +number) won’t work when “Use wildcards” is on. So you have to enter the characters themselves.

This method can be easily adapted to suit searches for Greek letters, Cyrillic, etc.

I hope you find it useful.

Taipei to spend NT$300 million making MRT signage worse

Taipei MRT station
Commonwealth Magazine (Tiānxià zázhì) recently interviewed me for a Mandarin-language piece related to the signage on Taipei’s MRT system.

As anyone who has looked at Pinyin News more than a couple of times over the years should be able to guess, I had a lot to say about that — most of which understandably didn’t make it into the article. For example, I recall making liberal use of the word “bèn” (“stupid”) to describe the situation and the city’s approach. But the reporter — Yen Pei-hua (Yán Pèihuá / 嚴珮華), who is perhaps Taiwan’s top business journalist — diplomatically omitted that.

Since the article discusses the nicknumbering system Taipei is determined to implement “for the foreigners,” even though most foreigners are at best indifferent to this, but doesn’t include my remarks on it, I’ll refer you to my post on this from last year: Taipei MRT moves to adopt nicknumbering system. Back then, though, I didn’t know the staggering amount of money the city is going to spend on screwing up the MRT system’s signs: NT$300 million (about US$10 million)! The main reason given for this is the sports event Taipei will host next summer. That’s supposed to last for about ten days, which would put the cost for the signs alone at about US$1 million per day.

On the other hand, the city does not plan to fix the real problems with the Taipei MRT’s station names, specifically the lack of apostrophes in what should be written Qili’an (not Qilian), Da’an (not Daan) (twice!), Jing’an (not Jingan), and Yong’an (not Yongan) — in Chinese characters: 唭哩岸, 大安, 景安, and 永安, respectively. And then there’s the problem of wordy English names.

Well, take a look and comment — here, or better still, on the Facebook page. (Links below.) I’m grateful to Ms. Yen and Commonwealth for discussing the issue.

References:

Aiyo! OED fails to use Pinyin for some new entries

The Oxford English Dictionary has just added some new entries, including several from Sinitic languages.

A lot of these come by way of Singapore and so reflect the Hokkien language. For example, among the new entries is “ang pow,” which is Hokkien’s equivalent of Mandarin’s “hongbao,” which also made the list.

A few of the entries, however, come from Mandarin, for example two common interjections for surprise. Oddly, though, the OED uses “aiyoh” and “aiyah” instead of their proper Pinyin spellings of “aiyo” and “aiya.”

“Ah,” you say, “but maybe the aiyoh and aiyah spellings are more common in English.”

Nope.

Even in Singapore domains (.sg), the Pinyin spellings are more common than those the OED calls for. As the tables below show, in every instance the Pinyin spellings are also more common in Hong Kong, China, and Taiwan. Throughout the world, the Pinyin spellings are more common — the vast majority of the time by a factor of at least two.

Google search results for “aiyo” (Pinyin) and “aiyoh” (spelling used in the OED)

  aiyo aiyoh
.sg 12,200 5,680
.hk 2,570 187
.cn 6,040 984
.tw 4,690 196
all domains 1,250,000 137,000
all domains  + “chinese” 97,700 77,100
all domains  + “mandarin” 51,800 14,100

Google search results for “aiya” (Pinyin) and “aiyah” (spelling used in the OED)

  aiya aiyah
.sg 17,600 8,310
.hk 6,400 2,360
.cn 13,200 1,860
.tw 5,910 1,710
all domains 3,370,000 332,000
all domains  + “chinese” 238,000 63,200
all domains  + “mandarin” 36,500 22,800

Searching Google Books also reveals that the Pinyin forms are more common.

In short, I do not see any good reason for the OED to have adopted ad hoc spellings rather than the Pinyin standard. They must have their reasons, but it looks like they botched this.

Shanghai considers deleting Pinyin from street signs

The Shanghai Road Administration Bureau is considering removing Hanyu Pinyin from street signs in the city.

Typically, the bureau’s division chief, Wang Weifeng, seems to be confused about the difference between Pinyin and English. He also justifies the move by claiming that larger Chinese characters would benefit Chinese citizens, ignoring the high number of people in China who are largely illiterate.

“Of course we will keep the English-Chinese traffic signs around some special areas, such as the tourism spots, CBD areas and some transport hubs,” Wang said.

A German newspaper article notes:

Ob sie die Umschrift wortwörtlich „aus dem Verkehr“ zieht, will Schanghai angeblich von einer „Umfrage“ unter „Anwohnern“ abhängig machen, ebenso vom Urteil nicht näher genannter „Experten“. Dies ist eine gängige Formulierung, wenn chinesische Regierungsstellen ihren einsamen Entscheidungen einen basisdemokratischen Anstrich geben wollen.

[Google Translate: Whether they literally “out of circulation” pulls the inscription, Shanghai will supposedly make a “survey” of “residents” depends, as of indeterminate sentence from “experts”. This is a common formulation, when Chinese authorities want to give their lonely decisions a grassroots paint.]

This is a situation all too common in Taiwan as well, such as in Taipei’s misguided move to apply nicknumbering to subway stops. “Experts” — ha!

Shanghai’s survey on Pinyin use and signage is of course in Mandarin only, with no English. The poll ends on August 30 (next week!), so add your views to that soon.

So far, public opinion seems to be largely against removing Hanyu Pinyin from signs. But that doesn’t mean this might not happen anyway. After all: Shanghai has its “experts” on the case. Heh.

If Shanghai really wanted to help the legibility of its signs, it should consider using word parsing even with text in Chinese characters. For example:

  • use 陕西 南路, not 陕西南路
  • use 斜土 路, not 斜土路
  • use 建国 西路, not 建国西路

That would also permit the use of superscript on the generic parts of names (e.g., “南路”) to save space. This could also be done with the Pinyin/English, with the Pinyin in large letters and the English “Rd” etc. in superscript.

Thanks to Michael Cannings for the tip.

sources:

Mind the line
break

Line breaks are an interesting but little-discussed aspect of typography. That’s a shame, because they can matter, especially in signage.

Book covers are another place where line breaks can matter. I’m especially concerned with those because I’m involved in a company that publishes books about Taiwan, China, and other places in East Asia. I wish I could take credit for Camphor Press’s book covers; alas, though, I have no talent in that area.

Here’s a good example of a line break making a difference in a sign. This ends up being not unlike a typographical crash blossom. I took this photo last week at a Costco in metropolitan Taipei.

sign in a Costco seafood section that reads 'HOKKAIDO COOKED HAIR [line break] CRAB'

For those who are curious, NT$987 is about US$29.60.

Anyway, here’s the Mandarin text:
北海道熟凍毛蟹(冷凍)
Běihǎidào shú dòng máoxiè (lěngdòng)

(I don’t know what that first “dòng” is doing there, given that this ends with “lěngdòng.”)

For maoxie, the ABC Chinese-English Dictionary gives “small crab; baby crab.” But I’m not sure that’s quite right.

If the translator had gone with the more common form of “hairy crab” instead of “hair crab,” the adjective would have alerted readers that they needed to keep going. On the other hand, use of another common translation, “mitten crab,” wouldn’t have helped much, though I suppose that

HOKKAIDO COOKED MITTEN
CRAB

is slightly more palatable sounding than

HOKKAIDO COOKED HAIR
CRAB

And at least they didn’t use the sometimes seen translation of “hair crabs,” which could conjure up altogether the wrong image.

Languages, scripts, and signs: a walk around Taipei’s Shixin University

Recently I took some trails through the mountains in Taipei and ended up at Shih Hsin University (Shìxīn Dàxué / 世新大學). Near the school are some interesting signs. Rather than giving individual posts for each of these, I’m keeping the signs together in this one, as this is better testimony to the increasing and often playful diversity of languages and scripts in Taiwan.

Cǎo Chuàn

Here’s a restaurant whose name is given in Pinyin with tone marks! That’s quite a rarity here, though I suspect we’ll be seeing more of this in the future. The name in Chinese characters (草串) can be found, much smaller, on a separate sign below.

cao_chuan

二哥の牛肉麵

Right by Cao Chuan is Èrgē de Niúròumiàn (Second Brother’s Beef Noodle Soup). Note the use of the Japanese の rather than Mandarin’s 的; this is quite common in Taiwan.

erge_de_niuroumian

芭樂ㄟ店

This store has an ㄟ, which serves as a marker of the Taiwanese language. Here, ㄟ is the equivalent of 的 — and of の.

Bālè ei diàn
bala_ei_dian

A’Woo Tea Bar

awoo_tea_bar

I couldn’t find a name in Chinese characters for this place. The name is probably onomatopoeia, as in “Werewolves of London — awoo!”

https://www.youtube.com/watch?v=iDpYBT0XyvA