variant Chinese characters and Unicode

A submission to the Unicode Consortium’s Ideographic [sic] Variation Database for the “Combined registration of the Adobe-Japan1 collection and of sequences in that collection” is available for review through November 25. This submission, PRI 108, is a revision of PRI 98.

This set “enumerates 23,058 glyphs” and contains 14,664 tetragraphs (Chinese characters / kanji). About three quarters of Unicode pertains to Chinese characters.

Two sets of charts are available: the complete one (4.4 MB PDF), which shows all the submitted sequences, and the partial one (776 KB PDF), which shows “only the characters for which multiple sequences are submitted.”

Below is a more or less random sample of some of the tetragraphs.

Initially I was going to combine this announcement with a rant against Unicode’s continued misuse of the term “ideographic.” But I’ve decided to save that for a separate post.

sample image of some of the kanji variants in the proposal

MOI and Tongyong Pinyin: update

I have spent many hours over the past few days trying to find out exactly what is behind the recent news story about the Ministry of Education and moves to expand Tongyong Pinyin by the end of the year.

I have sent out no fewer than five e-mail messages to various government officials but have received no responses. I have also made more than a dozen phone calls to various ministries and government-information lines. But nobody I spoke with knows what is going on. My wife helped by making some calls on her own. She was eventually able to get through to someone at the Ministry of the Interior who does have a clue about all this.

Here is basically what is happening.

On October 30, Taiwan’s Ministry of the Interior promulgated the government’s guidelines for writing place names (including not just town names but physical features, such as rivers, mountains, temples, bridges, etc.) in English and romanization: Yùgào dìngdìng “biāozhǔn dìmíng yì xiě zhǔnzé” (預告訂定「標準地名譯寫準則」) (MS Word document).

Most of the pages of this document are simply a list of townships and districts throughout Taiwan, as given in Tongyong Pinyin. But it also contains a few pages of general guidelines. Local governments and interested individuals (yes, that could include you, o reader) who wish to comment on these guidelines may do so before the deadline of Thursday, November 8. The question of Tongyong Pinyin vs. Hanyu Pinyin, however, is supposedly off the table, as the Ministry of the Interior must follow the administration in this — though I encourage anyone who writes the ministry to bring up the issue anyway. I will post contact information as soon as I get it.

To return to the matter of the promulgated document, these are the guidelines that Taiwan’s local governments are ordered to use, with local governments’ offices of land administration compiling lists of place names to be standardized within their jurisdictions and submitting these lists to the MOI’s Department of Land Administration (dìzhèng sī fāngyù kē / 地政司方域科).

If local governments reject Tongyong Pinyin and use a different romanization system, the MOI does not have the authority to compel them to switch to Tongyong. But the central government can and and almost certainly will exert pressure on them to toe the line.

Making matters worse for advocates of Hanyu Pinyin, the international standard romanization system for Mandarin, is the fact that many local officials — even in “blue” regions — do not believe they have autonomy in this matter, as I know from having spoken with several of them about precisely this topic. Nor, unsurprisingly, do they take the word of a foreigner over what they “know” to be “correct”: that they must use Tongyong whether they like it or not. As an example, the city of Jilong (”Keelung”), which is controlled by the anti-Tongyong “blues,” instituted a plan to standardize street names there with Tongyong Pinyin. Nor will most officials bother to look up the rule they are supposedly following — and which, BTW, I can’t show them because it doesn’t exist.

The recently promulgated proposal has extremely limited guidelines. These are most certainly inferior to the fuller guidelines for Hanyu Pinyin — to say nothing of the book-length supplementary guidelines for Hanyu Pinyin (Chinese Romanization: Pronunciation and Orthography and the Xinhua Pinxie Cidian) and carefully produced dictionaries in Hanyu Pinyin.

Probably the best thing I could say about the guidelines is a negative: At least they didn’t adopt Taipei’s StuPid, StuPid PolICy Of InTerCapITaLiZaTion.

The problem that is likely to affect more names than other deficiencies — other than the fundamental matter of Tongyong Pinyin, that is — is the recommended use of the hyphen. Basically, the guidelines call for a hyphen where Hanyu Pinyin would use an apostrophe: before any syllable that begins with a, e, or o, unless that syllable comes at the beginning of a word or immediately follows a hyphen or other dash.

The reason that is a big problem, beyond the failure to follow the standard of Hanyu Pinyin, is that hyphens cannot then be put to the good use they have in Hanyu Pinyin. Hyphens are often needed in signage because they are used in short forms of proper nouns, for example the correct short form of Taiwan Daxue (National Taiwan University) is “Tai-Da.”

Hyphens can thus help clarify names a great deal becuase they often indicate an abbreviation. Mandarin’s tendency toward Consider bridge names, in which the hyphen helps indicates the reason for the name:

  • not Huazhong but Hua-Zhong (for [Wan]hua to Zhong[he])
  • not Huajiang but Hua-Jiang (for [Wan]hua to Jiang[zicui])

Or the case given in the guidelines of 嘉南大圳. The recommendation there is for “Jianan dazun.” But giving “Jia-Nan” instead of “Jianan” would help clarify that this is something in Jiayi and Tainan counties.

The government guidelines’ failure to employ the hyphen in the same manner as Hanyu Pinyin is a major deficiency.

Taiwan should have Tongyong Pinyin’s orthography follow the well-established guidelines for Hanyu Pinyin. But the administration’s petty difference-for-the-sake-of-difference policy will likely rule out that course.

more on Taiwan’s new Tongyong move

This morning all three of Taiwan’s English-language newspapers ran the AP story on the Ministry of the Interior’s plan to expand the use of Tongyong Pinyin. (Bonus points to the copy editor at the Taipei Times who changed the original article’s sloppy “Taiwan will standardize the English transliterations of its Chinese Mandarin place names by the end of the year” to “The Romanization of Mandarin place names will be standardized by the end of this year.”)

I have made a few calls about this, but to little effect so far. Unfortunately, I haven’t had the time today to track down someone at the Ministry of the Interior who can give some definitive information about this.

Meanwhile, here’s another article. It gives a little more information: no intercapping (good), hyphens instead of apostrophes (bad), some screwed-up word parsing (bad).

But all of this sounds like old news. How this will be any different in implementation is still unclear.

Wàijí rénshì lái Táiwān gōngzuò huò lǚyóu, zǒng bèi Táiwān de dìmíng yì xiě gǎo de “wù shàsha,” jiéjú cháng yǐ mílù shōuchǎng. Nèizhèngbù 30 rì gōng bù “biāozhǔn dìmíng yì xiě zhǔnzé” cǎo’àn, míng dìng dìmíng yì xiě yǐ “yīnyì” wèi yuánzé, bìng cǎi “Tōngyòng Pīnyīn” wèi jīzhǔn, ruò dìmíng yǒu lìshǐ, yǔyán, guójì guànyòng, shùzì děng tèxìng, zé yǐ dìmíng xìngzhì fānyì, rú Rìyuè Tán yì wéi “Sun Moon Lake;” 306 gāodì yì wéi “Highland 306.”

Gāi cǎo’àn shì yījù “guótǔ cèhuì fǎ” dìngdìng, bìng nàrù Jiàoyùbù zhìdìng de “Zhōngwén yìyīn shǐyòng yuánzé” zuòwéi yì xiě biāozhǔn, dìmíng yì xiě fāngshì yóu dìmíng zhǔguǎn jīguān zìxíng juédìng.

Cǎo’àn zhǐchū, wèi bìmiǎn yì xiě zhě duì wényì rènzhī bùtóng, chǎnshēng yì xiě chāyì, tǒngyī xíngzhèng qūyù de biāozhǔn yì xiě fāngshì, shěng “Province,” shì “City,” xiàn “County,” xiāng-zhèn “Township,” qū “District,” cūnli “Village.” Jiēdào míngchēng yě tǒngyī yì xiě, dàdào “Boulevard,” lù “Road,” jiē “Street,” xiàng “lane,” nòng “Alley.” Lìrú Kǎidágélán Dàdào wéi “Kaidagelan Boulevard.”

Cǎo’àn míng dìng, biāozhǔn dìmíng de yì xiě cǎi tōngyòng pīnyīn, dàn dìmíng hányǒu “shǔxìng míngchēng” shí, yǐ shǔxìng míngchēng yìyì fāngshì yì xiě, rú Dōng Fēng zhíyì wéi “East Peak.”

Ruò shǔxìng míngchēng yǔ biāozhǔn dìmíng zhěngtǐ shìwéi yī ge zhuānyǒu míngchēng shí, bù lìng yǐ yìyì fāngshì fēnkāi yì xiě, rú “Jiā-Nán dà zùn [zhèn?]” yì wéi “Jianan dazun;” Yángmíng Shān yì wéi “Yangmingshan;” Zhúzi Hú yì wéi “Jhuzihhu.”

Lìngwài dìmíng yǒu dāngdì lìshǐ, yǔyán, fēngsúxíguàn, zōngjiào xìnyǎng, guójì guànyòng huò qítā tèshū yuányīn, jīng zhǔguǎn jīguān bào zhōngyāng zhǔguǎn jīguān hédìng hòu, bù shòu “shǔxìng míngchēng” xiànzhì, rú Yù Shān zhíyì wéi Jade Mountain; zhōngyāng shānmài yì wéi “Central Mountains.”

Cǎo’àn guīdìng, biāozhǔn dìmíng yì xiě shūxiě fāngshì, dì-yī ge zìmǔ dàxiě, qíyú zìmǔ xiǎoxiě, rú bǎnqiáo yì wéi “Banciao,” ér fēi “Ban Ciao” huò “Ban-ciao.” Dàn dìmíng de dì-yī ge zì yǐhòu de pīnyīn zìmǔ, chūxiàn a, o, e shí, yǔ qián dānzì jiān yǐ duǎnxiàn liánjiē, rú Qīlǐ’àn yì wéi “Cili-an,” Rén’ài Xiāng wéi “Ren-ai Township.”

Cǐwài, cǎo’àn yě tǒngyī zìrán dìlǐ shítǐ shǔxìng míngchēng, rú píngyuán, péndì, dǎoyǔ, qúndǎo, liè yǔ, jiāo, tān, shāzhōu, jiǎjiǎo, shān, shānmài, fēng, hé xī, hú, tán děng shíwǔ zhǒng yì xiě fāngshì. Lìrú, Dōngshā Qúndǎo yì wéi “Dongsha Islands;” Diàoyútái liè yǔ “Diaoyutai Archipelago;” Běiwèi Tān “Beiwei Bank;” “Ālǐ Shān shānmài” yì wéi “Alishan Mountains;” zhǔfēng yì wéi “Main Peak;” Shānhútán zhíyì wéi “Shanhu Pond.”

source: Yīngyì yǒu “zhǔn” — lǎowài zhǎo lù bùzài wù shàsha (英譯有「準」 老外找路不再霧煞煞), China Times, October 31, 2007

Taiwan to expand use of Tongyong Pinyin?

The Associated Press is reporting what appears to be an expansion of the Taiwan government’s monumentally misguided promotion of its Tongyong Pinyin romanization system.

No one is answering the phones at the Ministry of the Interior now, and I haven’t been able to find out more information on the Web site yet. But I’ll be following this closely.

The story follows, with a few of my notes in brackets.

Taiwan will standardize the English transliterations of its Chinese Mandarin place names by the end of the year, an official said Wednesday, after years of confusion stemming from multiple spellings.

An official from the Ministry of Interior said the island would use the locally developed “Tongyong,” system in its transliterations, rejecting use of mainland China’s [Hanyu] Pinyin system, and the once common Wade-Giles system, introduced by two Englishmen in the late 19th century.

Over the past decade [Hanyu] Pinyin has gained wide acceptance among foreign students of Chinese, even as Wade-Giles and other foreign systems have diminished in importance.

Taiwan’s Tongyong system is virtually unknown outside the island.

But the Interior Ministry official insisted that Tongyong was still a good choice for a standard transliteration system.

“In the past, diverse spellings have caused confusion, so we have decided to remedy the situation,” he said, speaking on condition of anonymity because he was not authorized to talk to the press.

Multiple transliterations of place names have often caused confusion for non-Chinese-literate visitors to Taiwan.

For example, a busy shopping street in Taipei is variously rendered as Chunghsiao [in bastardized Wade-Giles — but no official signs on this street in Taipei use this system], Zhongxiao [in Hanyu Pinyin] and Jhongsiao [in Tongyong Pinyin — but no official signs on this street in Taipei use this system].

According to Ministry of Interior’s Web site, exceptions to the Tongyong system will still be allowed for some well known tourist attractions, including Jade Mountain in central Taiwan and Taipei’s Yangmingshan [Yangmingshan is the same in Tongyong Pinyin and Hanyu Pinyin, though it is properly written Yangming Shan].

source: Taiwan to standarize English [sic] spellings of place names, AP, via the International Herald Tribune, October 31, 2007

Questions on the origin of writing: SPP 26

a cross potent, which looks like a plus sign with perpendicular stems on the end of each of the four lines, but not so long as to make a cross in a square; image copied from Wikipedia

Sino-Platonic Papers has rereleased another issue related to the history of writing: Questions on the Origins of Writing Raised by the Silk Road (1.0 MB PDF), by Jao Tsung-i (Ráo Zōngyí, 饒宗頤) of the Chinese University of Hong Kong.

This work focuses especially on the use of two symbols, shown at right, in China and elsewhere.

This is issue no. 26 of Sino-Platonic Papers. It was first published in September 1991.

additional reading:

The Tao of semiotics. Zen and etymology.

Sino-Platonic Papers has rereleased for free Tracks of the Tao, Semantics of Zen (950 KB PDF), by Victor H. Mair.

After a brief introduction, Mair, who has translated more than one classic Taoist text, asks, “How did Tao and Zen enter our vocabulary? And what do these two extraordinarily powerful words really mean?”

He then enters into a “somewhat lengthy excursion into the neglected realm of philology” but keeps to his word to “try to make it as painless and entertaining as possible.”

It’s a fascinating and wide-ranging essay, especially for those interested in historical linguistics.

This is issue no. 17 of Sino-Platonic Papers. It was originally released in April 1990.

Street names in English translation: trend or error?

Taipei street sign reading '園區街 Park St.'Ah, Park Street: Taipei’s lovely tree-lined boulevard next to a wonderful oasis of well-manicured nature.

Nope.

Here, “park” refers to Nangang Software Park (Nángǎng Ruǎntǐ Yuánqū, 南港軟體園區), an area in eastern Taipei of new buildings housing mainly software-development and biomedical companies. The software park itself is a pretty nice place and looks fine; its surrounding area, however, is anything but green and leafy, comprising mainly dreary brick buildings and vacant lots.

But what’s odder than the name itself is that it appears in English rather than in the mix of Hanyu Pinyin (with StuPid, StuPid InTerCapITaLiZaTion) and English (e.g., St., Rd.) that has become standard in Taipei. Also odd is that at one end of the street the signs read “Park St,” but at the other end “YuanQu St.” This is a fairly new street name, as the software park is only a few years old.

Taipei street sign reading '園區街 YuanQu St.'

The flash on my camera helps reveal that the part of the sign reading “YuanQu St.” is pasted on top of something else — quite possibly “Park St.”

I spent about 15 minutes today getting my phone call to the Taipei City Government transferred from one desk to another before I was able to speak with someone who knew what she was doing. She stated that the Park Street version is in error and would be corrected to Yuanqu Street.

I really wish I’d asked for her extension number, because I’m certain to be making similar calls in the future.

Tailingua.com: an introduction to Taiwanese

My friend Michael Cannings has just unveiled his new Web site on the Taiwanese language, Tailingua. Here is how he introduces it:

Taiwanese is a Chinese language spoken by two-thirds of the population of Taiwan. It forms one dialect of the group known as Southern Min, which has a total of around forty-nine million native speakers, making it the twenty-first most widely-spoken language in the world.

However, there is very little information in English available on the internet (or in print, for that matter) about Southern Min in general, and Taiwanese in particular – a lack that Tailingua is designed to remedy, at least in part.

The site provides concise summaries of romanization and other methods for writing Taiwanese. It also offers fonts, input methods, a list of useful books, and more.

A very promising beginning!