updating Karlgren: a forthcoming reference book

The University of Hawai`i Press will be releasing another work in its groundbreaking ABC Chinese Dictionary Series, which is responsible for my favorite Mandarin-English dictionary, the Pinyin-ordered ABC Chinese-English Comprehensive Dictionary, edited by John DeFrancis.

The new work, which will be released in December 2008, is Minimal Old Chinese and Later Han Chinese: A Companion to Grammata Serica Recensa, by Axel Schuessler.

Here’s the publisher’s description:

Although long out of date, Bernard Karlgren’s (1957) remains the most convenient work for looking up Middle Chinese (ca. A.D. 600) and Old Chinese (before 200 B.C.) reconstructions of all graphs that occur in literature from the beginning of writing (ca. 1250 B.C.) down to the third century B.C. In the present volume, Axel Schuessler provides a more current reconstruction of Old Chinese, limiting it, as far as possible, to those post-Karlgrenian phonological features of Old Chinese that enjoy some consensus among today’s investigators. At the same time, the updating of the material disregards more speculative theories and proposals. Schuessler refers to these minimal forms as “Minimal Old Chinese” (OCM). He bases OCM on Baxter’s 1992 reconstructions but with some changes, mostly notational. In keeping with its minimal aspect, the OCM forms are kept as simple as possible and transcribed in an equally simple notation. Some issues in Old Chinese phonology still await clarification; hence interpolations and proposals of limited currency appear in this update.

Karlgren’s Middle Chinese reconstructions, as emended by Li Fang-kuei, are widely cited as points of reference for historical forms of Chinese as well as dialects. This emended Middle Chinese is also supplied by Schuessler. Another important addition to Karlgren’s work is an intermediate layer midway between the Old and Middle Chinese periods known as “Later Han Chinese” (ca. second century A.D.) The additional layer makes this volume a useful resource for those working on Han sources, especially poetry.

This book is intended as a “companion” to the original Grammata Serica Recensa and therefore does not repeat other information provided there. Matters such as English glosses and references to the earliest occurrence of a graph can be looked up in Grammata Serica Recensa itself or in other relevant dictionaries. The great accomplishment of this companion volume is to update an essential reference and thereby fulfill the need for an accessible and user-friendly source for citing the various historically reconstructed stages of Chinese.

software for Shanghainese

Professor Qián Nǎiróng (Qian Nairong / 錢乃榮) of Shanghai University has just issued free software to help with the writing of Shanghainese (上海话). People may now download the 1.3 MB zip file of the program.

Some examples:

shanghe 上海
shanghehhehho 上海闲/言话(上海话)
whangpugang 黄浦江
suzouhhu苏州河
shyti 事体(事情)
makshy 物事(东西)
bhakxiang 白相(玩)
dangbhang 打朋(开玩笑)
ghakbhangyhou 轧朋友(交朋友)
cakyhangxiang 出洋相(闹笑话,出丑)
linfhakqin 拎勿清(不能领会)
dhaojiangwhu 淘浆糊(混)
aoshaoxhin 拗造型(有意塑造姿态形象)
ghe 隑(靠)
kang 囥(藏)
yin 瀴(凉、冷)
dia 嗲
whakji 滑稽

The program offers two flavors of romanization. Here are some examples of the differences between the two styles:

New Folk Old Timers
makshy 物事(东西)
bhakxiang 白相(玩)
dangbhang 打朋(开玩笑)
ghakbhangyhou 轧朋友(交朋友)
cakyhangxiang 出洋相(闹笑话,出丑)
linfhakqin 拎勿清(不能领会)
mekshy 物事(东西)
bhekxian 白相(玩)
danbhan 打朋(开玩笑)
ghakbhanyhou 轧朋友(交朋友)
cekyhanxian 出洋相(闹笑话,出丑)
linfhekqin 拎勿清(不能领会)

Here’s a brief story on this:

Xiànzài, wǒmen zài wǎngluò zhōng liáotiān de shíhou yuèláiyuè duō de péngyou dōu kāishǐ xǐhuan yòng Shànghǎihuà. Dànshì yǒushíhou shìbushì juéde xiǎng biǎodá dehuà bùzhīdào zěnme dǎ, nòng de yǒudiǎn bùlúnbùlèi ne? Xiànzài, yī ge kěyǐ qīngsōng dǎchū Shànghǎihuà de chéngxù chūlai le.

Jīngguò liǎng nián nǔlì, Shànghǎi dàxué Zhōngwénxì Qián Nǎiróng jiàoshòu jí tā de yánjiūshēng hé dādàng zhōngyú yú běnyuè wánchéng le Shànghǎihuà shūrùfǎ de zhìzuò. Zhíde guānzhù de shì, zhè tào shūrùfǎ hái bāokuò xīn-lǎo liǎng ge bǎnběn, 45 suì yǐshàng de lǎo Shànghǎi rénhé niánqīng yī dài de Shànghǎirén dōu kěyǐ zhǎodào zìjǐ de “dǎfǎ.”

Háishi tóngyàng 26 ge zìmǔ de jiànpán, 8 yuè 1 rì qǐ xiàzài le Shànghǎihuà shūrùfǎ zhīhòu, nín jiù kěyǐ tōngguò shūrù “linfhakqin” dǎchū “līn wù qīng,” shūrù “dhaojiangwhu” dǎchū “táo jiànghu” děng yuánzhī yuán wèi de Shànghǎihuà le. Zuótiān, jìzhě tíqián xiàzài dào gāi ruǎnjiàn. Ànzhào shǐyòng shuōmíng, yòng quánpīn de fāngshì chángshì shūrù “laoselaosy” zhèxiē zìmǔ, píngmù shàng, lìjí chūxiàn le “lǎo sānlǎo sì” (Shànghǎihuà, yìsi shì “màilǎo, chōng lǎochéng de yàngzi”).

Jùxī, yóuyú Shànghǎihuà yǔ Pǔtōnghuà de dúfǎ yǒusuǒbùtóng, suǒyǐ zài pīnyīn pīnxiě fāngshì shàng háishi xūyào shǐyòng shuōmíng de bāngzhù. Bǐrú jìzhě fāxiàn, fánshì yǔ Pǔtōnghuà shēngmǔ, yùnmǔ xiāngtóng de zì, zài Shànghǎihuà shūrùfǎ zhōng zuìzhōng yòng de háishi Pǔtōnghuà pīnyīn, bùtóng de zé cǎiyòng Shànghǎihuà shūrùfǎ de pīnxiě fāngshì. Rú “chénguāng” de “chén,” “huātou” de “tóu” dōu fāchéng zhuóyīn, Shànghǎihuà pīnyīn shūrùfǎ zhōng yàozài shēngmǔ zhōng jiā yī ge zìmǔ h, pīnchéng “shen,” “dhou;” fánshì rùshēng zì, zé zài pīnyīn hòu jiā zìmǔk, rú “báixiāng” de “bái” jiù pīnchéng bhek.

Bùguò, dàjiā bùyào juéde tài nán. Jìzhě fāxiàn, Shànghǎihuà shūrùfǎ yǔ Pǔtōnghuà de shūrùfǎ zuìdà xiāngtóng zhī chǔzài yú, zhǐyào liánxù shūrù shēngmǔ hé yùnmǔ jiù kěyǐ, bùxū shūrù shēngdiào. Cǐwài, Shànghǎihuà pīnyīn shūrù xìtǒng háiyǒu lèisì “zhìnéng” yōudiǎn, kěyòng suōlüè fāngshì bǎ cíyǔ pīnxiě chūlai.

Zhǔchí Shànghǎihuà shūrùfǎ kāifā de Shànghǎi dàxué Zhōngwénxì Qián Nǎiróng jiàoshòu gàosu jìzhě, zhè tào shūrùfǎ bùjǐn néng dǎchū Shànghǎihuà dà cídiǎn zhōng 15,000 duō ge cítiáo, érqiě hái néng yòng Shànghǎihuà pīnyīn dǎchū Shànghǎihuà zhōng shǐyòng zhe de, yǔ Pǔtōnghuà cíyì xiāngtóng dàn yǔyīn bùtóng de chángyòng cíyǔ. Rú “Huángpǔ Jiāng” shūrù “whangpugang” , “lǐxiǎng” zéshì lixiang děng, gòngjì 10,000 duō ge cítiáo.

sources:

separating Pinyin syllables: PHP code

A few weeks ago I had someone write to ask if I had a script that can divide Pinyin texts into their individual syllables. It so happens that I do have something that does just that. Since I sent out that bit of code, I might as well make it available to everyone (GNU GPL, and links back to Pinyin.Info are always appreciated).

It has lots of regular expressions, to make the code nice and compact. I’ve added comments for clarity.

##############################
### SEPARATE THE SYLLABLES
##############################
// In the lines below, \s means space
// This program assumes that ü is written as v
// The i at the end of a line means case insensitive
// \W is a single, non-word character (e.g., punctuation)

$search = array ("'([aeiouv])([^aeiounr\W\s])'i", // This line does most of the work
"'(\w)([csz]h)'i", // double-consonant initials
"'(n)([^aeiouvg\W\s])'i", // cleans up most n compounds
"'([aeiuov])([^aeiou\W\s])([aeiuov])'i", // assumes correct Pinyin (i.e., no missing apostrophes)
"'([aeiouv])(n)(g)([aeiouv])'i", // assumes correct Pinyin, i.e. changan = chan + gan
"'([gr])([^aeiou\W\s])'i", // fixes -ng and -r finals not followed by vowels
"'([^e\W\s])(r)'i", // r an initial, except in er
);

$replace = array ("\\1 \\2",
"\\1 \\2",
"\\1 \\2",
"\\1 \\2\\3",
"\\1\\2 \\3\\4",
"\\1 \\2",
"\\1 \\2",
);

$usertext = preg_replace($search, $replace, $document);

##############################

Since I’m always going on about the need for word parsing and not separating Pinyin into single syllables, some of you are probably wondering just why I of all people would have ever written such code. The answer is that it’s part of my Pinyin spell-checker, which is only a very basic utility in that it functions by checking for theoretically correct groups of syllables rather than real words (i.e., anything composed of correctly spelled groups of syllables, minus tone marks, will pass even if that word isn’t found in a dictionary).

Suggestions for improvements are always welcome.

Ovid Tzeng reiterates backing for Hanyu Pinyin

Earlier this week Ovid Tzeng, a former minister of education and current minister without portfolio, reaffirmed his support for Taiwan adoping Hanyu Pinyin and said that this is an important issue the government will need to deal with sooner or later.

Zēng Zhìlǎng Jiàoyùbù zhǎng rènnèi, jiānchí cǎiyòng Hànyǔ Pīnyīn, shì tā bèi huàn xiàlái de zhǔyīn zhīyī. Tā zuótiān réng bù gǎi qí zhì, qiángdiào guówài bùguǎn Zhōngwén jiàoxué huò xuéshù qīkān, hěn duō yǐjing gǎiyòng Hànyǔ Pīnyīn, Táiwān bùnéng shìruòwúdǔ, zhè suī fēi xīn zhèngfǔ zuì yōuxiān shīzhèng xiàngmù, dànshì yě lièwéi wèilái zhòngdà jiǎntǎo shìxiàng.

Most of the source article for this discusses poet and academic Zheng Chouyu’s backing for Hanyu Pinyin. He stresses his view that this is a practical matter, not a political one.

source: Zhèng Chóuyǔ jiànyì: Zhōngwén yìyīn kěyǐ cǎi Hànyǔ Pīnyīn (鄭愁予建議:中文譯音 可採漢語拼音), United Daily News, June 9, 2009

further reading: Hanyu Pinyin backer to return to Taiwan’s Cabinet, Pinyin News, April 29, 2008

Whither Taiwan’s English renamings?

Those working in the new administration of President Ma Ying-jeou (Mǎ Yīngjiǔ) are people with priorities. For example, they certainly didn’t waste any time removing the Chinese characters for “Taiwan” from the Web site of the presidential office, as this happened on his first day in office. On the other hand, they didn’t bother with other things, like having the current year be 2008 instead of “108.”

From a screen shot taken a couple of nights ago:
screenshot from the website of the Office of the President, showing that the date script *still* hasn't been fixed (with the year given as '108' instead of '2008')

From a screen shot taken about two-and-a-half years ago:
screenshot from the website of the Office of the President, showing that the date as '106-01-02' for January 2, 2006

(FWIW, I told a meeting of government webmasters three years ago that the date script needed fixing — or, better still, deletion. Are they really under the impression that lots of people visit the presidential office’s Web site or that of any other Taiwan governmental agency to check the date and time?)

Also, given what the head of the ruling party recently said in the glorious motherland China, perhaps they might want to replace “Office of the President” with “Office of Mr. Ma.”

At any rate, how things are named is a concern of the current administration, just as it was for the previous one. I’ve given up trying to follow the twists and turns of the name of Revere the Bloody Dictator Shrine Chiang Kai-shek Memorial Hall Taiwan Democracy Memorial Hall. Someone let me know when the dust finally settles.

And then there’s the airport. The last time I was on a highway in Taoyuan I noticed that the signs that previously said “CKS Airport” had the “CKS” covered, so they read simply “Airport”. Maybe the new administration can live with that, regardless of what it does about the signage of the airport itself.

But what is to become of the official names that weren’t changed in Mandarin but only in English? Please note that I’m not talking about romanizations but about real English names. I’m referring to how the English names of several ministries and other government agencies were changed during President Chen Shui-bian’s two terms in office, though the Mandarin names remained the same.

For example:

Mandarin Name English Name
Pre DPP Current (March 2008)
Yuánzhùmín Wěiyuánhuì Council of Aboriginal Affairs Council of Indigenous Peoples
Guóyǔhuì Mandarin Promotion Council National Languages Committee
Zhōnghuá Mínguó Duìwài Màoyì Fāzhǎn Xiéhuì China External Trade Development Council (CETRA) Taiwan External Trade Development Council (TAITRA)
Qiáowù Wěiyuánhuì Overseas Chinese Affairs Commission Overseas Compatriot Affairs Commission

None of the above revised names have been revoked or changed as of today (June 12, 2008 — or 108-06-12, as the Presidential Office would have it).

What about the addresses of the Web sites of these ministries and agencies?

name URL comments
Council of Indigenous Peoples www.apc.gov.tw APC? According to someone I spoke with at the council, this stands for “Aboriginal People’s Commission” (or maybe “Aboriginal Peoples’ Commission”), a name that dates back to 1996. But I can’t find any search results for that name within .tw domains. Also, neither www.cip.gov.tw nor www.cip.gov.tw leads to anything. But lately the APC site has often been unresponsive. I mentioned to the council that they might want an updated URL; the person I spoke with said she’d look into it.
National Languages Committee www.edu.tw/MANDR/ This is under the Ministry of Education, which has changed the URL a few times over the years but has yet to revise the focus in the address on Mandarin (i.e., “MANDR”). Not even under the DPP was this address subject to rectification (zhèngmíng, 正名 ).
Taiwan External Trade Development Council (TAITRA) www.taitra.org.tw The old URL of www.cetra.org.tw leads to nothing, not even a redirect. www.taitra.com.tw mirrors the .org.tw address. This doesn’t have a .gov.tw address because it’s a semi-governmental organization.
Overseas Compatriot Affairs Commission www.ocac.gov.tw “Overseas Chinese Affairs Commission” and “Overseas Compatriot Affairs Commission” share the same abbreviation. One URL fits all.

Thus, so far the new English names have survived.

early Chinese tattoos

As my friend Tian of Hanzi Smatter continues to document, some people, Westerners especially, remain keen on having themselves tattooed with Chinese characters — even if they can’t read them. I doubt, though, that many are aware of China’s historical traditions in tattooing. As Carrie E. Reed notes in Early Chinese Tattoo (2.9 MB PDF), which is the latest reissue from Sino-Platonic Papers, “it appears that the practice of tattoo (other than the penal use) never achieved any level of general acceptance or widespread use among most parts of ancient Chinese society of any era.”

Yes, penal use: In early China tattooing was a common way of branding criminals. Often such tattoos were standard designs, such as circles. But sometimes they contained text.

Here’s something from Reed’s discussion of the Yuan dynasty’s legal code:

In the section on illicit sexual relationships we read that, in general, on the first offense the adulterous couple will be separated, but if they are “caught in the act” a second time, the man (it is not clear if the woman is tattooed as well) will be tattooed on the face with the words “committed licentious acts two times” (犯姦二度) and banished. Numerous examples are given to illustrate this type of punishment.

Reed examines and translates many texts describing tattoos.

Some of the terms encountered in these early texts are (with a literal translation given in parentheses) qing 黥 (to brand, tattoo), mo 墨 (to ink), ci qing 刺青 (to pierce [and make] blue-green), wen shen 文身 (to pattern the body), diao qing 雕青 (to carve and [make] blue-green), ju yan 沮顏 (to injure the countenence), wen mian 文面 (to pattern the face), li mian 剺面 (to cut the face) , hua mian 畫面 (to mark the face), lou shen 鏤身 (to engrave the body), lou ti 鏤體 (same), xiu mian 繡面 (to embroider [or ornament] the face), ke nie 刻涅 (to cut [and] blacken), nie zi 涅字 (to blacken characters) ci zi 刺字 (to pierce characters), and so on. These terms are sometimes used together, and there are numerous further variations. In general, if the tattooing of characters (字) appears in the term, it refers to punishment, but this is certainly not true in every case. Likewise, if a term literally meaning “to ornament” or “decorate” is used, it does not necessarily mean that the tattoo was done voluntarily or for decorative purposes.

All of the types of tattoo, except perhaps for the figurative and textual, are usually described as inherently opprobrious; people bearing them are stigmatized as impure, defiled, shameful or uncivilized. There does not ever seem to have been a widespread acceptance of tattoo of any type by the “mainstream” society; this was inevitable, partly due to the early and long-lasting association of body marking with peoples perceived as barbaric, or with punishment and the inevitable subsequent ostracism from the society of law-abiding people. Another reason, of course, is the Confucian belief that the body of a filial person is meant to be maintained as it was given to one by one’s parents.

This was first published in June 2000 as issue no. 103 of Sino-Platonic Papers. Although the work contains no illustrations, it does feature copious translations of texts describing tattoos or relating tales about them.

Gaoxiong street signs

Sinle StDuring an extremely brief trip a few weeks ago to Gāoxióng, Taiwan’s second-largest city, I was able to grab a few photos of signage there. Most of these were taken from a moving taxi; thus the poor quality and lack of much diversity. But these are the best I could do under the circumstances.

First, a few basic points:

  • they’re in Tongyong Pinyin (bleah — but at least they’re consistent)
  • they don’t use InTerCaPiTaLiZaTion (This lack is, of course, a good thing. If only Taipei hadn’t screwed this up!)
  • in most cases the text in romanization is large enough to read even at a distance (Very good — unlike all too many relatively recent signs elsewhere, such as Taipei County.)

In short, other than the choice of romanization most of these signs aren’t all that bad. They’re certainly much better (and more consistent) than the ones that Taipei County put up in Tongyong Pinyin a few years ago. (Although Taipei County’s current magistrate said more than two years ago that he was in favor of switching to Hanyu Pinyin, as far as I can see he has done absolutely nothing about this. Of course, some might say that he’s done absolutely nothing about anything; but I’ll leave discussion of that to the political blogs.)

Here’s another Gāoxióng sign with romanization that isn’t too small.
Dacheng St.

I’m not a fan of the practice of force-justifying the Chinese characters and romanization/English to the same width. This style can be seen in many of these signs. Sometimes this results in the romanized/English words being spaced too far apart; more often, though, the Chinese characters are left with lots of space between them — so much space that it would be easy to have spaces indicate word divisions for the texts in Hanzi (something Y.R. Chao recommended nearly a century ago), which might be an interesting thing to try on signs. I wonder if anyone has ever performed any experiments on this.

The full Mandarin name of the school indicated by the blue sign on the left is rather long:

Gāoxióng shìlì Gāoxióng nǚzǐ gāojí zhōngxué
(高雄市立高雄女子高級中學)

Whoever made the sign wisely desided to cut that down to 高雄女中 (Gāoxióng nǚ zhōng). If only someone had realized that it would have been better to use something shorter than the full English name, too. “Kaohsiung Municipal Girls’ Senior High School” is a lot to fit on one small sign. “Kaohsiung Girls’ High School”, “Girls’ Municipal High School”, or something even shorter would have been much better.

Here are some more signs.

And finally an address plate on a building. This style could certainly be better.
Dayi St.