Penghu street signs

My wife and I recently spent a weekend in Penghu, a beautiful, stark archipelago between the main island of Taiwan and China.

Since Penghu is under KMT rule, I expected to find street signs in Magong, the capital, in some old system (e.g., MPS2 or perhaps bastardized Wade-Giles) or perhaps even Hanyu Pinyin. (Highway signs, however, are a different matter. They’re put up by the central government, which means that relatively recent ones are in Tongyong Pinyin, regardless of which party might control the area.)

This first street sign, however, is unmistakably in Tongyong Pinyin, giving “Wunsyue” (for what in Hanyu Pinyin would be “Wenxue”).
street sign reading 'Wunsyue Rd.' (Wenxue Road)

But I looked around some more and saw signs in Hanyu Pinyin, such as “Huimin” for what in Tongyong would be “Hueimin” and “Hui[‘]an” for what in Tongyong would be “Huei[-]an.”
street sign reading 'Huimin Road'

street sign reading 'Huian first Road'

So were there some signs in Hanyu Pinyin after all? Apparently only coincidentally. The previous two hui signs were probably just a mistake, the result of Taiwan’s standard, sloppy chabuduo jiu keyi approach to signage. Here’s a sign on the same street as above; but in this case “惠” is romanized huei and not hui. (And “first” is missing, from both the Hanzi and romanization.)
street sign reading 'Hueian Rd.'

Most signs were in Tongyong, such as these. (Note that Penghu, too, has a Hot Milk Road.)
street signs: 'Jhongjheng Road' (Zhongzheng Road) and 'Renai Road' (Ren'ai Road)

So, Tongyong after all. Well, at least they don’t have InTerCaPiTaLiZaTion … or do they?
street signs reading 'JhongShan Rd.' -- note InTerCaPiTaLiZaTion -- (Zhongshan Road) and 'Jhongjheng Rd.' (Zhongzheng Road) -- no intercapping

Fortunately, that sign was a one-off. I didn’t spot InTerCaPiTaLiZaTion elsewhere. Here’s another sign from the same road:
street sign reading 'Jhongshan Rd.' (Zhongshan Road)

So, in short, Penghu’s street signs are in Tongyong Pinyin — but with plenty of mistakes and inconsistencies (e.g., missing apostrophes/hyphens, “first” rather than “1st”, and both “Road” and “Rd.”). It’s especially ridiculous that the KMT-administered Penghu bothered with Tongyong, especially since it was free to adopt Hanyu Pinyin. Now it’s going to have to change its signs over to Hanyu Pinyin. But some of the signs would need to be updated anyway, since many already show signs of age, with letters missing. (My guess is that Penghu put up such low-quality signs that in the annual windy season some of the letters just get blown away.)

Here’s a sign in little danger of having its writing blow away any time soon. This is what a much older Magong street sign looks like. Note that it must be read from right to left: 復國路 (Fuguo Road — “Recover Atlantis the Lost Country Road”).
old concrete street sign reading, right to left, '復國路' (Fuguo Road)

Finally, here’s something that isn’t a street sign at all. But it is nonetheless a sign of historic importance, since it’s a stela that commemorates the Ming Chinese official Shen Yourong telling the red-haired barbarians (i.e., Westerners — in this case, the Dutch) to get the hell out of Penghu. (The Dutch were told they could instead go to Taiwan, since back then China didn’t care about it in the least.) The composite photo shows both the 400-year-old stone original and a modern reproduction in wood.

photos of the original stone stela and a modern reproduction in wood

The text reads “Shěn Yǒuróng yù tuì hóngmáo fān[zi] Wéimálàng děng” (「沈有容諭退紅毛番韋麻郎等」): “Shen Yourong orders the red-haired foreigners under [Dutch commander] Wybrand van Warwijck to withdraw.”

gov’t unveils online Taiwanese dictionary

Taiwan’s Ministry of Education has put online its new Taiwanese (Hoklo) dictionary, the Táiwān Mǐnnányǔ chángyòngcí cídiǎn (giving the Mandarin name) (臺灣閩南語常用詞辭典). The preliminary version, which is to be amended in six months, contains 16,000 entries.

I especially welcome the section on Taiwan place-names.

further reading: MOE launches first Hoklo-language online dictionary, Taipei Times, October 20, 2008 [Note: The headline’s use of “first” is almost certainly incorrect.]

convert Chinese characters to Unicode character references: javascript

I’ve had a spate of requests recently for the code for Pinyin.info’s tool that converts Chinese characters to Unicode numeric character references (i.e., something that converts, say, “漢語拼音” into “漢語拼音”). Since I’m a believer in open-source work — and since people could find the code anyway if they look carefully enough in the Web page’s source code — I might as well publish it.

This tool can be very handy when making Web pages that use a variety of scripts. (It works on Cyrillic, etc., as well.) I often employ it myself.

Here’s the heart of the code:


function convertToEntities() {
var tstr = document.form.unicode.value;
var bstr = '';
for(i=0; i<tstr.length; i++)
{
if(tstr.charCodeAt(i)>127)
{
bstr += '&#' + tstr.charCodeAt(i) + ';';
}
else
{
bstr += tstr.charAt(i);
}
}
document.form.entity.value = bstr;
}

This sleek little bit of Javascript is originally by Steve Minutillo and used here on Pinyin.info with his permission. I may have tweaked the code a little myself; but that was so long ago I don’t remember well. (I’ve had the converter here for about five years.) Anyway, if you use this please acknowledge Steve’s authorship; and of course I always greatly appreciate links back to Pinyin.info.

If anyone knows how to do the same thing in PHP — preferably with no more code than used above, please let me know.

See also: separating Pinyin syllables: PHP code.

Hanyu Pinyin and common nouns: the rules

cover of Chinese Romanization: Pronunciation and OrthographyI’ve just added another long section of Yin Binyong’s book on the detailed rules for Hanyu Pinyin. This part (pp. 78-138) covers common nouns (2.4 MB PDF).

I should have mentioned earlier that this book isn’t useful just for those who want to know more about Pinyin. It can also serve as an excellent work for those learning Mandarin, since it tends to group like ideas together and gives many examples of how combinations form other words.

All that, and it’s absolutely free. So go ahead and download it now.

Here are the main divisions:

  1. Introduction
  2. Simple Nouns
  3. Nouns with Prefixes
  4. Nouns with Suffixes
  5. Reduplicated Nouns
  6. Nouns of Modifier-Modified Construction
  7. Nouns of Coordinate Construction
  8. Nouns of Verb-Object and Subject-Predicate Construction
  9. Locational Nouns
  10. Nouns of Time
  11. Noun Phrases that Express a Single Concept

John DeFrancis video

John DeFrancisTen years ago John DeFrancis was awarded the Chinese Language Teachers Association’s first lifetime achievement award. Since he could not be present at the association’s annual conference to receive the award, he sent a videotape of a 12-minute acceptance speech. The video was recently edited down to 6:27 and uploaded to YouTube: John DeFrancis remarks.

Here’s my summary of the main points:

0:00 — While working on what he intended to be a largely political study of Chinese nationalism, DeFrancis encountered references to people who wanted China to adopt an alphabetic writing system, an idea which he initially dismissed. But discovering Lu Xun’s interest in romanization led him to investigate the matter further. [I’m frustrated by the cut away from this discussion. Perhaps a fuller version of the video will be posted later.]
1:30 — Emphasizes he’s not in favor of completely abandoning Chinese characters. Rather, he favors digraphia.
2:30 — “I’d like to mention three aspects of the Chinese field which have interested me.”

  1. pedagogy (2:50) — lots of advancements
  2. linguistic aspect (3:20) — that’s also progressing well
  3. socio-linguistics (3:52) — the field isn’t doing as well as it should be

5:00 — computers and Chinese characters. DeFrancis tears into the Chinese government for its emphasis on shape-based character-input methods rather than Pinyin.

Ma administration still undecided on how to teach Taiwanese

Under the new administration of President Ma Ying-jeou, Taiwan’s Ministry of Education has worked out its plan for teaching pretty much everything … except for Hoklo (the language better known in these parts as “Taiwanese”). There have been a lot of arguments. How early to start teaching the language? How much should be taught? Use romanization? Use zhuyin? May teachers use any kind of soap or only special kinds when washing out the mouths of students speaking the language? (OK, they don’t do that last one anymore.)

So the ministry has decided to appoint a new committee to review such questions. Decisions on these issues are expected in six months or so.

My guess would be that the ministry is going to pack the new committee with conservatives who will see to it that romanization is avoided or at least belittled, that little of the language will actually be taught, and that students will not be tested seriously on the subject. But I’ll be happy if I’m wrong.

sources:

updating Karlgren: a forthcoming reference book

The University of Hawai`i Press will be releasing another work in its groundbreaking ABC Chinese Dictionary Series, which is responsible for my favorite Mandarin-English dictionary, the Pinyin-ordered ABC Chinese-English Comprehensive Dictionary, edited by John DeFrancis.

The new work, which will be released in December 2008, is Minimal Old Chinese and Later Han Chinese: A Companion to Grammata Serica Recensa, by Axel Schuessler.

Here’s the publisher’s description:

Although long out of date, Bernard Karlgren’s (1957) remains the most convenient work for looking up Middle Chinese (ca. A.D. 600) and Old Chinese (before 200 B.C.) reconstructions of all graphs that occur in literature from the beginning of writing (ca. 1250 B.C.) down to the third century B.C. In the present volume, Axel Schuessler provides a more current reconstruction of Old Chinese, limiting it, as far as possible, to those post-Karlgrenian phonological features of Old Chinese that enjoy some consensus among today’s investigators. At the same time, the updating of the material disregards more speculative theories and proposals. Schuessler refers to these minimal forms as “Minimal Old Chinese” (OCM). He bases OCM on Baxter’s 1992 reconstructions but with some changes, mostly notational. In keeping with its minimal aspect, the OCM forms are kept as simple as possible and transcribed in an equally simple notation. Some issues in Old Chinese phonology still await clarification; hence interpolations and proposals of limited currency appear in this update.

Karlgren’s Middle Chinese reconstructions, as emended by Li Fang-kuei, are widely cited as points of reference for historical forms of Chinese as well as dialects. This emended Middle Chinese is also supplied by Schuessler. Another important addition to Karlgren’s work is an intermediate layer midway between the Old and Middle Chinese periods known as “Later Han Chinese” (ca. second century A.D.) The additional layer makes this volume a useful resource for those working on Han sources, especially poetry.

This book is intended as a “companion” to the original Grammata Serica Recensa and therefore does not repeat other information provided there. Matters such as English glosses and references to the earliest occurrence of a graph can be looked up in Grammata Serica Recensa itself or in other relevant dictionaries. The great accomplishment of this companion volume is to update an essential reference and thereby fulfill the need for an accessible and user-friendly source for citing the various historically reconstructed stages of Chinese.

85 percent of Han in China have two-syllable given names: report

Just how common are monosyllabic given names in China? I’ve seen lots of wild guesses, which generally range from about one-quarter to one-half (?!) of the population. Zhang et al., however, give the following figures:

91.06% Chinese have three-character names and only 8.34% have two-character names. People with four characters or more only constitute 0.6% of the population.

This was based on a database of 1,644,911 names in China.

According to a larger survey last year in the PRC, however, 14.22 percent of Han people in China have given names that are monosyllabic … and thus are written with a single Chinese character. On the other hand, 85.61 percent of Han people in China have full names written with exactly three Chinese characters, according to the report released by the National Citizen Identity Information Center, an organization with ties to China’s Ministry of Public Security. (It thus seems likely they have access to especially good data.)

Since the source material is unclear on what is meant by names written with three Chinese characters, it’s possible that some people in the second group have disyllabic family names and monosyllabic given names; but that number is likely to be close to statistically insignificant, given the relative paucity of monosyllabic given names and the outright rarity of disyllabic family names. (Only 0.02 percent of those in Zhang et al.‘s name list had disyllabic family names.)

The sum of 14.22 and 85.61 is 99.83, which leaves 0.17 percent of those in China classified as Han having names that are at least four syllables long and so take at least four Chinese characters to write.

According to a report published last December but which I’m just now getting around to writing about, nearly one thousand names in China are written with at least ten Chinese characters. The news story, alas, does not give any of these names; but it does provide a breakdown of the numbers:

10 characters: 594 names
11 characters: 272 names
12 characters: 94 names
13 characters: 33 names
14 characters: 5 names
15 characters: 1 name

A total of 97 percent of those 999 people live in the predominantly non-Han Chinese region of Xinjiang, which likely indicates that they have non-Han names that are being forced into forms that fit procrustean Mandarinized syllables.

A report from Nanjing states that 309 of the city’s 6 million people have names that take more than four Chinese characters to write.

PRC authorities have proposed limiting given names to two syllables and family names to four syllables (for rare cases in which a child receives a disyllabic family name from both parents).

As for Taiwan, monosyllabic given names are much rarer here than in China. My guess would be about 2 percent. This could probably be worked out from Chih-Hao Tsai’s list of Chinese names; but right now I don’t have the time.

On the other hand, China’s public is being urged to embrace new disyllabic family names, largely because the relative paucity of surnames ensures many, many people in China share common names.

Recent demographic surveys indicate there are about 1,600 surnames, with only 100 or so being frequently used, among Chinese nationals, which means many people share a name. For example, nearly 300,000 people, male and female, use the same common name of Zhang Wei, the statistics show.

The top 1,600 U.S. surnames don’t even cover half of the population, according to the U.S. Census Bureau, whose list of surnames found in the United States contains more than 88,000 entries.

sources:

Further reading: