Zhou Enlai and others on script reform

New on Pinyin Info is the nearly complete text of Reform of the Chinese Written Language, a booklet from the PRC that dates back to 1958. Most of the essays, however, contain misconceptions about Chinese characters, romanization, and the nature of script reform, so this work is placed here on this site not as a recommended reading but as a historical reference. So, with that in mind, here are the essays:

some recent posts elsewhere

Although many notable stories have been in the news lately, I haven’t had time yet to comment on any of them. So for now I’d like to draw everyone’s attention to two recent posts elsewhere:

Pinyin, mispronounced Mandarin linked: Malaysian official

Although announcements in Mandarin are being mispronounced in Kuala Lumpur International Airport, that’s only to be expected because the announcers are paid little and must use Hanyu Pinyin, according to Malaysian Deputy Tourism Minister Datuk Donald Lim Siang Chai.

Bah. Pinyin doesn’t take long to learn. Moreover, it’s simple and accurate. The problem is simply a lack of training. Hanyu Pinyin is probably more closely phonetic than the spelling systems of any of the other languages the airport personnel would have to deliver announcements in.

Here’s the article:

Announcements in Mandarin pronounced wrongly at KL International Airport should be tolerated if the information is accurate, said Deputy Tourism Minister Datuk Donald Lim Siang Chai.

He said information should include time of flight arrivals and departures and gate numbers.

Lim attributed the wrong pronunciations to the announcers, who relied on hanyu pinyin (romanised Chinese).

“It is not easy to get good announcers given the low pay and long working hours,” he told reporters after opening a workshop organised by the Malaysia Mental Literacy Movement here yesterday.

Lim said RTM also has a similar problem in getting newsreaders fluent in dialects.

Sin Chew Daily reported last week that wrong pronunciations at KLIA had not only drawn laughter but also made some tourists irritated.

source: Info more important than how you say it, Star, May 14, 2006

via justnice.org ver 3.0

China’s Cultural Revolution, Pinyin, and other romanizations

Some people have the idea that because during the Cultural Revolution the Red Guards went about destroying much of China’s cultural heritage, they must have attacked Chinese characters and supported Pinyin. This idea is wrong. During that terrible time Pinyin was attacked, like so much else that was good in China.

With the fortieth anniversary of the beginning of the Cultural Revolution upon us, this might be a good time to bring out this selection from The Chinese Language: Fact and Fantasy, by John DeFrancis:

In view of the fact that separate alphabetic treatment for the regionalects has been a virtually tabooed subject since 1949, it comes as a surprise that among the revelations following the downfall of the Gang of Four is an account by Prof. Huang Diancheng of Amoy University of the adaptation of Pinyin to the Southern Min speech of Amoy and its use in the production of anti-illiteracy textbooks and other activities. Huang reports that during the Cultural Revolution people possessing materials in Min alphabetic writing were denounced as “foreign lackeys” and were forced to take the material out to the street, kneel down alongside them, set them afire, and reduce them to ashes. Elsewhere repression of Pinyin in any form was undertaken by xenophobic Red Guards, themselves staunch supporters of character simplification, who tore down street signs written in Pinyin as evidence of subservience to foreigners.

The Nazi-like book-burning episode and other acts against the use of Pinyin are fitting testimony of the repression exercised against activities concerned with fundamental issues in Chinese writing reform. In these actions the positive idea that China should stand on its own feet without demeaning reliance on foreign aid was expressed in its most xenophobic form as a sort of anti-intellectual blood-and-soil nativism that constitutes a danger, still present, of a Chinese-style fascism. The young student storm troopers who sought to humble the old-time intellectuals, far from following Lu Xun in embracing the one system of writing that would have done the most to equalize things between illiterates and all those who had received an education, supported instead the lesser reform of character simplification that might enhance their own position relative to the older generation.

evolution of simplified Chinese characters: dissertation

Stockholm University’s Department of Oriental Languages has just released Long Story of Short Forms: The Evolution of Simplified Chinese Characters (10.4 MB PDF), a Ph.D. dissertation by Roar Bökset.

Here is the abstract:

A script reform was carried out in China between 1955 and 1964 by simplifying the shape of a number of characters. Most of the simplified forms adopted had already been in popular use for a long time before this reform, while a few were invented for the occasion.

One objective of this dissertation is to estimate the proportion of invented forms. To this end, use of simplified variants before 1955 was surveyed. Pre-reform writing turned out to be more heterogeneous than expected. In fact, already Han dynasty (206 BC-AD 220) handwriting differed considerably from the norms set up by contemporary dictionaries and model texts.

One aim of the script reform was to unify writing habits and make them conform better with established norms. To evaluate the Script Reform Committee’s success in this field, this dissertation surveys the use of different unofficial short forms even after the reform. Success turned out to be moderate. Many pre-1955 short variants survived, and, what was worse, new ones emerged after the reform. Particularly confusing was the use of different unofficial short forms in different parts of China. The existence of such local variants was confirmed by extensive reading of signs, advertisements, price tags and wall newspapers in twenty-one provinces, and by interviews with informants at four hundred localities. Results of that survey are displayed on twenty-four maps.

A few years earlier, even Japanese characters had gone through a reform which made many simplified forms official. Some of the new official Japanese forms differed from those which came to be official in China, creating a discrepancy which has at times been lamented. However, this dissertation compares the short forms used in pre-reform Japan with those of pre-reform China, and shows that most of the present discrepancies have roots in differences in Chinese and Japanese writing traditions, which bound the hands of reformers in both countries and enforced the decisions which were eventually made.

OCR and Pinyin texts

[This entry is largely for my own reference. But feel free to read on, especially if you’re interested in OCR or if you somehow happen to have a lot of Pinyin texts lying around.]

What’s the best way to run optical character recognition (OCR) on texts written in Pinyin with tone marks? Adobe Acrobat 7.0 Standard, the most advanced such software I have on my computer, doesn’t have a “Pinyin” setting. I’d be surprised if any OCR software currently does.

Getting second tones, fourth tones, and umlauts to be read correctly shouldn’t be a big problem, given how the same marks are standard in the orthographies of many European languages. But first tones and third tones are a different matter. The best that can probably be hoped for at present is a more-or-less regular rendering of vowels with first- and third-tone marks as something else that can be fixed quickly through a search-and-replace procedure.

Here’s an image, slightly reduced, of what was being scanned:
scan of sentences in Pinyin

Here’s the text:
W? bù shì xuézh?, bù néng y?nj?ng jùdi?n. Dànshì w? y?u cóng zìj? sh?nghuó l? délái de w? ge zh?nshí lìzi, d?u bi?omíng Hànzì bìng bù tèbié bi?oyì.

Here are the results of OCR, with various language settings applied:

DUTCH
WÖ bu shi xuézhë, bù néng yinjing jùdiän. Danshi wö yöu cóng zip shënghuó
li délái de wü ge zhënshí Iizi, döu biäomíng Hanzì bing bu tebié biäoyì.

CATALAN
W6 bu shi xuezh8, bir neng yinjing jhdisn. Danshi w6 y5u cong ziji shenghuó
li delai de wü ge zhenshí lizi, d6u bibmíng Hanzi bing bu tebie bigoyi.

DANISH
W6 bu shi xuezhe, bU neng yinjing jhdian. Danshi w6 y5u cong ziji shGnghu6
li delai de wii ge zhenshi Iizi, dóu bigoming Hanzi bing bu tebie biaoyi.

FINNISH
WÖ bu shi xuezhe, bU neng yinjing jiidiän. Danshi wö yöu cong ziji shGnghu6
Ii delai de wii ge zhenshi Iizi, döu biäoming Hanzi bing bu tebie biäoyi.

FRENCH
W6 bù shi xuézhë, bù néng yinjing jùdian. Dànshi wO y5u cong ziji shënghu6
li délai de wü ge zhënshi Iizi, dou bigoming Hànzi bing bu tèbié biaoyi.

GERMAN
WÖ bu shi xuezhe, bU neng yinjing jiidiän. Danshi wö yöu cong ziji shGnghu6
li delai de wü ge zhenshi Iizi, döu biäoming Hanzi bing bu tebie biäoyi.

GERMAN (SWISS)
WÖ bu shi xuezhe, bU neng yinjing jiidiän. Danshi wö yöu cong ziji shGnghu6
li delai de wü ge zhenshi Iizi, döu biäoming Hanzi bing bu tebie biäoyi.

ITALIAN
W6 bu shì xuézhe, bù néng yinjing jùdian. Dànshì w6 y5u cong ziji shènghu6
li délai de wii ge zhenshi Iizi, dou bigoming Hànzì bing bu tèbié biaoyì.

NYNORSK
W6 bu shi xuezhe, bU neng yinjing jhdian. Danshi wO y5u cong ziji shGnghu6
li delai de wii ge zhenshi Iizi, dou biaoming Hanzi bing bu tebie biaoyi.

PORTUGUESE (BRAZILIAN)
WÕ bu shi xuézhe, bU néng yinjing jùdiãn. Danshi wõ yõu cóng ziji shènghuó
li délái de wü ge zhenshí Iizi, dõu biãomíng Hanzi bing bu tèbié biãoyi.

PORTUGUESE
WÕ bu shi xuézhe, bU néng yinjing jùdiãn. Danshi wõ yõu cóng ziji shènghuó
li délái de wü ge zhenshí Iizi, dõu biãomíng Hanzi bing bu tèbié biãoyi.

SPANISH
W6 bu shi xuézhe, bU néng yinjing jhdian. Danshi wO y5u cóng ziji shenghuó
li délái de wü ge zhenshí Iizi, dóu bigomíng Hanzi bing bu tebié biaoyi.

There’s no clear winner. The best results, such as they are, appear to be using Dutch and Portuguese (Brazilian or standard).

Aborigine legislators should use original names: activist

Aborigine politicians should use their original names, not Han Chinese names, or explain to their constituents why they don’t, the head of an aboriginal group called the Vine Cultural Association stated on Tuesday.

All eight of Taiwan’s legislators holding the seats reserved for Aborigines — Chen Ying, Liao Kuo-tung, Lin Cheng-er, Yang Jen-fu, Kao Chin Su-mei, Kung Wen-chi, Lin Chung-te, Tseng Hua-te — currently officially use “Chinese” names rather than Aborigine ones.

The head of Taiwan’s Council of Indigenous Peoples, however, does use his original name: Walis Pelin.

I’m waiting for someone to get on TV and talk about how few legislators who are Hoklo use Taiwanese rather than Mandarin forms for the romanizations of their names. (I could probably count them all on one hand, even though Taiwan has some 225 legislators.) Same thing for legislators who are Hakka but who don’t use the Hakka forms of their names in romanization.

sources:

Cantonese input method for Chinese characters

There’s a new Unicode-based phonetic input method for inputting Chinese characters … using Cantonese: Canto Input.

Here’s the author’s description:

What is it?
CantoInput is a freely available, Unicode-based Chinese input method (IME) which allows you to type both traditional and simplified characters using Cantonese romanization. Both the Yale and Jyutping methods are supported. A Mandarin Pinyin mode is also available.

Why does the world need another Chinese input method?
While there already exist excellent phonetic input methods based on Mandarin Pinyin pronunciation, there is a general lack of support for Cantonese. As a Cantonese learner, I was frustrated by the difficulty of typing Chinese, especially Cantonese-specific colloquial characters. Most existing Cantonese input methods require a Chinese version of Windows and operate using non-Unicode encodings such as BIG5 or GB, while non-phonetic methods such as Cangjie have a very steep learning curve. I originally wrote this program for my own personal use but decided to make it freely available since I felt that other Cantonese speakers and learners might also find it useful. It’s still really basic at this time, but hopefully I’ll have time to impove the interface and add more features in the future.

Those interested in trying this out might find the comments on Chinese Forums useful.