Wenlin: ‘software for learning Chinese’

I get a lot of questions about how to do some sort of conversion involving Chinese characters. Most of the time, my answer is something like, “Get Wenlin. Even the free, non-expiring demo version (4 MB) will do what you need — and a lot more.”

For those of you who aren’t familiar with Wenlin, Random Stuff That Matters has posted a five-minute movie (with sound) of Wenlin in action (14.5 MB).

The range of what Wenlin can do extends far beyond what the movie shows. A lot of people might not notice that even in the demo a wide range of options are available under

  • EditMake Transformed Copy

My favorite, which is available only with the full version, is

  • EditMake Transformed CopyPinyin Transcription

Oh, it is a thing of beauty. (That function, though, works only in the full version, not the demo.)

For those of you who have the full version, I thought I’d share a little-known feature of Wenlin: its ability to search for regular expressions.

Let’s say you are trying to remember a chengyu (set phrase) about studying, but all you can recall is that it contains the sound “rubu.” You’re not sure of the characters. You’re not even sure of the tones. First you look up entries beginning with “rubu” in Wenlin’s electronic edition of the ABC Chinese-English Comprehensive Dictionary:

  • ListWords by Pinyin
  • Then enter rubu and hit OK.

This will take you to rùbùfūchū and rúbùshèngyī. But neither of those is what you’re looking for. Now what? Here’s where regular expressions come in handy.

Hit Ctrl+F to search for something within the current page.

In the Find box, enter

  • re=r(u|ū|ú|ǔ|ù)b(u|ū|ú|ǔ|ù)

This will yield:

  • chǒngrǔbùjīng 寵辱不驚[宠–惊] f.e. unmoved by honors/disgrace
  • lèirúbùgān 淚濡不乾[泪–干] f.e. be drowned in tears
  • nièrúbùyán 囁嚅不言[嗫—] f.e. 〈wr.〉 move the mouth without speaking
  • xuérúbùjí 學如不及[学—] f.e. study as if one could never learn enough

Bingo!

The reason for using OR pipes to separate the possibilities instead of putting them together — i.e., the reason for writing (u|ū|ú|ǔ|ù) instead of [uūúǔù] — is that the regex library sees non-ASCII characters as strings of bytes (UTF-8); thus, without the pipes you could end up with extra garbage or not find what you intend to at all. This might be fixed in the next version.

surname-spelling scrap

Danwei has picked up on a story of someone in China with the surname of Xiè being issued an air ticket under the name Jiě. The reason behind the mixup is that the character used for this woman’s name, 解, is most often pronounced “jie,” as in jiěfàng (liberate; emancipate), jiějué (solve; resolve; settle), liǎojiě (understand; comprehend; find out; acquaint oneself with), and jiěshì (expound; interpret; analyze). Thus, it is but one of the many Chinese characters that has more than one pronunciation.

When she and some of her relatives went to the travel agency to get the matter cleared up, however, an argument broke out. Before long, people from the travel agency were using poles to beat the family.

(Maybe not my strongest entry, but there was no way I was going to pass up a chance to post on a story titled “Is personal safety another argument for Chinese romanization?”)

sources:

many Taipei sixth graders can’t use traditional dictionaries

The Taipei City Government has released the results of a Mandarin proficiency exam administered to 31,145 sixth-grade students.

According to the results, more than 40 percent of those tested are unable to use so-called radicals (bùshǒu, 部首) to find Chinese characters in dictionaries. This, of course, comes as no great surprise to me. Ah, for the wisdom of the alphabetical arrangement of the ABC Chinese-English Comprehensive Dictionary!

Furthermore, the Taipei Times reports that the person in charge of the testing, Datong Elementary School Principal Chen Qin-yin, said that although most students received good grades, the essay test revealed weaknesses in writing ability, including a limited use of adjectives.

Reading that sort of thing sets off all sorts of alarms in my head. First, adjectives are the junk food of writing. Even worse, though, I suspect that Chen is talking not about any ol’ adjectives but rather stock phrases either in or reminiscent of Literary Sinitic (Classical Chinese). Larding a text with clichés is the sort of thing that passes for good writing here. And if, for example, students don’t throw in a zhi in the place of a de often enough their grades will suffer.

The language reforms springing from the May 4 movement have been tremendously important. But more than eighty years later the job still isn’t finished!

sources:

Taiwanese and alphabetical abbreviations

I’d been working on a post about the cards and miniature magnets given away at Family Mart (Quánjiā / 全家) convenience stores with purchases of at least NT$75 (about US$2). But Jason at Wandering to Tamshui beat me to it yesterday with a post showing all of the cards, so I’ll keep this short.

These are particularly interesting because of the use of Taiwanese as well as several other languages, though everything here is labeled “Yingwen” (English). As Jason wrote, “That faint sound you hear is a thousand foreign English teachers slapping their foreheads in despair.”

The series, labeled Quánmín pīn Yīngwén (全民拼英文), is probably meant to counter rival 7-Eleven’s popular Hello Kitty button series. Although few take on Hello Kitty and live to tell the tale, I think the alphabet cards are doing fairly well.

Below is an example. On the left is the wrapper (pun not intended). Top right shows the front and back of the magnet that comes with this particular card. And at bottom right is the card itself. (I say card; but it’s really just glossy paper.)

Here MG is meant to stand for mai3 ke2 sian1 (as always, help with my spelling would be appreciated), which, despite the use of Chinese characters (嘜假仙), is Taiwanese, not Mandarin. Reading 嘜假仙 as Mandarin yields only nonsense. (So much for the “universality” and “ideographic” myths of Chinese characters.)

photo of promotional item from a convenience store; it uses the Roman alphabet to indicate abbreviations of phrases in Taiwanese

resources:

UN has been using only simplified characters for years: Taiwan foreign ministry

In my earlier post on a report that the United Nations would drop the use of traditional Chinese characters, I wrote, “I hadn’t known the U.N. was still using traditional characters at all.”

According to a release from Taiwan’s Ministry of Foreign Affairs (MOFA) yesterday, the U.N. has not used traditional characters for years. The story led the Taiwan News today:

When Taiwan’s representative office in New York checked on the report with the U.N., officials from the Department of the U.N. Secretariat said they were not informed of the report and felt puzzled by it, the [MOFA] statement said.

Although the U.N. uses Chinese, English, French, Spanish, Arabic, and Russian as its official languages, the decision has not deterred the development of other languages, such as Japanese, German, or Portuguese, the statement added.

The conservation of culture in countries using these languages was also unaffected by the U.N.’s language policy, the statement said.

Overseas Chinese Affairs Commission Vice Minister Cheng Tong-hsing said yesterday at the Legislature that the government has plans to call press conferences and various publicity campaigns to boost public awareness of the significance of using traditional Chinese characters among Taiwanese and overseas Chinese.

Minister of Education Tu Cheng-sheng (杜正勝) also said that due to the language’s historical and cultural significance, the MOE is firm in its stance that traditional Chinese characters will continue to be taught in local educational institutions regardless of the U.N.’s decision.

The Taipei Times‘ report was more cautious:

Tu said that the education ministry was in the process of verifying the UN’s plans.

It appears there’s something fishy (xīqiāo) going on, as the foreign ministry put it.

sources:

Chinese characters and left-handers

I came across an article earlier on myths about left-handedness. The section labeled “oppressing the left” notes that “lefties have long suffered.” One of the statements made in support of this, however, is that “Chinese characters prove extremely difficult to write with the left hand.” I’ve heard this assertion about Chinese characters before, many times.

Certainly there’s been a great deal of discrimination against left-handed people in China and Taiwan, where they are often forced to switch. This happens even more frequently in those two countries than in the West, where it almost certainly continues to occur. (When I was in second grade my teacher tried to force me to use my right hand. Fortunately for me, my left-handed father came to school to set her straight on this. )

Oddly enough, people in Taiwan and China have often remarked to me that left-handed people are especially smart.

I have none-too-beautiful handwriting when it comes to Chinese characters. My handwriting in the Roman alphabet, however, is pretty good when I’m writing for someone other than myself. But I doubt the difference has anything to do with me being left-handed. I didn’t grow up endlessly practicing how to write Chinese characters; also, I simply don’t care.

I’d like to note a few things.

  • For thousands of years, until well into the twentieth century, the standard order for Chinese texts was top to bottom and right to left, which, if it benefits anyone, would seem to benefit left-handed people.
  • Throughout most of their history, Chinese characters have most often been written with a calligraphy brush (maobi). And in calligraphy the brush is held perpendicular to the paper, so there’s no slant beneficial to people writing with one hand or the other.
  • Most writing with a brush is still done top to bottom and right to left.
  • Since pencils and pens produce lines of even thickness, there doesn’t seem to be anything inherently different in writing Chinese characters with these than writing the Roman alphabet, something left-handed people can do just fine.

So what, other than prejudice, is the source of the contention that left-handed people are at a significant disadvantage when it comes to writing Chinese characters?

Before anyone mentions stroke order, however, I’d like to note that is also largely a convention, not something inherent in the final appearance of the character. Otherwise, there wouldn’t be variations in stroke order, even today, especially between China and Japan.

I’m inclined to believe that this is just another of the many erroneous claims about Chinese characters, but I’d certainly be interested in hearing any evidence to the contrary.

source: What Makes a Lefty: Myths and Mysteries Persist, Live Science, March 21, 2006

Taipei County signage and romanization systems

Speaking yesterday on topics related to signage and romanization, Taipei County Magistrate Zhou Xi-wei said that Taipei County should have its systems match those of Taipei City:

Táiběi Xiànzhǎng Zhōu Xīwěi xīwàng yǐ shēnghuóquān wéi kǎoliáng, yào hé Táiběi Shì zhěnghé yīzhì.

One of the implications of this is that for Taipei County, Taiwan’s most populous area, Tongyong Pinyin is out and Hanyu Pinyin is in.

This is no surprise, given that Zhou

  • is a member of the Kuomintang, whose chairman, Ma Ying-jeou, has backed Hanyu Pinyin and implemented it in Taipei in his role as mayor of the capital
  • campaigned for integration (whatever that’s supposed to mean) of Taipei County with Taipei City.

As an advocate of Hanyu Pinyin and resident of Taipei County, I’m pleased by the change. But as someone who has lived in Taiwan for ten years, I know all too well how likely it is that the new signage will be botched. Taiwan has a poor record of correct implementation of romanization — in any system. Moreover, there are aspects of Taipei City’s signage that Taipei County should certainly not copy, namely InTerCaPiTaLiZaTion (unnecessary and counterproductive) and “nicknumbering” (putting a number on a street does nothing to aid communication if nobody knows what the number refers to). So if this doesn’t end up another SNAFU, I’ll be pleasantly surprised. (Does anyone have any good contacts within the Taipei County government? I’d like to be able to talk with some people in charge well before this gets beyond the planning stage.)

Until late last year Taipei County was under a DPP administration, so its romanization policy, such as it was, was to use Tongyong Pinyin. But implementation has been spotty and often sloppy. Most street signs in Taipei County remain in MPS2. Banqiao has seen more signs in Tongyong Pinyin; but most of those have the romanization in such relatively tiny letters that it’s nearly useless for drivers.

Turning back for a moment to the news reports that prompted this post, an additional item of interest is the headline of one of the stories: Pīnyīn fāngshì「qiao」bùdìng Yīngwén dìmíng busasa (拼音方式「喬」不定 英文地名霧煞煞). Here, both qiao and busasa are Taiwanese, not Mandarin. (A-giâu or somebody else, help me out on the spelling here!)

Here’s one of the stories:

Táiběi jiéyùn Bǎnqiáo-Tǔchéng xiàn jiāng yú wǔ yuèfèn tōngchē zhì Tǔchéng yǒng nìng zhàn, yīnyīng zhuǎnchéng lǚkè xūyào, Tái-Tiě Bǎnqiáo chēzhàn jiāng shèzhì línshíxìng zhǐshì pái. Bùguò, Yīngyǔ yìyīn hùnluàn, xiànzhǎng Zhōu Xī-wěi biǎoshì, gāi cǎiyòng Tōngyòng Pīnyīn huòshì Hànyǔ Pīnyīn, jiāng huì yǐ shēnghuóquān de gàiniàn wèi qiántí, yǔ Táiběi Shì zhěnghé.

So the additional MRT stations are opening in May after all. As a Banqiao resident who has waited long for that day, I’m happy to hear it. But since the stations are opening so soon, I’d be willing to bet that they’ll reproduce the mistakes already in the system instead of correcting them.

Zhōu Xī-wěi biǎoshì, wèilái yě jiāng tuīdòng yī piào fúwù dàodǐ wèi mùbiāo, rú mínzhòng chíyǒu yōu yóu kǎ huò qítā piàozhèng, jíkě zhuǎnchéng jiéyùn, gāo tiě huò Tái-Tiě, dāchéng dàzhòng yùnshū gōngjù jiāng gèng biànlì.

Zhōu Xī-wěi jīntiān xiàwǔ xúnshì Bǎnqiáo huǒchēzhàn rénxíng tōngdào, duìyú Tái-Tiě, gāo tiě jí jiéyùn sān tiě gòng gòu, zhàn pái, lù míng Yīngwén biāoshì què wǔhuābāmén, yǒude yòng Táiwān Tōngyòng Pīnyīn, yǒude yǐ Zhōngguó dàlù Hànyǔ Pīnyīn, érqiě biāoshì shífēn bù míngxiǎn, dēngguāng bùgòu míngliàng, Zhōu Xī-wěi xīwàng gè dānwèi xiétiáo gǎishàn.

Zhōu Xī-wěi rènwéi, Bǎnqiáo chēzhàn jiānglái shì quánguó zuìdà de jiāotōng zhuǎnyùnzhàn, měirì fúwù wúshù mínzhòng, biāoshì yīng yǐ jiǎndān fāngshì, qīngchu gàosu shǐyòng rénshēn yú héchù, gāi wǎng héchù qù.

sources:

Windows computer systems and Pinyin input of Chinese characters

I often get messages from people asking how to use Hanyu Pinyin to input Chinese characters on their English-language Windows systems. But the most I’ve ever added to my site on this topic is a brief page on using Pinyin to type Chinese characters on a U.S. English Windows 2000 system. Fortunately for everyone, now there’s Pinyin Joe’s Chinese computing resources, which explains in user-friendly detail how to set up Western-language Windows XP computers to input Chinese characters using Pinyin and even zhuyin fuhao. I certainly don’t recommend using zhuyin; but it’s nice to know the information on how to type it (both by itself and for character input) is available and put forward so clearly.

The site covers a few other areas as well. Check it out. Pinyin Joe’s also promises to cover Vista once Microsoft finally releases it.

Another good place to ask related questions is Forumosa‘s technology forum, especially within the thread on Hanyu Pinyin input for XP.