I get a lot of questions about how to do some sort of conversion involving Chinese characters. Most of the time, my answer is something like, “Get Wenlin. Even the free, non-expiring demo version (4 MB) will do what you need — and a lot more.”
For those of you who aren’t familiar with Wenlin, Random Stuff That Matters has posted a five-minute movie (with sound) of Wenlin in action (14.5 MB).
The range of what Wenlin can do extends far beyond what the movie shows. A lot of people might not notice that even in the demo a wide range of options are available under
Make Transformed Copy
My favorite, which is available only with the full version, is
Make Transformed Copy→
Oh, it is a thing of beauty. (That function, though, works only in the full version, not the demo.)
For those of you who have the full version, I thought I’d share a little-known feature of Wenlin: its ability to search for regular expressions.
Let’s say you are trying to remember a chengyu (set phrase) about studying, but all you can recall is that it contains the sound “rubu.” You’re not sure of the characters. You’re not even sure of the tones. First you look up entries beginning with “rubu” in Wenlin’s electronic edition of the ABC Chinese-English Comprehensive Dictionary:
Words by Pinyin
- Then enter
This will take you to rùbùfūchū and rúbùshèngyī. But neither of those is what you’re looking for. Now what? Here’s where regular expressions come in handy.
Ctrl+F to search for something within the current page.
In the Find box, enter
This will yield:
- chǒngrǔbùjīng 寵辱不驚[宠–惊] f.e. unmoved by honors/disgrace
- lèirúbùgān 淚濡不乾[泪–干] f.e. be drowned in tears
- nièrúbùyán 囁嚅不言[嗫—] f.e. 〈wr.〉 move the mouth without speaking
- xuérúbùjí 學如不及[学—] f.e. study as if one could never learn enough
The reason for using OR pipes to separate the possibilities instead of putting them together — i.e., the reason for writing (u|ū|ú|ǔ|ù) instead of [uūúǔù] — is that the regex library sees non-ASCII characters as strings of bytes (UTF-8); thus, without the pipes you could end up with extra garbage or not find what you intend to at all. This might be fixed in the next version.