I get a lot of questions about how to do some sort of conversion involving Chinese characters. Most of the time, my answer is something like, “Get Wenlin. Even the free, non-expiring demo version (4 MB) will do what you need — and a lot more.”
For those of you who aren’t familiar with Wenlin, Random Stuff That Matters has posted a five-minute movie (with sound) of Wenlin in action (14.5 MB).
The range of what Wenlin can do extends far beyond what the movie shows. A lot of people might not notice that even in the demo a wide range of options are available under
Make Transformed Copy
My favorite, which is available only with the full version, is
Make Transformed Copy?
Oh, it is a thing of beauty. (That function, though, works only in the full version, not the demo.)
For those of you who have the full version, I thought I’d share a little-known feature of Wenlin: its ability to search for regular expressions.
Let’s say you are trying to remember a chengyu (set phrase) about studying, but all you can recall is that it contains the sound “rubu.” You’re not sure of the characters. You’re not even sure of the tones. First you look up entries beginning with “rubu” in Wenlin’s electronic edition of the ABC Chinese-English Comprehensive Dictionary:
Words by Pinyin
- Then enter
This will take you to rùbùf?ch? and rúbùshèngy?. But neither of those is what you’re looking for. Now what? Here’s where regular expressions come in handy.
Ctrl+F to search for something within the current page.
In the Find box, enter
This will yield:
- ch?ngr?bùj?ng ????[?–?] f.e. unmoved by honors/disgrace
- lèirúbùg?n ????[?–?] f.e. be drowned in tears
- nièrúbùyán ????[?—] f.e. ?wr.? move the mouth without speaking
- xuérúbùjí ????[?—] f.e. study as if one could never learn enough
The reason for using OR pipes to separate the possibilities instead of putting them together — i.e., the reason for writing (u|?|ú|?|ù) instead of [u?ú?ù] — is that the regex library sees non-ASCII characters as strings of bytes (UTF-8); thus, without the pipes you could end up with extra garbage or not find what you intend to at all. This might be fixed in the next version.