If you ever find yourself stuck on how to pronounce English

It’s times like this I especially miss John DeFrancis. How he would have loved this! It’s partially an example of what he dubbed “Singlish” — not Singapore English but Sino-English, the tortured attempt to use Chinese characters to write English. He details this in “The Singlish Affair,” a shaggy dog story that serves as the introduction to his essential work: The Chinese Language: Fact and Fantasy. (And I really do mean essential. If you don’t have this book yet, buy it and read it.)

Here are some lyrics from a popular song, “Count on Me,” by Bruno Mars, with a Mandarin translation. The interesting part is that a Taiwanese third-grader has penciled in some phonetic guides for him or herself, using a combination of zhuyin fuhao (aka bopo mofo) (sometimes with tone marks!), English (as a gloss for English! and English pronunciation of some letters and numbers), and Chinese characters (albeit not always correctly written Chinese characters — not that I could do any better myself). Again, this is a Taiwanese third-grader and so is someone unlikely to know Hanyu Pinyin.

lyric sheet, as described in this post

“If you ever find yourself stuck”


If

ㄧˊㄈㄨˊ

yífú
you  
 
ever ㄟㄈㄦ ei-f’er

find

5

five
yourself Uㄦㄒㄧㄦㄈㄨ U’er xi’erfu

stuck

ㄙ打可

s-dake

“I’ll be the light to guide you.”


I’ll

ㄞㄦ

ài’er

be

ㄅㄧ

bi
the l[e]

light

賴特*

laite
to tu

guide


gai

you

you

you

“Find out what we’re made of”


Find

ㄈㄞˋ

fài

out

ㄠㄊㄜ

ao-t’e

what

花得

huade

we’re

ㄨㄧㄚ

wi’a

made

妹的

meide

of

歐福

oufu

“When we are called to help our friends in need”


What when


hua

we

ㄨㄧ

wi

are


a

called


kou

to


tu

help

嘿ㄜㄆ

hei’e-p[e]

our

ㄠㄦ

ao’er

friends

ㄈㄨㄌㄣˇ的ㄙ

fulen-de-s

in


ying

need

[?]

[?]

Pinyin sort order

The standard for alphabetically sorting Hanyu Pinyin is given in the ABC dictionary series edited by John DeFrancis and issued by the University of Hawaii Press.

Here’s the basic idea:

The ordering is primarily simply alphabetical. Diacritical marks, punctuation, juncture and capitalization are only taken into account when the strings being compared are otherwise identical. For example, píng’ān sorts before pīnyīn, because pingan sorts before pinyin, because g precedes y alphabetically.

Only when two strings are alphabetically identical is non-alphabetical information taken into account.

The series’ Reader’s Guide presents the specifics of the sort order. Since I don’t have to worry about how much space this takes up on my site, I have reformatted the information slightly to give the examples as numbered lists.

Head entry transcriptions with the same sequence of letters are ordered first strictly by letter sequence regardless of tones, then by initial syllable tone in the sequence 0 1 2 3 4. For entries with the same initial tone, arrangement is by the tone of the second syllable, again in the order 0 1 2 3 4. For example:

  • shīshi
  • shīshī
  • shīshí
  • shīshǐ
  • shīshì
  • shíshī
  • shíshì
  • shǐshī
  • shìshī
  • Irrespective of tones, entries with the vowel u precede those with ü.
    For example:

    Entries without apostrophe precede those with apostrophe. For example:

    1. biànargue
    2. bǐ’ànthe other shore

    Lower-case entries precede upper-case entries. For example:

    1. hòujìnaftereffect
    2. Hòu JìnLater Jin dynasty

    For entries with identical spelling, including tones, arrangement is by order of frequency….

    For most users, the most important thing to note is that the neutral tone is regarded as 0, not as 5. Thus, the order is notā á ǎ à a,” but “a ā á ǎ à.” And, because lowercase comes before uppercase, notA a Ā ā Á á Ǎ ǎ À à” but “a A ā Ā á Á ǎ Ǎ à À.”

    One can see this in action in the A entries for the ABC English-Chinese, Chinese-English Dictionary. And here are some sample pages from an earlier ABC dictionary.

    The ABC series follows the example of the Hanyu Pinyin Cihui (汉语拼音词汇 / 漢語拼音詞彙 / Hànyǔ Pīnyīn Cíhuì) (example), with only one minor difference, as noted by Tom Bishop:

    HPC [Hanyu Pinyin Cihui] gave hyphens and spaces the same priority as apostrophes, so that lìgōng sorted before lǐ-gōng, in spite of the tones. Usage of hyphens and spaces in pinyin is still far from being fully standardized. (The same is true in English orthography.) Consequently, for collation it makes sense to give less weight to hyphens and spaces, and more weight to tones, thus sorting lǐ-gōng before lìgōng. In ABC, hyphens and spaces don’t affect the sort order unless they change the pronunciation in the same way that apostrophe would; for example, ¹míng-àn 明暗 and ²míng’àn 冥暗 are treated as homophones, and they sort after mǐngǎn 敏感.

    Wenlin releases major upgrade (4.0)

    Wenlin logoOne of my favorite programs, Wenlin (which bills itself as “software for learning Chinese”), has just released a major upgrade for both Mac and Windows versions. This doesn’t happen often; it has been three-and-a-half years since the most recent big change was issued (Wenlin 3.4) and heaven only knows how long since 3.0 came out. So, yes, this release has many substantial improvements.

    One of the features nearest and dearest to my heart is that Wenlin 4.0 features greatly improved handling of Pinyin. I was among the field testers for the new version, so I’ve already spent a lot of time examining this feature. Here are a few important aspects of this:

    • Conversions from Chinese characters follow Hanyu Pinyin orthography much more closely than before. This is a major change for the better. (There’s still some room for improvement. But I don’t think we’ll have to wait years for this.)
    • In the past, using Wenlin to convert long texts in Chinese characters into Pinyin could be a real chore, with users having to examine example after example of Chinese characters with multiple pronunciations in order to select the proper pronunciation for that particular context. But now users may, if they so desire, tell Wenlin not to ask users for disambiguation input. Of course, that doesn’t mean that Wenlin will always guess right; but many users will be happy that this trade-off allows them to skip the frustration of, for example, having to tell the program over and over and over that, yes, in this case 說 is pronounced shuō rather than shuì.
    • Relative newcomers to Mandarin may appreciate that for common words tone sandhi is indicated in Wenlin with additional marks (a dot or line below the vowel). This feature can also be turned off, for those who want standard Pinyin.

    There are, of course, many improvements beyond the area of Pinyin. Here are a few:

    • One limitation of Wenlin 3.x was that its English dictionary wasn’t very large. But Wenlin 4.0 includes not only the ABC Chinese-English Comprehensive Dictionary but also the excellent new ABC English-Chinese, Chinese-English Dictionary (now finally in stock in the printed version).
    • The flashcards are now set up to handle not just individual characters but polysyllabic words.
    • There’s full Unicode Unihan 6.0 support for more than 75,000 Chinese characters.
    • And for those who think 75,000 just isn’t enough, users can now access Wenlin’s CDL technology. Through this, users can create new, variant, and rare characters; moreover, these can be published and shared with other Wenlin users or CDL-friendly devices.
    • Seal script versions of more than 11,000 characters are provided.
    • Wenlin contains an e-edition of the Shuowen Jiezi (Shuōwén Jiězì / 說文解字 / 说文解字).
    • Coders will be interested to know that Wenlin appears to be headed toward becoming open-source.
    • Both Mandarin and English entries are marked with grade levels, which aids learners by indicating relative frequency of use. The levels for Mandarin words are based on the Hanyu Shuiping Kaoshi (Hànyǔ Shǔipíng Kǎoshì / 汉语水平考试 / 漢語水平考試 / HSK).

    The full version (i.e., the CD with the program comes in a box and is likely packaged with a hard copy of the manual) is US$199, or US$179 if you download it from the Wenlin Web store. Upgrades from 3.x cost US$49.

    For more information, see the summary of features and outline of what’s new in Wenlin 4.0.

    screenshot from Wenlin 4.0 -- click for larger version

    ABC English-Chinese, Chinese-English Dictionary out soon

    front cover of the ABC English-Chinese, Chinese-English DictionaryThe ABC Chinese-English Dictionary was published ten years ago. It was revolutionary in that, for the first time, a Mandarin-English dictionary was ordered entirely by the headwords’ pronunciation as written in pinyin. (Stroke and radical indexes are also there to aid finding a character when its shape is known but not its pronunciation.) Other dictionaries in the DeFrancis ABC series have followed. But up to now there been no ABC dictionary with an English to Mandarin section as well as a Mandarin to English one.

    At the end of this month the University of Hawai`i Press is releasing the ABC English-Chinese, Chinese-English Dictionary. The new dictionary, which is 1,252 pages long, has 29,670 entries in its English-Mandarin section and 37,963 entries for Mandarin-English (total 67,633 entries). (The much larger ABC Chinese-English Comprehensive Dictionary has some 196,000 entries — all Mandarin-English).

    This is a big year for Mandarin-English dictionaries, with the forthcoming release of the ABC ECCE and the release three months ago of the massive Oxford Chinese Dictionary. From the standpoint of Pinyin, however, the Oxford dictionary is a disappointment. For example, the Oxford dictionary has no Pinyin in the English-Mandarin section, just Chinese characters; in some other places tone marks are missing from some of the Pinyin, where it appears at all. Perhaps this will be rectified in the online edition, which has yet to appear. At the moment, though, the Oxford looks like a fairly traditional dictionary — albeit a huge one — aimed mainly at English learners in China, which isn’t necessarily a bad thing if you happen to be among that very large group of people. For more on the Oxford, see the video at Danwei and the entries at Chinese Forums (with some images) and Language Log.

    Unlike the Oxford dictionary, the ABC ECCE offers both Pinyin and Chinese characters for all entries and sample sentences. (See samples below. Click on those for more extensive examples in PDF files.)

    From what I’ve seen so far of the ABC English-Chinese, Chinese-English Dictionary, I expect it to become the dictionary for English-speaking students of Mandarin. I’ll write more about this once I’m able to see a hard copy.

    The ABC English-Chinese, Chinese-English Dictionary retails for only US$20, compared to US$75 for the Oxford.

    From the Mandarin-English section. But don’t expect the text in the printed edition to be this large. I’ve enlarged the image to make it easier to read on the Web.
    examples of entries in the Mandarin-English section of the ABC English-Chinese, Chinese-English Dictionary

    From the English-Mandarin section:
    examples of entries in the English-Mandarin section of the ABC English-Chinese, Chinese-English Dictionary

    (ISBN-10: 0824834852; ISBN-13: 978-0824834852)

    See also:

    Xin Tang 6

    cover of Xin Tang, no. 6My previous post linked to a new HTML version of Homographobia, an essay by John DeFrancis. The work was first published in November 1985, in the sixth issue of Xin Tang (New China).

    Xin Tang (Xīn Táng) is an especially interesting journal in that it is primarily in Mandarin written in romanization. A variety of romanization systems and methods are employed over the course of the journal. Indeed, over the course of its run one can see many questions of systems and orthographies being worked out.

    I want to stress, though, that the journal does not restrict itself to material of interest only to romanization specialists. It also features poetry, illustrated stories, philosophy, letters to the editor, children’s material, and much more.

    English and a few Chinese characters are also found; and there are even articles in languages such as Turkish (with Mandarin and English translations).

    Most of what appears in English is also translated into Mandarin — romanized Mandarin, of course. So DeFrancis’s essay also appears, appropriately, in Pinyin:

    Homographobia is a disorder characterized by an irrational fear of ambiguity when individual lexical items which are now distinguished graphically lose their distinctive features and become identical if written phonemically. The seriousness of the disorder appears to be in direct proportion to the increase in number of items with identical spelling that phonemic rendering might bring about….

    Tongyinci-kongjuzheng shi yi zhong xinli shang d shichang, tezheng shi huluande haipa yong pinyin zhuanxie dangqing kao zixing fende hen qingchu d cir hui shiqu tamend bianbiexing. Kan qilai, zhei ge bing d yanzhongxing gen pinyin shuxie keneng zaocheng d tongxing pinshi shuliang d zengjia cheng zhengbi….

    All of the issue with the DeFrancis essay is now online: Xin Tang no. 6.

    illustration of a dragon reading a copy of Xin Tang, from an illustrated story
    Note the occasional employment of a tonal spelling (shuui).

    Homographobia

    Twenty-five years ago, John DeFrancis wrote a terrific essay on what he aptly dubbed homographobia (in Mandarin: tóngyīncí-kǒngjùzhèng). It’s a word that deserves wider currency, as the irrational fear he describes still affects a great many people.

    Homographobia is a disorder characterized by an irrational fear of ambiguity when individual lexical items which are now distinguished graphically lose their distinctive features and become identical if written phonemically. The seriousness of the disorder appears to be in direct proportion to the increase in number of items with identical spelling that phonemic rendering might bring about. The aberration may not exist at all among people favored by writing systems that are already closely phonemic, such as Spanish and German. It exists to a mild degree among readers of a poorly phonemic (actually morphophonemic) writing system such as English, some of whom suffer anxiety reactions at the thought of the confusion that might arise if, for example, rain, rein, and reign were all written as rane. It exists in its most virulent form among those exposed to Chinese characters, which, among all the writing systems ever created, are unique in their ability to convey meaning under extreme conditions of isolation

    That the fear is a genuine phobia, that is an irrational fear, is attested to by the fact that it is confined only to those cases in which lexical items that are now distinguished in writing would lose their distinctiveness if written phonemically, as in the case of the three English homophones mentioned above. Quite irrationally, the fear is not provoked by lexical items which are not now distinguished in writing, even though the amount of already existing homography might be considerably greater than in projected cases, such as the mere three English words pronounced rane. The English graphic form can, for example, has at least ten different meanings which to a normal mind might appear as ten different words. But no one, either in or out of his right mind in such matters, suffers any anxiety from the problems which in theory should exist in such extensive homography.

    The uncritical acceptance of current written forms as an immutable given ignores the accidents in the history of writing that have resulted in current graphic differentiation for some homophones and not for others. Such methodological myopia cannot lead to any useful consideration of ambiguity….

    The complete essay is now online: Homographobia.