Popularity of Chinese character country code TLDs

Yesterday we looked at the popularity of the Chinese character TLD for Singapore Internet domains. Today we’re going to examine the Chinese character ccTLDs (country code top-level domains) for those places that use Chinese characters and compare the figures with those for the respective Roman alphabet TLDs.

In other words, how, for example, does the use of taiwan in traditional Chinese characters   .台灣 domains compare with the use of .tw domains?

Since, unlike the case with Singapore, I don’t have the registration figures, I’m having to make do with Google hits, which is a different measure. For this purpose, Google is unfortunately a bit of a blunt instrument. But at least it should be a fairly evenhanded blunt instrument and will be useful in establishing baselines for later comparisons.

A few notes before we get started:

  • Japan has yet to bother with completing the process for its own name in kanji (Japan, as written in kanji / Chinese characters), so it is omitted here.
  • Macau only recently asked for aomen in simplified Chinese characters    
  .澳门 and aomen in traditional Chinese characters    
  .澳門, so those figures are still at zero.
  • Oddly enough, there’s no taiwan_super in traditional Chinese characters   
  .臺灣 ccTLD, even though the Ma administration, which was in power when Taiwan’s ccTLDs went into effect, officially prefers the more complex form of taiwan_super in traditional Chinese characters   
  .臺灣 to taiwan in traditional Chinese characters   .台灣 — not to mention prefering it to taiwan in simplified Chinese characters    
  .台湾.
  Google Hits Percent of Total
MACAU    
.mo 18400000 100.00
aomen in simplified Chinese characters    
  .澳门 0 0.00
aomen in traditional Chinese characters    
  .澳門 0 0.00
TAIWAN    
.tw 206000000 99.86
taiwan in simplified Chinese characters    
  .台湾 67600 0.03
taiwan_super in traditional Chinese characters   
  .臺灣 0 0.00
taiwan in traditional Chinese characters   .台灣 230000 0.11
HONG KONG    
.hk 193000000 99.94
xianggang  in Chinese characters 
  .香港 118000 0.06
SINGAPORE    
.sg 97800000 100.00
xinjiapo  in Chinese characters 
  .新加坡 2 0.00
CHINA    
.cn 315000000 99.61
zhongguo in simplified Chinese characters  
  .中国 973000 0.31
zhongguo in traditional Chinese characters   
  .中國 251000 0.08

So in no instance does the Chinese character ccTLD reach even one half of one percent of the total for any given place.

Here are the results in a chart.

Graph showing that although China leads in domains in Chinese characters, they do not reach even one half of one percent of the total for China

Note that the ratio of simplified:traditional forms in China and Taiwan are roughly mirror images of each other, as is perhaps to be expected.

See also Platform on Tai, Pinyin News, December 30, 2011

China down slightly as destination for U.S. study abroad students

Rapid growth in U.S. students going to China to study has not been seen since around 2008. In fact, in the most recent school year for which we have data (2012–2013), the total fell to 14,413, down slightly from the 14,887 U.S. students studying in China during the 2011–2012 school year.

US_study_abroad_students_in_China
Meanwhile, the number of students from China studying in the United States is back on the rise.

Note, the chart below is not of the absolute number of Chinese students in the United States but of the ratio of Chinese students in the United States to U.S. students in China — just because I thought it might be more interesting. If you’d like to the see the numbers for the former, then check the source document.

Students from the People's Republic of China in the United States per U.S. student in China

China is the leading place of origin for students coming to the United States, with Chinese students comprising 31% of international students in the United States. They’re about evenly divided between undergrad and grad students.

Source: Open Doors Fact Sheet: China.

PRC’s official rules for Pinyin: 2012 revision — in traditional Chinese characters

Last week I put online China’s official rules for Hanyu Pinyin, the 2012 revision (GB/T 16159-2012). I’ve now made a traditional-Chinese-character version of those rules for Pinyin.

Eventually I’ll also issue versions in Pinyin and English.

gbt_16159-2012_traditional
(Note: The image above is of course Photoshopped. I altered the cover of the PRC standard simply to provide an illustration in traditional Chinese characters for this post.)

PRC’s official rules for Pinyin: 2012 revision

In 2012 China revised its official guidelines for writing Pinyin.

These are the Hanyu Pinyin Zhengcifa Jiben Guize (official translation: “Basic Rules of the Chinese Phonetic Alphabet Orthography”), promulgated as GB/T 16159-2012.

Among the changes are that some alternate forms are now allowed, for example “wo de” (my) may also be written as “wode”. I’m not thrilled about that; but I know some people will welcome this.

I’ve added a few notes, such as for errors in the original document.

So far I have made only a version in so-called simplified Chinese characters. But eventually I’ll add one in traditional Chinese characters and an English translation.

front cover of GB/T 16159-2012 Pinyin guidelines

Zhou Youguang on politics

The New York Times has just published a profile of Zhou Youguang, who is often called “the father of Pinyin” (though he modestly prefers to stress that others worked with him): A Chinese Voice of Dissent That Took Its Time.

This profile focuses not only on Zhou’s role in the creation of Hanyu Pinyin but also on his political views, which he has become increasingly public with.

About Mao, he said in an interview: “I deny he did any good.” About the 1989 Tiananmen Square massacre: “I am sure one day justice will be done.” About popular support for the Communist Party: “The people have no freedom to express themselves, so we cannot know.”

As for fostering creativity in the Communist system, Mr. Zhou had this to say, in a 2010 book of essays: “Inventions are flowers that grow out of the soil of freedom. Innovation and invention don’t grow out of the government’s orders.”

No sooner had the first batch of copies been printed than the book was banned in China.

Although the reporter’s assertion, following the PRC’s official figures, that “China all but stamp[ed] out illiteracy” is well wide of the mark, there is no denying Pinyin’s crucial role in this area. I recommend reading the whole article.

Zhou Youguang

Remembering Hu Shih: 1891-1962

black and white photo of the face of Hu Shih (胡適)

Hú Shì
17 December 1891 — 24 February 1962

Today, on the fiftieth anniversary of the death of Hu Shih (Hú Shì/胡適/胡适), I’d like to say a few things in his memory. This is, after all, someone I regard as a hero in many ways. I even keep a photo of him in my office.

The opening of the preface to a splendid new biography of Hu Shih covers the basics:

Hu Shi (1891–1962), “the Father of the Chinese Renaissance,” towered over China’s intellectual landscape in the first half of the twentieth century. Among other achievements, he is credited with having made everyday speech respectable as a medium of written communication. Groomed as a traditional scholar-bureaucrat in his father’s footsteps, he had already turned into an iconoclastic renegade by the time he left Shanghai at the age of eighteen to study in the United States. In John Dewey, whose approach to philosophy was to treat all doctrines as working hypotheses, Hu felt he found “the proper way to think.” He and his associates who studied with Dewey at Columbia University established the framework of China’s modern educational system. A dedicated humanist, social reformer and promoter of women rights, he was, at different periods of his life, president of Peking University, president of the Academia Sinica, and ambassador to Washington.

To return to the most important point, at least in terms of the focus of this site, it was he, more than anyone else, who helped break the stranglehold of Literary Sinitic (a.k.a. classical Chinese). The vernacular movement he spearheaded is of far greater significance and has had a much greater impact on Chinese culture and people’s lives than so-called character simplification. Yet it receives relatively little attention, perhaps because many do not understand — or do not want to admit — how very different Literary Sinitic is from modern standard Mandarin.

Hu Shih is also the one who, more than anyone else, popularized the use of modern punctuation in Chinese texts, such as through his book Zhōngguó Zhéxuéshǐ Dàgāng and his editions of earlier works. That alone should be enough to earn him the eternal gratitude of all who read texts written in Chinese characters.

There’s so much more to the man than this, though most of it falls outside the bounds of this site. So rather than go into it here I will just encourage people to read more by and about him.

Shortly after Hu Shih’s death his son wrote:

father passed away during a cocktail party in honor of the members of the Academia Sinica after the completion of the members’ meeting. He passed away without any pain, and from every one present at the party, I gathered that he died happy, for the last words he said was, “Let’s have some drinks!”

I lift my glass.

Further reading:

dàdǎn jiǎshè

xiǎoxīn qiúzhèng
Nǐ bùnéng zuò wǒ de shī,
zhèngrú wǒ bùnéng zuò nǐ de mèng.

—Hú Shì
from “Mèng yǔ Shī” (夢與詩)

New database of cross-strait differences in Mandarin goes online

Last week, on the same day President Ma Ying-jeou accepted the resignation of a minister who made some drunken lewd remarks at a wěiyá (year-end office party), Ma was joking to the media about blow jobs.

Classy.

screenshot from a video of a news story on this

But it was all for a good cause, of course. You see, the Mandarin expression chuī lǎba, when not referring to the literal playing of a trumpet, is usually taken in Taiwan to refer to a blow job. But in China, Ma explained, chuī lǎba means the same thing as the idiom pāi mǎpì (pat/kiss the horse’s ass — i.e., flatter). And now that we have the handy-dandy Zhōnghuá Yǔwén Zhīshikù (Chinese Language Database), which Ma was announcing, we can look up how Mandarin differs in Taiwan and China, and thus not get tripped up by such misunderstandings. Or at least that’s supposed to be the idea.

The database, which is the result of cross-strait cooperation, can be accessed via two sites: one in Taiwan, the other in China.

It’s clear that a lot of money has been spent on this. For example, many entries are accompanied by well-documented, precise explanations by distinguished lexicographers. Ha! Just kidding! Many entries are really accompanied by videos — some two hundred of them — of cutesy puppets gabbing about cross-strait differences in Mandarin expressions. But if there’s a video in there of the panda in the skirt explaining to the sheep in the vest that a useful skill for getting ahead in Chinese society is chuī lǎba, I haven’t found it yet. Will NMA will take up the challenge?

Much of the site emphasizes not so much language as Chinese characters. For example, another expensively produced video feeds the ideographic myth by showing off obscure Hanzi, such as the one for chěng.

WARNING: The screenshot below links to a video that contains scenes with intense wawa-ing and thus may not be suitable for anyone who thinks it’s not really cute for grown women to try to sound like they’re only thwee-and-a-half years old.

cheng3

In a welcome bit of synchronicity, Victor Mair posted on Language Log earlier the same week on the unpredictability of Chinese character formation and pronunciation, briefly discussing just such patterns of duplication, triplication, etc.

Mair notes:

Most of these characters are of relatively low frequency and, except for a few of them, neither their meanings nor their pronunciations are known by persons of average literacy.

Many more such characters consisting or two, three, or four repetitions of the same character exist, and their sounds and meanings are in most cases equally or more opaque.

The Hanzi for chěng (which looks like 馬馬馬 run together as one character) in the video above is sufficiently obscure that it likely won’t be shown correctly in many browsers on most systems when written in real text: 𩧢. But never fear: It’s already in Unicode and so should be appearing one of these years in a massively bloated system font.

Further reinforcing the impression that the focus is on Chinese characters, Liú Zhàoxuán, who is the head of the association in charge of the project on the Taiwan side, equated traditional Chinese characters with Chinese culture itself and declared that getting the masses in China to recognize them is an important mission. (Liu really needs to read Lü Shuxiang’s “Comparing Chinese Characters and a Chinese Spelling Script — an evening conversation on the reform of Chinese characters.”)

Then he went on about how Chinese characters are a great system because, supposedly, they have a one-to-one correspondence with language that other scripts cannot match and people can know what they mean by looking at them (!) and that they therefore have a high degree of artistic quality (gāodù de yìshùxìng). Basically, the person in charge of this project seems to have a bad case of the Like Wow syndrome, which is not a reassuring trait for someone in charge of producing a dictionary.

The same cooperation that built the Web sites led to a new book, Liǎng’àn Měirì Yī Cí (《兩岸每日一詞》 / Roughly: Cross-Strait Term-a-Day Book), which was also touted at the press conference.

The book contains Hanyu Pinyin, as well as zhuyin fuhao. But, alas, the book makes the Pinyin look ugly and fails completely at the first rule of Pinyin: use word parsing. (In the online images from the book, such as the one below, all of the words are se pa ra ted in to syl la bles.)

The Web site also has ugly Pinyin, with the CSS file for the Taiwan site calling for Pinyin to be shown in SimSun, which is one of the fonts it’s better not to use for Pinyin. But the word parsing on the Web site is at least not always wrong. Here are a few examples.

  • “跑神兒” is given as pǎoshénr (good).
  • And apostrophes appear to be used correctly: e.g., fàn’ān (販安), chūn’ān (春安), and fēi’ān (飛安).
  • But “第二春” is run together as “dìèrchūn” (no hyphen) rather than as shown correctly as dì-èr chūn.
  • And “一個頭兩個大” is given as yíɡe tóu liǎnɡɡe dà (for Taiwan) and yīɡe tóu liǎnɡɡe dà (for China). But ge is supposed to be written separately. (The variation of tone for yi is in this case useful.)

Still, my general impression from this is that we should not expect the forthcoming cross-strait dictionary to be very good.

Further reading:

Early instances of misunderstandings of biblical proportions

old-style Hanzi for 來From time to time I come across references by the credulous to the supposed biblical roots of some Chinese characters. I was surprised to learn, however, that that manner of interpretation has been around for many years.

In his 1902 book China and the Chinese, Herbert A. Giles (of Wade-Giles fame) pointed out the flaw he had seen in some earlier work.

Even the early Jesuit Fathers of the seventeenth and eighteenth centuries, to whom we owe so much for pioneer work in the domain of Sinology, were not without occasional lapses of the kind, due no doubt to a laudable if excessive zeal. Finding the character 船, which is the common word for “a ship,” as indicated by 舟, the earlier picture-character for “boat” seen on the left-hand side, one ingenious Father proceeded to analyse it as follows: —

舟 “ship,” 八 “eight,” 口 “mouth” = eight mouths on a ship—“the Ark.”

But the right-hand portion is merely the phonetic of the character; it was originally 铅 “lead,” which gave the sound required; then the indicator “boat” was substituted for “metal.”

So with the word 禁 “to prohibit.” Because it could be analysed into two 木木 “trees” and 示 “a divine proclamation,” an allusion was discovered therein to the two trees and the proclamation of the Garden of Eden; whereas again the proper analysis is into indicator and phonetic.

Nor is such misplaced ingenuity confined to the Roman Catholic Church. In 1892 a Protestant missionary published and circulated broadcast what he said was “evidence in favour of the Gospels,” being nothing less than a prophecy of Christ’s coming hidden in the Chinese character 來 “to come.” He pointed out that this was composed of “a cross,” with two 人人 ‘men,’ one on each side, and a ‘greater man’ 人 in the middle.

That analysis is all very well for the character as it stands now; but before the Christian era this same character was written and was a picture, not of men and of a cross, but of a sheaf of corn. It came to mean “come,” says the Chinese etymologist, “because corn comes from heaven.”

Even if all the character etymologies Giles cites are not necessarily in keeping with modern scholarship, his principles here are correct.