OMG, another rabbit pun!

photo detailing ad described in the post, with cartoon rabbits and other cutesy animals outside a school

Another pun for the Year of the Rabbit

Back to Schoo!! [sic]

This is notable mainly for writing the word “tūrán” (“suddenly”) as “兔然” rather than properly as “突然”.

The key is that “兔” is the character used in writing “tùzi” (rabbit), as in the Year of the Rabbit. (I took the photo last month.) Thus, here we have a Mandarin-Mandarin pun rather than a Mandarin-English one like the one I posted earlier.

So the ad is basically saying, “Oh my god! A new year / school semester is suddenly upon us [so we’d better update our computers and get Microsoft 365].”

I wish they’d bothered to get “school” correct, though.

Turkey, Türkiye, and Chinese characters

Turkish flag

Victor Mair’s recent post at Language Log on Transcription vs. transliteration vs. translation in cartography brought to mind last year’s Turkey/Türkiye situation, which I meant to write about at the time but never did. Briefly, the Turkish government basically said, “We’d like the world to stop calling the country ‘Turkey’ and use ‘Türkiye’ instead.” (As far as I know, the government didn’t call for a revision of “Turkish.”)

A lot of countries agreed to go along with the switch. Last month the United States officially jumped on board as well — sort of. The U.S. State Department’s web page on this currently states, “The official conventional long-form and short-form names remain “Republic of Turkey” and “Turkey”, respectively. “Republic of Türkiye” should be used in formal and diplomatic contexts. The conventional names may be used in place of or alongside “Türkiye” in appropriate instances, including U.S. government cartographic products, as it is more widely understood by the American public.”

But, this being a site that focuses mainly on matters related to Mandarin, I’m more interested in what China and Taiwan did.

It turns out that both China and Taiwan agreed to adopt the form “Türkiye.” In practice, though, that relates mainly to those governments issuing releases in English. But what about the Mandarin name of the country, which has been “Tǚ’ěrqí” (written “土耳其” in Chinese characters).

As Yin Binyong, who was the main force in the orthography of Hanyu Pinyin, noted in “Transliteration of Foreign Place Names and Personal Names“:

A small number of foreign names are translated into Putonghua according to meaning, or a combination of meaning and pronunciation; the great majority are transliterated, i.e. translated according to pronunciation.

(Following Mair, though, we should read “transcription” for “transliteration.” The language of the original publication was English, which is why the quote appears as such.)

“Tǚ’ěrqí” belongs to the third category; it is just a phonetic approximation of “Turkey.” (For those unfamiliar with Pinyin or Mandarin, Tu’erqi is pronounced very roughly like “to” + “her” (minus the h sound) + the “chee” in cheese.) Among Mandarin’s 410 or so syllable sounds (not counting tones), there is nothing much like key. But the ye in Türkiye would not be a problem for Mandarin speakers.

If the governments of China and Taiwan really wanted to show their respect for the change from Turkey to Türkiye, they could come up with new Mandarin names that would do a better job of matching the pronunciation of Türkiye than Tu’erqi. But they haven’t. Tu’erqi/土耳其 remains, and this is unlikely to change. Note, for example, how the Xinhua article listed below calls “Türkiye” not the name but the foreign-language name (waiwen) of Tu’erqi.

Of course, how much respect the government of Taiwan owes the government of Turkey — er, Türkiye, which has become somewhat cozy with the PRC, might be worth considering as well. But that’s heading off-topic.

Pinyin, US trademark law, and myths about Chinese characters

芝麻 vs. ZHIMA

The Mandarin word for “sesame” is zhīma (written “芝麻” in Chinese characters). That’s all the Mandarin anyone will need to know for this post. But if any of you non-Mandarin speakers are curious, an approximate pronunciation would be the je in jerk + ma (with the a as in father).

OK, let’s get into it now.

Everyone knows open sesame from the story of Ali Baba and the forty thieves, thought Jack Ma, when he was deciding upon a name for his new company. Alibaba Group Holding Limited is now one of China’s and indeed one of the world’s largest companies. So it’s no surprise that “open sesame” and just plain ol’ “sesame” are still very much associated with the company. And yet the company was acting as if this were not so, at least when it comes to Pinyin.

The U.S. Patent and Trademark Office’s Trademark Trial and Appeal Board recently ruled finally against a trademark application by Advanced New Technologies Co. (hereafter “Applicant”), which was acting on behalf of Alibaba. The mark applied for was “ZHIMA” (as such). The application (serial no. 86832288) was originally filed on November 25, 2015; Applicant requested reconsideration after earlier rejections.

The trademark office has a longstanding rule that trademark applications must, “if the mark includes non-English wording,” include “an English translation of that wording.” But Alibaba didn’t want to do that. The U.S. trademark board ruling lists some of the claims put forth by those arguing for Alibaba.

Applicant refused to submit the required statement for the following reasons:

  1. There are no Chinese characters (or other non-Latin characters) in Applicant’s Mark;
  2. A purported meaning of Chinese characters (or any nonLatin characters of even designs or stylizations) cannot be attached to a mark that does not contain such characters);
  3. Even if similar lettering is used as a transliteration of Chinese characters, Applicant’s Mark, ZHIMA – the only wording at issue – is not a transliteration of Chinese characters;
  4. Applicant’s Mark ZHIMA is not a translation of Chinese characters;
  5. Applicant’s Mark does not mean “sesame” in English;
  6. There is no logical or acceptable reason to ascribe the meaning of any Chinese characters to Applicant’s Mark. Applicant’s Latin-character Mark is a coined word with no translation in a foreign language or meaning which can be attributed.

Applicant concludes that ZHIMA is a coined term, not a foreign word; therefore, a translation/transliteration statement is not necessary.

Although I’m not a lawyer, I do know a thing or two about Pinyin, Chinese characters, and the difference between languages (e.g., Mandarin, English, Swahili, Hebrew) and scripts (the means of writing those languages, e.g., Chinese characters, the Roman alphabet, the Hebrew alphabet). So I feel confident in stating that Alibaba’s claims were risible.

The ruling also quotes the Applicant as claiming that “it is the Chinese characters which translate to ‘sesame’ and that ‘zhima’ is merely a transliteration/pronunciation of these Chinese characters.”

The ruling sums that up as follows: “In other words, according to Applicant the Chinese characters 芝麻 pronounced ZHIMA mean ‘sesame,’ but ‘Zhima’ itself has no meaning.” Elsewhere in the ruling there is this:

Applicant argues, in essence, that while the Chinese characters pronounced ZHIMA means “sesame,” ZHIMA, in and of itself, has no meaning. This is because “the Latin characters ‘zhima’ or ‘zhi ma’ merely represent the transliteration/sounds of particular Chinese characters that are not part of the mark as filed” (i.e., ZHIMA). Without the Chinese characters, ZHIMA has no meaning.

I believe most people would have no trouble laughing at the claim that zhima (the way to write in Pinyin the Mandarin word for sesame) has “no meaning” but is merely something coined by the company. Would anyone believe that this was just some sort of coincidence?

The authorities at the Patent and Trademark Office of course had no trouble finding plenty of examples of zhima being used as such to write the Mandarin word for sesame, including by Alibaba itself. And so the application for a U.S. trademark on “ZHIMA” as a coined word that was supposedly not Mandarin at all but merely something without meaning was rejected once and for all. Importantly, this decision sets a precedent, which should help stop such claims in the future.

Although I’m pleased that the correct decision was reached, I don’t think the decision was necessarily a foregone conclusion, however obviously absurd the claims of Alibaba were. The problem is that a lot of people — including many who really should know better — actually believe nonsense like Chinese characters are necessary to convey the meaning of Mandarin words. The truth is that Mandarin is a language, and Chinese characters and Hanyu Pinyin are scripts (means for writing that language). Chinese characters are not some sort of über language. And, by extension, no matter how many times such claims are repeated, even in what would normally be considered reputable sources, there is no such thing as an “ideographic language” or a “logographic language.”

Speech is primary, not secondary, to the existence of a living language. If by some sort of quirk in the universe every single Chinese character vanished from the face of the Earth, Mandarin would still exist, hundreds of millions of people would still be speaking it with one another, and the Mandarin word for sesame would still be zhima, regardless of how one might write it or what the lawyers for a huge company claim.

North Korea cracking down on wussy given names that don’t end in consonants

Korean consonants

North Korea is a scary, scary, scary place. Fortunately, at least for those of us not living in that People’s Paradise, every so often the country also provides important linguistic tips, which I am duty-bound to pass along to you.

For example, did you know that names without final consonants are “anti-socialist”? The wise authorities in North Korea have reportedly come to that conclusion and are presently dedicated to the task of cleansing that evil. Since October, “notices have been constantly issued at the neighborhood-watch unit’s residents’ meeting to correct all names without final consonants. People with names that don’t have a final consonant have until the end of the year to add political meanings to their name to meet revolutionary standards,” a resident of North Korea’s North Hamgyong told Radio Free Asia.

In meetings and public notices, officials have gone so far as to instruct adults and children to change their names if they are deemed too soft or simple …, another source said….

The government has threatened to fine anyone who does not use names with political meanings, a resident in the northern province of Ryanggang told RFA on condition of anonymity to speak freely.

Naturally, it would be unwise to adopt any sort of name that reminds the government authorities of names in South Korea or elsewhere.

In the past, North Koreans were encouraged to give their children patriotic names that held some ideological or even militaristic meaning, such as Chung Sim (loyalty), Chong Il (gun), Pok Il (bomb) or Ui Song (satellite).

In recent years, though, as the county has become more open to the outside world, North Koreans have been naming their children gentler, more uplifting names that are easier to say, such as A Ri (loved one), So Ra (conch shell) and Su Mi (super beauty), sources inside the country say.

Instead of names that end on harder sounding consonants, children are being given names that end in softer vowels, which is more like names given to children in South Korea.

But recently, North Korean authorities are clamping down on this trend, requiring citizens with the softer names to change to more ideological ones, and even their children’s names, if they aren’t “revolutionary” enough, the sources say.

Grammar shrammar

The following is a guest post by Victor H. Mair.


How do we learn languages, after all? By following rules, whether hard-wired or learned? Or by acquiring and absorbing principles and patterns through massive amounts of repetitions?

AI is changing scientists’ understanding of language learning — and raising questions about innate grammar,” a stimulating new article by Morten Christiansen and Pablo Contreras Kallens that first appeared in The Conversation (10/19/2022) and later in Ars Technica and elsewhere, begins thus:

Unlike the carefully scripted dialogue found in most books and movies, the language of everyday interaction tends to be messy and incomplete, full of false starts, interruptions and people talking over each other. From casual conversations between friends, to bickering between siblings, to formal discussions in a boardroom, authentic conversation is chaotic. It seems miraculous that anyone can learn language at all given the haphazard nature of the linguistic experience.

I must say that I am in profound agreement with this scenario. In many university and college departments, which consist entirely of learned professors, you’d think that discussions and deliberations would be governed by regulations and rationality. Such, however, is not the case. Instead, people constantly talk over and past each other, barely listening to what their colleagues are saying. They interrupt one another and engage in aggressive behavior, or erupt in mindless laughter over who knows what. I’m not saying that all the members of these departments are like this nor that all departments are like this, but far too many do converse in this fashion. The individuals who are more sedate and civilized tend to remain silent for hours on end because, as the saying goes, they can’t get a word in edgewise. It’s a wonder that departments can accomplish anything.

For this reason, many language scientists – including Noam Chomsky, a founder of modern linguistics – believe that language learners require a kind of glue to rein in the unruly nature of everyday language. And that glue is grammar: a system of rules for generating grammatical sentences.

Everybody knows these things — or knew them decades ago — but now they are indubitably passé.

Children must have a grammar template wired into their brains to help them overcome the limitations of their language experience – or so the thinking goes.

This template, for example, might contain a “super-rule” that dictates how new pieces are added to existing phrases. Children then only need to learn whether their native language is one, like English, where the verb goes before the object (as in “I eat sushi”), or one like Japanese, where the verb goes after the object (in Japanese, the same sentence is structured as “I sushi eat”).

But new insights into language learning are coming from an unlikely source: artificial intelligence. A new breed of large AI language models can write newspaper articles, poetry and computer code and answer questions truthfully after being exposed to vast amounts of language input. And even more astonishingly, they all do it without the help of grammar.

Now, however, the authors make an astonishing claim. They assert that AI language models produce language that is grammatically correct, but they do so without a grammar!

Even if their choice of words is sometimes strange, nonsensical or contains racist, sexist and other harmful biases, one thing is very clear: the overwhelming majority of the output of these AI language models is grammatically correct. And yet, there are no grammar templates or rules hardwired into them – they rely on linguistic experience alone, messy as it may be.

GPT-3, arguably the most well-known of these models, is a gigantic deep-learning neural network with 175 billion parameters. It was trained to predict the next word in a sentence given what came before across hundreds of billions of words from the internet, books and Wikipedia. When it made a wrong prediction, its parameters were adjusted using an automatic learning algorithm.

Remarkably, GPT-3 can generate believable text reacting to prompts such as “A summary of the last ‘Fast and Furious’ movie is…” or “Write a poem in the style of Emily Dickinson.” Moreover, GPT-3 can respond to SAT level analogies, reading comprehension questions and even solve simple arithmetic problems – all from learning how to predict the next word.

The authors delve more deeply into comparisons of AI models and human brains, not without raising some significant problems:

A possible concern is that these new AI language models are fed a lot of input: GPT-3 was trained on linguistic experience equivalent to 20,000 human years. But a preliminary study that has not yet been peer-reviewed found that GPT-2 [a “little brother” of GPT-3] can still model human next-word predictions and brain activations even when trained on just 100 million words. That’s well within the amount of linguistic input that an average child might hear during the first 10 years of life.

In conclusion, Christiansen and Kallens call for a rethinking of language learning:

“Children should be seen, not heard” goes the old saying, but the latest AI language models suggest that nothing could be further from the truth. Instead, children need to be engaged in the back-and-forth of conversation as much as possible to help them develop their language skills. Linguistic experience – not grammar – is key to becoming a competent language user.

By all means, talk at the table, but respectfully, and not too loudly.

Atomic Enema Gwoyeu Romatzyh

box for a product with the English name of Atomic Enema

I know what you’re thinking: “Man, look at the weird romanization in that address!” ;-)

Say what you will against the Gwoyeu Romatzyh romanization system for Mandarin (or “GR” for short) — its quirkiness, its unnecessary complications, its counter-intuitiveness for those who don’t know its rules (much more so than with Hanyu Pinyin). But at least in the few instances where it’s still seen in the wild, it’s usually spelled correctly.

That’s not the case here.

The address for the manufacturer, the Health Chemical Pharmaceutical Co., Ltd., is given as

No.12, Yeou-4th Rd., Ta-Chia Yowshy Ind. Dist.

  • yeou = Hanyu Pinyin yǒu — misspelled GR (should be “yow,” which is “yòu” in HP); this is all the more strange given that the company gets “yow” correct elsewhere in the same line
  • ta = HP dà — essentially correct Wade-Giles (not GR)
  • chia = HP jiǎ — essentially correct Wade-Giles (not GR)
  • yow = HP yòu — correct GR
  • shy = HP shī — misspelled GR (should be syh)

This is definitely misspelled Gwoyeu Romatzyh rather than a different system (such as MPS2, which is often seen in the boondocks of Taiwan).

And the city name is given as “Taichung,” which is bastardized Wade-Giles (for what would be spelled “Taizhong” in Hanyu Pinyin). But since that is the standard spelling in Taiwan, one can’t blame the company for this.

And at least the company didn’t get “4th” wrong, which is more than can be said for the Taichung City Government, as shown by a sign near the factory. (From Google Street View.)

The source of the other misspellings will likely remain enema-migmatic.

Street sign reading 'You 4rd Rd.'

Big Pinyin on Chengdu Storefronts

Fan Yiying and Gu Peng have posted a story at Sixth Tone that is both surprising and not surprising at all: State Media Criticizes Chengdu Shop Signs in Romanized Chinese.

The main points I’d like to make about this are:

  • Word-parsing matters.
  • Hundreds of millions of people in China use Hanyu Pinyin on a daily basis but still do not know how Pinyin is meant to work as an orthographic system.
  • The government of China, though it needs Pinyin, is in many ways hostile to it.
  • The fonts available for writing the Roman alphabet (and thus Pinyin) far exceed those for writing Chinese characters, so there is nothing in the least artistically limiting about Pinyin per se. (Whether Chinese characters are intrinsically more beautiful than the Roman alphabet is another matter.)

Here are some screenshots from the video mentioned in the article. Note: This isn’t the loveliest voice ever….

Sorry about the triangles on the photos, which make the shots look like videos. I wasn’t good at capturing screenshots without pausing the video, which made the triangles appear.

signs reading DIAN XIAN DIAN LAN, etc.

signs reading HONG DA TU WEN and MIAN DAO



ER LIANG WAN ZA MIAN sign in Chinese characters