Emoji, language, and translation

A couple of days ago the New York Times ran a small piece, “How Emojis Find Their Way to Phones.” It contains the sort of nonsense about Chinese characters and language that often sets me off.

Fortunately, Victor Mair quickly posted something on this. J. Marshall Unger (Ideogram: Chinese Characters and the Myth of Disembodied Meaning, The Fifth Generation Fallacy, and Literacy and Script Reform in Occupation Japan) and S. Robert Ramsey (The Languages of China) quickly followed. But since those are in the comments to a Language Log post and thus may not be seen as much as they should be, I thought I’d link to them here.

The Language Log post itself is on Emoji Dick, which is billed as a translation of Moby Dick into emoji. As long as I’m writing, I might as well offer up a sample for you. See if you can determine the original English.


Did you try “Call me Ishmael”? Sorry. That’s not it. But if you guessed that I would choose the passage from Moby Dick that mentions Taiwan, give yourself bonus points.

Here’s what the above emoji supposedly translate:

Hereby the casks are sought to be kept damply tight; while by the changed character of the withdrawn water, the mariners readily detect any serious leakage in the precious cargo.

Now, from the South and West the Pequod was drawing nigh to Formosa and the Bashee Isles, between which lies one of the tropical outlets from the China waters into the Pacific.

Ah, of course. It’s all so clear now.

The next time you hear someone use “pictorial language,” “ideographs,” or the like in all seriousness, perhaps ask them for their own English translation of the above string of images.

Actually, Emoji Dick screwed this up some, as part belongs to the main text and part to a footnote.


Pinyin font: Noto

I shouldn’t go too long without mentioning Google’s ambitious Noto project, which offers both serif and sans-serif versions: Noto Serif and Noto Sans.

When text is rendered by a computer, sometimes there will be characters in the text that can not be displayed, because no font that supports them is available to the computer. When this occurs, small boxes are shown to represent the characters. We call those small boxes “tofu,” and we want to remove tofu from the Web. This is how the Noto font families got their name.

Noto helps to make the web more beautiful across platforms for all languages. Currently, Noto covers over 30 scripts, and will cover all of Unicode in the future. This is the Sans Latin, Greek and Cyrillic family. It has Regular, Bold, Italic and Bold Italic styles and is hinted. It is derived from Droid, and like Droid it has a serif sister family, Noto Serif.

Noto fonts for many other languages are available as web fonts from the Google Web Fonts Early Access page.

Noto fonts are intended to be visually harmonious across multiple languages, with compatible heights and stroke thicknesses.

(Emphasis added.)

And it’s free, of course.



Diing Dong

A doubled vowel is a sure sign of the Gwoyeu Romatzyh romanization system — except when it’s a sign of someone wrongly omitting an apostrophe in Hanyu Pinyin or simply making a typo. But today’s example is certainly Gwoyeu Romatzyh, as, oddly enough, the side of a coach bus is one of the most likely places in Taiwan to spot an example of that romanization system. I’m seeing it less and less as the years go by, though, which saddens me.

Here, however, is a nice example that looks fairly new. I took the photo along Taidong’s lovely coastline a couple of weeks ago.

Diing Dong Bus (Pinyin: Ding3 Dong1; lit. ancient three-legged round cauldron, east)

Note, too, the mixing of Mandarin and English (rather than the loanword form of bashi), and those hideously misplaced g’s.

photo of a coach bus, with 'Diing Dong Bus' in large letters on the side, with the bottom of the descenders on the g's sitting on the baseline

Milk Shop

Here’s another in my series of photos of English with Chinese character(istic)s, that is Chinese characters being used to write English (sort of). I want to stress that these aren’t loan words, just an approximate phonetic rendering of the English.

Today’s entry — which was taken a few weeks ago in Xinzhu (usually spelled “Hsinchu”), Taiwan — is Mi2ke4 Xia4 (lit. “lost guest summer”).

sign for a drinks store, labeled 'milk shop' in English and 'mi ke xia' in Chinese characters

PRC’s official rules for Pinyin: 2012 revision — in traditional Chinese characters

Last week I put online China’s official rules for Hanyu Pinyin, the 2012 revision (GB/T 16159-2012). I’ve now made a traditional-Chinese-character version of those rules for Pinyin.

Eventually I’ll also issue versions in Pinyin and English.

(Note: The image above is of course Photoshopped. I altered the cover of the PRC standard simply to provide an illustration in traditional Chinese characters for this post.)


I tend to think of Hanzi being used to write English words as “Singlish,” after John DeFrancis’s classic spoof, “The Singlish Affair,” which is the opening chapter of his essential book The Chinese Language: Fact and Fantasy. But these days the word is mainly used for Singaporean English. So now I usually go with something like “English with Chinese character(istic)s.”

For a few earlier examples, see the my photos of the dog and the butterfly businesses.

Today’s example is “Crunchy,” written as ke3 lang3 qi2 (can bright strange). Kelangqi, however, isn’t how to say “crunchy” in Mandarin (cui4 de is); it’s just an attempt to render the English word using Chinese characters, probably in an attempt to look different and cool.

Sign advertising a store named 'Crunchy' in English and 'ke lang qi' (in Chinese characters) in Mandarin

Crunchy, which is now out of business, was just a block away from the Dog (dou4 ge2) store, which is still around.