Adso now available for download

David Lancashire’s wonderful Adso — which I tend to use primarily for conversions into Pinyin (under Style, select Pinyin) but which can handle much, much more — is now available for download as a Unix binary. A Windows version is expected soon.

This is fully-featured non-crippleware and should run on most modern linux distributions. To my knowledge, it is also the first reasonably-functional and freely-downloadable machine translation and NLP engine in the world.

If I were even half the programmer I ought to be, I’d snap this up in an instant.

Do Chinese characters save paper?

A common claim about Chinese characters (Hanzi) is that they take less space than alphabetic systems and so using them “saves paper.” After all, there aren’t spaces between words when writing in Chinese characters, and Chinese characters handle entire syllables rather than having to spell them out letter by letter. So this claim would seem to be self-evident. But things don’t always work out as expected.

cover of 'Did Adam and Eve Have Navels?' by Martin Gardnercover of the Mandarin translation of 'Did Adam and Eve Have Navels' 愛迪生,你被騙了!:你必須打破的27個科學迷思

A few weeks ago I was browsing the shelves of the enormous, wonderful Eslite bookstore near Taipei City Hall. (Nobody seems quite sure how the so-called English name of this chain is supposed to be pronounced, so many foreigners here prefer the Mandarin name: Chéngpǐn (誠品).) In many of the store’s sections, English-language originals and their translations into Mandarin are shelved right next to each other. So, after looking at a science book in English I pulled out the Mandarin Chinese translation of the same work and browsed through it. While I was doing so, I noticed something unexpected: the Mandarin version was longer than the English-language original.

This sparked my interest, so I pulled out some more paired titles, more or less at random, off the shelves for the purpose of comparison.

I did my best to keep the comparisons fair. In almost all of the cases I compared pairs of trade paperbacks: standard trade paperbacks in English with standard trade paperbacks in Mandarin.

Also, I didn’t count the pages taken up by indexes, since none of the translations into Mandarin had indexes. (Alphabets win hands down over Chinese characters when it comes to creating and using indexes, and I saw no reason to penalize the English books for this by counting pages that the ones in Chinese characters didn’t have the equivalent of.)

In addition, I avoided old books, since I wanted to be fairly sure the Mandarin Chinese translations were from the same English text as I was looking at. (I do, however, have one book written in German and translated into English. I didn’t check to see if the Mandarin version was done from the German original or the English translation.)

Of course, comparing across scripts and languages is certainly not the same as comparing simply across scripts (Hanzi vs. Hanyu Pinyin); but one does what one can.

Later, when I was supplementing my survey at the Eslite bookstore on Dunhua South Road when I noticed an error in my original method: I had forgotten to check where in the book page 1 fell. Many (but not all) English-language books mark the first page of the first chapter as page 1; many (but not all) books printed in Taiwan, however, include the front matter in their pagination, which leads to the first page of the first chapter being page 10 or so. So to help compensate for my oversight, it might be fair to subtract 10 pages from the Mandarin versions of those titles below followed by an asterisk. (The ones without an asterisk are those I examined most recently — and more carefully.)

Here are the results of my admittedly brief and unscientific survey:

Chronicles, Vol. 1, by Bob Dylan
English: 291 pp.
Mandarin in Hanzi: 295 pp.

Collapse, by Jared Diamond
English: 560 pp.
Mandarin in Hanzi: 609 pp.

The Death of Vishnu, by Manil Suri
English: 283 pp.
Mandarin in Hanzi: 287 pp.

Deep Simplicity: Bringing Order to Chaos and Complexity*, by John Gribbin
English: 235 pp.
Mandarin in Hanzi: 255 pp.

Did Adam and Eve Have Navels?: Debunking Pseudoscience*, by Martin Gardner
English: 310 pp.
Mandarin in Hanzi: 367 pp.

The Elegant Universe*, by Brian Greene
English: 428 pp.
Mandarin in Hanzi: 463 pp.

The Enigma of Arrival, by V.S. Naipaul
English: 350 pp.
Mandarin in Hanzi: 422 pp.

Harry Potter and the Half-Blood Prince, by J.K. Rowling
English: 607 pp. (hardback)
Mandarin in Hanzi: 716 pp.

Laboratory Earth*, by Stephen H. Schneider
English: 169 pp.
Mandarin in Hanzi: 227 pp.

The Long Tail, by Chris Anderson
English: 226 pp. (hardback, slightly larger than the Mandarin trade paperback)
Mandarin in Hanzi: 313 pp. (written left to right)

Perfume*, by Patrick Su?skind
English: 255 pp. (translation from German)
Mandarin in Hanzi: 278 pp.

Tough Choices, by Carly Fiorina
English: 309 pp.
Mandarin in Hanzi: 341 pp.

Vernon God Little, by D.B.C. Pierre
English: 275 pp. (mass market paperback)
Mandarin in Hanzi: 325 pp.

In every instance, the books in Chinese characters are longer than those in English. Moreover, the pages in the Mandarin-language trade paperbacks are somewhat larger than those in the English-language trade paperbacks. So that’s even more paper consumed by the books written in Chinese characters.

Although I certainly do not believe that all pairs of books in English and Mandarin translation follow this pattern, a pattern this very much appears to be.

My guess would be that books printed in China would have fewer pages than those printed in Taiwan. (Anyone want to check some of the above titles? Or does anyone have pairs of other titles in unexpurgated editions?) In general, books in China simply aren’t designed and printed with the same degrees of competency, attention, and concern for the reader as books in Taiwan — not to mention books in the United States and Britain. (Or have things changed very much in this regard since I lived in China?) So, among other factors, the characters tend to be smaller, along with the leading and the margins.

And then there’s the fact that translations in China sometimes omit sentences or entire sections, especially if they are deemed “sensitive.” (I doubt, however, that the books I examined suffered from Beijing’s censors.)

Also, China’s left-to-right format might have an advantage over Taiwan’s predominant top-to-bottom style in terms of space.

rice pizza = ‘mizza’

advertising photo of Pizza Hut's rice pizza; the copy reads '米zza 超ㄏㄤ美味新鮮fun'Something written with three different scripts (Chinese characters, zhuyin, and the roman alphabet) is very much the sort of thing that attracts my attention, as is a product that mixes scripts in its name. So this ad for a new product from Taiwan’s Pizza Hut definitely caught my eye, though it did not inspire me to actually taste the item being touted, which is a rice pizza. (Generally, I do not care for pizzas with Taiwanese characteristics, such as those with peas, corn, or squid. For that matter, I don’t even like pineapple on pizza.)

The name for this rice pizza, “米zza” (mǐzza), is a portmanteau — using two different languages and two different scripts, no less. 米 is the Chinese character for , which is used mainly in rice- and other grain-associated words. The second part of the word comes, of course, from “pizza.”

Let’s move on to the slogan:

米zza 超ㄏㄤ美味 新鮮fun

In romanization, this is

mǐzza: chāo hāng měiwèi — xīnxiān fun

Here we have Chinese characters (zza ㄏㄤ美味新鮮fun), zhuyin (米zza 超ㄏㄤ美味新鮮fun), and the Roman alphabet (米zza 超ㄏㄤ美味新鮮fun). Three scripts in just one line! (Yes, yes, I know that a line in written Japanese will often have just as many scripts, if not more; but this is Mandarin.)

The zhuyin, ㄏㄤ, represent hāng, a new slang word that, according to several people I have asked, has appeared within the last five years at most. It means “hot” in the sense of “extremely popular right now.”

Also, there’s a possibility that the English word “fun” is meant to echo the Mandarin fàn (飯 / 饭/ “rice”). Such puns across languages are not uncommon here, especially in local Internet slang.

So, the whole slogan might be translated as “Rice pizza: the super-’hot’ delicious food — fresh, new fun.” Sorry, that’s not a very good translation; it works better in Mandarin.

I predict such portmanteaux and mixing will be increasingly common here in Taiwan, where code switching is a way of life for many people. “Mǐzza” could be the wave of the future — just not the culinary future, I hope.

source: Taiwan Pizza Hut menu page, accessed January 30, 2007

Gaoxiong receives funding to upgrade the city’s English

The government of Gaoxiong (Kaohsiung) has recently secured funding from the Executive Yuan to

  • waste on so-called translation agencies that wouldn’t know real English if it bit them on the ass,
  • print up some signs on which the English is so small as to be almost unusable,
  • put up even more signs in a romanization system few people know but many think is ridiculous at best,
  • um, create an “English-friendly environment” in advance of the World Games, which will be held in the city in 2009.

The stories didn’t mention how much money will be involved in this. The project will be headed by the recently promoted Xǔ Lì-míng (許立明 / Xu Liming / Hsu Li-ming).

Let’s all hope the city does a much better job than is to be expected from past experience throughout Taiwan.

sources:

Shanghainese are overusing English, says PRC academic

From the China Daily a few months ago.

A linguistics expert has claimed Shanghainese are overusing the English language.

“It’s a blind worship of the English language,”said Pan Wenguo, dean of Chinese as a Foreign Language School at East China Normal University, at a conference held Monday to commemorate the 20th anniversary of promoting Putonghua, or Mandarin.

He added the business sector was particularly responsible for the trend, claiming many people used English “more for following others blindly than for practical needs.”

Pan said up to one-third of Chinese are studying or have studied English, while the number of English learners in Shanghai is even higher.

“English is not bad in itself, but the present mania of learning English is really too much,”said Pan.

Last Sunday, more than 50,000 Shanghai locals sat the English Interpreter Test of middle to high levels, an increase of 20 per cent on last year.

The time set a side for English learning has been on the rise for students at various levels….

In the increasingly competitive job market, the English Certificate has become one of the most important qualifications employees look for, ranking only behind diplomas.

Many employers, especially in the business sector, tend to hire only people with good English communication abilities….

source: Linguist criticizes ‘blind worship’ of English, China Daily, September 23, 2006

Taiwan license plates and English

Taipei City councilors holding up signs resembling license plates with funny English: PIG-456 and EGG-008It seems that ridding Taiwan license plates of the dreaded number 4 wasn’t enough. A Taipei city councilor, Tim Chang (Cháng Zhōngtiān / 常中天) of the New Party, suggested last year that “drivers are making an ass of themselves” if they drive around with license plates that spell out something that is insulting, ill-omened, or funny in English. He called for such unfortunate combinations to be filtered out in advance and for motorists to be allowed to change their plate numbers.

As the Taipei Times article on this notes, “License plates in Taiwan are made up of two alphabetic letters and four digits for cars, while license plates on scooters have three letters and three digits.”

People in Taiwan can change to another random plate number for NT$1,250 (approx. US$38), while personalized license plates cost at least NT$3,000.

source: Lucky number plate? Not for this ASS, Taipei Times, August 23, 2006

Y.R. Chao works being reissued

cover of the book 'Linguistic Essays, by Yuenren Chao'The Commercial Press has begun issuing a set of the complete works of Y.R. Chao (Zhao Yuanren / 趙元任 / 赵元任). This project, which will comprise some twenty volumes, will contain works in both English and Mandarin Chinese. All of the many fields Chao wrote about will be covered. Letters and journals will also be included, as will sound recordings. Wonderful!

For those who don’t want to wait for the whole series or don’t feel the need to buy all of them, the Commercial Press has also two volumes of Chao’s selected essays on linguistics: one in English and one in Mandarin. These are, respectively, Linguistic Essays by Yuenren Chao (ISBN: 7-100-03385-3/H·860) and Zhào Yuánrèn yǔyánxué lùnwénjí (赵元任语言学论文集) (ISBN: 7-100-03127-3/H·789).

cover of the book '赵元任语言学论文集 Zhao Yuanren Yuyanxue Lunwenji'Note how the cover of Linguistic Essays, a book printed just last year in China, uses “Yuenren Chao,” the traditional spelling and Western order of his name, rather than “Zhao Yuanren,” the spelling used in Hanyu Pinyin. Also note how the Mandarin title is given in traditional, not simplified, characters: 趙元任語言學論文集, not 赵元任语言学论文集. A nice surprise, on both counts. On the other hand, the botched romanization on the cover of the Mandarin-language collection, which gives “ZHAOYUANREN YUYANXUELUNWENJI” instead of “Zhào Yuánrèn yǔyánxué lùnwénjí,” is particularly inappropriate and painful to look at on a collection of the works of this brilliant linguist. But don’t judge this book by its cover.

Here are links to all the volumes in the complete works that I’ve been able to locate information on:

cover of the first volume of Y.R. Chao's collected works

Beijing subway signage — some photos

Sonarchic sent in photos of some signs in the Beijing subway system.

The typography for English and Pinyin is generally poor, as is common in China.

There are several things in general I’d like to draw attention to:

  • Everything is in a boring sans-serif.
  • The letters are often set too close together and occasionally too far apart.
  • EVERYTHING IS IN CAPITAL LETTERS.
  • The size of the English/romanization relative to the Chinese characters varies, with the English text often too small. (The latter is increasingly a problem in Taiwan.)

OK, now to the photos.

column-mounted list of station names along one particular Beijing subway line
Above:

  • very tight tracking between most of the Roman letters, except around the letter I
  • enormous (and incorrect) space after the apostrophes in “Tian’anmen” (which is, correctly, written with an apostrophe, BTW) and “Yong’anli” (which should perhaps be written “Yong’an Li”)
  • yet the apostrophes in the time-to-station markings are not followed by enormous spaces
  • failure to parse many words correctly, e.g., “Lù” (“Road” / ?) should be written apart from the name of the road: G?chéng Lù (???), not GUCHENGLU, etc.

sign hanging from the ceiling of a Beijing subway station, with arrows showing which way to different lines
Above:

  • note different word spacing between “TO” and “LINE” than between “LINE” and the number

click for larger image
Above:

  • This should almost certainly be “Changchun Jie” (jie means “street”), not “CHANGCHUNJIE”.

click for larger image
Above:

  • only Chinese characters identify this as the northeast exit (??? D?ngb?i k?u)
  • uneven left margin for the English/romanization
  • very small English in relation to the Chinese characters
  • clumsy letterspacing around capital I’s
  • too much space after the period in JRJ.COM
  • uneven spacing, as can be seen in the two uses of the word “insurance” comparison of the sizes of the word 'insurance' on the same sign

stylized image of a person sitting on a stair, with the caption 'no loitering' in English and Mandarin

further reading: