iOS app for writing Pinyin with tone marks

Those of you who, unlike me, own an iPhone, an iPad, or an iPod Touch may find the new Pinyin Typist Mac application of use.

Taffy of Tailingua had a look at this for me.

I’ve had a play with the Pinyin application and I’m generally quite positive about it. It’s clean, unfussy, and gets the job done. The automatic positioning looks to be flawless (i.e. typing zhuang1 gives you zhu?ng, not zh?ang)…. Overall though I like it, as it does what it set out to do without any showboating or unnecessary steps (excepting apostrophes).

Although I wish the apostrophe and hyphen were right there on the main screen instead of on a secondary one, the program allows people to do what they need to do: type Pinyin with tone marks.

It sells for US$3.99 US$2.99.

[Headline changed from “Mac app for writing Pinyin with tone marks”]

Simplified Chinese characters being purged from Taiwan government sites

Taiwan’s government Web sites have begun removing versions of their content in simplified Chinese characters at the instruction of President Ma Ying-jeou (M? Y?ngji?).

This isn’t just a matter of, say, writing “??” (Taiwan) instead of “??” (which, yes, the government here is encouraging). This is much bigger. Entire pages, entire Web sites even, written in simplified Chinese characters are being eliminated.

The Tourism Bureau, for example, removed the version of its site in simplified Chinese characters from the Web on Wednesday. This comes at a time that the government’s further lifting of restrictions against individual Chinese tourists is aimed at bringing in more travelers from China.

The Presidential Office’s spokesman quoted Ma as saying “To maintain our role as the pioneer in Chinese culture, all government bodies should use traditional Chinese in official documents and on their Web sites, so that people around the world can learn about the beauty of traditional characters.” (Is that what pioneers do? I’ll try to find the original Mandarin-language quote later if I get a chance.)

It’s one thing to urge businesses not to remove traditional Chinese characters and replace them with simplified Chinese characters (as the government did on Tuesday). It’s quite another to remove alternate versions in another script — one that a very sizable target audience would have an easier time with.

During the administration of President Chen Shui-bian the government began adding versions in simplified Chinese characters of the Mandarin texts of official Web sites. The Office of the President was one such site. Now the simplified version is gone. That’s happening across government sites.

Here, for example, are some screen shots I took.

This was the language/script selection at the National Palace Museum‘s Web site as of Thursday morning. (Click to see an image of the entire front page.)
click to see image of entire front page
“????” (ji?nt? Zh?ngwén) is brighter because I had my mouse over it to highlight that text.

And here the language/script selection at the National Palace Museum’s Web site as of Thursday evening:
click to see image of entire front page
As you can see, the choice of viewing the site in simplified Chinese characters has been removed.

Here at Pinyin.Info I often have material in Hanyu Pinyin. So I’m certainly not unsympathetic to the idea that sometimes the medium really is a major part of the message. But I doubt that President Ma’s tough-love approach in this area will accomplish anything useful for Taiwan or the survival of traditional Chinese characters; indeed, I believe it will be counter-productive.

To be more blunt about this, this seems like a really, really bad idea.

some sources:

Google Translate and romaji revisited

OK, Google has improved its Pinyin converter some, though it still fails in important areas. So that’s the present situation for Google and Mandarin.

How about for Google and Japanese?

Professor J. Marshall Unger of the Ohio State University’s Department of East Asian Languages and Literatures generously agreed to reexamine Google’s performance in conversions to r?maji (Japanese written in romanization).

Below is his latest evaluation.

For his initial analysis (in December 2009), see Google Translate and r?maji.

I ran the test passage through Google Translate again. There’s some improvement, but it’s still pretty mediocre.

Original Google Translate
???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? 6-Nichi gogo 4-ji 35-fun-goro, T?ky?-to Chiyoda-ku K?kyogaien no tod? (uchibori-d?ri) no Nij?bashi zen k?saten de, Ch?goku kara no kank? kyaku no 40-dai no dansei ga j?y?sha ni hane rare, zenshin o tsuyoku Utte mamonaku shib? shita. Kuruma wa hod? ni noriagete aruite ita dansei (69) mo hane, dansei wa atama o tsuyoku utte ishiki fumei no j?tai. Marunouchi-sho wa, unten shite ita T?ky?-to Minato-ku hakkin 3-ch?me, kaisha yakuin Takahashi nobe Tsubuse y?gi-sha (24) o jid?sha unten kashitsu sh?gai no utagai de genk?-han taiho shi, y?gi o d? chishi ni kirikaete shirabete iru.
???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? D?sho ni yoru to, shib? shita dansei wa ?dan hod? o aruite watatte ita tokoro o chokushin shite kita kuruma ni hane rareta. Kuruma wa hidari ni ky? handoru o kiri, shad? to hod? no sakai ni oka reta kasetsu no saku o haneage, hod? ni noriageta toyuu. Saku wa hod? de ran’ningu o shite ita dansei (34) niatari, dansei wa ry?ashi ni karui kega.
???????????????????????????????????????????? D?sho wa, shib? shita dansei no mimoto kakunin o susumeru totomoni, t?ji no k?saten no shing? no j?ky? o shirabete iru.
????????????????????????????????????????? Genba sh?hen wa T?ky? kank? no supotto no hitotsudaga, saikin wa jogingu o tanoshimu hito mo fuete iru.

Notes:

  • The use of numerals dodges a plethora of errors, but “6-Nichi” is still wrong for Muika.
  • Lots of correct capitalizations have been added, but “uchibori” was missed and “Utte” capitalized by mistake.
  • Some false spaces or lack of spaces persist: “hane rare”, “oka reta”; “hitotsudaga” and “niatari” were correctly hitotsu da ga and ni atari in the original test.
  • Names still get butchered (“hakkin” for Shirogane, “nobe Tsubuse” for Nobuhiro.
  • The needless apostrophe in “ran’ningu” is still there.
  • Interestingly, “toyuu” is a new error: it should be to iu.
  • There’s evidence of some attempt to use hyphens, but why not in “kank? kyaku” or “Nij?bashi zen”?

So, to update: Google gets kudos for conscientiousness, but I stick by my original comments.

For more by Prof. Unger, see Pinyin.info’s recommended readings, which includes selections from The Fifth Generation Fallacy: Why Japan Is Betting Its Future on Artificial Intelligence, Literacy and Script Reform in Occupation Japan: Reading Between the Lines, and Ideogram: Chinese Characters and the Myth of Disembodied Meaning.

Banqiao — the Xinbei ways

Xinbei, formerly known as Taipei County and now officially bearing the atrocious English name of “New Taipei City,” has made available an online map of its territory.

Interestingly, the map is available not just in Mandarin with traditional Chinese characters and English with Hanyu Pinyin (most of the time — but more on that soon) but also in Mandarin with simplified Chinese characters. A Japanese interface is also available.

The interface for all versions opens to a map centered on Xinbei City Hall. What struck me upon seeing this for the first time was that, in just one small section, Banqiao is spelled four different ways:

  • Banqiao (Hanyu Pinyin)
  • Panchiao (bastardized Wade-Giles)
  • Ban-Chiau (MPS2, with an added hyphen)
  • Banciao (Tongyong Pinyin)

Click the map to see an enlargement.
click for larger version

I want to stress that these are not typos. These are the result of an inattention to detail that is all too common here.

The spelling for the city, er, district is also wrong in the interface, with Tongyong used. Since Banqiao is the seat of the Xinbei City Government and has more than half a million inhabitants,*, it’s not exactly so obscure that spelling its name correctly should be much of a challenge. Tongyong and other systems also crop up in some other names outside the interface.

It should be admitted, however, that the Xinbei map’s romanization is still better overall than the error-filled mess issued by GooGle.

*: including me

Wenlin releases major upgrade (4.0)

Wenlin logoOne of my favorite programs, Wenlin (which bills itself as “software for learning Chinese”), has just released a major upgrade for both Mac and Windows versions. This doesn’t happen often; it has been three-and-a-half years since the most recent big change was issued (Wenlin 3.4) and heaven only knows how long since 3.0 came out. So, yes, this release has many substantial improvements.

One of the features nearest and dearest to my heart is that Wenlin 4.0 features greatly improved handling of Pinyin. I was among the field testers for the new version, so I’ve already spent a lot of time examining this feature. Here are a few important aspects of this:

  • Conversions from Chinese characters follow Hanyu Pinyin orthography much more closely than before. This is a major change for the better. (There’s still some room for improvement. But I don’t think we’ll have to wait years for this.)
  • In the past, using Wenlin to convert long texts in Chinese characters into Pinyin could be a real chore, with users having to examine example after example of Chinese characters with multiple pronunciations in order to select the proper pronunciation for that particular context. But now users may, if they so desire, tell Wenlin not to ask users for disambiguation input. Of course, that doesn’t mean that Wenlin will always guess right; but many users will be happy that this trade-off allows them to skip the frustration of, for example, having to tell the program over and over and over that, yes, in this case ? is pronounced shu? rather than shuì.
  • Relative newcomers to Mandarin may appreciate that for common words tone sandhi is indicated in Wenlin with additional marks (a dot or line below the vowel). This feature can also be turned off, for those who want standard Pinyin.

There are, of course, many improvements beyond the area of Pinyin. Here are a few:

  • One limitation of Wenlin 3.x was that its English dictionary wasn’t very large. But Wenlin 4.0 includes not only the ABC Chinese-English Comprehensive Dictionary but also the excellent new ABC English-Chinese, Chinese-English Dictionary (now finally in stock in the printed version).
  • The flashcards are now set up to handle not just individual characters but polysyllabic words.
  • There’s full Unicode Unihan 6.0 support for more than 75,000 Chinese characters.
  • And for those who think 75,000 just isn’t enough, users can now access Wenlin’s CDL technology. Through this, users can create new, variant, and rare characters; moreover, these can be published and shared with other Wenlin users or CDL-friendly devices.
  • Seal script versions of more than 11,000 characters are provided.
  • Wenlin contains an e-edition of the Shuowen Jiezi (Shu?wén Ji?zì / ???? / ????).
  • Coders will be interested to know that Wenlin appears to be headed toward becoming open-source.
  • Both Mandarin and English entries are marked with grade levels, which aids learners by indicating relative frequency of use. The levels for Mandarin words are based on the Hanyu Shuiping Kaoshi (Hàny? Sh?ipíng K?oshì / ?????? / ?????? / HSK).

The full version (i.e., the CD with the program comes in a box and is likely packaged with a hard copy of the manual) is US$199, or US$179 if you download it from the Wenlin Web store. Upgrades from 3.x cost US$49.

For more information, see the summary of features and outline of what’s new in Wenlin 4.0.

screenshot from Wenlin 4.0 -- click for larger version

sg domain names in Chinese characters lag

Between November, 23, 2009, when Singapore first began registering .sg names in Chinese characters, and June 10, 2010, when registrations of Chinese-character .sg domain names opened to all without any additional fee, only 1,024 such names were registered, or just 0.88 percent of all .sg domain names. This apparently includes not just second-level domains (e.g., ??.sg) but also third-level domains (e.g., ??.com.sg).

The percentage will likely rise in the coming months, as the process has only recently opened to everyone on a first-come, first-served basis. But, still, demand for such names in Singapore has so far been underwhelming.

A bit more information:

Registrations were accepted in phases, with registrations for government organizations starting on Nov. 23, 2009. Beginning in January, SGNIC began accepting domain name registrations from trademark holders.

During the third phase, the general public was allowed to register domain names starting on March 25, but applicants were charged a “priority fee” of S$100 (US$72) for each domain name, with domain names sought by several applicants awarded to the highest bidder.

In all three phases, applicants could apply for a domain name made up of Chinese numbers or a name with just one Chinese character for a fee of S$500 [US$360]….

The fourth and final phase began on June 10, with SGNIC accepting domain name applications on a first-come, first-served basis. The S$100 priority fee is no longer required, but applicants are no longer allowed to register domain names using Chinese numbers or names with just one Chinese character….

When IDA announced the introduction of Chinese-language domain names last year, SGNIC said the effort was partly intended to help Singaporean businesses target the Chinese market.

source: Singapore registers 1,000 Chinese-language domain names, IDG News Service, June 23, 2010

Baidu adds handwriting input

Baidu has just added a function that allows people to use their mouse to write Chinese characters for searches.

On the Baidu home page, click on “??” (sh?uxi?/??/handwrite).

This will bring up a pop-up box in which you can use your mouse to write Hanzi. This functions in basically the same way as the mouse-writing tool that Nciku added about two years ago.

source: Baidu.com’s Search Box Now Supports Chinese Handwriting Input, China Tech News, June 16, 2010