Today’s Pinyin-friendly font is Sorority, by Kreative Korporation. It’s free for personal use.
Dissolving Pinyin
Late last week, Victor Mair — with some assistance from Matt Anderson, David Moser, me, and others — wrote in “Lobsters”: a perplexing stop motion film about a short 1959 film from China that gives some Pinyin. In some cases, the Pinyin is presented for a second and then is quickly dissolved into Chinese characters. Since Victor’s post supplies only the text, I thought that I’d supplement that here with images from the film.
See the original post for translations and discussion.
https://www.youtube.com/watch?v=HKYMO73hLRY
The film often shows a newspaper. The headline (at 7:57) reads (or rather should read, since the first word is misspelled):
QICHE GUPIAO MENGDIE
DAPI LONGXIA ZHIXIAO
But since the image above doesn’t show the name of the paper, I’m also offering this rotated and cropped photo, that allows us to see that this is the “JIN YUAN DIGUO RI-BAO”
Elsewhere, there are again some g’s for q’s. For the first example of text dissolving from Pinyin to Chinese characters (at 2:11), I’m offering screenshots of the text in Pinyin, the text during the dissolve, and the text in Chinese characters. Later I’ll give just the Pinyin and Chinese characters.
Hongdang Louwang
Yipi hongdang zai daogi [sic] jiudian jihui buxing guanbu [sic] louwang
Soon thereafter (at 2:44), we get a handwritten note.
At 3:39 we’re shown the printed notice in the newspaper of the above text.
A brief glance at the newspaper at 3:23 gives us FA CHOU, which is probably referring to the stink the bad lobsters are giving off.
Here a man is carrying a copy of Zibenlun (Das Kapital), by Makesi (Marx).
Actually, it’s not really Das Kapital, just the cover of the book; inside is a stack of decadent Western material. “MEI NE” is probably supposed to be “MEINÜ” (beautiful women).
I imagine that, in the PRC of 1959, the artists for this film must have inwardly rejoiced at the chance to draw something like that for a change, and that is also why there’s a nude on the wall in one scene.
Pinyin font: Alegreya
Alegreya is a large font family available for free through Google Fonts. It’s by Juan Pablo del Peral of Huerta Tipográfica. Unfortunately, the serif version of Alegreya isn’t yet ready to handle third tones; but at least the sans serif is.
It also comes with complete small caps.
UTF-8 Unicode vs. other encodings over time
Some eight years ago UTF-8 (Unicode) became the most used encoding on Web pages. At the time, though, it was used on only about 26% of Web pages, so it had a plurality but not an absolute majority.
By the beginning of 2010 Unicode was rapidly approaching use on half of Web pages.
In 2012 the trends were holding up.
Note that the 2008 crossover point appears different in the latter two Google graphs, which is why I’m showing all three graphs rather than just the third.
A different source (with slightly different figures) provides us with a look at the situation up to the present, with UTF-8 now on 85% of Web pages. Expansion of UTF-8 is slowing somewhat. But that may be due largely to the continuing presence of older websites in non-Unicode encodings rather than lots of new sites going up in encodings other than UTF-8.
Here’s the same chart, but focusing on encodings (other than UTF-8) that use Chinese characters, so the percentages are relatively low.
And here’s the same as the above, but with the results for individual languages combined.
By the way, Pinyin.info has been in UTF-8 since the site began way back in 2001. The reason that Chinese characters and Pinyin with tone marks appear scrambled within Pinyin News is that a hack caused the WordPress database to be set to Swedish (latin1_swedish_ci), of all things. And I haven’t been able to get it fixed; so just for the time being I’ve given up trying. One of these days….
Sources:
- Unicode tops other encodings on Web pages: Google, May 7, 2008, Pinyin News
- Unicode nearing 50% of the web, January 28, 2010, Google Official Blog
- Unicode over 60 percent of the web, February 3, 2012, Google Official Blog
- Historical yearly trends in the usage of character encodings for websites, accessed October 27, 2015
Pinyin font: Skarpa
Today’s Pinyin-friendly font is Skarpa, by Aga Silva of Poland. It’s a bit quirky (e.g., second-tone o’s and lowercase q’s) but still sharp.
Skarpa was later modified into Skarpa 2, which is not free but which comes in several weights and types.
Most of Silva’s other fonts also can handle Pinyin with tone marks. Those are all commercial rather than free.
Popularity of Chinese character country code TLDs
Yesterday we looked at the popularity of the Chinese character TLD for Singapore Internet domains. Today we’re going to examine the Chinese character ccTLDs (country code top-level domains) for those places that use Chinese characters and compare the figures with those for the respective Roman alphabet TLDs.
In other words, how, for example, does the use of domains compare with the use of .tw domains?
Since, unlike the case with Singapore, I don’t have the registration figures, I’m having to make do with Google hits, which is a different measure. For this purpose, Google is unfortunately a bit of a blunt instrument. But at least it should be a fairly evenhanded blunt instrument and will be useful in establishing baselines for later comparisons.
A few notes before we get started:
- Japan has yet to bother with completing the process for its own name in kanji (
), so it is omitted here.
- Macau only recently asked for
and
, so those figures are still at zero.
- Oddly enough, there’s no
ccTLD, even though the Ma administration, which was in power when Taiwan’s ccTLDs went into effect, officially prefers the more complex form of
to
— not to mention prefering it to
.
Google Hits | Percent of Total | |
---|---|---|
MACAU | ||
.mo | 18400000 | 100.00 |
![]() |
0 | 0.00 |
![]() |
0 | 0.00 |
TAIWAN | ||
.tw | 206000000 | 99.86 |
![]() |
67600 | 0.03 |
![]() |
0 | 0.00 |
![]() |
230000 | 0.11 |
HONG KONG | ||
.hk | 193000000 | 99.94 |
![]() |
118000 | 0.06 |
SINGAPORE | ||
.sg | 97800000 | 100.00 |
![]() |
2 | 0.00 |
CHINA | ||
.cn | 315000000 | 99.61 |
![]() |
973000 | 0.31 |
![]() |
251000 | 0.08 |
So in no instance does the Chinese character ccTLD reach even one half of one percent of the total for any given place.
Here are the results in a chart.
Note that the ratio of simplified:traditional forms in China and Taiwan are roughly mirror images of each other, as is perhaps to be expected.
See also Platform on Tai, Pinyin News, December 30, 2011
Popularity of the Chinese character TLD for Singapore Internet domains
For quite a few years Singapore has had several choices for those wishing to register Singapore-specific domain names, including .com.sg, .net.sg,, .org.sg, .edu.sg, .gov.sg, .per.sg, and just .sg.
Of those, .sg is a top-level domain (TLD), whereas .com.sg, .net.sg,, .org.sg, .edu.sg, .gov.sg, and .per.sg are second-level domains. This post is mainly concerned with TLDs; but when I’m giving totals I also include .com.sg, .net.sg,, .org.sg, .edu.sg, .gov.sg, and .per.sg but exclude specific domains such as groupon.sg. OK, now back to the post.
Although English is the dominant language of Singapore, it is but one of four official languages there, along with Mandarin, Malay, and Tamil, with Mandarin (along with other Sinitc languages) being the most common of the latter three. Some three-quarters of the city-state’s population is ethnic Chinese, and around half of that group speak Mandarin as the main language in their homes. In addition, for decades Singapore has promoted its campaign to Strike Hard Against Hoklo, Cantonese, and Other Languages that Your Government Says Are Puny and Insignificant Because They Have Only Tens of Millions of Speakers Apiece Speak Mandarin.
So you might think that four years ago, when Singapore introduced Singapore’s name in Chinese characters () as a top-level Internet domain (TLD), many in that multilingual society might jump at the chance to pick up some domain names ending with “Singapore” in Chinese characters. (Oh, it hurts me to use images instead of real text there; but until I get the hack fixed, that’s what I’m stuck with.)
Let’s take a look at what happened when the gates opened.
In September 2011, the first month that dot-Xinjiapo (.) domains became available, a total of 86 were registered. That’s not much of a land rush. The next month and the month after that saw no new registrations. But, OK, maybe they had a sunrise period limiting things. What happened later?
In December 2011 the number jumped to 218. This figure grew over the year 2012 to an all-time high that October of … 247 domains using the . TLD. Just 247. During the same month, Singapore had 143,887 registered domains, meaning that at the high point those with the Chinese character TLD were less than one fifth of one percent of the total. Since then, the number has fallen to a mere 210, with the percentage dropping to less than one eighth of one percent of the total.
A Google search for the . domains reveals that those domains are even less used than the already astonishingly low registration numbers might indicate.
So that’s a total of two active dot-Xinjiapo domains, one of which is for sale. In other words, basically there’s just one being used. Ouch. That’s about as close to utter insignificance as a Singapore TLD can get.
Indeed, the only sort of Singapore-related domain that is of even less interest to the netizens of Singapore is one within the dot-Cinkappur TLD, with Singapore written in the Tamil script:
Dot-Cinkappur (.) domains have been available since December 2011, which is just a few months after the introduction of dot-Xinjiapo domains. The middle of 2015 saw the all-time record high in dot-Cinkappur domain registrations: sixteen. Since then the number has dropped to just fifteen.
A search on Google for dot-Cinkappur domains reveals zero active sites.
source: Registration Statistics, Singapore Network Information Centre (SGNIC), accessed October 27, 2015
See also: sg domain names in Chinese characters lag, Pinyin News, June 23, 2010.
Pinyin font: Sherbrooke
Sherbrooke is a free Pinyin-friendly font by Eyad Al-Samman.