The interlinear version of the Scriptures is the prototype or ideal of all translation.
— Walter Benjamin
Interlinear texts are probably familiar to most who have studied a foreign language. Interlinear texts on the Web, however, tend to be in the form of tables. And, like most other fans of CSS, I tend to cringe at the word “table.” Moreover, text within tables doesn’t wrap to different window sizes.
I am generally opposed to the practice of displaying texts in both pinyin and Chinese characters interlinearly as opposed to en face. Pinyin was not designed to be an annotation system for Chinese characters but to be a full writing system (orthography) for modern Mandarin. Many if not most people, however, are misinformed about this basic point. Consequently, I try to avoid presenting pinyin in a way that could reinforce the mistaken notion that it is a supplement to characters rather than an independent system. Nevertheless, I recognize that interlinear texts can be useful in some circumstances. Moreover, perhaps others can make less problematic use of an interlinear technique for displaying other languages and scripts.
About six months ago I started to work out a standards-compliant, table-free method for displaying Chinese characters and pinyin interlinearly on Web pages. As is so often the case, once I figured out the basics I became distracted by something else and never finished. A recent request for a way to display ruby text with pinyin, however, has prompted me to present some of my ideas on this in case others might find them useful and produce something with them. And, at any rate, CSS3’s ruby text feature isn’t likely to be implemented by the major browsers anytime soon.
The fundamental approach of the method I recommend is to put individual words/phrases and their pinyin/character equivalent in floated div
tags and use CSS to make everything look right. Unfortunately, the method isn’t semantically correct because it uses div
and p
tags for individual words rather than true blocks of text; but I don’t see that as a big enough problem to resort to the trouble of putting all this into xml. YMMV.
This is adapted from a thumbnail-captioning method detailed on A List Apart.
Floated elements, of course, need to have declared widths. But this gets tricky because words are of various widths. It’s not enough, either, to set widths based on the number of letters or Chinese characters within a block, because the question of width is complicated.
The five-letter syllable “chong,” for example, is wider than the five-letter “liang” because the letters l and i are thinner than any of the letters in chong — at least in most fonts. And the widths of pinyin elements do not correspond to the widths of Chinese characters.
With Chinese characters the situation is for the most part different. Note that 哩哩啦啦 and 爽爽快快 take the same amount of horizontal space to write:
哩哩啦啦
爽爽快快
The same, however, is not true of their Pinyin equivalents:
līlīlālā
shuǎngshuǎngkuàikuài
One way to deal with this is “headline counting,” which is an old method copy editors use to help make headlines fit within alloted spaces. Under this system, letters, numbers, and punctuation marks are given different values, based on their approximate width. Here are the values under one headline-counting method:
count value applicable letters, numbers, punctuation marks 0.5 flitj.,:;! 1.0 abcdeghknopqrsuvxyz[space]I1-[vowels, including i, with tone marks] 1.5 mwABCDEFGHJKLNOPQRSTUVXYZ234567890$? 2.0 MW[em dash]
Thus, “pinyin” would have a count of 5, but “Pinyin” would have a count of 5.5. And “Hanyu Pinyin” would have a count of 12.
To have the text spaced as attractively as possible, counts would also need to be performed for the Chinese characters and then checked against the count for the romanized text to make sure the larger value is used. This is because counts for Pinyin words could result in widths being set smaller than required, such as in the case of lí’è, which is thinner than 罹厄 unless the characters are made to be unusually small relative to the romanization. Deriving a count for the width of Chinese characters, however, is easy, because in most cases they can safely be treated as if they all took the same amount of horizontal space. The value assigned for the counting of Chinese characters would depend on how large you want to make them in relation to the pinyin.
Next, assign a CSS class
to the relevant div
. I’ve named the classes according to the counts (multiplied by 10). The base text goes inside a paragraph tag. Thus, to put “wèishénme” over “為什麼” would require the following code:
<div class="count95">
wèishénme<p>為什麼</p>
</div>
The main thing requiring attention is coming up with the correct width for each class. In the CSS for this example, I’ve rounded up counts so that two different classes can have the same width. In a finished version, perhaps they should be given separate widths or the pairs of classes should be combined to make for simpler code.
Here’s the CSS:
.interlinear div { margin-right: 0.2em; /* FOR THE SPACES BETWEEN WORDS */ height: 4.0em; /* TO KEEP LINES FROM OVERLAPPING */ } .count20, .count25 { width: 1.5em; } .count30, .count35 { width: 2.0em; } .count40, .count45 { width: 2.5em; } .count50, .count55 { width: 3.0em; } .count60, .count65 { width: 3.2em; } .count70, .count75 { width: 3.5em; } .count80, .count85 { width: 4.0em; } .count90, .count95 { width: 4.5em; } .interlinear p { font-size: 100%; margin-top: 0.3em; line-height: 1em; } /* ++++++++++++++++++++++ */ /* the CSS below this point probably does not need to be adjusted */ /* except to add more 'countXX' classes for longer words */ /* ++++++++++++++++++++++ */ .interlinear div.spacer { clear: both; height: 0; } .count20, .count25, .count30, .count35, .count40, .count45, .count50, .count55, .count60, .count65, .count70, .count75, .count80, .count85, .count90, .count95 { float: left; text-align: center; } .interlinear p { text-align: center; font-family: serif; font-size: 100%; } .interlinear { font-family: serif; font-size: 100%; }
Note the unfortunate but likely necessary use of spacer divs to separate paragraphs by clearing the floated elements. In the HTML these divs take the following form:
<div class="spacer">
</div>
Here’s some of this in action:
Here’s some interlinear text with Pinyin above Chinese characters
對面
的
女孩
看
過來,
看
過來,
看
過來.
這裡
的
表演
很
精彩.
請
不要
假裝
不理不睬.
Here’s some interlinear text with Chinese characters above Pinyin
Duìmiàn
de
nǚhái
kàn
guòlai,
kàn
guòlai,
kàn
guòlai.
Zhèlǐ
de
biǎoyǎn
hěn
jīngcǎi.
Qǐng
bùyào
jiǎzhuāng
bùlǐbùcǎi.
So, does anyone have suggestions for improving this or know how to program a way to automate the process as much as possible?
I’ve often thought the best way to do this is via tooltips (the little popup text that appears when your mouse goes over some text – like your ‘some other sites of interest’ which appears when you are over ‘links’ on this page).
You could do this either way: Have the Chinese characters, and the pinyin as a tooltip for each character(word?) – or the pinyin text and the characters as a tooltip for each word.
Doing a tooltip is simple – but automating the whole process when you’re writing an article would be a bit more tricky …
I agree that, say, Adso has done a nice job with its mouseover popups. I worry, though, that that method would be even closer to what I’m trying avoid: Pinyin as a mere crutch or “annotation” system for Chinese characters rather than a full writing system in its own right.
Another person wrote to remind me that I neglected to mention the Ruby Annotation feature of XHTML. Although it has been part of XHTML 1.1 since May 2001, it is still not particularly supported by the major browsers, so it’s not something webmasters can make use of yet. But I should have mentioned this in the original post.
As far as I know, IE6 supports XHTML’s Ruby Annotation (or some semblance of it) and Firefox can give the appearance of support with this extension (http://piro.sakura.ne.jp/xul/_rubysupport.html.en). I’ve tested the latter on this page (http://www.i18nguy.com/unicode/unicode-example-ruby.html) and it seems to work nicely.
Yeah, IE’s ruby support is actually pretty nice. I’ve only seen it in action for Japanese text, though.
“I am generally opposed to the practice of displaying texts in both pinyin and Chinese characters interlinearly as opposed to en face. Pinyin was not designed to be an annotation system for Chinese characters but to be a full writing system (orthography) for modern Mandarin.”
Do you still maintain this puristic attitude? There are times when it’s useful to have the text in Chinese characters and the pronunciation given above in pinyin. For instance, if you would prefer to have the text in Chinese characters for those who can read them, and pinyin for the benefit of those who can’t. Pinyin might position itself as an “alternative writing system” but it’s hard to read for ordinary (literate) Chinese, meaning that in most cases it doesn’t really cut it as an alternative writing system. Linguists, of course, just use pinyin, which simply serves to distance their profession from anyone but other linguists. As for ruby in HTML, it’s very narrowly defined, including an inability to manipulate the size of the annotation. Sometimes it would be nice if the pinyin could be made a bit larger.
I’m unsure why you want to label this as “puristic,” given that the whole point of my post — nearly nineteen years ago — was to provide a way for people to make interlinear Pinyin texts rather than en face ones. Regardless, I see nothing I would fundamentally alter.
The only difficulties “ordinary (literate) Chinese” — at least those who are Mandarin speakers who know Pinyin (the majority) — have in reading Pinyin (real Pinyin, as it is meant to be written, with word parsing, etc.) are simply psychological barriers from misinformation about the nature of Pinyin and a lack of practice. When someone spends basically their whole life reading texts in one script, getting used to something else isn’t automatic. But in the case of Pinyin, it is neither especially difficult nor particularly time-consuming (especially not in comparison to the time and effort that must be devoted to the study of Chinese characters).
As for ruby text, as can be seen in the post, the ruby effect of my CSS method does not produce small text (though it could be made to do so if so desired). It’s the same point size as the Chinese characters.
Now that HTML’s ruby tags are supported by probably all browsers, such workarounds are probably unnecessary. As for annotation size, it’s easy to adjust the size of ruby text relative to the main text through CSS. It can be made larger or indeed whatever size the webmaster might want.