stroke counts: Taiwan vs. China

One of the myths about Chinese characters is that for each character there is One True Way and One True Way Only for it to be written, with a specific number of specific strokes in a certain specific and invariable order. Generally speaking, characters are indeed taught with standard stroke orders with certain numbers of strokes (the patterns help make it less difficult to remember how characters are written) — but these can vary from place to place, though the characters may look the same. Moreover, people often write characters in their own fashion, though they may not always be aware of this.

Michael Kaplan of Microsoft recently examined the stroke data from standards bodies in China for all 70,195 “ideographs” [sic] in Unicode 5.0 and compared it against “the 54,195 ideographs for which stroke count data was provided by Taiwan standards bodies” to see how how much of a difference there was in the stroke counts for the characters that both sides provided data for.

(I’m a bit surprised the two sides have compiled such extensive lists, and I’d love to see them. But that’s another matter.)

He found that 9,768 of these characters (18 percent) have different stroke counts between the two standards, with 9,045 characters differing by 1 stroke, 675 characters by 2 strokes, 44 characters by 3 strokes, 2 characters by 4 strokes, 1 character by 5 strokes, and 1 character by 6 strokes.

Note: This is about stroke counts of matching characters, not about differing stroke counts for traditional and “simplified” characters — e.g., not 國 (11 strokes) vs 国 (8 strokes).

So, is this a case of chabuduoism, or of truly differing standards? The answer is not yet fully clear; but be sure to read Kaplan’s post and the comments there.

sources and additional info:

9 thoughts on “stroke counts: Taiwan vs. China

  1. Pingback: links for 2007-12-03 | bent

  2. Interesting. I just looked at one character: U+25F22 and tried to see how it could possibly be the number of strokes assigned to it by either system (20 for Taiwan and 16 for China). I was able to get 16 by cutting some corners, but there was no way I could come up with 20. Either I’m doing something wrong, or this is a case of “chabuduoism”.

  3. I was able to come up with 19 strokes by dividing the character 𥼢 into the following separate pieces:
    米 on top (6 strokes)
    十 two times (4 strokes)
    申 in the middle (5 strokes)
    木 on the bottom (4 strokes)

    Is there anyway to squeeze one more stroke out of this?

    It seems unlikely that anyone would actually write the character this way, rather than using a single vertical line down the middle. (On the other hand, how the character is written may be an open question. I wonder how many times this character has ever been handwritten by human beings alive today.)

  4. Zev: 19 was the most I could come up with as well. 16 comes from not splitting the horizontal bar between the two ?, and from drawing one single line down the middle. The only thing I could think of is that they don’t properly count the top and right sides of the box in ? as one stroke, but that would be a mistake.

  5. What you have to realise is that it is no use trying to get the correct stroke count from the character you can see in the Unicode code charts or that your font has, because the stroke counts are derived from a Taiwan reference font that has the some unfortunately muddled glyphs. Looking at the draft multi-column source glyph chart for CJK-B (IRG N1381) that has just been released for review we can see that:

    U+25F22 𥼢 is 20 strokes instead of whatever you expect because its Taiwan source glyph is actually the same as U+25F52 𥽒 (under 米 plus 14 in the Kangxi Dictionary)

    U+272F0 𧋰 is 19 strokes instead of the expected 13 strokes because its Taiwan source glyph is actually the same as U+27499 𧒙 (under 虫 plus 13 in the Kangxi Dictionary)

    U+28F71 𨽱 is 24 strokes instead of the expected 29 strokes because its Taiwan source glyph is actually the same as U+28F70 𨽰 (under 阝 plus 21 in the Kangxi Dictionary)

    I discuss the case of U+272F0 in more detail in my latest blog post.

  6. Pingback: Pinyin news » Web site for stroke-order practice

Leave a Reply

Your email address will not be published. Required fields are marked *