{"id":5660,"date":"2012-03-01T17:27:13","date_gmt":"2012-03-01T09:27:13","guid":{"rendered":"https:\/\/pinyin.info\/news\/?p=5660"},"modified":"2016-06-24T23:13:02","modified_gmt":"2016-06-24T15:13:02","slug":"pinyin-sort-order","status":"publish","type":"post","link":"https:\/\/pinyin.info\/news\/2012\/pinyin-sort-order\/","title":{"rendered":"Pinyin sort order"},"content":{"rendered":"<p><img decoding=\"async\" src=\"https:\/\/pinyin.info\/news\/news_photos\/2012\/01\/pinyin_sort_order.gif\" alt=\"\" title=\"pinyin_sort_order\" style=\"float: right; width: 80px; height: 364px;\" \/>The standard for alphabetically sorting Hanyu Pinyin is given in the <em>ABC<\/em> dictionary series edited by John DeFrancis and issued by the University of Hawaii Press.<\/p>\n<p>Here&#8217;s <a href=\"http:\/\/wenlin.com\/pysort\">the basic idea<\/a>: <\/p>\n<blockquote><p>The ordering is primarily simply alphabetical. Diacritical marks, punctuation, juncture and capitalization are only taken into account when the strings being compared are otherwise identical. For example, p&#237;ng&#8217;&#257;n sorts before p&#299;ny&#299;n, because <span class=\"py\">pingan<\/span> sorts before <span class=\"py\">pinyin<\/span>, because <em>g<\/em> precedes <em>y<\/em> alphabetically.<\/p>\n<p>Only when two strings are alphabetically identical is non-alphabetical information taken into account.<\/p><\/blockquote>\n<p>The series&#8217; <a href=\"http:\/\/www.chinesestudies.hawaii.edu\/media\/pdf\/abc\/guide.pdf\">Reader&#8217;s Guide<\/a> presents the specifics of the sort order. Since I don&#8217;t have to worry about how much space this takes up on my site, I have reformatted the information slightly to give the examples as numbered lists. <\/p>\n<blockquote style=\"font-size: 1em;\"><p>Head entry transcriptions with the same sequence of letters are ordered first strictly by letter sequence regardless of tones, then by initial syllable tone in the sequence 0 1 2 3 4. For entries with the same initial tone, arrangement is by the tone of the second syllable, again in the order 0 1 2 3 4. For example:<\/p>\n<li>sh&#299;shi<\/li>\n<li>sh&#299;sh&#299;<\/li>\n<li>sh&#299;sh&#237;<\/li>\n<li>sh&#299;sh&#464;<\/li>\n<li>sh&#299;sh&#236;<\/li>\n<li>sh&#237;sh&#299;<\/li>\n<li>sh&#237;sh&#236;<\/li>\n<li>sh&#464;sh&#299;<\/li>\n<li>sh&#236;sh&#299;<\/li>\n<p>Irrespective of tones, entries with the vowel <span class=\"py\">u<\/span> precede those with <span class=\"py\">\u00fc<\/span>.<br \/>\nFor example:<\/p>\n<ol class=\"py\">\n<li>l&#250;<\/li>\n<li>l&#468;<\/li>\n<li>l&#249;<\/li>\n<li>l&#472;<\/li>\n<li>l&#474;<\/li>\n<li>l&#476;<\/li>\n<\/ol>\n<ol class=\"py\">\n<li>n&#249;<\/li>\n<li>n&#474;<\/li>\n<\/ol>\n<p>Entries without apostrophe precede those with apostrophe. For example:<\/p>\n<ol>\n<li><span class=\"py\">bi\u00e0n<\/span> &#8212; <em>argue<\/em><\/li>\n<li><span class=\"py\">b&#464;&rsquo;&#224;n<\/span> &#8212; <em>the other shore<\/em><\/li>\n<\/ol>\n<p>Lower-case entries precede upper-case entries. For example:<\/p>\n<ol>\n<li><span class=\"py\">h\u00f2uj\u00ecn<\/span> &#8212; <em>aftereffect<\/em><\/li>\n<li><span class=\"py\">H\u00f2u J\u00ecn<\/span> &#8212; <em>Later Jin dynasty<\/em><\/li>\n<\/ol>\n<p>For entries with identical spelling, including tones, arrangement is by order of frequency&#8230;.<\/p><\/blockquote>\n<p>For most users, the most important thing to note is that <strong>the neutral tone is regarded as 0, not as 5<\/strong>. Thus, the order is <em>not<\/em> &ldquo;<span class=\"py\">&#257; &#225; &#462; &#224; a,<\/span>&rdquo; but &ldquo;<strong><span class=\"py\">a &#257; &#225; &#462; &#224;.<\/span><\/strong>&rdquo; And, because <strong>lowercase comes before uppercase<\/strong>, <em>not<\/em> &ldquo;<span class=\"py\">A a &#256; &#257; &#193; &#225; &#461; &#462; &#192; &#224;<\/span>&rdquo; but &ldquo;<strong><span class=\"py\">a A &#257; &#256; &#225; &#193; &#462; &#461; &#224; &#192;<\/span><\/strong>.&rdquo;<\/p>\n<p>One can see this in action in the <a href=\"http:\/\/www.uhpress.hawaii.edu\/books\/defrancisChinese.pdf\"><em>A<\/em> entries for the <em>ABC English-Chinese, Chinese-English Dictionary<\/em><\/a>. And here are some <a href=\"http:\/\/www.chinesestudies.hawaii.edu\/abc\/excerpts.html\">sample pages from an earlier <em>ABC<\/em> dictionary<\/a>.<\/p>\n<p>The <em>ABC<\/em> series follows the example of the <a href=\"https:\/\/pinyin.info\/news\/2010\/hanyu-pinyin-cihui\/\"><em>Hanyu Pinyin Cihui<\/em> (&#27721;&#35821;&#25340;&#38899;&#35789;&#27719; \/ &#28450;&#35486;&#25340;&#38899;&#35422;&#24409; \/ <span class=\"py\">H&#224;ny&#468; P&#299;ny&#299;n C&#237;hu&#236;<\/span>)<\/a> (<a href=\"https:\/\/pinyin.info\/readings\/hanyu_pinyin_cihui.pdf\" title=\"Hanyu Pinyin Cihui\">example<\/a>), with only one minor difference, as noted by Tom Bishop: <\/p>\n<blockquote><p>HPC [<em>Hanyu Pinyin Cihui<\/em>] gave hyphens and spaces the same priority as apostrophes, so that l&#236;g&#333;ng sorted before l&#464;-g&#333;ng, in spite of the tones. Usage of hyphens and spaces in pinyin is still far from being fully standardized. (The same is true in English orthography.) Consequently, for collation it makes sense to give less weight to hyphens and spaces, and more weight to tones, thus sorting l&#464;-g&#333;ng before l&#236;g&#333;ng. In ABC, hyphens and spaces don&#8217;t affect the sort order unless they change the pronunciation in the same way that apostrophe would; for example, &#185;m&#237;ng-&#224;n &#26126;&#26263; and &#178;m&#237;ng&rsquo;&#224;n &#20901;&#26263; are treated as homophones, and they sort after m&#464;ng&#462;n &#25935;&#24863;.<\/p><\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>The standard for alphabetically sorting Hanyu Pinyin is given in the ABC dictionary series edited by John DeFrancis and issued by the University of Hawaii Press. Here&#8217;s the basic idea: The ordering is primarily simply alphabetical. Diacritical marks, punctuation, juncture &hellip; <a href=\"https:\/\/pinyin.info\/news\/2012\/pinyin-sort-order\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[104,12,92,106,97,28,32,20,600,19,127,105],"tags":[378],"class_list":["post-5660","post","type-post","status-publish","format-standard","hentry","category-alphabet","category-chinese","category-dictionary","category-hanyu","category-john-defrancis","category-languages","category-mandarin","category-pinyin","category-pinyin-rules","category-romanization","category-tonal-languages","category-tone-marks","tag-chinese-dictionary"],"_links":{"self":[{"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/posts\/5660","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/comments?post=5660"}],"version-history":[{"count":39,"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/posts\/5660\/revisions"}],"predecessor-version":[{"id":7247,"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/posts\/5660\/revisions\/7247"}],"wp:attachment":[{"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/media?parent=5660"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/categories?post=5660"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/tags?post=5660"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}