{"id":6640,"date":"2015-11-02T23:28:56","date_gmt":"2015-11-02T15:28:56","guid":{"rendered":"https:\/\/pinyin.info\/news\/?p=6640"},"modified":"2015-11-02T23:28:56","modified_gmt":"2015-11-02T15:28:56","slug":"utf-8-unicode-vs-other-encodings-over-time","status":"publish","type":"post","link":"https:\/\/pinyin.info\/news\/2015\/utf-8-unicode-vs-other-encodings-over-time\/","title":{"rendered":"UTF-8 Unicode vs. other encodings over time"},"content":{"rendered":"<p>Some eight years ago UTF-8 (Unicode) became the most used encoding on Web pages. At the time, though, it was used on only about 26% of Web pages, so it had a plurality but not an absolute majority. <\/p>\n<p><img decoding=\"async\" src=\"https:\/\/pinyin.info\/news\/news_photos\/2008\/05\/unicode_vs_ascii.gif\" alt=\"Graph showing growth of the UTF-8 encoding\" \/><\/p>\n<p>By the beginning of 2010 Unicode was rapidly approaching use on half of Web pages.<br \/>\n<a href=\"https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/UTF-8_website_use_2001-2010.png\"><img decoding=\"async\" src=\"https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/UTF-8_website_use_2001-2010.png\" alt=\"graph showing a steep rise in the use of UTF-8 and a steep decline in other major encodings\" width=\"500\" class=\"aligncenter size-full wp-image-6643\" srcset=\"https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/UTF-8_website_use_2001-2010.png 800w, https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/UTF-8_website_use_2001-2010-300x225.png 300w, https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/UTF-8_website_use_2001-2010-400x300.png 400w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/><\/a><\/p>\n<p>In 2012 the trends were holding up.<br \/>\n<a href=\"https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/UTF-8_website_use_2001-2012.png\"><img decoding=\"async\" src=\"https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/UTF-8_website_use_2001-2012.png\" alt=\"UTF-8_website_use_2001-2012\" width=\"500\" class=\"aligncenter size-full wp-image-6645\" srcset=\"https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/UTF-8_website_use_2001-2012.png 834w, https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/UTF-8_website_use_2001-2012-300x149.png 300w, https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/UTF-8_website_use_2001-2012-500x249.png 500w\" sizes=\"(max-width: 834px) 100vw, 834px\" \/><\/a><\/p>\n<p>Note that the 2008 crossover point appears different in the latter two Google graphs, which is why I&#8217;m showing all three graphs rather than just the third. <\/p>\n<p>A different source (with slightly different figures) provides us with a look at the situation up to the present, with <strong>UTF-8 now on 85% of Web pages<\/strong>. Expansion of UTF-8 is slowing somewhat. But that may be due largely to the continuing presence of older websites in non-Unicode encodings rather than lots of new sites going up in encodings other than UTF-8.<br \/>\n<a href=\"https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/UTF-8_website_use_2010-2015.png\"><img decoding=\"async\" src=\"https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/UTF-8_website_use_2010-2015.png\" alt=\"growth in Unicode UTF-8 encoding on Web pages, 2010-2015\" width=\"500\" class=\"aligncenter size-full wp-image-6647\" srcset=\"https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/UTF-8_website_use_2010-2015.png 1849w, https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/UTF-8_website_use_2010-2015-300x128.png 300w, https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/UTF-8_website_use_2010-2015-1024x436.png 1024w, https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/UTF-8_website_use_2010-2015-500x213.png 500w\" sizes=\"(max-width: 1849px) 100vw, 1849px\" \/><\/a><\/p>\n<p>Here&#8217;s the same chart, but focusing on encodings (other than UTF-8) that use Chinese characters, so the percentages are relatively low.<br \/>\n<a href=\"https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/asian_language_encodings_2010-2015.png\"><img decoding=\"async\" src=\"https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/asian_language_encodings_2010-2015.png\" alt=\"asian_language_encodings_2010-2015\" width=\"500\" class=\"aligncenter size-full wp-image-6650\" srcset=\"https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/asian_language_encodings_2010-2015.png 1849w, https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/asian_language_encodings_2010-2015-300x128.png 300w, https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/asian_language_encodings_2010-2015-1024x436.png 1024w, https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/asian_language_encodings_2010-2015-500x213.png 500w\" sizes=\"(max-width: 1849px) 100vw, 1849px\" \/><\/a><\/p>\n<p>And here&#8217;s the same as the above, but with the results for individual languages combined.<br \/>\n<a href=\"https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/asian_language_encodings_2010-2015_by_language.png\"><img decoding=\"async\" src=\"https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/asian_language_encodings_2010-2015_by_language.png\" alt=\"asian_language_encodings_2010-2015_by_language\" width=\"500\" class=\"aligncenter size-full wp-image-6651\" srcset=\"https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/asian_language_encodings_2010-2015_by_language.png 1849w, https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/asian_language_encodings_2010-2015_by_language-300x128.png 300w, https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/asian_language_encodings_2010-2015_by_language-1024x436.png 1024w, https:\/\/pinyin.info\/news\/news_photos\/2015\/10\/asian_language_encodings_2010-2015_by_language-500x213.png 500w\" sizes=\"(max-width: 1849px) 100vw, 1849px\" \/><\/a><\/p>\n<p>By the way, Pinyin.info has been in UTF-8 since the site began way back in 2001. The reason that Chinese characters and Pinyin with tone marks appear scrambled within Pinyin News is that a hack caused the WordPress database to be set to <em>Swedish<\/em> (latin1_swedish_ci), of all things. And I haven&#8217;t been able to get it fixed; so just for the time being I&#8217;ve given up trying. One of these days&#8230;.<\/p>\n<p>Sources: <\/p>\n<ul>\n<li><a href=\"https:\/\/pinyin.info\/news\/2008\/unicode-tops-other-encodings-on-web-pages-google\/\">Unicode tops other encodings on Web pages: Google<\/a>, May 7, 2008, Pinyin News<\/li>\n<li><a href=\"https:\/\/googleblog.blogspot.tw\/2010\/01\/unicode-nearing-50-of-web.html\">Unicode nearing 50% of the web<\/a>, January 28, 2010, Google Official Blog<\/li>\n<li><a href=\"https:\/\/googleblog.blogspot.ch\/2012\/02\/unicode-over-60-percent-of-web.html\">Unicode over 60 percent of the web<\/a>, February 3, 2012, Google Official Blog<\/li>\n<li><a href=\"http:\/\/w3techs.com\/technologies\/history_overview\/character_encoding\/ms\/y\">Historical yearly trends in the usage of character encodings for websites<\/a>, accessed October 27, 2015<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Some eight years ago UTF-8 (Unicode) became the most used encoding on Web pages. At the time, though, it was used on only about 26% of Web pages, so it had a plurality but not an absolute majority. By the &hellip; <a href=\"https:\/\/pinyin.info\/news\/2015\/utf-8-unicode-vs-other-encodings-over-time\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[12,15,29,13,594],"tags":[879],"class_list":["post-6640","post","type-post","status-publish","format-standard","hentry","category-chinese","category-chinese-characters","category-japanese","category-kanji","category-unicode-writing-systems","tag-utf-8"],"_links":{"self":[{"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/posts\/6640","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/comments?post=6640"}],"version-history":[{"count":9,"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/posts\/6640\/revisions"}],"predecessor-version":[{"id":6654,"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/posts\/6640\/revisions\/6654"}],"wp:attachment":[{"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/media?parent=6640"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/categories?post=6640"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/tags?post=6640"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}