{"id":3295,"date":"2010-03-31T19:00:01","date_gmt":"2010-03-31T11:00:01","guid":{"rendered":"https:\/\/pinyin.info\/news\/?p=3295"},"modified":"2010-03-31T14:30:55","modified_gmt":"2010-03-31T06:30:55","slug":"how-to-strip-subtitle-files-down-to-text","status":"publish","type":"post","link":"https:\/\/pinyin.info\/news\/2010\/how-to-strip-subtitle-files-down-to-text\/","title":{"rendered":"How to strip subtitle files down to text"},"content":{"rendered":"<p>Subtitle files are wonderful things. But for those times when you want to just read the text by itself and not bother with the movie (for example, if you want to prepare a script), they can look a little cluttered &#8212; what with all of that extra timing information. <\/p>\n<blockquote><p>\n1<br \/>\n00:00:49,000 &#8211;> 00:00:51,500<br \/>\nYo! Li ye lai la<\/p>\n<p>2<br \/>\n00:00:52,200 &#8211;> 00:00:53,600<br \/>\nLi ye lai la<\/p>\n<p>3<br \/>\n00:01:06,900 &#8211;> 00:01:08,400<br \/>\nXiulian<\/p><\/blockquote>\n<p>The directions below for how to remove all of the extra numbers, etc., refer to Microsoft Word, since most people already have that tool. <\/p>\n<p>To strip out everything except for the text of the subtitles, run the following wildcard search (CTRL+H &#8211;> More &#8211;> Use wildcards). <\/p>\n<p>Find what:<br \/>\n<code>^13[0-9:\\,\\-\\> ]{1,}^13<\/code><\/p>\n<p>Replace with:<br \/>\n<code>^p<\/code><\/p>\n<p>Replace all. <\/p>\n<p>Note: You may need to run the above \u201creplace all\u201d twice. Also, unless you add an extra return at the top of the document you&#8217;ll need to clean up the first entry by hand. <\/p>\n<p>The above search-and-replace will yield <\/p>\n<blockquote><p>\nYo! Li ye lai la<\/p>\n<p>Li ye lai la<\/p>\n<p>Xiulian<\/p><\/blockquote>\n<p>If, however, you want to at least temporarily keep the basic timing information (such as to help you identify scene boundaries more quickly), you can do so as follows. <\/p>\n<p>Find what (wildcards):<br \/>\n<code>^13[0-9]{1,}^13([0-9\\:]{1,})([0-9\\:\\-\\> \\,]{1,})^13<\/code><\/p>\n<p>Replace with:<br \/>\n<code>^p\\1^p<\/code><\/p>\n<p>Again, unless you add an extra return at the top of the document you&#8217;ll need to clean up the first entry by hand. <\/p>\n<p>This will result in the document looking like this: <\/p>\n<blockquote><p>00:00:49<br \/>\nYo! Li ye lai la<\/p>\n<p>00:00:52<br \/>\nLi ye lai la<\/p>\n<p>00:01:06<br \/>\nXiulian<\/p><\/blockquote>\n<p>Once you&#8217;re through with the timing information, you can strip it out using the first search-and-replace above. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Subtitle files are wonderful things. But for those times when you want to just read the text by itself and not bother with the movie (for example, if you want to prepare a script), they can look a little cluttered &hellip; <a href=\"https:\/\/pinyin.info\/news\/2010\/how-to-strip-subtitle-files-down-to-text\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[38,73],"tags":[],"class_list":["post-3295","post","type-post","status-publish","format-standard","hentry","category-software","category-subtitles"],"_links":{"self":[{"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/posts\/3295","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/comments?post=3295"}],"version-history":[{"count":15,"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/posts\/3295\/revisions"}],"predecessor-version":[{"id":3311,"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/posts\/3295\/revisions\/3311"}],"wp:attachment":[{"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/media?parent=3295"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/categories?post=3295"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/pinyin.info\/news\/wp-json\/wp\/v2\/tags?post=3295"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}