Subtitle files are wonderful things. But for those times when you want to just read the text by itself and not bother with the movie (for example, if you want to prepare a script), they can look a little cluttered — what with all of that extra timing information.
1
00:00:49,000 –> 00:00:51,500
Yo! Li ye lai la2
00:00:52,200 –> 00:00:53,600
Li ye lai la3
00:01:06,900 –> 00:01:08,400
Xiulian
The directions below for how to remove all of the extra numbers, etc., refer to Microsoft Word, since most people already have that tool.
To strip out everything except for the text of the subtitles, run the following wildcard search (CTRL+H –> More –> Use wildcards).
Find what:
^13[0-9:\,\-\> ]{1,}^13
Replace with:
^p
Replace all.
Note: You may need to run the above “replace all” twice. Also, unless you add an extra return at the top of the document you’ll need to clean up the first entry by hand.
The above search-and-replace will yield
Yo! Li ye lai la
Li ye lai la
Xiulian
If, however, you want to at least temporarily keep the basic timing information (such as to help you identify scene boundaries more quickly), you can do so as follows.
Find what (wildcards):
^13[0-9]{1,}^13([0-9\:]{1,})([0-9\:\-\> \,]{1,})^13
Replace with:
^p\1^p
Again, unless you add an extra return at the top of the document you’ll need to clean up the first entry by hand.
This will result in the document looking like this:
00:00:49
Yo! Li ye lai la00:00:52
Li ye lai la00:01:06
Xiulian
Once you’re through with the timing information, you can strip it out using the first search-and-replace above.
Pingback: Pinyin news » Combining Pinyin and Chinese character subtitles