Talk:Wycliffe Bible

From CrossWire Bible Society
Revision as of 19:59, 21 March 2011 by David Haslam (talk | contribs) (WikiSource Book Application: == WikiSource Conversion to USFM ==)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

What prompted me to do this?

This exercise was prompted by something posted in facebook by one of my friends. He had pasted a verse from our existing Wycliffe module, which contains only the Pentateuch and the Gospels, and which was derived from Sergej Fedosov's Slavic Bible. IMHO, it should be fairly straightforward to create a SWORD module for the entire Wycliffe Bible, albeit we would require to use av11n to include the 10 Deuterocanonical books. David Haslam 15:58, 17 March 2011 (UTC)

Is the Slavic Bible website now defunct? I got a gateway timeout yesterday. David Haslam 15:36, 19 March 2011 (UTC)

Collaboration

I have uploaded the USFM files to a private folder in my box.net account. If anyone in CrossWire would like to collaborate on this further work, please contact me. I can easily make the folder suitable for collaboration. David Haslam 16:16, 17 March 2011 (UTC)

TextPipe filter details

Here is an exported view of the bespoke TextPipe filter I used to convert the data to USFM:

TextPipe Single User Edition
Purchased by: David Haslam, David Haslam

Filter Title: C:\Users\David\TextPipe Filters\Special filter to convert Wycliffe Bible from the Wesley Center Online to rudimentary USFM.fll

Filter List
-----------
Filter options
|  [ ] Log to file
|  [X] Append to logfile
|  Log filename: textpipe.log
|  Threshold 500
|
|--Input from file(s)
|     [ ] Confirm before processing each file
|     [ ] Confirm before processing read/only files
|     [ ] Delete input files after processing
|     Process binary files
|   
|--Comment...
|  |  Special filter to convert Wycliffe Bible from the Wesley Center Online to rudimentary USFM
|  |
|  |--Comment...
|  |  |  Convert ANSI to UTF-8
|  |  |
|  |  +--Convert from ANSI to UTF-8
|  |      
|  |--Comment...
|  |  |  Convert chapter numbers to tags
|  |  |
|  |  |--Comment...
|  |  |  |  Deal first with various exceptions
|  |  |  |
|  |  |  |--Perl pattern [^\QCAP. I.\E] with [CAP 1]
|  |  |  |     [ ] Match case
|  |  |  |     [ ] Whole words only
|  |  |  |     [ ] Case sensitive replace
|  |  |  |     [ ] Prompt on replace
|  |  |  |     [ ] Skip prompt if identical
|  |  |  |     [ ] First only
|  |  |  |     [ ] Extract matches
|  |  |  |     Maximum text buffer size 4096
|  |  |  |     [ ] Maximum match (greedy)
|  |  |  |     [ ] Allow comments
|  |  |  |     [X] '.' matches newline
|  |  |  |     [ ] UTF-8 Support
|  |  |  |   
|  |  |  |--Perl pattern [^\QPALM\E] with [CAP]
|  |  |  |     [ ] Match case
|  |  |  |     [ ] Whole words only
|  |  |  |     [ ] Case sensitive replace
|  |  |  |     [ ] Prompt on replace
|  |  |  |     [ ] Skip prompt if identical
|  |  |  |     [ ] First only
|  |  |  |     [ ] Extract matches
|  |  |  |     Maximum text buffer size 4096
|  |  |  |     [ ] Maximum match (greedy)
|  |  |  |     [ ] Allow comments
|  |  |  |     [X] '.' matches newline
|  |  |  |     [ ] UTF-8 Support
|  |  |  |   
|  |  |  +--Perl pattern [^\QPSALM\E] with [CAP]
|  |  |        [ ] Match case
|  |  |        [ ] Whole words only
|  |  |        [ ] Case sensitive replace
|  |  |        [ ] Prompt on replace
|  |  |        [ ] Skip prompt if identical
|  |  |        [ ] First only
|  |  |        [ ] Extract matches
|  |  |        Maximum text buffer size 4096
|  |  |        [ ] Maximum match (greedy)
|  |  |        [ ] Allow comments
|  |  |        [X] '.' matches newline
|  |  |        [ ] UTF-8 Support
|  |  |      
|  |  +--Perl pattern [^CAP (\d+)$] with []
|  |     |  [ ] Match case
|  |     |  [ ] Whole words only
|  |     |  [ ] Case sensitive replace
|  |     |  [ ] Prompt on replace
|  |     |  [ ] Skip prompt if identical
|  |     |  [ ] First only
|  |     |  [ ] Extract matches
|  |     |  Maximum text buffer size 4096
|  |     |  [ ] Maximum match (greedy)
|  |     |  [ ] Allow comments
|  |     |  [ ] '.' matches newline
|  |     |  [X] UTF-8 Support
|  |     |
|  |     +--Replace [CAP] with [\\c]
|  |           [ ] Match case
|  |           [ ] Whole words only
|  |           [ ] Case sensitive replace
|  |           [ ] Prompt on replace
|  |           [ ] Skip prompt if identical
|  |           [ ] First only
|  |           [ ] Extract matches
|  |         
|  |--Comment...
|  |  |  Convert verse numbers to tags
|  |  |
|  |  +--Perl pattern [^(\d+) ] with []
|  |     |  [ ] Match case
|  |     |  [ ] Whole words only
|  |     |  [ ] Case sensitive replace
|  |     |  [ ] Prompt on replace
|  |     |  [ ] Skip prompt if identical
|  |     |  [ ] First only
|  |     |  [ ] Extract matches
|  |     |  Maximum text buffer size 4096
|  |     |  [X] Maximum match (greedy)
|  |     |  [ ] Allow comments
|  |     |  [ ] '.' matches newline
|  |     |  [X] UTF-8 Support
|  |     |
|  |     +--Perl pattern [^(\d+)] with [\\v $1]
|  |           [ ] Match case
|  |           [ ] Whole words only
|  |           [ ] Case sensitive replace
|  |           [ ] Prompt on replace
|  |           [ ] Skip prompt if identical
|  |           [ ] First only
|  |           [ ] Extract matches
|  |           Maximum text buffer size 4096
|  |           [X] Maximum match (greedy)
|  |           [ ] Allow comments
|  |           [ ] '.' matches newline
|  |           [X] UTF-8 Support
|  |         
|  +--Comment...
|     |  Add remarks, filenames and convert to ID and Header tags
|     |
|     |--Add file header [\\rem John Wycliffe's Translation of the Bible]
|     |   
|     |--Add file header [@inputFilename]
|     |   
|     |--Restrict lines:Line 1 .. line 1
|     |  |
|     |  +--Perl pattern [^(\d+)_(...)\Q.TXT\E] with [\\id $2\r\n\\h $2]
|     |        [ ] Match case
|     |        [ ] Whole words only
|     |        [ ] Case sensitive replace
|     |        [ ] Prompt on replace
|     |        [ ] Skip prompt if identical
|     |        [ ] First only
|     |        [ ] Extract matches
|     |        Maximum text buffer size 4096
|     |        [X] Maximum match (greedy)
|     |        [ ] Allow comments
|     |        [ ] '.' matches newline
|     |        [X] UTF-8 Support
|     |      
|     +--Restrict lines:Line 2 .. line 2
|        |
|        +--Replace list: D:\Download\Java\GoBibleCreator\Download Other\Wesley Center Online\ID to Book.csv Perl pattern
|              [X] Match case
|              [X] Whole words only
|              [ ] Case sensitive replace
|              [ ] Prompt on replace
|              [ ] Skip prompt if identical
|              [ ] First only
|              [ ] Extract matches
|              Maximum text buffer size 4096
|              [ ] Maximum match (greedy)
|              [ ] Allow comments
|              [X] '.' matches newline
|              [ ] UTF-8 Support
|            
+--Output to file(s)
      [ ] Only update date on changed files
      [ ] Append mode
      [X] Change extension to: .usfm
      [ ] Open output file
    Only output modified files      Output folder: D:\Download\Java\GoBibleCreator\Download Other\Wesley Center Online\USFM
      [ ] Maintain folder structure
      [ ] Remove empty output files    

Files List
----------
D:\Download\Java\GoBibleCreator\Download Other\Wesley Center Online\wycbible\AP\*.txt
D:\Download\Java\GoBibleCreator\Download Other\Wesley Center Online\wycbible\NT\*.txt
D:\Download\Java\GoBibleCreator\Download Other\Wesley Center Online\wycbible\OT\*.txt

WikiSource Book Application

For the first time in my life, I've just tried the WikiSource Book Application. I added all 76 pages for the Wycliffe Bible to a special book, then downloaded it in Open Document text format (*.odt). This can be opened using WordPad. The complete text is thus available for further processing and hence potential comparison with the one it was derived from at the Wesley Center Online. David Haslam 17:47, 18 March 2011 (UTC)

One doesn't even need to have an account to use the Book Creator. The completed work may even be shareable. Here's a short URL for the one I just made. http://bit.ly/eFYSbl David Haslam 17:51, 18 March 2011 (UTC)

WikiSource Conversion to USFM

I made a separate TextPipe filter to process the WikiSource file, after first converting the downloaded ODT file to Unicode using WordPad. The single text file was split automatically into 76 USFM file. The file split was at a pattern, so the file numbers do not match how I manually numbered the Wesley Center files.

Versification errors in the WikiSource edition

The only correction in this edition was to the verse numbers for Psalm 9.

The following additonal errors were detected:

04_Numbers.usfm 	31	15
04_Numbers.usfm 	31	34
54_Romaynes.usfm	3	17

The text for these three verses is missing. David Haslam 19:59, 21 March 2011 (UTC)