Difference between revisions of "Osis2mod"
David Haslam (talk | contribs) (→Transformations: :''This section may need further updating''.) |
(→Module and Testament Introductions: Making progress toward having Module and Testament Introductions) |
||
(32 intermediate revisions by 3 users not shown) | |||
Line 7: | Line 7: | ||
== Current status == | == Current status == | ||
− | Software bugs relating to osis2mod should be reported in | + | Software bugs relating to osis2mod should be reported in https://tracker.crosswire.org/browse/MODTOOLS |
:''Please describe current status of osis2mod, including a list of any outstanding issues or unsolved difficulties''. | :''Please describe current status of osis2mod, including a list of any outstanding issues or unsolved difficulties''. | ||
Line 18: | Line 18: | ||
!Revision | !Revision | ||
!Feature | !Feature | ||
+ | |-valign="top" | ||
+ | | | ||
+ | '''2025-07-07''' | ||
+ | | | ||
+ | r3907 | ||
+ | | | ||
+ | * MODTOOLS-25 give location in input for messages. | ||
+ | |||
+ | Created identifyMsg which creates a standardized message for reporting and used it for all messages to standard out. | ||
+ | |||
+ | The message is of the form: type(kind)[linePos,charPos] osisID=osisID: message | ||
+ | |||
+ | Where: | ||
+ | type The message type (e.g., "ERROR", "WARNING", "INFO"). | ||
+ | kind The message category or kind (e.g., "REF", "PARSE"). | ||
+ | linePos The position in the file of the last line that was read. | ||
+ | charPos The position in the line of the last character that was read. | ||
+ | osisID (Optional) The current OSIS ID to include. May be nullptr or empty. | ||
+ | message event description with details | ||
+ | |||
+ | * If linePos is 0, the position ([linePos,charPos]) is omitted. | ||
+ | * If osisID is nullptr or empty, the osisID part is omitted. | ||
+ | * The returned string always ends with a colon and a trailing space (": "). | ||
+ | |||
+ | |-valign="top" | ||
+ | | | ||
+ | '''2025-06-25''' | ||
+ | | | ||
+ | r3906 | ||
+ | | | ||
+ | * MODTOOLS-76: Enhance -v argument handling in osis2mod with case-insensitive and prefix matching, and improved error reporting. | ||
+ | ** Outputs the name of the versification system being used at program start | ||
+ | ** Adds layered resolution of versifications: | ||
+ | ## Case-sensitive exact match | ||
+ | ## Case-insensitive exact match | ||
+ | ## Case-insensitive prefix match | ||
+ | ** Shows detailed error messages when input is invalid or ambiguous: | ||
+ | *** Lists all matches in case of ambiguity | ||
+ | *** Lists input and valid options if no match is found | ||
+ | |||
+ | |-valign="top" | ||
+ | | | ||
+ | '''2025-06-19''' | ||
+ | | | ||
+ | r3903 | ||
+ | | | ||
+ | * MODTOOLS-108 fix. Properly append verses beyond the last chapter of a book at the end of the book in the last verse of the chapter, which caused an infinite loop. | ||
+ | |||
+ | |-valign="top" | ||
+ | | | ||
+ | '''2020-07-26''' | ||
+ | | | ||
+ | r3769 | ||
+ | | | ||
+ | * Fixed linking bug with ranges in osis2mod | ||
+ | |||
+ | |-valign="top" | ||
+ | | | ||
+ | '''2020-05-08''' | ||
+ | | | ||
+ | r3737 | ||
+ | | | ||
+ | * Added support putting reponumber into nano position of SWORD version string and output current SWORD version | ||
+ | * standardize on (SWORD: {SWVersion::currentVersion.getText()}) for all reported SWORD versions in utilities | ||
+ | |||
+ | |-valign="top" | ||
+ | | | ||
+ | '''2016-08-16''' | ||
+ | | | ||
+ | r3431 | ||
+ | | | ||
+ | * commented out a set of lines handling "div type="majorSection" " - see comments in code. | ||
+ | |||
|-valign="top" | |-valign="top" | ||
| | | | ||
Line 24: | Line 97: | ||
r3401 | r3401 | ||
| | | | ||
− | * | + | * Added entity handling. |
− | * | + | * div type='colophon' now not changed to milestones. |
+ | |||
+ | |-valign="top" | ||
+ | | | ||
+ | '''2015-02-16''' | ||
+ | | | ||
+ | r3322 | ||
+ | | | ||
+ | * Correct casing of COUT | ||
+ | |||
+ | |-valign="top" | ||
+ | | | ||
+ | '''2015-02-16''' | ||
+ | | | ||
+ | r3321 | ||
+ | | | ||
+ | * cleaned up whitespace in osis2mod | ||
+ | |||
+ | |-valign="top" | ||
+ | | | ||
+ | '''2014-12-17''' | ||
+ | | | ||
+ | r3310 | ||
+ | | | ||
+ | * MODTOOLS-55 endless loop fix. | ||
+ | |||
+ | |-valign="top" | ||
+ | | | ||
+ | '''2014-12-15''' | ||
+ | | | ||
+ | r3307 | ||
+ | | | ||
+ | * Greg Helling's patch to osis2mod for majorSections | ||
|-valign="top" | |-valign="top" | ||
Line 253: | Line 358: | ||
Osis2mod performs the following transformations<ref>These transformations are all performed "under the hood" as it were. Tweaking OSIS XML files to fix problems with pre-verse titles, etc., was never intended to be done by module developers as part of the preprocessing before using osis2mod.</ref>: | Osis2mod performs the following transformations<ref>These transformations are all performed "under the hood" as it were. Tweaking OSIS XML files to fix problems with pre-verse titles, etc., was never intended to be done by module developers as part of the preprocessing before using osis2mod.</ref>: | ||
+ | |||
* '''Whitespace''' -- Allows for human-readable OSIS files. | * '''Whitespace''' -- Allows for human-readable OSIS files. | ||
**Leading whitespace on books, chapters and verses is removed | **Leading whitespace on books, chapters and verses is removed | ||
**Whitespace is normalized into blanks | **Whitespace is normalized into blanks | ||
**Multiple adjacent whitespace is reduced to a single space | **Multiple adjacent whitespace is reduced to a single space | ||
− | * '''Unicode handling''' - All modules should be UTF-8, NFC | + | |
+ | * '''Unicode handling''' - All modules should be UTF-8, NFC.<ref>With the possible exception of Biblical Hebrew and some Indic scripts that should not be normalized.</ref> | ||
**Latin-1 (cp1252 and iso8859-1) are converted into UTF-8 | **Latin-1 (cp1252 and iso8859-1) are converted into UTF-8 | ||
**UTF-8 is normalized into NFC, unless specified otherwise (i.e. by using the -N option). | **UTF-8 is normalized into NFC, unless specified otherwise (i.e. by using the -N option). | ||
− | * '''Milestone conversion''' - necessary for frontends to show a verse at a time. | + | |
− | ** <q ...>...</q> | + | * '''Verse ranges''' - also known as linked verses. |
− | ** <p ...>...</p> becomes <div | + | ** Verses within a specified OSIS verse range are each linked to the first verse of the range. |
− | ** <chapter ...>...</chapter> becomes <chapter sID=" | + | |
− | ** <closer ...>...</closer> becomes <closer sID=" | + | * '''Milestone conversion''' - necessary for frontends to show a verse at a time. |
− | ** <div ...>...</div> becomes <div sID=" | + | ** <tt><q ...>...</q></tt> becomes <tt><q sID="gen#" .../>...<q eID="gen#" .../></tt><ref>'''gen#''' is unique for an sID/eID pair, where '''#''' is a number.</ref><ref>Quotes with who="Jesus" are not transformed at this stage.</ref> |
− | ** <l ...>...</l> becomes <l sID=" | + | ** <tt><p ...>...</p></tt> becomes <tt><div sID="gen#" type="x-p" .../>... <div eID="gen#" type="x-p" ...></tt> |
− | ** <lg ...>...</lg> becomes <lg sID=" | + | ** <tt><chapter ...>...</chapter></tt> becomes <tt><chapter sID="gen#" .../>...<chapter eID="gen#" .../></tt> |
− | ** <salute ...>...</salute> becomes <salute sID=" | + | ** <tt><closer ...>...</closer></tt> becomes <tt><closer sID="gen#" .../>...<closer eID="gen#" .../></tt> |
− | ** <signed ...>...</signed> becomes <signed sID=" | + | ** <tt><div ...>...</div></tt> becomes <tt><div sID="gen#" .../>...<div eID="gen#" .../></tt><ref>As of r3401, <div type="colophon"...> is unchanged.</ref> |
− | ** <speech ...>...</speech> becomes <speech sID=" | + | ** <tt><l ...>...</l></tt> becomes <tt><l sID="gen#" .../>...<l eID="gen#" .../></tt> |
− | ** <verse ...>...</verse> becomes (when using -d 2 for debugging.) <milestone resp="v" sID=" | + | ** <tt><lg ...>...</lg></tt> becomes <tt><lg sID="gen#" .../>...<lg eID="gen#" .../></tt> |
− | * '''Words of Christ''' - necessary for front-ends to appropriately highlight the WOC, a verse at a time. | + | ** <tt><salute ...>...</salute></tt> becomes <tt><salute sID="gen#" .../>...<salute eID="gen#" .../></tt> |
− | ** <q sID=" | + | ** <tt><signed ...>...</signed></tt> becomes <tt><signed sID="gen#" .../>...<signed eID="gen#" .../></tt> |
− | ** <q who="Jesus" ...>...</q> becomes <q who="Jesus" marker=""><q sID=" | + | ** <tt><speech ...>...</speech></tt> becomes <tt><speech sID="gen#" .../>...<speech eID="gen#" .../></tt> |
− | ** Within the following construct, <q who="Jesus" marker="">...</q> will surround verse text. | + | ** <tt><verse ...>...</verse></tt> becomes (when using -d 2 for debugging.)<BR><tt><milestone resp="v" sID="gen#" ... />...<milestone resp="v" eID="gen#" ... /></tt> |
− | * '''Pre-Verse Titles''' | + | |
− | ** Titles immediately preceeding a verse are converted into <title type=" | + | * '''Words of Christ''' - necessary for front-ends to appropriately highlight the WOC, a verse at a time.<ref>The examples given have the null string as the quotation marks in the '''marker''' attribute (as in the KJV module), but proper quotation marks are also supported.</ref> |
− | + | ** <tt><q sID="gen#" who="Jesus" .../>...<eID="gen#" who="Jesus" .../></tt> becomes<BR><tt><q who="Jesus" marker=""><q sID="gen#" .../>...<q eID="gen#" .../></q></tt> | |
− | * '''InterVerse Content''' | + | ** <tt><q who="Jesus" ...>...</q></tt> becomes<BR><tt><q who="Jesus" marker=""><q sID="gen#" .../>...<q eID="gen#" .../></q></tt> |
− | ** InterVerse Content refers to all content not contained by the verse element.<ref>For OSIS files derived from USFM files, the implication of this requirement is the following rule:< | + | ** Within the following construct, <tt><q who="Jesus" marker="">...</q></tt> will surround verse text. |
− | except where the translation places the title somewhere within the verse text in the USFM file.</ref> | + | |
+ | * '''Pre-Verse Titles'''<ref>The older method that fixed titles only became obsolete with SVN revision 2358 for the SWORD 1.6.0 release and replaced with '''InterVerse Content'''.</ref> | ||
+ | ** InterVerse or postVerse<ref>For example, the '''colophon''' div as found at the end of some Epistles.</ref> content not in titles is appended to the prior verse. | ||
+ | ** Titles immediately preceeding a verse are converted into either<BR><tt><title canonical="true" type="psalm">...</title></tt> or<BR><tt><title canonical="false" type="sub">...</title></tt><ref>type="sub" is used irrespective of whether the non-canonical title is a section title or a subSection title.</ref><BR>and the whole such '''title''' element is wrapped within a special preverse '''div''' element. ''See below''. | ||
+ | |||
+ | * '''InterVerse Content'''<ref>Introduced with SVN revision 2358 for the SWORD 1.6.0 release.</ref> | ||
+ | ** InterVerse Content refers to all content not contained by the verse element.<ref>i.e. Content between the '''eID''' milestone of a verse and the '''sID''' milestone of the next verse, or before the first verse of the chapter.</ref><ref>For OSIS files derived from USFM files, the implication of this requirement is the following rule: | ||
+ | <blockquote>Do not place a title (or similar element) between a matching pair of verse milestones<BR> | ||
+ | except where the translation places the title somewhere within the verse text in the USFM file.</blockquote></ref> | ||
** Such content is divided between the prior and the current verse. | ** Such content is divided between the prior and the current verse. | ||
** Content appended to the prior verse is not marked in any special way. | ** Content appended to the prior verse is not marked in any special way. | ||
− | ** Content prepended to the current verse is marked with <div | + | ** Content prepended<ref>This ensures that when the verse is called the prepended content is also displayed, subject to whatever filters are in place.</ref> to the current verse is marked with<BR><tt><div type="x-milestone" subType="x-preverse" sID="pv#"/>...<div type="x-milestone" subType="x-preverse" eID="pv#"/>.</tt> |
'''Notes:''' | '''Notes:''' | ||
Line 296: | Line 411: | ||
# In the following, the effects of the above transformations are not shown. The tagging of the pre-verse material is also not shown. | # In the following, the effects of the above transformations are not shown. The tagging of the pre-verse material is also not shown. | ||
− | ===Module and Testament Introductions=== | + | ===Module Introduction=== |
+ | |||
+ | At this time, osis2mod does not fully support module introduction. It is intended that the following will work. All material after the </header> and before a div with a type of book or bookGroup is a module heading. | ||
+ | |||
+ | A module introduction should be place into testament 0, book 0, chapter 0, verse 0. | ||
+ | In SWORD Module introductions have the special id of: | ||
+ | *<nowiki>[ Module Heading ]</nowiki> | ||
+ | and can be accessed by setting a VerseKey's testament as | ||
+ | // In order to access headings, aka intros, one has to set intros to true. | ||
+ | vk.setIntros(true); | ||
+ | // Setting the testament will also set book, chapter and verse to 0 when intros is set to true | ||
+ | vk.setTestament(0); | ||
+ | |||
+ | ===Testament Introductions=== | ||
+ | At this time, osis2mod does not fully support testament introductions. It is the intention that the following will work. | ||
− | + | A testament introduction contains all the material that is after a module introduction and before the first book for the Old Testament introduction or before the first book in the New Testament.<br/> | |
+ | Note: here first OT or NT book means the first that occurs in the OSIS xml. | ||
+ | |||
+ | It is required that <div type="bookGroup"> contains the books. Either as all the books, or sets of books, e.g. OT, NT, Apocrypha, Torah, History, Major Prophets, Minor Prophets, with nesting groups as desired. | ||
+ | |||
+ | Minimal example of a module consisting of 2 books, Psalms and John: | ||
+ | <div type="bookGroup"> | ||
+ | ... Old Testament introductory material ... | ||
+ | <div type="book" osisID="Ps"> | ||
+ | ... Chapters of Psalms ... | ||
+ | </div> <!-- end of Psalms --> | ||
+ | ... New Testament introductory material ... | ||
+ | <div type="book" osisID="Jn"> | ||
+ | </div> <!-- end of John --> | ||
+ | </div> <!-- end of the Bible --> | ||
+ | |||
+ | Example of testaments in separate book groups. Again using a module of 2 books. | ||
+ | <div type="bookGroup" subType="x-OT"> | ||
+ | ... Old Testament introductory material ... | ||
+ | <div type="book" osisID="Ps"> | ||
+ | ... Chapters of Psalms ... | ||
+ | </div> <!-- end of Psalms --> | ||
+ | </div> <!-- end of the OT --> | ||
+ | <div type="bookGroup" subType="x-NT"> | ||
+ | ... New Testament introductory material ... | ||
+ | <div type="book" osisID="Jn"> | ||
+ | </div> <!-- end of John --> | ||
+ | </div> <!-- end of the NT --> | ||
+ | |||
+ | Example of testaments in separate book groups with the Apocrypha book group. | ||
+ | <div type="bookGroup" subType="x-Old"> | ||
+ | ... Old Testament introductory material ... | ||
+ | <div type="book" osisID="Ps"> | ||
+ | ... Chapters of Psalms ... | ||
+ | </div> <!-- end of Psalms --> | ||
+ | </div> <!-- end of the OT --> | ||
+ | <div type="bookGroup" subType="x-non-canon"> | ||
+ | ... Books of the Apocrypha ... | ||
+ | </div> <!-- end of the Deuterocanon --> | ||
+ | <div type="bookGroup" subType="x-New"> | ||
+ | ... New Testament introductory material ... | ||
+ | <div type="book" osisID="Jn"> | ||
+ | </div> <!-- end of John --> | ||
+ | </div> <!-- end of the New --> | ||
+ | |||
+ | A testament introduction should be placed into testament 1 or 2, book 0, chapter 0, verse 0. | ||
+ | In SWORD Testament introductions have the special ids of: | ||
+ | *<nowiki>[ Testament 1 Heading ]</nowiki> | ||
+ | *<nowiki>[ Testament 2 Heading ]</nowiki> | ||
+ | but can be accessed by setting a VerseKey's testament as | ||
+ | // In order to access headings, aka intros, one has to set intros to true. | ||
+ | vk.setIntros(true); | ||
+ | // Setting the testament will also set book, chapter and verse to 0 when intros is set to true | ||
+ | vk.setTestament(1); | ||
===Book Introductions and Titles=== | ===Book Introductions and Titles=== | ||
Line 521: | Line 703: | ||
Osis2mod has robust, mind-boggling messages. These are provided here in hopes that it will help problem diagnosis. | Osis2mod has robust, mind-boggling messages. These are provided here in hopes that it will help problem diagnosis. | ||
− | + | Messages are output with a standard format: | |
− | + | TYPE(KIND)[line,column'](osisID'): Message | |
− | + | ||
− | + | '''TYPE''' is one of | |
− | + | * FATAL - Usually accompanied by an immediate exit. The problem should be fixed and osis2mod rerun. | |
− | + | * ERROR - A non-fatal problem that should be fixed before the module is used. | |
− | + | * WARNING - A problem with the input that probably should be fixed. | |
+ | * INFO - Information about what the program is doing. | ||
+ | * DEBUG - Managed by the -d flag. | ||
+ | |||
+ | '''KIND''' is one of: | ||
+ | * UTF8 - Deals with conversion from Latin-1 to UTF8. | ||
+ | * V11N - Messages related to Versification. | ||
+ | * WRITE - Messages related to writing to the module. | ||
+ | * LINK - Messages related to linked verses. | ||
+ | * REF - Messages related to the normalization of osis references to SWORD references. | ||
+ | * NESTING - Messages related to improper overlapping of BCV and BSP | ||
+ | * COMMENTS - Messages related to XML comment processing. | ||
+ | * PARSE - Messages related to XML entity processing. | ||
+ | * QUOTE - Handling of quotes, especially the Words of Christ (WoC) | ||
+ | * TITLE - Handling of Introductions and Titles | ||
+ | * INTERVERSE - Handling of material between verses and before verse 1. | ||
+ | * FOUND - Diagnostics related to finding of Books, Chapters and Verses. | ||
+ | * ARGS - Summary of command line arguments. | ||
+ | Some of these are described more fully below | ||
+ | |||
+ | '''Line''' and '''Column''' gives the location of the last line and column read in the file. Since processing is handled after the reading, the location is approximate. If the processing is before the file is read, such as during the reading of command line arguments, it won't show up. | ||
+ | |||
+ | '''osisID''' gives the osisID being processed or N/A. It also is optionally output early in the execution of the program. | ||
+ | |||
+ | In the following, example values are given in '''{...}'''. The brackets do not actually appear in the message. Also, the messages are a bit prettier here than in reality. | ||
+ | |||
===Diagnostic Messages=== | ===Diagnostic Messages=== | ||
− | + | ||
− | + | WARNING(UTF8)['''{ line }''','''{ col }''']('''{ osisID }'''): Should be converted to UTF-8 ('''{text }''') | |
The program will always check for text that is not UTF-8. | The program will always check for text that is not UTF-8. | ||
− | INFO(UTF8) | + | INFO(UTF8)['''{ line }''','''{ col }''']('''{ osisID }'''): Converting to UTF-8 ('''{ text before conversion }''') |
Text that is converted to UTF-8 is noted. | Text that is converted to UTF-8 is noted. | ||
− | ERROR(UTF8) | + | ERROR(UTF8)['''{ line }''','''{ col }''']('''{ osisID }'''): Converting to UTF-8 ('''{ text after first conversion }''') |
It is an error if after a conversion it still is not UTF-8. | It is an error if after a conversion it still is not UTF-8. | ||
− | WARNING(UTF8): osis2mod is not compiled with support for ICU. Ignoring -n flag. | + | WARNING(UTF8)['''{ line }''','''{ col }''']('''{ osisID }'''): osis2mod is not compiled with support for ICU. Ignoring -n flag. |
Normalization was requested, but since osis2mod was not compiled for it, it cannot honor the default request. | Normalization was requested, but since osis2mod was not compiled for it, it cannot honor the default request. | ||
− | INFO(V11N) | + | INFO(V11N)['''{ line }''','''{ col }''']('''{ osisID }'''): is not in the '''{ v11n }''' versification. |
Indicates that a verse is not in the versification. | Indicates that a verse is not in the versification. | ||
− | INFO(V11N) | + | INFO(V11N)['''{ line }''','''{ col }''']('''{ osisID }'''): is not in the '''{ v11n }''' versification. Appending content to '''{ osisID }''' |
This like the other indicates a versification problem, but shows where the text will be found. Osis2mod preserves all module content for supported books. | This like the other indicates a versification problem, but shows where the text will be found. Osis2mod preserves all module content for supported books. | ||
− | WARNING(V11N): New book is ''' | + | WARNING(V11N)['''{ line }''','''{ col }''']('''{ osisID }'''): New book is '''{ name }''' and is not in '''{ v11n }''' versification, ignoring |
The name of the book was not recognized as belonging to the chosen versification, it and all of it's content is ignored. | The name of the book was not recognized as belonging to the chosen versification, it and all of it's content is ignored. | ||
− | INFO(WRITE): Appending entry: ''' | + | INFO(WRITE)['''{ line }''','''{ col }''']('''{ osisID }'''): Appending entry: '''{ osisID }''': '''{ text so far }''' |
If osis2mod encounters text that needs to be appended to a verse that is already in the module. This could indicate that | If osis2mod encounters text that needs to be appended to a verse that is already in the module. This could indicate that | ||
* the reference is in the input twice. This typically indicates a problem. | * the reference is in the input twice. This typically indicates a problem. | ||
Line 558: | Line 765: | ||
* osis2mod is being run in append mode to fix a verse in the module. | * osis2mod is being run in append mode to fix a verse in the module. | ||
− | INFO(LINK): Linking ''' | + | INFO(LINK)['''{ line }''','''{ col }''']('''{ osisID }'''): Linking '''{ osisID }''' to '''{ osisID }''' |
An osisID such as "Gen.1.1 Gen.1.2 Gen.1.3" was used and the latter are linked to the first. | An osisID such as "Gen.1.1 Gen.1.2 Gen.1.3" was used and the latter are linked to the first. | ||
− | ERROR(REF): Invalid osisID/annotateRef: ''' | + | ERROR(REF)['''{ line }''','''{ col }''']('''{ osisID }'''): Invalid osisID/annotateRef: '''{ invalid attribute value }''' |
This indicates that the SWORD library was unable to parse the osisID or annotateRef. | This indicates that the SWORD library was unable to parse the osisID or annotateRef. | ||
− | FATAL(NESTING) | + | FATAL(NESTING)['''{ line }''','''{ col }''']('''{ osisID }'''): tag expected |
This indicates that the specified verse is not balanced with regard to its tags. Building a raw text module, looking in the module for the verse and pairing begin/end tags will help find the problem. Typically, this indicates an end tag that did not have a matching begin tag and all tags before it were properly paired. | This indicates that the specified verse is not balanced with regard to its tags. Building a raw text module, looking in the module for the verse and pairing begin/end tags will help find the problem. Typically, this indicates an end tag that did not have a matching begin tag and all tags before it were properly paired. | ||
− | FATAL(NESTING) | + | FATAL(NESTING)['''{ line }''','''{ col }''']('''{ osisID }'''): Expected '''{ topToken.getName() }''' found '''{ tokenName }''' |
This also indicates that the specified verse is not balanced with regard to its tags. Building a raw text module, looking in the module for the verse and pairing begin/end tags will help find the problem. It could be either a begin or an end tag problem. | This also indicates that the specified verse is not balanced with regard to its tags. Building a raw text module, looking in the module for the verse and pairing begin/end tags will help find the problem. It could be either a begin or an end tag problem. | ||
− | WARNING(NESTING): verse ''' | + | WARNING(NESTING)['''{ line }''','''{ col }''']('''{ osisID }'''): verse '''{ currentOsisID }''' is not well formed:('''{ verseDepth }''','''{ tagDepth }''') |
This indicates that the verse probably will not show properly in some front-ends in some circumstances. Typically, it shows the problem if the verse is shown in isolation. | This indicates that the verse probably will not show properly in some front-ends in some circumstances. Typically, it shows the problem if the verse is shown in isolation. | ||
− | ERROR(NESTING): improper nesting ''' | + | ERROR(NESTING)['''{ line }''','''{ col }''']('''{ osisID }'''): improper nesting '''{ currentOsisID }''': matching (sID,eID) not found. Looking at ('''{ sID }''','''{ eID }''') |
OSIS specifies that every sID has a matching eID. Osis2mod is checking that BSP elements are properly nested. | OSIS specifies that every sID has a matching eID. Osis2mod is checking that BSP elements are properly nested. | ||
− | FATAL(COMMENTS): unknown commentstate on comment start: | + | FATAL(COMMENTS)['''{ line }''','''{ col }''']('''{ osisID }'''): unknown commentstate on comment start: { comment state } |
This indicates that the comment is not of the form <!-- ... -->. | This indicates that the comment is not of the form <!-- ... -->. | ||
− | FATAL(COMMENTS): unknown commentstate on comment end: | + | FATAL(COMMENTS)['''{ line }''','''{ col }''']('''{ osisID }'''): unknown commentstate on comment end: { comment state } |
This indicates that the comment is not of the form <!-- ... -->. | This indicates that the comment is not of the form <!-- ... -->. | ||
− | WARNING(PARSE): malformed entity, replacing &''' | + | WARNING(PARSE)['''{ line }''','''{ col }''']('''{ osisID }'''): malformed entity, replacing &'''{ malformed entity }''' with &amp;'''{ malformed entity }''' |
This means it found a & that starts an entity but it wasn't terminated by a ; and changed it to something that probably isn't appropriate. | This means it found a & that starts an entity but it wasn't terminated by a ; and changed it to something that probably isn't appropriate. | ||
− | WARNING(PARSE): HEX entity must begin with &x, found ''' | + | WARNING(PARSE)['''{ line }''','''{ col }''']('''{ osisID }'''): HEX entity must begin with &x, found '''{ entity }''' |
This indicates &X, which is not valid for xml. It is not changed. | This indicates &X, which is not valid for xml. It is not changed. | ||
− | WARNING(PARSE): SWORD does not search HEX entities, found ''' | + | WARNING(PARSE)['''{ line }''','''{ col }''']('''{ osisID }'''): SWORD does not search HEX entities, found '''{ entity }''' |
− | WARNING(PARSE): SWORD does not search numeric entities, found ''' | + | WARNING(PARSE)['''{ line }''','''{ col }''']('''{ osisID }'''): SWORD does not search numeric entities, found '''{ entity }''' |
Since we don't transform HEX or numeric entities to their equivalent UTF-8 value, this will be literal text that cannot be searched. | Since we don't transform HEX or numeric entities to their equivalent UTF-8 value, this will be literal text that cannot be searched. | ||
− | WARNING(PARSE): XML only supports 5 Character entities &amp;, &lt;, &gt;, &quot; and &apos;, found ''' | + | WARNING(PARSE)['''{ line }''','''{ col }''']('''{ osisID }'''): XML only supports 5 Character entities &amp;, &lt;, &gt;, &quot; and &apos;, found '''{ entity }''' |
XML does not allow character entities. These are passed unchanged to SWORD module. This frustrates search and may only display properly in a front-end using HTML rendering. | XML does not allow character entities. These are passed unchanged to SWORD module. This frustrates search and may only display properly in a front-end using HTML rendering. | ||
− | WARNING(PARSE): &quot; is unnecessary outside of attribute values. Replace with ". | + | WARNING(PARSE)['''{ line }''','''{ col }''']('''{ osisID }'''): &quot; is unnecessary outside of attribute values. Replace with ". |
− | WARNING(PARSE): &quot; is unnecessary inside single quoted attribute values. Replace with ". | + | WARNING(PARSE)['''{ line }''','''{ col }''']('''{ osisID }'''): &quot; is unnecessary inside single quoted attribute values. Replace with ". |
− | WARNING(PARSE): &quot; is only needed within double quoted attribute values. Considering using single quoted attribute and replacing with ". | + | WARNING(PARSE)['''{ line }''','''{ col }''']('''{ osisID }'''): &quot; is only needed within double quoted attribute values. Considering using single quoted attribute and replacing with ". |
− | WARNING(PARSE): &apos; is only needed within single quoted attribute values. Considering using double quoted attribute and replacing with '. | + | WARNING(PARSE)['''{ line }''','''{ col }''']('''{ osisID }'''): &apos; is only needed within single quoted attribute values. Considering using double quoted attribute and replacing with '. |
− | WARNING(PARSE): &apos; is unnecessary inside double quoted attribute values. Replacing with '. | + | WARNING(PARSE)['''{ line }''','''{ col }''']('''{ osisID }'''): &apos; is unnecessary inside double quoted attribute values. Replacing with '. |
− | WARNING(PARSE): &apos; is unnecessary outside of attribute values. Replacing with '. | + | WARNING(PARSE)['''{ line }''','''{ col }''']('''{ osisID }'''): &apos; is unnecessary outside of attribute values. Replacing with '. |
The &apos; or &quot; was found outside of an attribute value and is being substituted with a straight single apostrophe, ', or double quote, ", respectively. | The &apos; or &quot; was found outside of an attribute value and is being substituted with a straight single apostrophe, ', or double quote, ", respectively. | ||
− | WARNING(PARSE): While valid for XML, &quot; is only needed within double quoted attribute values. | + | WARNING(PARSE)['''{ line }''','''{ col }''']('''{ osisID }'''): While valid for XML, &quot; is only needed within double quoted attribute values. |
This warning indicates that the entity is present where it does not need to be. | This warning indicates that the entity is present where it does not need to be. | ||
− | WARNING(PARSE): While valid for XML, XHTML does not support &apos;. | + | WARNING(PARSE)['''{ line }''','''{ col }''']('''{ osisID }'''): While valid for XML, XHTML does not support &apos;. |
This warning indicates that the &apos; may not display properly in all front-ends. It was found in an attribute value. | This warning indicates that the &apos; may not display properly in all front-ends. It was found in an attribute value. | ||
Line 613: | Line 820: | ||
'''-d 1'''<br/> | '''-d 1'''<br/> | ||
− | Output of what is being written to the module | + | Output of what is being written to the module. |
− | DEBUG(WRITE) | + | DEBUG(WRITE)['''{ line }''','''{ col }''']('''{ osisID }'''): '''{ text so far }''' |
+ | |||
+ | '''-d 2'''<br/> | ||
+ | Adds milestones for the <verse> and </verse> of the form <milestone resp="v" ... with attributes of the verse element ... /><br/> | ||
+ | This is especially useful in viewing the dat file of an uncompressed module. | ||
'''-d 4'''<br/> | '''-d 4'''<br/> | ||
A stack is maintained to represent the Words of Christ on a per verse basis. This is internal diagnostic of that stack | A stack is maintained to represent the Words of Christ on a per verse basis. This is internal diagnostic of that stack | ||
− | DEBUG(QUOTE) | + | DEBUG(QUOTE)['''{ line }''','''{ col }''']('''{ osisID }'''): quote top('''{ quote stack size }''') '''{ token }''' |
− | DEBUG(QUOTE) | + | DEBUG(QUOTE)['''{ line }''','''{ col }''']('''{ osisID }'''): quote pop('''{ quote stack size }''') '''{ topToken }''' -- '''{ token }''' |
− | DEBUG(QUOTE) | + | DEBUG(QUOTE)['''{ line }''','''{ col }''']('''{ osisID }'''): ('''{ quote stack size }''') '''{ topToken }''' -- '''{ token }''' |
'''-d 8'''<br/> | '''-d 8'''<br/> | ||
Identifies when book and chapter introductions are being determined. | Identifies when book and chapter introductions are being determined. | ||
− | DEBUG(TITLE) | + | DEBUG(TITLE)['''{ line }''','''{ col }''']('''{ osisID }'''): OOPS INTRO |
− | inChapterIntro = ''' | + | inChapterIntro = '''{ inChapterIntro }''' |
− | inBookIntro = ''' | + | inBookIntro = '''{ inBookIntro }''' |
− | DEBUG(TITLE) | + | DEBUG(TITLE)['''{ line }''','''{ col }''']('''{ osisID }'''): Looking for book introduction |
− | DEBUG(TITLE) | + | DEBUG(TITLE)['''{ line }''','''{ col }''']('''{ osisID }'''): Done looking for book introduction |
− | DEBUG(TITLE) | + | DEBUG(TITLE)['''{ line }''','''{ col }''']('''{ osisID }'''): BOOK INTRO '''{ beading }''' |
− | DEBUG(TITLE) | + | DEBUG(TITLE)['''{ line }''','''{ col }''']('''{ osisID }'''): Looking for chapter introduction |
− | DEBUG(TITLE) | + | DEBUG(TITLE)['''{ line }''','''{ col }''']('''{ osisID }'''): Done looking for chapter introduction |
− | DEBUG(TITLE) | + | DEBUG(TITLE)['''{ line }''','''{ col }''']('''{ osisID }'''): CHAPTER INTRO '''{ heading }''' |
'''-d 16'''<br/> | '''-d 16'''<br/> | ||
Inter-verse material either goes with the prior "verse" or the next. This help diagnose problems related to that split. | Inter-verse material either goes with the prior "verse" or the next. This help diagnose problems related to that split. | ||
− | DEBUG(INTERVERSE) | + | DEBUG(INTERVERSE)['''{ line }''','''{ col }''']('''{ osisID }'''): Interverse start token '''{ token }''':'''{ text }''' |
− | DEBUG(INTERVERSE) | + | DEBUG(INTERVERSE)['''{ line }''','''{ col }''']('''{ osisID }'''): Interverse end tag: '''{ tokenName }'''('''{ tagDepth }''','''{ chapterDepth }''','''{ bookDepth }''') |
− | DEBUG(INTERVERSE) | + | DEBUG(INTERVERSE)['''{ line }''','''{ col }''']('''{ osisID }'''): Appending interverse end tag: '''{tokenName }'''('''{ tagDepth }''','''{ chapterDepth }''','''{ bookDepth }''') |
'''-d 32'''<br/> | '''-d 32'''<br/> | ||
The following messages relate to the transformations of containers to milestones. | The following messages relate to the transformations of containers to milestones. | ||
− | DEBUG(XFORM) | + | DEBUG(XFORM)['''{ line }''','''{ col }''']('''{ osisID }'''): Transform start tag from '''{ orig }''' to '''{ transformed }''' |
− | + | DEBUG(XFORM)['''{ line }''','''{ col }''']('''{ osisID }'''): Transform end tag from '''{ orig }''' to '''{ transformed }''' | |
− | DEBUG(XFORM) | ||
− | |||
'''-d 64'''<br/> | '''-d 64'''<br/> | ||
Occasionally a verse reference is outside of the chosen versification. These messages help to understand difficulties that osis2mod has in storing extra-canonical material in the module. | Occasionally a verse reference is outside of the chosen versification. These messages help to understand difficulties that osis2mod has in storing extra-canonical material in the module. | ||
− | DEBUG(V11N): ''' | + | DEBUG(V11N)['''{ line }''','''{ col }''']('''{ osisID }'''): {'''{ caller }'''} normalizes to '''{ after }''' |
− | DEBUG(V11N): Chapter max:''' | + | DEBUG(V11N)['''{ line }''','''{ col }''']('''{ osisID }'''): Chapter max:'''{ chapterMax }''', Verse Max:'''{ verseMax }''' |
'''-d 128'''<br/> | '''-d 128'''<br/> | ||
OSIS ids and references can be of a form that SWORD cannot parse. Osis2mod contains a routine that munges these into a form that SWORD can understand. | OSIS ids and references can be of a form that SWORD cannot parse. Osis2mod contains a routine that munges these into a form that SWORD can understand. | ||
− | DEBUG(REF): | + | DEBUG(REF)['''{ line }''','''{ col }''']('''{ osisID }'''): VerseKey can parse this as is. |
− | DEBUG(REF): Found a work prefix ''' | + | DEBUG(REF)['''{ line }''','''{ col }''']('''{ osisID }'''): Found a range marker. Progress: '''{ progress }''' Remaining: '''{ remaining }''' |
− | DEBUG(REF): | + | DEBUG(REF)['''{ line }''','''{ col }''']('''{ osisID }'''): Found a work prefix '''{ workPrefix }''' Progress: '''{ progress }''' Remaining: '''{ remaining }''' |
− | DEBUG(REF): Found a grain suffix ''' | + | DEBUG(REF)['''{ line }''','''{ col }''']('''{ osisID }'''): Found an osisID:'''{ osisID }''' Progress: '''{ progress }''' Remaining: '''{ remaining }''' |
− | DEBUG(REF) | + | DEBUG(REF)['''{ line }''','''{ col }''']('''{ osisID }'''): Found a grain suffix '''{ grain }''' Progress: '''{ progress }''' Remaining: '''{ remaining }''' |
− | + | DEBUG(REF)['''{ line }''','''{ col }''']('''{ osisID }'''): Replacing space with ; Progress: '''{ progress }''' Remaining: '''{ remaining }''' | |
− | DEBUG(REF): | + | DEBUG(REF)['''{ line }''','''{ col }''']('''{ osisID }'''): Parseable VerseKey -- '''{ osisRef }''' |
'''-d 256'''<br/> | '''-d 256'''<br/> | ||
Osis2mod contains two stacks to validate proper nesting of BSP and BCV, respectively. This is an internal representation of the BCV stacks. It provides additional information to understand the diagnostic nesting messages. | Osis2mod contains two stacks to validate proper nesting of BSP and BCV, respectively. This is an internal representation of the BCV stacks. It provides additional information to understand the diagnostic nesting messages. | ||
− | DEBUG(STACK) | + | DEBUG(STACK)['''{ line }''','''{ col }''']('''{ osisID }'''): push ('''{ stack size}''') '''{ tokenName }''' |
− | DEBUG(STACK) | + | DEBUG(STACK)['''{ line }''','''{ col }''']('''{ osisID }'''): pop('''{ tagDepth }''') '''{ topToken.getName() }''' |
'''-d 512'''<br/> | '''-d 512'''<br/> | ||
These are general debug messages. | These are general debug messages. | ||
− | DEBUG(FOUND): Found first div and pitching prior material: ''' | + | DEBUG(FOUND)['''{ line }''','''{ col }''']('''{ osisID }'''): Found first div and pitching prior material: '''{ text }''' |
− | DEBUG(FOUND) | + | DEBUG(FOUND)['''{ line }''','''{ col }''']('''{ osisID }'''): Found new book. |
− | DEBUG(FOUND): Current chapter is '''[ | + | DEBUG(FOUND)['''{ line }''','''{ col }''']('''{ osisID }'''): Current chapter is '''{ currentOsisID }''' |
− | + | DEBUG(FOUND)['''{ line }''','''{ col }''']('''{ osisID }'''): Entering verse | |
− | DEBUG(FOUND) | + | DEBUG(FOUND)['''{ line }''','''{ col }''']('''{ osisID }'''): New current verse |
− | DEBUG(FOUND) | + | DEBUG(FOUND)['''{ line }''','''{ col }''']('''{ osisID }'''): End of header found. |
− | DEBUG(COMMENTS): in comment | + | DEBUG(COMMENTS)['''{ line }''','''{ col }''']('''{ osisID }'''): in comment |
− | DEBUG(COMMENTS): out of comment | + | DEBUG(COMMENTS)['''{ line }''','''{ col }''']('''{ osisID }'''): out of comment |
+ | DEBUG(ARGS): | ||
+ | path: '''{ path }''' | ||
+ | osisDoc: '''{ osisDoc }''' | ||
+ | create: '''{ append }''' | ||
+ | compressType: '''{ compType }''' | ||
+ | blockType: '''{ iType }''' | ||
+ | cipherKey: '''{ cipherKey }''' | ||
+ | normalize: '''{ normalize }''' | ||
− | + | ===Exit Status=== | |
− | + | When an error occurs that causes osis2mod to exit without processing the entire input file, a non-zero exit status is supplied to the caller. Here are the codes that osis2mod uses: | |
− | + | const int EXIT_BAD_ARG = 1; // Bad parameter given for program | |
− | + | const int EXIT_NO_WRITE = 2; // Could not open the module for writing | |
− | + | const int EXIT_NO_CREATE = 3; // Could not create the module | |
− | + | const int EXIT_NO_READ = 4; // Could not open the input file for reading. | |
− | + | const int EXIT_BAD_NESTING = 5; // BSP or BCV nesting is bad or improper XML comment | |
− | |||
− | |||
− | |||
== Future Roadmap == | == Future Roadmap == | ||
Line 702: | Line 916: | ||
== See also == | == See also == | ||
− | |||
* [[OSIS]] – a partial list of other OSIS related pages and external links. | * [[OSIS]] – a partial list of other OSIS related pages and external links. | ||
− | |||
* [[Mod2zmod]] | * [[Mod2zmod]] | ||
Latest revision as of 16:50, 14 July 2025
Contents
[hide]Introduction
osis2mod transforms an OSIS encoded Bible or commentary into a SWORD module.
Where to get it
Current status
Software bugs relating to osis2mod should be reported in https://tracker.crosswire.org/browse/MODTOOLS
- Please describe current status of osis2mod, including a list of any outstanding issues or unsolved difficulties.
History of Changes
The following outlines in reverse chronological order the major changes to osis2mod. When several changes were made over the span of a few days, they are lumped into the most recent date. Bug fixes are not mentioned.
Date | Revision | Feature |
---|---|---|
2025-07-07 |
r3907 |
Created identifyMsg which creates a standardized message for reporting and used it for all messages to standard out. The message is of the form: type(kind)[linePos,charPos] osisID=osisID: message Where: type The message type (e.g., "ERROR", "WARNING", "INFO"). kind The message category or kind (e.g., "REF", "PARSE"). linePos The position in the file of the last line that was read. charPos The position in the line of the last character that was read. osisID (Optional) The current OSIS ID to include. May be nullptr or empty. message event description with details * If linePos is 0, the position ([linePos,charPos]) is omitted. * If osisID is nullptr or empty, the osisID part is omitted. * The returned string always ends with a colon and a trailing space (": "). |
2025-06-25 |
r3906 |
|
2025-06-19 |
r3903 |
|
2020-07-26 |
r3769 |
|
2020-05-08 |
r3737 |
|
2016-08-16 |
r3431 |
|
2016-02-06 |
r3401 |
|
2015-02-16 |
r3322 |
|
2015-02-16 |
r3321 |
|
2014-12-17 |
r3310 |
|
2014-12-15 |
r3307 |
|
2014-03-17 |
r3139 |
|
2014-01-21 |
r3011 |
|
2012-03-24 |
r2693 |
|
2011-11-12 |
r2671 |
|
2010-06-04 |
r2519 |
|
2009-06-06 |
r2435 |
|
2009-05-30 |
r2421 |
|
2009-05-14 |
r2413 |
|
2009-04-28 |
r2358 |
|
2009-04-26 |
r2345 |
|
2008-09-11 |
r2196 |
|
2008-02-29 |
r2141 |
|
2007-09-27 |
r2090 |
|
2007-05-13 |
r2050 |
|
2007-05-01 |
r2044 |
|
2007-04-24 |
r2038 |
|
2006-07-15 |
r1948 |
|
2006-07-04 |
r1914 |
|
2005-12-22 |
r1876 |
|
2005-05-02 |
r1790 |
|
2005-02-03 |
r1707 |
|
2004-06-12 |
r1620 |
|
2004-05-19 |
r1597 |
|
2003-11-21 |
r1407 |
|
2003-05-26 |
r1183 |
|
Transformations
- This section may need further updating.
Osis2mod performs the following transformations[1]:
- Whitespace -- Allows for human-readable OSIS files.
- Leading whitespace on books, chapters and verses is removed
- Whitespace is normalized into blanks
- Multiple adjacent whitespace is reduced to a single space
- Unicode handling - All modules should be UTF-8, NFC.[2]
- Latin-1 (cp1252 and iso8859-1) are converted into UTF-8
- UTF-8 is normalized into NFC, unless specified otherwise (i.e. by using the -N option).
- Verse ranges - also known as linked verses.
- Verses within a specified OSIS verse range are each linked to the first verse of the range.
- Milestone conversion - necessary for frontends to show a verse at a time.
- <q ...>...</q> becomes <q sID="gen#" .../>...<q eID="gen#" .../>[3][4]
- <p ...>...</p> becomes <div sID="gen#" type="x-p" .../>... <div eID="gen#" type="x-p" ...>
- <chapter ...>...</chapter> becomes <chapter sID="gen#" .../>...<chapter eID="gen#" .../>
- <closer ...>...</closer> becomes <closer sID="gen#" .../>...<closer eID="gen#" .../>
- <div ...>...</div> becomes <div sID="gen#" .../>...<div eID="gen#" .../>[5]
- <l ...>...</l> becomes <l sID="gen#" .../>...<l eID="gen#" .../>
- <lg ...>...</lg> becomes <lg sID="gen#" .../>...<lg eID="gen#" .../>
- <salute ...>...</salute> becomes <salute sID="gen#" .../>...<salute eID="gen#" .../>
- <signed ...>...</signed> becomes <signed sID="gen#" .../>...<signed eID="gen#" .../>
- <speech ...>...</speech> becomes <speech sID="gen#" .../>...<speech eID="gen#" .../>
- <verse ...>...</verse> becomes (when using -d 2 for debugging.)
<milestone resp="v" sID="gen#" ... />...<milestone resp="v" eID="gen#" ... />
- Words of Christ - necessary for front-ends to appropriately highlight the WOC, a verse at a time.[6]
- <q sID="gen#" who="Jesus" .../>...<eID="gen#" who="Jesus" .../> becomes
<q who="Jesus" marker=""><q sID="gen#" .../>...<q eID="gen#" .../></q> - <q who="Jesus" ...>...</q> becomes
<q who="Jesus" marker=""><q sID="gen#" .../>...<q eID="gen#" .../></q> - Within the following construct, <q who="Jesus" marker="">...</q> will surround verse text.
- <q sID="gen#" who="Jesus" .../>...<eID="gen#" who="Jesus" .../> becomes
- Pre-Verse Titles[7]
- InterVerse or postVerse[8] content not in titles is appended to the prior verse.
- Titles immediately preceeding a verse are converted into either
<title canonical="true" type="psalm">...</title> or
<title canonical="false" type="sub">...</title>[9]
and the whole such title element is wrapped within a special preverse div element. See below.
- InterVerse Content[10]
- InterVerse Content refers to all content not contained by the verse element.[11][12]
- Such content is divided between the prior and the current verse.
- Content appended to the prior verse is not marked in any special way.
- Content prepended[13] to the current verse is marked with
<div type="x-milestone" subType="x-preverse" sID="pv#"/>...<div type="x-milestone" subType="x-preverse" eID="pv#"/>.
Notes:
- Jump up ↑ These transformations are all performed "under the hood" as it were. Tweaking OSIS XML files to fix problems with pre-verse titles, etc., was never intended to be done by module developers as part of the preprocessing before using osis2mod.
- Jump up ↑ With the possible exception of Biblical Hebrew and some Indic scripts that should not be normalized.
- Jump up ↑ gen# is unique for an sID/eID pair, where # is a number.
- Jump up ↑ Quotes with who="Jesus" are not transformed at this stage.
- Jump up ↑ As of r3401, <div type="colophon"...> is unchanged.
- Jump up ↑ The examples given have the null string as the quotation marks in the marker attribute (as in the KJV module), but proper quotation marks are also supported.
- Jump up ↑ The older method that fixed titles only became obsolete with SVN revision 2358 for the SWORD 1.6.0 release and replaced with InterVerse Content.
- Jump up ↑ For example, the colophon div as found at the end of some Epistles.
- Jump up ↑ type="sub" is used irrespective of whether the non-canonical title is a section title or a subSection title.
- Jump up ↑ Introduced with SVN revision 2358 for the SWORD 1.6.0 release.
- Jump up ↑ i.e. Content between the eID milestone of a verse and the sID milestone of the next verse, or before the first verse of the chapter.
- Jump up ↑ For OSIS files derived from USFM files, the implication of this requirement is the following rule:
Do not place a title (or similar element) between a matching pair of verse milestones
except where the translation places the title somewhere within the verse text in the USFM file. - Jump up ↑ This ensures that when the verse is called the prepended content is also displayed, subject to whatever filters are in place.
Handling of Introductions, Titles and Inter-Verse Material
SWORD looks for module, testament, book and chapter introductory material. Those introductions can have appropriate titles as well. In SWORD 1.6.0 the handling of this material has changed.
Note:
- In the following, the effects of the above transformations are not shown. The tagging of the pre-verse material is also not shown.
Module Introduction
At this time, osis2mod does not fully support module introduction. It is intended that the following will work. All material after the </header> and before a div with a type of book or bookGroup is a module heading.
A module introduction should be place into testament 0, book 0, chapter 0, verse 0. In SWORD Module introductions have the special id of:
- [ Module Heading ]
and can be accessed by setting a VerseKey's testament as
// In order to access headings, aka intros, one has to set intros to true. vk.setIntros(true); // Setting the testament will also set book, chapter and verse to 0 when intros is set to true vk.setTestament(0);
Testament Introductions
At this time, osis2mod does not fully support testament introductions. It is the intention that the following will work.
A testament introduction contains all the material that is after a module introduction and before the first book for the Old Testament introduction or before the first book in the New Testament.
Note: here first OT or NT book means the first that occurs in the OSIS xml.
It is required that <div type="bookGroup"> contains the books. Either as all the books, or sets of books, e.g. OT, NT, Apocrypha, Torah, History, Major Prophets, Minor Prophets, with nesting groups as desired.
Minimal example of a module consisting of 2 books, Psalms and John:
<div type="bookGroup"> ... Old Testament introductory material ... <div type="book" osisID="Ps"> ... Chapters of Psalms ... </div> <!-- end of Psalms --> ... New Testament introductory material ... <div type="book" osisID="Jn"> </div> <!-- end of John --> </div> <!-- end of the Bible -->
Example of testaments in separate book groups. Again using a module of 2 books.
<div type="bookGroup" subType="x-OT"> ... Old Testament introductory material ... <div type="book" osisID="Ps"> ... Chapters of Psalms ... </div> <!-- end of Psalms --> </div> <!-- end of the OT --> <div type="bookGroup" subType="x-NT"> ... New Testament introductory material ... <div type="book" osisID="Jn"> </div> <!-- end of John --> </div> <!-- end of the NT -->
Example of testaments in separate book groups with the Apocrypha book group.
<div type="bookGroup" subType="x-Old"> ... Old Testament introductory material ... <div type="book" osisID="Ps"> ... Chapters of Psalms ... </div> <!-- end of Psalms --> </div> <!-- end of the OT --> <div type="bookGroup" subType="x-non-canon"> ... Books of the Apocrypha ... </div> <!-- end of the Deuterocanon --> <div type="bookGroup" subType="x-New"> ... New Testament introductory material ... <div type="book" osisID="Jn"> </div> <!-- end of John --> </div> <!-- end of the New -->
A testament introduction should be placed into testament 1 or 2, book 0, chapter 0, verse 0. In SWORD Testament introductions have the special ids of:
- [ Testament 1 Heading ]
- [ Testament 2 Heading ]
but can be accessed by setting a VerseKey's testament as
// In order to access headings, aka intros, one has to set intros to true. vk.setIntros(true); // Setting the testament will also set book, chapter and verse to 0 when intros is set to true vk.setTestament(1);
Book Introductions and Titles
Book introductions and titles are straight forward. It includes the start of the book and everything following it up to, but not including the start of the chapter. See OSIS Bibles for best practices in marking up titles and introductions.
For example:
<div type="book" ...> ... introductory material ... <chapter"...>
will put the following into the book introduction:
<div type="book" ...> ... introductory material ...
Chapter Introductions
Chapter introductions and titles are a bit problematic. Between the start of a chapter and its first verse, we could have a chapter title, a chapter introduction and/or a start of a section of verses or a titled verse. Osis2mod now handles this in a predictable fashion. From the start of the chapter up to and not including a section div or a title that has a type that is not main, chapter or sub, the content is chapter introduction. After that, it is part of the verse.
Specifically, the following list gives the possible first elements following the chapter introduction.:
- <div type="section" ...>
- <title type="yyy" ...> where yyy is not main, chapter or sub.
For example,
<chapter ...> <title>Chapter Title</title> <title type="sub">Chapter Subtitle</title> <div type="introduction">... intro ...</div> <div type="section"> <title>A section title</title> <p> <verse ...>...verse content...</verse>
will put the following into the chapter introduction:
<chapter ...> <title>Chapter Title</title> <title type="sub">Chapter Subtitle</title> <div type="introduction">... intro ...</div>
will put the following into verse 1 of that chapter
<div type="section"> <title>A section title</title> <p> <verse ...>...verse content...</verse>
Between Verses
Between verses we may have closing tags to finish off what was started earlier, structural opening tags (e.g. line groups, divisions, paragraphs, ...), titles and/or introductory material.
Upon finding the close of a verse, osis2mod will append all adjacent closing tags to it. Once it finds a start tag, it will attach that to the following verse.
For example, suppose the following is between </verse> and <verse ...>:
</lg> </p> </div> <div type="section"> <title>Section title</title> <p> <lg>
then it will append the following to the prior verse:
</lg> </p> </div>
and will prepend the following to the current verse:
<div type="section"> <title>Section title</title> <p> <lg>
Last Verse
The material following the last verse of a chapter is appended to that verse. You might find:
... verse content ... </chapter> <div type="colophon">... colophon text ...</div> </div> </div>
Exclusions
Only content starting the first <div> to the last </div> is retained. All other is excluded. From a practical perspective, this excludes the OSIS header information.
Usage
It is always preferable to use the most recent version of osis2mod and compiling it from SVN is best.
Note:
- After the SWORD 1.5.9 release, osis2mod was changed to take flags rather than positional arguments.
You are running osis2mod: $Rev: 3431 $ OSIS Bible/commentary module creation tool for The SWORD Project usage: ./osis2mod <output/path> <osisDoc> [OPTIONS] <output/path> an existing folder that the module will be written <osisDoc> path to the validated OSIS document, or '-' to read from standard input -a augment module if exists (default is to create new) -z <l|z|b|x> compression type (default: none) l - LZSS; z - ZIP; b - bzip2; x - xz -b <2|3|4> compression block size (default: 4) 2 - verse; 3 - chapter; 4 - book -l <1-9> compression level (default varies by compression type) -c <cipher_key> encipher module using supplied key (default no enciphering) -e <1|2|s> convert Unicode encoding (default: 1) 1 - UTF-8 ; 2 - UTF-16 ; s - SCSU -N do not normalize to NFC -s <2|4> bytes used to store entry size (default is 2). -v <v11n> specify a versification scheme to use (default is KJV) Note: The following are valid values for v11n: Calvin Catholic Catholic2 DarbyFr German KJV KJVA LXX Leningrad Luther MT NRSV NRSVA Orthodox Segond Synodal SynodalProt Vulg -h print verbose usage text See http://www.crosswire.org/wiki/osis2mod for more details.
Earlier builds (e.g. $Rev: 2893 $) also included the debug option help: See Debug Messages.
-d <flags> turn on debugging (default is 0) Note: This flag may change in the future. Flags: The following are valid values: 0 - no debugging 1 - writes to module, very verbose 2 - verse start and end 4 - quotes, esp. Words of Christ 8 - titles 16 - inter-verse material 32 - BSP to BCV transformations 64 - v11n exceptions 128 - parsing of osisID and osisRef 256 - internal stack 512 - miscellaneous This argument can be used more than once. (Or the flags may be added together.) See http://www.crosswire.org/wiki/osis2mod for more details.
Parameters and Options
<output/path>
This a path to any existing directory. It is best for it to be empty.
<osisDoc>
This is a single, well-formed, valid OSIS document.
If - is used instead of a file name, the document will be read from standard input. This allows for two constructs:
- Redirection
osis2mod ./modules/texts/ztext/KJV - < kjv.xml
- Piping
cat kjv.xml | osis2mod ./modules/texts/ztext/KJV -
-a
Osis2mod can create a Bible all at once or incrementally, depending on the presence of the -a flag. This
provides for two abilities,
- Assembling a Bible from book files:
mkdir /tmp/mymodule osis2mod /tmp/mymodule matt.xml osis2mod /tmp/mymodule -a mark.xml ... osis2mod /tmp/mymodule -a rev.xml
Note: The book files can be in any order. SWORD will order them correctly in the index.
- Adding corrections to a Bible:
osis2mod /tmp/mymodule -a fixes.xml
Note: When fixes are put into the module they are appended to the data file and do not actually replace the verses. The index file is adjusted to point to the new place in the data file.
-z <l|z|b|x>
A SWORD Bible can be compressed with Zip (-z z), LZSS (-z l), BZip2 (-z b) or XZ (-lx). All of CrossWire's compressed Bible and commentary modules are compressed with Zip. This saves significant space over an uncompressed module. Uncompressed modules are useful for debugging.
-b <2|3|4>
This setting is only useful for a compressed module. The choice as to whether to use Verse (2), Chapter (3) or Book (4, the default) level compression depends upon the amount of data in the block. A typical Bible is best compressed book by book. A commentary, chapter by chapter. If the commentary is very robust and the amount of text per verse is really huge, then verse compression might make sense.
All of SWORD's compressed Bible modules are compressed by book. Basically, all of the verses in a block are compressed and appended to the data file. For this reason, the datafile cannot be uncompressed by anything other than the SWORD and JSword libraries.
When creating the module by appending it is important to do so by whole compression block. That is, if blockType is Chapter, then the osisDoc needs to contain one or more whole chapters.
-l [1-9]
This setting is useful for a compressed module and provides the level of compression to be used. The default varies depending on the compressor used. The default is typically used.
-c cipherKey
This is typically 16 characters in length, having no leading or trailing spaces, consisting of alternating sets of 4 alpha and 4 numeric characters, such as Aduf0274PjNq0328. The key is case-sensitive.
-e <1|2|s>
This flag gives whether the input is to be converted to UTF-8, UTF-16, or SCSU. By default the encoding is UTF-8. Using UTF-16 is reasonable for non-latin texts with little markup. SCSU will compress the Unicode, but it has not been sufficiently tested, requires ICU support when compiling osis2mod and when running (not all frontends do).
-N
All OSIS modules should be UTF-8 and all that are UTF-8 are also to be NFC. The default is to automatically detect the presense of Latin-1 (either cp1252 or iso8859-1) and convert it to UTF-8 and to normalize UTF-8 to NFC. This flag will turn off this behavior and is useful for creating Latin-1 modules or for modules for which the source text is already UTF-8 and NFC. It is also advised for Biblical Hebrew modules for which the source text (with accents and cantillation points) is intentionally not normalized, as "Unicode normalization can easily break Biblical Hebrew". Quoted from page 9 of the SBL Hebrew Font Manual.[1]
Note: this was added late Feb 2008 and requires ICU support when compiling.
-s <2|4>
A value of 2, the default, restricts modules to 64K bytes per entry. A value of 4, breaks this barrier. This is needed for Bibles, having large introductory materials, and for commentaries with large entries.
Note: 4 was added late Apr 2009 for raw, uncompressed modules and will be part of the SWORD 1.6.0 release (formerly known as 1.5.11). A later release extended support to compressed modules.
-v v11n
By default, osis2mod uses the KJV versification. The practical implication of this is that only books in the KJV canon are allowed and any text in an allowed book are retained. However, if the verse reference of a supported book falls outside of the versification it is appended to the prior verse in the canon. This flag allows for an alternate versification.
Note: This option was added in April 2009 as part of the SWORD 1.6.0 release.
-d flags
The flag can be used more than once or the flags can be added together. For example,
-d 2 -d 4
is the same as
-d 6
To do verbose debugging use:
-d -1
For the most part these flags are not intended for debugging modules, but rather for debugging problems in osis2mod.
The -d 2 flag produces no output but puts milestones into the module where verses start and end. The form of the milestone is:
<milestone resp="v" [attributes from verse] />
The milestone will contain the osisID from the verse and also a valid sID or eID. The sID/eID indicates the start of end of the verse.
Note: the -d 2 flag might change at any time, or may even be removed.
Messages
Osis2mod has robust, mind-boggling messages. These are provided here in hopes that it will help problem diagnosis.
Messages are output with a standard format: TYPE(KIND)[line,column'](osisID'): Message
TYPE is one of
- FATAL - Usually accompanied by an immediate exit. The problem should be fixed and osis2mod rerun.
- ERROR - A non-fatal problem that should be fixed before the module is used.
- WARNING - A problem with the input that probably should be fixed.
- INFO - Information about what the program is doing.
- DEBUG - Managed by the -d flag.
KIND is one of:
- UTF8 - Deals with conversion from Latin-1 to UTF8.
- V11N - Messages related to Versification.
- WRITE - Messages related to writing to the module.
- LINK - Messages related to linked verses.
- REF - Messages related to the normalization of osis references to SWORD references.
- NESTING - Messages related to improper overlapping of BCV and BSP
- COMMENTS - Messages related to XML comment processing.
- PARSE - Messages related to XML entity processing.
- QUOTE - Handling of quotes, especially the Words of Christ (WoC)
- TITLE - Handling of Introductions and Titles
- INTERVERSE - Handling of material between verses and before verse 1.
- FOUND - Diagnostics related to finding of Books, Chapters and Verses.
- ARGS - Summary of command line arguments.
Some of these are described more fully below
Line and Column gives the location of the last line and column read in the file. Since processing is handled after the reading, the location is approximate. If the processing is before the file is read, such as during the reading of command line arguments, it won't show up.
osisID gives the osisID being processed or N/A. It also is optionally output early in the execution of the program.
In the following, example values are given in {...}. The brackets do not actually appear in the message. Also, the messages are a bit prettier here than in reality.
Diagnostic Messages
WARNING(UTF8)[{ line },{ col }]({ osisID }): Should be converted to UTF-8 ({text })
The program will always check for text that is not UTF-8.
INFO(UTF8)[{ line },{ col }]({ osisID }): Converting to UTF-8 ({ text before conversion })
Text that is converted to UTF-8 is noted.
ERROR(UTF8)[{ line },{ col }]({ osisID }): Converting to UTF-8 ({ text after first conversion })
It is an error if after a conversion it still is not UTF-8.
WARNING(UTF8)[{ line },{ col }]({ osisID }): osis2mod is not compiled with support for ICU. Ignoring -n flag.
Normalization was requested, but since osis2mod was not compiled for it, it cannot honor the default request.
INFO(V11N)[{ line },{ col }]({ osisID }): is not in the { v11n } versification.
Indicates that a verse is not in the versification.
INFO(V11N)[{ line },{ col }]({ osisID }): is not in the { v11n } versification. Appending content to { osisID }
This like the other indicates a versification problem, but shows where the text will be found. Osis2mod preserves all module content for supported books.
WARNING(V11N)[{ line },{ col }]({ osisID }): New book is { name } and is not in { v11n } versification, ignoring
The name of the book was not recognized as belonging to the chosen versification, it and all of it's content is ignored.
INFO(WRITE)[{ line },{ col }]({ osisID }): Appending entry: { osisID }: { text so far }
If osis2mod encounters text that needs to be appended to a verse that is already in the module. This could indicate that
- the reference is in the input twice. This typically indicates a problem.
- more text was found that needs to be added to the prior verse.
- osis2mod is being run in append mode to fix a verse in the module.
INFO(LINK)[{ line },{ col }]({ osisID }): Linking { osisID } to { osisID }
An osisID such as "Gen.1.1 Gen.1.2 Gen.1.3" was used and the latter are linked to the first.
ERROR(REF)[{ line },{ col }]({ osisID }): Invalid osisID/annotateRef: { invalid attribute value }
This indicates that the SWORD library was unable to parse the osisID or annotateRef.
FATAL(NESTING)[{ line },{ col }]({ osisID }): tag expected
This indicates that the specified verse is not balanced with regard to its tags. Building a raw text module, looking in the module for the verse and pairing begin/end tags will help find the problem. Typically, this indicates an end tag that did not have a matching begin tag and all tags before it were properly paired.
FATAL(NESTING)[{ line },{ col }]({ osisID }): Expected { topToken.getName() } found { tokenName }
This also indicates that the specified verse is not balanced with regard to its tags. Building a raw text module, looking in the module for the verse and pairing begin/end tags will help find the problem. It could be either a begin or an end tag problem.
WARNING(NESTING)[{ line },{ col }]({ osisID }): verse { currentOsisID } is not well formed:({ verseDepth },{ tagDepth })
This indicates that the verse probably will not show properly in some front-ends in some circumstances. Typically, it shows the problem if the verse is shown in isolation.
ERROR(NESTING)[{ line },{ col }]({ osisID }): improper nesting { currentOsisID }: matching (sID,eID) not found. Looking at ({ sID },{ eID })
OSIS specifies that every sID has a matching eID. Osis2mod is checking that BSP elements are properly nested.
FATAL(COMMENTS)[{ line },{ col }]({ osisID }): unknown commentstate on comment start: { comment state }
This indicates that the comment is not of the form <!-- ... -->.
FATAL(COMMENTS)[{ line },{ col }]({ osisID }): unknown commentstate on comment end: { comment state }
This indicates that the comment is not of the form <!-- ... -->.
WARNING(PARSE)[{ line },{ col }]({ osisID }): malformed entity, replacing &{ malformed entity } with &{ malformed entity }
This means it found a & that starts an entity but it wasn't terminated by a ; and changed it to something that probably isn't appropriate.
WARNING(PARSE)[{ line },{ col }]({ osisID }): HEX entity must begin with &x, found { entity }
This indicates &X, which is not valid for xml. It is not changed.
WARNING(PARSE)[{ line },{ col }]({ osisID }): SWORD does not search HEX entities, found { entity } WARNING(PARSE)[{ line },{ col }]({ osisID }): SWORD does not search numeric entities, found { entity }
Since we don't transform HEX or numeric entities to their equivalent UTF-8 value, this will be literal text that cannot be searched.
WARNING(PARSE)[{ line },{ col }]({ osisID }): XML only supports 5 Character entities &, <, >, " and ', found { entity }
XML does not allow character entities. These are passed unchanged to SWORD module. This frustrates search and may only display properly in a front-end using HTML rendering.
WARNING(PARSE)[{ line },{ col }]({ osisID }): " is unnecessary outside of attribute values. Replace with ". WARNING(PARSE)[{ line },{ col }]({ osisID }): " is unnecessary inside single quoted attribute values. Replace with ". WARNING(PARSE)[{ line },{ col }]({ osisID }): " is only needed within double quoted attribute values. Considering using single quoted attribute and replacing with ". WARNING(PARSE)[{ line },{ col }]({ osisID }): ' is only needed within single quoted attribute values. Considering using double quoted attribute and replacing with '. WARNING(PARSE)[{ line },{ col }]({ osisID }): ' is unnecessary inside double quoted attribute values. Replacing with '. WARNING(PARSE)[{ line },{ col }]({ osisID }): ' is unnecessary outside of attribute values. Replacing with '.
The ' or " was found outside of an attribute value and is being substituted with a straight single apostrophe, ', or double quote, ", respectively.
WARNING(PARSE)[{ line },{ col }]({ osisID }): While valid for XML, " is only needed within double quoted attribute values.
This warning indicates that the entity is present where it does not need to be.
WARNING(PARSE)[{ line },{ col }]({ osisID }): While valid for XML, XHTML does not support '.
This warning indicates that the ' may not display properly in all front-ends. It was found in an attribute value.
Debug Messages
The following are shown in the same form as the diagnostic messages above. They are given without comment.
-d 1
Output of what is being written to the module.
DEBUG(WRITE)[{ line },{ col }]({ osisID }): { text so far }
-d 2
Adds milestones for the <verse> and </verse> of the form <milestone resp="v" ... with attributes of the verse element ... />
This is especially useful in viewing the dat file of an uncompressed module.
-d 4
A stack is maintained to represent the Words of Christ on a per verse basis. This is internal diagnostic of that stack
DEBUG(QUOTE)[{ line },{ col }]({ osisID }): quote top({ quote stack size }) { token } DEBUG(QUOTE)[{ line },{ col }]({ osisID }): quote pop({ quote stack size }) { topToken } -- { token } DEBUG(QUOTE)[{ line },{ col }]({ osisID }): ({ quote stack size }) { topToken } -- { token }
-d 8
Identifies when book and chapter introductions are being determined.
DEBUG(TITLE)[{ line },{ col }]({ osisID }): OOPS INTRO inChapterIntro = { inChapterIntro } inBookIntro = { inBookIntro } DEBUG(TITLE)[{ line },{ col }]({ osisID }): Looking for book introduction DEBUG(TITLE)[{ line },{ col }]({ osisID }): Done looking for book introduction DEBUG(TITLE)[{ line },{ col }]({ osisID }): BOOK INTRO { beading } DEBUG(TITLE)[{ line },{ col }]({ osisID }): Looking for chapter introduction DEBUG(TITLE)[{ line },{ col }]({ osisID }): Done looking for chapter introduction DEBUG(TITLE)[{ line },{ col }]({ osisID }): CHAPTER INTRO { heading }
-d 16
Inter-verse material either goes with the prior "verse" or the next. This help diagnose problems related to that split.
DEBUG(INTERVERSE)[{ line },{ col }]({ osisID }): Interverse start token { token }:{ text } DEBUG(INTERVERSE)[{ line },{ col }]({ osisID }): Interverse end tag: { tokenName }({ tagDepth },{ chapterDepth },{ bookDepth }) DEBUG(INTERVERSE)[{ line },{ col }]({ osisID }): Appending interverse end tag: {tokenName }({ tagDepth },{ chapterDepth },{ bookDepth })
-d 32
The following messages relate to the transformations of containers to milestones.
DEBUG(XFORM)[{ line },{ col }]({ osisID }): Transform start tag from { orig } to { transformed } DEBUG(XFORM)[{ line },{ col }]({ osisID }): Transform end tag from { orig } to { transformed }
-d 64
Occasionally a verse reference is outside of the chosen versification. These messages help to understand difficulties that osis2mod has in storing extra-canonical material in the module.
DEBUG(V11N)[{ line },{ col }]({ osisID }): {{ caller }} normalizes to { after } DEBUG(V11N)[{ line },{ col }]({ osisID }): Chapter max:{ chapterMax }, Verse Max:{ verseMax }
-d 128
OSIS ids and references can be of a form that SWORD cannot parse. Osis2mod contains a routine that munges these into a form that SWORD can understand.
DEBUG(REF)[{ line },{ col }]({ osisID }): VerseKey can parse this as is. DEBUG(REF)[{ line },{ col }]({ osisID }): Found a range marker. Progress: { progress } Remaining: { remaining } DEBUG(REF)[{ line },{ col }]({ osisID }): Found a work prefix { workPrefix } Progress: { progress } Remaining: { remaining } DEBUG(REF)[{ line },{ col }]({ osisID }): Found an osisID:{ osisID } Progress: { progress } Remaining: { remaining } DEBUG(REF)[{ line },{ col }]({ osisID }): Found a grain suffix { grain } Progress: { progress } Remaining: { remaining } DEBUG(REF)[{ line },{ col }]({ osisID }): Replacing space with ; Progress: { progress } Remaining: { remaining } DEBUG(REF)[{ line },{ col }]({ osisID }): Parseable VerseKey -- { osisRef }
-d 256
Osis2mod contains two stacks to validate proper nesting of BSP and BCV, respectively. This is an internal representation of the BCV stacks. It provides additional information to understand the diagnostic nesting messages.
DEBUG(STACK)[{ line },{ col }]({ osisID }): push ({ stack size}) { tokenName } DEBUG(STACK)[{ line },{ col }]({ osisID }): pop({ tagDepth }) { topToken.getName() }
-d 512
These are general debug messages.
DEBUG(FOUND)[{ line },{ col }]({ osisID }): Found first div and pitching prior material: { text } DEBUG(FOUND)[{ line },{ col }]({ osisID }): Found new book. DEBUG(FOUND)[{ line },{ col }]({ osisID }): Current chapter is { currentOsisID } DEBUG(FOUND)[{ line },{ col }]({ osisID }): Entering verse DEBUG(FOUND)[{ line },{ col }]({ osisID }): New current verse DEBUG(FOUND)[{ line },{ col }]({ osisID }): End of header found. DEBUG(COMMENTS)[{ line },{ col }]({ osisID }): in comment DEBUG(COMMENTS)[{ line },{ col }]({ osisID }): out of comment DEBUG(ARGS): path: { path } osisDoc: { osisDoc } create: { append } compressType: { compType } blockType: { iType } cipherKey: { cipherKey } normalize: { normalize }
Exit Status
When an error occurs that causes osis2mod to exit without processing the entire input file, a non-zero exit status is supplied to the caller. Here are the codes that osis2mod uses:
const int EXIT_BAD_ARG = 1; // Bad parameter given for program const int EXIT_NO_WRITE = 2; // Could not open the module for writing const int EXIT_NO_CREATE = 3; // Could not create the module const int EXIT_NO_READ = 4; // Could not open the input file for reading. const int EXIT_BAD_NESTING = 5; // BSP or BCV nesting is bad or improper XML comment
Future Roadmap
- List here suggestions for useful enhancements to or improvements for osis2mod.
- Add a command line flag -x to exclude all OSIS elements with the global attribute editions.[2]
osis2mod would exclude those elements that match the specified attribute value.
Note:
- Jump up ↑ See https://www.sbl-site.org/Fonts/SBLHebrewUserManual1.5x.pdf
- Jump up ↑ The editions attribute is used to specify edition specific content, such as material omitted in Protestant versions of the Bible.