Talk:OSIS Bibles

From CrossWire Bible Society
Jump to: navigation, search

Display outside <verse>

Dowens just added the following comment: "Also, a quirk of the SWORD compilation process is that the only kind of content which reliably displays outside of <verse> elements are titles."

If this was the case, it may no longer be the case for osis2mod. I'm not familiar with a specific instance where this has been a problem, since I believe we maintain everything except <verse> elements themselves. That being said, if there is an issue, my guess would be that it is a frontend and/or filter issue--not an issue with the compilation process. Probably this needs some more investigation (and for bugs to be filed as appropriate). --Osk 00:50, 29 November 2008 (UTC)

Late response. Osis2mod takes all interverse material and either appends it to the prior verse or puts it into a x-preverse title. In doing this, it re-arranges the tags if needed. I am working on the implementation of a <div subType="x-preverse" sID="xxx"/>...<div subType="x-preverse" eID="xxx"/> to replace it. I hope to be done within the next week.--Dmsmith 11:33, 24 April 2009 (UTC)
If this was implemented, please update the main page to describe the syntax. David Haslam 19:33, 23 July 2009 (UTC)
Please document the correct method to tag the sentence after the "Amen." at the end of each Pauline epistle. This is post-verse content, describing where each epistle was written, etc. Not all Bible translations include these attributions, but the KJV does. For convenience, here is a list of such verses:
Romans 16:27
1 Corinthians 16:24
2 Corinthians 13:14
Galatians 6:18
Ephesians 6:24
Philippians 4:23
Colossians 4:18
1 Thessalonians 5:28
2 Thessalonians 3:18
1 Timothy 6:21
2 Timothy 4:22
Titus 3:15
Philemon 25
Hebrews 13:25
For use with BD, this list can be pasted into a text file and saved as
%Application Data%\JSword\bookmarks\Last verse of every Pauline epistle.lst 

David Haslam 20:00, 14 August 2009 (UTC)

For colophons, use <div type="colophon">....</div>. --Osk 22:22, 14 August 2009 (UTC)
A colophon division may not be within a containered verse element, but should be after it. E.g.
<verse osisID="Rom.16.27">.....</verse>
<div type="colophon">.....</div>

David Haslam 21:29, 2 December 2010 (UTC)

Milestoned version of <chapter>

The section about milestones states, "There are a couple of instances where the milestoned version of <chapter> must be used. eg: when the paragraph is spanning over a chapter." IMHO, it would be helpful to cite both such instances, and give examples of Bible versions where this occurs. David Haslam 12:15, 22 April 2009 (UTC)

I gave the example of the Sermon on the Mount having a quote that spans several chapters. Osk removed it. One can naturally start the quote in chapter 5 and artificially end it at the end of 5, only to artificially start at the beginning of chapter 6 and also artificially end it at the end of the same chapter and thus to artificially start it at the beginning of chapter 7 and finally ending it at the natural place in chapter 7. This is unnecessary and current osis2mod does not require it. Just have the start quote in Matt 5 and the end in Matt 7, using milestoned chapters in between. Once milestoned chapters are used anywhere, OSIS "requires" that they are used everywhere. If quotes are not marked up, then this example, of course, does not pertain.--Dmsmith 01:18, 24 April 2009 (UTC)

I take that back. One can have a milestoned quote start in chapter 5 and a matching milestoned end in chapter 7. This was Osk's comment in removing it. Even chapter spanning paragraphs are not a necessary requirement for milestoned chapters. There is a milestoned version of the paragraph: <div type="paragraph" sID="xxx"/> ... <div type="paragraph" eID="xxx"/>. I was hesitant to mention this, but now SWORD supports this and osis2mod uses it instead of it's proprietary <lb> hack.--Dmsmith 11:15, 24 April 2009 (UTC)
Regarding milestoned paragraphs: Not quite.... There are a few instances of of unfortunate homonymy in OSIS. I think that the difference between <chapter> and <div type="chapter"> is fairly well understood. The former is meant for Bible chapters and the latter for any other type of chapter in any other type of book. Similarly <p> and <div type="paragraph"> have different meanings. A <p> is the normal sense of paragraph (a block of one or more sentences without any linebreaks, probably relating to a single topic). A <div type="paragraph"> is a text division equivalent to a section, sub-section, or chapter. It could very well contain one or more <p> elements. Or it might start with a <title> followed by <p>s or other <div>s for sub-paragraphs. It was specifically intended for documents that have "paragraphs" as a named division type. Law texts would be one example of a text using paragraphs in this sense. I believe the Catechism of the Catholic Church also uses paragraphs in this sense. So <p> really wasn't intended ever to be milestoned. --Osk 18:46, 24 April 2009 (UTC)
It is not at all clear to me in the OSIS manual that <chapter> and <div type="chapter"> have any semantic difference. In the manual it defines the div type of chapter:
chapter - Standard chapter as is found in a textbook.
To me, I always saw the chapters of a book of the Bible as equivalent to that of a textbook.
Likewise for paragraph:
paragraph - Should be used only where the paragraphs in question do not correspond to normal use of the p element of OSIS.
If the text is marked up with BCV, then the "normal use" of the p element does not work. So, I see it as an appropriate transformation inside a module or as a way of marking paragraphs that span chapters. The filters now support this mechanism in a fashion similar to paragraphs, using line breaks. I hope this is appropriate for the proper semantic usage you described, which allows for nested paragraphing.
Small comment on my removal of your <q> content from the article: I don't disagree that this is a good case for using milestoned chapters. That's quite possibly a better way to encode the Sermon on the Mount. But the text of the article, as worded, specifies instances where "milestoned version of <chapter> must be used" (emphasis mine), which this isn't. --Osk 18:29, 24 April 2009 (UTC)

Making a colophon start on a new line after the last verse of the epistle?

In OSIS, what is the preferred way to make the colophon start on a new line, rather than being a continuation of the verse text? David Haslam 12:34, 3 December 2010 (UTC)

Still awaiting a response. ∴ Raised the question in the SWORD Devel mailing list. David Haslam 08:40, 21 October 2011 (MDT)
The preferred way is given above, put it in a div with a type of colophon. The SWORD engine should be changed to output a newline in such cases, if it does not do so already. Until that happens you can use <lb/>. But note, this is not preferred, just a temporary workaround. If or when the SWORD engine puts the div on a new line, there will be extra vertical whitespace. --Dmsmith 13:25, 21 October 2011 (MDT)

Counting milestones

For a milestoned OSIS document, the following may be useful as an adjunct to successful XML validation. Some text editors such as Notepad++ have a count button in the search dialog. If this is used to count the occurrences of both " sID" and " eID", the two results should be the same if sID/eID are all matched pairs. This is a necessary, though not a sufficient, requirement which milestones must satisfy. Observe that the search pattern must start with a space, in order to distinguish "osisID" from "sID". David Haslam 20:26, 3 December 2010 (UTC)

Checking that the ID attributes match and are unique is a more difficult task. David Haslam 20:27, 3 December 2010 (UTC)

XML Tools plugin for Notepad++

I have reported the problem about schema location syntax to the author of the XML Tools plugin. David Haslam 02:14, 11 July 2011 (MDT)

This issue was solved a long time ago. I use the plugin regularly to validate XML files. David Haslam 13:18, 25 December 2014 (MST)
XML Tools Plugin in version 2.4.7 alpha Unicode supports proxy; useful for those at their place of work. David Haslam 04:39, 17 June 2015 (MDT)

Linked verses

In the OSIS Reference Manual, the example given for linked verses is for only two successive verses:

<verse sID="Esth.1.1-Esth.1.2" osisID="Esth.1.1 Esth.1.2" n="1-2"/>
King Xerxes of Persia lived in his capital city of Susa and ruled one
hundred twenty-seven provinces from India to Ethiopia.
<verse eID="Esth.1.1-Esth.1.2"/>

It's my observation that this can easily be misunderstood when more than two successive verses are linked.

Here's a recent example of an incorrectly marked passage: (taken from the Hakha version at Myanmar Bibles)

<verse sID="Exod.22.2-Exod.22.4" osisID="Exod.22.2 Exod.22.4"/>A liam hrimhrim awk a si lai. Zeihmanh ngeihmi a ngeih
lo ahcun, a firmi chamnak ah khan amah cu zuar a si lai. A firmi saṭil kha, cawtum siseh, laa siseh, tuu siseh, 
a kut chungah a rak nun ko rih ahcun, a let in a liam lai. </p>
“Mifir pakhat nih inn a bauh i a bauhnak ah an thah ahcun athattu cu lainawng a si lo. 
Asinain ni chuah hnuah a si ahcun lainawng a si ṭhiamṭhiam. </p>
<verse eID='Exod.22.2-Exod.22.4'/>

"Why is this wrong?", you might ask.

Because verses 2 to 4 are linked, namely three successive verses, the osisID attribute must have three references.

Here is the corrected version:

<verse sID="Exod.22.2-Exod.22.4" osisID="Exod.22.2 Exod.22.3 Exod.22.4"/>A liam hrimhrim awk a si lai. Zeihmanh ngeihmi a ngeih
lo ahcun, a firmi chamnak ah khan amah cu zuar a si lai. A firmi saṭil kha, cawtum siseh, laa siseh, tuu siseh, 
a kut chungah a rak nun ko rih ahcun, a let in a liam lai. </p>
“Mifir pakhat nih inn a bauh i a bauhnak ah an thah ahcun athattu cu lainawng a si lo. 
Asinain ni chuah hnuah a si ahcun lainawng a si ṭhiamṭhiam. </p>
<verse eID='Exod.22.2-Exod.22.4'/>

Thus corrected, the log output for osis2mod will include:

INFO(LINK): Linking Exod.22.3 to Exod.22.2
INFO(LINK): Linking Exod.22.4 to Exod.22.2

David Haslam 08:17, 14 November 2011 (MST)

Milestone form for chapters?

It would be helpful to module developers to also provide an example of using the milestone form for chapters. David Haslam 00:31, 26 December 2011 (MST)

Marking variants

I have expanded the section about marking variants.
I am now puzzled why SWORD uses the <seg> element rather than the <rdg> element as documented in the OSIS 2.1 User Manual. viz.

The rdg element has several pre-defined value for type:
  • alternate: A reading of approximately equal probability compared to others.
  • variant: A reading that varies from the accepted tradition.
If the enumerated values for type on the rdg element are insufficient, users should create their own, but prepended with the ‘x-’ extension.

David Haslam 13:16, 7 January 2014 (MST)

<rdg> can only appear within <note>. You're free to use it to note alternate readings within notes, but that has nothing to do with toggling between variants. --Osk 05:06, 22 January 2014 (MST)
Thanks. David Haslam 07:14, 22 January 2014 (MST)

I suppose that the filter value is

'false' for showing the primary reading
'true' for showing the secondary reading instead.

Please advise. David Haslam 13:28, 7 January 2014 (MST)

The values are not true/false. The values are "Primary Reading", "Secondary Reading", and "All Readings". We can potentially add additional readings in the future, but don't presently have content with more than two readings. --Osk 05:06, 22 January 2014 (MST)
Thanks. David Haslam 07:14, 22 January 2014 (MST)

Marking lemmas?

In the recently updated five Greek NT modules (TR, WHNU, Byz, Elzevir, Antoniades), lemmas occur in the following form: (example)

<w lemma="strong:G1080" morph="robinson:V-AAI-3S strongsMorph:5656">εγεννησεν</w>

Now I know what the first two main attributes are (Strong and Robinson), but what on earth is the second part in the latter?


Is this a new feature? David Haslam 02:37, 21 January 2014 (MST)

The lemma part is just the Strong's number, so this heading is a misnomer. The morph attribute indicates morphological coding (not the lemma). The second value, strongsMorph:5656, should have been encoded as strongsMorph:G5656. This will go out in the next update, but I won't update for at least a week to avoid repeated updates in quick succession, in case you're still playing around with these modules. This is not even remotely new. It has been in place in these modules in exactly this form since at least 2009, and the numbers have appeared in various modules since the 1990s. The number is also known as a TVM number (but it explicitly excludes any proprietary data from the TVM database). Lookups can be done using the Robinson module. --Osk 04:57, 22 January 2014 (MST)
Thanks for the explanation. Essentially you're saying that there was an old bug in osis2mod r2893 which caused the "G" to be omitted. I look forward to seeing this fixed, and the modules that were affected being updated. David Haslam 07:05, 22 January 2014 (MST)
PS. Is there a publicly documented definition for a TVM number? David Haslam 07:06, 22 January 2014 (MST)
Just found what the TVM acronym denotes: TVM = Tense Voice Mood. David Haslam 07:09, 22 January 2014 (MST)

Strong's markup with splits

Not documented in the main page is the markup for when a single Strong's number is split by how the verse is translated. e.g.

   <w lemma="strong:G1752 type="x-split-418">for</w>
   <w lemma="strong:G1343>righteousness’</w>
   <w lemma="strong:G1752 type="x-split-418">sake</w>:

In this example from KJV Matthew 5:10, the Greek word ενεκεν corresponding to G1752 was split in the English translation. David Haslam 04:11, 2 September 2014 (MDT)

Marking mid-verse acrostic poetry headings?

What is the correct method to mark mid-verse acrostic poetry headings?

In some translations, these are used in Psalms 111 and 112 (110 and 111 in Synodal v11n). See [1].

Likewise, there is a single mid-verse acrostic in Psalm 34 (33 in Synodal v11n). See [2].

David Haslam 09:29, 4 December 2014 (MST)

Marking enumerated words?

To work with the GlobalOptionFilter=OSISEnum, which attribute should be used?

Under marking morphology, it says: The src attribute is used here to indicate the word position in the original Greek.

cf. In the SP module for the Samaritan Pentateuch, the n attribute was used.

David Haslam 05:05, 19 December 2014 (MST)

Until I find an answer to this question, I will defer adding a sub-section for this. David Haslam 09:31, 19 December 2014 (MST)
In section 13.17. w of the OSIS 2.1 User Manual, it has "src Use to record origin of the word."
A word origin is not the same as an enumerated position. David Haslam 09:45, 19 December 2014 (MST)
In the NT, the KJV source file uses the src attribute to record the position of the word[s] in the original Greek text. David Haslam 14:16, 19 December 2014 (MST)
Added the sub-section now it's clearer to me. David Haslam 14:25, 19 December 2014 (MST)

Charset conversion

This sub-section has just been moved to Fonts#Charset_conversion. David Haslam 08:29, 18 February 2015 (MST)

Misleading advice about the canonical attribute?

Note 3 in the Body section reads:

Any div element defaults canonical to false. You need to set it to true on elements representing the structure of the original text.

This ignores the fact that the value of the canonical attribute is inherited.

If one includes canonical="true" in the osisText element, it will be inherited by the div elements of type bookGroup and book.

--David Haslam (talk) 02:58, 6 December 2017 (MST)

Of course, it would require marking any non-Biblical divisions (e.g. Front Matter) with canonical="false" if this approach were to be generally adopted. In practice, Bible modules with such additional content are few and far between.
--David Haslam(talk) 11:41, 10 December 2017 (MST)

Marking references in Right to Left scripts : Technical details

I was about to add a screenshot to enhance the example, but there's a problem with uploading files at present. This will have to wait until the problem is resolved. David Haslam (talk) 05:46, 11 December 2017 (MST)

Marking morpheme segmentation

I have not come across any clearly documented details about how to markup morpheme segmentation that is a feature in some Biblical Hebrew modules, but which currently seems not to function even in front-ends such as Xiphos that do have this as a module option. --David Haslam (talk) 13:38, 24 December 2017 (MST)

Marking references in Right to Left scripts

This subsection of the page gives an example from the module UrduGeo. Unfortunately, my module submission in 2018 was not accepted by the modules team, even though I had done all this valuable research. Not because the research was in any way invalid, but simply because I had made use of TextPipe to develop the method of getting the source text fixed as regards the RtoL vernacular cross-references. David Haslam (talk) 12:19, 5 February 2019 (UTC)