Difference between revisions of "User:Dmsmith/KJV 2.6"
David Haslam (talk | contribs) (→Before or after the keyword?: pasted) |
David Haslam (talk | contribs) (→Punctuation and Strongs, Future: :This issue is a symptom of an underlying one that was there from when Strong's markup was first added. It was done with "greedy backwards matching" - viz. The ''start'' of a '''w''' element is just after the) |
||
(11 intermediate revisions by the same user not shown) | |||
Line 261: | Line 261: | ||
This issue was also reported by email on 2015-09-10. [[User:David Haslam|David Haslam]] | This issue was also reported by email on 2015-09-10. [[User:David Haslam|David Haslam]] | ||
+ | |||
+ | :This issue is a symptom of an underlying one that was there from when Strong's markup was first added. It was done with "greedy backwards matching" - viz. The ''start'' of a '''w''' element is just after the ''end'' of the ''previous'' '''w''' element in the verse text. [[User:David Haslam|David Haslam]] ([[User talk:David Haslam|talk]]) | ||
== Parsing the study notes == | == Parsing the study notes == | ||
Line 320: | Line 322: | ||
----- | ----- | ||
− | === catchWord osisRef, 2. | + | === catchWord osisRef, 2.12 === |
We might further enhance the study notes by adding a suitable '''osisRef''' to each '''catchWord''' element.<ref>Unless we can use (e.g.) <code>osisRef="Ps.40.0"</code> to point to a Psalm title, notes appertaining to such titles would need to be excluded.</ref> '''Fine grain references''' could be used to locate the catchWord text.<ref>See page 91 in the '''OSIS 2.1 User Manual'''</ref> | We might further enhance the study notes by adding a suitable '''osisRef''' to each '''catchWord''' element.<ref>Unless we can use (e.g.) <code>osisRef="Ps.40.0"</code> to point to a Psalm title, notes appertaining to such titles would need to be excluded.</ref> '''Fine grain references''' could be used to locate the catchWord text.<ref>See page 91 in the '''OSIS 2.1 User Manual'''</ref> | ||
* For a single keyword<ref>Including words 'hyphenated' using the en dash and words containing the right single quotation mark used as an apostrophe.</ref><ref>Care would be required when the catchWord text ends with an ellipsis and when the text includes the divineName element.</ref><ref>With some ingenuity, this could be extended to a word preceded only by the definite or indefinite article, etc.</ref>, we could use the <code>@s[word]</code> operator to search for the keyword in the verse text.<ref>This will only find the first occurrence of the word.</ref> | * For a single keyword<ref>Including words 'hyphenated' using the en dash and words containing the right single quotation mark used as an apostrophe.</ref><ref>Care would be required when the catchWord text ends with an ellipsis and when the text includes the divineName element.</ref><ref>With some ingenuity, this could be extended to a word preceded only by the definite or indefinite article, etc.</ref>, we could use the <code>@s[word]</code> operator to search for the keyword in the verse text.<ref>This will only find the first occurrence of the word.</ref> | ||
Line 330: | Line 332: | ||
[[User:David Haslam|David Haslam]] | [[User:David Haslam|David Haslam]] | ||
− | === Note punctuation, 2. | + | === Note punctuation, 2.12 === |
Every margin note in the Blayney 1769 Oxford Edition ends in a full stop. Compare this to what we have. We are missing the full stop for most notes. [[User:David Haslam|David Haslam]] | Every margin note in the Blayney 1769 Oxford Edition ends in a full stop. Compare this to what we have. We are missing the full stop for most notes. [[User:David Haslam|David Haslam]] | ||
Line 342: | Line 344: | ||
[[User:David Haslam|David Haslam]] | [[User:David Haslam|David Haslam]] | ||
− | + | === Unmatched catchWord text, 2.12 (if possible) === | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | === Unmatched catchWord text, 2. | ||
A systematic analysis of catchWord text has enabled us to identify 10 notes where the catchWord does not properly match to the verse text. Here are the results: | A systematic analysis of catchWord text has enabled us to identify 10 notes where the catchWord does not properly match to the verse text. Here are the results: | ||
{| class="wikitable" | {| class="wikitable" | ||
Line 443: | Line 414: | ||
# the catchWord text '''I will''' is at the start of the next verse 13. The note was misplaced! | # the catchWord text '''I will''' is at the start of the next verse 13. The note was misplaced! | ||
# is another special case, a bit like #1, but with more comma separated words.<BR>The note implies that for each instance of 'from', the alternative is 'between'. | # is another special case, a bit like #1, but with more comma separated words.<BR>The note implies that for each instance of 'from', the alternative is 'between'. | ||
+ | |||
+ | == Note tag placement == | ||
+ | === Notes appertaining to Psalm titles, 2.12 (if possible) === | ||
+ | 116 Psalms have a canonical title. The file '''kjv.xml''' contains 71 lines that include <code><title.+</note></code>. Within these lines the number of note elements is 101.<ref>Preparing for 2.12 will insert a line feed before the verse '''sID''' element in the XML file. This enables verse 1 for such Psalms to be processed in the same way as all other verses.</ref> | ||
+ | |||
+ | Study notes appertaining to text within a Psalm title are currently placed at the end of verse 1, just like any other note. To prevent these notes being orphaned when headings are hidden, it is proposed to move these notes to within the title element. As some Psalms also have one or more note appertaining to text within verse 1, this change will require careful manual editing, rather than automating by a script. | ||
+ | |||
+ | '''Notes:''' | ||
+ | <references/> | ||
+ | [[User:David Haslam|David Haslam]] | ||
+ | |||
+ | === Before or after the keyword? === | ||
+ | My late mother's 1936 Collins edition of the KJV has centre margins with notes and cross-references. Of particular interest is that the cross-references tags are italicised lowercase superscript letters and the note tags are superscripted integers. These are positioned at the '''start''' of each word being referenced. This practice differs from how many of our modules are marked up, where the tag is often placed at the '''end''' of the word being referenced. In the KJV module, however, all the study note tags are at the end of the verse. | ||
+ | [[User:David Haslam|David Haslam]] | ||
+ | |||
+ | === Note tag order === | ||
+ | There are 972 lines in the KJV source XML that contain more than one note element. This accounts for 2087 of the notes. Are these notes in the same order as the catchWord text is in the verse text? Some verses have up to four notes. One example is Esther 1:19 | ||
+ | <pre> | ||
+ | <note type="study"><catchWord>If it…</catchWord>: Heb. <rdg type="x-literal">If it be good with the king</rdg></note> | ||
+ | <note type="study"><catchWord>unto…</catchWord>: Heb. <rdg type="x-literal">unto her companion</rdg></note> | ||
+ | <note type="study"><catchWord>from him</catchWord>: Heb. <rdg type="x-literal">from before him</rdg></note> | ||
+ | <note type="study"><catchWord>be not…</catchWord>: Heb. <rdg type="x-literal">pass not away</rdg></note> | ||
+ | </pre> | ||
+ | In this example, it looks as though the notes should be re-ordered. The verse reads (without markup, emphasis mine): | ||
+ | |||
+ | Esther 1:19: '''If it''' please the king, let there go a royal commandment '''from him''', and let it be written among the laws of the Persians and the Medes, that it '''be not''' altered, That Vashti come no more before king Ahasuerus; and let the king give her royal estate '''unto''' another that is better than she. | ||
+ | |||
+ | [[User:David Haslam|David Haslam]] | ||
+ | |||
+ | :Now that you are adding osisRef to catchWord, you can probably also reposition the note to the proper location in the verse. That'd be far better than end notes. I'd like to see the 1769 facsimile to determine whether it is before the phrase or after the phrase. The ... of many of the catch words indicates that positioning at the end of the phrase will be difficult. --[[User:Dmsmith|Dmsmith]] 07:54, 24 January 2016 (MST) | ||
+ | ::The frequent use of the horizontal ellipsis does present problems, which can only be resolved by detailed examination of the verse text for each such note. [[User:David Haslam|David Haslam]] | ||
== XML cosmetic == | == XML cosmetic == | ||
− | For version 2. | + | For version 2.12 the order of the '''osisID''' and '''sID''' attributes in verse elements will be swapped. This will improve the readability of the OSIS XML file, and conforms to the example given in [[OSIS Bibles]]. |
Similar changes: | Similar changes: | ||
Line 506: | Line 508: | ||
[[User:David Haslam|David Haslam]] | [[User:David Haslam|David Haslam]] | ||
− | == Pilcrow signs, 2. | + | == Pilcrow signs, 2.12 == |
:''This section was moved from [[User:Dmsmith/KJV2011]] – it having never been actioned''. [[User:David Haslam|David Haslam]] | :''This section was moved from [[User:Dmsmith/KJV2011]] – it having never been actioned''. [[User:David Haslam|David Haslam]] | ||
* In printed editions of the KJV, there is normally a space immediately after the ¶. When viewed with PocketSword or Xiphos, there is no such space.<BR>This is not an artifact of how the SWORD engine handles the OSIS markup. Example: | * In printed editions of the KJV, there is normally a space immediately after the ¶. When viewed with PocketSword or Xiphos, there is no such space.<BR>This is not an artifact of how the SWORD engine handles the OSIS markup. Example: | ||
Line 528: | Line 530: | ||
Inserting a space within the Pilcrow marker has been agreed. It should be implemented before the next release. [[User:David Haslam|David Haslam]] 14:28, 24 January 2016 (MST) | Inserting a space within the Pilcrow marker has been agreed. It should be implemented before the next release. [[User:David Haslam|David Haslam]] 14:28, 24 January 2016 (MST) | ||
− | === Missing Pilcrows, 2. | + | === Missing Pilcrows, 2.12 === |
In modern editions of the KJV there are no pilcrows in the NT from Romans through Revelation. The generally accepted view is that the printers ran out of moveable type and just omitted them. Based on the observation that verses with a pilcrow are the same as those that are numbered in chapter descriptions, it's feasible to restore the aforementioned missing pilcrows. The only exception to this rule is that verse 1 in a chapter never has a pilcrow, because the printed editions generally have a large drop capital on the first verse. | In modern editions of the KJV there are no pilcrows in the NT from Romans through Revelation. The generally accepted view is that the printers ran out of moveable type and just omitted them. Based on the observation that verses with a pilcrow are the same as those that are numbered in chapter descriptions, it's feasible to restore the aforementioned missing pilcrows. The only exception to this rule is that verse 1 in a chapter never has a pilcrow, because the printed editions generally have a large drop capital on the first verse. | ||
Line 544: | Line 546: | ||
The '''description''' should explain all the '''x-prefix''' attribute values used in the body. | The '''description''' should explain all the '''x-prefix''' attribute values used in the body. | ||
+ | |||
+ | We should record file history in the OSIS header using the '''revisionDesc''' element. | ||
== See also == | == See also == |
Latest revision as of 20:08, 22 June 2023
This page is for recommended changes to the KJV module version 2.6 (or later).
Contents
- 1 Reference text policy
- 2 Revision History
- 3 Punctuation, 2.7
- 4 Words of Christ, 2.7
- 5 Added words
- 6 Tagging the Divine Name
- 7 Hebrew words, Future
- 8 Psalm 119 Acrostic Stanza Titles, 2.8
- 9 Multiple whitespace, 2.8
- 10 Missing punctuation in notes, 2.8
- 11 Add language identifier for foreign element, 2.8 and Ongoing
- 12 Selah, 2.9
- 13 Punctuation and Strongs, Future
- 14 Parsing the study notes
- 14.1 OSIS catchWord element, 2.9
- 14.2 OSIS rdg element, 2.9
- 14.3 transChange type="amplified", 2.9
- 14.4 Punctuation & typos, Ongoing
- 14.5 catchWord osisRef, 2.12
- 14.6 Note punctuation, 2.12
- 14.7 Omissions from and errors within note text, Ongoing, released as found
- 14.8 Unmatched catchWord text, 2.12 (if possible)
- 15 Note tag placement
- 16 XML cosmetic
- 17 Cross references?, Future
- 18 NT margin notes, Future
- 19 Hyphens, Discussion
- 20 Chapter descriptions, Future
- 21 Pilcrow signs, 2.12
- 22 Drop capitals
- 23 OSIS header
- 24 See also
- 25 External links
Reference text policy
- See also CrossWire KJV#Reference_Text.
A facsimile of the 1769 KJV (aka Blayney's KJV), will be the reference text for the KJV text starting Feb 2016. Previously, it was the Old Scofield. The old Scofield will continue to be the reference text for red letter markup of the Words of Christ.
It may be sensible to review whether we chose the most suitable published text as our reference standard. The most widely accepted one is the Cambridge University Press - Concord Reference Bible.
- Need two things: an e-text and permission for the text. (I think the "crown" claims copyright.) --Dmsmith 19:36, 18 February 2014 (MST)
- Crown Copyright applies to the Authorised Version per se, not just to those printed by CUP, who are merely one of the licensed printers for all the works that come under Royal Letters Patent. Refer to our Copyright page David Haslam 05:08, 19 February 2014 (MST)
Since when did the Old Scofield become our reference text for this purpose? cf. [1]. David Haslam 05:50, 19 February 2014 (MST)
- During my earlier efforts, I found that there were lots of variations on the texts. I was wanting one that the KJV Only adherents felt was more accurate 1769 text. I found various listings of differences between current and "true" and examined dozens of dead tree texts in several stores. Both combined to come up with the Old Scofield. Also, it was important to avoid copyright claims based upon minor changes in the text. Additionally, I worked with Tim Lanfear who was doing the CCEL KJV to compare various eTexts. Those comparisons yielded differences that needed to be verified in an independent text. Thus the need for *a* dead tree text. While I was working on this, several websites that were dedicated to producing a "true" text were abandoned with chagrin that it is not a doable task apart from having a facsimile of the 1769, which is not known.
- Regarding Red Letters, I don't have a better resource. I'd love the "original" by the person who first did it, or a greater authority. What I have is a consistent authority. In doing the verification of the markup, I flipped through every page of the Old Scofield and compared it visually to BibleDesktop's display of the module. I also reviewed lists which various people provided me. Most were regarding words attributed to Jesus.
- --Dmsmith 06:29, 19 February 2014 (MST)
Revision History
Revision | Date | Description |
---|---|---|
2.9 | 2016-01-21 | Added markup to notes. Improved markup of Selah. |
2.8 | 2015-12-20 | Moved Ps 119 acrostic titles before verse number. Added Feature for no paragraphs. |
2.7 | 2015-08-09 | Fixed bugs preventing the display of some Strong's Numbers. |
2.6.1 | 2014-02-15 | Added GlobalOptionFilter for OSISLemma |
2.6 | 2013-10-05 | Fixed bugs. Added Greek from TR. |
2.5 | 2013-02-02 | Fixed bugs. |
2.4 | 2009-05-29 | Fixed bugs. |
2.3 | 2006-10-09 | Fixed bugs. |
2.2 | 2004-01-21 | Updated to 20040121 snapshot of KJV2003. |
2.1 | 2003-06-24 | Changed Old Testament to use OSIS tags, removing the last of the GBF markup. Also updated to 20030624 snapshot of KJV2003. Compressed. |
2.0 | early 2003 | Changed New Testament to use a snapshot of the KJV2003 Project |
Punctuation, 2.7
1 Cor 15:27 The comma at "him, it is" should not be italicized. There should be a comma between "excepted which". --Dmsmith 11:19, 16 February 2014 (MST)
- Done --Dmsmith 17:54, 18 February 2014 (MST)
Words of Christ, 2.7
The Old Scofield only highlights Words of Christ (WoC) as they come directly from his mouth. Not what others say he said. Not translation of what he said, such as translation from Aramaic. In 2.6, there are 3 error in markup (maybe more, but these are known). Red is what should be WoC, black is currently red, but shouldn't be:
Mat 8:26 And he saith unto them, Why are ye fearful, O ye of little faith? Then he arose, and rebuked the winds and the sea; and there was a great calm.
Mat 19:18 He saith unto him, Which? Jesus said, Thou shalt do no murder, Thou shalt not commit adultery, Thou shalt not steal, Thou shalt not bear false witness,
Act 1:4 And, being assembled together with them, commanded them that they should not depart from Jerusalem, but wait for the promise of the Father, which, saith he, ye have heard of me.
- Done--Dmsmith 06:56, 19 February 2014 (MST)
Added words
Added words & punctuation, 2.7
Split all transChange elements that contain punctuation marks, so that the punctuation and following space is normal text. 1 Cor 15:27 is one example of many such occurrences.
- Found and fixed:
- 67 ,
- 10 ;
- 6 :
- 1 ?
--Dmsmith 18:38, 18 February 2014 (MST)
Added words & Strong's, No Change
Review and correct each instance in which a w element for Strong's & Morph is found within a transChange element. The markup probably belongs to the preceding word. The following are the instances and word(s) that precede that are not contained by a <w> element.
Gen 14.10 <transChange type="added"><w lemma="strong:H0875">was full of</w></transChange> Exod 15.12 <transChange type="added"><w lemma="strong:H02098">which</w></transChange> Exod 15.16 <transChange type="added"><w lemma="strong:H02098">which</w></transChange> Exod 34.19 <transChange type="added"><w lemma="strong:H02142" morph="strongMorph:TH8735">that is male</w></transChange> Num 1.16 These <transChange type="added"><w lemma="strong:H07148">were</w></transChange> Num 3.19 These <transChange type="added"><w lemma="strong:H01992">are</w></transChange> Num 10.28 Thus <transChange type="added"><w lemma="strong:H0428">were</w></transChange> Num 13.3 <transChange type="added"><w lemma="strong:H01992">were</w></transChange> Num 14.28 unto them, <transChange type="added"><w lemma="strong:H03808">As truly as</w></transChange> Num 20.13 This <transChange type="added"><w lemma="strong:H01992">is</w></transChange> 1Sam 30.27 To <transChange type="added"><w lemma="strong:H0834">them</w></transChange> 2Kgs 19.31 <transChange type="added"><w lemma="strong:H06635" morph="strongMorph:TH8675">of hosts</w></transChange> 2Chr.10.16 <transChange type="added"><w lemma="strong:H07200" morph="strongMorph:TH8804">saw</w></transChange> Ezra 2.65 and <transChange type="added"><w lemma="strong:H0428">there were</w></transChange> Ps.17.6 unto me <transChange type="added"><w lemma="strong:H08085" morph="strongMorph:TH8798">and hear</w></transChange> Ps 39.3 <transChange type="added"><w lemma="strong:H0227">then</w></transChange> Jer 6.14 <transChange type="added"><w lemma="strong:H01323" morph="strongMorph:TH8676">of the daughter</w></transChange> Jer 28.9 <transChange type="added"><w lemma="strong:H0227">then</w></transChange> Jer.51.53 <transChange type="added"><w lemma="strong:H0227">yet</w></transChange>
- I'll need some help determining how these should be changed, if at all. It may be that the KJV uses italics for a purpose other than "added" words. --Dmsmith 19:33, 18 February 2014 (MST)
- Italics is presentational formating. The transChange element is semantic. David Haslam 05:19, 19 February 2014 (MST)
- The 1611 and 1769 editions of the KJV didn't have an eText with semantic markup. Any semantic markup we have deduces the intention of the authors/printers from the orthographic representation. --Dmsmith 07:00, 19 February 2014 (MST)
- We submitted this list to David Instone-Brewer and received a detailed reply. David Haslam 06:37, 17 September 2015 (MDT)
- No action required following D I-B's explanation. David Haslam
- We submitted this list to David Instone-Brewer and received a detailed reply. David Haslam 06:37, 17 September 2015 (MDT)
- The 1611 and 1769 editions of the KJV didn't have an eText with semantic markup. Any semantic markup we have deduces the intention of the authors/printers from the orthographic representation. --Dmsmith 07:00, 19 February 2014 (MST)
- Italics is presentational formating. The transChange element is semantic. David Haslam 05:19, 19 February 2014 (MST)
Added words & the Divine Name, No Change
We found one instance of the Divine Name element within a transChange element. This is probably inappropriate.
- The one example is in 2 Chronicles 17:4. It is rendered in italics and small caps. So accordingly it is an added word representing the tetragrammaton. It is represented properly in OSIS. --Dmsmith 17:49, 18 February 2014 (MST)
Tagging the Divine Name
This much more complicated than we thought. Observations:
- The divine name is also tagged within some study notes (even twice within the same note a few times).
- More than one Strong's number is involved.
- Five instances are in the NT where the Greek word κυριος is tagged.
- There is one instance where two Strong's numbers are joined to the divine name.
- In many places, there are some English words between the Strong's tag and the divine name tag.
- There are places where the divine name is tagged, even though it is within a transChange element (see previous subsection).
- The three hyphenated forms of the divine name (Jehovah–jireh, Jehovah–nissi, Jehovah–shalom) are not tagged in the main text, only in the study notes.
- The other two hyphenated forms of the divine name (Jehovah–shammah, Jehovah–tsidkenu) occur only in the study notes, where the English word (Lord) is tagged.
- Divine Name tagging in the KJV follows the "small caps" orthographic representation of Lord, God, Yah.
As such, it is found in added words and notes not being associated with Strong's Numbers. - The Strong's Numbers tagged are H3068, H3069, H3072 and H3050. The first is the tetragrammaton. The second and third are variations of it. The last is Yah.
- In the NT, the orthographic representation of Lord as the divine name are backed by Greek, not Hebrew.
- The instance of two Strong's Numbers being associated with divine name needs to be reviewed. The leading word is "face", often translated "before" or "presence".
Jer.26.19 <w morph="strongMorph:TH8762" lemma="strong:H02470">and besought</w> <w lemma="strong:H06440 strong:H03068">the <divineName>Lord</divineName></w>
- Here the leading word is left untranslated.
Hebrew words, Future
Following the addition of Greek words in the NT from the TR in version 2.6, is it planned to do likewise for Hebrew (& Aramaic) words in the OT from the MT ?
- It would be wonderful. However, the tagging of Strong's numbers provided a map to the TR in the src="x y" attribute, where that gave the position of the word in the TR. So, the addition of the Greek was trivial. We have nothing like that for the MT. It won't be trivial. Also, the Strong's tagging in the OT is not comprehensive. In any given verse only some of the words from the MT are tagged. In the NT all the TR words were present in the tagging, even if empty (i.e. untranslated.)
- We are more likely to update the morphology of the OT first for those that have some kind of morphology today.
- --Dmsmith 09:31, 25 February 2014 (MST)
Psalm 119 Acrostic Stanza Titles, 2.8
The 22 Hebrew letter acrostic titles in Psalm 119 should be displayed before the first verse of each eight-verse stanza. Currently, the next verse tag is displayed before each stanza title. This is incorrect when compared to the KJV printed edition. The mod2imp output for the first such title is:
$$$Psalms 119:1 <title canonical="true" type="acrostic"><foreign n="א">ALEPH.</foreign></title> <w lemma="strong:H0835">Blessed</w> <transChange type="added">are</transChange> <w lemma="strong:H08549">the undefiled</w> <w lemma="strong:H01870">in the way</w>, <w lemma="strong:H01980" morph="strongMorph:TH8802">who walk</w> <w lemma="strong:H08451">in the law</w> <w lemma="strong:H03068">of the <divineName>Lord</divineName></w>. <note type="study">undefiled: or, perfect, or, sincere</note>
Do we need to wrap each stanza title between suitably constructed milestone preverse div elements?
- The preverse div should never be constructed in xml. It is created by osis2mod.
- Done in version 2.8----Dmsmith 05:33, 20 December 2015 (MST)
- Did you just move the titles to before the stanza? David Haslam 05:43, 20 December 2015 (MST)
- So far, yes. I'm testing it. I may have to put it in a section div or change osis2mod. If a div, that would be the first instance of a section div, which may add vertical whitespace that is not present elsewhere in the KJV.--Dmsmith 05:46, 20 December 2015 (MST)
- Did you just move the titles to before the stanza? David Haslam 05:43, 20 December 2015 (MST)
Multiple whitespace, 2.8
Within the text source for 2.7 (kjvfull.xml) there are 39 instances of double spaces (outside the header):
- 36 are immediately after the "w" in a w element
- 2 are after a closing " but within a w element
- 1 is between two w elements; in Phil.4.2, and is displayed by SWORD as
that they be of the same mind
The latter should be corrected.
- Done in version 2.8----Dmsmith 05:31, 20 December 2015 (MST)
Missing punctuation in notes, 2.8
- There are 3 study notes contain the abbreviation "Heb" with no full-stop after the abbreviation. The locations are 2Chr.2.16, Isa.9.20, Jer.13.21
- Done in version 2.8----Dmsmith 05:36, 20 December 2015 (MST)
Add language identifier for foreign element, 2.8 and Ongoing
Suggest add the following attribute to the foreign element in each acrostic title:
xml:lang="hbo"
Refer to OSIS Reference Manual.
- Done in version 2.8--Dmsmith 06:18, 20 December 2015 (MST)
Should also identify and add other foreign elements.--Dmsmith 06:18, 20 December 2015 (MST)
- MENE, MENE, TEKEL, UPHARSIN (what language code?) David Haslam 10:00, 20 December 2015 (MST)
Selah, 2.9
There are 75 instances of the whole word "Selah" in the KJV.[1] The first is in II Kings 14:7. The rest are found in Psalms (71) and Habbakuk (3).
Of those in Psalms, these 13 locations have the peculiarity in that the Strongs markup includes other words besides Selah.
<w lemma="strong:H05542">thereof. Selah</w>. <w lemma="strong:H05542">me. Selah</w>. <w lemma="strong:H05542">himself. Selah</w>. <w lemma="strong:H05542">before them. Selah</w>. <w lemma="strong:H05542">for us. Selah</w>. <w lemma="strong:H05542">themselves. Selah</w>. <w lemma="strong:H05542">upon us; Selah</w>. <w lemma="strong:H05542">of it. Selah</w>. <w lemma="strong:H05542">thee. Selah</w>. <w lemma="strong:H05542">there. Selah</w>. <w lemma="strong:H05542">thee? Selah</w>. <w lemma="strong:H05542">for me. Selah</w>. <w lemma="strong:H05542">themselves. Selah</w>.
These words & punctuation do not belong properly to the "Selah" but are part of the preceding sentence or phrase. It may therefore be sensible to convert them like this:
thereof. <w lemma="strong:H05542">Selah</w>. me. <w lemma="strong:H05542">Selah</w>. himself. <w lemma="strong:H05542">Selah</w>. before them. <w lemma="strong:H05542">Selah</w>. for us. <w lemma="strong:H05542">Selah</w>. themselves. <w lemma="strong:H05542">Selah</w>. upon us; <w lemma="strong:H05542">Selah</w>. of it. <w lemma="strong:H05542">Selah</w>. thee. <w lemma="strong:H05542">Selah</w>. there. <w lemma="strong:H05542">Selah</w>. thee? <w lemma="strong:H05542">Selah</w>. for me. <w lemma="strong:H05542">Selah</w>. themselves. <w lemma="strong:H05542">Selah</w>.
This issue was already communicated by email on 2015-09-09. David Haslam
It can be readily fixed by a simple Perl replacement, thus:
Perl pattern [(<w lemma="strong:H05542">)(.+)(Selah</w>)] with [$2$$1$$3]
[X] Match case Maximum text buffer size 4096 [ ] Maximum match (greedy) [ ] Allow comments [ ] '.' matches newline [X] UTF-8 Support
- Done in 2.9. --Dmsmith 19:15, 23 January 2016 (MST)
Note:
- ↑ Although OSIS defines an attribute value type="selah", this only applies to the poetry line element l, none of which are used in the KJV.
Punctuation and Strongs, Future
A much more general issue was also reported. Namely, tagged w elements that span beyond the end of a sentence or phrase. Many of these can be identified by the fact that the spanned text includes at least one terminating punctuation mark [.,;:!?)]. Some of these even contain two or more such punctuation marks, so devising a regexp is a bit fraught. Moreover, for some of those that have a comma, it may be perfectly valid to include the preceding word[s]. Less likely for the other punctuation marks.
Searching for different regexps such as [>.+\?.+</w>] I counted the following:
Count Punctuation mark 219 Full-stop 7646 Comma (of which 444 have two or more commas) 1215 Colon 1064 Semicolon 22 Exclamation mark 254 Question mark 11 Right parenthesis 13 Left parenthesis (all these also contain another pm)
It's often the case that the English word that matches the Strong's tag is the last word before the </w>. Even so, I have not proven that this applies to 100% of the above patterns.
This issue was also reported by email on 2015-09-10. David Haslam
- This issue is a symptom of an underlying one that was there from when Strong's markup was first added. It was done with "greedy backwards matching" - viz. The start of a w element is just after the end of the previous w element in the verse text. David Haslam (talk)
Parsing the study notes
These subsections document my analyses of the KJV study notes, as well as proposing ways in which we might improve them using standard OSIS markup.
In Xiphos, the text within a catchWord or a rdg element is displayed in italics.
- The proposals detailed in the following subsections have been implemented in KJV module version 2.9 that was released on 2016-01-21.
OSIS catchWord element, 2.9
Most of the study notes in the KJV source text have recognisable key words or key phrases. These should be marked up using the OSIS catchWord element. e.g.
<note type="study">the light from…: Heb. between the light and between the darkness</note>
should become
<note type="study"><catchWord>the light from…</catchWord>: Heb. between the light and between the darkness</note>
The catchWord element would be added by pattern matching to the first colon in the note text.
Out of 6959 study notes, there are 147 notes with more than a single colon, 140 of which have ": or, ".
David Haslam
OSIS rdg element, 2.9
Many of the study notes record alternative or more literal renderings of the Hebrew, Chaldee[1] or Greek[2] text.
We might wish to wrap all such readings within the OSIS rdg element. e.g.
<note type="study">And the evening…: Heb. And the evening was, and the morning was etc.</note>
would become
<note type="study"><catchWord>And the evening…</catchWord>: Heb. <rdg>And the evening was, and the morning was</rdg> etc.</note>
Proposed type attributes for the rdg element:
-
type="alternate"
[3] – used when the subsequent text was introduced by ", or, "[4] -
type="x-equivalent"
– used when the note gives the Gr. equivalent (LXX ?) to a Hebrew name. -
type="x-identity"
– used when the note has " also called, " (sometimes without a comma). -
type="x-literal"
– used when the note gives the Heb. translation more literally than the main text. -
type="x-meaning"
– used when the note explains the meaning of a Heb. or Chaldee name or word. Sometimes introduced by " that is, "
The earlier example would then become:
<note type="study"><catchWord>And the evening…</catchWord>: Heb. <rdg type="x-literal">And the evening was, and the morning was</rdg> etc.</note>
A few notes will end up with a rdg element inside a rdg element.
We use a customised OSIS schema, so the inner one is not wrapped within a seg element in order to be valid OSIS.
Notes:
- ↑ i.e. Biblical Aramaic.
- ↑ Among the OT notes there are 35 instances of "Gr. " with equivalent names to the Hebrew.
- ↑ As documented in the OSIS 2.1.1 Reference Manual.
- ↑ Of the 2672 notes containing this string, there are 125 notes that contain it twice, and 1 note that has it thrice (see below).
<note type="study">a beacon: or, a tree bereft of branches, or, boughs: or, a mast</note>
transChange type="amplified", 2.9
The following six study notes were candidates for this. They didn't match the concept of a reading.
<note type="study"><catchWord>selvedge</catchWord>: <transChange type="amplified">an edge of cloth so woven that it cannot unravel</transChange></note> <note type="study"><catchWord>the caul</catchWord>: <transChange type="amplified">it seemeth by anatomy, and the Hebrew doctors, to be the midriff</transChange></note> <note type="study"><catchWord>selvedge</catchWord>: <transChange type="amplified">an edge of cloth so woven that it cannot unravel</transChange></note> <note type="study"><catchWord>his reign</catchWord>: <transChange type="amplified">Nebuchadnezzar’s eighth year</transChange></note> <note type="study"><catchWord>behemoth</catchWord>: <transChange type="amplified">probably an extinct animal of some kind</transChange></note> <note type="study"><catchWord>leviathan</catchWord>: <transChange type="amplified">probably an extinct animal of some kind</transChange></note>
Punctuation & typos, Ongoing
I doubt if anyone has ever thoroughly audited the punctuation in the KJV study notes against the standard reference printed edition.e.g.
- There were 12 instances of " also called " without a comma, compared to 143 instances of " also called, " with the comma.
- There were 2 instances of ", of" that have now been corrected to ", or".
- Audit to be continued even after version 2.9 was released.
Everything below here is yet to be done.
catchWord osisRef, 2.12
We might further enhance the study notes by adding a suitable osisRef to each catchWord element.[1] Fine grain references could be used to locate the catchWord text.[2]
- For a single keyword[3][4][5], we could use the
@s[word]
operator to search for the keyword in the verse text.[6] - For a keyphrase (multiple words), we could use the
@cp
operator and specify the corresponding range of codepoints in the verse text.
The former would be easier to implement than the latter, which is also less than ideal as meaningful enhancement.
Notes:
- ↑ Unless we can use (e.g.)
osisRef="Ps.40.0"
to point to a Psalm title, notes appertaining to such titles would need to be excluded. - ↑ See page 91 in the OSIS 2.1 User Manual
- ↑ Including words 'hyphenated' using the en dash and words containing the right single quotation mark used as an apostrophe.
- ↑ Care would be required when the catchWord text ends with an ellipsis and when the text includes the divineName element.
- ↑ With some ingenuity, this could be extended to a word preceded only by the definite or indefinite article, etc.
- ↑ This will only find the first occurrence of the word.
Note punctuation, 2.12
Every margin note in the Blayney 1769 Oxford Edition ends in a full stop. Compare this to what we have. We are missing the full stop for most notes. David Haslam
Omissions from and errors within note text, Ongoing, released as found
The study notes may have never been thoroughly audited for accuracy. I just found a textual omission.
The second note in Isa.59.5 currently reads: (omitting recently added markup, merely for clarity)
<note type="study">crushed…: or, sprinkled is as if there brake out a viper</note>
It should read:
<note type="study">crushed…: or, that which is sprinkled is as if there brake out a viper</note>
Unmatched catchWord text, 2.12 (if possible)
A systematic analysis of catchWord text has enabled us to identify 10 notes where the catchWord does not properly match to the verse text. Here are the results:
Item # | osisID | catchWord text | verse text |
---|---|---|---|
1 | Gen.36.39 | Hadar, Pau | And Baal–hanan the son of Achbor died, and Hadar reigned in his stead: and the name of his city was Pau; and his wife’s name was Mehetabel, the daughter of Matred, the daughter of Mezahab. |
2 | Judg.6.32 | Jerubbesheth | Therefore on that day he called him Jerubbaal, saying, Let Baal plead against him, because he hath thrown down his altar. |
3 | 2Sam.20.5 | Assemble | So Amasa went to assemble the men of Judah: but he tarried longer than the set time which he had appointed him. |
4 | 2Sam.23.8 | 1ch 11:11 he lift: from whom he: Heb. slain. | These be the names of the mighty men whom David had: The Tachmonite that sat in the seat, chief among the captains; the same was Adino the Eznite: he lift up his spear against eight hundred, whom he slew at one time. |
5 | 1Chr.4.14 | Hathath | And Meonothai begat Ophrah: and Seraiah begat Joab, the father of the valley of Charashim; for they were craftsmen. |
6 | 2Chr.6.29 | toward | Then what prayer or what supplication soever shall be made of any man, or of all thy people Israel, when every one shall know his own sore and his own grief, and shall spread forth his hands in this house: |
7 | 2Chr.6.32 | toward | Moreover concerning the stranger, which is not of thy people Israel, but is come from a far country for thy great name’s sake, and thy mighty hand, and thy stretched out arm; if they come and pray in this house; |
8 | Job.8.20 | come | Behold, God will not cast away a perfect man, neither will he help the evil doers: |
9 | Jer.8.12 | I will | Were they ashamed when they had committed abomination? nay, they were not at all ashamed, neither could they blush: therefore shall they fall among them that fall: in the time of their visitation they shall be cast down, saith the Lord. |
10 | Ezek.47.18 | from (Hauran, Damascus, Gilead, the land) | And the east side ye shall measure from Hauran, and from Damascus, and from Gilead, and from the land of Israel by Jordan, from the border unto the east sea. And this is the east side. |
Observations:
- is a special case, where both the individual words match, but not the comma combination.
- seems to have the wrong catchWord. The note tag is for Jerubbaal.
- is a capitalisation typo. Should be "assemble".
- we already had observed as a peculiar exception to the usual note format/function. See below.
- the catchWord Hathath is actually in the previous verse 13. The note was misplaced!
- should be, Or, towards this house., tagged before "in..."
- is just like #6, but Collins 1936 has no note here!
- we already analysed. Add details.
- the catchWord text I will is at the start of the next verse 13. The note was misplaced!
- is another special case, a bit like #1, but with more comma separated words.
The note implies that for each instance of 'from', the alternative is 'between'.
Note tag placement
Notes appertaining to Psalm titles, 2.12 (if possible)
116 Psalms have a canonical title. The file kjv.xml contains 71 lines that include <title.+</note>
. Within these lines the number of note elements is 101.[1]
Study notes appertaining to text within a Psalm title are currently placed at the end of verse 1, just like any other note. To prevent these notes being orphaned when headings are hidden, it is proposed to move these notes to within the title element. As some Psalms also have one or more note appertaining to text within verse 1, this change will require careful manual editing, rather than automating by a script.
Notes:
- ↑ Preparing for 2.12 will insert a line feed before the verse sID element in the XML file. This enables verse 1 for such Psalms to be processed in the same way as all other verses.
Before or after the keyword?
My late mother's 1936 Collins edition of the KJV has centre margins with notes and cross-references. Of particular interest is that the cross-references tags are italicised lowercase superscript letters and the note tags are superscripted integers. These are positioned at the start of each word being referenced. This practice differs from how many of our modules are marked up, where the tag is often placed at the end of the word being referenced. In the KJV module, however, all the study note tags are at the end of the verse. David Haslam
Note tag order
There are 972 lines in the KJV source XML that contain more than one note element. This accounts for 2087 of the notes. Are these notes in the same order as the catchWord text is in the verse text? Some verses have up to four notes. One example is Esther 1:19
<note type="study"><catchWord>If it…</catchWord>: Heb. <rdg type="x-literal">If it be good with the king</rdg></note> <note type="study"><catchWord>unto…</catchWord>: Heb. <rdg type="x-literal">unto her companion</rdg></note> <note type="study"><catchWord>from him</catchWord>: Heb. <rdg type="x-literal">from before him</rdg></note> <note type="study"><catchWord>be not…</catchWord>: Heb. <rdg type="x-literal">pass not away</rdg></note>
In this example, it looks as though the notes should be re-ordered. The verse reads (without markup, emphasis mine):
Esther 1:19: If it please the king, let there go a royal commandment from him, and let it be written among the laws of the Persians and the Medes, that it be not altered, That Vashti come no more before king Ahasuerus; and let the king give her royal estate unto another that is better than she.
- Now that you are adding osisRef to catchWord, you can probably also reposition the note to the proper location in the verse. That'd be far better than end notes. I'd like to see the 1769 facsimile to determine whether it is before the phrase or after the phrase. The ... of many of the catch words indicates that positioning at the end of the phrase will be difficult. --Dmsmith 07:54, 24 January 2016 (MST)
- The frequent use of the horizontal ellipsis does present problems, which can only be resolved by detailed examination of the verse text for each such note. David Haslam
XML cosmetic
For version 2.12 the order of the osisID and sID attributes in verse elements will be swapped. This will improve the readability of the OSIS XML file, and conforms to the example given in OSIS Bibles.
Similar changes:
- The lemma attribute will be moved to before the morph attribute in w elements.
- The canonical attribute will be moved to be the first one in title elements.
Cross references?, Future
Sadly lacking from our KJV module are any scripture cross-references. Many printed editions of the AV contain such references. We should explore how the module might be enhanced by obtaining the data from a suitable electronic source.
[2] is of interest in this context, but see the foot of [3] which describes the sources of the data.
The 1769 edition of the KJV included Benjamin Blayney's cross-references. Many of the OT references therein were to the Deuterocanonical books. See also [4].
One cross-reference already
The note for II Samuel 23:8 contains a cross-reference! This is how I would parse this note.
The text in italics is a reading, the rest being note text.
<note type="study">1ch 11:11 he lift…: from whom he…: Heb. slain</note>
II Samuel 23:8 reads: (emphasis mine)
These be the names of the mighty men whom David had: The Tachmonite that sat in the seat, chief among the captains; the same was Adino the Eznite: he lift up his spear against eight hundred, whom he slew at one time.
The text "he lift up his spear" is in italics as per <transChange type="added")
.
I Chronicles 11:11 reads: (emphasis mine)
And this is the number of the mighty men whom David had; Jashobeam, an Hachmonite, the chief of the captains: he lifted up his spear against three hundred slain by him at one time.
But could "1ch 11:11" have been a cross-reference with its own separate tag?
- It looks that way in the Collins 1936 edition. Nearby verses 9 & 11 both have references to 1 Chr.11.12.
- The note in verse 8 reads: slain. and the note tag is before whom he slew.
- This note was the only one that has not been marked up in KJV version 2.9
Aside: There's also a numerical discrepancy between these two verses in the Hebrew text. David Haslam
NT margin notes, Future
The KJV module has study notes only in the OT. We should find a source text that also contains all the margin notes found in the NT. e.g.
Matt.1.11@s[Josias] Some read, Josias begat Jakim, and Jakim begat Jechonias. 1 Chr. 3.15.
This example just happens to also contain a cross-reference, but many others do not. David Haslam
Hyphens, Discussion
As discussed before under Hyphenation, only five words in the NT use a hyphen/minus, eleven occurrences in total for the whole Bible. Seeing as the text of the KJV module already requires a font that includes the en dash (U+2013), and thus is not restricted to ASCII, I see no reason why we shouldn't replace these hyphen/minus by the proper Unicode character for hyphen, U+2010. The five words are:
3 God-ward 1 joint-heirs 1 thee-ward 3 us-ward 3 you-ward
Chapter descriptions, Future
The reference edition of the KJV included chapter descriptions (equivalent to USFM tag \cd_text) with a verse number giving the placement of each section thus described. We might consider adding these. In OSIS, the verse numbers should be made into active links.
The verse numbers within chapter descriptions correspond exactly to the use of the Pilcrows in the main text. See below. In fact, it should be feasible to restore the Pilcrows missing from the NT (after Acts of the Apostles) by taking account of this fact.
Pilcrow signs, 2.12
- This section was moved from User:Dmsmith/KJV2011 – it having never been actioned. David Haslam
- In printed editions of the KJV, there is normally a space immediately after the ¶. When viewed with PocketSword or Xiphos, there is no such space.
This is not an artifact of how the SWORD engine handles the OSIS markup. Example:
<verse osisID="Gen.1.6" sID="Gen.1.6"/><milestone type="x-p" marker="¶"/><w lemma="strong:H0430">And God</w> <w morph="strongMorph:TH8799" lemma="strong:H0559">said</w>, ...
Would a simple solution be to change it to marker="¶ ", i.e. with a space after the Pilcrow sign?
- Yes, this is a fine solution. --Dmsmith 21:07, 12 November 2011 (MST)
- Slight complication! – when the verse starts as red letters, the space is already displayed after the Pilcrow. Compare Matthew 22:11 with Matthew 22:15. David Haslam 08:28, 14 November 2011 (MST)
- This is no longer the case! David Haslam 09:49, 11 September 2015 (MDT)
- When red letters are on (the verse starts as red letters), there is a space between the verse tag and the text, but when red letters are off, there is no space between the verse tag and the text. This is a different issue. David Haslam 01:39, 23 January 2016 (MST)
- I think this is in the SWORD engine, as I see this also on PocketSword. David Haslam 01:40, 23 January 2016 (MST)
- Slight complication! – when the verse starts as red letters, the space is already displayed after the Pilcrow. Compare Matthew 22:11 with Matthew 22:15. David Haslam 08:28, 14 November 2011 (MST)
- On second look, it would be a fine solution, but Xiphos already has special code to add the space. The change should not be made if it is disruptive to Xiphos. --Dmsmith 07:21, 17 November 2011 (MST)
- Surely two spaces (where there are red letters) is preferable to not having any space after the pilcrow elsewhere? David Haslam 12:22, 28 May 2012 (MDT)
- Xiphos 4.0.4 does not add any space after the ¶ so maybe that special code was never implemented? David Haslam 09:46, 11 September 2015 (MDT)
Inserting a space within the Pilcrow marker has been agreed. It should be implemented before the next release. David Haslam 14:28, 24 January 2016 (MST)
Missing Pilcrows, 2.12
In modern editions of the KJV there are no pilcrows in the NT from Romans through Revelation. The generally accepted view is that the printers ran out of moveable type and just omitted them. Based on the observation that verses with a pilcrow are the same as those that are numbered in chapter descriptions, it's feasible to restore the aforementioned missing pilcrows. The only exception to this rule is that verse 1 in a chapter never has a pilcrow, because the printed editions generally have a large drop capital on the first verse.
Colophons
In the Blayney Edition, the colophon at the end of each Pauline Epistle (except for Ephesians) has a Pilcrow. All the colophons are indented on a new line.
Drop capitals
The Blayney edition uses a drop capital at the start of each chapter in place of the tag for verse 1, with the remaining text of the first word shown in ordinary capitals.
One exception (to be confirmed) is in Revelation 1, where it's verse 4 that has the drop capital, etc.
OSIS header
Currently the header contains only several work elements required for defining features. It should be expanded to provide further useful information.
The description should explain all the x-prefix attribute values used in the body.
We should record file history in the OSIS header using the revisionDesc element.
See also
- KJV2011 – closed wiki page for earlier changes.
- KJV2011 – DM's personal directory has the latest source text
- Benjamin Blayney's 1769 KJV – David's user page started to collate further information.
External links
- A supplement to the authorised English version of the New Testament : being a critical illustration of its more difficult passages from the Syriac, Latin and earlier English versions, (1845) – by Scrivener, Frederick Henry Ambrose, 1813-1891
- An exhaustive listing of the marginal notes of the 1611 edition of the King James Bible – at Literatura Bautista.