Difference between revisions of "User:Dmsmith/KJV 2.6"

From CrossWire Bible Society
Jump to: navigation, search
(OSIS rdg element: <BR> A few notes will end up with a '''rdg''' element inside a '''rdg''' element. The inner one should be wrapped within a '''seg''' element to be valid OSIS.)
(See also: * Benjamin Blayney's 1769 KJV – David's user page started to collate further information.)
Line 271: Line 271:
 
* [[User:Dmsmith/KJV2011|KJV2011]] &ndash; closed wiki page for earlier changes.
 
* [[User:Dmsmith/KJV2011|KJV2011]] &ndash; closed wiki page for earlier changes.
 
* [http://www.crosswire.org/~dmsmith/kjv2011/ KJV2011] &ndash; DM's personal directory has the latest source text
 
* [http://www.crosswire.org/~dmsmith/kjv2011/ KJV2011] &ndash; DM's personal directory has the latest source text
 +
* [[User:David Haslam/Benjamin Blayney's 1769 KJV|Benjamin Blayney's 1769 KJV]] &ndash; David's user page started to collate further information.
  
 
[[Category:Modules|KJV 2.6]]
 
[[Category:Modules|KJV 2.6]]
 
[[Category:English Bibles|KJV 2.6]]
 
[[Category:English Bibles|KJV 2.6]]

Revision as of 10:30, 18 January 2016

This page is for recommended changes to the KJV module version 2.6 (or later).

Punctuation

1 Cor 15:27 The comma at "him, it is" should not be italicized. There should be a comma between "excepted which". --Dmsmith 11:19, 16 February 2014 (MST)

Done --Dmsmith 17:54, 18 February 2014 (MST)

Words of Christ

The Old Scofield only highlights Words of Christ (WoC) as they come directly from his mouth. Not what others say he said. Not translation of what he said, such as translation from Aramaic. In 2.6, there are 3 error in markup (maybe more, but these are known). Red is what should be WoC, black is currently red, but shouldn't be:

Mat 8:26 And he saith unto them, Why are ye fearful, O ye of little faith? Then he arose, and rebuked the winds and the sea; and there was a great calm.

Mat 19:18 He saith unto him, Which? Jesus said, Thou shalt do no murder, Thou shalt not commit adultery, Thou shalt not steal, Thou shalt not bear false witness,

Act 1:4 And, being assembled together with them, commanded them that they should not depart from Jerusalem, but wait for the promise of the Father, which, saith he, ye have heard of me.

Done--Dmsmith 06:56, 19 February 2014 (MST)

Added words

Added words & punctuation

Split all transChange elements that contain punctuation marks, so that the punctuation and following space is normal text. 1 Cor 15:27 is one example of many such occurrences.

  • Found and fixed:
    • 67 ,
    • 10 ;
    • 6 :
    • 1 ?

--Dmsmith 18:38, 18 February 2014 (MST)

Added words & Strong's

Review and correct each instance in which a w element for Strong's & Morph is found within a transChange element. The markup probably belongs to the preceding word. The following are the instances and word(s) that precede that are not contained by a <w> element.

Gen 14.10	<transChange type="added"><w lemma="strong:H0875">was full of</w></transChange>
Exod 15.12	<transChange type="added"><w lemma="strong:H02098">which</w></transChange>
Exod 15.16	<transChange type="added"><w lemma="strong:H02098">which</w></transChange>
Exod 34.19	<transChange type="added"><w lemma="strong:H02142" morph="strongMorph:TH8735">that is male</w></transChange>
Num 1.16	These <transChange type="added"><w lemma="strong:H07148">were</w></transChange>
Num 3.19	These <transChange type="added"><w lemma="strong:H01992">are</w></transChange>
Num 10.28	Thus <transChange type="added"><w lemma="strong:H0428">were</w></transChange>
Num 13.3	<transChange type="added"><w lemma="strong:H01992">were</w></transChange>
Num 14.28	unto them, <transChange type="added"><w lemma="strong:H03808">As truly as</w></transChange>
Num 20.13	This <transChange type="added"><w lemma="strong:H01992">is</w></transChange>
1Sam 30.27	To <transChange type="added"><w lemma="strong:H0834">them</w></transChange>
2Kgs 19.31	<transChange type="added"><w lemma="strong:H06635" morph="strongMorph:TH8675">of hosts</w></transChange>
2Chr.10.16	<transChange type="added"><w lemma="strong:H07200" morph="strongMorph:TH8804">saw</w></transChange>
Ezra 2.65	and <transChange type="added"><w lemma="strong:H0428">there were</w></transChange>
Ps.17.6		unto me <transChange type="added"><w lemma="strong:H08085" morph="strongMorph:TH8798">and hear</w></transChange>
Ps 39.3		<transChange type="added"><w lemma="strong:H0227">then</w></transChange>
Jer 6.14	<transChange type="added"><w lemma="strong:H01323" morph="strongMorph:TH8676">of the daughter</w></transChange>
Jer 28.9	<transChange type="added"><w lemma="strong:H0227">then</w></transChange>
Jer.51.53	<transChange type="added"><w lemma="strong:H0227">yet</w></transChange>
I'll need some help determining how these should be changed, if at all. It may be that the KJV uses italics for a purpose other than "added" words. --Dmsmith 19:33, 18 February 2014 (MST)
Italics is presentational formating. The transChange element is semantic. David Haslam 05:19, 19 February 2014 (MST)
The 1611 and 1769 editions of the KJV didn't have an eText with semantic markup. Any semantic markup we have deduces the intention of the authors/printers from the orthographic representation. --Dmsmith 07:00, 19 February 2014 (MST)
We submitted this list to David Instone-Brewer and received a detailed reply. David Haslam 06:37, 17 September 2015 (MDT)

Added words & the Divine Name

We found one instance of the Divine Name element within a transChange element. This is probably inappropriate.

The one example is in 2 Chronicles 17:4. It is rendered in italics and small caps. So accordingly it is an added word representing the tetragrammaton. It is represented properly in OSIS. --Dmsmith 17:49, 18 February 2014 (MST)

Tagging the Divine Name

This much more complicated than we thought. Observations:

  1. The divine name is also tagged within some study notes (even twice within the same note a few times).
  2. More than one Strong's number is involved.
  3. Five instances are in the NT where the Greek word κυριος is tagged.
  4. There is one instance where two Strong's numbers are joined to the divine name.
  5. In many places, there are some English words between the Strong's tag and the divine name tag.
  6. There are places where the divine name is tagged, even though it is within a transChange element (see previous subsection).
  7. The three hyphenated forms of the divine name (Jehovah–jireh, Jehovah–nissi, Jehovah–shalom) are not tagged in the main text, only in the study notes.
  8. The other two hyphenated forms of the divine name (Jehovah–shammah, Jehovah–tsidkenu) occur only in the study notes, where the English word (Lord) is tagged.

Divine Name tagging in the KJV follows the "small caps" orthographic representation of Lord, God, Yah.
As such, it is found in added words and notes not being associated with Strong's Numbers.
The Strong's Numbers tagged are H3068, H3069, H3072 and H3050. The first is the tetragrammaton. The second and third are variations of it. The last is Yah.
In the NT, the orthographic representation of Lord as the divine name are backed by Greek, not Hebrew.
The instance of two Strong's Numbers being associated with divine name needs to be reviewed. The leading word is "face", often translated "before" or "presence".
Jer.26.19 <w morph="strongMorph:TH8762" lemma="strong:H02470">and besought</w> <w lemma="strong:H06440 strong:H03068">the <divineName>Lord</divineName></w>
Here the leading word is left untranslated.

Reference text policy

It may be sensible to review whether we chose the most suitable published text as our reference standard. The most widely accepted one is the Cambridge University Press - Concord Reference Bible.

Need two things: an e-text and permission for the text. (I think the "crown" claims copyright.) --Dmsmith 19:36, 18 February 2014 (MST)
Crown Copyright applies to the Authorised Version per se, not just to those printed by CUP, who are merely one of the licensed printers for all the works that come under Royal Letters Patent. Refer to our Copyright page David Haslam 05:08, 19 February 2014 (MST)

Hebrew words

Following the addition of Greek words in the NT from the TR in version 2.6, is it planned to do likewise for Hebrew (& Aramaic) words in the OT from the MT ?

It would be wonderful. However, the tagging of Strong's numbers provided a map to the TR in the src="x y" attribute, where that gave the position of the word in the TR. So, the addition of the Greek was trivial. We have nothing like that for the MT. It won't be trivial. Also, the Strong's tagging in the OT is not comprehensive. In any given verse only some of the words from the MT are tagged. In the NT all the TR words were present in the tagging, even if empty (i.e. untranslated.)
We are more likely to update the morphology of the OT first for those that have some kind of morphology today.
--Dmsmith 09:31, 25 February 2014 (MST)

Psalm 119 Acrostic Stanza Titles

The 22 Hebrew letter acrostic titles in Psalm 119 should be displayed before the first verse of each eight-verse stanza. Currently, the next verse tag is displayed before each stanza title. This is incorrect when compared to the KJV printed edition. The mod2imp output for the first such title is:

$$$Psalms 119:1
<title canonical="true" type="acrostic"><foreign n="א">ALEPH.</foreign></title>
<w lemma="strong:H0835">Blessed</w> <transChange type="added">are</transChange> 
<w lemma="strong:H08549">the undefiled</w> 
<w lemma="strong:H01870">in the way</w>, 
<w lemma="strong:H01980" morph="strongMorph:TH8802">who walk</w> 
<w lemma="strong:H08451">in the law</w> 
<w lemma="strong:H03068">of the <divineName>Lord</divineName></w>.
<note type="study">undefiled: or, perfect, or, sincere</note>

Do we need to wrap each stanza title between suitably constructed milestone preverse div elements?

The preverse div should never be constructed in xml. It is created by osis2mod.
Done in version 2.8----Dmsmith 05:33, 20 December 2015 (MST)
Did you just move the titles to before the stanza? David Haslam 05:43, 20 December 2015 (MST)
So far, yes. I'm testing it. I may have to put it in a section div or change osis2mod. If a div, that would be the first instance of a section div, which may add vertical whitespace that is not present elsewhere in the KJV.--Dmsmith 05:46, 20 December 2015 (MST)

Multiple whitespace

Within the text source for 2.7 (kjvfull.xml) there are 39 instances of double spaces (outside the header):

  • 36 are immediately after the "w" in a w element
  • 2 are after a closing " but within a w element
  • 1 is between two w elements; in Phil.4.2, and is displayed by SWORD as
    that they  be of the same mind

The latter should be corrected.

Done in version 2.8----Dmsmith 05:31, 20 December 2015 (MST)

Missing punctuation in notes

  • There are 3 study notes contain the abbreviation "Heb" with no full-stop after the abbreviation. The locations are 2Chr.2.16, Isa.9.20, Jer.13.21
Done in version 2.8----Dmsmith 05:36, 20 December 2015 (MST)

Add language identifier for foreign element

Suggest add the following attribute to the foreign element in each acrostic title:

xml:lang="hbo"

Refer to OSIS Reference Manual.

Done in version 2.8--Dmsmith 06:18, 20 December 2015 (MST)

Should also identify and add other foreign elements.--Dmsmith 06:18, 20 December 2015 (MST)

MENE, MENE, TEKEL, UPHARSIN (what language code?) David Haslam 10:00, 20 December 2015 (MST)

Hyphens

As discussed before under Hyphenation, only five words in the NT use a hyphen/minus, eleven occurrences in total for the whole Bible. Seeing as the text of the KJV module already requires a font that includes the en dash (U+2013), and thus is not restricted to ASCII, I see no reason why we shouldn't replace these hyphen/minus by the proper Unicode character for hyphen, U+2010. The five words are:

3	God-ward
1	joint-heirs
1	thee-ward
3	us-ward
3	you-ward

Selah

There are 75 instances of the whole word "Selah" in the KJV.[1] The first is in II Kings 14:7. The rest are found in Psalms (71) and Habbakuk (3).
Of those in Psalms, these 13 locations have the peculiarity in that the Strongs markup includes other words besides Selah.

    <w lemma="strong:H05542">thereof. Selah</w>.
    <w lemma="strong:H05542">me. Selah</w>.
    <w lemma="strong:H05542">himself. Selah</w>.
    <w lemma="strong:H05542">before them. Selah</w>.
    <w lemma="strong:H05542">for us. Selah</w>.
    <w lemma="strong:H05542">themselves. Selah</w>.
    <w lemma="strong:H05542">upon us; Selah</w>.
    <w lemma="strong:H05542">of it. Selah</w>.
    <w lemma="strong:H05542">thee. Selah</w>.
    <w lemma="strong:H05542">there. Selah</w>.
    <w lemma="strong:H05542">thee? Selah</w>.
    <w lemma="strong:H05542">for me. Selah</w>.
    <w lemma="strong:H05542">themselves. Selah</w>.

These words & punctuation do not belong properly to the "Selah" but are part of the preceding sentence or phrase. It may therefore be sensible to convert them like this:

    thereof. <w lemma="strong:H05542">Selah</w>.
    me. <w lemma="strong:H05542">Selah</w>.
    himself. <w lemma="strong:H05542">Selah</w>.
    before them. <w lemma="strong:H05542">Selah</w>.
    for us. <w lemma="strong:H05542">Selah</w>.
    themselves. <w lemma="strong:H05542">Selah</w>.
    upon us; <w lemma="strong:H05542">Selah</w>.
    of it. <w lemma="strong:H05542">Selah</w>.
    thee. <w lemma="strong:H05542">Selah</w>.
    there. <w lemma="strong:H05542">Selah</w>.
    thee? <w lemma="strong:H05542">Selah</w>.
    for me. <w lemma="strong:H05542">Selah</w>.
    themselves. <w lemma="strong:H05542">Selah</w>.

This issue was already communicated by email on 2015-09-09. David Haslam

It can be readily fixed by a simple Perl replacement, thus: Perl pattern [(<w lemma="strong:H05542">)(.+)(Selah</w>)] with [$2$$1$$3]

  [X] Match case
  Maximum text buffer size 4096
  [ ] Maximum match (greedy)
  [ ] Allow comments
  [ ] '.' matches newline
  [X] UTF-8 Support

David Haslam Note:

  1. Although OSIS defines an attribute value type="selah", this only applies to the poetry line element l, none of which are used in the KJV.

Punctuation and Strongs

A much more general issue was also reported. Namely, tagged w elements that span beyond the end of a sentence or phrase. Many of these can be identified by the fact that the spanned text includes at least one terminating punctuation mark [.,;:!?)]. Some of these even contain two or more such punctuation marks, so devising a regexp is a bit fraught. Moreover, for some of those that have a comma, it may be perfectly valid to include the preceding word[s]. Less likely for the other punctuation marks.

Searching for different regexps such as [>.+\?.+</w>] I counted the following:

Count    Punctuation mark
 219     Full-stop       
7646     Comma (of which 444 have two or more commas)
1215     Colon
1064     Semicolon
  22     Exclamation mark
 254     Question mark
  11     Right parenthesis
  13     Left parenthesis (all these also contain another pm)

It's often the case that the English word that matches the Strong's tag is the last word before the </w>. Even so, I have not proven that this applies to 100% of the above patterns.

This issue was also reported by email on 2015-09-10. David Haslam

Parsing the study notes

These subsections document some of my analyses of the KJV study notes, as well as proposing ways in which we might improve them using standard OSIS markup. David Haslam

Notes appertaining to Psalm titles

Study notes appertaining to text within a Psalm title are currently placed at the end of verse 1, just like any other note. To prevent these notes being orphaned when headings are hidden, it is proposed to move these notes to within the title element. As some Psalms also have one or more note appertaining to text within verse 1, this change will require careful manual editing, rather than automating by a script.

OSIS catchWord element

Most of the study notes in the KJV source text have recognisable catch words. These should be marked up using the OSIS catchWord element. e.g.

<note type="study">the light from…: Heb. between the light and between the darkness</note>

should become

<note type="study"><catchWord>the light from…</catchWord>: Heb. between the light and between the darkness</note>

The catchWord element would be added by pattern matching to the first colon in the note text.
Out of 6959 study notes, there are 147 notes with more than a single colon, 140 of which have ": or, ".
David Haslam

OSIS rdg element

Many of the study notes record alternative or more literal renderings of the Hebrew, Chaldee[1] or Greek[2] text.
We might wish to wrap all such readings within the OSIS rdg element. e.g.

<note type="study">And the evening…: Heb. And the evening was, and the morning was etc.</note>

would become

<note type="study"><catchWord>And the evening…</catchWord>: Heb. <rdg>And the evening was, and the morning was</rdg> etc.</note>

Proposed type attributes for the rdg element:

  • type="alternative"[3] – used when the subsequent text was introduced by ", or, "[4]
  • type="x-equivalent" – used when the note gives the Gr. equivalent (LXX ?) to a Hebrew name.
  • type="x-identity" – used when the note has " also called, " (sometimes without a comma).
  • type="x-literal" – used when the note gives the Heb. translation more literally than the main text.
  • type="x-meaning" – used when the note explains the meaning of a Heb. or Chaldee name or word. Sometimes introduced by " that is, "

The earlier example would then become:

<note type="study"><catchWord>And the evening…</catchWord>: Heb. <rdg type="x-literal">And the evening was, and the morning was</rdg> etc.</note>

We might also include the xml:lang attribute using a language code for each of the three Biblical languages.
A few notes will end up with a rdg element inside a rdg element. The inner one should be wrapped within a seg element to be valid OSIS.

Notes:

  1. i.e. Biblical Aramaic.
  2. Among the OT notes there are 35 instances of "Gr. " with equivalent names to the Hebrew.
  3. As documented in the OSIS 2.1.1 Reference Manual.
  4. Of the 2672 notes containing this string, there are 125 notes that contain it twice, and 1 note that has it thrice (see below).
<note type="study">a beacon: or, a tree bereft of branches, or, boughs: or, a mast</note>

David Haslam

Punctuation

I doubt if anyone has ever thoroughly audited the punctuation in the KJV study notes against the standard reference printed edition.
e.g. There are 12 instances of " also called " without a comma, compared to 143 instances of " also called, " with the comma.
David Haslam

Cross references?

Sadly lacking from our KJV module are any scripture cross-references. Many printed editions of the AV contain such references. We should explore how the module might be enhanced by obtaining the data from a suitable electronic source.

[1] is of interest in this context, but see the foot of [2] which describes the sources of the data.

The 1769 edition of the KJV included Benjamin Blayney's cross-references. Many of the OT references therein were to the Deuterocanonical books. See also [3].

Note tag placement

My late mother's 1936 Collins edition of the KJV has centre margins with notes and cross-references. Of particular interest is that the cross-references tags are italicised lowercase superscript letters and the note tags are superscripted integers. These are positioned at the start of each word being referenced. This practice differs from how many of our modules are marked up, where the tag is often placed at the end of the word being referenced. David Haslam

One cross-reference already

The note within II Samuel 23:8 already contains a cross-reference! It should be converted to a proper OSIS xref, and the conf file should be updated to include GlobalOptionsFilter=OSISScripRef.

<note type="study">1ch 11:11 he lift…: from whom he…: Heb. slain</note>

This note appears to have two "catchWords". It's the only study note that has more than 2 colons. David Haslam

NT margin notes

The KJV module has study notes only in the OT. We should find a source text that also contains all the margin notes found in the NT. e.g.

Matt.1.11@s[Josias] Some read, Josias begat Jakim, and Jakim begat Jechonias. 1 Chr. 3.15.

This example just happens to also contain a cross-reference, but many others do not. David Haslam

See also