User talk:Dmsmith/KJV 2.6

From CrossWire Bible Society
Jump to: navigation, search

Page name

It may be better to rename this page as KJV2014, so that it can also cover changes after the next release this year. David Haslam 12:35, 18 February 2014 (MST)

I hope to do releases more often than once a year. It was too long between releases. Note sure that this is the best name, but I wanted a page to track what needed to be done and what was done.--Dmsmith 16:49, 18 February 2014 (MST)
Indent to reply, not bullet! David Haslam 05:16, 19 February 2014 (MST)

Analysis

David Haslam has generated the following counted lists:

  1. all transChange elements
  2. proper names hyphenated using the ndash
  3. words containing the Ææ graphemes
  4. possessive words punctuated with the single right quotation mark
  5. words tagged as the divine name
  6. extracts which include the tagged divine name

Revision History

Revision Date Description
2.9 2016-01-21 Added markup to notes. Improved markup of Selah.
2.8 2015-12-20 Moved Ps 119 acrostic titles before verse number. Added Feature for no paragraphs.
2.7 2015-08-09 Fixed bugs preventing the display of some Strong's Numbers.
2.6.1 2014-02-15 Added GlobalOptionFilter for OSISLemma
2.6 2013-10-05 Fixed bugs. Added Greek from TR.
2.5 2013-02-02 Fixed bugs.
2.4 2009-05-29 Fixed bugs.
2.3 2006-10-09 Fixed bugs.
2.2 2004-01-21 Updated to 20040121 snapshot of KJV2003.
2.1 2003-06-24 Changed Old Testament to use OSIS tags, removing the last of the GBF markup. Also updated to 20030624 snapshot of KJV2003. Compressed.
2.0 early 2003 Changed New Testament to use a snapshot of the KJV2003 Project


Cross references

This may be of interest. http://www.newkreation.com/bible/bible.php described as "49,775 Cross-References / The Bible cross references have been dutifully copied from 1914 A. J. Holman Company's Holman Home Bible in the public domain." David Haslam 07:44, 22 December 2015 (MST)

This resource can be installed to be read by the iSilo app for various platforms. The data is stored in PDB format. David Haslam 12:28, 18 January 2016 (MST)
The iSilo app for Windows (30 day free trial) has an Edit option to Copy Entire Document. I just did that and saved the data as a UTF-8 text file. David Haslam 12:46, 18 January 2016 (MST)

The technical challenge would be to distinguish reference tags from the words they are prefixed to.

Genesis 1
1 In athe beginning bGod created the heaven and the earth. 2 And the earth was cwithout form, and void; and darkness was upon the face of the deep. dAnd the Spirit of God moved upon the face of the waters. 

3 eAnd God said, Let there be light: and there was light. 4 And God saw the light, that it was good: and God divided the light from the darkness. 5 And God called the light fDay, and the darkness he called Night. And the evening and the morning were the first day. 

6 And God said, gLet there be a firmament in the midst of the waters, and let it divide the waters from the waters. 7 And God made the firmament, and divided the waters which were under the firmament from the waters which were above the firmament: and it was so. 8 And God called the firmament Heaven. And the evening and the morning were the second day. 
Only within the app are they superscripted italics letters. I've just left a message with the webmaster enquiring about the source text availability. David Haslam 13:48, 18 January 2016 (MST)
We might explore what can be done with pdbparse. David Haslam

Here's another online resource that has cross-references. https://github.com/scrollmapper/bible_databases
David Haslam 08:07, 22 December 2015 (MST)

Enhanced OSIS markup of the study notes

During the past few days, I developed a TextPipe filter to add the catchWord and rdg elements to the KJV study notes.
This left the following 15 note elements are without a rdg element.

<note type="study"><catchWord>selvedge</catchWord>: an edge of cloth so woven that it cannot unravel</note>
<note type="study"><catchWord>the caul</catchWord>: it seemeth by anatomy, and the Hebrew doctors, to be the midriff</note>
<note type="study"><catchWord>selvedge</catchWord>: an edge of cloth so woven that it cannot unravel</note>
<note type="study"><catchWord>the nations</catchWord>: the nations that warred against Israel</note>
<note type="study"><catchWord>any of…</catchWord>: any of the judges</note>
<note type="study">1ch 11:11 he lift…: from whom he…: Heb. slain</note>
<note type="study"><catchWord>his reign</catchWord>: Nebuchadnezzar’s eighth year</note>
<note type="study"><catchWord>behemoth</catchWord>: probably an extinct animal of some kind</note>
<note type="study"><catchWord>leviathan</catchWord>: probably an extinct animal of some kind</note>
<note type="study"><catchWord>where…</catchWord>: without water</note>
<note type="study"><catchWord>inflame</catchWord>: of, pursue</note>
<note type="study"><catchWord>battering…</catchWord>: chief leaders</note>
<note type="study"><catchWord>against thee</catchWord>: of, of thee</note>
<note type="study"><catchWord>O king…</catchWord>: (Chaldee, to the end of chapter seven)</note>
<note type="study"><catchWord>I will…</catchWord>: rather, Where is thy king?</note>

NB. I suspected that the 2 notes containing ": of, " are typos that should be ": or, ".

David Haslam 05:37, 19 January 2016 (MST)

  • ": of, pursue" in Isa.5.11 should be ": or, pursue"
  • ": of, of thee" in Ezek.33.30 should be ": or, of thee"

I have made these two corrections. We then had only 13 notes without a rdg element:

<note type="study"><catchWord>selvedge</catchWord>: an edge of cloth so woven that it cannot unravel</note>
<note type="study"><catchWord>the caul</catchWord>: it seemeth by anatomy, and the Hebrew doctors, to be the midriff</note>
<note type="study"><catchWord>selvedge</catchWord>: an edge of cloth so woven that it cannot unravel</note>
<note type="study"><catchWord>the nations</catchWord>: the nations that warred against Israel</note>
<note type="study"><catchWord>any of…</catchWord>: any of the judges</note>
<note type="study">1ch 11:11 he lift…: from whom he…: Heb. slain</note>
<note type="study"><catchWord>his reign</catchWord>: Nebuchadnezzar’s eighth year</note>
<note type="study"><catchWord>behemoth</catchWord>: probably an extinct animal of some kind</note>
<note type="study"><catchWord>leviathan</catchWord>: probably an extinct animal of some kind</note>
<note type="study"><catchWord>where…</catchWord>: without water</note>
<note type="study"><catchWord>battering…</catchWord>: chief leaders</note>
<note type="study"><catchWord>O king…</catchWord>: (Chaldee, to the end of chapter seven)</note>
<note type="study"><catchWord>I will…</catchWord>: rather, Where is thy king?</note>

Some of these have since been changed to have a rdg element with no attribute.

This task has now been substantially completed, and the converted XML file was resent to DM today.

Summary statistics:

  • 6958 catchWord elements
  • 7404 rdg elements

We have done away with the seg workaround. DM will use a modified OSIS schema for validation.

David Haslam 05:08, 21 January 2016 (MST)

catchWord osisRef

I have begun a task to add the osisRef attribute to many of the catchWord elements in the study notes. For the time being, catchWord text that has more than one word is not included. Neither are catchWord elements in the notes for canonical Psalm titles. Making good progress...

KJV (OT) catchWord statistics:

  • 6958 = total, of which
  • 5242 = do not contain a space
  • 5958 = now have osisRef (1 of which requires reverting)[1]
  • 1000 = are still without osisRef, of which
  • 0000 = are keywords in verse 1 for Psalms with titles[2]

Notes:

  1. <catchWord osisRef="Num.26.39@s[Shupham…Hupham]">Shupham…Hupham</catchWord>
  2. These are now included because I fixed the XML file such that these lines now start with a verse sID milestone.

The above exception might be better coded as an osisRef range:

<catchWord osisRef="Num.26.39@s[Shupham]-Num.26.39@s[Hupham]">Shupham…Hupham</catchWord>

David Haslam 05:16, 25 January 2016 (MST)

Related discussion

fine grain operators

The … was probably an artifact of typesetting margin notes. Limited space, resulting in truncation. I like the idea of a range. Regarding Psalm titles, you can double check the reference by dumping with mod2imp. --Dmsmith 07:48, 24 January 2016 (MST)

The range concept could work in a lot of places, but what when the range end word is also found earlier in the verse text than the range begin word? The @s[word] operator finds only the first occurrence of the word in question. I expect there are many such examples. David Haslam 10:42, 24 January 2016 (MST)
I think it should be easy to understand that a range is anchored at the start and follows to the end. I'd argue that the @s[a]-@s[b] implies that a and b do not reoccur in the range. Maybe it is lacking a secondary subscript as in @s[a] means the first occurrence, also @s[a][1] means the first occurrence, but @s[a][3] means the third occurrence. Since OSIS is targeted to commentaries as well as Bibles, it should be expected that text referenced by an osisID is large. OSIS has another operator that gives the counted position of the match. I think it counts code points though. IIRC: cp(n). It'd be better to have one that counted words, e.g. wc(n). The further complication is that @s[asdf] might be a substring match rather than a whole word match. --Dmsmith 11:58, 24 January 2016 (MST)
That's a good point about substrings. We need to develop a tighter specification for the @s[text] operator, but one that gives much more flexibility of use, while retaining precision for word locations. David Haslam 07:14, 25 January 2016 (MST)