Difference between revisions of "User talk:Dmsmith/KJV 2.6"

From CrossWire Bible Society
Jump to: navigation, search
m (Enhanced OSIS markup of the study notes: substantially)
(fine grain operators: reply)
 
(36 intermediate revisions by 2 users not shown)
Line 13: Line 13:
 
# extracts which include the tagged divine name
 
# extracts which include the tagged divine name
  
== Auhoritative reference text for red letter markup ==
+
== Revision History ==
 +
{|  width="100%" border="1" class="sortable"
 +
! width="5%"|Revision
 +
! width="10%"|Date
 +
! width="85%"|Description
 +
|-
 +
| 2.9
 +
| 2016-01-21
 +
| Added markup to notes. Improved markup of Selah.
 +
|-
 +
| 2.8
 +
| 2015-12-20
 +
| Moved Ps 119 acrostic titles before verse number. Added Feature for no paragraphs.
 +
|-
 +
| 2.7
 +
| 2015-08-09
 +
| Fixed bugs preventing the display of some Strong's Numbers.
 +
|-
 +
| 2.6.1
 +
| 2014-02-15
 +
| Added GlobalOptionFilter for OSISLemma
 +
|-
 +
| 2.6
 +
| 2013-10-05
 +
| Fixed bugs. Added Greek from TR.
 +
|-
 +
| 2.5
 +
| 2013-02-02
 +
| Fixed bugs.
 +
|-
 +
| 2.4
 +
| 2009-05-29
 +
| Fixed bugs.
 +
|-
 +
| 2.3
 +
| 2006-10-09
 +
| Fixed bugs.
 +
|-
 +
| 2.2
 +
| 2004-01-21
 +
| Updated to 20040121 snapshot of KJV2003.
 +
|-
 +
| 2.1
 +
| 2003-06-24
 +
| Changed Old Testament to use OSIS tags, removing the last of the GBF markup.  Also updated to 20030624 snapshot of KJV2003.  Compressed.
 +
|-
 +
| 2.0
 +
| early 2003
 +
| Changed New Testament to use a snapshot of the KJV2003 Project
 +
|}
  
Since when did the Old Scofield become our reference text for this purpose? cf. [https://en.wikipedia.org/wiki/Red_letter_edition]. [[User:David Haslam|David Haslam]] 05:50, 19 February 2014 (MST)
+
<references />
: During my earlier efforts, I found that there were lots of variations on the texts. I was wanting one that the KJV Only adherents felt was more accurate 1769 text. I found various listings of differences between current and "true" and examined dozens of dead tree texts in several stores. Both combined to come up with the Old Scofield. Also, it was important to avoid copyright claims based upon minor changes in the text. Additionally, I worked with Tim Lanfear who was doing the CCEL KJV to compare various eTexts. Those comparisons yielded differences that needed to be verified in an independent text. Thus the need for *a* dead tree text. While I was working on this, several websites that were dedicated to producing a "true" text were abandoned with chagrin that it is not a doable task apart from having a facsimile of the 1769, which is not known.
 
: Regarding Red Letters, I don't have a better resource. I'd love the "original" by the person who first did it, or a greater authority. What I have is a consistent authority. In doing the verification of the markup, I flipped through every page of the Old Scofield and compared it visually to BibleDesktop's display of the module. I also reviewed lists which various people provided me. Most were regarding words attributed to Jesus.
 
:--[[User:Dmsmith|Dmsmith]] 06:29, 19 February 2014 (MST)
 
 
 
== KJV 2.7 ==
 
 
 
Version 2.7 was released on 2015-08-29.
 
 
 
History_2.7=Fixed bugs preventing the display of some Strong's Numbers.
 
 
 
''No need to rename this planning page''. [[User:David Haslam|David Haslam]] 06:42, 17 September 2015 (MDT)
 
 
 
== KJV 2.8 ==
 
Version 2.8 was released on 2015-12-20.
 
History_2.8=Moved Ps 119 acrostic titles before verse number. Added Feature for no paragraphs.
 
[[User:David Haslam|David Haslam]] 07:46, 22 December 2015 (MST)
 
  
 
== Cross references ==
 
== Cross references ==
Line 55: Line 88:
 
== Enhanced OSIS markup of the study notes ==
 
== Enhanced OSIS markup of the study notes ==
  
During the past few days, I developed a TextPipe filter to add the '''catchWord''' and '''rdg''' elements to the KJV study notes.<BR>This task has now been substantially completed, and the converted XML file was resent to DM today.
+
During the past few days, I developed a TextPipe filter to add the '''catchWord''' and '''rdg''' elements to the KJV study notes.<BR>
 
 
Summary statistics:
 
* 6958 '''catchWord''' elements
 
* 7403 '''rdg''' elements (with 19 that are wrapped in '''seg''')
 
 
This left the following 15 '''note''' elements are without a '''rdg''' element.
 
This left the following 15 '''note''' elements are without a '''rdg''' element.
 
<pre>
 
<pre>
Line 78: Line 107:
 
<note type="study"><catchWord>I will…</catchWord>: rather, Where is thy king?</note>
 
<note type="study"><catchWord>I will…</catchWord>: rather, Where is thy king?</note>
 
</pre>
 
</pre>
NB. I suspect that the 2 notes containing ": of, " are typos that should be ": or, ".
+
NB. I suspected that the 2 notes containing ": of, " are typos that should be ": or, ".
  
 
[[User:David Haslam|David Haslam]] 05:37, 19 January 2016 (MST)
 
[[User:David Haslam|David Haslam]] 05:37, 19 January 2016 (MST)
 +
:* ": of, pursue" in Isa.5.11 should be ": or, pursue"
 +
:* ": of, of thee" in Ezek.33.30 should be ": or, of thee"
 +
I have made these two corrections. We then had only 13 notes without a '''rdg''' element:
 +
<pre>
 +
<note type="study"><catchWord>selvedge</catchWord>: an edge of cloth so woven that it cannot unravel</note>
 +
<note type="study"><catchWord>the caul</catchWord>: it seemeth by anatomy, and the Hebrew doctors, to be the midriff</note>
 +
<note type="study"><catchWord>selvedge</catchWord>: an edge of cloth so woven that it cannot unravel</note>
 +
<note type="study"><catchWord>the nations</catchWord>: the nations that warred against Israel</note>
 +
<note type="study"><catchWord>any of…</catchWord>: any of the judges</note>
 +
<note type="study">1ch 11:11 he lift…: from whom he…: Heb. slain</note>
 +
<note type="study"><catchWord>his reign</catchWord>: Nebuchadnezzar’s eighth year</note>
 +
<note type="study"><catchWord>behemoth</catchWord>: probably an extinct animal of some kind</note>
 +
<note type="study"><catchWord>leviathan</catchWord>: probably an extinct animal of some kind</note>
 +
<note type="study"><catchWord>where…</catchWord>: without water</note>
 +
<note type="study"><catchWord>battering…</catchWord>: chief leaders</note>
 +
<note type="study"><catchWord>O king…</catchWord>: (Chaldee, to the end of chapter seven)</note>
 +
<note type="study"><catchWord>I will…</catchWord>: rather, Where is thy king?</note>
 +
</pre>
 +
Some of these have since been changed to have a rdg element with no attribute.
 +
 +
This task has now been substantially completed, and the converted XML file was resent to DM today.
 +
 +
Summary statistics:
 +
* 6958 '''catchWord''' elements
 +
* 7404 '''rdg''' elements
 +
 +
We have done away with the seg workaround. DM will use a modified OSIS schema for validation.
 +
 +
[[User:David Haslam|David Haslam]] 05:08, 21 January 2016 (MST)
 +
 +
== catchWord osisRef  ==
 +
I have begun a task to add the '''osisRef''' attribute to many of the '''catchWord''' elements in the study notes. For the time being, catchWord text that has more than one word is not included. Neither are catchWord elements in the notes for canonical Psalm titles. Making good progress...
 +
 +
'''KJV (OT) catchWord statistics''':
 +
* 6958 = total, of which
 +
* 5242 = do not contain a space
 +
* 5958 = now have osisRef (1 of which requires reverting)<ref><code><catchWord osisRef="Num.26.39@s[Shupham…Hupham]">Shupham…Hupham</catchWord></code></ref>
 +
* 1000 = are still without osisRef, of which
 +
* 0000 = are keywords in verse 1 for Psalms with titles<ref>These are now included because I fixed the XML file such that these lines now start with a verse sID milestone.</ref>
 +
 +
'''Notes:'''
 +
<references/>
 +
 +
The above exception might be better coded as an osisRef range:
 +
 +
::<code><catchWord osisRef="Num.26.39@s[Shupham]-Num.26.39@s[Hupham]">Shupham…Hupham</catchWord></code>
 +
 +
[[User:David Haslam|David Haslam]] 05:16, 25 January 2016 (MST)
 +
=== Related discussion ===
 +
==== fine grain operators ====
 +
The … was probably an artifact of typesetting margin notes. Limited space, resulting in truncation. I like the idea of a range. Regarding Psalm titles, you can double check the reference by dumping with mod2imp. --[[User:Dmsmith|Dmsmith]] 07:48, 24 January 2016 (MST)
 +
:The range concept could work in a lot of places, but what when the range end word is also found earlier in the verse text than the range begin word? The '''@s[word]''' operator finds only the first occurrence of the word in question. I expect there are many such examples. [[User:David Haslam|David Haslam]] 10:42, 24 January 2016 (MST)
 +
:: I think it should be easy to understand that a range is anchored at the start and follows to the end. I'd argue that the @s[a]-@s[b] implies that a and b do not reoccur in the range. Maybe it is lacking a secondary subscript as in @s[a] means the first occurrence, also @s[a][1] means the first occurrence, but @s[a][3] means the third occurrence. Since OSIS is targeted to commentaries as well as Bibles, it should be expected that text referenced by an osisID is large. OSIS has another operator that gives the counted position of the match. I think it counts code points though. IIRC: cp(n). It'd be better to have one that counted words, e.g. wc(n). The further complication is that @s[asdf] might be a substring match rather than a whole word match. --[[User:Dmsmith|Dmsmith]] 11:58, 24 January 2016 (MST)
 +
:::That's a good point about substrings. We need to develop a tighter specification for the '''@s[text]''' operator, but one that gives much more flexibility of use, while retaining precision for word locations. [[User:David Haslam|David Haslam]] 07:14, 25 January 2016 (MST)

Latest revision as of 14:14, 25 January 2016

Page name

It may be better to rename this page as KJV2014, so that it can also cover changes after the next release this year. David Haslam 12:35, 18 February 2014 (MST)

I hope to do releases more often than once a year. It was too long between releases. Note sure that this is the best name, but I wanted a page to track what needed to be done and what was done.--Dmsmith 16:49, 18 February 2014 (MST)
Indent to reply, not bullet! David Haslam 05:16, 19 February 2014 (MST)

Analysis

David Haslam has generated the following counted lists:

  1. all transChange elements
  2. proper names hyphenated using the ndash
  3. words containing the Ææ graphemes
  4. possessive words punctuated with the single right quotation mark
  5. words tagged as the divine name
  6. extracts which include the tagged divine name

Revision History

Revision Date Description
2.9 2016-01-21 Added markup to notes. Improved markup of Selah.
2.8 2015-12-20 Moved Ps 119 acrostic titles before verse number. Added Feature for no paragraphs.
2.7 2015-08-09 Fixed bugs preventing the display of some Strong's Numbers.
2.6.1 2014-02-15 Added GlobalOptionFilter for OSISLemma
2.6 2013-10-05 Fixed bugs. Added Greek from TR.
2.5 2013-02-02 Fixed bugs.
2.4 2009-05-29 Fixed bugs.
2.3 2006-10-09 Fixed bugs.
2.2 2004-01-21 Updated to 20040121 snapshot of KJV2003.
2.1 2003-06-24 Changed Old Testament to use OSIS tags, removing the last of the GBF markup. Also updated to 20030624 snapshot of KJV2003. Compressed.
2.0 early 2003 Changed New Testament to use a snapshot of the KJV2003 Project


Cross references

This may be of interest. http://www.newkreation.com/bible/bible.php described as "49,775 Cross-References / The Bible cross references have been dutifully copied from 1914 A. J. Holman Company's Holman Home Bible in the public domain." David Haslam 07:44, 22 December 2015 (MST)

This resource can be installed to be read by the iSilo app for various platforms. The data is stored in PDB format. David Haslam 12:28, 18 January 2016 (MST)
The iSilo app for Windows (30 day free trial) has an Edit option to Copy Entire Document. I just did that and saved the data as a UTF-8 text file. David Haslam 12:46, 18 January 2016 (MST)

The technical challenge would be to distinguish reference tags from the words they are prefixed to.

Genesis 1
1 In athe beginning bGod created the heaven and the earth. 2 And the earth was cwithout form, and void; and darkness was upon the face of the deep. dAnd the Spirit of God moved upon the face of the waters. 

3 eAnd God said, Let there be light: and there was light. 4 And God saw the light, that it was good: and God divided the light from the darkness. 5 And God called the light fDay, and the darkness he called Night. And the evening and the morning were the first day. 

6 And God said, gLet there be a firmament in the midst of the waters, and let it divide the waters from the waters. 7 And God made the firmament, and divided the waters which were under the firmament from the waters which were above the firmament: and it was so. 8 And God called the firmament Heaven. And the evening and the morning were the second day. 
Only within the app are they superscripted italics letters. I've just left a message with the webmaster enquiring about the source text availability. David Haslam 13:48, 18 January 2016 (MST)
We might explore what can be done with pdbparse. David Haslam

Here's another online resource that has cross-references. https://github.com/scrollmapper/bible_databases
David Haslam 08:07, 22 December 2015 (MST)

Enhanced OSIS markup of the study notes

During the past few days, I developed a TextPipe filter to add the catchWord and rdg elements to the KJV study notes.
This left the following 15 note elements are without a rdg element.

<note type="study"><catchWord>selvedge</catchWord>: an edge of cloth so woven that it cannot unravel</note>
<note type="study"><catchWord>the caul</catchWord>: it seemeth by anatomy, and the Hebrew doctors, to be the midriff</note>
<note type="study"><catchWord>selvedge</catchWord>: an edge of cloth so woven that it cannot unravel</note>
<note type="study"><catchWord>the nations</catchWord>: the nations that warred against Israel</note>
<note type="study"><catchWord>any of…</catchWord>: any of the judges</note>
<note type="study">1ch 11:11 he lift…: from whom he…: Heb. slain</note>
<note type="study"><catchWord>his reign</catchWord>: Nebuchadnezzar’s eighth year</note>
<note type="study"><catchWord>behemoth</catchWord>: probably an extinct animal of some kind</note>
<note type="study"><catchWord>leviathan</catchWord>: probably an extinct animal of some kind</note>
<note type="study"><catchWord>where…</catchWord>: without water</note>
<note type="study"><catchWord>inflame</catchWord>: of, pursue</note>
<note type="study"><catchWord>battering…</catchWord>: chief leaders</note>
<note type="study"><catchWord>against thee</catchWord>: of, of thee</note>
<note type="study"><catchWord>O king…</catchWord>: (Chaldee, to the end of chapter seven)</note>
<note type="study"><catchWord>I will…</catchWord>: rather, Where is thy king?</note>

NB. I suspected that the 2 notes containing ": of, " are typos that should be ": or, ".

David Haslam 05:37, 19 January 2016 (MST)

  • ": of, pursue" in Isa.5.11 should be ": or, pursue"
  • ": of, of thee" in Ezek.33.30 should be ": or, of thee"

I have made these two corrections. We then had only 13 notes without a rdg element:

<note type="study"><catchWord>selvedge</catchWord>: an edge of cloth so woven that it cannot unravel</note>
<note type="study"><catchWord>the caul</catchWord>: it seemeth by anatomy, and the Hebrew doctors, to be the midriff</note>
<note type="study"><catchWord>selvedge</catchWord>: an edge of cloth so woven that it cannot unravel</note>
<note type="study"><catchWord>the nations</catchWord>: the nations that warred against Israel</note>
<note type="study"><catchWord>any of…</catchWord>: any of the judges</note>
<note type="study">1ch 11:11 he lift…: from whom he…: Heb. slain</note>
<note type="study"><catchWord>his reign</catchWord>: Nebuchadnezzar’s eighth year</note>
<note type="study"><catchWord>behemoth</catchWord>: probably an extinct animal of some kind</note>
<note type="study"><catchWord>leviathan</catchWord>: probably an extinct animal of some kind</note>
<note type="study"><catchWord>where…</catchWord>: without water</note>
<note type="study"><catchWord>battering…</catchWord>: chief leaders</note>
<note type="study"><catchWord>O king…</catchWord>: (Chaldee, to the end of chapter seven)</note>
<note type="study"><catchWord>I will…</catchWord>: rather, Where is thy king?</note>

Some of these have since been changed to have a rdg element with no attribute.

This task has now been substantially completed, and the converted XML file was resent to DM today.

Summary statistics:

  • 6958 catchWord elements
  • 7404 rdg elements

We have done away with the seg workaround. DM will use a modified OSIS schema for validation.

David Haslam 05:08, 21 January 2016 (MST)

catchWord osisRef

I have begun a task to add the osisRef attribute to many of the catchWord elements in the study notes. For the time being, catchWord text that has more than one word is not included. Neither are catchWord elements in the notes for canonical Psalm titles. Making good progress...

KJV (OT) catchWord statistics:

  • 6958 = total, of which
  • 5242 = do not contain a space
  • 5958 = now have osisRef (1 of which requires reverting)[1]
  • 1000 = are still without osisRef, of which
  • 0000 = are keywords in verse 1 for Psalms with titles[2]

Notes:

  1. <catchWord osisRef="Num.26.39@s[Shupham…Hupham]">Shupham…Hupham</catchWord>
  2. These are now included because I fixed the XML file such that these lines now start with a verse sID milestone.

The above exception might be better coded as an osisRef range:

<catchWord osisRef="Num.26.39@s[Shupham]-Num.26.39@s[Hupham]">Shupham…Hupham</catchWord>

David Haslam 05:16, 25 January 2016 (MST)

Related discussion

fine grain operators

The … was probably an artifact of typesetting margin notes. Limited space, resulting in truncation. I like the idea of a range. Regarding Psalm titles, you can double check the reference by dumping with mod2imp. --Dmsmith 07:48, 24 January 2016 (MST)

The range concept could work in a lot of places, but what when the range end word is also found earlier in the verse text than the range begin word? The @s[word] operator finds only the first occurrence of the word in question. I expect there are many such examples. David Haslam 10:42, 24 January 2016 (MST)
I think it should be easy to understand that a range is anchored at the start and follows to the end. I'd argue that the @s[a]-@s[b] implies that a and b do not reoccur in the range. Maybe it is lacking a secondary subscript as in @s[a] means the first occurrence, also @s[a][1] means the first occurrence, but @s[a][3] means the third occurrence. Since OSIS is targeted to commentaries as well as Bibles, it should be expected that text referenced by an osisID is large. OSIS has another operator that gives the counted position of the match. I think it counts code points though. IIRC: cp(n). It'd be better to have one that counted words, e.g. wc(n). The further complication is that @s[asdf] might be a substring match rather than a whole word match. --Dmsmith 11:58, 24 January 2016 (MST)
That's a good point about substrings. We need to develop a tighter specification for the @s[text] operator, but one that gives much more flexibility of use, while retaining precision for word locations. David Haslam 07:14, 25 January 2016 (MST)