Talk:List of eXtensions to OSIS used in SWORD

From CrossWire Bible Society
Revision as of 11:57, 15 August 2012 by David Haslam (talk | contribs) (eXtensions generated by usfm2osis.py: :It seems that <tt>x-p</tt> is used only as a temporary attribute within the script, and removed during the osisReorderAndCleanup.)

Jump to: navigation, search

Why this page should prove useful

For a valid OSIS XML file to be "fit for purpose", it's necessary to know what to include in order to achieve the goal of making good SWORD modules. Though some of these extensions are already partially documented in existing wiki pages, there is no single place where they are all listed along with usage rules. As further pre-processing tools are being developed (such as USX to OSIS), having a one stop reference should be valuable to help developers ensure that they can take full advantage of existing solutions and workarounds. David Haslam 05:00, 27 April 2012 (MDT)

This page will be a 'work in progress' for a time. I have made a start by adding sections for extensions found by searching the wiki. the result of such a search is probably quite incomplete. Developers who work on the SWORD source code should be able to add further sections. David Haslam 05:11, 27 April 2012 (MDT)
Further extensions were added by reference to usfm2osis.pl and to kjv.xml. David Haslam 05:30, 28 April 2012 (MDT)
Examination of existing modules for unsupported extensions may also suggest opportunities for further enhancements to the SWORD engine, especially if they are used extensively, and when it would seem that their purpose was to affect how text is displayed. David Haslam 02:36, 1 May 2012 (MDT)
Comment from the xulsword developer received by email: David Haslam 12:11, 1 May 2012 (MDT)
"Hey this is great! This is the kind of thing that enables developers to make best use of their time and design products that are compatible. "
It may also help us to decide which extensions should be deprecated. David Haslam 12:16, 8 August 2012 (MDT)

Non-CrossWire OSIS eXtensions

Please list any non-CrossWire OSIS eXtensions here in the talk page. David Haslam 11:45, 27 April 2012 (MDT)

SWORD and JSword

The page is intended to reflect what extensions are supported by the SWORD API. There is no guarantee that these extensions are also supported by JSword, albeit many of them are. David Haslam 05:26, 28 April 2012 (MDT)

Extensions no longer supported ...

If it turns out that I have listed any extension that is no longer supported by the latest release of the SWORD engine, please contact me before editing the main page. David Haslam 05:33, 28 April 2012 (MDT)

Likewise for any extension that was never supported. David Haslam 02:23, 1 May 2012 (MDT)

Extensions used in xulsword

Research here about OSIS extensions used in xulsword, but not yet implemented or adopted in SWORD.
For further details, see [1]. David Haslam 10:22, 28 April 2012 (MDT)

x-parallel-passage

This can be used within the note element as a value of the attribute subType.

It's used to mark crossReference links for parallel passages.

x-p-indent

This can be used within the milestone element as a value of the attribute type.

It's used to provide a poetry indentation as an alternative to using line elements with level attributes.
Currently, deeper indents are created by two or three <milestone type="x-p-indent" /> elements in series.

Extensions found in SWORD modules

Research here about OSIS extensions found in SWORD modules, for which the status of support by the SWORD engine is still unclear. Please remember to identify the particular module. David Haslam 02:26, 1 May 2012 (MDT)

In the following researches, I'm restricting to recording notable observations. David Haslam 10:51, 2 May 2012 (MDT)

ESV module

This is a list of all x-prefix extensions in the ESV module (along with a count in column 1):

05206	x-begin-paragraph
19805	x-br
00081	x-declares
05206	x-end-paragraph
00007	x-extra-space
11334	x-indent
00021	x-indent-2
02428	x-preverse
00001	x-psalm-book
00003	x-psalm-doxology
00179	x-same-paragraph
00074	x-selah
00001	x-speaker
00002	x-textual-note
00016	x-us-time

KJV module

This is an edited list of all x-prefix extensions in the KJV module v2.4 from CrossWire Beta (along with a count in column 1):

03422	x-1,x-2 ... x-60
01424	x-added
00835	x-extra-p
00232	x-milestone
02970	x-p
00232	x-preverse
03422	x-split
00017	x-transChange

LEB module

This is a list of all x-prefix extensions in the LEB module (along with a count in column 1):

000959	x-idiom
000002	x-license
000002	x-trademark

x-idiom

This can be used within the seg element as a value of the attribute type.

This extension is used to identify idiomatic words and phrases, and is normally immediately followed by a note element giving the literal meaning. e.g.

“Behold, the virgin <seg type="x-idiom">˻will become pregnant˼</seg><note>Literally “will have in the womb”</note> and ...

The LEB also uses visible markers (low corner brackets) to mark the start and end of each idiom.

OSMHB module

This is a list of all x-prefix extensions in the Hebrew OSHMB module (along with a count in column 1):

000004	x-large
042582	x-maqqef
002287	x-paseq
001181	x-pe
001270	x-qere
000009	x-reversednun
001981	x-samekh
000003	x-small
023193	x-sof-pasuq
000004	x-suspended

WLC module

This is a list of all x-prefix extensions in the Hebrew WLC module (along with a count in column 1):

005915	x-DHSource
000009	x-invertednun
001276	x-ketiv
000004	x-large
472676	x-morph
001181	x-pe
001288	x-qere
001981	x-samekh
000003	x-small
000004	x-suspended
001018	x-textual

WEB module

This is a list of all x-prefix extensions in the WEB module v3.1 (along with a count in column 1):

09596	x-milestone
09596	x-preverse
00002	x-testament

This is a list of all x-prefix extensions in the file web.osis.xml (downloaded from ebible.org 2010-06-15) included for comparison:

000333	x-directAddress
000680	x-doNotGeneratePunctuation
000628	x-noteStartAnchor
003983	x-plural
007660	x-primary
008342	x-secondary

x-plural

This could be used within the w element as a value of the attribute type.

It's intended to mark the pronoun 'you' where it's second person plural in the original languages.
The WEB is a translation that uses 'you' for both singular and plural second person pronouns.
For the whole pronoun word to be highlighted somehow (future implementation?), the syntax might be:

<w type="x-plural">you</w>

As it was in the file web.osis.xml, the pronouns were not wrapped like that, but merely tagged as:

you<w type="x-plural" />

I guess it depends to a large extent on how the translator envisaged them to be visibly displayed.
cf. Printed editions of the NASB mark the plural forms with an superscript asterisk after the word.

<w> is a container element, so <w/> doesn't mean anything. <w type="x-plural"/> doesn't mean that the previous word is plural, it means that the empty word is plural. Also, plurality is properly marked in the morph attribute of <w>. --Osk 06:59, 6 August 2012 (MDT)

x-primary & x-secondary

These could be used within the l element as a value of the attribute type.

It was a method envisaged to mark poetry indents in the WEB translation.

Poetic level (indentation) is indicated by the level attribute of <l>. --Osk 07:04, 6 August 2012 (MDT)

eXtensions generated by usfm2osis.py

Osk is currently developing a Python script to convert USFM to OSIS. It would be helpful to list all the OSIS eXtensions generated by usfm2osis.py so that these are readily available for reference when updates are to be be made on the SWORD tool osis2mod. David Haslam 06:27, 6 August 2012 (MDT)

Here's what I've extracted so far: (where # is short for an integer value, 1 to 5)
x-alphabeticalContents
x-center
x-chronology
x-description
x-dotUnderline
x-embedded
x-embedded-closing
x-embedded-opening
x-end
x-foreword
x-glossary
x-halfTitlePage
x-indent-#
x-indented
x-indented-#
x-indented-hanging
x-introduction
x-introduction-end
x-level-#
x-liturgical
x-mapIndex
x-nobreak
x-nobreakNext
x-nobreakNext-indented
x-noindent
x-noindent-indented
x-noindent-quote
x-ntQuotesFromLXX
x-optional
x-p
x-promotionalPage
x-quote
x-right
x-sidebar
x-spine
x-study
x-subSubSection
x-subSubSubSection
x-subSubSubSubSection
x-tableofAbbreviations
x-unknown
x-usfm-ie
x-usfm-sts
x-usfm-toc1
x-usfm-toc2
x-usfm-toc3
x-usfm-z-\1
x-weightsAndMeasures
x-workTitle
In x-usfm-z-\1, the \1 substitutes for the name of any user defined USFM tag found in the Paratext files.
It seems that x-p is used only as a temporary attribute within the script, and removed during the osisReorderAndCleanup.

In addition to those listed above, the script can also output these subType attributes for introductions:

x-acts
x-bible
x-deuterocanon
x-epistles
x-gospels
x-history
x-letters
x-newTestament
x-oldTestament
x-pentateuch
x-poetry
x-prophecy

David Haslam 04:04, 13 August 2012 (MDT)