OSIS Tutorial

From CrossWire Bible Society
Revision as of 14:59, 17 October 2011 by David Haslam (talk | contribs) (Literal Translations: corrected changeType attribute name to simply type)

Jump to: navigation, search

Creating a Basic OSIS Document

Creating a basic OSIS document can be an easy task with just a little direction, which this brief tutorial hopes to provide.

Throughout this tutorial, we will be marking up a Bible excerpt: Gen 1:1-3, from the King James Version of the Bible, 1611 edition. The excerpt reads as follows:

THE FIRST BOOKE OF MOSES, called GENESIS.
Chap.j.
The creation of the world.
1 The creation of Heauen and Earth, 3 of the
light, 6 of the firmament, 9 of the earth ſe-
parated from the waters, 11 and made fruit-
full, 14 of the Sunne, Moone and Starres,
20 of fiſh and fowle, 24 of beaſts and cat-
tell, 26 of Man in the Image of God. 29 Al-
ſo the appointment of food.
                       1. In the beginning
                          God created the
                          Heauen and the
                          Earth.
                             2.  And the
                          earth was with=
                          out forme , and
                          voyd;and darken=
                          effe was vpon
the face of the deepe : And the Spirit
of God mooued vpon the face of the
waters.
  3.  And God ſaid, *Let there be light :
and there was light.
* Pſal.33.6. and 136.5. acts.14.15. and 17.24. hebr.11.3.

While this looks like a bunch of typos, it is a faithful representation of the original.
For an image of this see: http://dewey.library.upenn.edu/sceti/printedbooksNew/index.cfm?TextID=kjbible&PagePosition=77
For the entire first chapter of Genesis see: KJV 1611

XML

At the core, OSIS is an XML markup standard and must comply with rules for basic XML documents. This means that we will need a basic XML header to begin our document. This line should do just fine:

<?xml version="1.0" encoding="UTF-8" ?>

The Root Node

The root node for an OSIS document has the element name osis. Since OSIS uses XML Schema to define itself, we can place a link in the root node declaring our document's structure definition (we're an OSIS XML document, not just any XML document). Our complete root node will look like this:

<osis xmlns="http://www.bibletechnologies.net/2003/OSIS/namespace" 
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.bibletechnologies.net/2003/OSIS/namespace http://www.bibletechnologies.net/osisCore.2.1.1.xsd">

Notes:

  1. You may encounter files containing the extra line xmlns:xml="http://www.w3.org/XML/1998/namespace".
    This is not necessary, and does not alter how an OSIS XML file validates.
  2. You may encounter files in which the file osisCore.2.1.1.xsd does not have the full remote path.
    For these situations it's assumed that you have a local copy for the convenience and speed of off-line validation.

The Work Container

OSIS documents can either be a corpus of multiple works, or merely a single text, like the KJV Bible. The latter will be our OSIS document type. We declare this by placing our entire work in an osisText element. This element will include attributes that declare our work's id and default reference scheme. The values of these elements are not as important as their function. They are used to link to a work section that we will discuss a little later, that completely define two functions for our document. For now, we'll just set them to: "KJV" and "defaultReferenceScheme". Here is our osisText line, which also includes the default language of our document:

   <osisText osisIDWork="KJV" osisRefWork="defaultReferenceScheme" xml:lang="en">

The Header

Each OSIS work must have a header section that defines information about the text. This will include copyright and cataloguing data, among other bibliographic information. The header not only contains information about our work, it also contains basic information about any works which we reference in our text. Below is our entire header for our text. We'll explain it in more detail, below.

    <header>
        <work osisWork="KJV">
            <title>King James Version of 1611</title>
            <identifier type="OSIS">KJV.TutorEncoding</identifier>
            <refSystem>Bible.KJV</refSystem>
        </work>
        <work osisWork="defaultReferenceScheme">
            <refSystem>Bible.KJV</refSystem>
        </work>
    </header>

Our header includes two work elements. Each work element is uniquely distinguished by its osisWork attribute.

The first work element defines our work. This is designated by matching the osisWork attribute value to the osisIDWork attribute value of our osisText element in the section above. This sounds confusing but simply notice they both have the value "KJV"-- that's all there is to it.

Inside this work, we have an element that defines our title. We also have a special identifier element with a type value of "OSIS". This identifier element must be present and is used for assigning a canonical name to our OSIS document. We're claiming the "KJV.TutorEncoding" identifier for our document, in case anyone wants to refer to our text if, say, we're included in a large library of OSIS documents.

When marking up Biblical materials, we have a need to reference certain portions of our materials, and not just the entire document. To allow this, we must provide two sides of the same coin:

  • I am referencing this portion;
  • I am this portion

The two functions are facilitated in OSIS with the osisRef and osisID attributes, respectively, and are reviewed in more detail in the next section.

The refSystem element in our work states that whenever we markup a portion of our text as something like, "Genesis 1:1" (side 2 of the coin) we are using the Bible.KJV reference system.

Remember that we had a second attribute on osisText, in the section above. This attribute is used to state which reference scheme our document will use by default, when citing references (side 1 of the coin) in our text. The second work element above matches our reference scheme, and declares that we are using the Bible.KJV reference scheme as our default.

Marking and Referencing

Before going any further, we should talk about an OSIS concept that enables us to reference pieces of other works, and also label pieces of our own work as targets of references. These two concepts are represented by the OSIS attributes osisRef and osisID, respectively. The concepts are fairly straight forward, and in their simplest incarnation, easy to comprehend. As an example, let's say we have a commentary that wishes to reference James 1:19. OSIS includes a reference element for this purpose, and for our example, an instance may look something like:

Please see <reference osisRef="Jas.1.19">James 1:19</reference>

The counterpart, osisID, is very similar. Let's say a Bible wishes to mark a section as being James 1:19. OSIS provides a verse element for this, and our Bible may include something like:

<verse osisID="Jas.1.19">Wherefore, my beloved brethren, let every man be swift to hear, slow to speak, slow to wrath:</verse>

Text Divisions

OSIS works can be subdivided into arbitrary smaller sections, if desired. For Bibles, it usually makes sense to keep the traditional divisions that have been used for decades: Testament, Book, Chapter, Verse. We will use these for our text. OSIS provides a div element to facilitate some of these divisions; other divisions are more specifically supported. Here are our 3 divisions, at the beginning of our work to get us down to Genesis, chapter 1.

    <div type="x-testament">
      <div type="book" osisID="Gen">
        <chapter osisID="Gen.1">

Actual Text

Now finally we're ready to start including our actual Bible text in the document. The first verse is Genesis 1:1. We'd like to let the world know the identification of this part of our document, so we'll include the text in a verse element with an appropriate osisID attribute. Here it is:

          <verse osisID="Gen.1.1">In the beginning God created the Heauen and the Earth.</verse>

Literal Translations

The text that we've chosen to encode claims to be a special type of Bible translation, sometimes referred to as a literal translation. Translations of these type attempt to preserve --as best they can-- the wording of the text from their original language. These works all tend to use similar mechanisms to indicate where they have needed to deviate from what was presented in the original source, for the purpose of clear target language understanding. Our second verse includes one such anomaly-- the second instance of the word was. Most printed editions of the KJV will italicize this word, indicating that there was no such Hebrew counterpart that was translated into the English "was", but without this word, the sentence would not represent correct English. Since OSIS is presentation-agnostic, instead of delegating a display type like italic for this purpose, we will mark the anomaly and let the publisher choose how they would like it displayed. OSIS provides a transChange element to allow these translations to mark deviations such as these, and a type attribute on this element to indicate the type of change made. Here is our Genesis 1:2, which includes this markup.

          <verse osisID="Gen.1.2">
            And the earth was without forme , and voyd;and darkeneffe
            <transChange type="added">was</transChange>
            vpon the face of the deepe : And the Spirit of God mooued
            vpon the face of the waters.
          </verse>

Quotes

Our last verse that we will markup includes a quote by God, Himself! Let's be sure to get this one correct. In OSIS, quotes can be marked with a q element. A speaker attribute is also optionally allowed, which we will use to designate who is speaking in this portion of Scripture. Since the orthography of the KJV had no quotation marks, this is indicated with an empty marker attribute.

          <verse osisID="Gen.1.3">
            And God faid, *<q who="God" marker="">Let there be light</q> : and there was light.
          </verse>

Notes

Genesis 1:3 has a note that still needs to be included. In OSIS, notes are inlined at the point of inclusion. Notes also have various types. In this instance, the note is a cross-reference. To retain the original orthography we use the n attribute to provide the original marker.

<note type="crossReference" n="*">
  <reference osisRef="Ps.33.6">Pfal.33.6.</reference>
  and
  <reference osisRef="Ps.136.5">136.5.</reference>
  <reference osisRef="Acts.14.15">acts.14.15.</reference>
  and
  <reference osisRef="acts.17.24">17.24.</reference>
  <reference osisRef="Heb.11.3">hebr.11.3.</reference>
</note>

The entire verse would be:

          <verse osisID="Gen.1.3">
            And God faid, 
            <note type="crossReference" n="*">
              <reference osisRef="Ps.33.6">Pfal.33.6.</reference>
              and
              <reference osisRef="Ps.136.5">136.5.</reference>
              <reference osisRef="Acts.14.15">acts.14.15.</reference>
              and
              <reference osisRef="acts.17.24">17.24.</reference>
              <reference osisRef="Heb.11.3">hebr.11.3.</reference>
            </note>
            <q who="God" marker="">Let there be light</q>
            : and there was light.
          </verse>

Finishing up

Remember, in XML, we must have closing marks for every opening mark of an element. Let's be sure to close all of our elements before finishing up our OSIS text. First we'll close our Chapter, then Book, then Testament.

        </chapter>
      </div>
    </div>

Then we'll close our osisText.

  </osisText>

and finally, our root osis element

</osis>

Conclusion

And that's it!!! Congratulations, you've just walked through your first entire OSIS document. With power comes complexity, so there is much more to learn if you wish to unlock the features of OSIS that will allow you to more richly markup your texts. But some people may prefer to keep the more intricate aspects of OSIS under lock and key, depending on their needs. With OSIS, the choice is yours, and you now know everything necessary to start encoding your own texts, making them usable by organizations all around the world in a variety of presentation venues. Blessings in your endeavours.