Difference between revisions of "DevTools:Modules"

From CrossWire Bible Society
Jump to: navigation, search
(Creating a module)
m (Related Pages: deleted *GenBook and OpenOffice)
 
(105 intermediate revisions by 10 users not shown)
Line 1: Line 1:
= Module Development=
+
= Module Development Overview=
A SWORD module consists of a set of binary files in any of an increasing number of formats created for SWORD plus a .conf file that specifies the location and attributes of the module.
+
If you want to learn how to create a SWORD module, this is the place to start. Here is a brief overview of the process:
 +
#Collect and install the necessary software tools.
 +
#Obtain the source text and permission from the copyright holder if you wish to distribute a copyrighted module.
 +
#Prepare the source text for import.
 +
#Use an XML validator to check that your source file is properly constructed.
 +
#Import the source text using the appropriate tool.
 +
#Create a .conf file.
 +
#Install and test that the module displays correctly in several of the SWORD front-end applications.
 +
#Submit your module to CrossWire for distribution.
  
The .conf file is located in a standard location, such as the mods.d directory of the SWORD install directory, that may be specified to The SWORD Engine in a number of ways that are outside the scope of this document. This file can be created with a standard text editor like notepad, emacs, vi, or pico. Its contents are described below in section II.
+
=Creating a module=
 +
==Collect and Install Software Tools==
 +
*Install SWORD and collect SWORD module creation tools. For Linux (and Mac?) many of the module creation tools come installed with SWORD, so if you aren't comfortable installing from source simply install from your distribution's repositories. For Windows, you will need to download the module creation utilities from [http://www.crosswire.org/ftpmirror/pub/sword/utils/win32/ http://www.crosswire.org/ftpmirror/pub/sword/utils/win32/].<ref>Unofficial builds that may be from more recent SVN revisions may be found [http://dl.thehellings.com/sword-utils/ here].</ref> Be sure to download the most recent icudt dll and unzip it in the folder where you place the utilities.
  
The module files themselves usually require an amount of pre-processing before they are ready to be imported to SWORD. How you go about this pre-processing is something you will need to decide for yourself. You may be able to do all of your pre-processing with simple search & replace operations in notepad or more complex regular expression search & replace operations in emacs, but the majority of modules will probably require even more complex editing using a scripting language such as Perl plus a fair amount of manual correction. On the other hand, some modules may come in a standard format such as ThML or OSIS encoded files, which does not require any modification, assuming they are valid documents. Document pre-processing is outside the scope of this document, but we will explain how you need to format documents to prepare them for import to SWORD, both in terms of encoding and markup.
+
*[[Frontends:Xiphos|Xiphos]] (for Windows) users should find a complete set of Sword tools in the Xiphos directory.
  
Once you have a document ready for import, you will need to run it through an importer to create the SWORD module files, which will then be placed in the module directory you specify in your .conf file.
+
*The latest build of the Sword utilities for Windows requires the [http://www.microsoft.com/download/en/details.aspx?id=5555 Visual C++ 2010 Redistributable x86] version. If you have a 64-bit version of Windows, and you have installed only the [http://www.microsoft.com/download/en/details.aspx?id=14632 Visual C++ 2010 Redistributable x64] version, you will also need to install the x86 version.
  
After this, you may test your work and consider submitting it to The SWORD Project for public distribution from our website.
+
*Obtain a good programmer's text editor, preferentially one which does syntax and validation checks. See [[DevTools:Text Editors]] for examples
  
=Creating a .conf File=
+
<references />
==.conf File Layout==
 
The conf file tells the Sword engine how to treat installed module files, etc. which kind of markup they contain, and so forth.
 
  
Below is a listing of the possible directives in that file. Each of these directives of the form key=value. Some keys can be repeated. Some can have values that span more than 1 line with '\' at the end of a line indicating that the text on the next line continues the value. Some values allow RTF and some allow HTML &lt;a href="xxx"&gt;label&lt;/a&gt;hypertext links. HTML is not allowed otherwise.
+
==Obtain Source Text and Permission to Distribute==
 +
*The easiest texts to work with, especially for learning how to make a module, are texts in the [http://en.wikipedia.org/wiki/Public_domain public domain]. For example, you might try downloading a text from [http://www.ccel.org/ CCEL] first to get you started on the process of moving from a prepared text to a compiled module. Be sure to verify that the source text you obtain is indeed in the public domain.
 +
*Some people provide texts that are freely distributable under some sort of license (GNU GPL, Creative Commons, etc.) or no formal license. Be sure to document where the source provides that permission and check to see that they have the right to grant such permission so you can produce it if you want to distribute it with CrossWire.
 +
*For copyrighted material, you will need to contact the publisher or author to obtain permission. First check to see that someone else in CrossWire hasn't already pursued permission for that work. A list of requests and attempts to obtain rights on behalf of CrossWire can be found at [http://www.crosswire.org/wiki/index.php/Module_Requests Module Requests].
  
Note on RTF, only the following are allowed:
+
==Prepare the Source Text for Import==
* \qc - for centering
 
* \par - for paragraph breaks
 
* \pard - for resetting paragraph attributes, i.e. turning off centering.
 
* \u{num}? - for unicode characters, where {num} is a signed, 16-bit representation of the code point and where ? is the  ASCII character used in case unicode is not supported. If the {num} is less than 0 then add 65536 to it. This should only be used in modules that have an Encoding=UTF-8, but using the actual UTF-8 character is preferred.
 
  
Enumerated values are shown in bold. These should be used exactly as given and no other values should be used.
+
===Versification===
 +
For Bible modules, SWORD supports a growing number of [[Alternate Versification|Versifications]]. A prerequisite for submission of a module that would not match the default versification for the KJV Bible is to decide which of the several versifications is most suitable for the new module.
  
<table width="100%" border="1">
+
===Encoding===
  <tr>
+
Note that the SWORD Project requires all submitted texts to be Unicode (UTF-8) encoded documents.  
<th>Element</th>
 
<th>Values (type or enumerated)</th>
 
<th>Default Value</th>
 
<th>Allows</th>
 
  </tr>
 
  <tr>
 
<th colspan="4">Required Elements</th>
 
  </tr>
 
  <tr>
 
<td>[name]</td>
 
<td>Each conf file begins with [name], replacing "name" with be a short well known abbreviation. This must be on the first line, and start the first line. It can only contain A-Z, a-z, 0-9 and _.<br/>
 
  
The name of the file should be the lowercase of this name followed by .conf. For example, [MyModule] would be mymodule.conf.</td>
+
Legacy texts might need [[DevTools:Conversion to Unicode|conversion to Unicode]].
<td>&nbsp;</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>DataPath</td>
 
<td>&lt;relative system path&gt;
 
  
DataPath is the path to the module relative to the SWORD module root directory. This path should start with "./modules". If the DataPath indicates a directory it should end with a '/'. Otherwise the module name is both the directory and the prefix for each file in that directory.
+
===Character sets===
 +
A common problem in texts created by people less aware of Unicode principles is presence of characters which fit graphically but are out of set for this particular language. E.g. a Cyrillic a and a Latin a look identical but use different code points. It is important to clean this up prior to import as our module search depends on clean texts with consistent use of letters. Below a short shell script which will produce a list of all characters employed in a text.  
  
Typical paths are for a module named [mymodule] are:
+
<pre>
  ./modules/texts/rawtext/mymodule/
+
  cat *txt |\
  ./modules/texts/ztext/mymodule/
+
  uni2ascii -paU |\
  ./modules/comments/zcom/mymodule/
+
  sed -e "s/u/\nu/g" -e '/\./d' |\
  ./modules/comments/hrefcom/mymodule/
+
  sort |\
./modules/comments/rawcom/mymodule/
+
  uniq -c -w5 |\
  ./modules/comments/rawcom4/mymodule/
+
  sed -e 's/\\//' -e 's/\W+/\t/g' -e 's/\s*\([0-9][0-9]*\).*u\([0-9A-F]*\)/Character \\x\2 (u\2) was used \1 times/' |\
  ./modules/comments/rawfiles/mymodule/
+
  ascii2uni -aB |\
  ./modules/lexdict/zld/mymodule/mymodule
+
  grep 'used' > charactercount.txt
./modules/lexdict/rawld/mymodule/mymodule
+
</pre>
./modules/lexdict/rawld/devotionals/mymodule/mymodule
 
  ./modules/lexdict/rawld/glossaries/mymodule/mymodule
 
  ./modules/lexdict/rawld4/mymodule/mymodule
 
./modules/genbook/rawgenbook/mymodule/mymodule
 
  
But when it really comes down to it, a valid path could be:<br/>
+
===Images===
./xxx/ or ./xxx/mymodule
+
Images can be included in any type of module. The specifics of how to do this is dependent upon markup.
</td>
 
<td>&nbsp;</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>Description</td>
 
<td>&lt;string&gt;<br/>
 
  
This is a short (1 line) title of the module.</td>
+
Some SWORD applications are able to use virtually any image format, but SWORD only offically supports JPEG and PNG image files.
<td>&nbsp;</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>ModDrv</td>
 
<td>
 
<b>RawText</b> (for uncompressed Bibles)<br/>
 
<b>zText</b> (for compressed Bibles)<br/>
 
<b>RawCom</b> (for uncompressed Commentaries)<br/>
 
<b>RawCom4</b> (for uncompressed Commentaries having entries greater than 32K bytes)<br/>
 
<b>zCom</b> (for compressed Commentaries)<br/>
 
<b>HREFCom</b> (currently no module uses this type)<br/>
 
<b>RawFiles</b> (for Personal Commentary)<br/>
 
<b>RawLD</b> (for uncompressed Dictionaries)<br/>
 
<b>RawLD4</b> (for uncompressed Dictionaries having entries greater than 32K bytes)<br/>
 
<b>zLD</b> (for compressed Dictionaries)<br/>
 
<b>RawGenBook</b> (for uncompressed tree keyed modules)
 
</td>
 
<td>&nbsp;</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<th colspan="4">Elements required for proper module access</th>
 
  </tr>
 
  <tr>
 
<td>CompressType</td>
 
<td><b>ZIP</b><br/>
 
<b>LZSS</b>
 
  
Used to indicate how a compressed modules (zText, zCom, & zLD) is compressed.
+
===Markup===
ZIP is the preferred format. And as of today, no compressed modules use LZSS.
 
</td>
 
<td>LZSS</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>BlockType</td>
 
<td>
 
<b>BOOK</b><br/>
 
<b>CHAPTER</b><br/>
 
<b>VERSE</b><br/>
 
 
 
Used for zText and zCom to indicate how much of the work is compressed into a block. The trade off is size for speed, with BOOK taking the least overall space and the longest time and VERSE taking the greatest overall space and the least time. While BlockType has a default, it is a best practice to specify it. Most Bibles use BOOK and larger Commentaries use CHAPTER. To date, no module uses VERSE.
 
</td>
 
<td>CHAPTER</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>BlockCount</td>
 
<td>&lt;integer&gt;<br/>
 
  
Used for zLD to indicate the number of entries in a compressed block. Higher values will make the module slower, but smaller.</td>
+
(''see also [[DevTools:Misc]]'')
<td>200</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>KeyType</td>
 
<td><b>TreeKey</b><br/>
 
<b>VerseKey</b><br/>
 
  
Used for RawGenBook to indicate whether the module contains a book or a Bible. At this time VerseKey is being developed as the solution for [[Alternate_Versification|alternate versification]].</td>
+
We will accept only plain texts or texts marked up in OSIS or TEI, with the sole exception texts based on CCEL documents that are marked up in ThML.
<td>TreeKey</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>CipherKey</td>
 
<td>&lt;string&gt;<br/>
 
Contains the unlock key for enciphered modules. A good key is something that is hard to guess. Typically in a format matching the pattern: /[0-9]{4}[A-Za-z]{4}[0-9]{4}[A-Za-z]{4}/. Internally the key can be any byte sequence from 1 to 255 bytes in length. But this file needs it to be readable, plain text, without leading or trailing spaces. Leave a blank line ("CipherKey=") to indicate that the module is enciphered but has no unlock key. (Omit for unlocked modules.)</td>
 
<td>&nbsp;</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<th colspan="4">Elements required for proper rendering</th>
 
  </tr>
 
  <tr>
 
<td>GlobalOptionFilter</td>
 
<td>
 
<b>GBFStrongs</b> (For GBF texts having Strong's Numbers)<br/>
 
<b>GBFFootnotes</b> (For GBF texts having footnotes)<br/>
 
<b>GBFMorph</b> (For GBF texts having morphology information)<br/>
 
<b>GBFHeadings</b> (For GBF texts having headings)<br/>
 
<b>GBFRedLetterWords</b> (For GBF texts marking the Words of Christ)<br/>
 
<b>ThMLStrongs</b> (For THML texts having Strong's Numbers)<br/>
 
<b>ThMLFootnotes</b> (For THML texts having footnotes)<br/>
 
<b>ThMLScripref</b> (For THML texts having cross references)<br/>
 
<b>ThMLMorph</b> (For THML texts having morphology information)<br/>
 
<b>ThMLHeadings</b> (For THML texts having headings)<br/>
 
<b>ThMLVariants</b> (For THML texts having variant readings)<br/>
 
<b>ThMLLemma</b> (For THML texts having lemmas)<br/>
 
<b>UTF8Cantillation</b> (For Hebrew texts having cantillation marks)<br/>
 
<b>UTF8GreekAccents</b> (For Greek texts having accents)<br/>
 
<b>UTF8HebrewPoints</b> (For Hebrew texts having vowel points)<br/>
 
<b>OSISStrongs</b> (For OSIS texts having Strong's Numbers)<br/>
 
<b>OSISFootnotes</b> (For OSIS texts having informational notes)<br/>
 
<b>OSISScripref</b> (For OSIS texts having cross reference type notes)<br/>
 
<b>OSISMorph</b> (For OSIS texts having morphology information)<br/>
 
<b>OSISHeadings</b> (For OSIS texts having non-canonical headings)<br/>
 
<b>OSISRedLetterWords</b> (For OSIS texts marking the Words of Christ)<br/>
 
<b>OSISLemma</b> (For OSIS texts having lemmas)<br/>
 
<b>OSISRuby</b> (For Japanese OSIS texts with ruby: Kana glosses of Han characters)<br/>
 
  
Each of these filters removes/hides the text's feature, when activated by the application.
+
Internally, SWORD can process text in OSIS, TEI and 2 legacy formats(ThML, and GBF). From these formats, it can convert to other formats, including RTF and HTML, for display. OSIS 2.1 is now the preferred format for Bibles and commentaries. At the moment OSIS does not have thorough support for complex dictionaries. For that reason we support TEI for dictionaries.
These filters are applied in the order that they are listed in the conf.
 
        </td>
 
<td>&nbsp;</td>
 
<td>Repeats</td>
 
  </tr>
 
  <tr>
 
<td>Direction</td>
 
<td>
 
<b>LtoR</b> (<u>L</u>eft to <u>R</u>ight)<br/>
 
<b>RtoL</b> (<u>R</u>ight to <u>L</u>eft)<br/>
 
<b>BiDi</b> (<u>Bi</u>-<u>Di</u>rectional)<br/>
 
 
 
Indicate whether the language's script is a left to right script or a right to left script. Languages such as Hebrew, Arabic, Urdu, Farsi have a RtoL script. When a module contains more than one direction, such as a Hebrew/English dictionary, set this value to BiDi. If the RtoL script is transliterated into a LtoR script, set the value to LtoR.
 
</td>
 
<td>LtoR</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>SourceType</td>
 
<td>
 
<b>Plaintext</b><br/>
 
<b>GBF</b> General Bible Format: http://www.ebible.org/bible/gbf.htm<br/>
 
<b>ThML</b> Theological Markup Language: http://www.ccel.org/ThML<br/>
 
<b>OSIS</b> Open Scriptural Information Standard: http://www.bibletechnologies.net<br/>
 
<b>TEI</b> Text Encoding Initiative: http://www.tei-c.org/P4X/DI.html<br/>
 
 
 
These are various ways that the text can be encoded. The preferred encoding is OSIS. TEI is preferred for dictionaries until OSIS supports dictionaries.<br/>
 
 
 
In SWORD, for modules encoded with ThML, OSIS or TEI, each verse, dictionary entry, and book division needs to be well-formed XML or it will result in display problems in some frontends.
 
</td>
 
<td>Plaintext</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>Encoding</td>
 
<td>
 
<b>UTF-8</b><br/>
 
<b>Latin-1</b> (cp1252; see the warning)<br/>
 
 
 
The preferred encoding of texts is UTF-8. UTF-8 modules must be encoded with [http://unicode.org/reports/tr15/ Normalization Form C (NFC)]. Latin-1 is defined by [http://en.wikipedia.org/wiki/Windows-1252 Windows Codepage 1252 (cp1252)] which is a superset of ISO 8859-1.
 
 
 
This encoding indicates how the conf and the module are encoded.
 
 
 
<i><b>Warning</b>: "Latin-1" is an ambiguously used term. See [http://en.wikipedia.org/wiki/ISO_8859-1 ISO 8859-1 at Wikipedia] for technical details; in reality Latin-1 is regularly used as a synonym for ISO-8859-1. Frontend implementors should use "cp1252" or "windows1252" explicitly, not "Latin-1" provided by some programming language libraries.</i>
 
</td>
 
<td>Latin-1</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>DisplayLevel</td>
 
<td>&lt;integer&gt;</td>
 
<td>1</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>Font</td>
 
<td>&lt;string&gt;<br/>
 
Specify the font to be used for display of the module if it is available. Omit this line to use the default font. Do not make use of font-specific encodings in your documents, but use Unicode instead and the Private Use Area if necessary for codepoints that are not handled by Unicode.
 
</td>
 
<td>&nbsp;</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>OSISqToTick</td>
 
<td>true/false<br/>
 
When set to false indicates that OSIS quote elements without a marker attribute are not to produce a quotation mark. This is useful for languages (e.g. Thai) and texts (e.g. KJV) that do not have quotation marks. It is also useful for modules that mark the "Words of Christ" on a verse by verse basis, when the quote spans more than one verse.</td>
 
<td>true</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<th colspan="4">Elements to indicate features</th>
 
  </tr>
 
  <tr>
 
<td>Feature</td>
 
<td>
 
<b>StrongsNumbers</b> (for modules that include Strong's numbers)<br/>
 
<b>GreekDef</b> (for modules with Strong's number encoded Greek definitions)<br/>
 
<b>HebrewDef</b> (for modules with Strong's number encoded Hebrew definitions)<br/>
 
<b>GreekParse</b> (for modules with Greek morphology expansions)<br/>
 
<b>HebrewParse</b> (for modules with Hebrew morphology expansions)<br/>
 
<b>DailyDevotion</b> (for daily devotionals using one of the LD drivers and keyed with MM.DD)<br/>
 
<b>Glossary</b> (for collections of glosses using one of the LD drivers)<br/>
 
<b>Images</b> (for modules that contain images of any type)
 
</td>
 
<td>&nbsp;</td>
 
<td>Repeats</td>
 
  </tr>
 
  <tr>
 
<td>GlossaryFrom</td>
 
<td>&lt;xml:lang identifier&gt;<br/>
 
Glossaries map one language to another. This value indicates the language being translated from.
 
See Lang below for a discussion of valid values.</td>
 
<td>&nbsp;</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>GlossaryTo</td>
 
<td>&lt;xml:lang identifier&gt;<br/>
 
Glossaries map one language to another. This value indicates the language being translated to.
 
See Lang below for a discussion of valid values.</td>
 
<td>&nbsp;</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<th colspan="4">General informatic and installer elements</th>
 
  </tr>
 
  <tr>
 
<td>About</td>
 
<td>&lt;string&gt;<br/>
 
A lengthier description and may include copyright, source, etc. information, possibly duplicating information in other elements.</td>
 
<td>&nbsp;</td>
 
<td>Continuation<br/>RTF</td>
 
  </tr>
 
  <tr>
 
<td>Version</td>
 
<td>&lt;version string&gt;<br/>
 
Gives the module's revision number. Incrementing it when changes are made alerts users of the SWORD Installers to the presence of updated modules. Please start with version 1.0 and increment by 0.1 for minor updates and by larger values for more major updates such as a new text source. Changes to this conf file should also increment the version number. Do not use non-numbers, such as 1.4a or 1.1.3.</td>
 
<td>1.0</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>History_x.x</td>
 
<td>&lt;string&gt;<br/>
 
Used to alert users to what has changed between different versions. Each time a version is incremented a history line with that version number should explain the change.</td>
 
<td>&nbsp;</td>
 
<td>Repeats</td>
 
  </tr>
 
  <tr>
 
<td>MinimumVersion</td>
 
<td>&lt;version string&gt;<br/>
 
Identifies the minimum version of the Sword library required for this module.</td>
 
<td>1.5.1a</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>Category</td>
 
<td>
 
<b>Daily Devotional</b> (for modules with Feature=DailyDevotion)<br/>
 
<b>Glossaries</b> (for modules with Feature=Glossary)<br/>
 
<b>Cults / Unorthodox / Questionable Material</b><br/>
 
<b>Essays</b> (for essays)<br/>
 
<b>Maps</b> (for modules that primarily consist of maps)<br/>
 
<b>Images</b> (for modules that primarily consist of images)<br/>
 
<b>Biblical Texts</b> (for Bibles)<br/>
 
<b>Commentaries</b><br/>
 
<b>Lexicons / Dictionaries</b><br/>
 
<b>Generic Books</b> (for anything else....)<br/>
 
This is used by installers to further categorize the modules beyond what can be figured out by the ModDrv and Feature.
 
</td>
 
<td>Biblical Texts (for /(Raw|z)Text4?/)<br/>
 
Commentaries (for /(Raw|z|HREF)Com4?/)<br/>
 
Lexicons / Dictionaries (for /(Raw|z)LD4?/<br/>
 
Generic Books (for RawGenBook)</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>LCSH</td>
 
<td>&lt;tree/string&gt;<br/>
 
Library of Congress Subject Heading. You may search the [http://catalog.loc.gov Library of Congress catalog] or use it as a guide for determining an appropriate LCSH for books that are not in the Library of Congress.</td>
 
<td>&nbsp;</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>Lang</td>
 
<td>&lt;xml:lang identifier&gt;<br/>
 
This is the primary language code of the module and should include a value according to RFC 4646 using ISO639 codes when possible. ISO 639-1 codes are the preferred code (e.g. en for English). If there is none for the given language, use an ISO 639-2/T code (e.g. ceb for Cebuano). Failing that, use ISO 639-3, which covers over 7000 languages. See: http://www.sil.org/iso639-3/codes.asp for ISO 639-1, 639-2/T and 639-3 codes.<br/>
 
 
 
If a text is country specific, such as the Anglicized NIV, include the [http://www.iso.org/iso/en/prods-services/iso3166ma/02iso-3166-code-lists/list-en1.html ISO 3166-1 country code] after the language code and an underscore (e.g. en_GB for UK English).</td>
 
<td>en</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>InstallSize</td>
 
<td>&lt;integer (indicating bytes)&gt;</td>
 
<td>&nbsp;</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>SwordVersionDate</td>
 
<td>&lt;ISO date string (yyyy-mm-dd)&gt;
 
Indicates the date that the module was changed.</td>
 
<td>&nbsp;</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>Obsoletes</td>
 
<td>&lt;name of module&gt;<br/>
 
Each instance of this element names a module that is made obsolete by this module, usually indicated by the former name of the module.</td>
 
<td>&nbsp;</td>
 
<td>Repeats</td>
 
  </tr>
 
  <tr>
 
<th colspan="4">Copyright &amp; Licensing related elements</th>
 
  </tr>
 
  <tr>
 
<td>Copyright</td>
 
<td>&lt;string&gt;<br/>
 
Contains the copyright notice for the work, including the year of copyright and the owner of the copyright.</td>
 
<td>&nbsp;</td>
 
<td>Continuation</td>
 
  </tr>
 
  <tr>
 
<td>CopyrightHolder</td>
 
<td>&lt;string&gt;<br/>
 
Contains the name of the copyright holder.</td>
 
<td>&nbsp;</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>CopyrightDate</td>
 
<td>&lt;integer (indicating year)&gt;</td>
 
<td>&nbsp;</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>CopyrightNotes</td>
 
<td>&lt;string&gt;</td>
 
<td>&nbsp;</td>
 
<td>Continuation</td>
 
  </tr>
 
  <tr>
 
<td>CopyrightContactName</td>
 
<td>&lt;string&gt;<br/>
 
Contains the name of the copyright holder.</td>
 
<td>&nbsp;</td>
 
<td>Continuation</td>
 
  </tr>
 
  <tr>
 
<td>CopyrightContactNotes</td>
 
<td>&lt;string&gt;</td>
 
<td>&nbsp;</td>
 
<td>Continuation</td>
 
  </tr>
 
  <tr>
 
<td>CopyrightContactAddress</td>
 
<td>&lt;string&gt;<br/>
 
Contains the mailing address of the copyright holder.</td>
 
<td>&nbsp;</td>
 
<td>Continuation</td>
 
  </tr>
 
  <tr>
 
<td>CopyrightContactEmail</td>
 
<td>&lt;string&gt;<br/>
 
Contains the email address of the copyright holder, preferably in the form:<br/> name at xyz dot com.</td>
 
<td>&nbsp;</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>ShortPromo</td>
 
<td>&lt;string&gt;<br/>
 
A link to the home page for the module, perhaps with an encouragement to visit the site.</td>
 
<td>&nbsp;</td>
 
<td>HTML Link</td>
 
  </tr>
 
  <tr>
 
<td>ShortCopyright</td>
 
<td>&lt;string&gt;</td>
 
<td>&nbsp;</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>DistributionLicense</td>
 
<td>
 
<b>Public Domain</b><br/>
 
<b>Copyrighted</b><br/>
 
<b>Copyrighted; Permission to distribute granted to CrossWire</b><br/>
 
<b>Copyrighted; Free non-commercial distribution</b><br/>
 
<b>Copyrighted; Freely distributable</b><br/>
 
<b>Copyrighted; Permission granted to distribute non-commercially in Sword format</b><br/>
 
<b>[http://www.gnu.org/copyleft/fdl.html GFDL]</b><br/>
 
<b>[http://www.gnu.org/copyleft/gpl.html GPL]</b><br/>
 
<b>Creative Commons: [http://creativecommons.org/licenses/by-nc-nd by-nc-nd]</b><br/>
 
<b>Creative Commons: [http://creativecommons.org/licenses/by-nc-sa by-nc-sa]</b><br/>
 
<b>Creative Commons: [http://creativecommons.org/licenses/by-nc by-nc]</b><br/>
 
<b>Creative Commons: [http://creativecommons.org/licenses/by-nd by-nd]</b><br/>
 
<b>Creative Commons: [http://creativecommons.org/licenses/by-sa by-sa]</b><br/>
 
<b>Creative Commons: [http://creativecommons.org/licenses/by by]</b><br/>
 
<br/>
 
Use one of these strings verbatim. The actual copyright and/or license information is held in other elements. The last six licenses are [http://creativecommons.org/ Creative Commons] licenses.</td>
 
<td>&nbsp;</td>
 
<td>&nbsp;</td>
 
  </tr>
 
  <tr>
 
<td>DistributionNotes</td>
 
<td>&lt;string&gt;<br/>
 
Indicates any additional notes about distribution of the module.</td>
 
<td>&nbsp;</td>
 
<td>Continuation</td>
 
  </tr>
 
  <tr>
 
<td>TextSource</td>
 
<td>&lt;string, probably a URL&gt;<br/>
 
Indicates, either in prose (such as "CCEL") or as a URL of the source of the text</td>
 
<td>&nbsp;</td>
 
<td>Continuation</td>
 
  </tr>
 
</table>
 
 
 
=Creating a module=
 
The SWORD Project currently requires that all submitted texts be Unicode (specifically UTF-8) encoded documents. We recommend that texts be marked up in OSIS or TEI, but will still accept texts based on CCEL documents that are marked up in ThML.
 
 
 
==Preparing a Text for Import==
 
 
===Encoding===
 
As mentioned above in the conf's Encoding directive, SWORD modules can be encoded either in [http://en.wikipedia.org/wiki/Windows-1252 Windows Codepage 1252 (cp1252)] (a superset of ISO 8859-1) or in UTF-8. See [[Encoding]] for a complete explanation and definition.
 
 
 
For English language texts that only make use of ASCII characters, no change to the source encoding will be required. For other European language and most other languages, there probably exist simple encoding converters for ISO and national standards to UTF-8. For more complex source encodings, you may need to create your own converter or adapt an existing one. Some currently available conversion tools that you may find useful, depending on your platform and needs, include:
 
 
 
uconv (part of ICU), available compiled for Win32 at http://crosswire.org/ftpmirror/pub/sword/utils/win32/uconv.zip or in source format from ICU at http://oss.software.ibm.com/icu/.
 
font2uni from CCEL, available at http://www.ccel.org/info/gkheb/.
 
 
 
uconv is best suited for standard encodings and font2uni is best suited for font-specific encodings. When creating XML texts, the only entities that should be used are &amp;amp; for '&amp;' and &amp;lt; for '&lt;'. All other entities should be encoded as their UTF-8 equivalents.
 
 
 
===Markup===
 
 
 
(''see also [[Various Tools]]'')
 
 
 
Internally, SWORD can process text in one of four formats: OSIS, TEI, ThML, and GBF. From these formats, it can convert to other formats, including RTF and HTML, for display. OSIS 2.1 is now the preferred format for Bibles and commentaries. At the moment OSIS does not have thorough support for complex dictionaries. For that reason we support TEI for dictionaries.
 
  
 
You may find documentation for each of these standards at their respective websites:<br/>
 
You may find documentation for each of these standards at their respective websites:<br/>
 
*Open Scriptural Information Standard (OSIS) : http://www.bibletechnologies.net/<br/>
 
*Open Scriptural Information Standard (OSIS) : http://www.bibletechnologies.net/<br/>
 
*Text Encoding Initiative (TEI) : http://www.tei-c.org/P4X/DI.html<br/>
 
*Text Encoding Initiative (TEI) : http://www.tei-c.org/P4X/DI.html<br/>
*Theological Markup Format (ThML) : http://www.ccel.org/ThML/<br/>
 
*General Bible Format (GBF) : http://www.ebible.org/bible/gbf.htm<br/>
 
  
In SWORD, for modules encoded with ThML and OSIS, each verse, dictionary entry, and book division needs to be well-formed XML or it will result in display problems in some frontends. SWORD only handles the subset of the ThML tags that we have found necessary, but we are willing to supporting additional tags, as the need arises.
+
In SWORD, for modules encoded in OSIS, each verse, dictionary entry, and book division needs to be well-formed XML or it will result in display problems in some front-ends.
  
Use of ThML for Sword is deprecated. Supported ThML tags include: &lt;sync&gt; (with type parameters of Strongs, morph, & lemma), &lt;scripRef&gt;, and &lt;note&gt; (plus closing tags where appropriate). HTML tags that ThML inherits, which may be used in SWORD modules include &lt;div&gt; (with types of sechead for section headings and title for titles, &lt;i&gt;, &lt;br&gt;, and &lt;b&gt;. Additional HTML tags may be interpreted by those SWORD frontends that render HTML, but will not be translated to RTF for the Win32 frontend. Do not submit untidy HTML and label it ThML--it's rude and lazy.
+
===Import formats===
  
GBF is deprecated and no GBF modules will be accepted by the SWORD Project. Supported GBF tags include: &lt;WG&gt;, &lt;WH&gt;, &lt;WTG&gt;, &lt;WTH&gt;, &lt;RX&gt;, &lt;RF&gt;, &lt;FI&gt;, &lt;FB&gt;, &lt;FN&gt;, &lt;FR&gt;, &lt;FS&gt;, &lt;FU&gt;, &lt;FO&gt;, &lt;FV&gt;, &lt;CA&gt;, &lt;CL&gt;, &lt;CG&gt;, &lt;CM&gt;, &lt;CT&gt;, &lt;JR&gt;, &lt;JC&gt;, &lt;JL&gt;, &lt;TT&gt;, and &lt;TS&gt; (plus closing tags where appropriate). In addition, SWORD allows full use of UTF-8 rather than merely ASCII as the GBF standard specifies.
+
====OSIS Formatted General Books====
 
+
With OSIS formatted general books, provided your document is well-formed and valid XML according to the OSIS 2.1 Schema, you should not need to do any further processing. You can use your XML file with xml2gbs. For OSIS encoded Bibles use [[osis2mod]].
===Import formats===
 
====ThML and OSIS Formatted General Books====
 
With ThML and OSIS formatted general books, provided your document is well-formed and valid XML according to the ThML DTD or the OSIS 2.1 Schema, you should not need to do any further processing. You can use your XML file with thml2gbs and xml2gbs. For OSIS encoded Bibles use [[osis2mod]].
 
  
====vpl Format====
+
====VPL Format====
vpl or verse-per-line format may only be used in creating Bibles. This format requires that each line start with a verse reference that SWORD can understand, such as "Genesis 1:1" or "Jn 3:16". Most English abbreviations are acceptable. Following the verse reference, the verse itself should be written, in any kind of markup. For example:
+
[[File Formats#VPL|VPL]] or verse-per-line format may only be used in creating Bibles. This format requires that each line start with a verse reference that SWORD can understand, such as "Genesis 1:1" or "Jn 3:16". Most English abbreviations are acceptable. Following the verse reference, the verse itself should be written, in any kind of markup. For example:
  
 
  Genesis 1:1 In the beginning God created the heaven and the earth.
 
  Genesis 1:1 In the beginning God created the heaven and the earth.
Line 519: Line 83:
 
This format is used with the utility vpl2mod, discussed below. To import Bibles that have have combined verses, you will need to use imp format, instead of vpl.
 
This format is used with the utility vpl2mod, discussed below. To import Bibles that have have combined verses, you will need to use imp format, instead of vpl.
  
====imp Format====
+
'''For CrossWire import purposes VPL is acceptable for text only Bibles without any further markup'''
The imp or import format is the most versatile of the import formats and may be used in creating all types of modules (Bibles, commentaries, dictionaries, daily devotionals, glossaries, general books, etc.) in any supported format (GBF, ThML, OSIS or TEI). Each entry in an imp file may take as many lines as are needed. The first line of the entry will have a format such as "$$$&lt;key&gt;" and will be followed by all lines of text that should be included with that entry. So our above example in imp format would be written as:
+
 
$$$Genesis 1:1
+
===Imp Format ===
In the beginning God created the heaven and the earth.
 
$$$Genesis 1:2
 
And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters.
 
  
Commentaries would follow the same format, but would probably include a greater number of lines of text. If your Bible or commentary uses a single entry to handle multiple verses, simply give a list or range of verses as the key (e.g. "$$$Genesis 1:1-5", "$$$Exodus 1", "$$$Leviticus 1:1,5"). Lexicons, dictionaries, glossaries and daily devotionals would take a form such as:
+
GenBook and LexDict modules may also be submitted in [[DevTools:Imp Format|Imp Format]].
$$$Adam
 
Adam was the first man created by God.
 
$$$Eve
 
Eve was the first woman created by God.
 
  
For daily devotionals, you must encode the key as "$$$mm.dd", such as "$$$01.01" for January 1st and "$$$12.31" for December 31st.
+
==Validate the Source Text==
  
General books are encoded with each book division as a separate entry. The entries are then listed as a tree hierarchy with keys similar to a file system directory structure. For example, if you were encoding the Josephus' Works, you might have a structure like this:
+
If your source text is either OSIS or TEI, you should definitely check it with a suitable XML text editor before proceeding to the next step.
$$$/War
+
# Check that it has valid XML syntax
The War of the Jews
+
# Validate the XML contents against the specified XML schema
$$$/War/Book 1
 
Book 1 of the War of the Jews
 
$$$/War/Book 1/Chapter 1
 
Chapter 1 of Book 1 of the War of the Jews
 
$$$/War/Book 1/Chapter 1/Section 1
 
Section 1 of Chapter 1 of Book 1 of the War of the Jews
 
$$$/War/Book 1/Chapter 1/Section 2
 
Section 2 of Chapter 1 of Book 1 of the War of the Jews
 
  
==Importing==
+
==Import the Source Text==
 
Now that your text is ready to be imported, you will need to use one of the command line utilities for converting documents to SWORD format. Depending on the format of your document at this point, you will need to use the appropriate importer.
 
Now that your text is ready to be imported, you will need to use one of the command line utilities for converting documents to SWORD format. Depending on the format of your document at this point, you will need to use the appropriate importer.
*If your text is a valid ThML document, use thml2gbs.
 
 
*If your text is a valid OSIS Bible, use [[osis2mod]].
 
*If your text is a valid OSIS Bible, use [[osis2mod]].
 
*If your text is a valid OSIS Commentary, use [[osis2mod]].
 
*If your text is a valid OSIS Commentary, use [[osis2mod]].
*If your text is a valid OSIS document, use xml2gbs.
+
*If your text is a valid OSIS document of some other type, use xml2gbs.
 +
*If your text is a valid TEI P5 dictionary, use [[TEI Dictionaries|tei2mod]].
 
*If your text is a vpl format Bible, use vpl2mod.
 
*If your text is a vpl format Bible, use vpl2mod.
 
*If your text is an imp format Bible or commentary, use imp2vs.
 
*If your text is an imp format Bible or commentary, use imp2vs.
Line 558: Line 107:
  
 
You may find these files in the SWORD Project source distribution or compiled for Win32 at http://crosswire.org/ftpmirror/pub/sword/utils/win32/. Each utility has brief usage information that can be viewed by running it once without any arguments.
 
You may find these files in the SWORD Project source distribution or compiled for Win32 at http://crosswire.org/ftpmirror/pub/sword/utils/win32/. Each utility has brief usage information that can be viewed by running it once without any arguments.
 
==Additional Utilities==
 
There are additional utilities that may be used on SWORD modules:
 
  
 
===Compressing Modules===
 
===Compressing Modules===
Line 569: Line 115:
 
You may wish to try different compression settings to find out which is best for your module. Typically, we use chapter compression for large commentaries, book compression for Bibles, and the Zip compression algorithm.
 
You may wish to try different compression settings to find out which is best for your module. Typically, we use chapter compression for large commentaries, book compression for Bibles, and the Zip compression algorithm.
 
   
 
   
===Locking Modules===
+
===[[Copyright#Locked_Modules|Locking Modules]]===
 
To lock a rawText Bible or rawCom commentary module, use the cipherraw utility. Just run:
 
To lock a rawText Bible or rawCom commentary module, use the cipherraw utility. Just run:
 
  cipherraw &lt;/path/to/module&gt; '&lt;key&gt;'
 
  cipherraw &lt;/path/to/module&gt; '&lt;key&gt;'
 +
 +
===Miscellaneous tools===
 +
Further miscellaneous tools that are 'not ready for public consumption' but may be useful to modules authors are found in [[DevTools:Misc]].  These includes scripts and programs that are used for the preparation and conversion of various specific modules.
 +
 +
===Debugging modules===
 +
 +
If you are not the module creator and thus do not have a copy of the source text file, some of the SWORD utilities can be used for help in debugging a module. It is especially useful to note that mod2imp followed by [imp2vs | imp2ld | imp2gbs] (depending on module type) often produces a lossless round-trip, no matter what the markup variety was used to make the original module. Editing the IMP file in between these two steps can therefore be used as a method to evaluate sensible conjectures regarding the apparent cause of a bug in a particular module.
 +
 +
:'''N.B.''' mod2imp followed by imp2(vs|ld|gbs) is not guaranteed to round-trip losslessly. mod2imp will give an accurate record of the contents of each content entry in the module, but skips past link entries. So for investigating entry contents in order to track down bugs, this process is fine, but CrossWire would never release content that had gone through this process.
 +
 +
If you have created a 'plain text' Bible module using either imp2vs or vpl2mod, it can be worthwhile to see what the module outputs with mod2osis. If the resulting osis file fails to pass the XML syntax check when viewed with a suitable XML editor, this may point to unexpected residual items in the source text. The proper solution then would be to raise the matter with the provider of the source text.
 +
 +
If you have created a devotional, imp2ld appears to have a bug that prevents it from creating the module correctly unless you use the trailing argument 2. 
 +
For example:
 +
imp2ld MyLD.txt MyLD 2
 +
 +
==Create the .conf File==
 +
In order to test and before submitting a new module, you need to create a .conf file, which tells Sword how to recognize and what to do with your module. Instructions for creating a .conf file are on the [[DevTools:conf Files]] page.
 +
 +
==Install and Test the Module==
 +
===Install the Module===
 +
Once you have imported the source document as a binary, several files will result. The number of files depends on the type of module, but you should make sure all of them are in the same folder. That folder will go in the modules folder, and the .conf file should go in the mods.d folder wherever your SWORD modules are installed. Open a front-end and see that it is recognizing the presence of the module. If it appears then you have installed it in the right place. Check to make sure that the content appears, and you are ready to start testing.
  
 
===Checking for Missing Verses===
 
===Checking for Missing Verses===
Line 578: Line 146:
 
on an installed module to generate a list.
 
on an installed module to generate a list.
  
===Miscellaneous tools===
+
==Submit the Module to CrossWire for Distribution==
Further miscellaneous tools that are 'not ready for public consumption' but may be useful to modules authors are found in [[DevTools:Misc]].  These includes scripts and programs that are used for the preparation and conversion of various specific modules.
 
  
==Submitting content to the SWORD Project==
+
First ensure that your module complies with our stated [[copyright]] policy.
After you have tested your module, you may wish to submit it to the SWORD Project for public release so that other people can benefit from your work. All modules submitted to the SWORD Project for distribution either on the internet or on CDs should include both the module as a single document and the .conf file.
 
  
The module itself should be an uncompiled, plain text document in either vpl (verse-per-line), imp (import), ThML, OSIS or TEI format, ready to be run through one of the import tools.
+
If your module is to be sold rather than distributed for free, CrossWire has a tool to encrypt a module. However, we do not handle any payments, so we always suggest that it's for the owner (or an authorised agent) to host the payment system as well as the delivery of the unlock key.
  
Before any module will be considered for posting, we expect that the following minimum set of tags be included in its .conf file: DataPath, ModDrv, Lang, Description, About, DistributionLicense, and TextSource. We also strongly prefer that an LCSH line be included with the .conf file, but will look the LCSH up ourselves if you have trouble deciding on a value. (You can look at other .conf files for examples.)
+
After you have tested your module, you may wish to submit it to the SWORD Project for public release so that other people can benefit from your work. The submission itself should be of an uncompiled, plain text document in either VPL (verse-per-line), IMP (import),  OSIS or TEI [[File Formats|format]], ready to be run through one of the module build tools. '''Do not''' submit built modules that you have imported to Sword format; submit only source documents. You also need to supply the relevant conf file entries.  Before any module will be considered for hosting, we require that the following minimum set of module configuration fields be included in its .conf file: '''Description''', '''About''', '''DistributionLicense''', and '''TextSource'''. For further detail please read [[Module Submission]]
  
When you feel your module is ready to be submitted, you may email it to modules@crosswire.org. If you are unable to email it or would prefer to send the files by some other means, you may contact us at the same email address, and we can discuss other arrangements.
+
When you decide that your module is ready to be submitted, you may email it to modules@crosswire.org. If you are unable to email it or would prefer to send the files by some other means, you may contact us at the same email address, and we can discuss other arrangements.
  
 
=Related Pages=
 
=Related Pages=
===A Basic [[OSIS Tutorial]]===
+
*A Basic [[OSIS Tutorial]]
===Guide to Writing [[OSIS Bibles]]===
+
*Guide to Writing [[OSIS Bibles]]
===[[DevTools:OSISBookNames|OSIS Book Name Abbreviations]]===
+
*[[OSIS Book Abbreviations|OSIS Book Name Abbreviations]]
===Guide to Writing [[OSIS Commentaries]]===
+
*Guide to Writing [[OSIS Commentaries]]
===Guide to Writing [[TEI Dictionaries]]===
+
*Guide to Writing [[TEI Dictionaries]]
===Guide to [[Converting SFM Bibles to OSIS]]===
+
*Guide to [[Converting SFM Bibles to OSIS]]
===Guide to Writing [[ThML modules]]===
+
*[[Encoding| Text Encoding]]
===[[GenBook and OpenOffice]]===
+
*Related [[File Formats]]
===Definition of [[Encoding]]===
+
*List of [[Official and Affiliated Module Repositories]]
===List of Known [[Module Repositories]]===
+
*[[Zipped modules]]
===[[Module Requests]]===
+
*[[Module Requests]]
===[[New Modules]]===
+
*[[Copyright]]
 +
 
 +
[[Category:Development tools|Modules]]
 +
[[Category:Modules]]

Latest revision as of 20:31, 4 June 2020

Module Development Overview

If you want to learn how to create a SWORD module, this is the place to start. Here is a brief overview of the process:

  1. Collect and install the necessary software tools.
  2. Obtain the source text and permission from the copyright holder if you wish to distribute a copyrighted module.
  3. Prepare the source text for import.
  4. Use an XML validator to check that your source file is properly constructed.
  5. Import the source text using the appropriate tool.
  6. Create a .conf file.
  7. Install and test that the module displays correctly in several of the SWORD front-end applications.
  8. Submit your module to CrossWire for distribution.

Creating a module

Collect and Install Software Tools

  • Install SWORD and collect SWORD module creation tools. For Linux (and Mac?) many of the module creation tools come installed with SWORD, so if you aren't comfortable installing from source simply install from your distribution's repositories. For Windows, you will need to download the module creation utilities from http://www.crosswire.org/ftpmirror/pub/sword/utils/win32/.[1] Be sure to download the most recent icudt dll and unzip it in the folder where you place the utilities.
  • Xiphos (for Windows) users should find a complete set of Sword tools in the Xiphos directory.
  • Obtain a good programmer's text editor, preferentially one which does syntax and validation checks. See DevTools:Text Editors for examples
  1. Unofficial builds that may be from more recent SVN revisions may be found here.

Obtain Source Text and Permission to Distribute

  • The easiest texts to work with, especially for learning how to make a module, are texts in the public domain. For example, you might try downloading a text from CCEL first to get you started on the process of moving from a prepared text to a compiled module. Be sure to verify that the source text you obtain is indeed in the public domain.
  • Some people provide texts that are freely distributable under some sort of license (GNU GPL, Creative Commons, etc.) or no formal license. Be sure to document where the source provides that permission and check to see that they have the right to grant such permission so you can produce it if you want to distribute it with CrossWire.
  • For copyrighted material, you will need to contact the publisher or author to obtain permission. First check to see that someone else in CrossWire hasn't already pursued permission for that work. A list of requests and attempts to obtain rights on behalf of CrossWire can be found at Module Requests.

Prepare the Source Text for Import

Versification

For Bible modules, SWORD supports a growing number of Versifications. A prerequisite for submission of a module that would not match the default versification for the KJV Bible is to decide which of the several versifications is most suitable for the new module.

Encoding

Note that the SWORD Project requires all submitted texts to be Unicode (UTF-8) encoded documents.

Legacy texts might need conversion to Unicode.

Character sets

A common problem in texts created by people less aware of Unicode principles is presence of characters which fit graphically but are out of set for this particular language. E.g. a Cyrillic a and a Latin a look identical but use different code points. It is important to clean this up prior to import as our module search depends on clean texts with consistent use of letters. Below a short shell script which will produce a list of all characters employed in a text.

 cat *txt |\
 uni2ascii -paU |\
 sed  -e "s/u/\nu/g" -e '/\./d' |\
 sort |\
 uniq -c -w5 |\
 sed -e 's/\\//' -e 's/\W+/\t/g' -e 's/\s*\([0-9][0-9]*\).*u\([0-9A-F]*\)/Character \\x\2 (u\2) was used \1 times/' |\
 ascii2uni -aB |\
 grep 'used' > charactercount.txt

Images

Images can be included in any type of module. The specifics of how to do this is dependent upon markup.

Some SWORD applications are able to use virtually any image format, but SWORD only offically supports JPEG and PNG image files.

Markup

(see also DevTools:Misc)

We will accept only plain texts or texts marked up in OSIS or TEI, with the sole exception texts based on CCEL documents that are marked up in ThML.

Internally, SWORD can process text in OSIS, TEI and 2 legacy formats(ThML, and GBF). From these formats, it can convert to other formats, including RTF and HTML, for display. OSIS 2.1 is now the preferred format for Bibles and commentaries. At the moment OSIS does not have thorough support for complex dictionaries. For that reason we support TEI for dictionaries.

You may find documentation for each of these standards at their respective websites:

In SWORD, for modules encoded in OSIS, each verse, dictionary entry, and book division needs to be well-formed XML or it will result in display problems in some front-ends.

Import formats

OSIS Formatted General Books

With OSIS formatted general books, provided your document is well-formed and valid XML according to the OSIS 2.1 Schema, you should not need to do any further processing. You can use your XML file with xml2gbs. For OSIS encoded Bibles use osis2mod.

VPL Format

VPL or verse-per-line format may only be used in creating Bibles. This format requires that each line start with a verse reference that SWORD can understand, such as "Genesis 1:1" or "Jn 3:16". Most English abbreviations are acceptable. Following the verse reference, the verse itself should be written, in any kind of markup. For example:

Genesis 1:1 In the beginning God created the heaven and the earth.
Genesis 1:2 And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters.

This format is used with the utility vpl2mod, discussed below. To import Bibles that have have combined verses, you will need to use imp format, instead of vpl.

For CrossWire import purposes VPL is acceptable for text only Bibles without any further markup

Imp Format

GenBook and LexDict modules may also be submitted in Imp Format.

Validate the Source Text

If your source text is either OSIS or TEI, you should definitely check it with a suitable XML text editor before proceeding to the next step.

  1. Check that it has valid XML syntax
  2. Validate the XML contents against the specified XML schema

Import the Source Text

Now that your text is ready to be imported, you will need to use one of the command line utilities for converting documents to SWORD format. Depending on the format of your document at this point, you will need to use the appropriate importer.

  • If your text is a valid OSIS Bible, use osis2mod.
  • If your text is a valid OSIS Commentary, use osis2mod.
  • If your text is a valid OSIS document of some other type, use xml2gbs.
  • If your text is a valid TEI P5 dictionary, use tei2mod.
  • If your text is a vpl format Bible, use vpl2mod.
  • If your text is an imp format Bible or commentary, use imp2vs.
  • If your text is an imp format dictionary, lexicon, glossary, or daily devotional, use imp2ld.
  • If your text is an imp format general book, use imp2gbs.

You may find these files in the SWORD Project source distribution or compiled for Win32 at http://crosswire.org/ftpmirror/pub/sword/utils/win32/. Each utility has brief usage information that can be viewed by running it once without any arguments.

Compressing Modules

To compress a Bible, commentary, or LD module, use the mod2zmod utility. First you will need to install the module so that it can be accessed using the SWORD engine. Next, run:

mod2zmod <modname> <datapath> [blockType [compressType]]

blockType can be 4 = book (default), 3 = chapter, or 1 = verse and indicates the granularity of the compression blocks. The larger the block is, the longer it will take to access a piece of the text, but the smaller the resulting module will be. compressType can be either 1 = LZSS (default) or 2 = Zip.

You may wish to try different compression settings to find out which is best for your module. Typically, we use chapter compression for large commentaries, book compression for Bibles, and the Zip compression algorithm.

Locking Modules

To lock a rawText Bible or rawCom commentary module, use the cipherraw utility. Just run:

cipherraw </path/to/module> '<key>'

Miscellaneous tools

Further miscellaneous tools that are 'not ready for public consumption' but may be useful to modules authors are found in DevTools:Misc. These includes scripts and programs that are used for the preparation and conversion of various specific modules.

Debugging modules

If you are not the module creator and thus do not have a copy of the source text file, some of the SWORD utilities can be used for help in debugging a module. It is especially useful to note that mod2imp followed by [imp2vs | imp2ld | imp2gbs] (depending on module type) often produces a lossless round-trip, no matter what the markup variety was used to make the original module. Editing the IMP file in between these two steps can therefore be used as a method to evaluate sensible conjectures regarding the apparent cause of a bug in a particular module.

N.B. mod2imp followed by imp2(vs|ld|gbs) is not guaranteed to round-trip losslessly. mod2imp will give an accurate record of the contents of each content entry in the module, but skips past link entries. So for investigating entry contents in order to track down bugs, this process is fine, but CrossWire would never release content that had gone through this process.

If you have created a 'plain text' Bible module using either imp2vs or vpl2mod, it can be worthwhile to see what the module outputs with mod2osis. If the resulting osis file fails to pass the XML syntax check when viewed with a suitable XML editor, this may point to unexpected residual items in the source text. The proper solution then would be to raise the matter with the provider of the source text.

If you have created a devotional, imp2ld appears to have a bug that prevents it from creating the module correctly unless you use the trailing argument 2. For example:

imp2ld MyLD.txt MyLD 2

Create the .conf File

In order to test and before submitting a new module, you need to create a .conf file, which tells Sword how to recognize and what to do with your module. Instructions for creating a .conf file are on the DevTools:conf Files page.

Install and Test the Module

Install the Module

Once you have imported the source document as a binary, several files will result. The number of files depends on the type of module, but you should make sure all of them are in the same folder. That folder will go in the modules folder, and the .conf file should go in the mods.d folder wherever your SWORD modules are installed. Open a front-end and see that it is recognizing the presence of the module. If it appears then you have installed it in the right place. Check to make sure that the content appears, and you are ready to start testing.

Checking for Missing Verses

You can use the utility emptyvss to find verses in a module that contain no text, since this may indicate errors in the module. Just run:

emptyvss <module name>

on an installed module to generate a list.

Submit the Module to CrossWire for Distribution

First ensure that your module complies with our stated copyright policy.

If your module is to be sold rather than distributed for free, CrossWire has a tool to encrypt a module. However, we do not handle any payments, so we always suggest that it's for the owner (or an authorised agent) to host the payment system as well as the delivery of the unlock key.

After you have tested your module, you may wish to submit it to the SWORD Project for public release so that other people can benefit from your work. The submission itself should be of an uncompiled, plain text document in either VPL (verse-per-line), IMP (import), OSIS or TEI format, ready to be run through one of the module build tools. Do not submit built modules that you have imported to Sword format; submit only source documents. You also need to supply the relevant conf file entries. Before any module will be considered for hosting, we require that the following minimum set of module configuration fields be included in its .conf file: Description, About, DistributionLicense, and TextSource. For further detail please read Module Submission

When you decide that your module is ready to be submitted, you may email it to modules@crosswire.org. If you are unable to email it or would prefer to send the files by some other means, you may contact us at the same email address, and we can discuss other arrangements.

Related Pages