Difference between revisions of "DevTools:Modules"
m (added link to cp1252) |
|||
Line 210: | Line 210: | ||
<b>Latin-1</b> | <b>Latin-1</b> | ||
− | The preferred encoding of texts is UTF-8. Latin-1 is defined by cp1252 which is a superset of ISO8859-1. | + | The preferred encoding of texts is UTF-8. Latin-1 is defined by [http://en.wikipedia.org/wiki/Windows-1252 cp1252] which is a superset of ISO8859-1. |
This encoding indicates how the conf and the module are encoded. | This encoding indicates how the conf and the module are encoded. |
Revision as of 13:31, 10 September 2007
Module Repositories
List of known Module Repositories
Module Development
OSIS Book Name Abbreviations
Introduction
A SWORD module consists of a set of binary files in any of an increasing number of formats created for SWORD plus a .conf file that specifies the location and attributes of the module.
The .conf file is located in a standard location, such as the mods.d directory of the SWORD install directory, that may be specified to The SWORD Engine in a number of ways that are outside the scope of this document. This file should be created with a standard text editor like notepad, emacs, vi, or pico. Its contents are described below in section II.
The module files themselves usually require an amount of pre-processing before they are ready to be imported to SWORD. How you go about this pre-processing is something you will need to decide for yourself. You may be able to do all of your pre-processing with simple search & replace operations in notepad or more complex regular expression search & replace operations in emacs, but the majority of modules will probably require even more complex editing using a scripting language such as Perl plus a fair amount of manual correction. On the other hand, some modules may come in a standard format such as ThML or OSIS encoded files, which does not require any modification, assuming they are valid documents. Document pre-processing is outside the scope of this document, but we will explain how you need to format documents to prepare them for import to SWORD, both in terms of encoding and markup.
Once you have a document ready for import, you will need to run it through an importer to create the SWORD module files, which will then be placed in the module directory you specify in your .conf file.
After this, you may test your work and consider submitting it to The SWORD Project for public distribution from our website.
.conf File Layout
The conf file tell the Sword engine how to treat installed module files, etc. which kind of markup they contain, and so forth.
Below is a listing of the possible directives in that file. Each of these directives of the form key=value. Some keys can be repeated. Some can have values that span more than 1 line with '\' at the end of a line indicating that the text on the next line continues the value. Some values allow RTF and some allow HTML <a href="xxx">label</a>hypertext links. HTML is not allowed otherwise.
Enumerated values are shown in bold. These should be used exactly as given and no other values should be used.
Element | Values (type or enumerated) | Default Value | Allows |
---|---|---|---|
Required Elements | |||
[name] | Each conf file begins with [name], replacing "name" with be a short well known abbreviation. This must be on the first line, and start the first line. It can only contain A-Z, a-z and 0-9. The name of the file should be the lowercase of this name followed by .conf. For example, [MyModule] would be mymodule.conf. |
||
DataPath | <relative system path>
DataPath is the path to the module relative to the SWORD module root directory. This path should start with "./modules". If the DataPath indicates a directory it should end with a '/'. Otherwise the module name is both the directory and the prefix for each file in that directory. Typical paths are for a module named [mymodule] are: But when it really comes down to it, a valid path could be: |
||
Description | <string> This is a short (1 line) title of the module. |
||
ModDrv |
RawText (for uncompressed Bibles) |
||
Elements required for proper module access | |||
CompressType | ZIP LZSS Used to indicate how a compressed modules (zText, zCom, & zLD) is compressed. ZIP is the preferred format. And as of today, no compressed modules use LZSS. |
LZSS | |
BlockType |
BOOK Used for zText and zCom to indicate how much of the work is compressed into a block. The trade off is size for speed, with BOOK taking the least overall space and the longest time and VERSE taking the greatest overall space and the least time. While BlockType has a default, it is a best practice to specify it. Most Bibles use BOOK and larger Commentaries use CHAPTER. To date, no module uses VERSE. |
CHAPTER | |
BlockCount | <integer> Used for zLD to indicate the number of entries in a compressed block. Higher values will make the module slower, but smaller. |
200 | |
CipherKey | <string> (typically in a format matching the pattern: /[0-9]{4}[A-Za-z]{4}[0-9]{4}[A-Za-z]{4}/ Contains the unlock key for enciphered modules. Leave a blank line ("CipherKey=") to indicate that the module is enciphered but has no unlock key. (Omit for unlocked modules.) |
||
Elements required for proper rendering | |||
GlobalOptionFilter |
GBFStrongs (For GBF texts having Strong's Numbers) Each of these filters removes/hides the text's feature, when activated by the application. These filters are applied in the order that they are listed in the conf. |
Repeats | |
Direction |
LtoR Indicate whether the language's script is a left to right script or a right to left script. Languages such as Hebrew, Arabic, Urdu, Farsi have a RtoL script. When a module contains more than one direction, such as a Hebrew/English dictionary, set this value to the dominant direction. If the RtoL script is transliterated into a LtoR script, set the value to LtoR. |
LtoR | |
SourceType |
Plaintext These are various ways that the text can be encoded. The preferred encoding is OSIS. TEI is preferred for dictionaries until OSIS supports dictionaries. In SWORD, for modules encoded with ThML, OSIS or TEI, each verse, dictionary entry, and book division needs to be well-formed XML or it will result in display problems in some frontends. |
Plaintext | |
Encoding |
UTF-8 The preferred encoding of texts is UTF-8. Latin-1 is defined by cp1252 which is a superset of ISO8859-1. This encoding indicates how the conf and the module are encoded. |
Latin-1 | |
DisplayLevel | <integer> | 1 | |
Font | <string> Specify the font to be used for display of the module if it is available. Omit this line to use the default font. Do not make use of font-specific encodings in your documents, but use Unicode instead and the Private Use Area if necessary for codepoints that are not handled by Unicode. |
||
OSISqToTick | true/false When set to false indicates that OSIS quote elements without a marker attribute are not to produce a quotation mark. This is useful for languages (e.g. Thai) and texts (e.g. KJV) that do not have quotation marks. It is also useful for modules that mark the "Words of Christ" on a verse by verse basis, when the quote spans more than one verse. | true | |
Elements to indicate features | |||
Feature |
StrongsNumbers (for modules that include Strong's numbers) |
Repeats | |
GlossaryFrom | <xml:lang identifier> Glossaries map one language to another. This value indicates the language being translated from. See Lang below for a discussion of valid values. |
||
GlossaryTo | <xml:lang identifier> Glossaries map one language to another. This value indicates the language being translated to. See Lang below for a discussion of valid values. |
||
General informatic and installer elements | |||
About | <string> A lengthier description and may include copyright, source, etc. information, possibly duplicating information in other elements. |
Continuation RTF |
|
Version | <version string> Gives the module's revision number. Incrementing it when changes are made alerts users of the SWORD Installers to the presence of updated modules. Please start with version 1.0 and increment by 0.1 for minor updates and by larger values for more major updates such as a new text source. Changes to this conf file should also increment the version number. Do not use non-numbers, such as 1.4a or 1.1.3. |
1.0 | |
History_x.x | <string> Used to alert users to what has changed between different versions. Each time a version is incremented a history line with that version number should explain the change. |
Repeats | |
MinimumVersion | <version string> Identifies the minimum version of the Sword library required for this module. |
1.5.1a | |
Category |
Daily Devotional (for modules with Feature=DailyDevotion) |
||
LCSH | <tree/string> Library of Congress Subject Heading. You may search the Library of Congress catalog or use it as a guide for determining an appropriate LCSH for books that are not in the Library of Congress. |
||
Lang | <xml:lang identifier> This is the primary language code of the module and should include a value according to RFC 4646 using ISO639 codes when possible. ISO 639-1 codes are the preferred code (e.g. en for English). If there is none for the given language, use an ISO 639-2/T code (e.g. ceb for Cebuano). Failing that, use ISO 639-3, which covers over 7000 languages. See: http://www.sil.org/iso639-3/codes.asp for ISO 639-1, 639-2/T and 639-3 codes. |
en | |
InstallSize | <integer (indicating bytes)> | ||
SwordVersionDate | <ISO date string (yyyy-mm-dd)> Indicates the date that the module was changed. | ||
Obsoletes | <name of module> Each instance of this element names a module that is made obsolete by this module, usually indicated by the former name of the module. |
Repeats | |
Copyright & Licensing related elements | |||
Copyright | <string> Contains the copyright notice for the work, including the year of copyright and the owner of the copyright. |
Continuation | |
CopyrightHolder | <string> Contains the name of the copyright holder. |
||
CopyrightDate | <integer (indicating year)> | ||
CopyrightNotes | <string> | Continuation RTF |
|
CopyrightContactName | <string> Contains the name of the copyright holder. |
Continuation RTF |
|
CopyrightContactNotes | <string> | Continuation RTF |
|
CopyrightContactAddress | <string> Contains the mailing address of the copyright holder. |
Continuation RTF |
|
CopyrightContactEmail | <string> Contains the email address of the copyright holder, preferably in the form: name at xyz dot com. |
||
ShortPromo | <string> | ||
ShortCopyright | <string> | ||
DistributionLicense |
Public Domain |
||
DistributionSource | <string> Indicates where the text may be found, such as a URL. |
Continuation | |
DistributionNotes | <string> Indicates any additional notes about distribution of the module. |
Continuation | |
TextSource | <string, probably a URL> Indicates, either in prose (such as "CCEL") or as a URL of the source of the text |
Continuation |
Things that should move elsewhere
Module Requests
Here is a place to request modules you would like to be made. If the Copyright holder has been contacted, the permissions granted or not can be put here.
Bible Versions
Permission has been granted and the module is available as "TurNTB" in beta Refdoc 18:04, 29 August 2007 (MDT)
Books
English
Devotionals
Lexicons
New Modules
List of modules that are being developed
General Book
English
Portuguese
Brazilian
SBB denied permissions, someone from Brasília intends to try again.
SBT has been contacted, anticipating answer.
IBB has left the door open to a future favourable answer. Need to follow up with request for permissions on Versão Revisada and its Almeida Século XXI successor.
Corrigida low-quality copies are available, we need to evaluate if it is worthwhile to move forward: old translation, not too good, OCR will be troublesome.
Tradução Brazileira copy obtained, working library contacts to find a book scanner.
For all these, please ask for further information at sword-devel.
Iberian
Permissions were obtained on a number of texts from SBP, as per post at sword-devel.