Difference between revisions of "DevTools:Locale Files"
(→Related Sword utilities) |
(→Book Abbrevs) |
||
(8 intermediate revisions by 3 users not shown) | |||
Line 2: | Line 2: | ||
The file name is generally the language code with extension .conf <br> | The file name is generally the language code with extension .conf <br> | ||
As Unicode text files, locale files should be encoded UTF-8 (without BOM) and the file name should include "-utf8" after the language code.<br> | As Unicode text files, locale files should be encoded UTF-8 (without BOM) and the file name should include "-utf8" after the language code.<br> | ||
− | Other encodings are | + | Other encodings are deprecated. |
== Example == | == Example == | ||
Line 15: | Line 15: | ||
Name=de | Name=de | ||
Description=German | Description=German | ||
− | Encoding= | + | Encoding=UTF-8 |
The above information is used to define the locale. They should be | The above information is used to define the locale. They should be | ||
− | fairly obvious. Name should be | + | fairly obvious. Name should be, in preferred order, an ISO 639-1 alpha 2 language code, an ISO 639-2+ alpha 3 lanugage code, or an Ethnologue 3 letter language code. SWORD includes a fairly comprehensive list of these available at: sword/locales.d/locales.conf. |
− | + | This, and all entries are case sensitive. | |
− | are case sensitive. | ||
=== Text === | === Text === | ||
Line 39: | Line 38: | ||
=== Book Abbrevs === | === Book Abbrevs === | ||
− | This section | + | This section specifies all possible alternative book names and abbreviations for each Bible book. These are required for the locale to work correctly within the SWORD engine. This section teaches the verse reference parser all possible ways to represent a book name. The verse parser will allow partial matches and will sort entries in this section alphabetically by their codepoint values and give preference to partial matches with lower alphabetical order, e.g, "1 C" is a partial match for 1 CORINTHIANS=1Cor and 1 CHRONICLES=1Chr, but 1 CHRONICLES will be preferred because it is alphabetically first. If the desire is to have "1 C" prefer 1Cor, instead, then a 3rd entry is required: 1 C=1Cor. This preference for disambiguation is important to consider for partial matches, as they are prolific throughout Biblical literature, e.g., Jud=Jude (instead of Judges), Jo=John (instead of Job, Jonah, or Joshua) |
+ | |||
+ | The format is: | ||
+ | uppercase alternative book name or abbreviation=osisID | ||
+ | SWORD will sort this list, but it is preferred and standard to keep this list in alphabetical order to assist the authors in defining and tuning these preferences, .e.g., | ||
[Book Abbrevs] | [Book Abbrevs] | ||
Line 46: | Line 49: | ||
1 CORINTHIANS=1Cor | 1 CORINTHIANS=1Cor | ||
1 JN=1Jn | 1 JN=1Jn | ||
+ | 1C=1Cor | ||
+ | 1CHRONICLES=1Chr | ||
+ | 1CORINTHIANS=1Cor | ||
+ | I C=1Cor | ||
+ | I CHRONICLES=1Chr | ||
+ | I CORINTHIANS=1Cor | ||
+ | IC=1Cor | ||
+ | ICHRONICLES=1Chr | ||
+ | ICORINTHIANS=1Cor | ||
+ | |||
− | + | As expressed earlier, notice that 1 Chronicles would come, alphabetically | |
− | |||
− | |||
− | |||
before 1 Corinthians. The above entries say: 1Cor (which is the OSIS book id for 1 Corinthians) | before 1 Corinthians. The above entries say: 1Cor (which is the OSIS book id for 1 Corinthians) | ||
− | has precedence up through "1 C", any character beyond that will | + | has precedence up through "1 C", any character beyond that will disambiguate the entry anyway, so the default 1 CHRONICLES or 1 |
− | disambiguate the entry anyway, so the default 1 CHRONICLES or 1 | + | CORINTHIANS entries would correctly resolve partial matches with more characters starting with "1 C". |
− | CORINTHIANS entries would | ||
− | |||
− | |||
− | |||
'''IMPORTANT''': | '''IMPORTANT''': | ||
− | + | All verse references output by SWORD must be legal, parsable reference as input to SWORD. This means that there MUST be at least 1 abbreviation entry for each book name | |
− | comprised of a toupper (uppercase function) of the entire string | + | which is comprised of a toupper (uppercase function) of the entire string EXACTLY as you have translated it in the [Text] section. |
− | EXACTLY as you have translated it in the [Text] section. | ||
− | + | For example, the following are the REQUIRED entries for our book names from the excerpt [Text] section example above. | |
1. MOSE=Gen | 1. MOSE=Gen | ||
2. MOSE=Ex | 2. MOSE=Ex | ||
Line 77: | Line 82: | ||
alphabetical precedence, but might want Matthew or Mark). | alphabetical precedence, but might want Matthew or Mark). | ||
In this case, you would put in an entry MA=Matt or MA=Mark | In this case, you would put in an entry MA=Matt or MA=Mark | ||
+ | |||
+ | === Pref Abbrevs === | ||
+ | This section designates the preferred abbreviation for each book. These are typically used when SWORD is asked to display a very short verse reference or a short Bible book name. The format for these entries is: osisID=Preferred Abbreviation, e.g., | ||
+ | |||
+ | [Pref Abbrevs] | ||
+ | Gen=1Mo | ||
+ | Exod=2Mo | ||
+ | Lev=3Mo | ||
+ | Num=4Mo | ||
+ | Deut=5Mo | ||
+ | Josh=Jos | ||
+ | Judg=Rich | ||
+ | |||
+ | Each preferred abbreviation must necessarily be parsable by the exhaustive list of abbreviations in the [Book Abbrevs] section. | ||
+ | |||
You can test your locale with the sword/tests/parsekey test program (this | You can test your locale with the sword/tests/parsekey test program (this | ||
Line 91: | Line 111: | ||
== Maintained locale files == | == Maintained locale files == | ||
− | + | ||
On the CrossWire server, the locale files are stored in <tt>/space/home/ftp/pub/sword/raw/locales.d</tt><br> | On the CrossWire server, the locale files are stored in <tt>/space/home/ftp/pub/sword/raw/locales.d</tt><br> | ||
Users with FTP or SCP access are able to download them from that folder. | Users with FTP or SCP access are able to download them from that folder. | ||
Line 98: | Line 118: | ||
[[Category:Localization|Locale Files]] | [[Category:Localization|Locale Files]] | ||
+ | [[Category:SWORD]] |
Latest revision as of 20:59, 19 August 2019
A locale file is stored in the locales.d folder under the Sword path.
The file name is generally the language code with extension .conf
As Unicode text files, locale files should be encoded UTF-8 (without BOM) and the file name should include "-utf8" after the language code.
Other encodings are deprecated.
Contents
Example
Locales require a few things. Let's step through the German locale:
excerpts from /sword/locales.d/de.conf:
Meta
[Meta] Name=de Description=German Encoding=UTF-8
The above information is used to define the locale. They should be fairly obvious. Name should be, in preferred order, an ISO 639-1 alpha 2 language code, an ISO 639-2+ alpha 3 lanugage code, or an Ethnologue 3 letter language code. SWORD includes a fairly comprehensive list of these available at: sword/locales.d/locales.conf. This, and all entries are case sensitive.
Text
This section requires a "one-to-one" mapping for each string to be translated.
The following entries are translation strings for anything you might want. REQUIRED are the book names of the Bible, including deuterocanonical books if used. Other things might be option name, value, tip, translations, or any text returned from the engine.
If you find any errors or omissions, please post a message that you found a constant string in the engine not being (properly) translated.
[Text] Genesis=1. Mose Exodus=2. Mose Leviticus=3. Mose # <snipped rest of book names>
Observe that a full-stop is a permitted character in a book name.
Book Abbrevs
This section specifies all possible alternative book names and abbreviations for each Bible book. These are required for the locale to work correctly within the SWORD engine. This section teaches the verse reference parser all possible ways to represent a book name. The verse parser will allow partial matches and will sort entries in this section alphabetically by their codepoint values and give preference to partial matches with lower alphabetical order, e.g, "1 C" is a partial match for 1 CORINTHIANS=1Cor and 1 CHRONICLES=1Chr, but 1 CHRONICLES will be preferred because it is alphabetically first. If the desire is to have "1 C" prefer 1Cor, instead, then a 3rd entry is required: 1 C=1Cor. This preference for disambiguation is important to consider for partial matches, as they are prolific throughout Biblical literature, e.g., Jud=Jude (instead of Judges), Jo=John (instead of Job, Jonah, or Joshua)
The format is: uppercase alternative book name or abbreviation=osisID SWORD will sort this list, but it is preferred and standard to keep this list in alphabetical order to assist the authors in defining and tuning these preferences, .e.g.,
[Book Abbrevs] 1 C=1Cor 1 CHRONICLES=1Chr 1 CORINTHIANS=1Cor 1 JN=1Jn 1C=1Cor 1CHRONICLES=1Chr 1CORINTHIANS=1Cor I C=1Cor I CHRONICLES=1Chr I CORINTHIANS=1Cor IC=1Cor ICHRONICLES=1Chr ICORINTHIANS=1Cor
As expressed earlier, notice that 1 Chronicles would come, alphabetically
before 1 Corinthians. The above entries say: 1Cor (which is the OSIS book id for 1 Corinthians)
has precedence up through "1 C", any character beyond that will disambiguate the entry anyway, so the default 1 CHRONICLES or 1
CORINTHIANS entries would correctly resolve partial matches with more characters starting with "1 C".
IMPORTANT:
All verse references output by SWORD must be legal, parsable reference as input to SWORD. This means that there MUST be at least 1 abbreviation entry for each book name which is comprised of a toupper (uppercase function) of the entire string EXACTLY as you have translated it in the [Text] section.
For example, the following are the REQUIRED entries for our book names from the excerpt [Text] section example above.
1. MOSE=Gen 2. MOSE=Ex 3. MOSE=Lev
That's it for requirements. Tuning your locale can be important for the user experience. Many [Book Abbrevs] entries may be added to assign precedence if, for example, you find you are getting taken to the wrong entries from text like: "Ma 1:1" (would be Malachi by default because of alphabetical precedence, but might want Matthew or Mark). In this case, you would put in an entry MA=Matt or MA=Mark
Pref Abbrevs
This section designates the preferred abbreviation for each book. These are typically used when SWORD is asked to display a very short verse reference or a short Bible book name. The format for these entries is: osisID=Preferred Abbreviation, e.g.,
[Pref Abbrevs] Gen=1Mo Exod=2Mo Lev=3Mo Num=4Mo Deut=5Mo Josh=Jos Judg=Rich
Each preferred abbreviation must necessarily be parsable by the exhaustive list of abbreviations in the [Book Abbrevs] section.
You can test your locale with the sword/tests/parsekey test program (this
program is in the SWORD source along with several other programs that are
used to validate the configuration files) and try different strings to see
how they parse.
A full-stop is a permitted character in a localized book abbreviation. Other punctuation characters commonly used in verse references are not allowed in localized book names. These include the hyphen '-' (used for verse ranges), the colon ':' (used to separate chapter and verse numbers), and the comma ',' (used for verse lists). Additionally, numerals in non-initial position are not permitted in book names (i.e. '3John' is valid but 'Psalm151' is not).
Submissions
If you create a new locale file as part of the process towards making a module, please submit it to CrossWire.
Submissions should be sent to sword-support@crosswire.org
Maintained locale files
On the CrossWire server, the locale files are stored in /space/home/ftp/pub/sword/raw/locales.d
Users with FTP or SCP access are able to download them from that folder.
Corrections to errors in locale files should be sent to sword-support@crosswire.org