WBTI Bible discussion
General Remarks on Conversion
A long, long time ago, we were given 43 Bibles by Wycliffe. Some of the Bibles belong to WBTI, others to the Bible League.
The 43 Bibles were delivered as zips. Each zip contained OSIS files, a separate file for each book of the Bible. The OSIS files were not all valid at the time of delivery. The contents of the files were encoded as UTF-8, but made extensive use of the SIL PUA for characters not present within Unicode at the time of encoding.
The major task of developing a conversion system for these Bibles was to develop a means for mass conversion of the whole set that could, hopefully, be performed by someone from Wycliffe with a crosswire.org login. The basic conversion process was:
- Unzip the provided zips and output a modifiable table-of-contents file listing the OSIS books in their canonical order. (The converter outputs a list based on alphabetical order of the files, which sometimes needs to be modified. Neither the extracted files nor the table-of-contents file will be overwritten by subsequent executions of the build script, permitting corrections.)
- Create a single OSIS file from the separate book files, using the header of the first file and the body of all files.
- Use TECkit with SIL's own PUA to Unicode 5.1 converter to convert as much of the PUA content as possible to Unicode 5.1. (Since complete conversion remains unlikely and Unicode 5.1 fonts are uncommon, use of recent builds of Charis SIL is recommended and included in the .conf.)
- Determine a module name based on the language, owner, & year of publication: (lowercase language code)_(uppercase owner)_(4-digit year)
- Construct .confs based on header information from the first provided OSIS file.
- Run osis2mod on the combined, validated OSIS file.
At this point a few additional steps are needed:
Convert SIL Ethnologue codes to ISO 639-3. For the most part, especially with minority languages like these, these will be the same, but there are exceptions.(Apparently Ethnologue codes are equal to ISO 639-3 codes but there are some typos in the data that need to be corrected by hand. *sigh*)
- Perform validation on the OSIS files (either with xmllint or (preferrably) Xerces).
- Port the whole thing to Linux so that it can be run on the server. (At the moment TECkit is the only obstacle, but the source code will probably compile under Linux.)
Since the 43 Bibles from WBTI are encoded using similar practices and will likely have similar problems in SWORD software, we should keep discussion in a single place. Nevertheless, if you do see a problem, it would be good to note not only the location (which verse) where you spot a problem but also the particular Bible that it appears in.
Our discussion should focus on how we can make our software work better with this content as it is, rather than how we can fix the content. The content itself validates against the OSIS 2.1.1 schema (after some corrections), according to Xerces.
--Osk 20:42, 30 June 2008 (MDT)
(moved from Wycliffe Bibles section of beta modules page by Osk, original by Dmsmith)
Many modules have a problem with notes tied to headers. This may not be a module problem, but an osis2mod problem or a SWORD engine problem. For example, using SW to view Matt 5:21, there is a note marker that stands before the verse number and the heading appears after the verse number. Perhaps as a side effect, there is way too much whitespace (blank lines) surrounding the note. And the note have no content.
(moved from amu_BL_1999 section of beta modules page by Osk)
Verses are not well-formed. References are not within notes and are not proper.
E.g. Matt 5:21
<title subType="x-preverse" type="section"> <reference>(Lc. 12:57-59)</reference> </title> <div scope="Matt.5.21-Matt.5.26" type="section"> <title>Maˈmo̱ⁿ Jesús na tilaˈwja̱a̱ya ncˈiaaya</title> ’ˈO jnda̱ jndyeˈyoˈ ñˈoom na tyolue nnˈaⁿ nda̱a̱ welooya na matsonaˈ: “Tintseicueˈ xˈiaˈ ndoˈ meiⁿcwiˈñeeⁿcheⁿ tsˈaⁿ na nntsˈaa na ljoˈ, maxjeⁿ nntˈuiityeⁿnaˈ juu.”
This example show the problem with the osis2mod pre-verse hack: There are simply too many valid conditions for osis2mod to anticipate everything. I'd be interested in what the original for this was. My guess is that it is really good OSIS.
However, osis2mod and the SWORD engine are not that amenable to all good OSIS inputs.
With regard to <div>, osis2mod needs to change this to a milestoned version. Also, because the title is within the <div> it is not a candidate to become "x-preverse". This probably needs to be fixed too.
With regard to <reference> it is perfectly valid OSIS. However, SWORD expects it to have an osisRef, and because it does not, it displays a non-functional link. Further, SWORD expects references to be within <note type="crossReference">...</note>. I may be wrong on this last point but every OSIS module having references wraps them in <note>. Further, SWORD does not expect a <reference> to be between two verses. To which one should it be attached?
--Dmsmith 18:33, 30 June 2008 (MDT)
The advent of the Scripture Earth resources providing access to Scripture in the Heart Language. Scripture Earth now has many more heart languages. This means that most (if not all) the translations we received from WBT in the distant past are also now available in several other formats, both online and for download with Scripture resources for more than than those represented by our WBT/BL list in CrossWire Beta.
What is not yet found in Scripture Earth are any of these translations in SWORD module format. It would therefore make good sense for CrossWire to work in a new collaboration with WBT with a view to establishing a Scripture Earth module repository managed by WBT, rather than by CrossWire.
A related initiative is the Every Tribe Every Nation (ETEN) alliance which is well-funded by a number of generous donors. This brings together not only WBT, but also UBS and Biblica as Scripture Text Ministry Partners. The ETEN project also draws in several Scripture based ministries as End User Ministry Partners. These EUMPs already include: YouVersion, American Bible Society’s Bible Search, and Faith Comes By Hearing, etc.
At the heart of the ETEN alliance is the Digital Bible Library] (DBL). If CrossWire Bible Society could become an approved EUMP, it would provide us access to the DBL, and also widen the availability of all our own Scripture resources.
- A library card gives qualified ministries and organizations access to the ETEN Digital Bible Library so that they can distribute translations and media files through websites, apps, print on demand, and other means. Please note that rights holders determine how each digital file may be used. See .
David Haslam attended the ETEN organized 'digital scripture publishing summit' held in Germany during March 2012. He has more detailed information as a result of his participation at this consultation.
David Haslam 14 June 2012 (MDT)
- As on 2012-06-14, for these countries: Argentina, Aruba, Azerbaijan, Belize, Bolivia, Brazil, Canada, Chile, Colombia, Costa Rica, Dominica, Ecuador, El Salvador, French Guiana, Ghana, Grenada, Guatemala, Guyana, Honduras, Mexico, Myanmar, Nepal, Netherlands Antilles, Panama, Paraguay, Peru, Saint Lucia, Solomon Islands, Spain, Sudan, Suriname, Thailand, Trinidad and Tobago, United Kingdom, United States, Venezuela.
- e.g. For free Bible study software such as ‘The Word’, PDF, audio, etc.
Go Bible editions
A significant number of the translations at Scripture Earth have already been made into Go Bible applications. These are listed as Download the Java enabled cell phone module in the relevant language pages. I just downloaded one from  for the Bible in Achi' de Cubulco. It was made on 2011-06-11 using version 2.4.0 of Go Bible, having been created by Bill Dyck, the Americas regional publishing co-ordinator for WBT, who was also at the aforementioned consultation in Germany. The user interface is English.