Difference between revisions of "File Formats"
David Haslam (talk | contribs) m (→Zefania Utilities: –) |
m (Sword -> SWORD) |
||
Line 11: | Line 11: | ||
This markup format is intended as an aid to preparing Bible texts (specifically the WEB and WEB:ME) for use with various Bible search programs. The complete specification is at http://www.ebible.org/bible/gbf.htm. | This markup format is intended as an aid to preparing Bible texts (specifically the WEB and WEB:ME) for use with various Bible search programs. The complete specification is at http://www.ebible.org/bible/gbf.htm. | ||
− | This markup format was previously used for some | + | This markup format was previously used for some SWORD modules but is now deprecated in favor of OSIS. The rudimentary [http://crosswire.org/ftpmirror/pub/sword/utils/perl/gbf2osis.pl gbf2osis.pl] utility may be used to convert GBF to OSIS for import to SWORD's native format. |
===HTML=== | ===HTML=== | ||
Hyper Text Markup Language | Hyper Text Markup Language | ||
− | This is the basic markup language of the World Wide Web. Some | + | This is the basic markup language of the World Wide Web. Some SWORD front-ends, such as [http://www.bibletime.info/ BibleTime], [http://gnomesword.sourceforge.net/ GnomeSword], and [http://www.crosswire.org/bibledesktop/ Bible Desktop], use HTML for presentation. |
===IMP=== | ===IMP=== | ||
Import Format | Import Format | ||
− | This proprietary file format is used by | + | This proprietary file format is used by SWORD for import of all types of modules. The three utilities [http://crosswire.org/ftpmirror/pub/sword/utils/win32/imp2vs.exe imp2vs] (for Bibles and verse-indexed commentaries), [http://crosswire.org/ftpmirror/pub/sword/utils/win32/imp2ld.exe imp2ld] (for lexicons, dictionaries, and daily-devotionals), and [http://crosswire.org/ftpmirror/pub/sword/utils/win32/imp2gbs.exe imp2gbs] (for all other types of books) can be used to import IMP files to SWORD's native formats. |
An IMP file consists of any number of entries. Each entry consists of a key line and any number of content lines. The key line consists of a line beginning with "$$$". For example, "$$$Gen 1:1" would be the key line for the Genesis 1:1 entry of a Bible or commentary module. | An IMP file consists of any number of entries. Each entry consists of a key line and any number of content lines. The key line consists of a line beginning with "$$$". For example, "$$$Gen 1:1" would be the key line for the Genesis 1:1 entry of a Bible or commentary module. | ||
− | The content lines of an entry may consist of any text (provided that the first three characters of the line are not "$$$"). The internal markup of the content may be in any format supported by | + | The content lines of an entry may consist of any text (provided that the first three characters of the line are not "$$$"). The internal markup of the content may be in any format supported by SWORD, namely OSIS for any module type or ThML for freeform books from CCEL. |
===LitML=== | ===LitML=== | ||
Line 51: | Line 51: | ||
Rich Text Format | Rich Text Format | ||
− | This is a markup format designed by Microsoft. It is used as the markup language for presentation The SWORD Project for Windows. It is also the internal markup format used within STEP books (see below). The format is of limited use as an archival format and there are no plans for | + | This is a markup format designed by Microsoft. It is used as the markup language for presentation The SWORD Project for Windows. It is also the internal markup format used within STEP books (see below). The format is of limited use as an archival format and there are no plans for SWORD to support it beyond its current use for presentation. |
===STEP=== | ===STEP=== | ||
Line 66: | Line 66: | ||
This format is a variant of XML based on TEI and ThML, developed by and for the [http://www.ccel.org/ Christian Classics Ethereal Library]. The specifications for this markup format are available at http://www.ccel.org/ThML/. | This format is a variant of XML based on TEI and ThML, developed by and for the [http://www.ccel.org/ Christian Classics Ethereal Library]. The specifications for this markup format are available at http://www.ccel.org/ThML/. | ||
− | This markup format in some | + | This markup format in some SWORD resources, but only the creation of free-form "General book" modules based on existing CCEL resources is currently supported. Other works and new works should be created using the OSIS format. |
===Unbound Bible Format=== | ===Unbound Bible Format=== | ||
Line 73: | Line 73: | ||
The [http://unbound.biola.edu/ BIOLA's Unbound Bible] offers many of their resources for download in a proprietary, but relatively simple plain-text format. | The [http://unbound.biola.edu/ BIOLA's Unbound Bible] offers many of their resources for download in a proprietary, but relatively simple plain-text format. | ||
− | There is no widespread use of this format, but the rudimentary [http://crosswire.org/ftpmirror/pub/sword/utils/perl/unb2osis.pl unb2osis.pl] utility may be used to convert Unbound Bible format to OSIS for import to | + | There is no widespread use of this format, but the rudimentary [http://crosswire.org/ftpmirror/pub/sword/utils/perl/unb2osis.pl unb2osis.pl] utility may be used to convert Unbound Bible format to OSIS for import to SWORD's native format. |
===USFM=== | ===USFM=== | ||
[http://confluence.ubs-icap.org/display/USFM/Home Unified Standard Format Markers] | [http://confluence.ubs-icap.org/display/USFM/Home Unified Standard Format Markers] | ||
− | This plain-text format is a common internal-use format within Bible translation agencies and Bible societies. It is the native format of [http://paratext.ubs-translations.org/Register.html Paratext]. The rudimentary [http://crosswire.org/ftpmirror/pub/sword/utils/perl/usfm2osis.pl usfm2osis.pl] utility may be used to convert USFM to OSIS for import to | + | This plain-text format is a common internal-use format within Bible translation agencies and Bible societies. It is the native format of [http://paratext.ubs-translations.org/Register.html Paratext]. The rudimentary [http://crosswire.org/ftpmirror/pub/sword/utils/perl/usfm2osis.pl usfm2osis.pl] utility may be used to convert USFM to OSIS for import to SWORD's native format. |
See also: [[Converting SFM Bibles to OSIS]] | See also: [[Converting SFM Bibles to OSIS]] | ||
Line 85: | Line 85: | ||
Unified Scripture Format XML | Unified Scripture Format XML | ||
− | This XML file format is designed to provide clean conversions from Scripture to USFM compliant file formats. A more comprehensive description can be found at http://ebt.cx/usfx/. There is no widespread use of this format and there are no plans for | + | This XML file format is designed to provide clean conversions from Scripture to USFM compliant file formats. A more comprehensive description can be found at http://ebt.cx/usfx/. There is no widespread use of this format and there are no plans for SWORD to support it in any way. |
===VPL=== | ===VPL=== | ||
Verse-Per-Line | Verse-Per-Line | ||
− | This plain-text format is used for by | + | This plain-text format is used for by SWORD for import of Bibles. It consists of one verse per line, with an optional verse reference at the beginning. The [http://crosswire.org/ftpmirror/pub/sword/utils/win32/vpl2mod.exe vpl2mod] utility may be used for import. VPL is deprecated in favor of the IMP format, which is more widely useful. |
===XSEM=== | ===XSEM=== | ||
Line 101: | Line 101: | ||
http://scripts.sil.org/cms/scripts/render_download.php?site_id=nrsi&format=file&media_id=XSEM_Source&filename=XSEM_Source.zip | http://scripts.sil.org/cms/scripts/render_download.php?site_id=nrsi&format=file&media_id=XSEM_Source&filename=XSEM_Source.zip | ||
− | The designers of this markup language were instrumental in the writing of the OSIS Specification and it has largely been deprecated in favor of using OSIS. There is no widespread use of this format and there are no plans for | + | The designers of this markup language were instrumental in the writing of the OSIS Specification and it has largely been deprecated in favor of using OSIS. There is no widespread use of this format and there are no plans for SWORD to support it in any way. |
===XML=== | ===XML=== | ||
eXtensible Markup Language | eXtensible Markup Language | ||
− | This is generic family of markup formats. Links to a number of XML specifications can be found at http://xml.coverpages.org/xmlApplications.html. Each flavor has its own specifications. | + | This is generic family of markup formats. Links to a number of XML specifications can be found at http://xml.coverpages.org/xmlApplications.html. Each flavor has its own specifications. SWORD supports markup in the XML formats OSIS and ThML internally |
===Zefania XML=== | ===Zefania XML=== | ||
− | [http://www.zefania.de/ Zefania] is an XML format for Bible markup with only the most simple structural tags for book/chapter/verse, notes, etc. The [http://crosswire.org/ftpmirror/pub/sword/utils/perl/zef2osis.pl zef2osis.pl] utility may be used to convert Zefania XML to OSIS for import to | + | [http://www.zefania.de/ Zefania] is an XML format for Bible markup with only the most simple structural tags for book/chapter/verse, notes, etc. The [http://crosswire.org/ftpmirror/pub/sword/utils/perl/zef2osis.pl zef2osis.pl] utility may be used to convert Zefania XML to OSIS for import to SWORD's native format. |
===Go Bible=== | ===Go Bible=== | ||
Line 118: | Line 118: | ||
===The SWORD Project=== | ===The SWORD Project=== | ||
− | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/cipherraw.exe cipherraw] - used to encipher | + | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/cipherraw.exe cipherraw] - used to encipher SWORD modules |
− | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/diatheke.exe diatheke] - a basic CLI | + | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/diatheke.exe diatheke] - a basic CLI SWORD frontend |
* [http://crosswire.org/ftpmirror/pub/sword/utils/win32/mkfastmod.exe mkfstmod] - creates a search index for a module | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/mkfastmod.exe mkfstmod] - creates a search index for a module | ||
* [http://crosswire.org/ftpmirror/pub/sword/utils/win32/mod2zmod.exe mod2zmod] - creates a compressed module from an installed module | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/mod2zmod.exe mod2zmod] - creates a compressed module from an installed module | ||
====IMP Tools==== | ====IMP Tools==== | ||
− | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/imp2gbs.exe imp2gbs] - imports free-form General books in IMP format to | + | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/imp2gbs.exe imp2gbs] - imports free-form General books in IMP format to SWORD format |
− | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/imp2ld.exe imp2ld] - imports lexicons, dictionaries, and daily devotionals in IMP format to | + | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/imp2ld.exe imp2ld] - imports lexicons, dictionaries, and daily devotionals in IMP format to SWORD format |
− | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/imp2vs.exe imp2vs] - imports Bibles and commentaries in IMP format to | + | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/imp2vs.exe imp2vs] - imports Bibles and commentaries in IMP format to SWORD format |
* [http://crosswire.org/ftpmirror/pub/sword/utils/win32/mod2imp.exe mod2imp] - creates an IMP file from an installed module | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/mod2imp.exe mod2imp] - creates an IMP file from an installed module | ||
====VPL Tools==== | ====VPL Tools==== | ||
− | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/vpl2mod.exe vpl2mod] - imports Bibles and commentaries in Verse-Per-Line format to | + | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/vpl2mod.exe vpl2mod] - imports Bibles and commentaries in Verse-Per-Line format to SWORD format |
===GBF Tools=== | ===GBF Tools=== | ||
Line 140: | Line 140: | ||
===OSIS Utilities=== | ===OSIS Utilities=== | ||
* [http://crosswire.org/ftpmirror/pub/sword/utils/win32/mod2osis.exe mod2osis] - creates an OSIS file from an installed module | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/mod2osis.exe mod2osis] - creates an OSIS file from an installed module | ||
− | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/osis2mod.exe osis2mod] - imports Bibles and commentaries in OSIS format to | + | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/osis2mod.exe osis2mod] - imports Bibles and commentaries in OSIS format to SWORD format |
* [http://crosswire.org/ftpmirror/pub/sword/utils/win32/vs2osisref.exe vs2osisref] - returns the osisRef of a given (text form) verse reference | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/vs2osisref.exe vs2osisref] - returns the osisRef of a given (text form) verse reference | ||
− | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/xml2gbs.exe xml2gbs] - imports free-form General books in OSIS or ThML format to | + | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/xml2gbs.exe xml2gbs] - imports free-form General books in OSIS or ThML format to SWORD format |
===STEP Utilities=== | ===STEP Utilities=== | ||
Line 153: | Line 153: | ||
===ThML Utilities=== | ===ThML Utilities=== | ||
− | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/xml2gbs.exe xml2gbs] - imports free-form General books in OSIS or ThML format to | + | * [http://crosswire.org/ftpmirror/pub/sword/utils/win32/xml2gbs.exe xml2gbs] - imports free-form General books in OSIS or ThML format to SWORD format |
* [http://ccel-desktop.sourceforge.net/ CCEL Desktop] - a program for viewing and developing CCEL books | * [http://ccel-desktop.sourceforge.net/ CCEL Desktop] - a program for viewing and developing CCEL books | ||
* [http://www.crosswire.org/wiki/index.php/DevTools:Misc#thml2osis thml2osis] - converts ThML to OSIS format. | * [http://www.crosswire.org/wiki/index.php/DevTools:Misc#thml2osis thml2osis] - converts ThML to OSIS format. |
Revision as of 16:47, 2 September 2008
The SWORD Project respects copyright. As such, conversion of material that is under copyright is not supported by The SWORD Project.
This page merely lists some of the more common file formats relevant to The SWORD Project and associated utilities.
Contents
File Formats
Bible study programs use a plethora of markup formats. Even more have been suggested for use in creating Bibles and other religious material. This subsection describes some of the most common of those formats.
GBF
General Bible Format
This markup format is intended as an aid to preparing Bible texts (specifically the WEB and WEB:ME) for use with various Bible search programs. The complete specification is at http://www.ebible.org/bible/gbf.htm.
This markup format was previously used for some SWORD modules but is now deprecated in favor of OSIS. The rudimentary gbf2osis.pl utility may be used to convert GBF to OSIS for import to SWORD's native format.
HTML
Hyper Text Markup Language
This is the basic markup language of the World Wide Web. Some SWORD front-ends, such as BibleTime, GnomeSword, and Bible Desktop, use HTML for presentation.
IMP
Import Format
This proprietary file format is used by SWORD for import of all types of modules. The three utilities imp2vs (for Bibles and verse-indexed commentaries), imp2ld (for lexicons, dictionaries, and daily-devotionals), and imp2gbs (for all other types of books) can be used to import IMP files to SWORD's native formats.
An IMP file consists of any number of entries. Each entry consists of a key line and any number of content lines. The key line consists of a line beginning with "$$$". For example, "$$$Gen 1:1" would be the key line for the Genesis 1:1 entry of a Bible or commentary module.
The content lines of an entry may consist of any text (provided that the first three characters of the line are not "$$$"). The internal markup of the content may be in any format supported by SWORD, namely OSIS for any module type or ThML for freeform books from CCEL.
LitML
Liturgical Markup Language
This markup format is a descendant of, and complement to ThML, described at http://hildormen.org/docs/LitML/Guidelines-LitML10-1.0.html.
The markup reflects its orientation towards liturgy and hymns.
OSIS
Open Scripture Information Standard
The Open Scripture Information Standard (OSIS) is "a common format for many visions." It is an XML format for marking up scripture and related text, part of an initiative composed of translators, publishers, scholars, software manufacturers, and technical experts, coordinated by the Bible Technologies Group. It is co-sponsored by the American Bible Society and the Society of Biblical Literature.
The most recent XML schema is OSIS 2.1.1, and a manual is also available.
This markup format is recommended by the CrossWire Bible Society and can be used for creating all types of resources for The SWORD Project. Support for OSIS is actively maintained and support for any unsupported elements or features needed for a module you may be working on may be requested.
Portable Document Format
This is an ISO track file format for platform independent rendering of documents. It is derived from Postscript and is maintained by Adobe. Documents may be text, images, or scanned images of text. Even textual documents cannot reasonably be expected to allow plain-text export. As such, it is designed to be a "read only" format.
RTF
Rich Text Format
This is a markup format designed by Microsoft. It is used as the markup language for presentation The SWORD Project for Windows. It is also the internal markup format used within STEP books (see below). The format is of limited use as an archival format and there are no plans for SWORD to support it beyond its current use for presentation.
STEP
Standard Template for Electronic Publishing
This file format was formerly used by QuickVerse and WordSearch, and is currently used for some e-Sword books.
While not an open standard, the publicly released documentation and specifications for this format can be found mirrored at http://www.crosswire.org/bsisg/. Some utilities for working with this format are listed below. It is unlikely that the SWORD Project will support this format in the future as it is largely dead.
ThML
Theological Markup Language
This format is a variant of XML based on TEI and ThML, developed by and for the Christian Classics Ethereal Library. The specifications for this markup format are available at http://www.ccel.org/ThML/.
This markup format in some SWORD resources, but only the creation of free-form "General book" modules based on existing CCEL resources is currently supported. Other works and new works should be created using the OSIS format.
Unbound Bible Format
Unbound Bible Format
The BIOLA's Unbound Bible offers many of their resources for download in a proprietary, but relatively simple plain-text format.
There is no widespread use of this format, but the rudimentary unb2osis.pl utility may be used to convert Unbound Bible format to OSIS for import to SWORD's native format.
USFM
Unified Standard Format Markers
This plain-text format is a common internal-use format within Bible translation agencies and Bible societies. It is the native format of Paratext. The rudimentary usfm2osis.pl utility may be used to convert USFM to OSIS for import to SWORD's native format.
See also: Converting SFM Bibles to OSIS
USFX
Unified Scripture Format XML
This XML file format is designed to provide clean conversions from Scripture to USFM compliant file formats. A more comprehensive description can be found at http://ebt.cx/usfx/. There is no widespread use of this format and there are no plans for SWORD to support it in any way.
VPL
Verse-Per-Line
This plain-text format is used for by SWORD for import of Bibles. It consists of one verse per line, with an optional verse reference at the beginning. The vpl2mod utility may be used for import. VPL is deprecated in favor of the IMP format, which is more widely useful.
XSEM
XML Scripture Encoding Model
This XML format was proposed by SIL. A comprehensive description of the markup language can be found at http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=XSEM&_sc=1.
The formal specifications can be found at http://scripts.sil.org/cms/scripts/render_download.php?site_id=nrsi&format=file&media_id=XSEM_Source&filename=XSEM_Source.zip
The designers of this markup language were instrumental in the writing of the OSIS Specification and it has largely been deprecated in favor of using OSIS. There is no widespread use of this format and there are no plans for SWORD to support it in any way.
XML
eXtensible Markup Language
This is generic family of markup formats. Links to a number of XML specifications can be found at http://xml.coverpages.org/xmlApplications.html. Each flavor has its own specifications. SWORD supports markup in the XML formats OSIS and ThML internally
Zefania XML
Zefania is an XML format for Bible markup with only the most simple structural tags for book/chapter/verse, notes, etc. The zef2osis.pl utility may be used to convert Zefania XML to OSIS for import to SWORD's native format.
Go Bible
To achieve the navigation speed and general ease of use on even the simplest of Java mobile phones, Go Bible data is fully indexed, as well as being compressed (as are all JAR files). The format is described in Go Bible data format. Go Bible data is structured as Book | Chapter | Verse text and does not support notes, headings and cross-references, etc. The developer kit Go Bible Creator can take either ThML or OSIS as the source text, but they usually have to be made specially suitable. For example, OSIS files produced by Snowfall Software USFM2OSIS script are not structured the same. Work has begun to make an XSLT script to convert such OSIS XML files to the format suitable for Go Bible.
Utility Programs
Unless otherwise specified, the utility programs listed in this section do not work with file formats used by The SWORD Project.
The SWORD Project
- cipherraw - used to encipher SWORD modules
- diatheke - a basic CLI SWORD frontend
- mkfstmod - creates a search index for a module
- mod2zmod - creates a compressed module from an installed module
IMP Tools
- imp2gbs - imports free-form General books in IMP format to SWORD format
- imp2ld - imports lexicons, dictionaries, and daily devotionals in IMP format to SWORD format
- imp2vs - imports Bibles and commentaries in IMP format to SWORD format
- mod2imp - creates an IMP file from an installed module
VPL Tools
- vpl2mod - imports Bibles and commentaries in Verse-Per-Line format to SWORD format
GBF Tools
- gbf2osis.pl - a PERL utility for converting GBF to OSIS
- gbfconvertor, including gbf2osis, gbf2xsem, & gbf2sf - utilities for converting GBF to OSIS, XSEM, and SFM
- gbfsrc - utilities for converting GBF to "HTML, RTF, TeX, plain ASCII text, a format readable by BibleWorks 5 or later, and a couple of less useful formats"
OSIS Utilities
- mod2osis - creates an OSIS file from an installed module
- osis2mod - imports Bibles and commentaries in OSIS format to SWORD format
- vs2osisref - returns the osisRef of a given (text form) verse reference
- xml2gbs - imports free-form General books in OSIS or ThML format to SWORD format
STEP Utilities
- step2vpl - export a STEP book in Verse-Per-Line (VPL) format
- stepdump - dumps the contents of a STEP book
ThML Utilities
- xml2gbs - imports free-form General books in OSIS or ThML format to SWORD format
- CCEL Desktop - a program for viewing and developing CCEL books
- thml2osis - converts ThML to OSIS format.
- ThML Reader from cscholar.com – domain expired, but see [1]
Zefania Utilities
- zef2osis.pl – a PERL utility for converting Zefania XML to OSIS
- Zefania TextKonvertor
- Zefania XML Bible Book Names Changer
- NewTrueSharpSwordAPI – no downloads yet available
- Zefania_2_sword_win32 – sed based scripts maintained by JensG
Go Bible utilities
- Go Bible Creator – a Java SE program for converting either ThML or OSIS to Go Bible.