Difference between revisions of "Frontends:Diatheke"
(→Diatheke option filters: | '''M''' |(Morpheme segmentation)<ref>Added to source code on 01/07/2015 by KK.</ref> |for modules with GlobalOptionFilter=OSISMorphSegmentation |-)
(→Output formats: , and internal (default). & " – The option LaTeX will produce a compilable document, but may well require tweaking to be usable.")
|Line 203:||Line 203:|
=== Output formats ===
=== Output formats ===
Valid output_format values are: CGI, GBF, HTML, HTMLHREF, LaTeX<ref>Only after diatheke version 4.7</ref>, OSIS, RTF<ref>The output does not include any header lines that would facilitate the output including (e.g.) a font colour table, were the output to be redirected to a Rich Text File. Thus using the output filter '''-o w''' (Red Words of Christ) would require such a header to be added by the user in order to ensure the red letter text can be viewed as such.</ref>, ThML, XHTML, and plain<ref>The word '''plain''' here is merely a ''handle'' to distinguish the default from the other '''output formats'''.<BR>It does not imply that the output ''encoding'' is restricted in any way.</ref>
Valid output_format values are: CGI, GBF, HTML, HTMLHREF, LaTeX<ref>Only after diatheke version 4.7 </ref>, OSIS, RTF<ref>The output does not include any header lines that would facilitate the output including (e.g.) a font colour table, were the output to be redirected to a Rich Text File. Thus using the output filter '''-o w''' (Red Words of Christ) would require such a header to be added by the user in order to ensure the red letter text can be viewed as such.</ref>, ThML, XHTML, and plain<ref>The word '''plain''' here is merely a ''handle'' to distinguish the default from the other '''output formats'''.<BR>It does not imply that the output ''encoding'' is restricted in any way.</ref>(default).
=== Output encodings ===
=== Output encodings ===
Revision as of 15:09, 8 April 2018
- 1 What is diatheke?
- 2 Diatheke option filters
- 3 Diatheke search types
- 4 How do I use diatheke/CLI?
- 5 Diatheke output
- 6 Known weaknesses & bugs
- 7 Tools that use Diatheke
What is diatheke?
Diatheke is a very simple command line interface (CLI) front-end to the SWORD Project's Bible software library. Essentially, "diatheke" is the stuff contained within the file "corediatheke.cpp" in the apps/console/diatheke directory of the SWORD source tree. Corediatheke.cpp contains only one function that is intended to be called from any program using diatheke, and that function performs exactly one lookup in the SWORD library per call. Examples of calls would be a query for a verse (or verse list/range), a search, a request for a list of modules, etc.
Where's the name 'diatheke' come from?
Diatheke means 'testament' or 'commandment'. And diatheke (the program) was originally a command line application. commandment... command line app... It's a pun.
How is diatheke useful to me?
Probably it isn't, but there are a number of front-ends to diatheke (yes, front-ends to a front-end) that are of use. These include:
- diatheke/TCL: a BibleBot for eggdrop that interfaces with diatheke/CLI
- diatheke/CGI: a Perl/CGI interface to diatheke/CLI
- HANDiatheke: a Palm PQA interface to diatheke/CGI
- ActiveDiatheke: an ActiveX control (.OCX) interface to SWORD
The above four are no longer under active development and may even be no longer available.
- This section needs expanding.
- Version 4.6 was released during 2013.
- Version 4.7 was released on Aug 30 2015.
How do I get diatheke?
- This section needs updating.
To get the very latest version, grab the SWORD source tree from our SVN repository using the URL: https://crosswire.org/svn/sword/trunk
$ svn checkout https://crosswire.org/svn/sword/trunk sword
If you don't want to use SVN, you can try grabbing a recent release from ftp://ftp.crosswire.org/pub/sword/source/
The Sword utilities for Windows are also installed when Xiphos is installed. These include a copy of diatheke.exe
For diatheke/CLI and diatheke/CGI you can download version 4.0 from:
- ftp://ftp.crosswire.org/pub/sword/frontend/diatheke/diatheke-4.0-win32.zip (Windows binary)
- ftp://ftp.crosswire.org/pub/sword/frontend/diatheke/diatheke-4.0-src.zip (source)
For diatheke/TCL and HANDiatheke you can download version 2.0 from ftp://ftp.crosswire.org/pub/sword/frontend/diatheke/diatheke-2.0.tar.gz.
For ActiveDiatheke you can download a preliminary version from ftp://ftp.crosswire.org/pub/sword/frontend/diatheke/ActiveDiatheke.zip.
Diatheke option filters
Module option filters are off by default in diatheke. They must be specified to include the featured property in the output.
Valid (output) option_filters values are:
|a||(Greek Accents)||for modules with GlobalOptionFilter=UTF8GreekAccents|
|b||(Bi-Directional Reordering)||for modules with Direction=BiDi or Direction=RtoL|
|c||(Hebrew Cantillation)||for modules with GlobalOptionFilter=UTF8Cantillation|
|e||(Word Enumerations)||for modules with GlobalOptionFilter=OSISEnum|
|f||(Footnotes)||for modules with footnotes (GBF/ThML/OSIS)|
|g||(Glosses/Ruby)||for modules with GlobalOptionFilter=OSISGlosses or GlobalOptionFilter=OSISRuby|
|h||(Section Headings)||for modules with headings (GBF/ThML/OSIS)|
|l||(Lemmas)||for modules with GlobalOptionFilter=ThMLLemma or GlobalOptionFilter=OSISLemma|
|M||(Morpheme segmentation)||for modules with GlobalOptionFilter=OSISMorphSegmentation|
|m||(Morphology)||for modules with GlobalOptionFilter=ThMLMorph or GlobalOptionFilter=OSISMorph|
|n||(Strong's numbers)||for modules with Strong's Numbers (GBF/ThML/OSIS)|
|p||(Arabic Vowels)||for modules with GlobalOptionFilter=UTF8ArabicPoints|
|r||(Arabic Shaping)||for modules with Arabic/Persian script (required for proper rendering in Linux)|
|s||(Scripture Crossrefs)||for modules with GlobalOptionFilter=ThMLScripref or GlobalOptionFilter=OSISScripref|
|t||(Algorithmic Transliterations via ICU)|
|v||(Hebrew Vowels)||for modules with GlobalOptionFilter=UTF8HebrewPoints|
|w||(Red Words of Christ)||for modules with RedLetterWords (GBF/ThML/OSIS)|
|x||(Encoded Transliterations)||for modules with GlobalOptionFilter=OSISXlit|
- Refer to Module configuration files for this and similar items listed.
- Added to source code on 01/07/2015 by KK.
Diatheke search types
- This section needs expanding.
Valid search_type values are: phrase (default), regex, multiword, attribute, lucene, multilemma.
Search type lucene only works when the module already has a Lucene search index. Such an index can be created by means of either an installed front-end app such as Xiphos, or using the command line Sword utility mkfastmod.
Search type regex has some limitations. It doesn't yet fully support UTF-8 encoded text, so the results you get may not be what you expected. For example:
diatheke -b KJV -s regex -k Abed...nego
Verses containing "Abed...nego"-- Daniel 1:7 ; Daniel 2:49 ; Daniel 3:12 ; Daniel 3:13 ; Daniel 3:14 ; Daniel 3:16 ; Daniel 3:19 ; Daniel 3:20 ; Daniel 3:22 ; Daniel 3:23 ; Daniel 3:26 ; Daniel 3:28 ; Daniel 3:29 ; Daniel 3:30 -- 14 matches total (KJV)
Each dot in the search query represents "any single byte", so the wide UTF-8 character U+2013 en dash in the 'hyphenated' name Abed–nego can match three dots (E2 80 93), depending on whether SWORD/diatheke was compiled with or without cxx11regex.
How do I use diatheke/CLI?
Calling diatheke without any parameters will result in the command line syntax help being output to stderr.
The query_key (-k) must be the last argument because all further arguments are added to the key.
The following are a few examples of calling diatheke from the command line: (booknames can be abbreviated, providing this avoids ambiguity)
|Retrieve Acts ch 10||diatheke -b KJV -k "Acts 10"|
|First five verses of above||diatheke -b KJV -m 5 -k "Acts 10"|
|Acts chapters 1 and 2||diatheke -b KJV -k "Acts 1-2"|
|Genesis 1:1||diatheke -b KJV -k G 1:1|
|Galatians 1:1 w/ Strong's (if available)||diatheke -b KJV -o n -k "Ga 1:1"|
|I Corinthians 1:1 (also "ic 1:1")||diatheke -b KJV -o n -k "1c 1:1"|
|Revelation 1:1-1:7||diatheke -b KJV -k "Rev 1:1-7"|
|Revelation 1:1||diatheke -b KJV -m 1 -k "R 1:1-7"|
|Revelation 1:1,1:3,1:7 as HTML (w/ <p>, <i>, etc. tags)||diatheke -b KJV -f HTML -k R 1:1,3,7|
|Luke 3:35 with Greek accents, showing all variants||diatheke -b TischMorph -o a -v -1 -k Luke 3:35|
|verses with "my people", quotations optional||diatheke -b KJV -s phrase -k "my people"|
|verses with "skin" and "bones"||diatheke -b KJV -s multiword -k skin bones|
|verses with "church" or "assembly"||diatheke -b KJV -s regex -k "church | assembly"|
|Strong's Greek 3056||diatheke -b StrongsGreek -k 3056|
|Definition of "horn" in Two Babylons||diatheke -b 2BabDict -k horn|
|Entry for John 1:1 in Family Bible Notes||diatheke -b Family -k Jn 1:1|
|Entry for "Lion" in Scripture Alphabet Of Animals||diatheke -b SAOA -k "Lion"|
|Entry for "olive-tree" in Easton's Bible Dictionary||diatheke -b Easton -k olive-tree|
|Matthew 24 from Westcott Hort Greek NT transliterated into Latin script||diatheke -b WHNU -t Latin -o mn -k "Mt 24"|
The plain text output of diatheke marks any OSIS highlight elements (e.g. <hi type="italic">n</hi>) by wrapping the highlighted text between asterisks (e.g. *n*). It does this whatever the value of the type attribute.
Redirecting the output
It being a command line utility, the output from diatheke can be readily redirected to a file using the normal features of the command shell. This may be especially useful for (e.g.) a search that has a large number of results.
- Only after diatheke version 4.7 – The option LaTeX will produce a compilable document, but may well require tweaking to be usable.
- The output does not include any header lines that would facilitate the output including (e.g.) a font colour table, were the output to be redirected to a Rich Text File. Thus using the output filter -o w (Red Words of Christ) would require such a header to be added by the user in order to ensure the red letter text can be viewed as such.
- The word plain here is merely a handle to distinguish the default from the other output formats.
It does not imply that the output encoding is restricted in any way.
- Output encodings determine how any printable non-ASCII characters in the output are encoded.
- There is no hyphen in UTF8 or UTF16 even though a hyphen might be expected.
- When using diatheke to search, the results are all output as a single line without any breaks.
- Diatheke search results contain only the verse references where the pattern is matched (if any).
Known weaknesses & bugs
Both editions (Linux & Windows)
Diatheke does not support OSISReferenceLinks.
Currently, diatheke does not output:
- Any canonical Psalm titles even with the option filter -o h for section headings.
- Any pilcrow symbols in the KJV Bible or similar modules where these are encoded as a marker attribute in the OSIS milestone element.
- Any quotation marks in modules where these are encoded as a marker attribute in the OSIS q element.
- Any footnote text with the option filter -o f for footnotes; only a pair of brackets  is output at the location of each note tag.
Currently, diatheke does not distinguish different highlight types in the OSIS hi element. It treats all such styles as if they were bold by wrapping the text between two asterisks.
Search type regex does not yet properly support UTF-8 encoded text. (See above).
For some output filters and/or formats, the XML snippets may include the undefined attribute name savlm in the w elements. e.g.
Genesis 1:1: <w savlm="strong:H07225">In the beginning</w>
This seems to be a bug in the source code. Evidently, it denotes "save lemma".
grep savlm src/modules/filters/*.cpp src/modules/filters/osishtmlhref.cpp: SWBuf savelemma = tag.getAttribute("savlm"); src/modules/filters/osislatex.cpp: SWBuf savelemma = tag.getAttribute("savlm"); src/modules/filters/osisosis.cpp: tag.setAttribute("savlm", 0); src/modules/filters/osisrtf.cpp: SWBuf savelemma = tag.getAttribute("savlm"); src/modules/filters/osisstrongs.cpp: SWBuf savlm = l; src/modules/filters/osisstrongs.cpp: wtag.setAttribute("savlm", savlm); src/modules/filters/osiswebif.cpp: SWBuf savelemma = tag.getAttribute("savlm"); src/modules/filters/osisxhtml.cpp: SWBuf savelemma = tag.getAttribute("savlm");
Reported as http://tracker.crosswire.org/browse/API-199
The utility diatheke.exe is among the Sword utilities compiled for Win32.
Under the Windows command shell (cmd.exe), diatheke does not correctly handle non-ASCII characters in the query key. Thus, for example, the following command that works OK in Linux will fail in Windows:
diatheke -b KJV -s phrase -k Æneas
In Windows, the non-ASCII character "Æ" gets changed to U+00E3 LATIN SMALL LETTER A WITH TILDE.
The response in Windows is then:
Verses containing "ãneas"-- none (KJV)
The root cause is that Windows shell assumes text is encoded as UTF-16 LE whereas SWORD requires all text to be encoded as UTF-8. This problem mainly affects using the search options in diatheke, where a query key is more likely to contain non-ANSI characters. Even so, for any locale in which some Bible book names contain non-ANSI characters, the problem would also affect diatheke when the query key is a reference that contains such a character.
Tools that use Diatheke
AutoKey script for The SWORD Project
Ryan (Adyeth) has developed a script for the AutoKey 0.6x utility to do paste Bible text given a reference. It works with OpenOffice, plain text editors, or any other Linux program where you might need to paste scripture passages. It requires Diatheke in order to function. You can download it from his website.