Difference between revisions of "Frontends:Diatheke"

From CrossWire Bible Society
Jump to: navigation, search
(Both editions (Linux & Windows): Search type '''regex''' does not yet properly support UTF-8 encoded text. (See above).)
(Release history: This is superfluous here.. The absence of a release version history seems spurious too - the version of the diatheke is the version of the engine.)
 
(31 intermediate revisions by 2 users not shown)
Line 12: Line 12:
 
* ActiveDiatheke: an ActiveX control (.OCX) interface to SWORD
 
* ActiveDiatheke: an ActiveX control (.OCX) interface to SWORD
 
''The above four are no longer under active development and may even be no longer available''.
 
''The above four are no longer under active development and may even be no longer available''.
 
=== Release history ===
 
:''This section needs expanding''.
 
* Version 4.6 was released during 2013.
 
* Version 4.7 was released on Aug 30 2015.
 
  
 
=== How do I get diatheke? ===
 
=== How do I get diatheke? ===
Line 71: Line 66:
 
|(Section Headings)
 
|(Section Headings)
 
|for modules with headings (GBF/ThML/OSIS)
 
|for modules with headings (GBF/ThML/OSIS)
 +
|-
 +
| '''i'''
 +
|(Intros)
 +
|for modules with introduction divisions (GBF/ThML/OSIS)
 
|-
 
|-
 
| '''l'''  
 
| '''l'''  
 
|(Lemmas)
 
|(Lemmas)
 
|for modules with GlobalOptionFilter=ThMLLemma or GlobalOptionFilter=OSISLemma
 
|for modules with GlobalOptionFilter=ThMLLemma or GlobalOptionFilter=OSISLemma
 +
|-
 +
| '''M'''
 +
|(Morpheme segmentation)<ref>Added to source code on 01/07/2015 by KK.</ref>
 +
|for modules with GlobalOptionFilter=OSISMorphSegmentation
 
|-
 
|-
 
| '''m'''  
 
| '''m'''  
Line 116: Line 119:
  
 
==Diatheke search types ==
 
==Diatheke search types ==
 +
:''This section needs expanding''.
 
Valid search_type values are: '''phrase''' (default), '''regex''', '''multiword''', '''attribute''', '''lucene''', '''multilemma'''.
 
Valid search_type values are: '''phrase''' (default), '''regex''', '''multiword''', '''attribute''', '''lucene''', '''multilemma'''.
  
Line 125: Line 129:
 
  Verses containing "Abed...nego"-- Daniel 1:7 ; Daniel 2:49 ; Daniel 3:12 ; Daniel 3:13 ; Daniel 3:14 ; Daniel 3:16 ; Daniel 3:19 ; Daniel 3:20 ; Daniel 3:22 ; Daniel 3:23 ; Daniel 3:26 ; Daniel 3:28 ; Daniel 3:29 ; Daniel 3:30 -- 14 matches total (KJV)
 
  Verses containing "Abed...nego"-- Daniel 1:7 ; Daniel 2:49 ; Daniel 3:12 ; Daniel 3:13 ; Daniel 3:14 ; Daniel 3:16 ; Daniel 3:19 ; Daniel 3:20 ; Daniel 3:22 ; Daniel 3:23 ; Daniel 3:26 ; Daniel 3:28 ; Daniel 3:29 ; Daniel 3:30 -- 14 matches total (KJV)
 
Each dot in the search query represents "any single byte", so the wide UTF-8 character U+2013 '''en dash''' in the 'hyphenated' name '''Abed–nego''' can match three dots (E2 80 93), depending on whether SWORD/diatheke was compiled with or without '''cxx11regex'''.
 
Each dot in the search query represents "any single byte", so the wide UTF-8 character U+2013 '''en dash''' in the 'hyphenated' name '''Abed–nego''' can match three dots (E2 80 93), depending on whether SWORD/diatheke was compiled with or without '''cxx11regex'''.
 +
 +
Search type '''attribute''' can be used to query special features in a module. e.g.
 +
diatheke -b KJV -s attribute -k Heading///Neginoth
 +
should give the following output:
 +
Entries containing "Heading///Neginoth"-- Psalms 4:1 ; Psalms 6:1 ; Psalms 54:1 ; Psalms 55:1 ; Psalms 67:1 ; Psalms 76:1 ;  -- 6 matches total (KJV)
  
 
==How do I use diatheke/CLI?==
 
==How do I use diatheke/CLI?==
Line 161: Line 170:
 
|Revelation 1:1,1:3,1:7 as HTML (w/ &lt;p>, &lt;i>, etc. tags)
 
|Revelation 1:1,1:3,1:7 as HTML (w/ &lt;p>, &lt;i>, etc. tags)
 
|diatheke -b KJV -f HTML -k R 1:1,3,7
 
|diatheke -b KJV -f HTML -k R 1:1,3,7
 +
|-
 +
|Luke 3:35 with Greek accents, showing all variants
 +
|diatheke -b TischMorph -o a -v -1 -k Luke 3:35
 
|-
 
|-
 
|verses with "my people", quotations optional
 
|verses with "my people", quotations optional
Line 190: Line 202:
 
|-
 
|-
 
|}
 
|}
 +
 +
===System keys===
 +
If <book> is "system" you may use these system keys: "modulelist", "modulelistnames", "bibliography", and "localelist". e.g.
 +
diatheke -b system -k modulelist
 +
will generate a complete list of all installed modules, each with its Description.
  
 
== Diatheke output ==
 
== Diatheke output ==
Line 195: Line 212:
  
 
=== Output formats ===
 
=== Output formats ===
Valid output_format values are: CGI, GBF, HTML, HTMLHREF, LaTeX<ref>Only after diatheke version 4.7</ref>, OSIS, RTF, ThML, XHTML, and plain<ref>The word '''plain''' here is merely a ''handle'' to distinguish the default from the other '''output formats'''.<BR>It does not imply that the output ''encoding'' is restricted in any way.</ref> text (default).
+
Valid output_format values are: CGI, GBF, HTML, HTMLHREF, LaTeX<ref>Only after diatheke version 4.7 &ndash; The option LaTeX will produce a compilable document, but may well require tweaking to be usable.</ref>, OSIS, RTF<ref>The output does not include any header lines that would facilitate the output including (e.g.) a font colour table, were the output to be redirected to a Rich Text File. Thus using the output filter '''-o w''' (Red Words of Christ) would require such a header to be added by the user in order to ensure the red letter text can be viewed as such.</ref>, ThML, XHTML, '''plain''', and ''internal'' (default)<ref>The word ''internal'' here is merely a ''handle'' to distinguish the default from the other '''output formats'''.<BR>It does not imply that the output ''encoding'' is restricted in any way.</ref><ref>The default previously was '''plain''' text output.</ref>.
  
 
=== Output encodings ===
 
=== Output encodings ===
Valid output_encoding values are: Latin1, UTF8 (default), UTF16, HTML, and RTF.
+
Valid output_encoding values are: Latin1, UTF8 (default), UTF16, HTML, and RTF.<ref>Output encodings determine how any printable non-ASCII characters in the output are encoded.</ref><ref>There is no hyphen in UTF8 or UTF16 even though a hyphen might be expected.</ref>
  
 
'''Notes:'''
 
'''Notes:'''
 
<references />
 
<references />
  
== Known weaknesses ==
+
== Tools that use Diatheke ==
=== Both editions (Linux & Windows) ===
+
Diatheke does not support '''OSISReferenceLinks'''.
+
  
Currently, diatheke does not output:
 
* Any '''canonical Psalm titles''' even with the option filter '''-o h''' for section headings.
 
* Any '''pilcrow symbols''' in the KJV Bible or similar modules where these are encoded as a marker attribute in the OSIS '''milestone''' element.
 
* Any '''quotation marks''' in modules where these are encoded as a marker attribute in the OSIS '''q''' element.
 
* Any '''footnote text''' with the option filter '''-o f''' for footnotes; only a pair of brackets <tt>[]</tt> is output at the location of each note tag.
 
Currently, diatheke does not distinguish different '''highlight types''' in the OSIS '''hi''' element. It treats all such styles as if they were bold by wrapping the text between two asterisks.
 
 
Search type '''regex''' does not yet properly support UTF-8 encoded text. (See above).
 
 
=== Windows edition ===
 
The utility diatheke.exe is among the Sword utilities compiled for Win32.
 
 
Under the Windows command shell (cmd.exe), diatheke does not correctly handle non-ASCII characters in the query key. Thus, for example, the following command that works OK in Linux will fail in Windows:
 
 
diatheke -b KJV -s phrase -k Æneas
 
 
In Windows, the non-ASCII character "Æ" gets changed to U+00E3 LATIN SMALL LETTER A WITH TILDE.
 
 
The response in Windows is then:
 
Verses containing "ãneas"-- none (KJV)
 
 
The root cause is that Windows shell assumes text is encoded as UTF-16 LE whereas SWORD requires all text to be encoded as UTF-8. This problem mainly affects using the search options in diatheke, where a query key is more likely to contain non-ANSI characters. Even so, for any locale in which some Bible book names contain non-ANSI characters, the problem would also affect diatheke when the query key is a reference that contains such a character.
 
 
== Tools that use Diatheke ==
 
  
 
=== AutoKey script for The SWORD Project ===
 
=== AutoKey script for The SWORD Project ===
Ryan (Adyeth) has developed a script for the [http://autokey.sourceforge.net/ AutoKey] 0.6x utility to do paste Bible text given a reference. It works with OpenOffice, plain text editors, or any other Linux program where you might need to paste scripture passages. It requires Diatheke in order to function. You can download it from his website.
+
Ryan V (adyeths) has developed a script for the [http://autokey.sourceforge.net/ AutoKey] 0.6x utility to do paste Bible text given a reference. It works with OpenOffice, plain text editors, or any other Linux program where you might need to paste scripture passages. It requires Diatheke in order to function. You can download it from his website.
  
 
* [http://sites.google.com/site/adyeths/theswordproject AutoKey script for The SWORD Project]
 
* [http://sites.google.com/site/adyeths/theswordproject AutoKey script for The SWORD Project]
  
 
[[Category:SWORD Frontends|Diatheke]]
 
[[Category:SWORD Frontends|Diatheke]]

Latest revision as of 17:14, 13 April 2018

What is diatheke?

Diatheke is a very simple command line interface (CLI) front-end to the SWORD Project's Bible software library. Essentially, "diatheke" is the stuff contained within the file "corediatheke.cpp" in the apps/console/diatheke directory of the SWORD source tree. Corediatheke.cpp contains only one function that is intended to be called from any program using diatheke, and that function performs exactly one lookup in the SWORD library per call. Examples of calls would be a query for a verse (or verse list/range), a search, a request for a list of modules, etc.

Where's the name 'diatheke' come from?

Diatheke means 'testament' or 'commandment'. And diatheke (the program) was originally a command line application. commandment... command line app... It's a pun.

How is diatheke useful to me?

Probably it isn't, but there are a number of front-ends to diatheke (yes, front-ends to a front-end) that are of use. These include:

  • diatheke/TCL: a BibleBot for eggdrop that interfaces with diatheke/CLI
  • diatheke/CGI: a Perl/CGI interface to diatheke/CLI
  • HANDiatheke: a Palm PQA interface to diatheke/CGI
  • ActiveDiatheke: an ActiveX control (.OCX) interface to SWORD

The above four are no longer under active development and may even be no longer available.

How do I get diatheke?

This section needs updating.

To get the very latest version, grab the SWORD source tree from our SVN repository using the URL: https://crosswire.org/svn/sword/trunk

e.g.

   $ svn checkout https://crosswire.org/svn/sword/trunk sword

If you don't want to use SVN, you can try grabbing a recent release from ftp://ftp.crosswire.org/pub/sword/source/

The Sword utilities for Windows are also installed when Xiphos is installed. These include a copy of diatheke.exe

For diatheke/CLI and diatheke/CGI you can download version 4.0 from:

For diatheke/TCL and HANDiatheke you can download version 2.0 from ftp://ftp.crosswire.org/pub/sword/frontend/diatheke/diatheke-2.0.tar.gz.

For ActiveDiatheke you can download a preliminary version from ftp://ftp.crosswire.org/pub/sword/frontend/diatheke/ActiveDiatheke.zip.

Diatheke option filters

Module option filters are off by default in diatheke. They must be specified to include the featured property in the output.

Valid (output) option_filters values are:

a              (Greek Accents) for modules with GlobalOptionFilter=UTF8GreekAccents[1]
b (Bi-Directional Reordering) for modules with Direction=BiDi or Direction=RtoL
c (Hebrew Cantillation) for modules with GlobalOptionFilter=UTF8Cantillation
e (Word Enumerations) for modules with GlobalOptionFilter=OSISEnum
f (Footnotes) for modules with footnotes (GBF/ThML/OSIS)
g (Glosses/Ruby) for modules with GlobalOptionFilter=OSISGlosses or GlobalOptionFilter=OSISRuby
h (Section Headings) for modules with headings (GBF/ThML/OSIS)
i (Intros) for modules with introduction divisions (GBF/ThML/OSIS)
l (Lemmas) for modules with GlobalOptionFilter=ThMLLemma or GlobalOptionFilter=OSISLemma
M (Morpheme segmentation)[2] for modules with GlobalOptionFilter=OSISMorphSegmentation
m (Morphology) for modules with GlobalOptionFilter=ThMLMorph or GlobalOptionFilter=OSISMorph
n (Strong's numbers) for modules with Strong's Numbers (GBF/ThML/OSIS)
p (Arabic Vowels) for modules with GlobalOptionFilter=UTF8ArabicPoints
r (Arabic Shaping) for modules with Arabic/Persian script (required for proper rendering in Linux)
s (Scripture Crossrefs) for modules with GlobalOptionFilter=ThMLScripref or GlobalOptionFilter=OSISScripref
t (Algorithmic Transliterations via ICU)
v (Hebrew Vowels) for modules with GlobalOptionFilter=UTF8HebrewPoints
w (Red Words of Christ) for modules with RedLetterWords (GBF/ThML/OSIS)
x (Encoded Transliterations) for modules with GlobalOptionFilter=OSISXlit

Notes:

  1. Refer to Module configuration files for this and similar items listed.
  2. Added to source code on 01/07/2015 by KK.

Diatheke search types

This section needs expanding.

Valid search_type values are: phrase (default), regex, multiword, attribute, lucene, multilemma.

Search type lucene only works when the module already has a Lucene search index. Such an index can be created by means of either an installed front-end app such as Xiphos, or using the command line Sword utility mkfastmod.

Search type regex has some limitations. It doesn't yet fully support UTF-8 encoded text, so the results you get may not be what you expected. For example:

diatheke -b KJV -s regex -k Abed...nego

gives:

Verses containing "Abed...nego"-- Daniel 1:7 ; Daniel 2:49 ; Daniel 3:12 ; Daniel 3:13 ; Daniel 3:14 ; Daniel 3:16 ; Daniel 3:19 ; Daniel 3:20 ; Daniel 3:22 ; Daniel 3:23 ; Daniel 3:26 ; Daniel 3:28 ; Daniel 3:29 ; Daniel 3:30 -- 14 matches total (KJV)

Each dot in the search query represents "any single byte", so the wide UTF-8 character U+2013 en dash in the 'hyphenated' name Abed–nego can match three dots (E2 80 93), depending on whether SWORD/diatheke was compiled with or without cxx11regex.

Search type attribute can be used to query special features in a module. e.g.

diatheke -b KJV -s attribute -k Heading///Neginoth

should give the following output:

Entries containing "Heading///Neginoth"-- Psalms 4:1 ; Psalms 6:1 ; Psalms 54:1 ; Psalms 55:1 ; Psalms 67:1 ; Psalms 76:1 ;  -- 6 matches total (KJV)

How do I use diatheke/CLI?

Calling diatheke without any parameters will result in the command line syntax help being output to stderr.

The query_key (-k) must be the last argument because all further arguments are added to the key.

Examples

The following are a few examples of calling diatheke from the command line: (booknames can be abbreviated, providing this avoids ambiguity)

Retrieve Acts ch 10 diatheke -b KJV -k "Acts 10"
First five verses of above diatheke -b KJV -m 5 -k "Acts 10"
Acts chapters 1 and 2 diatheke -b KJV -k "Acts 1-2"
Genesis 1:1 diatheke -b KJV -k G 1:1
Galatians 1:1 w/ Strong's (if available) diatheke -b KJV -o n -k "Ga 1:1"
I Corinthians 1:1 (also "ic 1:1") diatheke -b KJV -o n -k "1c 1:1"
Revelation 1:1-1:7 diatheke -b KJV -k "Rev 1:1-7"
Revelation 1:1 diatheke -b KJV -m 1 -k "R 1:1-7"
Revelation 1:1,1:3,1:7 as HTML (w/ <p>, <i>, etc. tags) diatheke -b KJV -f HTML -k R 1:1,3,7
Luke 3:35 with Greek accents, showing all variants diatheke -b TischMorph -o a -v -1 -k Luke 3:35
verses with "my people", quotations optional diatheke -b KJV -s phrase -k "my people"
verses with "skin" and "bones" diatheke -b KJV -s multiword -k skin bones
verses with "church" or "assembly" diatheke -b KJV -s regex -k "church | assembly"
Strong's Greek 3056 diatheke -b StrongsGreek -k 3056
Definition of "horn" in Two Babylons diatheke -b 2BabDict -k horn
Entry for John 1:1 in Family Bible Notes diatheke -b Family -k Jn 1:1
Entry for "Lion" in Scripture Alphabet Of Animals diatheke -b SAOA -k "Lion"
Entry for "olive-tree" in Easton's Bible Dictionary diatheke -b Easton -k olive-tree
Matthew 24 from Westcott Hort Greek NT transliterated into Latin script diatheke -b WHNU -t Latin -o mn -k "Mt 24"

System keys

If <book> is "system" you may use these system keys: "modulelist", "modulelistnames", "bibliography", and "localelist". e.g.

diatheke -b system -k modulelist

will generate a complete list of all installed modules, each with its Description.

Diatheke output

The plain text output of diatheke marks any OSIS highlight elements (e.g. <hi type="italic">n</hi>) by wrapping the highlighted text between asterisks (e.g. *n*). It does this whatever the value of the type attribute.

Output formats

Valid output_format values are: CGI, GBF, HTML, HTMLHREF, LaTeX[1], OSIS, RTF[2], ThML, XHTML, plain, and internal (default)[3][4].

Output encodings

Valid output_encoding values are: Latin1, UTF8 (default), UTF16, HTML, and RTF.[5][6]

Notes:

  1. Only after diatheke version 4.7 – The option LaTeX will produce a compilable document, but may well require tweaking to be usable.
  2. The output does not include any header lines that would facilitate the output including (e.g.) a font colour table, were the output to be redirected to a Rich Text File. Thus using the output filter -o w (Red Words of Christ) would require such a header to be added by the user in order to ensure the red letter text can be viewed as such.
  3. The word internal here is merely a handle to distinguish the default from the other output formats.
    It does not imply that the output encoding is restricted in any way.
  4. The default previously was plain text output.
  5. Output encodings determine how any printable non-ASCII characters in the output are encoded.
  6. There is no hyphen in UTF8 or UTF16 even though a hyphen might be expected.

Tools that use Diatheke

AutoKey script for The SWORD Project

Ryan V (adyeths) has developed a script for the AutoKey 0.6x utility to do paste Bible text given a reference. It works with OpenOffice, plain text editors, or any other Linux program where you might need to paste scripture passages. It requires Diatheke in order to function. You can download it from his website.