Difference between revisions of "Talk:File Formats"

From CrossWire Bible Society
Jump to: navigation, search
(Request for info: SWORD internal file formats)
(Go Bible: new section)
 
(15 intermediate revisions by 3 users not shown)
Line 1: Line 1:
== non-SWORD software ==
+
[[Talk:File Formats/Archive]]
  
I couldn't see the value of listing other Bible software in our Wiki, but tolerated it since it was supposed to be further developed into something more worthwhile. However, now that it's become largely an ad for non-Sword software, I'm going to remove the whole section.
+
== Biblical texts stored in SQlite database format ==
  
Only material related, relevant, or useful to CrossWire and/or the SWORD Project need go in our Wiki. Material not fitting those categories can go in someone else's Wiki and will be deleted from this one.
+
Should you come across a Biblical text stored in [https://en.wikipedia.org/wiki/SQLite SQlite] database format, then this open source program may prove to be useful.
 +
: [http://sqlitebrowser.org/ DB Browser for SQLite]
 +
The source files are hosted on github – there are builds available for Windows, Mac OS X, Linux, Arch Linux, Fedora, Ubuntu and Derivatives, and FreeBSD. You can export a table to CSV format, which could be a stepping stone towards making a SWORD module with other utilities.
 +
:[[User:David Haslam|David Haslam]] ([[User talk:David Haslam|talk]]) 12:34, 16 December 2016 (MST)
  
--Osk
+
== Deletion ==
  
== Go Bible & collaboration with CrossWire ==
+
I am deleting references to all defunct, foreign or otherwise irrelevant formats and their utilities. I also remove references to undocumented utilities. There are a few within the source tree which possibly need removing, but we are not helping this by keeping references here.
  
I have just added some information about Go Bible. Please visit my user page to see why, rather than removing this. [[User:David Haslam|David Haslam]] 15:47, 7 August 2008 (MDT)
+
Further, we do not support GBF anymore other than in legacy modules. We also have seen for years (ie.e since I am part of the project) any suggested imports from STEP. I will generally par this page down to be specific to tools currently in use, encouraged and useful. [[User:Refdoc|refdoc]]:[[User_Talk:Refdoc|talk]] 04:11, 8 January 2018 (MST)
 +
:New sections to a talk page should always be added at the bottom. Sections should be in chronological order. --[[User:David Haslam|David Haslam]] ([[User talk:David Haslam|talk]]) 04:47, 8 January 2018 (MST)
  
== Tessaract OCR software ==
+
::As we still accept '''IMP format''', this section should not have been removed. Please re-instate it. [[User:David Haslam|David Haslam]] ([[User talk:David Haslam|talk]]) 16:25, 11 March 2018 (UTC)
 +
:::Found it and added a link to the new page under '''See also'''. [[User:David Haslam|David Haslam]] ([[User talk:David Haslam|talk]])
  
Having just added a stub section about Tessaract, I'd like to suggest that Troy adds a few words about it. [[User:David Haslam|David Haslam]] 20:03, 9 January 2009 (UTC)
+
== Go Bible ==
  
== Zefania XML ==
+
I am removing this section from the main page. It has some broken links. Go Bible is no longer being developed. [[User:David Haslam|David Haslam]] ([[User talk:David Haslam|talk]]) 12:30, 28 April 2018 (UTC)
  
A couple of days ago, there was a message from Wolfgang Schultz in the Sword dev mailing list which announced that he has removed all websites relating to '''Zefania XML'''. Certainly the [http://www.zefania.de/ link] now goes to a "page under construction" message. [[User:David Haslam|David Haslam]] 19:25, 27 April 2009 (UTC)
+
===Go Bible===
 +
Following an agreement made in July 2008 with the program's author Jolon Faichney, [[Projects:Go Bible|Go Bible]] was adopted by CrossWire as its Java ME software project.
  
:The sourceforge repository for various modules is still available at [http://sourceforge.net/projects/zefania-sharp/]. The most recently uploaded module is the Zürcher Bibel, dated April 12, 2009. [[User:David Haslam|David Haslam]] 10:00, 15 May 2009 (UTC)
+
To achieve the navigation speed and general ease of use on even the simplest of Java mobile phones, Go Bible data is fully indexed, as well as being compressed (as are all JAR files).  The format is described in [http://code.google.com/p/gobible/wiki/GoBibleDataFormat Go Bible data format]. Go Bible data is structured as Book | Chapter | Verse text and does not support notes, headings and cross-references, etc. The developer kit [http://gobible.jolon.org/developer/welcome.html Go Bible Creator] can take either USFM, ThML or OSIS as the source text format, but they usually have to be made specially suitable. For example, OSIS files produced by Snowfall Software's SFMToOSIS script are not structured the same. Work has begun to make an [http://en.wikipedia.org/wiki/XSL_Transformations XSLT] script to convert such OSIS XML files to the format suitable for Go Bible. [[Projects:Go Bible/Go Bible Creator|Go Bible Creator]] version 2.3.2 and onwards can take a folder of USFM files as the source text format.
  
::The admins for this sourceforge project are Tom Baccei and Mathieu Delarue. [[User:David Haslam|David Haslam]] 10:03, 15 May 2009 (UTC)
+
Go Bible source code is now available [https://crosswire.org/svn/gobible/ here] on the CrossWire Repository. ''To access this you will need to have an account''.
  
:::This repository is still active. Many more Zefania XML files have been uploaded since I last reported. [[User:David Haslam|David Haslam]] 16:14, 25 July 2009 (UTC)
+
GoBibleDataFormat is being extended in the [[Projects:Go Bible/SymScroll|SymScroll]] branch.
 
+
== PMD files? ==
+
 
+
Anyone know a way (short of obtaining and installing an old version of PageMaker) how to extract the text from a PMD file? [[User:David Haslam|David Haslam]] 09:13, 14 June 2009 (UTC)
+
:Anyone tried [http://createpdf.adobe.com/ Create Adobe® PDF Online]? This online service is only available to subscribers in the US & Canada. ''I live in the UK''. [[User:David Haslam|David Haslam]] 09:33, 14 June 2009 (UTC)
+
 
+
== Please discuss before reverting - the page is now become inaccurate ==
+
 
+
I see that 7 of my recent edits have been reverted by [[User:Osk|Osk]]. In future, please discuss before reverting - as the page is now become inaccurate. [[User:David Haslam|David Haslam]] 11:58, 24 July 2009 (UTC)
+
 
+
:I reverted because one of your last edits erased text following "$$$". This is the second time you've done this kind of careless edit. Last time, I (and I believe another editor) corrected all of the deletions you performed. --[[User:Osk|Osk]] 15:47, 24 July 2009 (UTC)
+
 
+
::I have no idea why that occurred, when all I was doing was changing a redlink in the previous paragraph, and that only during my last edit before your reversion. I'm sure this wasn't through carelessness. I normally check my edits using preview before saving. This is quite weird! Could this be due to a subtle bug in the wiki software?  [[User:David Haslam|David Haslam]] 19:44, 24 July 2009 (UTC)
+
 
+
:::I've been thinking about this further. The two items of text that were unintentionally deleted were both Bible references. A couple of weeks ago I installed a Firefox add-on called [http://www.proginosko.com/refalizer.html Bible Refalizer]. I wonder if that was the cause? I'll do some tests in the wiki sandbox, and report later. [[User:David Haslam|David Haslam]] 21:09, 24 July 2009 (UTC)
+
 
+
::::With '''Bible Refalizer''' enabled, when a wiki page or section is edited, Bible references in the edit box get removed. With Refalizer disabled, everything is OK.  This is a serious bug in the Refalizer add-on for Firefox. I have reported it to the programmer, James Anderson. [[User:David Haslam|David Haslam]] 16:31, 29 July 2009 (UTC)
+
 
+
== usfm2osis.pl ==
+
 
+
Is the word 'rudimentary' still needed before [http://crosswire.org/ftpmirror/pub/sword/utils/perl/usfm2osis.pl usfm2osis.pl] ? [[User:David Haslam|David Haslam]] 12:48, 26 October 2009 (UTC)
+
 
+
== Creating and copying from PDF files ==
+
 
+
Providing the document's security properties permit copying, the entire content of a PDF file can be copied using '''Adobe Reader 9.x''' (I have successfully used this to paste a complete Bible text into Wordpad). Couple this concept with the fact that there are several printer drivers that permit printing from any Windows application to a PDF file, and let your imagination run riot. Two such printer drivers are the commercial program called '''pdfFactory''' from [http://www.fineprint.com/ Fineprint], and the free program called [http://www.pdfforge.org/ PDFCreator]. [[User:David Haslam|David Haslam]] 11:21, 19 November 2009 (UTC)
+
 
+
== modwrite / treeidxutil / xmlcatalog ==
+
 
+
Please would someone provide a suitable description of these SWORD tools. I have added a line for each under Miscellaneous. [[User:David Haslam|David Haslam]] 13:28, 7 December 2009 (UTC)
+
:Still waiting. [[User:David Haslam|David Haslam]] 15:57, 19 December 2009 (UTC)
+
 
+
== SGML section ==
+
 
+
I added the SGML section after learning that one of my contacts in SIL has a task for converting some Folio View files to Logos format, and he's doing it via XML. He's probably not using SP, yet I thought it helpful to record what I have found. [[User:David Haslam|David Haslam]] 17:50, 22 April 2010 (UTC)
+
 
+
== XHTML-TE and Go Bible Creator ==
+
 
+
I am in contact with the SIL employee who is adapting Go Bible Creator to add the option to specify XHTML-TE as a source text format. ''Email me if you'd like further details''. [[User:David Haslam|David Haslam]] 12:21, 25 May 2010 (UTC)
+
 
+
== imp2osis.pl ==
+
 
+
Here is a copy of the help output for imp2osis.pl
+
<pre>
+
imp2osis.pl -- IMP (Sword Import) format to OSIS 2.1.1 converter version 2.0.1
+
Revision 227 (2009-10-30)
+
Syntax: imp2osis.pl <osisWork> <input filename> [-o OSIS-file] [-m]
+
 
+
The -m option will produce milestoned <verse/> elements,
+
which are more likely to produce valid OSIS from Bibles with OSIS markup internally.
+
 
+
No attempt is made to convert markup present in the verse entries themselves,
+
so this tool is appropriate for converting Bibles that already contain OSIS markup or plaintext markup.
+
 
+
This tool is ONLY intended for VersKey-type Sword texts, namely Bibles and commentaries.
+
</pre>
+
''Lightly edited to avoid having to use horizontal scrolling''.
+
[[User:David Haslam|David Haslam]] 16:37, 4 December 2010 (UTC)
+
 
+
== Why I have added the section about ODF ==
+
 
+
Works that are supplied as word processing files are sometimes difficult to convert to OSIS XML. The XML content of an ODF file may prove to be a useful intermediate step for format shifting. In theory, it should be feasible to develop an XSLT script to perform such a transformation. [[User:David Haslam|David Haslam]] 13:41, 23 March 2011 (UTC)
+
 
+
== Request for info: SWORD internal file formats ==
+
 
+
I arrived at this page looking for info on SWORD's internal file formats.
+
 
+
There's some detail on "other" formats and references to existing tools that convert between some of them and "SWORD format" but it would be useful to tell more about what "SWORD format" actually IS.
+
 
+
Is there documentation available anywhere about what "SWORD format" looks like externally (file names and/or directory structure) and internally (file layout)?
+
 
+
What if someone wanted to develop a document for use in a sword application?
+
 
+
This page seems to suggest that they'd be best advised to format that information in one of these other formats -- because those are documented -- and trust one of the listed conversion routines to somehow make the data usable in sword apps.
+
 
+
Is it really the case that a direct formatting into Sword format is not worth documenting/considering?
+
 
+
Even the indirect path of importing from one of these other formats leaves questions about the resulting "external" structure like:
+
 
+
How do I identify the result -- like to tell if there's any chance that a conversion succeeded or to distribute it to another machine?
+
 
+
Should I expect to see a file? multiple files? a directory? a directory tree?
+
 
+
How, as in by what file name(s), file extension(s), directory name(s) etc. does an application identify such resources to the sword library?
+
 
+
[[User:Pmartel60|Pmartel60]] 15:12, 1 October 2011 (MDT)
+
 
+
: Sword's format is not and will not be documented (except in the sense that it is documented in formal languages, namely C++ & Java). Our source code is open source, so you are welcome to read it, with the understanding that any reimplementation of our code that results from reading our code (such as a Sword format reader) would necessarily be bound by the GPL. Sword's format is not an open format. It is a proprietary format, prone to change without notice, according to our current and future needs. --[[User:Osk|Osk]] 17:12, 1 October 2011 (MDT)
+

Latest revision as of 12:30, 28 April 2018

Talk:File Formats/Archive

Biblical texts stored in SQlite database format

Should you come across a Biblical text stored in SQlite database format, then this open source program may prove to be useful.

DB Browser for SQLite

The source files are hosted on github – there are builds available for Windows, Mac OS X, Linux, Arch Linux, Fedora, Ubuntu and Derivatives, and FreeBSD. You can export a table to CSV format, which could be a stepping stone towards making a SWORD module with other utilities.

David Haslam (talk) 12:34, 16 December 2016 (MST)

Deletion

I am deleting references to all defunct, foreign or otherwise irrelevant formats and their utilities. I also remove references to undocumented utilities. There are a few within the source tree which possibly need removing, but we are not helping this by keeping references here.

Further, we do not support GBF anymore other than in legacy modules. We also have seen for years (ie.e since I am part of the project) any suggested imports from STEP. I will generally par this page down to be specific to tools currently in use, encouraged and useful. refdoc:talk 04:11, 8 January 2018 (MST)

New sections to a talk page should always be added at the bottom. Sections should be in chronological order. --David Haslam (talk) 04:47, 8 January 2018 (MST)
As we still accept IMP format, this section should not have been removed. Please re-instate it. David Haslam (talk) 16:25, 11 March 2018 (UTC)
Found it and added a link to the new page under See also. David Haslam (talk)

Go Bible

I am removing this section from the main page. It has some broken links. Go Bible is no longer being developed. David Haslam (talk) 12:30, 28 April 2018 (UTC)

Go Bible

Following an agreement made in July 2008 with the program's author Jolon Faichney, Go Bible was adopted by CrossWire as its Java ME software project.

To achieve the navigation speed and general ease of use on even the simplest of Java mobile phones, Go Bible data is fully indexed, as well as being compressed (as are all JAR files). The format is described in Go Bible data format. Go Bible data is structured as Book | Chapter | Verse text and does not support notes, headings and cross-references, etc. The developer kit Go Bible Creator can take either USFM, ThML or OSIS as the source text format, but they usually have to be made specially suitable. For example, OSIS files produced by Snowfall Software's SFMToOSIS script are not structured the same. Work has begun to make an XSLT script to convert such OSIS XML files to the format suitable for Go Bible. Go Bible Creator version 2.3.2 and onwards can take a folder of USFM files as the source text format.

Go Bible source code is now available here on the CrossWire Repository. To access this you will need to have an account.

GoBibleDataFormat is being extended in the SymScroll branch.