DevTools:ICU

From CrossWire Bible Society
Jump to: navigation, search

ICU

International Components for Unicode (ICU) is an optional, but highly recommended library that can be integrated into Sword and Sword applications. It offers implementations of numerous Unicode algorithms, incorporates complete Unicode Standard (TUS) and Unicode Common Locale Data Repository (CLDR) data, and provides Unicode compatible versions of common algorithms.

Other implementations of many of these algorithms and other sources of this data exist, but ICU provides a superior version in most, if not all, cases owing to the close involvement of members of the Unicode TC (such as Unicode co-founder Mark Davis) in the development of ICU. ICU itself is BSD licensed and benefits from the work of numerous developers at companies such as Google, Apple, & IBM. Indeed, ICU is used by companies such as these, to provide Unicode support in their applications. If you use Mac OSX, an iPhone, Adobe Acrobat, Logos, OpenOffice.org, Mozilla, Chrome, Safari, or any recent software from SIL, you have already enjoyed the benefits of ICU in your software.

At present, Sword makes use of ICU for casing (used in search), normalization, and script transliteration.

In the future we would like to add include localized collation and employ ICU's regular expressions facility.

icu-sword

While using the stock version of ICU will provide much benefit, a customized version of ICU, called icu-sword, is also maintained for Sword users and developers. This version of ICU is maintained in its own SVN repository and is made available via the Sword FTP site.

icu-sword is regularly synchronized with releases of ICU (usually within a day or two). New versions of icu-sword are tested on four platforms before being released: Mac OSX (current version), Borland C++ Builder 5, MS VC++ (current version), and some flavor of Linux (Ubuntu JJ at the time of writing).

icu-sword incorporates a number of changes from the stock version of ICU:

  1. Additional transliteration schemes, especially for biblical languages using standard scholarly & interchange transliteration systems
  2. Removal of unused locales, charset converters, and collation data to reduce download size
  3. BCB5 project files and fixes to compiler errors for building on BCB5
  4. Scripts to convert between building a complete ICU data bundle and the smaller resource bundle of data used by Sword

Using icu-sword

The icu-sword tarball contains an icu-sword directory in which all of its source code resides. This directory should be placed at the same level (within the same directory as) a directory named sword containing the Sword source code itself.

BCB5

To build icu-sword in BCB5, use the project files found in icu-sword\as_is\borland. Or, to build BibleCS or other Sword apps that use icu-sword, simply build them from their own projects, which will employ icu-sword correctly if it is in the above named location.

VC++

To build icu-sword in VC++, using the project files found in icu-sword\source\allinone. Or, to build Sword utilities from the Sword source tree, simply build them from their own projects, which will employ icu-sword correctly if it is in the above named location (diatheke also requires that you build the ICU version to include ICU support, otherwise it will build a non-ICU version).

Incorporating icu-sword into VC++ apps

[list needed defines here]

Linux/Mac OSX

icu-sword can be build in Linux, Mac OSX, and other POSIX-like environments using standard ICU make procedures. The runConfigure script found in icu-sword/source is used to configure. Following this, use make & make install as usual. If this is not clear, consult build documentation at ICU's website.