International Components for Unicode (ICU) is an optional, but highly recommended library that can be integrated into Sword and Sword applications. It offers implementations of numerous Unicode algorithms, incorporates complete Unicode Standard (TUS) and Unicode Common Locale Data Repository (CLDR) data, and provides Unicode compatible versions of common algorithms.
Other implementations of many of these algorithms and other sources of this data exist, but ICU provides a superior version in most, if not all, cases owing to the close involvement of members of the Unicode TC (such as Unicode co-founder Mark Davis) in the development of ICU. ICU itself is BSD licensed and benefits from the work of numerous developers at companies such as Google, Apple, & IBM. Indeed, ICU is used by companies such as these, to provide Unicode support in their applications. If you use Mac OSX, an iPhone, Adobe Acrobat, Logos, OpenOffice.org, Mozilla, Chrome, Safari, or any recent software from SIL, you have already enjoyed the benefits of ICU in your software.
At present, Sword makes use of ICU for casing (used in search), normalization, and script transliteration.
In the future we would like to add include localized collation and employ ICU's regular expressions facility.
While using the stock version of ICU will provide much benefit, a customized version of ICU, called icu-sword, is also maintained for Sword users and developers. This version of ICU is maintained in its own SVN repository and is made available via the Sword FTP site.
icu-sword is regularly synchronized with releases of ICU (usually within a day or two). New versions of icu-sword are tested on four platforms before being released: Mac OSX (current version), Borland C++ Builder 5, MS VC++ (current version), and some flavor of Linux (Ubuntu JJ at the time of writing).
icu-sword incorporates a number of changes from the stock version of ICU:
- Additional transliteration schemes, especially for biblical languages using standard scholarly & interchange transliteration systems
- Removal of unused locales, charset converters, and collation data to reduce download size
- BCB5 project files and fixes to compiler errors for building on BCB5
- Scripts to convert between building a complete ICU data bundle and the smaller resource bundle of data used by Sword