DevTools:Conversion to Unicode
From CrossWire Bible Society
For English language texts that only make use of ANSI characters, no change to the source encoding will be required. For other European language and most other languages, there probably exist simple encoding converters for ISO and national standards to UTF-8. For more complex source encodings, you may need to create your own converter or adapt an existing one. Some currently available conversion tools that you may find useful, depending on your platform and needs, include:
- uconv (part of ICU), available compiled for Win32 in the utilities ZIP at http://crosswire.org/ftpmirror/pub/sword/utils/win32 or in source format from ICU at http://www.icu-project.org/.
- font2uni from CCEL, available at http://www.ccel.org/info/gkheb/.
uconv is best suited for standard encodings and font2uni is best suited for font-specific encodings. When creating XML texts, the only entities that should be used are & for '&' and < for '<'. All other entities should be encoded as their UTF-8 equivalents.