Difference between revisions of "Talk:Localized Language Names"
From CrossWire Bible Society
(Question on scripts.) |
|||
Line 9: | Line 9: | ||
I think there is a standard for scripts and that CLDR and ICU are starting to do something with it. Any thoughts??? | I think there is a standard for scripts and that CLDR and ICU are starting to do something with it. Any thoughts??? | ||
--[[User:Dmsmith|Dmsmith]] 00:37, 13 November 2009 (UTC) | --[[User:Dmsmith|Dmsmith]] 00:37, 13 November 2009 (UTC) | ||
+ | |||
+ | : Script codes come from [http://unicode.org/iso15924/iso15924-codes.html ISO 15924], and according to BCP 47, the way to include them in a locale is between the language and region, so en-Latn-US would be English written in Latin script in/for the US. (This is a bad example because BCP 47 says not to overspecify by naming a script when it should be obvious.) BCP 47 does specify using hyphen to separate tags, but I would guess that Java would be happy with the same style of tags if you just map the hyphens to underscores. | ||
+ | |||
+ | : Traditional Chinese is zh-Hant, simplified is zh-Hans. Mongolian in Mongolian script, Cyrillic, and Latin would be mn-Mong, mn-Cyrl, and mn-Latn respectively. | ||
+ | |||
+ | : This is all incorporated into CLDR and ICU, but more importantly is officially recognized/maintained by ISO, Unicode, and IANA. --[[User:Osk|Osk]] 03:11, 13 November 2009 (UTC) |
Revision as of 03:11, 13 November 2009
Some languages have multiple scripts. What is the proper way to show that? E.g. Traditional vs Simplified Chinese? And I think Azeri has multiple scripts.
For BibleDesktop, we have localized zh (traditional) and zh_CN (simplified). This is not quite right, but fits pragmatically based on Java's locale and resource bundle mechanism.
For Java a locale is lang, lang_country, lang_country_dialect, or lang__dialect (where country is unstated). There is no standard for dialect, so it could be used for script.
I think there is a standard for scripts and that CLDR and ICU are starting to do something with it. Any thoughts??? --Dmsmith 00:37, 13 November 2009 (UTC)
- Script codes come from ISO 15924, and according to BCP 47, the way to include them in a locale is between the language and region, so en-Latn-US would be English written in Latin script in/for the US. (This is a bad example because BCP 47 says not to overspecify by naming a script when it should be obvious.) BCP 47 does specify using hyphen to separate tags, but I would guess that Java would be happy with the same style of tags if you just map the hyphens to underscores.
- Traditional Chinese is zh-Hant, simplified is zh-Hans. Mongolian in Mongolian script, Cyrillic, and Latin would be mn-Mong, mn-Cyrl, and mn-Latn respectively.
- This is all incorporated into CLDR and ICU, but more importantly is officially recognized/maintained by ISO, Unicode, and IANA. --Osk 03:11, 13 November 2009 (UTC)