User:Purodha/cldr and lib-c data
- use it,
- make additions quickly and easily on-site,
- contribute to the common libraries for everyones use.
In order to do that, we need to have:
- import modules
- export modules
that yet have to be programmed.
Use the talkpage for discussion and suggestions. to answer open questions.
Timing and work considerations
Import obviously is the step to begin with. Export needs not be ready before data to be exported has been collected, and can even be postponed until someone is ready to take it and use it.
A testing and development environment is nearly there, and likely to be fully funktional within days. Early commits can be made so as to allow everyone to monitor progress and run own tests if they like. Running the code at translatewiki.net would require staff action. Still, it would not likely accidentally break anything else even if broken, since it is isolated from the rest.
Whether or not cldr and lib-c imports can be developed in one go is not clear at the moment. It depends on whether or not they are structurally equal or similar enough. Also, it has do be decided how to deal with non-identical values for identical keys, that is, when the contradict one another.
Import from cldr
Cldr allows to download locale data from their website in LDML, an is easily parseable as XML format. It includes a hierarchy of items. Both the hierarchy levels and items include names. That allows to
- generate local message keys from flattening the hierary appending names to each other with a separator that cannot not be part of a name.
There is a gross 50% overlap between the languages translatewiki.net has, and the locales, that cldr has. They are not named equally.
- We need to decide what to do with locales that we currently do not support.
- We need a mapping for the locales of cldr and languages of translatewiki.net.
There are occasional hints about proposed alternates in the cldr data.
- Hints on alternates could go to either talk pages or editable parts of message documentiaton.
There is some explanatory documentation, hints, and format information available from cldr on various items. Some is in depths, some covers merely the data types. For some sorts of items, editor most know specific requirements so as to correctly edit them.
- Can the information for editors be imported as not editable part of the message documentation?
- Was it legal? Likely mostly because they do not qualify for copyrighted work, but it should be better to asked for, or license terms be checked. Linking to them is not a good alternative, for practicability reasons.
- How to deal with structures local to locales?
- Those of lesser importance for translatewiki.net can be skipped in a first step if an easy other solution is not found. There are likely none of considerable urgency.
Reimporting, that is importing a new cldr release once data already exists in translatewiki.net, is still to be assessed.
Import from lib-c
to be added later
to be added later
- Editing English data should be enabled for this group.
- Editing is likely not to be called "translating" for this data.
- It may be worth considering to allow any ISO 639 defined language to be supported for this data. Many things can be simply taken from various published sources by everyone, selectively restricting language support may thus be counterproductive.
- CLDR - Unicode Common Locale Data Repository
- LDML – Unicode Locale Data Markup Language
- download locale data - CLDR Releases/Downloads
- survey tool - the CLDR Survey Tool
- translation guidelines - CLDR translation guidelines
|||See Weblinks above.|
||| For instance translatewiki.nets |
|||such as international dialling directories, banks lists of exchange rates, tourist guides, and many others.|