Jump to content

Some questions from new developers

Thanks for the comment. I will write my own widget to handle the translation of the messages.

Anyway, I'm still curious if I can use something like Chinese as the message id, after all the non-English wiki also uses non-English as the page title.

Also, is there a way to observe all the changes in a project?

Kanashimi (talk)22:58, 10 February 2022

Chinese IDs are possible, notably if your project uses Chinese as its primary development language. I've not seen much projects using it with Mediawiki but with other open projects.

Technically nothing would prohibit using Arabic as well, or Hebrew (except that Bidi reordering creates additional difficulties if these identifiers are not "isolated" or surrounded by enclosing pairs of punctuations like parentheses or brackets, and even with the MediaWiki syntax, the use of brackets/braces for enclosing links to page names or to transclude templates or to call parser functions, causes problem when the link includes a pipe and a display label which may use another language/script that would need to be isolated themselves: we frequently see problems where contributors have difficulties to input them, caused by the default Bidi reordering and the absence by default of any isolation for proper input order).

With Chinese identifiers, there's no Bidi problem. The problem is that translators often have to be able to easily refer to messages, and can't easily type them unless they have a Chinese IME, and know how to use them. On the other side almost everyone can type ASCII on their keyboards, but then the translation interface must always provide a way to display and copy these identifiers for references (in MediaWiki these identifiers should be suitable as wiki pagenames)

I suggest you look at the Unicode specifications and Javascript spcifications for "internationalized identifiers" and which characters are suitable. Unicode (and CLDR) define character properties.

Things are simpler if identifiers are limited to characters that everyone can input on their keyboard and these identifier are starting and ending with characters with strong directionality (i.e. not any punctuation or symbols with weak direction, such as quotation marks, or hyphens/dashes/underscores, which may eventually be present in identifiers: C/C++/java identifiers allow the use of underscores anywhere, ignoring this constraint and causing problems for users of RTL scripts, other languages also allow leading or trailing significant punctuations/symbols like $, % and these leading/trailing symbols cause problems for embedding them in other contexts, forcing the use of some escaping system which are dependant of each environment. of use... If possible, identifiers should work without needing these escaping mechanisms and the namespace in which these identifiers will be defined and used should also provide some useful equivalences, e.g. underscores and hyphens used in the middle of identifiers, and some alternate punctuation for namespace separation: these identifiers or composite identifiers should ideally respect such constraints and should avoid using characters that require a complex IME, otherwise the project using them will stay largely confined with a smaller set of contributors or translators that can work with them; unfortunately this limits most projects to use simpler alphabets possibly augmented with more basic diacritics and to apply some discipline for the use of separators in the middle of identifiers).

You can have other views, but given you are requesting this here, I think you should consider the needs for real internationationisation with users of any languages: do you want to maximize the translatability of your project or to have your project translated to languages for which there is a good enough interaction and sufficient cooperation with Chinese users?

Maybe your project could also support alternate identifiers (defined aliases) written in alternate scripts, by using an internal registry of these equivalences (but for now Translatewiki.net does not have a way to be aware of these equivalences and recognize aliases (even if there's a limited support by using redirects for wiki page names where these identifier aliases may be mapped to).

Verdy p (talk)07:26, 11 February 2022

Thank you for the explanation.

Anyway, for search purposes, it is best to use English. However, sometimes I may need to change the message ID if there is an error. however, there seems to be a function in translatewiki to automatically search for similar phrases, so maybe changing the message ID won't cause too many problems.

Kanashimi (talk)12:20, 11 February 2022

Yes there's such function, however contact the site admins for details, I've not used it but I've seen that there's some known issues and ways to do that correctly. Noite that for now, when a message gets renamed or when the message group is splitted into two separate ones, the moved messages may need to be reviewed again, causing additonal work for translators. I've seen a few phabricator tasks about this.

Verdy p (talk)15:11, 11 February 2022
 

If you only change a message key, and keep the message content identical, then our system automatically suggests to rename it with all the translations. This is reviewed by translation admin. On next export, all languages will updated with the renamed key.

Nike (talk)16:33, 11 February 2022