About [[Phabricator:arcanist-core-3119f17b3b50c21d/en]]

Edited by author.
Last edit: 17:33, 20 September 2022

just [ping], as there's not ben any reply since 4 months and we still don't know what "nd" alone means and how it can be translated. Normally Arcanist is part of the Phabricator extensions supported by Wikimedia. But may be it has not been officially retired or does not use this specific message due to some disabled feature? I don't know who to contact if this is supported upstream, or if this is an old legacy message no longer used.

Several message seems also to be related:

  • arcanist-core-9b02d9974c14e623 ("st") (it is an ordinal suffix hardcoded for using rules valid only in English: "1st", "21nd", ... "91nd", but not "11th"; note also that in French, Spanish, Portuguese, Italian, this ordinal suffix will vary according to grammatical gender, such as "er" vs "re" in French, or "º" vs "ª" in Spanish/Portuguese/Italian...)
  • arcanist-core-3119f17b3b50c21d ("nd") (same indicated PHP source file, just one line below, it is an ordinal suffix hardcoded for using rules valid only in English: "2nd", "22nd", ... "99nd", but not "12th"; this does not vary in French but varies in Spanish/Portuguese/Italian...)
  • arcanist-core-9f6194d012e32351 ("rd") (same indicated PHP source file, just one line below, it is an ordinal suffix hardcoded for using rules valid only in English: "3rd", "22nd", ... "99nd", but not "13th"; this does not vary in French but varies in Spanish/Portuguese/Italian...)
  • arcanist-core-fa6af6e97d010a98 ("th") (same indicated PHP source file, just another line below, so it is likely an ordinal suffix hardcoded for using rules valid only in English: other cases; this does not vary ni French but varies in Spanish/Portuguese/Italian...)

Such messages are perfect example of "pachwork" messages that are NOT correctly internationalisable (using abbreviation suffixes out of any context, and hardcoding the linguistic rules always like those applicable only to English), whereas CLDR or similar functions would better help.

If you really want to support suffixes after numerals for expressing ordinals, these suffixes should be styled in superscript tags (even if it's not needed in English, and should not be done in German, where the suffix will be a ".").

For this reason, Phabricator should use a good converter from numbers to ordinals that can take these into consideration: there alerady exists good i18n templates/modules already for that in Commons, with relevant data for many languages, including the special formaters for centuries or millenia.

Verdy p (talk)16:46, 2 August 2022

For some reason, the translation appears 'outdated' while nd is the only possible translation.

Cigaryno (talk)17:44, 24 August 2022

I don't agree; there are other solutions if these two messages are intended to generate ordinal suffixes: don't hardcode ordinals this way; but this requires patching the source code.

  • If we translate to French for example we could use the alternate suffix "e" for all values... except for the 1st ordinal which should be suffixed by "er", but the current hardcoded rules for English don't work this way (e.g. "1st", "11th", "21st" in English, versus "1er", "11e" and "21e" in French)
  • In German this would be simpler as a general suffix "." can be used for all ordinals
  • Some language require varying the suffix (e.g. by grammatical gender or plural or grammatical case according to the nominal group that the ordinal qualifies)
  • Other languages would use a prefix instead, such as "nº 1".

Using these messages as is makes no sense at all except in English. CLDR has resources about ordinal formatting (but only for the nominal or vocative grammatical case). But in fact It would be simpler to adapt messages if ordinals were not made translatable using this patchwork model, but using complete sentences containing the ordinal (and just a placeholder for the numeric value): we could then use "PLURAL:" parser function if needed, or use alternatives like "nº 1" that do not depend on the numeric value, or translate without any ordinal.

Translating suffixes only, without using a placeholder for the numeral value, makes NO sense. These 4 messages (currently hardcoded and tuned for following rules only valid in English) must be replaced by "$1st", "$1nd", "$1rd", "$1th" (or just by a single message using for example "#$1" or "nº $1" without needing any ordinal form) so that we can create correct translations.

  • Only German-like translations (also used in some Central European languages) can use a "." suffix for the ordinal indicator to be used in all 4 messages (which can then all be turned to "$1." when using the "$1" placeholder).
  • But that solution is NOT permitted for Romance languages which allow a final dot as an abbreviation mark only when the word is abbreviated using only its first *letters*; but when there are no letters, just digits, we MUST use final letters of the ordinal suffix, and they vary in gender, possibly even in grammatical case for some languages (so the alternative is to not used suffixed ordinals, but expressions with a common prefix like "numero $1", abbreviated as "nº $1".
  • Various languages do not support any suffixed notation for ordinals, but only prefixed notations.

After searching the code, I found that these ordinals are used for testing the translated formatting of dates, only for days of the month. This is not clear in the current doc. And I don't know if this code also tests for other locales than English.

But then these ordinals are not necessarily using ordinals suffixes in formatted dates (for example in French, only "1" is suffixed as "1er", whereas all other days of the month do not use any suffix. Note also that the ordinal suffix "er" is superscripted, but I do not know if the formatted dates can include HTML formatting, so the test could as well accept "1er" (e.g. when processing dates with a plain-text command line interface.

Other solutions could also use specific superscript characters (normally encoded in Unicode only as "modifier letters" for compatibility with other notations such as phonetic notations or common abbreviations used in Spanish/Italian/Portuguese, in a restricted set of letters: this subset includes 'ª' and 'º' as symbols for compatibility with ISO 8859-1, but not the superscript 'e' and 'r'; it contains a superscript 'r' as 'ʳ' (U+02B3) only as a modifier letter for IPA, but no superscript letter 'e' that would be needed for French dates, and no other superscript letters used in French and other languages for noting the final letters of many abbreviations. So these "compatiblity superscript" or "letter modifiers" are not usable (they are also incompatible with transforms of letter cases): we need rich text with styling (i.e. <sup></sup> elements in HTML) for normal processing (note that these are not really "styles", because they are semantically significant, offering distinction in some cases, but not here within dates, so we can still recognize the date "1er octobre 2022" even if it should be "1er octobre 2022" with the normally required superscripts).

Note as well that English also uses sometimes superscripts for abbreviations when they are semantically significant. But Unicode/ISO/IEC 10646 contains no formatting control that could be used to alter the semantics and rendering of a normal letter to make it superscript (some font formats like OpenType and text renderers have such capability which contextually transforms some letters to superscript if they occur after a decimal number, but this does not work if this occurs with Roman numbers, like "XXe siècle" in French so you'd still see "XXe siècle"...

If these 4 messages are just intended to test the date formater in the English locale only, these 4 messages should not even be marked for translation (so the PHP source code is incorrect and should have not used pht("st") for example to mark these as translatable.

See:

Verdy p (talk)04:38, 25 August 2022