About [[Phabricator:arcanist-core-3119f17b3b50c21d/en]]

For some reason, the translation appears 'outdated' while nd is the only possible translation.

Cigaryno (talk)17:44, 24 August 2022

I don't agree; there are other solutions if these two messages are intended to generate ordinal suffixes: don't hardcode ordinals this way; but this requires patching the source code.

  • If we translate to French for example we could use the alternate suffix "e" for all values... except for the 1st ordinal which should be suffixed by "er", but the current hardcoded rules for English don't work this way (e.g. "1st", "11th", "21st" in English, versus "1er", "11e" and "21e" in French)
  • In German this would be simpler as a general suffix "." can be used for all ordinals
  • Some language require varying the suffix (e.g. by grammatical gender or plural or grammatical case according to the nominal group that the ordinal qualifies)
  • Other languages would use a prefix instead, such as "nº 1".

Using these messages as is makes no sense at all except in English. CLDR has resources about ordinal formatting (but only for the nominal or vocative grammatical case). But in fact It would be simpler to adapt messages if ordinals were not made translatable using this patchwork model, but using complete sentences containing the ordinal (and just a placeholder for the numeric value): we could then use "PLURAL:" parser function if needed, or use alternatives like "nº 1" that do not depend on the numeric value, or translate without any ordinal.

Translating suffixes only, without using a placeholder for the numeral value, makes NO sense. These 4 messages (currently hardcoded and tuned for following rules only valid in English) must be replaced by "$1st", "$1nd", "$1rd", "$1th" (or just by a single message using for example "#$1" or "nº $1" without needing any ordinal form) so that we can create correct translations.

  • Only German-like translations (also used in some Central European languages) can use a "." suffix for the ordinal indicator to be used in all 4 messages (which can then all be turned to "$1." when using the "$1" placeholder).
  • But that solution is NOT permitted for Romance languages which allow a final dot as an abbreviation mark only when the word is abbreviated using only its first *letters*; but when there are no letters, just digits, we MUST use final letters of the ordinal suffix, and they vary in gender, possibly even in grammatical case for some languages (so the alternative is to not used suffixed ordinals, but expressions with a common prefix like "numero $1", abbreviated as "nº $1".
  • Various languages do not support any suffixed notation for ordinals, but only prefixed notations.

After searching the code, I found that these ordinals are used for testing the translated formatting of dates, only for days of the month. This is not clear in the current doc. And I don't know if this code also tests for other locales than English.

But then these ordinals are not necessarily using ordinals suffixes in formatted dates (for example in French, only "1" is suffixed as "1er", whereas all other days of the month do not use any suffix. Note also that the ordinal suffix "er" is superscripted, but I do not know if the formatted dates can include HTML formatting, so the test could as well accept "1er" (e.g. when processing dates with a plain-text command line interface.

Other solutions could also use specific superscript characters (normally encoded in Unicode only as "modifier letters" for compatibility with other notations such as phonetic notations or common abbreviations used in Spanish/Italian/Portuguese, in a restricted set of letters: this subset includes 'ª' and 'º' as symbols for compatibility with ISO 8859-1, but not the superscript 'e' and 'r'; it contains a superscript 'r' as 'ʳ' (U+02B3) only as a modifier letter for IPA, but no superscript letter 'e' that would be needed for French dates, and no other superscript letters used in French and other languages for noting the final letters of many abbreviations. So these "compatiblity superscript" or "letter modifiers" are not usable (they are also incompatible with transforms of letter cases): we need rich text with styling (i.e. <sup></sup> elements in HTML) for normal processing (note that these are not really "styles", because they are semantically significant, offering distinction in some cases, but not here within dates, so we can still recognize the date "1er octobre 2022" even if it should be "1er octobre 2022" with the normally required superscripts).

Note as well that English also uses sometimes superscripts for abbreviations when they are semantically significant. But Unicode/ISO/IEC 10646 contains no formatting control that could be used to alter the semantics and rendering of a normal letter to make it superscript (some font formats like OpenType and text renderers have such capability which contextually transforms some letters to superscript if they occur after a decimal number, but this does not work if this occurs with Roman numbers, like "XXe siècle" in French so you'd still see "XXe siècle"...

If these 4 messages are just intended to test the date formater in the English locale only, these 4 messages should not even be marked for translation (so the PHP source code is incorrect and should have not used pht("st") for example to mark these as translatable.

See:

Verdy p (talk)04:38, 25 August 2022