PLURAL documentation

Fragment of a discussion from User talk:FShbib

I don't see how this differentiate from what was specified precisely in Portal:Ar#Plural_rules (which already lists the exact rules for matching number forms, as specified in CLDR and used in MediaWiki, and gives examplar values for each plural class). Everything was there (including the fact that for "zero", "one" and "two" forms, these forms are matched by exact value, so that you don't need to render the numeric value, and can translate them with a plain word for the numeral specifier, or by deriving the nouns/verbs/adjectives to match).

What is important is to have then at least 4 forms if you need to render the numeric value in the translation, and the last form (there can be 6) will match including when the effective value is not a plain integer and is always mapped as the "other" form. Not all translations need 4 to 6 forms, you can use only one (the default) for rendering numerals only as digits (for short forms, such as number followed by an abbreviated unit like "11 km", or used in narrow cells of data tables).

The translate UI must then check that that last form specified contains the numeric value, but may be there are cases where the numeric value will need to substitute "Central/Standard" Arabic digits (Eastern Perso-Arabic digits would be used for Farsi/Persian, Pashto, Urdu, Gilaki, Baluchi, etc. but they are different languages...), whereas the 1st parameter given immediately after "PLURAL:" only accepts (for now?) Western Arabic decimal digits (from ASCII/Basic Latin). when translating a message rendered by MediaWiki as wikitext, the current version of the MediaWiki "PLURAL:" parser function may then use another parameter in first position (containing the numeric value in ASCII), than the parameter containing the numeral formatted using Central Arabic decimal digits that would be needed for the last form.

But if the source message (in English) references only the parameter for the ASCII value (with no grouping delimiter, using decimal digits and "+" or "-" from ASCII only, and onlty "." for the decimal separator), we have no other choice than using that parameter for the formatted value to use in last form given to the "PLURAL:" function (for the "other" plural class). That is not a consideration specific to Arabic, because it also concerns English (where we also have the distinction between "raw" numeric values, and "formatted" numeric values): in such case the source message in English will specify another "$variable" in the last form, and that last "$variable" must be present in the last form of the translation (and it should also be present in the 4th, and 5th forms if they are specified in the translation), otherwise the last form must use the parameter used in the 1st parameter given just after to "PLURAL"; but variables may be omitted in the 1st, 2nd, and 3rd form, if they are specified before the last default form (and even should be omitted in Arabic where they map to "zero", "one" and "two").

For messages not rendered with the MediaWiki parser (e.g using GetText), they are normally not using the syntax with "PLURAL:" prefix (with a colon), but just a "template-like" syntax starting by "PLURAL" immediately follwoed by the vertical bar before the 1st parameter containing the value. Then further parameters may be specified positionnally (when used by GetText i18 library), or by name like "|zero=..." (when used by tools based on CLDR: in those messages, the "other" parameter" should then be present and in last position as it is not necessarily named, all other forms are optional and could be in any order). This case is a bit more complex to validate in the Translate UI.


IMHO, the "PLURAL:" parser function Mediawiki should be able to accept non-ASCII decimal digits in the 1st parameter, and possibly alternate typographic forms for the plus and minus signs, but not any other form than the "." for the decimal separator (it should warn if there's a comma ","), and should silently discard whitespaces used as grouping characters. The alternative would be to have another syntax like "LOCALPLURAL:" taking a second parameter with a language name, before the actual forms, so that we can use the *same* $variable containing the localized formatted value in the 1st parameter (containing the actual number) and in the last parameter (for the "other" form by default). But that syntax will still not be recognized by the current Translate UI and tool.

Verdy p (talk)15:17, 30 October 2022