Difference between CLDR plural rules and MediaWiki plural rules
Last edit: 18:43, 29 January 2012
In other words we want to know if there is a language where any given number-included plural form is not a subset of number-less plural form. If this is the case, then we cannot achieve number-less forms by just combining existing (number-included) plural forms (in practice, giving them the same translation) in suitable combination.
I'm not aware of a language not having the wanted subset relation. Since there may be ones, I think, CLDR should be made aware, with the suggestion to make a note on their explanatory page. Unless someone reports that (s)he did it already, I shall do that.
Well, I think, it would be a superset, but anyways. When going through the above examples again, I found that in Colognian, a sentence followed or preceded by a list (the items of which you can count) is to be treated and built exactly as if the number was included, even if the sentence itself is numberless.
The page on plural rule syntax at CLDR says: "There are two extra values that can be used with count attributes: 0 and 1. These are used for the explicit values, and may or may not be the same as the forms for "zero" and "one"." It seems that CLDR have got around the problem of defining additional categories for use in particular circumstances by introducing these 2 additional values. Would it be possible to write code for Mediawiki plural which does the same, enabling the use of '0' and '1' only when needed?
I wrote the previous comment before I had understood how Mediawiki uses more than one defined plural ruleset to handle numberless sentences (and potentially sentences with zero?). Mediawiki's solution appears to be elegant, with simpler syntax for translators for numberless sentences.
However, you also say in another thread that it is 'hard to unify Mediawiki rules with other systems'. Would it be easier to unify with other systems if instead of making the second ruleset shorter than the normal ruleset, instead we made it longer, typically by adding an additional rule for 1 (or for 1 and 2 for Scottish Gaelic for example) and an additional rule for 0 where needed (Swahili would benefit from an additional rule for 0, for example). Making the second ruleset longer is not as elegant as the current system. But does it help with compatibility with other systems?
Other systems only support one ruleset - that's the problem. And frankly I don't see any reason for using multiple rulesets for one language except for translator's convenience. As far as I know the second ruleset is always a shortcut which can be done using the first ruleset in slight longer way. Make the shortcut harder makes it even more useless and we should rather just drop it in that case.
A second ruleset is used for the languages using rules J and K. Combining both rulesets would entail increasing the number of forms for these languages by 1. As you say, this shouldn't be a great hardship, given that they only have 3 forms at present.
It sounds to me as if you would possibly like to see Mediawiki use one ruleset only for all languages and that this ruleset should be able to be used for sentences without numbers. If so, we could add a paragraph about alternative rulesets not being recommended to Plural#Alternative_ruleset.
I will try to find time to add to the discussion on CLDR, now that we have got examples of languages affected.