On different plural systems
Last edit: 16:39, 10 March 2012
I made some research on the various plural systems used by projects in translatewiki.net. It compares plural rules between three main systems: MediaWiki, Gettext and CLDR (used by Ruby projects). The results are at page Plural/Comparison_of_plural_rules_in_various_databases. The most troubling part are the detailed reports after the overview. In the detailed part the plural rules are compared between all applicable systems. If there are differences, those are printed out. Can we reach out speakers of those languages to verify which ruleset (if any) is correct.
If we can sort those issues out, we can go further and try to get CLDR data as complete as possible, and have all other projects use CLDR definitions in way or another. We could for example provide extended list of plural rules (based on CLDR data with missing rules added) for all the supported projects.
If someone wonders about the Gettext, there isn't any agreed standard on those rules, and different projects may use different rules for the same language. The list of Gettext rules would be those used by translatewiki.net and is built on multiple sources. The raw data for Gettext is at  and preprocessed CLDR data is at . There is one more peculiarity with CLDR data, that it also tries to take into account decimal numbers.
I haven't attempted to do anything with Gettext yet, and won't be in a position to tackle this for a while. Having had some discussions with the Bedwyr Language Institute at Bangor University about the plural rules (they requested the changes at CLDR), it appears that most commercial websites and computer programs have in the past just not put any plural rules for Welsh onto their systems. Since the mutation rules are complex, it appears that some linguists have opted to not mutate at all, just using the singular form of a word in an unmutated state, analogous to websites in English which use 'him' to refer to 'him/her'. Given that that is the case, it might prove impossible to get Gettext to follow the CLDR and translatewiki.net pattern. How big a problem is it likely to be in future if Gettext continues to be out of step with CLDR for Welsh?
OK. If that is so, am I right in thinking that for the projects here which use Gettext, we can use the 6 plural rules for Welsh, same as currently used on MediaWiki and CLDR?