On different plural systems

I made some research on the various plural systems used by projects in translatewiki.net. It compares plural rules between three main systems: MediaWiki, Gettext and CLDR (used by Ruby projects). The results are at page Plural/Comparison_of_plural_rules_in_various_databases. The most troubling part are the detailed reports after the overview. In the detailed part the plural rules are compared between all applicable systems. If there are differences, those are printed out. Can we reach out speakers of those languages to verify which ruleset (if any) is correct.

If we can sort those issues out, we can go further and try to get CLDR data as complete as possible, and have all other projects use CLDR definitions in way or another. We could for example provide extended list of plural rules (based on CLDR data with missing rules added) for all the supported projects.

Nike16:04, 10 September 2010

Looking for someone who wants to help getting this sorted out.

Siebrand09:25, 11 September 2010

I send out two notices that I hope will get some attention.

Siebrand09:50, 11 September 2010

If someone wonders about the Gettext, there isn't any agreed standard on those rules, and different projects may use different rules for the same language. The list of Gettext rules would be those used by translatewiki.net and is built on multiple sources. The raw data for Gettext is at [1] and preprocessed CLDR data is at [2]. There is one more peculiarity with CLDR data, that it also tries to take into account decimal numbers.

Nike10:19, 11 September 2010

I am glad to say that CLDR have now changed their plural rules for Welsh to agree with the rules on translatewiki.net. See the ticket and the rules.

I haven't attempted to do anything with Gettext yet, and won't be in a position to tackle this for a while. Having had some discussions with the Bedwyr Language Institute at Bangor University about the plural rules (they requested the changes at CLDR), it appears that most commercial websites and computer programs have in the past just not put any plural rules for Welsh onto their systems. Since the mutation rules are complex, it appears that some linguists have opted to not mutate at all, just using the singular form of a word in an unmutated state, analogous to websites in English which use 'him' to refer to 'him/her'. Given that that is the case, it might prove impossible to get Gettext to follow the CLDR and translatewiki.net pattern. How big a problem is it likely to be in future if Gettext continues to be out of step with CLDR for Welsh?

Lloffiwr18:05, 4 December 2010

As far as I know we can put any rule we want into Gettext headers.

Nike19:12, 4 December 2010

OK. If that is so, am I right in thinking that for the projects here which use Gettext, we can use the 6 plural rules for Welsh, same as currently used on MediaWiki and CLDR?

Lloffiwr19:18, 4 December 2010

Yes, that's how I think it works. The current set of Gettext rules in the comparison is just collection of rules from different sources.

Nike19:29, 4 December 2010