Plural

From translatewiki.net
Jump to: navigation, search
See also Comparison of plural rules in various databases

This page is a draft and may contain errors. Text marked (???????) needs confirmation from developers. Improvements welcome.

Contents

Introduction

Plural formatting provides a way for software to render sentences in a grammatically correct form, for all possible values of the numerical variables on which the sentence depends.

For example:

  • The following page is in the current category.
  • The following 10 pages are in the current category.

The software needs to contain a definition of the number of different forms which are needed to render plural in a particular language. English uses 2 forms thus:

  • 1: form 1
  • 0 and all other numbers: form 2

Other languages may have one form only, or they may use more than 2 forms; up to 6 forms have been requested by the languages supported at translatewiki.net so far. This means that plural syntax may be needed in messages which do not need it in English (see these explanations for software developers: on Mediawiki, on Gettext).

Software projects localised at translatewiki.net use 1 of the following 3 databases of the definitions of plural forms to generate plural:

  • Mediawiki
  • CLDR
  • Gettext

Some projects do not yet support plural. Notes on plural systems used by each project should be on each project page.

Table of plural formats used by projects

Plural Ruleset Projects
Mediawiki FUDforum
MediaWiki
Toolserver
Wikia
CLDR Encyclopedia of Life (Ruby implementation)
FreeCol (custom implementation)
OpenStreetMap (Ruby implementation)
Shapado (Ruby implementation)
Gettext

iHRIS (plural not used)
Mwlib.rl (PediaPress) (plural not used)
StatusNet

Custom Pywikipedia (uses same rules as plural-gettext.txt)
Plural not supported

Kiwix
MantisBT
Mifos
Okawix
Open Images
WikipediaMobile
WikiBlame
WikiReader
WiktionaryMobile

Plural rules for Mediawiki are localised here at translatewiki.net. Convergence between the different plural rules set on different software has been discussed as a long-term goal at translatewiki.net. A comparison of plural rules by language as set on Mediawiki, CLDR and gettext is given on Plural/Comparison of plural rules in various databases.

Messages needing plural

Some projects have plural enabled for all messages, even when the English message does not need it. Others, such as Mediawiki, must enable plural in specific messages which need it.

Software developers should be aware that plural should be used even where the result can never be 1 and therefore the English message does not need plural. However, if you think a message needs plural and it currently does not use it then, if the project supports plural, please report this. You can report either on the project talk page at translatewiki.net or on the project’s own bug reporting site, as directed on the project page.


Plural in Mediawiki

Mediawiki plural forms support 0 and whole numbers. It does not currently support separate forms for fractions or decimal numbers. There is an open bug request at Bugzilla 28128 for this.

Localising plural rules for a language

When a new language is set up at Mediawiki its default Plural has 2 forms (the same as English):

  • 1 - form 1
  • 0, all other numbers - form 2

The forms for a language can be changed at any time by putting a request on support, describing how plural works in your language. Give an example for each number, or group of numbers, which has a different plural. When the proposal is accepted a staff member at translatewiki.net will code the plural forms into the Mediawiki code at SVN.

The new code will not appear on Wikimedia Foundation sites until these are updated for changes at SVN which may mean waiting for weeks or months, especially for the smaller sites. Other sites outside the Wikimedia Foundation can choose to update their software to the latest SVN version whenever they wish. The localisation extension updates for changes to messages only and does not cover other changes to SVN such as plural.

Mediawiki can support any number of forms of plural (some languages have 6 forms), but the more forms there are, the more work a translator has to do to localise messages. There are 2 types of application which Mediawiki can handle but which occur only rarely at Mediawiki and on its extensions. These are:

  • 0. Many messages containing plural can only ever appear when the variable is not 0. Software developers also often write a separate message when 0 is a possible result for a variable. However, occasionally 0 does appear in messages with other numbers possible, as in this example:
    • Wlnote ("Below are the last $1 changes in the last $2 hours, as of $3, $4.")
  • Sentences where a number is not stated. Mediawiki software developers occasionally use the plural function in sentences where the number does not appear at all. For example:

In some languages additional forms may be needed to handle localisation of these messages. This is most likely to apply to languages where the number 1 would otherwise not be in a group of its own.

However, the greater the number of plural forms for a language the more work there is for the translator in localising messages using plural. Because of this translators in some langages have taken the view that they will not include forms to cater for either one or both of these special cases.

Another strategy available on Mediawiki is to define an alternative ruleset for a language, as described below.

Documenting plural rules for a language

It is recommended that when plural rules are coded in Mediawiki for a language, a note is made on the language portal (or a portal sub-page) of the Mediawiki plural forms for that language, preferably in English as well as the portal language.


Plural syntax in Mediawiki

In Mediawiki the syntax for the example above is:

  • in English - The following {{PLURAL:$1|page is|$1 pages are}} in the current category.
  • in Upper Sorbian – {{PLURAL:$1|Slědowaca strona je|Slědowacej $1 stronje stej|Slědowace $1 strony su|Slědowacych $1 stronow je}} w tutej kategoriji:

$1 is the number of pages.

If the number of forms written is less than the number required by the language, the software will repeat the last form written until it reaches the number of forms required. For example, in the Welsh message

  • "$1 {{PLURAL:$1||diwrnod|ddiwrnod|diwrnod|diwrnod|diwrnod}}" you could instead write
  • "$1 {{PLURAL:$1||diwrnod|ddiwrnod|diwrnod}}" and the result would be the same as using the full syntax.

Messages not needing plural

If a particular message uses plural in English but does not need plural in your language then you should still write the plural function. If you do not, then the localisation checks at translatewiki.net and MediaWiki will identify this as a message which should use plural but doesn’t – see examples on ‘Localization checks’.

However, you do not have to repeat the text for each form in your language. You can just write it once, {{PLURAL:$1|text}}. The software will produce the text for all values of the variable $1.

If your language does not need the PLURAL function at all, then explain this on the Support page. You may be able to get the check software to ignore missing PLURALs in your language.

MediaWiki messages not supporting plural

There are some messages, for example log messages, which cannot use PLURAL on all variables.

Alternative ruleset

It is possible to define more than one ruleset for a language, as long as the number of forms in each ruleset is different. The software will count the forms written in a plural function and use the ruleset corresponding to the total number of forms used. This is used in languages where the number 1 is included in a group with other numbers. The alternative ruleset usually gives 1 a form of its own. This is used to write plural in sentences where no number appears.

Russian is an example of a language with two rulesets.

  • Russian ruleset for sentences where the number appears
    1. n mod 10 is 1 and n mod 100 is not 11 (1, 21, 31, 41, 51, 61...)
    2. n mod 10 in 2..4 and n mod 100 not in 12..14 (2-4, 22-24, 32-34...)
    3. n mod 10 is 0 or n mod 10 in 5..9 or n mod 100 in 11..14 (0, 5-20, 25-30, 35-40...)
  • Russian ruleset for sentences where the number does not appear
    1. n is 1 (1)
    2. n is not 1 (0, 2, 3...)

Example of Russian message using the alternative ruleset with 2 forms: Раньше на сайте {{PLURAL:$1|уже был [$2 файл]|были [$2 файлы]}} с точно таким же содержанием, но {{PLURAL:$1|он был удалён|они были удалены}}.

Plural in CLDR

  • Description of plural on CLDR - CLDR.
  • Notes on setting plural rules for a language at CLDR – see creating a new locale.
  • The forms available for a particular language are listed on CLDR.
  • The latest version of the forms downloaded to translatewiki.net is available here in plain language. This information refers to one of the shared pluralization classes (written in code) which you can find at github. However, each project uses its own database of plural rules drawn from CLDR, usually via its i18n framework. When translating a project, you should check the version of the plural rules for your language used by that project, which should be visible via a link on the project page.??????

Plural syntax for projects using the CLDR database

CLDR does allow separate forms for fractions or decimal numbers.

There are two extra values that can be used with count attributes: 0 and 1. These are used for the explicit values, and may or may not be the same as the forms for "zero" and "one". "zero" might in some languages contain 0 and 1, or languages not using plural at all might not even have forms "zero" or "one".

If your language has not yet been localised at CLDR, then the plural rules used will be the same as for English.

Plural syntax for projects using the rails-i18n framework

In projects using the rails-i18n framework (also called Ruby on rails) the syntax for the example above is:

  • The following {{PLURAL|one=page is|10 pages are}} in the current category.

Here is an example from Open Street Map.

The general case, for a language which uses all the possible forms in CLDR, is {{PLURAL|zero=text|one=text|two=text|few=text|many=text|text}}. 'text' in the example may contain the variable %{count}, which is rendered as a number in digits. The last item listed covers 'other'.

You must always use all the forms defined for your language.

The rails-i18n framework supports zero as a form for any language, where zero is defined as the value 0. Use of the zero form is optional and if not supplied one of the other forms will be selected according to the pluralization rules for the language. But the Translate extension i18n file parser does not currently support using 0 and 1.

Plural in Gettext

There is no centralized standard for plural forms in Gettext so the Translate extension keeps it own list. The latest version of the forms are available in the Mediawiki code on git.

Plural syntax for projects using Gettext

Gettext plurals in translation work mostly the same as MediaWiki plurals. Those are identified by {{PLURAL:GETTEXT|...}}| syntax. The number might not be available for use in the message text. All forms must always be provided.

Here is an example from StatusNet.

Personal tools
Namespaces

Variants
Actions
Translators
Navigation
Toolbox
Google AdSense