Plural

From translatewiki.net
See also Plural/Comparison of plural rules in various databases

Introduction

Plural formatting provides a way for software to render sentences in a grammatically correct form, for all possible values of the numerical variables on which the sentence depends.

For example:

  • The following page is in the current category.
  • The following 10 pages are in the current category.

The software needs to contain a definition of the number of different forms which are needed to render plural in a particular language. English uses 2 forms thus:

  • 1: form 1
  • 0 and all other numbers: form 2

Other languages may have one form only, or they may use more than 2 forms; up to 6 forms have been requested by the languages supported at translatewiki.net so far. This means that plural syntax may be needed in messages which do not need it in English (see these explanations for software developers: Notes on plural on MediaWiki.org, and on Gettext).

Software projects localised at translatewiki.net use 1 of the following 3 databases of the definitions of plural forms to generate plural:

  • Mediawiki (pre switch to CLDR in 2012-09)
  • CLDR
  • Gettext

Some projects do not yet support plural. Notes on plural systems used by each project should be on each project page.

Table of plural formats used by projects

Plural Ruleset Projects currently supported on translatewiki.net Formerly supported projects
CLDR

Encyclopedia of Life (Ruby implementation)
Etherpad lite
FreeCol (custom implementation)
lib.reviews
MediaWiki (1.20 and above, custom implementation)
OpenStreetMap (Ruby implementation)
Wiki Ed Dashboard (Ruby implementation)

iNaturalist (Ruby implementation)
Shapado (Ruby implementation)

Mediawiki (old)

FUDforum
Intuition

Wikia
MediaWiki (1.19 and below) (legacy ruleset no longer used)

Gettext

Pywikibot (does not use gettext but uses custom rules from "plural-gettext.txt")

iHRIS (plural not used)
Mwlib.rl (PediaPress) (plural not used)
StatusNet

Plural not supported

Blockly
Kiwix
MantisBT
Mifos
WikimediaMobile
WikiBlame

Crosswatch
Okawix
Open Images
WikiReader
WiktionaryMobile

Convergence between the different plural rules set on different software has been discussed as a long-term goal at translatewiki.net. A comparison of plural rules by language as set on Mediawiki, CLDR and Gettext is given on Comparison of plural rules in various databases.

Messages needing plural

Some projects have plural enabled for all messages, even when the English message does not need it. Others, such as Mediawiki, must enable plural in specific messages which need it.

Software developers should be aware that plural should be used even where the result can never be 1 and therefore the English message does not need plural. However, if you think a message needs plural and it currently does not use it then, if the project supports plural, please report this. You can report either on the project talk page at translatewiki.net or on the project’s own bug reporting site, as directed on the project page.

Plural in MediaWiki

Plural rules for MediaWiki for each language explained in plain language
Notes on plural for developers (on MediaWiki.org)

Since version 1.20, MediaWiki is using plural rules from CLDR with some local overrides. MediaWiki has a very long time used a compact inline plural syntax, which we consider a superior to other ways to specify plural rules. See below for the actual syntax.

Localising plural rules for a language

When a new language is set up at MediaWiki its default plural has 2 forms (the same as English):

  • number 1 – form 1
  • 0 and all other numbers – form 2

The forms for a language can be changed at any time by putting a request on Support, describing how plural works in your language. Give an example for each number, or group of numbers, which has a different plural. Effort should be made to ensure that new or updated plural rules will end up in the CLDR database.

New plural rules will take effect on Wikimedia Foundation sites in about two weeks if defined directly in MediaWiki. Other sites need to wait for new MediaWiki release.

Mediawiki can support any number of forms of plural (some languages have 6 forms), but the more forms there are, the more work a translator has to do to localise messages. There are two types of application which Mediawiki can handle but which occur only rarely on Mediawiki and its extensions:

  1. Zero. Many messages containing plural can only ever appear when the variable is not 0. Software developers also often write a separate message when 0 is a possible result for a variable. However, occasionally 0 does appear in messages with other numbers possible, as in this example:
    • Wlnote ("Below {{PLURAL:$1|is the last change|are the last <strong>$1</strong> changes}} in the last {{PLURAL:$2|hour|<strong>$2</strong> hours}}, as of $3, $4.")
    In some languages, additional forms may be needed to handle localisation of these messages. This is most likely to apply to languages where the number 1 would otherwise not be in a group of its own.
  2. Sentences where a number is not stated. Mediawiki software developers occasionally use the plural function in sentences where the number does not appear at all. For example:
    This causes problems for languages which don't need different plural forms with numbers, but use a pluralizer when number is not a present.

Since these messages are in minority, it has been proposed to allow specifying the rule condition and text for it inline for exceptions.

Another (discouraged) strategy available on MediaWiki is to define an alternative ruleset for a language, as described below.

Documenting plural rules for a language

Please add examples how to use plural in your language to the portal page of your language. For example Finnish portal should have:

  • Normal use $1 {{PLURAL:$1|päivä|päivää}}
  • Idempotent use $1 {{PLURAL:$1|päivän}} ajan

Plural syntax in MediaWiki

In Mediawiki the syntax for the example above is below. $1 represents the number of pages.

  • in English – The following {{PLURAL:$1|page is|$1 pages are}} in the current category.
  • in Upper Sorbian – {{PLURAL:$1|Slědowaca strona je|Slědowacej $1 stronje stej|Slědowace $1 strony su|Slědowacych $1 stronow je}} w tutej kategoriji:

If the number of forms written is less than the number of forms required by the plural rules of the language, the last available form will be used for all missing forms. For example, the two Welsh messages below are equivalent.

  • $1 {{PLURAL:$1||diwrnod|ddiwrnod|diwrnod|diwrnod|diwrnod}}
  • $1 {{PLURAL:$1||diwrnod|ddiwrnod|diwrnod}}

The forms correspond to the named plural rules in CLDR with some local overrides. Not shown in the rulesets is the form other which will be used for any number not covered by the other rules.

  • in English: {{PLURAL:$1|one|other}}
  • in Lower Sorbian: {{PLURAL:$1|one|two|few|other}}

Explicit plural forms

Since MediaWiki 1.22, it's possible to force PLURAL to output some specific string for a specific number, especially 1 or 0, in order to override the plural rules for any language. This allows to do without separate messages for those cases.

Examples:

  • Accepted by {{PLURAL:$1|1=you|$1 users including you}} is a valid English usage
  • {{PLURAL:42|42=The answer is 42|Wrong answer|Wrong answers}} gives: The answer is 42

The syntax 42= can be put anywhere in the PLURAL statement after the first |. This syntax can be used even when the English message does not, but does not have to be used when the English message does.

Background.

Messages not needing plural

If a particular message uses plural in English but does not need plural in your language then you should still write the plural function. If you do not, then the localisation checks at translatewiki.net will flag the message as problematic and requiring revision.

However, you can use the the shortcut explained above to avoid repeating the forms: {{PLURAL:$1|text}}.

More examples of message definitions

# Simple plural
'key' => '$1 crying {{PLURAL:$1|baby|babies}}'
# Can be used multiple times
'key' => '$1 crying {{PLURAL:$1|baby|babies}}. Please feed {{PLURAL:$1|it|them}}.'
# Even on different variables
'key' => '$1 crying {{PLURAL:$1|baby|babies}}. Please feed {{PLURAL:$1|it|them}}. You have $2 {{PLURAL:$2|apple|apples}}.'
# The number doesn't need to be visible
'key' => 'You have {{PLURAL:$1|a new message|new messages}}.'
# And you can nest it freely
'key' => 'Please send {{PLURAL:$1|this $2 {{PLURAL:$2|word long|words long}} message|these messages, which are $2 {{PLURAL:$2|word long|words long}}}}.'

MediaWiki messages not supporting plural

There are some messages, for example log messages, which cannot use PLURAL on all variables. These messages cannot be parsed by invoking the MediaWiki parser to store or render them; instead a simpler i18n framework (with less internal dependecies) is used. There may be no connection with user's preferences for their current language (e.g. in messages generated by background jobs and internal bots, or in messages returned by the API, which may work only with a single locale, and must not fail).

This may also happen for preserving the compatiblity with older supported versions of MediaWiki or its supported extensions (newer versions or alternate extensions may eventually support PLURAL, but will use different messages, translated separately).

Alternative ruleset (obsolete)

This section is obsolete since MediaWiki 1.20:

It is possible though discouraged to define more than one ruleset for a language, as long as the number of forms in each ruleset is different. The software will count the forms written in a plural function and use the ruleset corresponding to the total number of forms used. This is used in languages where the number 1 is included in a group with other numbers. The alternative ruleset usually gives 1 a form of its own. This is used to write plural in sentences where no number appears.

Russian is an example of a language with two rulesets.

  • Russian ruleset for sentences where the number appears
    1. n mod 10 is 1 and n mod 100 is not 11 (1, 21, 31, 41, 51, 61...)
    2. n mod 10 in 2..4 and n mod 100 not in 12..14 (2-4, 22-24, 32-34...)
    3. n mod 10 is 0 or n mod 10 in 5..9 or n mod 100 in 11..14 (0, 5-20, 25-30, 35-40...)
  • Russian ruleset for sentences where the number does not appear
    1. n is 1 (1)
    2. n is not 1 (0, 2, 3...)

Example of Russian message using the alternative ruleset with 2 forms:
Раньше на сайте {{PLURAL:$1|уже был [$2 файл]|были [$2 файлы]}} с точно таким же содержанием, но {{PLURAL:$1|он был удалён|они были удалены}}.

Plural in CLDR

  • Description of plural on CLDR - CLDR.
  • Notes on setting plural rules for a language at CLDR – see creating a new locale.
  • The forms available for a particular language are listed on CLDR.

Plural syntax for projects using the CLDR database

CLDR does allow separate forms for fractions or decimal numbers.

There are two extra values that can be used with count attributes: 0 and 1. These are used for the explicit values, and may or may not be the same as the forms for "zero" and "one". "zero" might in some languages contain 0 and 1, or languages not using plural at all might not even have forms "zero" or "one".

If your language has not yet been localised at CLDR, then the plural rules used will be the same as in the fallback language or English.

Plural syntax for projects using the rails-i18n framework

In projects using the rails-i18n framework (also called Ruby on Rails) the syntax for the example above is:

  • The following {{PLURAL|one=page is|10 pages are}} in the current category.

Here is an example from OpenStreetMap.

The general case, for a language which uses all the possible forms in CLDR, is {{PLURAL|zero=text|one=text|two=text|few=text|many=text|text}}. 'text' in the example may contain the variable '%{count}', which is rendered as a number in digits. The last item listed covers 'other'.

You must always use all the forms defined for your language.

The rails-i18n framework supports zero as a form for any language, where zero is defined as the value 0. Use of the zero form is optional and if not supplied one of the other forms will be selected according to the pluralization rules for the language. But the Translate extension i18n file parser does not currently support using 0 and 1.

Plural in Gettext

There is no centralized standard for plural forms in Gettext so the Translate extension keeps it own list. The latest version of the forms are available in the MediaWiki Translate code on git.

Plural syntax for projects using Gettext

Gettext plurals in translation work mostly the same as MediaWiki plurals. Those are identified by {{PLURAL:GETTEXT|...}}| syntax. The number might not be available for use in the message text. All forms must always be provided.

Here is an example from StatusNet: {{PLURAL:GETTEXT|%1$d byte|%1$d bytes}}.

See also: GNU gettext utilities: Plural forms