Jump to content

Portal talk:Lv/LiquidThreads

From translatewiki.net
This page contains archived discussions using the Liquid Threads extension. You can still participate in existing conversations, but you should start new conversations using the new discussion tools on the main talk page.

Contents

Thread titleRepliesLast modified
PLURAL localization223:19, 3 April 2016
Bot request for Latvian(lv) plural translations615:21, 12 January 2015
Documentation of plural rules016:51, 28 April 2012

PLURAL localization

Was "pointed" here after reading Plural#Localising plural rules for a language. In Latvian, we have some problems with PLURAL. The wikicode could be like this (we have a template, which does kind of the same as PLURAL, and no mistakes have been noticed in last 5 years):

{{#ifexpr: (abs({{{1}}}) mod 10 = 1) and (abs({{{1}}}) mod 100 != 11)
| {{{2}}}<!-- word in singular; in [[Plural#Localising plural rules for a language]] described as "form1" -->
| {{{3}}}<!-- word in plural; in [[Plural#Localising plural rules for a language]] described as "form2" -->}}

Examples (included PLURAL function to enable checking @lvwiki):

PLURAL Number Should be Currently is
pieces 0 pieces piece
piece 1 piece pieces
pieces 11 pieces piece
pieces 42 pieces pieces
pieces 321 piece pieces
pieces 322 pieces pieces
pieces 500 pieces piece

Can this been fixed?

Edgars2007 (talk)16:46, 24 December 2014

The current rules come from CLDR and are:

<pluralRules locales="lv prg">
    <pluralRule count="zero">n % 10 = 0 or n % 100 = 11..19 or v = 2 and f % 100 = 11..19 @integer 0, 10~20, 30, 40, 50, 60, 100, 1000, 10000, 100000, 1000000,  @decimal 0.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 100.0, 1000.0, 10000.0, 100000.0, 1000000.0, </pluralRule>
    <pluralRule count="one">n % 10 = 1 and n % 100 != 11 or v = 2 and f % 10 = 1 and f % 100 != 11 or v != 2 and f % 10 = 1 @integer 1, 21, 31, 41, 51, 61, 71, 81, 101, 1001,  @decimal 0.1, 1.0, 1.1, 2.1, 3.1, 4.1, 5.1, 6.1, 7.1, 10.1, 100.1, 1000.1, </pluralRule>
    <pluralRule count="other"> @integer 2~9, 22~29, 102, 1002,  @decimal 0.2~0.9, 1.2~1.9, 10.2, 100.2, 1000.2, </pluralRule>
</pluralRules>

You would have to convince CLDR to change the rules. We only override CLDR rules in exceptional cases.

Nike (talk)00:44, 25 December 2014

The use of

{{PLURAL|form1|fallbackform}}

with positional (numbered) parameters should be deprecated in favor of the more explicit labelled forms:

{{PLURAL|one=form1|zero=form2|fallbackform}}

The mapping from labels (one, zero, other... as used in CLDR) to positional numbers (compatible with imports/exports for the legacy GNU Gettext format) can be made language dependant, but the database should record translations with those labels instead of implicitly numbered positions. All we can assume is that the form "other" (used by CLDR) should be mapped to the fallback form (the last form used in the legacy syntax using positional parameters).

This way no bot action would be required, except for transforming first all the existing translations so that they use labeled forms (except for the last form which should remain "other").

Then adding a plural rule for the "zero" form will have a minimum impact. It will be possible to reorder later the fallback order (once all the existing translations have had their forms labeled), by remapping the positional numbers assigned to each label.

This could also allow adding more labels, for example for gender, case, or for selectors of their object (reader, author, each cited person, group of persons, subject of the verb, object of the sentence, genitive possessor....).

A translation could also add these labels to qualify them as tags (e.g. in Chess games, say that the translation of "tower" is "tour" in French but qualifying it as "feminine" for the French feminine, and "one" for the French singular, its plural would be "tours" tagged by "feminine" and "other" ; the translation of "pawn" in French would be "pion", tagged as "masculine" and "one", its plural "pions" would be tagged as "masculine" and "other" : no need to add multiple variants, the source translation only requests the translation for "tower" and "pawn" ; but when this item used in a variable, the selectors are accessible from the variable name and variants are automatically selected so that:

{{#grammarswitch: ${variablename|?=gender articlecontract} ${count|?=plural}
| f plain    zero  = aucune       ${variablename} ${color|f one}
| f plain    one   = la           ${variablename} ${color|f one}
| f contract one   =            l’${variablename} ${color|f one}
| f          other = les ${count} ${variablename} ${color|f other}
| m plain    zero  = aucun        ${variablename} ${color|m one}
| m plain    one   = le           ${variablename} ${color|m one}
| m contract one   =            l’${variablename} ${color|m one}
| m          other
| #default         = les ${count} ${variablename} ${color|m other}
}}
// E.g. Given variablename="tower", count="2", color="white", we get: "les 2 tours blanches"

The selectors used above are generic (and look complex here), but there can exist also shorter aliases for combinations, defined by a mapping similar to the mapping of form labels to numbers. Note that for the ${color} adjective (which would be replaced by the translted forms "blanc(he)(s)" or "noir(e)(s)", depending on the color value and the gender and plrual of the name used with it), we select the appropriate form of the adjective using a smaller set of labels (we could use the "zero" form but its default is the same as "one" in French).

Selectors in the grammar switch are just an unordered space-separated list of symbolic tags (labels). A selector matches if all tags in the selector are present in the first parameter of the grammar switch. Some known synonyms (defined for the language) could be used such as "s" for "one", or "ms" instead of "m one".

variables can be queried in two ways:

  • using a "?" parameter in the pipe to retrieve a list of specific tags matching some known pattern (e.g. "gender" matches the tags "m" or "f" if they are set in the translation of the variable and return them). Here the synonym tags (such as "?=gp" instead of "?=gender plural" could be matched too, it will retrieve the union of tags "m" or "f" for the gender, or "one" or "other" for the plural form)
  • using a simple pipe followed by a space-separated list of tags, to select one of its known translations (internally stored like a grammar switch). Here the synonym tags (such as "|ms" instead of "|m one" could be matched too, it will switch the correct variant to return for the masculine singular form in French).

Tags (labels) are just symbolic names (containing only digits, letters, or dashes/underscores, with significant case to simplify implementations). There's also no limitation on the number of tags that can be set or queried from a translation or combined in a selector of a "grammarswitch".

The "grammarswitch" syntax above is for MediaWiki or for use in parts of a compound translation of the same variant ; the internal representation in the database (for storing multiple variants of the same translation unit) will just create as many translation items as needed, indexed by the tags list reordered in alphabetic order.

We can also tag the resulting translation with its resulting appropriate tags (but by default, these tags for the result of a "grammarswitch" are those in its selector (in the first parameter) and we don't need to change it (such change may be needed if the full translation contains more than one grammarswitch, but by default all tags in all grammarswitches will be combined in an union, and other literal texts outside the switch don't alter the list of tags). To set another list of tags for the result, we could use a syntax such as "${|=new tags}" at end of the variant (in wikisyntax). This system of defaults would allow useful inheritance of lingusitic properties for complex composites (but they would collide if we want to memorize for example two different genders for disting parts, in which case new tags should be set for each part; as another example a composite containing two singular names linked a conjunct like "and"/"et" would not keep the "one" tag of the singular, but could reset it to use the tag "other" instead with "${|one = other}" and force the resulting gender to become masculine if both are masculine with "${|m f = m}", and both can be set in one tag: "${|one = other|m f = m}" which means: if tag "one" is set then remove it and set tag "other", if tags "m" and "f" are both set then remove them and add tag "m").

Other examples are possible for selectors of case, person/object/subject... And many linguistic exceptions may be taken into account (e.g. above: the invariable composite color names such as "bleu-vert" are invariable in French even when used as epithetical or attribute adjectives)

Verdy p (talk)21:55, 3 April 2016
 
 

Bot request for Latvian(lv) plural translations

As we recently found out, our plurals are implemented using three, not two categories. We would need to a bot to run on all MediaWiki related messages which have only two parameters and make a copy of second value in first position (from language perspective it is OK). For example {{PLURAL:$1|one|other}} should result in {{PLURAL:$1|other|one|other}}. Any volunteers?

Papuass (talk)15:56, 9 January 2015

This is a related thread where we were convinced that this is a bug in MediaWiki: Thread:Support/PLURAL_localization. Updating messages would fix that.

Papuass (talk)15:59, 9 January 2015

The last time rules changed was one year ago, for CLDR 24. According to that patch (no particularly clear):

  • Latvian (lv) used to count only 0 as 'zero' form, but CLDR 24, any number satisifying the following formula is counted as zero: n % 10 = 0 or n % 100 = 11..19 or v = 2 and f % 100 = 11..19 Examples: 0, 10~20, 30, 40, 50, 60, 100. Updated the tests accordingly. Not overriding it in MW. Users will see different plural form for the above numbers.

Translated in human-readable form, the three forms of PLURAL are:

  1. 0, multiples of 10 and decimals ending with 0 [was: only 0];
  2. 1 and all numbers or decimals ending with 1 [was: numbers ending with 1 except 11];
  3. 2, 3, ..., 9; 22, 23..., 29; from 0.2 to 0.9, from 1.2 to 1.9; all numbers and decimals ending with 2 [was: nothing].

Does this correspond to what you're seeing? Is it possible to make a grammatical translation with these rules? (Nobody reported the rules as erroneous to CLDR yet; only an aesthetic change.)

The substitution you suggest means that you were still using the rules which were deprecated in June 2012. :( There is only one other language which changes the rules this way in 2012, ksh (it used to have 1, other, 0; in CLDR it's 0, 1, other); I don't know if their translations were corrected back then.

Nemo (talk)20:06, 9 January 2015

Yes, we know, that we were using the old plural rules. We were unable to find a reason, so we created a template for plural for most obvious places. I spent some time yesterday to find out real reason.

If the bot would mark requested edit as fuzzy, we would validate the translations later. Case 1 and case 3 in most cases can be the same.

Papuass (talk)12:30, 10 January 2015

Ok, if it's "most cases" then we can't merge the two rules. I don't understand why the "zero" rule comes first, though. Anyway, I'm working on the automatic replacement.

In the meanwhile, can you please do as Lloffiwr asked and document the rule somewhere here on the portal? Thanks.

Nemo (talk)13:29, 10 January 2015

The bot is almost done. You can revise all the translations within this page: [1].

Nemo (talk)14:21, 10 January 2015
 
 
 
 
 

Documentation of plural rules

I have drafted a list of Mediawiki plural rules for all languages. I would be grateful if a translator could check that the rules for Latvian are correctly recorded in the list. If I have made a mistake please go ahead and correct it.

It could also be useful to any future new translators for Latvian to be able to read notes on using plural, especially Mediawiki plural, on your portal or a sub-page of the portal. I note that the various plural systems used by projects have different rules for Latvian.

Lloffiwr (talk)16:51, 28 April 2012