A DiscordWikiBot message always fails

A DiscordWikiBot message always fails

This message is marked as wrong:

https://translatewiki.net/w/i.php?title=Special:Translate&showMessage=discordwikibot-configuring-status-streaming&group=discordwikibot&language=he&filter=&optional=1&action=translate

I see these errors:

Incorrect number of plural forms in {1:שנייה אחת|{1}. It must have 4 plural forms. Currently 2 plural forms are given.
Incorrect number of plural forms in {0:דקה אחת|{0}. It must have 4 plural forms. Currently 2 plural forms are given.

I suspect that the plural forms in the translations are parsed incorrectly by the validator: The variables in this format look like {1}, and the validator appears to think that the closing curly brace of {1} is the closing curly brace of the whole plural expression. I might be wrong, however.

Amir E. Aharoni (talk)07:30, 24 November 2022
Edited by author.
Last edit: 10:09, 24 November 2022

Your syntax is just wrong. You cannot put placeholders {0} or {1} inside the "plural selectors" starting by "{0:" or "{1:".

It should work if you write it as

התגובה האחרונה מהסטרימים: {0} {0:דקה אחת|דקות|דקות|דקות} {1} {1:שנייה אחת|שניות|שניות|שניות} מוקדם יותר.

Why did you place these placeholders inside when they can be obviously left outside (exactly like in English)?

You tried a syntax that looks like this in English: {0:{0} second|{0} second|{0} second|{0} seconds} instead of simply {0} {0:second|second|second|seconds}.

It looks syntaxically correct, but the pywikibot-specific form of the plural selector cannot parse placeholders inside each of the listed form (yes this looks like a bug in the parser, but I'm not even sure that pywikibot would accept it; such possibility is supported by the Mediawiki syntax for plural forms. Even if the message validator is fixed, I'm not sure that pywikibot will accept it and won't have the same "bug" if it searches for the first closing brace after "{0:" or "{1:" with a regexp (which cannot "count" the embedding levels of braces to determine which is the correct closing brace that terminates the selector, so that it can split the result on pipes to get the list of plural forms).

Verdy p (talk)09:38, 24 November 2022

They cannot "be obviously left outside". Your suggestion is grammatically wrong in the Hebrew language. Different numbers need different word order. Stop guessing things about the grammar of languages that you don't know.

Is it documented anywhere that {1} cannot be used within plural clauses in this format?

Amir E. Aharoni (talk)10:06, 24 November 2022

I'm not "guessing" at all, I analyse what you tried to do. I've not made any assumption at all about Hebrew (and I'm perfectly aware that it needs 4 clauses for plural forms and that languages may need to change the word order depending on numeric values).

I also note that you wanted to remove the placeholder for the 1st form (zero in Hebrew), as if you had written this in English (which is also not accepted by the validator when I try it on the "en-x-lolcat" locale for tests):

 {0:zero second|{0} second|{0} second|{0} seconds}

But to support it, pywikibot cannot use a basic regexp to parse it, it needs a LALR (or recursive) parser (like the Mediawiki parser) that can parse the string delimited by either "}" or "|", and then use a counter to determine the embedding level of braces. Such thing would be needed as well in English if we wanted to use a word like "zero" or "no" instead of displaying the numeric value as a placeholder at a fixed position.

We could have also tried to use this in English and French (even if they have 2 forms for plurals):

 {0:{0} or more|several}
 {0:{0} ou plus|plusieurs}

And here also this is not accepted for the same reason, so this is not specific at all to Hebrew.

So there's a need to fix it not just in the TWN valiadtor, but also in pywikibot using the string? If this does not work in pywikibot, you need to find a translation that will force the unconditional inclusion of placeholders outside plural forms.

Also we should find where that string is used in pywikibot to know how it parses it. Fixing that in TWN's parser won't have an effect if it's not corerct in pywikibot. If it is fixed in pywikibot, then a solution can be found in TWN that mimics it.

But if pywikibot is not fixed, may be it could change the format of the string, to use for example "$0" instead of "{0}" for placeholders, while keeping "{0: ... }" for plural forms (this would require no fix in TWN itself), so that these placeholders can be safely inserted inside plural clauses.

And please keep polite Amir, you've made and insisted multiple times about things you absolutely don't know in other languages (from the perception you have in your own country), and also enforced your position abusely! And you made just here another false assumption about my own knowledge. I have documented and described plural rules in many places, and even asserted many times that word order could not be assumed in sentences, including for plurals, or other grammatical or semantic features, or letter cases and case conversions (e.g. Turkish), or punctuation (notably in Armenian, Greek, Spanish), or spacing (notably Souteastern Asian languages)...

Verdy p (talk)10:07, 24 November 2022

If you don't know the answer, don't write an answer, especially such a long one. I don't have time to explain you all your mistakes, I'll just note two obvious ones:

  1. The first form is not zero in Hebrew.
  2. This project has absolutely nothing to do with pywikibot.

People repeatedly complain that they are confused and intimidated by your long and wrong responses. This is seriously harmful. The next time you do this, I'll block you to prevent further disruption.

Amir E. Aharoni (talk)08:24, 25 November 2022

Sorry but I thought this "discordwikibot" was an addon based on "pywikibot". This does not change what I said, just substitute the word. There's by evidence no "disruption" as you've perfectly understood. But why do you threaten so many peoples to take such position, even when they show you that you are wrong and give proofs?

But now that the fact this but is written in Csharp makes things even worse and we have a documention about this (you asked for such reference, this is one!)

https://learn.microsoft.com/en-us/dotnet/standard/base-types/composite-formatting

Which explicitly says that format strings used inside placeholders (themselves surrounded by braces) do NOT support nesting braces for embedding other placeholders:

Interpreting nested braces isn't supported. [sic!]

So as was perfectly right since my first sentence above:

You cannot put placeholders {0} or {1} inside the "plural selectors" starting by "{0:" or "{1:"

There's a discussion about this in the section "Escaping braces" of that documentation article. (there's an example showing how "{{{0:D}}}" is parsed by the stadnard format provider, and it does not work as intended, generating a literal "{" then formatting the variable at position [0] with the non-working format "D}" (which results from unescaping the first two "}") and then outputing the remaining "}").

There's no work-around given (by Microsoft in its standard docs for C#) to do all that in a *single* localization string. The default "FormatProvider" used by "String.Format()" in C# does not tolerate/support that.

The only solution for that case is to use another "FormatProvider" class, to change how placeholders are parsed. And now there's no standard in C#, many packages trying to implement plural rules (with many "solutions" working only for a few language or using bad assumptions based on English or another alternate language (I've seen packages written to support Russian, or Hebrew, with a fallback to English, but not any one working across languages, and none using a well defined syntax supporting embedded plural forms inside the same localized string. It is possible to write one in C#, e.g. using the Mediawiki syntax, implementing such a "FormatProvider" class (that's what several packages written in C# to support CLDR-based localisation are doing, or they simply use ICU ported or interfaced into C#).

But if I closely look at Microsoft article, one way to do that would be to use "double-braces" escaping for placeholders embedded inside placeholders, may be something like:

{0:zero second|{{0}} second|{{0}} second|{{0}} seconds}

That's something to try... if it works in the bot and if it really implements a "custom field-formater" for plural rules because the *parsed* string "zero second|{0} second|{0} second|{0} seconds" (after unescaping) containing pipes is not a standard field formater; and then only fix in TWN's validator (which needs to properly count the embedding level of braces) and document somewhere on this wiki. But I fear that what we'll get will be that there will be visible braces. Anyway this "double-brace escaping" (which apparently exists in C# for compatiblity with "interpolated strings") would be very unfriendly for translators (how many braces we must use depending on context!), and writing a "IFormatProvider" would be safer than trying to use a "ICustomFormatter" (where "format strings" are not supposed to contain any brace or any text in a human language, just basic codes like "D" for dates or "F2" for floatting points with 2 decimals of precision; a "custom formatter" is normally used to format another type of value, such as a complex number, instead of using their default "ToString" method if there's one for that type).

I don't know how the bot's author chose that syntax to try supporting plural formaters, but that syntax natively cannot support what you think would be correct, and we cannot progress without asking to these developers what they intend to do. But for now, there's no way to embed any placeholder in clauses of existing "custom format string" used for noting plural forms used by this bot, and so no real ned to fix TWN's validator to try supporting something that actually does not work for now, without knowing what will be the final solution. But we may still fix TWN's validator so that it correctly pairs opening/closing braces (this will still work with most strings using "double-braces" escaping described above, as they should be paired as well in any valid translatable message)

It seems that DiscordWikiBot uses such a custom format provider called "SmartFormat" but I don't know which version is used. See for example https://github.com/axuno/SmartFormat/pull/322 for a recent change to support languages (like Japanese, or sometimes even in English where the terms to translate, e.g. "you are", do not vary with plural in English but are varying in other languages) with a single plural form (this requires a change in the Wikibot project, to disable the "autodetection" of the custom format (which requires the presence of at last one '|' pipe), or specifying the name of the "plural:" formater explicitly when there's a single form (i.e. {0:plural:form1|form2} or just {0:plural:form1} or even {0:plural:}). This change is very recent (8 days ago in that repository for "SmartFormat", so it is not part of the last version 3.2.0 release 2 months ago) and I don't know if "SmartFormat" implements or supports the "double-braces" escaping (documented by Microsoft for the default format provider for "String.Format()" in C#) or if it's still applied by default (in the "String" class) before calling any custom format provider (this seems to be the case according to a discussion in closed bug #322 and its related bug #320 linked at start of this paragraph). A comment in the commit log for closing bug #322 recommends that "autodetection" be turned off, and it will soon be off by default (in which case you'll need to name the "plural:" formatter in ALL translated messages, and the legacy syntax "{0:form1|form2}" will no longer work and all translated messages for that Wikibot using plural forms will have to be updated here in ALL languages! This would also affect other projects supported in TWN written in C# or in other programming languages based on .Net/CLI, and that chose to use "SmartFormat").

And sorry if this looks too long for you, but this is an ongoing search for a working solution (that needs to be documented when it works).

Verdy p (talk)12:30, 25 November 2022