A DiscordWikiBot message always fails

Fragment of a discussion from Support
Jump to navigation Jump to search

If you don't know the answer, don't write an answer, especially such a long one. I don't have time to explain you all your mistakes, I'll just note two obvious ones:

  1. The first form is not zero in Hebrew.
  2. This project has absolutely nothing to do with pywikibot.

People repeatedly complain that they are confused and intimidated by your long and wrong responses. This is seriously harmful. The next time you do this, I'll block you to prevent further disruption.

Amir E. Aharoni (talk)08:24, 25 November 2022

Sorry but I thought this "discordwikibot" was an addon based on "pywikibot". This does not change what I said, just substitute the word. There's by evidence no "disruption" as you've perfectly understood. But why do you threaten so many peoples to take such position, even when they show you that you are wrong and give proofs?

But now that the fact this but is written in Csharp makes things even worse and we have a documention about this (you asked for such reference, this is one!)

https://learn.microsoft.com/en-us/dotnet/standard/base-types/composite-formatting

Which explicitly says that format strings used inside placeholders (themselves surrounded by braces) do NOT support nesting braces for embedding other placeholders:

Interpreting nested braces isn't supported. [sic!]

So as was perfectly right since my first sentence above:

You cannot put placeholders {0} or {1} inside the "plural selectors" starting by "{0:" or "{1:"

There's a discussion about this in the section "Escaping braces" of that documentation article. (there's an example showing how "{{{0:D}}}" is parsed by the stadnard format provider, and it does not work as intended, generating a literal "{" then formatting the variable at position [0] with the non-working format "D}" (which results from unescaping the first two "}") and then outputing the remaining "}").

There's no work-around given (by Microsoft in its standard docs for C#) to do all that in a *single* localization string. The default "FormatProvider" used by "String.Format()" in C# does not tolerate/support that.

The only solution for that case is to use another "FormatProvider" class, to change how placeholders are parsed. And now there's no standard in C#, many packages trying to implement plural rules (with many "solutions" working only for a few language or using bad assumptions based on English or another alternate language (I've seen packages written to support Russian, or Hebrew, with a fallback to English, but not any one working across languages, and none using a well defined syntax supporting embedded plural forms inside the same localized string. It is possible to write one in C#, e.g. using the Mediawiki syntax, implementing such a "FormatProvider" class (that's what several packages written in C# to support CLDR-based localisation are doing, or they simply use ICU ported or interfaced into C#).

But if I closely look at Microsoft article, one way to do that would be to use "double-braces" escaping for placeholders embedded inside placeholders, may be something like:

{0:zero second|{{0}} second|{{0}} second|{{0}} seconds}

That's something to try... if it works in the bot and if it really implements a "custom field-formater" for plural rules because the *parsed* string "zero second|{0} second|{0} second|{0} seconds" (after unescaping) containing pipes is not a standard field formater; and then only fix in TWN's validator (which needs to properly count the embedding level of braces) and document somewhere on this wiki. But I fear that what we'll get will be that there will be visible braces. Anyway this "double-brace escaping" (which apparently exists in C# for compatiblity with "interpolated strings") would be very unfriendly for translators (how many braces we must use depending on context!), and writing a "IFormatProvider" would be safer than trying to use a "ICustomFormatter" (where "format strings" are not supposed to contain any brace or any text in a human language, just basic codes like "D" for dates or "F2" for floatting points with 2 decimals of precision; a "custom formatter" is normally used to format another type of value, such as a complex number, instead of using their default "ToString" method if there's one for that type).

I don't know how the bot's author chose that syntax to try supporting plural formaters, but that syntax natively cannot support what you think would be correct, and we cannot progress without asking to these developers what they intend to do. But for now, there's no way to embed any placeholder in clauses of existing "custom format string" used for noting plural forms used by this bot, and so no real ned to fix TWN's validator to try supporting something that actually does not work for now, without knowing what will be the final solution. But we may still fix TWN's validator so that it correctly pairs opening/closing braces (this will still work with most strings using "double-braces" escaping described above, as they should be paired as well in any valid translatable message)

It seems that DiscordWikiBot uses such a custom format provider called "SmartFormat" but I don't know which version is used. See for example https://github.com/axuno/SmartFormat/pull/322 for a recent change to support languages (like Japanese, or sometimes even in English where the terms to translate, e.g. "you are", do not vary with plural in English but are varying in other languages) with a single plural form (this requires a change in the Wikibot project, to disable the "autodetection" of the custom format (which requires the presence of at last one '|' pipe), or specifying the name of the "plural:" formater explicitly when there's a single form (i.e. {0:plural:form1|form2} or just {0:plural:form1} or even {0:plural:}). This change is very recent (8 days ago in that repository for "SmartFormat", so it is not part of the last version 3.2.0 release 2 months ago) and I don't know if "SmartFormat" implements or supports the "double-braces" escaping (documented by Microsoft for the default format provider for "String.Format()" in C#) or if it's still applied by default (in the "String" class) before calling any custom format provider (this seems to be the case according to a discussion in closed bug #322 and its related bug #320 linked at start of this paragraph). A comment in the commit log for closing bug #322 recommends that "autodetection" be turned off, and it will soon be off by default (in which case you'll need to name the "plural:" formatter in ALL translated messages, and the legacy syntax "{0:form1|form2}" will no longer work and all translated messages for that Wikibot using plural forms will have to be updated here in ALL languages! This would also affect other projects supported in TWN written in C# or in other programming languages based on .Net/CLI, and that chose to use "SmartFormat").

And sorry if this looks too long for you, but this is an ongoing search for a working solution (that needs to be documented when it works).

Verdy p (talk)12:30, 25 November 2022