GRAMMAR

  • The magic keyword "GENDER:" is already taken but for a limited usage: its's first parameter is a username on the local wiki (and by default it is the user reading the page, whose human gender may be unknown or volutarily chosen by that user to be neutral). It is completely unrelated to the grammatical gender (which may even follow other rules, e.g. for non-humans, unanimated concepts, or for reasons of style/politeness/respect or irrespect, or that sometimes changes depending on plurals or uncountability).
  • We have a keyword "GRAMMAR:" whose first parameter is already a "tag", but that doesnot accept list of tags, some required that may be needed, some other optional to alter the renered variant).

But we have NO way to return BOTH the text of the translated variant, AND classification tags on output; only one input tag is accepted.

  • The magic keyword "PLURAL:" has no input tags, it only takes a numerical value on input.

Returning tags only on output will be inefficiant: it requires returning all possible variants, each one with their own tags for external selection, or return a some functional object that will be able to generate the selection without having to enumerate all possible variants. For common cases, like derived terms in the conjugation of verbs, or grammatical cases, and more complex clases like German verbs with their "detachable" particles, or pronominal verbs, we need input tags. Output tags should be used for the generated sets of variants. then we'll need some orchestrator to reduce the sets of variants and generate the calls of functions that will make the selctions. But for translators this is a complex task to do: we need to allow them to just being able to add variants and complement each output with one or more tags that can can easily select form a set of known tags; and these tags must make sense to human translators (so tag names must follow some known and documented convention). That's not somethinb easy to design in a snigle basic "magic keyword" in MediaWiki, except for very limited use cases.


IMHO, using tags on input rather than output will be more powerful and will offer a simpler interface. And making all tags on input requires using a "functional" syntax, but with possibly unordered parameters (the input would just be some unordered set of tags. Adding variants for a given message to translate just requires each added variant to have distinct sets of tags, including the empty set which would be the default translation (that should be always usable in isolation, e.g. as items in a bulleted list or in an index, but not as a "title" or section header as it adds specific some usage context requiring its own generative tag fior example for its capitalization).

If we accept the fact that "tags" cannot be named with any space in them (only alphanumeric or hyphens), then such solution is implementable in MediaWiki using the existing "GRAMMAR:" magic keyword as:

{{GRAMMAR: |tag1 tag2 = variant1 |tag1 tag3 = variant2 | ...}}

and it would be important to remember that the order of those tags is not significant (so "tag1 tag2" or "tag2 tag1" are equivalent. Optionnally some wildcard may be used but it would complexify the task for translators. Tags should be wellknown, documented and not generated as they want, so that each one can be validated.

But this new syntax (only to be used in the middle of a translated message, may as well use another separate keyword like "SELECT:", "VARIANTS:" and so on.

For the case of variants making the whole content of the message, we don't necessarily need any syntax to be exposed to translators, as they may just have a "+" button to add variants and an extra field where they can add suitable tags (the validator would just have to check that sets of tags are distinct for each added variant of the message, and would store them in some "canonical" order). And the UI may offer facilities to easily select "known" tags from a repository appropriate for each language, so that translators have less difficulties to use select and use them. That syntax would be hidden.

Any translated message (even if it has a single variant listed) could have one or more tags added to them (including the single default variant): those tags would then be able to provide the needed semantics that also allows generators (like conjugators for verbs) to work with them. Internally each variant stored would be used as a pure functional object, capable to use tags given on input and the defined text of the variant, to generate the simple text output. Basic i18n libariies or applications would ignore the input tags and would work only with the defined text (but would loose the semantic of that text if they use it to generate other texts).

The main difference is that stored translations would not longer be opaque texts

Verdy p (talk)22:48, 22 December 2022