Jump to content

More efficient text generation

More efficient text generation

Hi. There are many system messages like MediaWiki:Wm-license-cc-by-sa-2.5-br-text/en, with variations for each license type, each license version, each country, probably many hundreds of combinations in all. Translating each and every one of them in so many languages, even with extensive copying and pasting, is a huge waste of time and manual work. This can easily be simplified by having one single text with a few parameters in it. Then, for each language, it would be enough to translate that text, its various parameters, and a list of country names. Instead of spending hours of pure boredom, it could take just a few minutes to finish the job. Plus, whenever we need to update or add a set of licenses (a new version, a new country, etc.) we could do that quickly and painlessly, unlike now. Also the result would be guaranteed to be uniform and many mistakes could be avoided (who and when will spot in so many pages a wrong version number or a wrong country name?).

After all, the translators are human beings, not machines. Let's make it easier for them when it is possible. And in this case it clearly is possible.

AdiJapan04:13, 23 June 2010

We have considered this, and it was not an option for *all* of our target languages. As we promote i18n quality over efficiency in the area of language coverage, that does mean that there is duplication, amongst others in the area you are pointing out. That is unavoidable.

Users are human beings, not machines. The goals is to provide them with the best possible user interface we, developers and translators, can.

Siebrand06:17, 23 June 2010

What sort of problems prevent using a template-like text? Is there any language in which, for instance, replacing the name of a country would result in rephrasing the whole sentence? That doesn't seem plausible to me. Can you point me to the discussion where those problems were identified and the decision was taken?

Anyway, if there are indeed such languages, then there must be some other ways to automate the text generation, without sacrificing the quality for the end user, at least in part or at least for those many languages where this is possible. The way things are done now is simply stone-age style. I'm pretty sure the original English messages weren't written manually one by one, but in some sort of automatic way.

By the way, duplication is a bitter euphemism here. If I just had to double my effort I wouldn't bother asking. But I counted some 100 odd messages and there are probably many more on the way, for other jurisdictions and licenses. So calling it hundred-plication or thousand-plication would be more appropriate.

AdiJapan08:47, 23 June 2010

I'm sorry you feel that way. This is just how it works. You can choose to skip the message group Wikimedia License Texts and/or ask a fellow Romanian translator to do it. As far as I'm aware, we don't have this type of duplication too often.

Siebrand09:06, 23 June 2010
 

Yes, I am translating to a langauge, where you cannot simply have a template where denoninators for countries, languages, etc. slip in without having to adjust grammar according to gender and other properties of these denominators. E.g. Switzerland and Tukey have to have a female singular article, the UK has to have a neuter singular, the USA has to have a male plural article, the United Arab Emirates have to have a neuter plural article while France and India and many others must not have articles at all.

While maintaining changes to these repeated texts looks laborous, just making a start of them via copy&paste is not too bad, imho.

Purodha Blissenbach15:46, 23 January 2011