Translating:Localisation for developers

From translatewiki.net
Jump to navigation Jump to search

This page contains a collection of advice and information on localisation including management of localisation, mainly for software developers and software project managers. Some of the information here is based on the experiences of translatewiki.net users in collaboration with the developers of supported projects. Links to resources on localisation outside of translatewiki.net are also provided here. If you have good examples of localisation problems, or have found good resources for localisation, please add them to this page. If you are interested in using translatewiki.net to localise your project please read the page on new projects.

This page is not intended to be a substitute for on-line discussion. You can contact translatewiki.net staff live on the IRC channel #mediawiki-i18n at freenode.

Principles

The following principles are closely related and largely at the base of all the rest.

Translators should translate

If you want a good translation, you need translators. If you require programming, you'll get coders instead.

See also:

Translating the wiki way

Translating the wiki way means being like wikis: low barriers to translate, translations used immediately, a forgiving system to improve translations.

MediaWiki translation is made by hundreds translators, using a simple syntax: it doesn't take a coder to add strings to MediaWiki, or translate them; translations go live on MediaWiki wikis within 24-48 hours from the moment English messages are added; users, translators and developers get instant feedback from each other.

i18n must not be an afterthought

Translating the wiki way means that i18n (and feedback from it) informs your development from the beginning; your product is made in partnership with translators.

Continuous translation

Some say "continuous translation" or "continuous localization" to mean translating the wiki way.

Management of supported projects

Unexpected permanently locked pages (bug occuring randomly, caused by SemanticMediawiki for its "change propagation")

Due to a bug in the "SemanticMediawiki" extension of Mediawiki, an addition/change/removal of categorisation for any created or modified page, may unexpectedly and unpredictibly lock this page permanently after the edit is saved. Anyone is seeing the message at top of the page, when trying to edit such page, and the page would then be uneditable by anyone, even by a site administrator (unless a specific admin-only but dangerous tool is used that will bypass many checks for replacing any content in "raw mode", or they use admin-only tools to directly update the SQL database, which can further break normal caching behavior of MediaWiki, or could seriously damage the data integrity of SemanticMediaWiki):

You do not have permission to edit this page, for the following reason:
This page is locked to prevent accidental data modification while a change propagation update is run. The process may take moment before the page is unlocked as it depends on the size and frequency of the job queue scheduler.

This is apparently caused by a race condition where a background job can be set to run later in one thread, when it has in fact already run in another thread and terminated trying to remove this temporary status, before the pending "Change propagation::+" status has been actually stored in the database. (See initial bug 2494 in GitHub, opend on 3 Jun 2017 then closed without being fixed, then bug 4344 still unsolved since 27 Oct 2019).

These affected pages are, currently (new pages may appear at any time, refresh this list by clicking the clock at top of this page to run again the SemanticMediawiki query and regenerate this report):

<templatestyles src="Template:Columns/styles.css"/>

Also, there are some additional categories showing this related message in a red banner when just visiting them:

They seem to use another SemanticMediaWiki property (still not identified). May be this is not a property but a bug that occured while the background job was running then failed, causing the job to remain in unterminated state, even if the change propagation property was removed, but this has the same effect and the affected page are also locked and permanently left not editable (no pending job will remove this broken status).

This last bug seems to affect double redirected categories (with a false unchecked assumption made by the SemanticMediaWiki background job, causing it to crash prematurely): they are listed on Special:DoubleRedirects but it's not possible to fix these broken redirects to the correct target category: when editing them, the first message shown above appears. This is not caused by the same SemanticMediaWiki property, but by another internal state (still not determined); this state is possibly not stored in the database by a documented property but could be caused internally by the hook code of SemanticMediawiki extension itself.

Note that SemanticMediaWiki is not supposed to manage any semantic properties for redirected categories that it officially "does not support". However SemanticMediaWiki should not break Mediawiki itself (even if these redirects are valid but the target is also redirected, or because there's a typo made by the contributor in the edited redirect). Hard redirects on categories are possible, useful when they are related to exact synonyms, and valid when they are empty (no members). There are also simple ways to check that they effectively remain empty and maintain this (by using a category tracking template, just below the redirect itself in the category page).

Project page

The main project management tool is the project page and its talk page. The project pages contain information useful to both project managers and translators including:

  • name, logo and short description of the project
  • project statistics
  • name of translatewiki.net coordinator for the project
  • details of how to contact the project itself, and whom to contact. Some projects have forums specializing on the localisation of the project, or e-mail lists like the FreeCol-translators mailing list ,
  • links to general project documentation and screenshots, useful for translators in understanding what they are translating
  • notes on PLURAL or GRAMMAR support available on the project
  • notes on translation flow, including frequency of 'commits' and frequency of software updates of the project
  • the localisation threshold
  • sub-pages with project documentation such as a glossary of project-specific terminology
  • the project talk page is usually the first place where translators can discuss project matters with developers. Translators can report suspected errors in the source message, and request clarification of a message, or request PLURAL, GRAMMAR or GENDER support, where available
  • optionally, a link to the public repository and issue tracker and a mention of the license used.

Things which contribute to efficient and good quality localisation

  • Good communication between translatewiki.net and the project. Where no answers to posts on the talk page are forthcoming, translators may try to contact the project directly, or may just end up discouraged.
  • Good documentation
  • The speedy appearance of translations in the project releases. If a link to software release notes or logs is provided, then translators can check progress. This helps to provide the motivation to translators to continue translating.
  • Feedback from the project end-users to the translators. Where a project uses technical terms which may already have been localised, then a way should be set up for end-users to provide the correct localised technical terms to the translators. For example, pointing potential contributors a project specific glossary page might be an option.

Support requests

All messages and requests about supported projects are placed on Support. We use {{Support}} to track requests which need attention/action and help the developers/project managers find them.

After a few weeks at most, all the threads belonging to one of the supported projects (whether resolved or not) will be moved to their talk pages by the maintainers of Support. In addition to that, and if possible before moving, the above-mentioned template should be added and updated as soon as possible so that threads are shown in several relevant places, including the global list Support/Open requests.

Assuming that's been done correctly, some neat things can happen, see for instance the following tutorial:


Export

The translations are committed to your project's repository by translatewiki.net staff because this phase is usually seamless on the project's end, while the sync needs care to protect translatewiki.net itself. Export is manually-initiated and happens semi-regularly, usually twice a week or when there is a particular need (e.g. upcoming release, many new translations).

The commit phase is usually called "export", while "sync" is the import of new strings: the two are independent and syncs are often more frequent (daily, often automatically as of 2016). More internal details can be found at New project setup, Repository management and related pages.

Some example exports for various formats:

Languages, messages, editing, translator issues

All this and more is covered in the FAQ.

Message identification

Messages at translatewiki.net are referred to using message names, also called message keys. If a project does not have any yet, they can be generated automatically. Forming message keys, it is best to use only lower case letters, digits, hyphens and dots, since many other characters tend to be impractical or give problems in a MediaWiki context.

It is wise to group messages by context of appearance, and keep them in an order in which they appear in reality. If possible, message keys should be used to indicate grouping. This both helps translators to more consistent and better quality.

Message reuse should be avoided in most instances. Even if message texts are equal in the original language, translations may be not, and may need to reflect contexts of use. For identical message texts being translated identically as well, copy and paste and translation memories support translators to be very efficient.

Message documentation

In splitting text strings into individual "messages", the context essential to translation is lost. In order to restore enough context to ensure successful translation, documentation is written manually for the messages. It is conveyed using a pseudo-language, having the code qqq (message documentation). This is one of the ISO 639 codes reserved for private use. This code is used to record message documentation in English, which is visible to translators.

The message documentation page is usually written manually. Programmers are encouraged to contribute to the message documentation. Useful information includes:

  1. message handling (parsing, escaping, plain text, etc.)
  2. types of parameters with example values
  3. where the message is used (pages, locations in the user interface, e-mails sent to users, etc.)
  4. how the message is used where it is used (a page title, button text, etc.)
  5. which other messages are used together with this message, or which other messages this message refers to, with links
  6. anything else which could be understood if the message were seen in context, but not when the message is displayed alone (which is the case when it is being translated)
  7. special properties of the message, such as a page name requiring an initial capital letter; that it should not be a direct translation; that it supports PLURAL or GRAMMAR (in MediaWiki), etc.
  8. parts of the message which must not be translated, such as generic namespace names or URLs
  9. translation hints such as synonyms for terms, the grammatical function of a short message (mainly whether a noun or verb), etc.
  10. a link to a definition of any website-specific technical terms. Glossaries are not yet integrated into the documentation system.
  11. a link to a page using the message, or to a screenshot of the page

The last item often provides enough information on its own, without having to write all of the other documentation.

In order to edit a qqq documentation message in translatewiki.net you will need to have an account. You can choose Message documentation as the translation language to easily find messages that lack documentation. Alternatively, developers can modify the qqq message files directly.

Projects that use GNU gettext can use a method whereby comments in the source code are picked up by gettext tools, and the Translate extension forwards those comments into an uneditable part of message documentation.

Generalities on giving context

Text originally from FUEL Project & Rajesh Ranjan, Context in Technical Translation Concept and Guidelines, 10 April 2013 (CC-BY-SA 3.0).

Types of Context

Context can be divided into following categories:

  1. Verbal Context — It is related to the text or talk of an expression surrounded to the text. The expression can be anything: words, sentences, speech etc.
  2. Social context — This can be defined in terms of objective social variables like class, gender, persons, social identity etc.
  3. Syntagmatic contexts — When semantics of a phrase or word is determined mainly by context, then the context is called syntagmatic context. More than 95 % of vocabularies a person knows comes from syntagmatic context.
  4. Paradigmatic contexts — Paradigmatic contexts concentrates on dictionary meanings that is analyzed within semantic relationship. It is also called lexematic context.

Five Aspects of Context

Linguists Alan K. Melby and Christopher Foster believe that for all practical purposes context consists of the following five factors that they think are relevant to the understanding of source text and the production of target text. As per their definition these are here:

  1. Co-text — Portion of the text, surrounding sentences
  2. Rel-text — Related text, fixing a problem is difficult; eg terminology, style guide
  3. Chron-text — Versions of a text
  4. Bi-text — Bi-lingual resources, translation memory
  5. Non-text — Beyond text, para linguistic information; eg. intent, audiences, purposes

Where We Often Need Context

There are some obvious places where translators need comments essentially. Here are some of them:

  1. Protocols — Protocols should be essentially commented especially when it appear as URI. eg. help:index. Protocol responses should also be taken care with proper contextual comments. eg. “Server returned HELLO”
  2. File names — File name should be commented. eg. fuel.txt
  3. Program Names — Program name should be commented. eg. firefox
  4. Ambiguous words — Particularly a word that can be used in different forms like verb as well as noun. eg. empty
  5. Environment variables — Environment variables should be intact and so context should be provided. eg. LANG=hi_IN.utf8
  6. Abbreviation & Acronyms — Generally it would be a better practices to give context for Abbreviation & Acronyms. eg. l10n, m17n, FIR
  7. Config Files Entries — Config Files Entries should be given context. eg. publican config file entries:
    --------------------------------
    surname: <YourSurname>
    firstname: <YourFirstname>
    email: <YourEmailAddress>
    langs: <YourLanguage>
    lang: <YourLanguage>
    ---------------------------------

Adaptation of source messages

Translators are asked to suggest improvements to messages and corrections to grammar in the source messages, as they are spotted during the translation process. Translators also propose the addition of PLURAL and GRAMMAR support to a message where needed, if possible. Proposals for amending messages are posted on the project talk page at translatewiki.net or on the project website.

Guidance on i18N and L10n

Links to advice on other websites

Examples of common problems