From translatewiki.net
2010-01-21: NOCC is getting ready for its 1.9 release. Complete your translation while you can!
2010-01-20: We have moved to a new server. Thank you, netcup.de! (Other news...)
As of version 1.15, MediaWiki will support in addition to {{PLURAL:}} and {{GRAMMAR:}} a new magic word {{GENDER:}}. It is a part a bigger i18n improvement which aims to make MediaWiki fully gender aware in every aspect.
In other words, we have currently two new features for gender.
- Users can set their preferred gender in their preferences.
- Magic word for specifying alternative wordings for each gender.
[edit] User preference
Currently there is three options: unknown (called unspeficed in ui), male and female. Of these the unknown is the default for all existing and new users. In future this feature may be (ab)used to add a so called polite gender, but this is still up for a discussion.
[edit] Magic word
The new magic word {{GENDER:}} works almost alike {{PLURAL:}} and {{GRAMMAR:}}. See the following examples for syntax. Those assume three genders, which is something which can change (per language?).
1. {{GENDER:Username|male text|female text|text for unspecified}}
2. {{GENDER:|male text|female text|text for unspecified}}
3. {{GENDER:.|male text|female text|text for unspecified}} (Dot username means - use default user gender on this wiki.)
Examples:
Gender of Nike is {{GENDER:Nike|male|female|unknown}}. Gender of Nike is male.
Gender of Nike is {{GENDER:Nike|male|female}}. Gender of Nike is male.
Default user gender on this wiki is {{GENDER:.|male|female|unknown}}. Default user gender on this wiki is unknown.
Default is {{GENDER:.|male|female}} if unknown is not defined. Default is male if unknown is not defined.
If the third parameter is omitted, it will default to the first (male) form. This behaviour can be changed.
This version can be used in content and in interface messages, where (any) username is given in a parameter as-is. The second version is a special case for interface messages only, which takes the gender of the current user.
There is one caveat for the second version, which relates to the many ways messages are handled in MediaWiki. Rule of thumb is that it works in messages where plural and grammar work. However, I anticipate that gender is needed in many other messages where those are not used, and it may or may not work in these messages. For this reason I ask you to report in which messages you (want to) use gender functionality.
[edit] Gender use cases
Message names etc..
[edit] User namespace aliases
Currently testing the code with stupid male and female user namespaces, these will not stay. If everything goes fine, it will go into trunk. Please test and comment! You can also start assembling list of gender aliases for languages which need them.
- Works for me. Please define in [be-tarask] Belarusian (Taraškievica orthography) – Беларуская (тарашкевіца): Удзельнік and Гутаркі_ўдзельніка (User and User talk) for male and unspecified variants; Удзельніца and Гутаркі ўдзельніцы for female variant (note: it's already added into $namespaceAliases).
- I think same should be done for [ru] Russian – Русский: Участник and Обсуждение_участника for male and unspecified; Участница and Обсуждение участницы for female (also in $namespaceAliases already).
- EugeneZelenko 15:33, 2 February 2009 (UTC)
- It's Right. The code would remove the some charges of sexism in Russian translation. --ajvol 09:19, 3 February 2009 (UTC)
[edit] Testing and more examples
[edit] Discussion
Please ask and comment how you use gender or plan to use it, does it work as expected, does it do everything it should do, how does gender affect your language (even if it doesn't) and any other issues here. – Nike 15:35, 28 January 2009 (UTC)
- In case it wasn't clear: you are allowed to use gender magic word now. Just please list the messages above so we can check for possible problems in the message parsing. – Nike 12:23, 31 January 2009 (UTC)
[edit] Technical implementation
{{GENDER:}} without user parameter does not work in page content. It always uses the default gender (usually unknown). This is by design. – Nike 20:57, 25 January 2009 (UTC)
- Other alternative is to allow it, and add gender to cachekey (dropping cache efficieny to 1/3) – Nike 20:59, 25 January 2009 (UTC)
Changing the gender and then reloading the page reflects the change immediatly, even if not logged in. This is good on one hand, but indicates that it kills the parser cache. Is that so? It should be a config option, IMHO. The parser cache is important, and people's gender doesn't change that often. -- Duesentrieb 21:53, 25 January 2009 (UTC)
- We can mark user gender in the URL like "action": gender=m or gender=f or gender=u. It is good idea for cache efficiency. Sp5uhe 21:56, 28 January 2009 (UTC)
I suggest that we allow {{GENDER:$n|… inside page content as well. Especially in Help: pages and community pages addressing users to do something, it would be useful. Redirecting is also not cheap, but when caching is an issue, I suggest to, indeed, add this gender stuff to the URLs. --Purodha Blissenbach 00:49, 29 January 2009 (UTC)
- As to the abovementined caveat for the second version, I suggest to generally allow the use of {{PLURAL/GRAMMAR/GENDER, html entities, and
<span> … </span> in practically every message that is rendered as visible html. --Purodha Blissenbach 01:21, 29 January 2009 (UTC)
[edit] Extending and localizing Gender
Looking at localized versions of PLURAL, one may think of likewise localizing GENDER, since languages have deviating uses of grammatical genders. When we have localized forms of GENDER allowing more forms than just the English "male", "female", and "unknown", we need more than just adding more options locally. What we also need, is a matrix mapping of each languages' individual genders to a gender of all other languages, since when switching languages, genders of users often prevail and need to be expressed in another language. At first sight, that sounds more complicated than it really is, since mapping to gender-unaware languages is always easy, and there is always the "unknown" case which can be used, and very many languages have identically structured genders; thus we do not end up with a 285×285 matrix at the moment, but rather with something considerably smaller, like 10 by 10 or less, maybe.
-
- I am not taking T-V-distinctions into account here. Even though they may be both theoretically and practically integrated into a revamped GENDER, they are structurally independent of grammatical genders. Distinct forms most usually may apply to any of the grammatical genders likewise, thus the final count of forms is the count of grammatical genders times the count of T-V-distinctions. Until it is clear how to deal with them, there is no need to go into details on them here, and most generally, thoughts presented here apply to any kind of distinction, be it distinct grammatical genders, or T-V-distinctions, or those two combined.
How to get to the mapping? We need to classify languages according to their uses of grammatical gender(s), and put identical uses together into individual classes. Example: Since most have a male/female/[neuter] scheme, and neuter hardly ever applies to users/usernames, the only thing to check would be whether or not the grammatical female of one language is as well the grammatical female of each other language, and the grammatical male of one language translates as well to the grammatical male of each other in a class. Likely, there will be many having their grammatical genders matching natural genders, and those all fit in one class. Etc.
Now, there are deviations, (please add to the list!)
- grammatical gender and natural gender are unrelated,
- children are grammatical neuters,
- two sorts of female grammatical gender,
- likely, there will be more.
What can we do so as to map those correctly? Of course that depends. In any case, we must ask additional information from users, because unless we have it, we cannot use it. So e.g. when grammatical and natural gender are not identical, (and unless there is an algorithm finding the grammatical gender of a user name, which we somethines have,) we must ask people for either gender. This poses the problem that, users may not know what their names grammatical genders could be in foreign languages. Likewise, we could ask "are you a child?", or maybe "the year you were born". In one case of two female genders, the distinction is in part contextual and semantic, actually a cross between a grammatical gender and a T-V-distinction, our only choice is to ask users how they want to be talked about, and getting this right may to some be more personally touching than all other gender-stuff and TV-distinctions. Of course, mapping from other languages to these two grammatical genders is only possible, when we ask the same question everywhere. We likely are expecting most answers to be in the "don't know" class where this language is not understood, do we not?
The last thought suggests that, in such hard-to-explain cases, we could spare foreign users the need to answer questions, and use some sort of default-to-unknown mapping anyways. Even more so for the time being, since these things will have to be reassessed once we introduce T-V-distinctions. --Purodha Blissenbach 09:41, 13 February 2009 (UTC)
-
- I'm not yet aware of languages where grammatical gender would differ from natural gender in singular third person. --Nike 16:51, 13 February 2009 (UTC)
- If languages have third person, singular, etc., in their grammars at all ;-)
- In the Central Franconian group and other Western languages of the West Middle German Group, there are some having grammatical gender different from natural gender througout, and others to some extent.
- Also the Standard German language has, I believe somewhat historically related , the distinction between "die Frau", female (woman; wife; lady; ma'am; Mrs.), and "das Mädchen", neuter (girl, female teenager, etc.), and "das Fräulein", neuter (the young unmarried woman; unmarried adult female; high-classed unmarried daughter; female sales-person, clerk, teacher, etc.) for example. References to them in the 3rd person go, of course, with the grammatical gender, when these wörds are used alone or as qualifiers. E.g. when talking about a social event: "Das Fräulein Müller und die Frau Schmitz sind beide eingetroffen. Es ging nach links, und sie nach rechts" (Miss Müller and Missis Schmitz both have arrived. Miss Müller turned left, and Missis Schmitz turned right.) In the first German sentence, you could leave the articles out. That would, however, not alter the second sentence. You cannot translate the second sentence to English properly without having to somehow circumvent the ambiguity arising from the fact that, in English, grammatical gender cannot be used to distinguish who turned where. Note also that, the first sentence tells us that Miss Müller has to be considerably older, or of higher social rank, than Miss Schmitz, probably both. If not so, the speaker was grossly impolite, and misbehaving, and should be ashamed of himself. Note also that, the second sentence indirectly suggests that on the left, there was likely the place for the more important people to gather, or meet, while at the left, there was the place of the more ordinary people - likely, but not necessarily.
- These specifics are a bit outside the scope of a language interface of a computer program, you may say? Not at all. If you want to get the tone right that some messages about other users use, you should be able to get these things right in the same way that every social event host at his or her microphone would. There is not much of a difference between a social event in real life, and a wiki community meeting online, after all, when it comes to publicly speaking about participants. --Purodha Blissenbach 10:24, 7 March 2009 (UTC)
I was a bit hasty with regards to my evaluation for this feature for Nynorsk. Though, I must say that I've never needed it in the translations I've done so far; so, are there any examples of existing messages where GENDER could be used? Or could we get messages like 'See {{GENDER:x|his|her|the user's}} user page' in the future? I've not really undestood the purpose of this feature, yet. --Harald Khan Ճ 20:59, 28 January 2009 (UTC)
- Yes, such differentiations should be possible, imho. --Purodha Blissenbach 00:44, 29 January 2009 (UTC)
- Next messages seem good candidates to use GENDER in Slavic languages: Blockipsuccesstext ("$1 has been blocked.
See IP block list to review blocks."), Blocklogentry ("blocked $1 with an expiry time of $2 $3"), Ipb already blocked (""$1" is already blocked"), Ipb-needreblock ("== Already blocked == $1 is already blocked. Do you want to change the settings?"), Alreadyrolled ("Cannot rollback last edit of $1 by $2 (talk | contribs); someone else has edited or rolled back the page already.
The last edit to the page was by $3 (talk | contribs)."), Confirmrecreate ("User $1 (talk) deleted this page after you started editing with reason:
- $2
Please confirm that you really want to recreate this page."). --EugeneZelenko 04:13, 29 January 2009 (UTC)
-
-
-
- The message Blocklogentry ("blocked $1 with an expiry time of $2 $3") is an interesting example: "blocked $1 with an expiry time of $2 $3". In Alemannic not only the blocked user $1 should have a gender-specific article (dr $1 if male or d $1 if female) but also the preliminary name of the sysop should have a gender-specific article. But the username of the sysop is not mentioned in the message.Als-Holder 17:48, 29 January 2009 (UTC)
- Same with all Ripuarian. (I've avoided that up to now using wordings like: "the user $n" where the (required) article is grammatical-genderwise bound to "user", and else using 'lazy forms' (like "don't" replacing "do not") that happen to be spelt alike for all four genders.)
messages with {{GENDER}} in Polish - working list. Sp5uhe 12:06, 30 January 2009 (UTC)
- Looks like there is a set of messages which need a new parameter for the acting user. – Nike 15:14, 30 January 2009 (UTC)
- Yes and no. We Ripuarians can cope with "list style" log entries which can do without articles in front of names, or ungrammatified dates, and/or times.
- Of course people do accommodate and find ways to hide problems, but the software should not dictate how your language can be used. – Nike 09:14, 31 January 2009 (UTC)
- Well done. As far as I understand, this list is relevant for other Slavic languages, especially Russian. --ajvol 09:26, 3 February 2009 (UTC)
- I think this list may be useful for gender in another languages, but not for all messages because of individual collocation specific for laguages. Sp5uhe 22:25, 6 February 2009 (UTC)
[edit] Gender in languages
Info: wikipedia:T-V distinction and wikipedia:Grammatical gender
- [fi] Finnish – Suomi: no grammatical gender, few lexical differences, t-v.
- [gsw] Swiss German – Alemannisch: 3 T-V distinctions, 3 grammatical genders, 2 of which apply to users. In contrast to German gender is relevant for names. For example, a message like "is done by (username)" should be translated as "isch gmacht vum (username)" if user is male, and "isch gmacht vu dr (username)" if female. So far it has been translated as "isch gmacht vu (username)" but this is not the grammatically correct form. For Alemannic gender will be a great progress. Als-Holder 13:14, 29 January 2009 (UTC)
- [hu] Hungarian – Magyar: no grammatical gender
- [pl] Polish – Polski: has grammatical gender messages with {{GENDER}} working list
- [nn] Norwegian Nynorsk – Norsk (nynorsk): got feminine, masculinum and neuter.
If the GENDER-thingy only points to user names, it should be useless for this language, AFAIK. --Harald Khan Ճ 17:57, 28 January 2009 (UTC)
- Users are the only thing we know the gender of, or what do you mean? – Nike 18:25, 28 January 2009 (UTC)
- I was thinking of inflection with regards to the genders of words. Som variables in messages could possible have different genders. --Harald Khan Ճ 18:31, 28 January 2009 (UTC)
- Which most often are user supplied input, and thus we cannot know the gender. If there however is preset list of items, something could be done to it. – Nike 18:49, 28 January 2009 (UTC)
- [uk] Ukrainian – Українська has grammatical gender. It should be чоловіча (Male), жіноча (Female), не зазначена (Unspecified).--Ahonc 21:09, 28 January 2009 (UTC)
- The gender-* alternatives in the gender selection are available for translation in mediawiki messages group. – Nike 21:39, 28 January 2009 (UTC)
- [cs] Czech – Česky: has grammatical gender. Almost every message which is in past tense is gender-dependent.
— Danny B. 21:50, 28 January 2009 (UTC)
- [pt] Portuguese – Português: both grammatical gender and T-V distinction. Malafaya 00:00, 29 January 2009 (UTC)
- [ksh] Ripoarisch – Ripoarisch: has 2 T-V distinctions, and, from a programmers point of view, 4+1 grammatical genders (2 female, 1 male, 1 neuter, +unnown), neuter usually not applicable to human users, but e.g. to bots having a neuter denomination. Common addressing patterns suggest to add T-V distinctions as additonal 'genders', which would imho definitely increase acceptance and reputability of the software. --Purodha Blissenbach 00:28, 29 January 2009 (UTC)
- [de] German – Deutsch: 2 T-V distinctions, 3 grammatical genders, 2 of which apply to users. While it is comparatively easy and common to avoid gender dependencies when addressing users or talking about users, T-V distinctions are an issue. They are already taken care of by a separate 'polite form' localization of
de, however.
- [be] Belarusian – Беларуская: 2 T-V distinctions, 3 grammatical genders, 2 of which apply to users. Base allpication is same as for Czech. Also phrases like Dear, XYZ!. --EugeneZelenko 04:11, 29 January 2009 (UTC)
- [ru] Russian – Русский: 2 T-V distinctions, 3 grammatical genders, 2 of which apply to users. Base allpication is same as for Czech. Also phrases like Dear, XYZ!
- [el] Greek – Ελληνικά: 2 T-V distinctions, 3 grammatical genders (feminine, masculine and neutral), 2 of which apply to users. --Geraki 14:46, 29 January 2009 (UTC)
- [sv] Swedish – Svenska: T-V distinctions is generally not used anymore. 2 grammatical genders, but only 1 apply to users. Difference between feminine and masculine is only present in personal and possesive pronouns. --Boivie 07:06, 30 January 2009 (UTC)
- [ca] Catalan – Català: T-V distinctions: they are disappearing in common language but they are used in computer language, however the rules are simple and don't need any programming gadget. Grammatical gender: two genders male-female which affect most adjectives.
- [ia] Interlingua – Interlingua: No grammatical gender. Two personal genders, used as in English (he=ille, she=illa, him=le, her=la). There is a T-V distinction (tu/vos) but we've chosen to adress the users informally using the T-form in this Interlingua translation. – McDutchie 14:35, 2 February 2009 (UTC)
- [es] Spanish – Español:has grammatical gender. It should be Masculino (Male), Femenino (Female), No especificado (Unspecified) BicScope --(Talk)-- 18:04, 5 February 2009 (UTC)
- [su] Sundanese – Basa Sunda: no grammatical gender, no difference at all.
- [cy] Welsh – Cymraeg: Welsh has two genders, masculine and feminine. Gramatically, using gender at all would lead to very many knock-on effects in text, mainly because of gender distinctions in mutations. Because of this, it is quite common in texts to adopt the convention that the male gender is used throughout, and refers to both male and female. I do not anticipate that we will use GENDER very much at all, since the simple option of not using it is acceptable. However, I have applied it to the Babel boxes, since the result is more elegant than before. Lloffiwr 13:43, 13 April 2009 (UTC)
- [sw] Swahili – Kiswahili: Swahili does not distinguish between masculine and feminine. However, there is a grammatical distinction between 10 different 'classes' of noun. One class includes 'people' and another class includes 'things', which would include bots. Grammatically, the class of a noun determines the agreements of demonstrative pronouns and verbs in a sentence. There is a subject prefix and an object infix in a Swahili verb which varies according to noun class. Usually it does not matter how we refer to bots, since a bot cannot take offence! However, it would be a nice feature to have a 'bot' 'gender' in the log entries, which could be used for bot usernames. For example, Blocklogentry ("blocked $1 with an expiry time of $2 $3"), using two usernames, would read (meaning 'blocked $1'):
- 'amemzuia $1' for a sysop who was a real person and a blocked user who was a real person
- 'imemzuia $1' for a bot sysop (I know this can't be but a bot could do actions to other users in other logs) and a blocked user who was a real person
- 'ameizuia $1' for a sysop who was a real person and a blocked bot user
- 'imeizuia $1' for a bot sysop and a blocked bot user.
However, since there is a real person behind every bot, this is a very low priority feature. At present all users are referred to as if they were real people. If there were a lot of other languages which had a similar issue and an additional 'bot' gender were added in the future, then we would consider using this in Swahili, which is why I am mentioning this. Lloffiwr 13:43, 13 April 2009 (UTC)
[edit] Meta message group
I think it would be most efficient/effective if we were to create a runtime meta message group for this, similar to MediaWiki:Betawiki-messages. That way the more knowledgeable users can keep the group up to date, and it can be reached from within the default Special:Translate UI without any hassle. Niklas: possible? Siebrand 21:27, 28 January 2009 (UTC)
- You are assuming that the group is homogeneous across languages. It is still open wether it is, or whether there are multiple types of groups inside it, or whether it is just too random to be useful. In any case we need examples for messages where it is used. – Nike 21:37, 28 January 2009 (UTC)
- Since long already I am longing for a classification of messages according to the type of data being supplied for $n variables, such as User names, dates, times, etc. - Messages using user names are one class having to use
{{GENDER:$n|… . Messages addressing users are another class, having to use {{GENDER:|… . Btw., there are several words, expressions and phrases in several languages, that can be used to identify the latter, e.g. en:"please", en:"klick", ksh:"donn", ksh:"bes esu joot", etc. --Purodha Blissenbach 01:15, 29 January 2009 (UTC)
[edit] Namespaces
What about most obvious usage of gender setting for translating User and User talk namespaces? At least this should be done for personal pages and in default signature. Less priority could be given to translate them in various special pages (watchlist, recent changes, etc). --EugeneZelenko 04:17, 29 January 2009 (UTC)
- For example see a first request: https://bugzilla.wikimedia.org/show_bug.cgi?id=17160 --- Best regards, Melancholie 04:49, 29 January 2009 (UTC)
- Sorry, this bug request talks about aliases, not actual usage. MessagesBe_tarask.php and MessagesRu.php already have aliases, but they are not affecting personal pages titles and signatures. --EugeneZelenko 14:52, 29 January 2009 (UTC)
- I don't think translating namespaces based on the user they are related to is not such a good idea. If you want to get to the user page of a user, you usually just go to User:Name. If you don't know the gender of the user, you would have to test whether the page is at User_(m):Name or User_(f):name. Regards, --ChrisiPK 17:31, 1 February 2009 (UTC)
- They would just be aliases, redirecting to the correct one, naturally. – Nike 17:35, 1 February 2009 (UTC)