CLDR and pt

CLDR and pt

I'm going to be long, and ask for you patience here. There is a problem in the mapping CLDR -> MediaWiki. CLDR has the languages:

pt
pt-BR
pt-PT

and its policy is that the main pt is the same as the variant with the largest population. So:

"CLDR pt" = "CLDR pt-BR".

MediaWiki does not have pt-PT. Its languages are:

pt
pt-BR

to avoid a third L10n/i18n. But MediaWiki then maps "MediaWiki pt" to "CLDR pt" instead of the correct "CLDR pt-PT". So,

"MediaWiki pt" = "CLDR pt" = "CLDR pt-BR".

For any purpose external to MediaWiki, this makes MediaWiki's languages:

pt-BR
pt-BR

Ensuing issues[edit source]

This has vast implications, but some I've come across which seem related are:

  1. My language here is pt. Yet when I use {{#languagename:pl}} the extension gives me the result "polonês" (from "CLDR pt") instead of "polaco" (from "CLDR pt-PT"). In other words, the language names in the MediaWiki universe are incorrect for users with the language pt. Similarly, date formats, currency designations, etc. should be incorrect for such users. Please remember that we corrected the PLURAL thing the other day, so that's fine.
  2. FreeCol seems to ignore the existence of the "MediaWiki pt" language. So, although it has been translated here, it's not used at all in FreeCol. I am guessing that this is related, because the pt translations seem to be complete for some time and are not in FreeCol. Please let me know if it will be used, because there are many messages and I'd like to be sure before reviewing some issues in the current translation.

How can we get around all this? Anything I can do to help?

Hamilton Abreu23:17, 21 November 2009

FreeCol has FreeColMessages_pt_BR.properties and FreeColMessages_pt_PT.properties, and we map "pt: pt_PT" and "pt-br: pt_BR". So there is no issue there, is there?

As for CLDR, I do not have an answer, nor a solution. I will ask Niklas to take a look at this.

Siebrand00:26, 22 November 2009

Yep, you're right about FreeCol. Right now, it seems the issue is that they haven't uploaded the translations for quite some time. We have translations from the beginning of the year which are not in 0.8.4 (Oct release) yet. I'll follow this up with them, so no issue on this side.

As for CLDR, thanks for following this up. It's quite important.

Hamilton Abreu01:35, 22 November 2009

By the way, the FreeCol issue is being followed up here: https://sourceforge.net/tracker/?func=detail&aid=2901948&group_id=43225&atid=435578 (don't know if the URL will work as login is required).

Hamilton Abreu21:53, 23 November 2009

I guess the problem is that there were once pt, pt-PT and pt-BR translations. I disabled one of those some time ago, but there should be two variants in trunk at least (not sure about the ages old 0.8.x).

Nike21:40, 24 November 2009

According to them, they weren't reloading already translated messages. So changes made here apparently were not committed. But they've assured me that version 0.9 of the software (release date unknown) will reload all messages. If somehow you guys could make sure that they synch to the appropriate language codes that would be great. I'll review the translations soon.

Hamilton Abreu23:58, 24 November 2009

FreeCol already has pt_PT and pt_BR in trunk and the codemap looks like this:

    pt:    pt_PT
    pt-br: pt_BR

Can you test the trunk version and see that they are correct?

Nike09:47, 26 November 2009

Sorry Nike, but how do I go about downloading the trunk version for windows? I'm unable to do compilations and so on, so would need a version I can install with a click. Currently they have 0.8.4 and 0.9.0 available in this way, but those are too old. Looked in sourceforge but couldn't figure out if such a thing even exists for the trunk version.

Hamilton Abreu21:43, 28 November 2009
Nike13:39, 1 December 2009
Edited by 0 users.
Last edit: 23:52, 1 December 2009

Trunk version is correct. I'l be reviewing some of the messages and hopefully they'll be committed later. Many thanks. Let's consider this issue closed.

Hamilton Abreu23:52, 1 December 2009
 
 
 
 
 
 
 
 

Hummm... I could be wrong about this being the cause for missing languages in other packages. FUDForum only lists one theme "default - portuguese" and presents the "pt" messages. So the "pt-BR" are unused. Can we get them to list both, perhaps adding a "default - portuguese (br)" and placing the "pt-BR" messages in it? Anything I can do to help?

Hamilton Abreu00:27, 22 November 2009

FUDforum has 'pt' and 'pt-br'. What's the problem?

P.s. I see you are uploading screenshots of the FreeCol installation. Please report those issues in the FreeCol bug tracker.

Siebrand00:34, 22 November 2009

I'd have to install it to be sure, and that won't happen soon. Let's forget FUDForum for now.

Hamilton Abreu02:00, 22 November 2009
 
 

Only language names are extracted from CLDR, everything else should be correct. We could introduce pt-PT locale, but I don't think we could change the meaning of pt without causing massive outcry.

Nike10:24, 22 November 2009

Let me provide the terms for getting that done:

  • organise a vote on pt.wp that the 'pt' code should use 'pt-br' as default. I would recommend that the vote would at least take a month to allow sufficient input from Portuguese Wikipedians
  • in case of an outcome in favour to using the Brazilian variant, create a request in bugzilla:.

The result of processing the bugzilla: request would be the following:

  • all translations in /pt would be put in 'pt-pt'
  • all source code repositories would be updated - lot of work, but we would do that.
  • 'pt' would be made empty, and fall back to 'pt-br'
  • all current 'pt' special page names, etc, that differ from those in 'pt-br' would be added to 'pt-br' as aliases for backward compatibility.
Siebrand10:39, 22 November 2009

+Mention it in the release notes for everybody else other than Wikimedia projects.

Nike16:42, 22 November 2009

If I understood correctly Nike's input, then the "CLDR pt-PT" -> "MediaWiki pt" mapping is already done correctly in MediaWiki, so no issue there.

As for adopting the CLDR main language thingy in MediaWiki (i.e. having a separate "pt-PT", then fallbacking pt to pt-BR) and Siebrand's terms for getting it done, I don't regard it necessary. It's somewhat silly in itself, and as long as we're clear about what "pt" and "pt-br" mean in MediaWiki, (and we are) we should be fine. Anyone disagrees?

This means that the only open issues here would be:

  • a bug in extension Language Names (Version 1.7.1 (CLDR 1.7.1)), because it maps "CLDR pt" -> "MediaWiki pt" when, in fact, the mapping should be "CLDR pt-PT" -> "MediaWiki pt". If we all agree on this, I can follow it up (any special procedure for that?).
  • any future extensions resorting to CLDR may commit the same mapping error - I'd suggest we deal with that on a case-by-case basis, as they turn up. Anyone disagrees?
Hamilton Abreu22:10, 23 November 2009

Right. I think I may have misunderstood. So the CLDR conversion script needs to have code mapping. Niklas?

Siebrand23:12, 23 November 2009

Will look into it when I have time. (Should be easy to do if anyone wants to hack the code though)

Nike21:33, 24 November 2009

Just to ensure we're in synch, what you guys are talking about will resolve the Language Names extension issue, right? So, I won't need to follow it up independently.

Hamilton Abreu00:03, 25 November 2009

Is it correct now?

Nike09:41, 26 November 2009

Thanks for addressing this, Nike. Well... I'm unsure :-). We have:

  • {{#languagename:pl}} now returns the correct "polaco", so that is certainly fixed.
  • But, on the "Other languages" box, in the "Intro" pages of this wiki, it still says "polonês" (incorrect) instead of the correct "polaco". Could this be due to the chache, perhaps?
  • On the weird side of things, though, in the side bar, under section "Recent changes", there was a change from "Traduções em português" (correct) to "Traduções em Portuguese" (incorrect). It seems to have been reverted to the english language... any idea why?
Hamilton Abreu20:51, 26 November 2009
  • Sidebar's in other languages comes from MediaWiki.
  • Don't know about the translations in ... thing. Maybe the pt-pt data in CLDR only has overrides for the pt translations?
Nike07:14, 27 November 2009

Yes, it should be the case that the pt-PT variation only has overrides, pretty much as all other variations which CDLR does not consider main. So, I guess it needs to be retrieved in the same way as en-GB, de-AT, de-CH, etc. Is that possible?

Hamilton Abreu23:57, 1 December 2009

Would be easier if MediaWiki used the same codes...

Nike06:41, 2 December 2009
 

May I be incredibly pushy here and nudge you a tiny little itsy bitsy bit? Maybe some hack possible?

Hamilton Abreu22:46, 8 December 2009
 

Bump. Is there some way we can address this, as most language names continue to be in English?

Hamilton Abreu21:27, 3 January 2010
 

I'd go for fixing the code discrepancy. That is out of my abilities however.

Nike21:49, 3 January 2010