Language names not in Unicode CLDR

The CLDR input period will reopen soon. For now it is betatesting a new faster and easier version of the site. Generally the input period takes about 1 months, terminated by a vetting period of about 1 month (sometimes more if there are lot of new input data, and performance problems). The next release follows in the next 2 weeks. We can expect a new release of CLDR at the beginning of summer. It could be good to request these 100 languages in the CLDR bug report to allow input there even if Wikimedia has already started working on this language list (Wikimedia can also start discussing and fixing many of them, and these discussions considered during the CLDR vetting process).

Verdy p (talk)‎

Verdy: Bug 6763 on CLDR was opened by us, so there is no need to make a further report. Localised language names are taken from CLDR usually. This bug will only help those languages which have a locale at CLDR (currently 240 compared to 345 supported languages at translatewiki.net) - but that is a different issue. User Whym pointed out that we do have some local names data ourselves. It does mention on Translatewiki.net_languages#Language_names that it is possible to get language names added at Wikimedia if CLDR doesn't yet support the locale or the language name, but this won't help other projects. Entering data at CLDR is definitely preferable, where possible.

I hope to do a review of translatewiki.net supported languages again next winter to identify more new languages at translatewiki.net, which are not yet included on the CLDR language names on the survey tool. We can then raise a new bug at CLDR to get these added to the survey tool. If the localisation statistics at CLDR are good for the batch of new names just added, then our chances of getting another batch of language names added will be better. Basically, the more people who contribute localised language names at CLDR, the better!

Lloffiwr (talk)‎

When you posted this initial thread the CLDR input period was still not open. I replied at that time and it was still not open. This is no longer the case, as now CLDR input has started again (but it currelty has some start problems with performance, so it's difficult to use when each input requires waiting for avout 20-30 seconds after each click or submission (otherwose the next clicks are handled asynchronously in the wrong order, or your clicks may be handled up to one minute later when the screen content has finally been changed and that click will go to another element than the one effectively clicked. For now, it's simply unusable for mass input. Additionnaly there are frequent losses of sessions.

These problems in the CLDR Survey are not new (it has always existed since so many years), but version after version it gets each time worse because the UI performs too many background requests, but also because it constantly reflows the content when all the page is in a giant table.

The CDLR Survey tool has not seriously been designed (and it is unusable without a very solid PC: even with higabytes of memory and an octo-core CPU, its javascripts are dramatically slow, even at non-peek hours, with less than a dozen users connected to it). It really lacks a good server (and probaly my own desktop PC is more powerful than the one used to host the CLDR survey).

So we cannot recommend many people to use it. It is simply more efficient to submit a bug report with the necessary data in XML format than using the online tool (and in fact most of the content of the CLDR has been submitted this way). The tool itelf is only uable for vetting a few items.

May be the Wikimedia Foundation could provide a grant to the CLDR project if we want to use it as a source. But for now, Wikiemdia projects should simply leave better by just providing the data itself, here on translatewiki.net, to perform most of the job needed for creating and vetting data, even if we submit it in XML format to the CLDR project where they will be merged and vetted in a later version. The CLDR project is very slow to accept new data, too much for us where we have more urgent needs (and my opinion is that our own community is larger and privides more fata with more quality than the very small communty on CLDR).

For now, the other "major" participants in CLDR (Microsoft, Google, Apple, IBM) did not contribute with the necessary resources they should have given to CLDR to make it effective. In fact they also have their own internal development processes and submit data very passively to CLDR, where cooperation is in fact very poor (really Google could have provided the needed technical infrastructure of servers, and a few of its web designers, but apparently it is not really interested in providing more locales than about 30-50 for which most of the needed data is already in CLDR and the rest can remain in English for Google... Microsoft, Apple and IBM are apparently on the same path and don't care much about our desire in Wikimedia to support more languages).

Even if I've been member to the CLDR project and Unicode since years, I am still convinced that Translatewiki.net performs better than CLDR and provides more data, faster, in a more efficient way, with faster corrections of errors, faster delivery, and a much larger community of contributors and users.

However the CLDR project is still good for the technical aspect of specifications. But for collecting the data itself I cannot recommend it (and there are really a lot of errors in this data, that has been signaled years ago, and impossible to change year after year because of not enough votes, and the near impssibility to involve more people to participate in this Survey tool).

I son't say that the CDLR is not unneeded, but it should just be one (minor) source of data for Wikimedia projects (and for most other open projects, including Ubuntu and Launchpad). I propose to reverse the direction of interaction with CLDR, and for as long as this CLDR project will not scale better with serious technical resources and more serious involvement of Google, Microsoft, Apple, IBM, Adobe, Oracle (and other large "official members" of CLDR TC).

Verdy p (talk)‎

Our new users have made thousands submissions already, so I'm glad to say the survey tool is fine at least for some people.

Nemo (talk)‎

Speaking of usability, on Firefox it was nearly unusable because of frequent freezes, but Chromium seemed to work well for me.

whym‎

The CLDR Survey uses too long forms and some browsers have severe problems handling long lists. Things are goind better now that long lists have been split into sublists. But it's true that it is sometime slow, the server sometimes delays its responses to the browwer in a very strange way (sometimes several minutes after the change, and the display is not necessarily synchronized with the input that was done. I also don't like the fact that clicking on an item is moving elements down and up on the page, because of these delays : sometimes this causes vetting clicks to be sent to a non-desired item.

So use the Survey tool with care: it you see some strange behavior, or if it suddenly starts being extremely slow, this is because it has accumulated in the browser too many pending requests (many of them are completed, but the completion event was not received, and these stale HTTP sessions are increasing the number of background threads and sessions up to the point that it may hang the browser (even in Chrome or Chromium).

In Firefox the tool is clearly unusable (let's not speak about Opera...) but the tool works also in IE. The tool lack some developments, it evolves slowly year after year; but Google, IBM and Microsoft should offer more help to the few programmers maintaining it for the CLDR TC. The server side of this tool however is much more stable today and much faster than it was in the past years.

Verdy p (talk)‎