Localisation of page
I'll give you the current state of play for Welsh and Swahili, as working examples. For Welsh, somebody has entered quite a lot of proposed terms into the survey tool, and I have voted on that and added some further proposals. 8 votes are needed for a term to be published on CLDR. However, I have come to a temporary halt with it, whilst waiting for a response from the linguists at the Bedwyr Centre for Celtic languages on a number of issues; the use of diacritic marks, the use of letters not normally part of the Welsh alphabet, principles for converting language names for unfamiliar languages into Welsh, and the like. Going on previous experience, I am not expecting them to answer in a hurry, and so realistically, the number of language names in the Welsh database is not going to increase significantly for at least another 6 months. If we were to set up the localisation of language names here, then the issue of the diacritic marks would disappear, because I would decide to use them, ignoring potential problems for users with outdated browsers which cannot cope with the diacritic marks (CLDR advises that this should be considered when deciding whether to include diacritic marks). And I would probably choose to follow the principles of localisation in the Academy Welsh Language Dictionary (which emphasises phonetic pronunciation over similarity to accepted 'international' versions of a language name). I would still like to get the opinion of the linguists at the Bedwyr Centre, but if they couldn't deal with my queries quickly then I would just go ahead anyway, subject to contributions from other Welsh translatewiki.net and Wikipedia users.
On Swahili, I have proposed a few mainly African language names on the survey tool after consulting dictionaries and native speakers. Some of the proposed terms already on CLDR are way off base - I know some are wrong, but am not in a position to come up with a correct term myself. However, I still haven't managed to set up any communications with professional linguists, because of various communication difficulties. The dictionary sources available to me contain only the names of very well-known languages. There are as yet no generally accepted written forms for less prominent languages, and I (and I venture my Tanzanian co-tranlator also) would not really like to venture to propose these terms on translatewiki.net until we had managed to consult with a professional linguist, who could either do the work for us, or propose principles for us to work with.
These two languages have both been written languages for a long time. Some of the languages supported here have been written for a short time only, or they are non-state languages with little opportunity to develop academic or other specialist vocabulary. I don't know whether other translators have other factors to contribute from their own experience.
It would be very nice to be able to see the language name for Breton in Welsh (Llydaweg) and Asu in Swahili (Kipare), both of which are correct, are already proposed (by me) on CLDR, but are waiting for votes before being published. So I like your idea to create a database here, which CLDR might be persuaded to accept as a bulk set of proposals (or a vote where a term has already been proposed). But creating the database here would not be straightforward, as described above.
For the languages that are not yet available on CLDR, it would be really good to have a database here on translatewiki.net.
For all languages, it would be good to be able to create a database, if Mediawiki could use a localised CLDR term first, or where one didn't exist, choose the translatewiki.net or Mediawiki database term instead. I don't know whether that is possible. If it was, we wouldn't have to depend on the speed of development of CLDR.
It's imho pretty easy to populate a database of language names, since ISO 639 has names already. Depending on the substandard (currently 1, 2B, 2T, 3, 5 plus 6 in preparation), we have English, French, and an autonym or several (name of a language name in the language itself) They are all available to more or less automatic bulk download. Since I've done that already seveal times, for my own tools, I'd be available to make the same for twn as well. I know there are few special cases with dropped codes and altered meanings, but the bulk is fairly easy. Btw., the Babel extension is doing something very similar already.
Have you considered encouraging the Welsh linguists to work with the CLDR directly? Quite a number of such organizations contribute data directly to the CLDR. Please don't create yet another divergent process.
This is a sore point. As well as asking the Bedwyr Language Centre, who tell me that they are the 'official' contributors to CLDR, to comment on the principals of localising foreign proper nouns to Welsh for CLDR, I have also asked them to contribute terms and vote on those already contributed by others. I have deliberately held off from voting myself where the term to be adopted is not obvious, to allow them to take the lead in localising. I expect, however, to be waiting a long time, if past experience is anything to go by, before they can do anything with this, since their time and resources are heavily committed on all sorts of things. I intended to convey my cautious approach to the question of standardisation in my post above and am sorry that that was not clearly stated.
At the same time there is nothing controversial about the word 'Llydaweg', meaning 'Breton language', which has been around for hundreds of years, and no-one needs a language specialist to rubber stamp it. It would be nice to be able to have 'Llydaweg' in a database created here, for use here until it is superceded by the CLDR term, whenever that may become available. Whether that is practical is a matter for the developers here to address.
As it stands 'Llydaweg' is the unopposed entry. It has the status of 'contributed', but it certainly doesn't need 8 votes to be 'in CLDR' as things stand. It will show up as "<language type="br" draft="contributed">Llydaweg</language>" and twn (and other CLDR consumers) can decide whether "contributed" is better than nothing.
You said that you held off on voting, did you at least enter the terms? Because, the time for submitting data has now passed in most cases. What you can (could) have done is to enter an option, and then change your vote back to n/o (no opinion)- that puts an entry in but doesn't yet vote for it.
Thanks for participating, and please watch for a mass e-mail soon about the vetting phase.