Alphabet changing Nogai language

Is this Latin alphabet used in any professional publications, such as books or newspapers? Is it taught in schools?

Or is it only used in informal writing?

Amir E. Aharoni (talk)10:15, 18 April 2022

There are some infos on https://en.wikipedia.org/wiki/Nogai_alphabets (3 scripts: Cyrillic, Latin and Arabic, but their usage depend on the country where Nogay speakers are living). For now the existing Nogai translation have been made using Cyrillic by default. Wikipedia claims that Latin is used today, and Cyrillic introduced in 1962 may not be used everywhere, or could have fallen out of use after political changes in the region where Cyrillic was introduced and may be enforced for a bit less than 30 years).

Verdy p (talk)15:55, 18 April 2022
 

Here are some sources: The Latin alphabet used today: https://www.omniglot.com/writing/nogai.htm

Used to translate: https://lyricstranslate.com/tr/ka%C3%B1ili-kanl%C4%B1.html

TayfunEt. (talk)09:30, 24 April 2022
 

And is it possible that the untranslated words will show in Kazakh (Kazakh language is very similar to the Nogai language)?

TayfunEt. (talk)09:35, 24 April 2022

Fallbacks should use the same script. So Nogai-Cyrillic could fallback first to Kazakh-Cyrillic, Nogai-Arabic could fallback first to Kazakh-Arabic, Nogai-Latin could fallback to Kazakh-Latin. For now only the Cyrillic version of Nogai is enabled (as the default script), meaning that it can only fallback to Kazah-Cyrillic.

Mixing Cyrillic and Latin because of partial translations is often possible without much problem in, composite messages (contain plaeholder varaibles for values that are also translatable with fallbacks).

However mixing Cyrillic+Arabic scripts or Latin+Arabic scripts often causes layout problems (but the same can be said when the default fallback is to English, written in Latin, and Kazakh or Nogai would be written in Arabic).

Such request is enviable but Kazakh needs also efforts to get more properly translated, notably in the Arabic script, before that later variant becomes a viable fallback for Nogai-Arabic (which is still not enabled for now anyway, so this requests should not cause major problems: we'll talk in another time about the opportunity to start translating to Nogai-Arabic, unless another possible fallback to Persian/Farsi is prefered in that case, because Persian is much more advanced and much mlore used today than Kazakh-Arabic which is historical).

Verdy p (talk)16:41, 24 April 2022

We will to use now the latin script as a default script and this means that the fallback language (Kazakh) needs to be in latin. For the scrips, what I think is something like the Crimean Tatar Wikipedia there is also used two scripts and the main script is latin.

TayfunEt. (talk)18:46, 24 April 2022
Edited by 2 users.
Last edit: 12:27, 25 April 2022

So you request that existing translations made in Cyrillic be moved to a subscript extension?

And make Latin-script the default (without needing an extension, replacing the existing Cyrillic version already moved before that), or just create an additional Latin-script extension (like for Kazakh, which is more likely to occur, giving a sotuation similar to Serbian, Chinese or Kazakh without any predefined default choice)?

Question: Is it viable to define a stable transliterator between Cyrillic and Latin for Nogai (such as the one used in Serbian) with an agreed standard allowing such conversion to be bijective (i.e. reversible without loss and without creating orthographic problems, except by using a dictionnary of known exceptions, like the large one used for Chinese between its two standard script forms)?

Verdy p (talk)20:31, 24 April 2022

Oh yes, it can be like the Kazakh Wikipedia and Serbian Wikipedia (like: you go to the languages of the page and also the alphabet is possible to change). And for information, we will use the new latin alphabet (not the 1928 version). File:Nogai latin alphabet.gif

TayfunEt. (talk)12:27, 25 April 2022
Edited by author.
Last edit: 08:10, 26 April 2022

At least Omniglot provides a good start for a transliterator. (https://www.omniglot.com/writing/nogai.htm)

It may eventually be reversible with these simple exceptions using digraphs that would take precedence for the conversion back to Cyrillic (still in use in the Russian Dagestan and Chechnya, while the Latin script is used in Turkey and Romania, and based on the Turkic Latin alphabet in the ISO/IEC 8859-9 subset for Latin, which also fully supports both Turkish and Romanian) :

  • "ya / Ya / YA" = "я / Я" [ja],
  • "yo / Yo / YO" = "ё / Ë" [jo] (only occuring in Nogai loanwords borrowed from Russian),
  • "yu / Yu / YU" = "ю / Ю" [ju] (or [jy] only if initial of the word or after a vowel),
  • "y / Y / ’" = "ь / Ь" [ɯ] (only occuring in Nogai loanwords borrowed from Russian),
    The special handling for the Latin apostrophe (preferably the curled version U+2019 (right single quotation mark) i.e. (’), not the ASCII quote, even if the ASCII apostrophe is probably commonly used as a substitute), to be associated to the Cyrillic soft sign letter "ь / Ь" (because there's no curly quote in ISO/IEC 8859-9, but it is present in the well supported codepages Windows-1254 for the Latin-based alphabet in Turkish or Windows-1250 for the Latin-based alphabet in Central European languages including Romanian, where the apostrophe is coded at 0x92 but rarely present on standard physical keyboards used in Turkey and Romania).
  • "j / J" = "й / Й" [j] (note: there's not need for distinction between dotted and undotted "j/J" in the Latin script, so it is "soft-dotted" like in English or Italian)
  • "i / İ" (dotted) = "и / И" [i],
  • "ı / I" (undotted) = "ш / Ш" [ɯ],
    Romanian users may have problems with distinguighing the dotted or undotted vowels with the Latin script, they may not have "ı" (undotted lowercase) or "İ" (dotted capital) on their physical Latin keyboard. Turkish users won't have such problem.
  • "ts / Ts / TS" = "ц / Ц" [ts] (only occuring in Nogai loanwords borrowed from Russian),
  • "şç / Şç / ŞÇ" = "щ / Щ" [ɕː] (only occuring in Nogai loanwords borrowed from Russian),
    Note: due to their keyboard, users in Romania writing Nogai in Latin script may type a comma below diacritic, which is standard in Romanian language, rather than the cedilla; but the same also happens frequently in Romanian where both diacritics are confused, and many old devices did not have fonts with the comma below which was mapped in legacy 8-bit charsets for Romanian with a precomposed character for base letters "c / C / s / S" only in ISO/CEI 8859-16 (Pan-EU Latin-10), only after it was encoded separately in Unicode, while the cedilla was used in ISO 8859-1 (Western European Latin-1) / ISO 8859-2 (Central European Latin-2) / ISO 8859-3 (Southern European Latin-3) / ISO 8859-9 (Turkic Latin-4; the "ISO/CEI 8859-16" (Latin-10) charset was added but fell out of real use as support for Unicode was already prefered, including in Wikipedia that already used UTF-8 at that time; and rapidly after that, ISO decided to no longer add and support new 8-bit charsets, focusing only in Unicode; Microsoft also did not need this charset, because it had already mapped since long the cedilla below "c/C/s/S" in Windows codepages based on Latin variants of ISO/CEI 8859).
    Only Romanian users using modern systems that are fully Unicode compatible and use modern Unicode-based fonts that supply the "WGL" common subset of Latin, may have keyboard layouts featuring the comma below, and may type it by default rather than typing cedillas, as if they were typing correctly in Romanian. Turkic-speaking users won't do that.

If that transliterator is enabled, then it could be installed by default and thus we would not even need to create separate translations for Nogai between these two scripts, and the wiki could remain unified with the same content, equally accessible from Russia and Turkey (or Romania).

Note also that MediaWiki supports special syntax (using -{code1=text1|code2=text2}-) to mark specific transliteration rules in articles (it is used for example in Chinese to make exceptions to the converter between simplified and traditional Chinese).

Verdy p (talk)13:53, 25 April 2022

And the fallback language Kazakh?

TayfunEt. (talk)04:58, 26 April 2022

Transliterating the content between Latin and Cyrillic would come first in my opinion, before trying a fallback to Kazakh in the appropriate script (Latin or Cyrillic, whichever has content).

Note that Kazakh Wikipedia also uses a transliterator, only from Cyrillic to Latin or to Arabic: all its article names (including proper names, except for brands like "Twitter" or "Los Angeles Times"), and category names are using only Cyrillic; some compatibility namespace names may have Latin alias in English (but they are translated to Cyrillic), and some template names borrowed from English Wikipedia or Commons without necessarily renaming them). So Kazah doesnot need an extension or separate contents, including in translations made on this wiki... except that Wikimedia transliterators are not installed on Translatewiki.net (which is not jsut translating for Wikimedia's Mediawiki-based wikis).

This means that Kazakh Wikipedia only uses and maintains the "kk-Cyrl" translation made here on translatewiki.net, "kk-Latn" and "kk-Arab" are used for the UI only (which does not use the content transliterator, but allows the users to choose their user language for the UI, independantly of page contents (which uses the script variant selector for the transliteration of page contents, and not the current user language for the rest of the UI).

Verdy p (talk)06:45, 26 April 2022

You are wrong with some of the lettes:

j = ж

ı (undotted)= ы

y = й

ş = ш

Ь (the soft sign) is not found on Nogai latin alphabet. Please don't confuse with the latin alphabet from 1928, the Latin alphabet wich is used today is different. The Latin alphabet is same with Crimean Tatar Latin alphabet, the different is the letter Ä ä wich is in Nogai Cyrillic Аь аь. Thank you for your helping!

TayfunEt. (talk)12:34, 26 April 2022