Alphabet changing Nogai language

From Support
Jump to navigation Jump to search

Alphabet changing Nogai language

TayfunEt. (talk)09:17, 18 April 2022

Is this Latin alphabet used in any professional publications, such as books or newspapers? Is it taught in schools?

Or is it only used in informal writing?

Amir E. Aharoni (talk)10:15, 18 April 2022

There are some infos on (3 scripts: Cyrillic, Latin and Arabic, but their usage depend on the country where Nogay speakers are living). For now the existing Nogai translation have been made using Cyrillic by default. Wikipedia claims that Latin is used today, and Cyrillic introduced in 1962 may not be used everywhere, or could have fallen out of use after political changes in the region where Cyrillic was introduced and may be enforced for a bit less than 30 years).

Verdy p (talk)15:55, 18 April 2022

Here are some sources: The Latin alphabet used today:

Used to translate:

TayfunEt. (talk)09:30, 24 April 2022

And is it possible that the untranslated words will show in Kazakh (Kazakh language is very similar to the Nogai language)?

TayfunEt. (talk)09:35, 24 April 2022

Fallbacks should use the same script. So Nogai-Cyrillic could fallback first to Kazakh-Cyrillic, Nogai-Arabic could fallback first to Kazakh-Arabic, Nogai-Latin could fallback to Kazakh-Latin. For now only the Cyrillic version of Nogai is enabled (as the default script), meaning that it can only fallback to Kazah-Cyrillic.

Mixing Cyrillic and Latin because of partial translations is often possible without much problem in, composite messages (contain plaeholder varaibles for values that are also translatable with fallbacks).

However mixing Cyrillic+Arabic scripts or Latin+Arabic scripts often causes layout problems (but the same can be said when the default fallback is to English, written in Latin, and Kazakh or Nogai would be written in Arabic).

Such request is enviable but Kazakh needs also efforts to get more properly translated, notably in the Arabic script, before that later variant becomes a viable fallback for Nogai-Arabic (which is still not enabled for now anyway, so this requests should not cause major problems: we'll talk in another time about the opportunity to start translating to Nogai-Arabic, unless another possible fallback to Persian/Farsi is prefered in that case, because Persian is much more advanced and much mlore used today than Kazakh-Arabic which is historical).

Verdy p (talk)16:41, 24 April 2022

We will to use now the latin script as a default script and this means that the fallback language (Kazakh) needs to be in latin. For the scrips, what I think is something like the Crimean Tatar Wikipedia there is also used two scripts and the main script is latin.

TayfunEt. (talk)18:46, 24 April 2022
Edited by 2 users.
Last edit: 12:27, 25 April 2022

So you request that existing translations made in Cyrillic be moved to a subscript extension?

And make Latin-script the default (without needing an extension, replacing the existing Cyrillic version already moved before that), or just create an additional Latin-script extension (like for Kazakh, which is more likely to occur, giving a sotuation similar to Serbian, Chinese or Kazakh without any predefined default choice)?

Question: Is it viable to define a stable transliterator between Cyrillic and Latin for Nogai (such as the one used in Serbian) with an agreed standard allowing such conversion to be bijective (i.e. reversible without loss and without creating orthographic problems, except by using a dictionnary of known exceptions, like the large one used for Chinese between its two standard script forms)?

Verdy p (talk)20:31, 24 April 2022

Oh yes, it can be like the Kazakh Wikipedia and Serbian Wikipedia (like: you go to the languages of the page and also the alphabet is possible to change). And for information, we will use the new latin alphabet (not the 1928 version). File:Nogai latin alphabet.gif

TayfunEt. (talk)12:27, 25 April 2022
Edited by author.
Last edit: 08:10, 26 April 2022

At least Omniglot provides a good start for a transliterator. (

It may eventually be reversible with these simple exceptions using digraphs that would take precedence for the conversion back to Cyrillic (still in use in the Russian Dagestan and Chechnya, while the Latin script is used in Turkey and Romania, and based on the Turkic Latin alphabet in the ISO/IEC 8859-9 subset for Latin, which also fully supports both Turkish and Romanian) :

  • "ya / Ya / YA" = "я / Я" [ja],
  • "yo / Yo / YO" = "ё / Ë" [jo] (only occuring in Nogai loanwords borrowed from Russian),
  • "yu / Yu / YU" = "ю / Ю" [ju] (or [jy] only if initial of the word or after a vowel),
  • "y / Y / ’" = "ь / Ь" [ɯ] (only occuring in Nogai loanwords borrowed from Russian),
    The special handling for the Latin apostrophe (preferably the curled version U+2019 (right single quotation mark) i.e. (’), not the ASCII quote, even if the ASCII apostrophe is probably commonly used as a substitute), to be associated to the Cyrillic soft sign letter "ь / Ь" (because there's no curly quote in ISO/IEC 8859-9, but it is present in the well supported codepages Windows-1254 for the Latin-based alphabet in Turkish or Windows-1250 for the Latin-based alphabet in Central European languages including Romanian, where the apostrophe is coded at 0x92 but rarely present on standard physical keyboards used in Turkey and Romania).
  • "j / J" = "й / Й" [j] (note: there's not need for distinction between dotted and undotted "j/J" in the Latin script, so it is "soft-dotted" like in English or Italian)
  • "i / İ" (dotted) = "и / И" [i],
  • "ı / I" (undotted) = "ш / Ш" [ɯ],
    Romanian users may have problems with distinguighing the dotted or undotted vowels with the Latin script, they may not have "ı" (undotted lowercase) or "İ" (dotted capital) on their physical Latin keyboard. Turkish users won't have such problem.
  • "ts / Ts / TS" = "ц / Ц" [ts] (only occuring in Nogai loanwords borrowed from Russian),
  • "şç / Şç / ŞÇ" = "щ / Щ" [ɕː] (only occuring in Nogai loanwords borrowed from Russian),
    Note: due to their keyboard, users in Romania writing Nogai in Latin script may type a comma below diacritic, which is standard in Romanian language, rather than the cedilla; but the same also happens frequently in Romanian where both diacritics are confused, and many old devices did not have fonts with the comma below which was mapped in legacy 8-bit charsets for Romanian with a precomposed character for base letters "c / C / s / S" only in ISO/CEI 8859-16 (Pan-EU Latin-10), only after it was encoded separately in Unicode, while the cedilla was used in ISO 8859-1 (Western European Latin-1) / ISO 8859-2 (Central European Latin-2) / ISO 8859-3 (Southern European Latin-3) / ISO 8859-9 (Turkic Latin-4; the "ISO/CEI 8859-16" (Latin-10) charset was added but fell out of real use as support for Unicode was already prefered, including in Wikipedia that already used UTF-8 at that time; and rapidly after that, ISO decided to no longer add and support new 8-bit charsets, focusing only in Unicode; Microsoft also did not need this charset, because it had already mapped since long the cedilla below "c/C/s/S" in Windows codepages based on Latin variants of ISO/CEI 8859).
    Only Romanian users using modern systems that are fully Unicode compatible and use modern Unicode-based fonts that supply the "WGL" common subset of Latin, may have keyboard layouts featuring the comma below, and may type it by default rather than typing cedillas, as if they were typing correctly in Romanian. Turkic-speaking users won't do that.

If that transliterator is enabled, then it could be installed by default and thus we would not even need to create separate translations for Nogai between these two scripts, and the wiki could remain unified with the same content, equally accessible from Russia and Turkey (or Romania).

Note also that MediaWiki supports special syntax (using -{code1=text1|code2=text2}-) to mark specific transliteration rules in articles (it is used for example in Chinese to make exceptions to the converter between simplified and traditional Chinese).

Verdy p (talk)13:53, 25 April 2022

And the fallback language Kazakh?

TayfunEt. (talk)04:58, 26 April 2022

Transliterating the content between Latin and Cyrillic would come first in my opinion, before trying a fallback to Kazakh in the appropriate script (Latin or Cyrillic, whichever has content).

Note that Kazakh Wikipedia also uses a transliterator, only from Cyrillic to Latin or to Arabic: all its article names (including proper names, except for brands like "Twitter" or "Los Angeles Times"), and category names are using only Cyrillic; some compatibility namespace names may have Latin alias in English (but they are translated to Cyrillic), and some template names borrowed from English Wikipedia or Commons without necessarily renaming them). So Kazah doesnot need an extension or separate contents, including in translations made on this wiki... except that Wikimedia transliterators are not installed on (which is not jsut translating for Wikimedia's Mediawiki-based wikis).

This means that Kazakh Wikipedia only uses and maintains the "kk-Cyrl" translation made here on, "kk-Latn" and "kk-Arab" are used for the UI only (which does not use the content transliterator, but allows the users to choose their user language for the UI, independantly of page contents (which uses the script variant selector for the transliteration of page contents, and not the current user language for the rest of the UI).

Verdy p (talk)06:45, 26 April 2022

You are wrong with some of the lettes:

j = ж

ı (undotted)= ы

y = й

ş = ш

Ь (the soft sign) is not found on Nogai latin alphabet. Please don't confuse with the latin alphabet from 1928, the Latin alphabet wich is used today is different. The Latin alphabet is same with Crimean Tatar Latin alphabet, the different is the letter Ä ä wich is in Nogai Cyrillic Аь аь. Thank you for your helping!

TayfunEt. (talk)12:34, 26 April 2022

TayfunEt., discussing it all is pointless until you bring up a reliable source that this Latin alphabet is used anywhere formally for school teaching, books, newspapers, professional websites (not social network posts), or something comparable. Omniglot is not a reliable source—it's just one guy who adds puts anything that people send to him on his website.

To the best of my knowledge, the only formal alphabet for Nogai is Cyrillic. Until proven otherwise, it will remain the default here.

Amir E. Aharoni (talk)09:40, 28 April 2022

And there are many proofs of the use of Nogai in Romania (Northern Nogai) and Turkey (Southern Nogai), written with the Latin script. The use of Cyrillic is only for Nogay people that emigrated further to the North at the beginning of the 20th century with the transfer of most of the former larger Nogai region to Russia and rapidly in USSR into the Moldavian SSR and in the Ukrainian SSR (including in Crimea, now in independant Ukraine but occupied by Russia).

Yes there are linguistic links with Crimean Tatar, but Tatars in South-East Romania are well known, most of them however moved to Turkey, and in both countries, they use the Latin script today. Nogai is not just for Russia or Ukraine, where they were assimilated/confused with Crimean Tatars that use the Cyrillic script for their language. Linguistically Nogai is recognized in Romania and is spoken/written by "Romanian Tatars" (some of them assimilated to "Turkish" people), but they don't use the Crimean Tatar language or its Cyrillic script. Nogai people in Romania an Moldova now share many more familial links with Nogai people in Turkey and have more relations with them than with Crimea (Ukraine or Russia), even if Nogai people in Romania are also now in contact with Bulgars (that use the Cyrillic script even in Romania) and also with Greeks, but in a limited way due to religion (Nogais are mostly sunnite muslems like Turks, whereas Romanians, Greeks, and Bulgars are mostly catholic or orthodox christians).

Nogai political parties in Southeastern Romania (generally islamic and often designating themselves as "Tatars" rather than "Turks" because they promote a local autonomy and better acceptance of their islamic religion) use the Latin script for their local communication and they are also finding financial and media support from Turkey due to their active and strong links. And if they don't communicate with Nogai language, they communicate in Turkish, even if officially they also have to communicate in Romanian for local elections. The links with Tatars of Crimea are almost cut since a bit more than one century but not the links with Turkey when it was founded at end of WW1 after the Ottoman Empire collapse (which caused the historic Nogai region to be splitted essentially in two parts between the Romania and the successors of the Russian Empire).

Verdy p (talk)10:15, 28 April 2022

There is a project supported by Tubitak in one of the prominent university of Turkey. They are using Turkish-based Latin script.( Many papers in dergipark also uses latin script. (ie. TDD Journal)

Joseph (talk)14:50, 30 April 2022

These two links don't directly show texts in the Latin script or documentation of orthography. Can you please give direct links to such texts? I tried searching for it in the websites, and couldn't find anything.

Note that transcriptions of words or sentences in scientific papers are not really texts, because that's not what usual people use for reading and writing.

Amir E. Aharoni (talk)08:43, 1 May 2022

Just look at the video on this page, whose title is clear (even if you don't read Turkish, you can easily decipher it). Go to the description given in Youtube. You'll see that this is an active universitary work, made in Turkey, and sustained by Unesco (so this is serious).

Verdy p (talk)16:23, 2 May 2022