Start Hanja Script language for Korean

From Support
Jump to navigation Jump to search

Start Hanja Script language for Korean

Hello. recently, Some Korean users started a new Hanja wiki and they are wanting to make Hanja Interface for their wiki. Also, in Wikimedia projects, New request for the Hanja Wikipedia is have not rejected and if LC accepts the request, It would be need of its own interfaces. according to the ISO 15924, the code kore is assigned to the Korean with hanja, so we can use the code of kore or ko-kore for it. I also have an interest in it, and more Koreans would contribute to it.

Ellif (talk)09:23, 13 October 2020

The ISO 15924 code [Kore] does NOT mean it uses only (or preferably) Hanja. This code indicates this is written in a mix of ALL scripts used in Korean, so it accepts Hangul, and Han/Hanja.

Korean by defaut uses [Kore], i.e. [ko] is a perfect synonym of [ko-Kore], just like [ja] is a perfect synonym of [ja-Jpan], ie. It uses all Japanese Kana syllabaries [Hira]+[Kana] and Kanji [Hani] (you can use [Hrkt] to indicate that all Kanjji are transcripted to kanas (Hiragana and Katakana).

Using [ko-Kore] is then NON-SENSE. Look at BCP47 and notably the IANA database which states that [ko] uses [Kore] as its default implied script. [ko-Kore] should then not be used in BCP47. It exists only for legacy applications that want to include a script subtag for all languages and do not want to perform a lookup of the language code to map it to its default script.

If you wanted to make a preference for Hanja, you should use [Hani], i.e. [ko-Hani], so that all words that have an Hanja transcription should be written with it and not in Hangul [Hang]. And as well you could do the same thing in Japanese with [ja-Hani] to restrict the use of kanas.

Verdy p (talk)10:52, 13 October 2020
Edited by author.
Last edit: 14:17, 13 October 2020

For once, I agree with Verdy p, the correct code for the Hanja script (and only the Hanja script) is ko-Hani.

EDIT: However, I see on Hanja wikis and websites that text is written in a mix of Hangul and Hanja scripts (also the case with the word given as an example on meta), maybe we should move translations in Hangul only to ko-Hang and translations for Hanja to ko?

Thibaut (talk)13:14, 13 October 2020

And it's strange that this wiki, actually uses all Korean scripts (including Hangul) everywhere in the content (I'm not speaking about the interface of the wiki itself)... So this wiki actually uses [ko-Kore] even if they want to remove Hangul (as much as possible) in favor of Hanja (I'm not really sure this is possible in modern use, unless you accept to borrow Chinese sinograms, most probably in their simplified in modern terminologies, but possibly traditional for coherence with Hanja; not sure that users will be able to read these borrowed Han sinograms, when they also are composed with phonologic traits which are appropriate to Mandarin, but most probably not for Korean).

Do you want to be inventive and create adhoc Hanjas, composed by a traditional Hanja radical and an Hangul part to replace the phonetic parts of Han sinograms? You won't be able to compose them in Unicode, so you'll fallback to use Hangul only (or maybe Latin or Cyrillic): Hangul is more complete than the basic alphabet and has many adhoc Jamos (plus a few diacritics) to compose more complex and more precise clusters that better respect the phonetic than modern Hangul which has simplified the phonology by reducing the number of jamos (the other ones being kept for historic use and encoded in Unicode even if they are not precomposable in a full syllable using a single Unicode character).

Note that Unicode contains a few clusters that don't respect the pure (L*V*T*) Hangul composition of jamos, just like Japanese has some precomposed clusters of kanas (possibly mixed with Latin, e.g. for international measurement units), and some adhoc Hanjas specific to Korean (not used in Chinese languages, Japanese, or historic Vietnamian).

Verdy p (talk)14:13, 13 October 2020

I don't think we should move Korean translation made in Hangul-only to [ko-Hang]. This is their normal modern form for Korean [ko-Kore].

The use of [ko-Hang] would be ONLY to propose a transliteration to Hangul of Hanjas still normally used in modern Korean.

The interest for [ko-Hang]would be to allow a site to display Hangul easily without needing a large Han font, and it may be useful for small devices that have limited fonts

Note also that [ko-Jamo] is not needed given that encoded Hangul clusters are canonically decomposable to Jamos (so rendering on low-resolution terminals that can only display aligned (non-composed) jamos is possible algorithmically. As well this easily allows generating a romanisation fully algoithmically. Only Hanjas cause problems as they require a lookup in a possibly large dictionnary (which is not fully defined and extended regularly in successibve Unicode versions; in some cases, Hanjas do not have a well defined phonology and may turn to several distinct Hangul clusters, depending on the reader, and that's probably why some Hanjas have been kept to avoid confusions, notably for proper names).

Verdy p (talk)14:29, 13 October 2020

Sorry, that's not how things work for Korean speakers. Nobody uses Hanja in their daily life in 2020. Proposal to mix Korean and Hanja in domain name got quite a lot of resist in 2018 when it was submitted to ICANN.

Hanja usage is basically now in the form of disambiguation and a hanja mania's area, and most words are expressed in Hangul. No hanja required for any expressions in Korean. We indeed borrow words in Hanja, but we display them in Korean. Nope. If they want to use Hanja in interface, they should be segregated in their own area, not in the way where you steal the Hangul's area.

revi11:20, 15 October 2020

Ok then, ko-Hani it is.

Thibaut (talk)14:50, 15 October 2020

You have to notice that All Korean translations in translatewiki are in Ko-hang status, not ko-kore status. After the FRAMEWORK ACT ON KOREAN LANGUAGE enacted in 2005, the language of official documents in all governments are obliged to write in Hangul only(Article 14). Also, all newspapers became Hanja only. Before 2000. So, You have thought that many Koreans like to Hanja during their Korean writings, but it goes to extinct.

If this decision goes into Korean Wikimedia community, The community will not approve your idea. Ho-hang should be the original status of Korean(ko), and gughamnun should behave the ko-kore status.

Ellif (talk)10:54, 15 October 2020

"So, You have thought that many Koreans like to Hanja during their Korean writings, but it goes to extinct."

No I didn’t think that, I know Hanja is rarely used nowadays, it was just a question, since ko and ko-Kore mean according to the IETF, a mix of Hangul + Hanja, that’s all.

Thibaut (talk)15:00, 15 October 2020

Also the Korean act applies only to official acts of the Korean governement bodies or agencies. This does not apply to all independant projects, notably wikis that are not required to use only it, and the web in general (so "ko"="ko-Kore" in IETF remains effectively, only the Korean government sticks now on using only "ko-Hang" but only for its official documents, not the informative documents and all its communication to the national or international public, and it still actively supports Hanjas in its cultural works and still works actively within the standardization bodies for Unicode/ISO 10646, notably in the IRG for the "ideographic" scripts (which should have been better named "sinographic" gen that sinograms are not purely ideographic, and there are other unrelated ideographic scripts that the IRG does NOT work on, such as "SignWriting" or various hieroglypic scripts developed far outside Eastern Asia).

Korean people in Korean-written wikis have a strong use of Hanjas (the Korean Wikipedia has many of them which are not even transliterated to the modern Hangul with an additional precision). Nothing prohibits in fact the default Korean UI [ko] to use Hanjas, even if it's rare there (but it's not forbidden at all, just a matter of community preferences, not decided by the South Korean government alone).

If there's a conflict between the South Korean governement goals and the comminuty goals, I would then prefer the addition of the [ko-Hang] locale in this wiki, so that overrides are possible for the few [ko] resources that may exhibit Hanjas (with fallback to [ko] = [ko-Kore]).

And I'm also not opposed to the addition of [ko-Hani] for the few users that still want to sponsor Hanjas as their prefered script (also with fallback to [ko] = [ko-Kore]).

In other words, we don't need [ko-Kore] here, and [ko] can remain as it is. Note that language tagging is not a requirement to use only the indicated script, it just sets a preference order (so even in [ko]=[ko-Kore] it is possible to embed Latin, not just Hangul or Hanjas, and this use is in fact not rare at all, notably for famous trademarks, including "Samsung", or international brands and compny names like "Google", "Apple", "Microsoft", Facebook", "Amazon", "IBM", "Honda", "Mercedes-Benz", and so on, or international bodies to which the Korean government bodies participate like "ITU", "ISO", or "Unicode").

In summary:

  • Adding [ko-Kore] : no, use [ko] instead (no need for this duplicate according to BCP47 rules and the IANA database), which is Sufficient for Wikipedia and Wiktionnary editions in Korean.
  • Forbidding use on Hanjas in [ko]: no
  • Adding [ko-Hang] : may be yes if there's a need for the official bodies of the South Korean governement. We have to wait for such demand for a wiki maintained by this governement.
  • Adding [ko-Hani] : many be yes for cultural reasons and if there's an intererested community to support it and prefer Hanjas in their wikis rather than the modern Hangul.
  • Adding [ko-Brai] : may be needed for blind Korean users using the Braille script, if the automatic conversion from Hangul (or worse with Hanja) is not easy to implement (and cannot work like in Chinese with their extensive dictionnaries containing normative romanizations sponsored the PRC governement, but based on Stantard Mandarin spoken language and not at all for the Standard Korean spoken language).
  • We have already accepted [ko-KP] for a different standard decided and used by the North Korean official communication, but in fact it's not actively supported because of lack of contributors and severe restrictions for local native users on the Internet. North Korean people that now reside in South Korea jsut have adopted the community rules used in South Korea and elsewhere worldwide.
Verdy p (talk)18:41, 16 October 2020

Well - I don't much care about anything about Hanja as long as no Hanja appears on [ko]. Korean-speaking wiki community's expectation is that Korean is for generally-Hangul-written and there is no need for Hanja injection - as established in w:ko:위키백과:사랑방 (일반)/2015년 제14주#한국어 위키에서의 한자 사용에 관한 의견 요청 (and w:ko:위키백과:사랑방/2012년 제40주#한자->한글 자동변환기능의 도입에 관하여 which also features wider community rejection of hanja use).

revi01:14, 19 October 2020
Edited by author.
Last edit: 20:07, 21 October 2020

You're wrong, wikis are not just first source.

They are not the South Korean government, which states that it no longer wants any Hanja characters in its own official/legal publications (it's their choice, but it only applies to these publications, and not to the cultural publications, as the South Korean governement still actively supports its cultural heritage, and the south Korean government also participates directly to the IRG, Ideographic Raporte Group, as part of a joint working group between Unicode and ISO/TC for the standardisation of sinographic characters: it has approved many sinographic characters and even made additions specific to the Hanja subset, not used elsewhere).

As well this policy does not apply to all Korean users for their own private use and even in their communications and cooperative works.

As well the wikis are not restricted to just South Korean users. And even if we have few users from North Korea, the South Korean governement requirements does not apply to them, and does not apply to other Korean users located elsewhere. We can't validly erase all Hanjs from the Korean culture.

Under ISO and BCP 47, the inclusion of Hanjs in [ko] is perfectly valid, standard, and even approved by the South Korean government for general use, even if the South Korean government will not use them for its own official/legal publications. Some Korean users will also want to apply the same thing, but this is also their own private choice/preference and not a requirement. A single wiki may also decide the same thing but this would require a community decision, which would apply only to this wiki.

The only way to "enforce" the restriction *against* Hanjas (what you want, or what the South Korean government wants of its official/legal publications, excluding its actively supported cultural support), is to provide specific translations for [ko-Hang]. But be aware that such enforcement is in fact difficult to maintain: even the South Korean governement requires the use of other scripts, notably Latin, including in official documents (e.g. on passports and in many international treaties, and trade/transport agencies, and many commercial contracts), as well as others (notably Kanjis and Kanas for Japanese people legally in Korea, Traditional Han for Taiwanese people, Simplified Han for PR's Chinese people and Singapoureans, sometimes Cyrillic as well, or Brahmic and Arabic scripts used by religous people coming legally to Korea from abroad). If it's not possibly to use only Hangul with [ko-Hang], it must be able to fallback to the more permissive [ko] code.

But there won't be any restriction for [ko]=[ko-Kore]=[ko-Hang]+[ko-Hani] (for example with important historic books in Korean Wiksource); as well, it's perfectly valid to request specific translations for [ko-Hani] to create a wiki that would prefer to use Hanjas as much as possible in preference to Hangul. And all these wikis must also remain open to North Koreans and Koreans living elsewhere.

Verdy p (talk)10:13, 19 October 2020

I never said anything about the Government, I don't care about the Government. I'm not sure why you continue to mention Government. Please re-read my statements and stop attacking strawman.

revi19:38, 21 October 2020

When you reply, please either supply a diff that User:Revi said anything abuot the GOVERNMENT or make a fresh argument on that Korean Wikipedia rejected the hanja usage.

I did not read your message beyond the "firs the South Korean Govenment" because the rest is merely hitting the strawman.

revi19:49, 21 October 2020

I'm not "attacking"' anyone like you are doing indirectly. And I prefectly read your message (several times). As well I've NEVER stated that Korean Wikipedia rejected the hanja usage (I said exactly the opposite!). Note a part of my reply above was unexpectedly truncated in one sentence "the fir ..." is where this occured just before I added the comment about the South Korean gov statement for its own works in its own adminsitration.

The initial request above (by Ellif) was incorrectly argumented. And it's a fact that Korean is more diverse than what you may think. You are just argumenting about some users preferences and these preferences are not a policy even in the supported projects (notably for Wikimedia where there's a freedom of choice, just like there's a freedom of speech). I do not want to raise any editing wars that the Korean communities must handle using their normal community decisions, local to each wiki (which does not apply here because this is for many other projects as well, not just those of Wikimedia: there are tons of wikis translated with that do not belong to Wikimedia, even if they use MediaWiki, and each of them can apply their own local policy).

I mention the South Korean government because it is on topic with your statement and the links you provided which have very limited scope (and not a policy decision) inside a talk page Korean Wikipedia where there are also other valid arguments.

As well you cited the intent of some Koreans campaigning against the inclusion of Hanja in Korean domain names (which is a separate, unrelated issue, and this opinion is not approved by ICANN, which cannot act on this, because the policy for names in the two Korean TLDs is decided by these Korean governments, and does not apply to gTLD which each have their own policy, asd long as they match with the IDNA framework, which allows for all scripts standardized in Unicode/ISO/IED 10646 and with relevant policies in RFCs and the Unicode standard itself, approved by many governments, and almost all web standardization bodies, including IANA for BCP 47 which is the most important standard, even more important than ISO 639 which still does not regulate at all the scripts or orthographies to be used).

Verdy p (talk)19:50, 21 October 2020

I could not accept your opinions.

  • [Ko-hang] is the same as [Ko] therefore Ko should not be moved from current status. If you are not agreed, I ask to see the Rodong Sinmun. I want to ask one more for you: Are there any people for teaching Ko-kore for Learning of Korean? As you can easily find out in Duolingo, No. Even North Korean and Koreans in China, Russia, and Kazakhstan do not use Ko-kore as for the writing nowadays. Therefore,
  • Yes [Ko-kore] for Korean with Hanja.
  • Nay [Ko-hani] for Korean with Hanja, because Modern Korean could not be expressed with Han characters only (except if you want to save out the Gugeyol, which is not accepted in Language Committee). If you are thinking about the writing Chinese in Korean way, You rather have to contribute to the Classical Chinese Wikipedia.
Ellif (talk)08:59, 23 October 2020

You're alone... By IETF's definition in the IANA database for BCP47, [ko]=[ko-Kore]:

Type: language
Subtag: ko
Description: Korean
Added: 2005-10-16
Suppress-Script: Kore

[ko-Hang] is therefore a subset of [ko]=[ko-Kore]

And under the ISO 15924 standard, [ko-Kore] is perfectly defined as Korean in the [Kore] script mix, perfectly defined by ISO 15924 as all scripts used for Korean, past or present, which include Hangul+Hanjas.

The problem you have is the mapping [ko]=[ko-Kore], while you want [ko-Hang]. But this is not what is in the IANA database and the BCP 47 standard (which is also THE standard for all web applications, including HTML, XML, CSS, SVG, and almost all programming languages, as well as almost all I18n libraries used in applications that are not restricted to just some ISO 639 part).

The situation is exactly the same for Korean as it is for Japanese [ja]=[ja-Jpan]: here also modern Japanese can no longer be written only with Kanjis. But this "only" is not relevant for saying that modern Japanese is [ja-HrKt] (though it is possible to make some approximation to it to drop Kanjis for some limited usages). As well modern Korean is a mix which may be written using Hangul "only", but there's no such restriction enforced, except for some limites usage, like in the South Korean government official documents). Koreans can legally use Hanjas as they want, and they do (including the South Korean governement, otherwise it would not even be an active member of the joint IRG working group for encoding sinographic scripts in Unicode/ISO/IEC 10646).

And Korean [ko] is not limited to just its current modern form, it's an umbrella for all variants including historic forms, and variants used in North Korea, and cultural variants that are still living today: Hanjas are not dead.

Verdy p (talk)16:42, 23 October 2020

So, If you have seen any of the ‘Living articles’ in the Korean Wikimedia projects, please notify me, which Actually does not exist.

Ellif (talk)07:05, 29 April 2021