词条 | Unified Hangul Code |
释义 |
|name=Unified Hangul Code |alias=Windows Code Page 949, IBM Code Page 1363 |image=Unified Hangul Code.svg |caption=Layout of the Unified Hangul Code |lang=Korean |extends=EUC-KR |standard=WHATWG Encoding Standard (as "EUC-KR")[1] |classification = Extended ISO 646,{{efn|Not in the strictest sense of the term, as ASCII bytes can appear as trail bytes, although this is limited to letter bytes.}} Variable-width encoding, CJK encoding |extra = {{notelist}} }} Unified Hangul Code (UHC),[2]{{efn|{{lang-ko|통합형 한글 코드[1]|Tonghabhyeong Hangeul Kodeu}}}} or Extended Wansung,[2]{{efn|{{lang-ko|확장 완성형|Hwagjang Wanseonghyeong}}}} also known under Microsoft Windows as Code Page 949 (Windows-949, MS949 or ambiguously CP949), is the Microsoft Windows code page for the Korean language. It is an extension of Wansung Code (KS C 5601:1987, encoded as EUC-KR) to include all 11172 Hangul syllables present in Johab (KS C 5601:1992 annex 3).[2][3] This corresponds to the pre-composed syllables available in Unicode 2.0 and later. Wansung Code has the drawback that it only assigns codes for the 2350 precomposed Hangul syllables which have their own KS X 1001 (KS C 5601) codepoints (out of 11172 in total, not counting those using obsolete jamo), and requires others to use eight-byte composition sequences, which are not supported by some partial implementations of the standard.[4] UHC resolves this by assigning single codes for all possible syllables constructed using modern jamo, by making assignments outside of the encoding space used for KS X 1001. TerminologyUnified Hangul Code is not registered with IANA as a standard to communicate information over the Internet.[5] Alternatives include UTF-8. However, the W3C/WHATWG Encoding Standard used by HTML5 incorporates the Unified Hangul Code extensions into its definition of "EUC-KR".[6] Microsoft assigns Windows-949 the label "ks_c_5601-1987",[7][8] which properly applies to KS X 1001 itself (KS C 5601 being the original name of KS X 1001). The WHATWG treat the label "ks_c_5601-1987" interchangeably with "EUC-KR" with the intent of being "compatible with deployed content".[9] The Unicode Consortium's "OBSOLETE/EASTASIA" collection of withdrawn mappings included mappings for Unified Hangul Code as "KSC5601.TXT", with the automatically derived mappings for 7-bit KS X 1001 being included as "KSX1001.TXT".[10] IBM's code page 949 is another, otherwise unrelated, extension of EUC-KR. International Components for Unicode (ICU) uses "cp949", "949" or "ibm-949" to refer to that IBM code page,[11] and "ms949" or "windows-949" (or several variants of "ks_c_5601-1987") to refer to the Windows mapping of UHC.[15] Python, by contrast, recognises "cp949", "949", "ms949" and "uhc" as labels for UHC, and does not include an IBM-949 codec.[12] Out of the labels incorporating the code page number, the WHATWG recognise only "windows-949".[9]IBM's code page for Unified Hangul Code is called Code page 1363 (IBM-1363), or "Korean MS-Win". It is a combination of Code page 1126 and Code page 1362.[13] It differs in having a single byte mapping of 0x5C to the Won sign (U+20A9);[14] Windows maps 0x5C to U+005C (the Unicode code point for the backslash) as in ASCII,[15] although fonts often still render it as a Won sign.[16] The IBM mapping for UHC is available as "ibm-1363" in ICU.[14] Footnotes{{notelist}}References1. ^{{cite web|url=http://www.w3c.or.kr/i18n/hangul-i18n/ko-code.html|title=한글 코드에 대하여|publisher=W3C|language=ko}} 2. ^1 {{cite web|url=http://zsigri.tripod.com/fontboard/cjk/ksc.html|first=Gyula|last=Zsigri|title=KSC and UHC|date=2002-06-18}} 3. ^1 {{citation|url=https://support.microsoft.com/en-gb/help/170557/info-hangul-korean-character-sets|title=INFO: Hangul (Korean) Character Sets|publisher=Microsoft|work=Microsoft Support}} 4. ^{{cite web | url=http://stason.org/TULARC/languages/korean/8-What-are-KS-X-1001-KS-C-5601-and-other-Hangul-codes.html | title=What are KS X 1001(KS C 5601) and other Hangul codes? | work=Hangul & Internet in Korea FAQ | author=Shin, Jungshik}} 5. ^{{cite web|url=http://www.iana.org/assignments/character-sets |title=Character Sets |website=Iana.org |date= |accessdate=2017-01-11}} 6. ^1 {{citation|url=https://encoding.spec.whatwg.org/#index-euc-kr|title=5. Indexes (§ index EUC-KR)|work=Encoding Standard|publisher=WHATWG}} 7. ^{{cite web|url=https://msdn.microsoft.com/en-us/library/system.text.encoding.windowscodepage(v=vs.110).aspx |title=Encoding.WindowsCodePage Property - .NET Framework (current version) |work=MSDN |publisher=Microsoft}} 8. ^{{citation |url=https://docs.microsoft.com/en-us/windows/desktop/intl/code-page-identifiers |title=Code Page Identifiers |publisher=Microsoft |work=Windows Dev Center}} 9. ^1 {{cite web | url=https://encoding.spec.whatwg.org/#names-and-labels | title=4.2. Names and labels | publisher=WHATWG | work=Encoding Standard}} 10. ^{{cite web |url=https://www.unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/KSC/KSX1001.TXT |title=KSX1001.TXT: KS X 1001 to Unicode table |author=Jungshik Shin |publisher=Unicode, Inc}} 11. ^{{citation|url=https://ssl.icu-project.org/icu-bin/convexp?conv=cp949|publisher=International Components for Unicode|title=ibm-949_P110-1999 (alias cp949)|work=Converter Explorer}} 12. ^{{cite web |url=https://docs.python.org/3.7/library/codecs.html#standard-encodings |title=codecs — Codec registry and base classes § Standard Encodings |work=Python 3.7.2 documentation |publisher=Python Software Foundation}} 13. ^{{citation|url=https://www-01.ibm.com/software/globalization/ccsid/ccsid1363.html|publisher=IBM|title=Coded character set identifiers - CCSID 1363|work=IBM Globalization|archive-url=https://web.archive.org/web/20141129210404/http://www-01.ibm.com/software/globalization/ccsid/ccsid1363.html|archive-date=2014-11-29|dead-url=yes}} 14. ^1 {{citation|url=http://demo.icu-project.org/icu-bin/convexp?conv=ibm-1363|publisher=International Components for Unicode|title=ibm-1363|work=Converter Explorer}} 15. ^1 {{citation|url=http://demo.icu-project.org/icu-bin/convexp?conv=windows-949-2000|publisher=International Components for Unicode|title=windows-949-2000|work=Converter Explorer}} 16. ^{{citation | title=When is a backslash not a backslash? | date=2005-09-17 | author=Kaplan, Michael S. | url=http://archives.miloush.net/michkap/archive/2005/09/17/469941.html | work=Sorting it all out}} External links
4 : Character sets|Windows code pages|Encodings of Asian languages|Hangul |
随便看 |
|
开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。