请输入您要查询的百科知识:

 

词条 Shift JIS
释义

  1. Description

  2. Multiple versions

      Windows-932 / Windows-31J    MacJapanese    Other variants  

  3. Shift JIS byte map

      As defined in JIS X 0208:1997    With vendor or JIS X 0213 extensions  

  4. See also

  5. References

  6. External links

{{short description|Japanese character encoding}}{{Citation style|date=September 2009}}

{{infobox character encoding
| name = Shift_JIS-2004
| mime =
| alias = Shift_JISx0213
| standard = JIS X 0213
| lang = Japanese, Ainu, English, Russian
| status =
| extends = Shift_JIS (1997),
JIS X 0201 (8-bit)
| encodes = JIS X 0213
| prev = Shift_JIS (1997)
| next =
}}

The newer JIS X 0213 standard defines an extended variant of Shift_JIS referred to as Shift_JISx0213 (in a previous version of the standard) or Shift_JIS-2004. It is a superset of standard Shift JIS.[15]

In order to represent the allocated rows on both planes of JIS X 0213, Shift_JIS-2004 uses the following method of mapping codepoints.[16]

In the above, is a two-byte Shift_JIS-2004 sequence, is the {{Nihongo|plane|面|men|surface}} number (1 or 2), is the {{Nihongo|row|区|ku|ward}} number (1-94) and is the {{Nihongo|cell|点|ten|point}} number (1-94). The ku and ten numbers are equivalent to and respectively, where is a two-byte JIS sequence referencing a given plane.

The same set of characters can represented by EUC-JIS-2004, the EUC-JP based counterpart.

Some of the additions collide with popular Shift JIS extensions, including Windows codepage 932 which is used in web standards (see above). For example, compare plane 1 row 89 in JIS X 0213 (beginning 硃, 硎, 硏…)[17] to row 89 in the JIS X 0208 variant defined in web standards (beginning 纊, 褜, 鍈…).[18] In addition, some of the characters map to Unicode characters beyond the BMP.

Other variants

The space with lead bytes 0xF5 to 0xF9 (beyond the region used for JIS X 0208) is used by Japanese mobile phone operators for pictographs for use in E-mail.[19] KDDI goes further and defines hundreds more in the space with lead bytes 0xF3 and 0xF4.[20]

Beyond even this, there have been numerous minor variations made on Shift JIS, with individual characters here and there altered. Most of these extensions and variants have no IANA registration, so there is much scope for confusion, if the extensions are used.

A variant is the one that must be used if wanting to encode Shift JIS in source code strings of C and similar programming languages. This variant doubles the byte 0x5C if it appears as second byte of a two-byte character, but not if it appears as a single "¥" (ASCII: "\\") character, because 0x5C is the beginning of an escape sequence. The best way of handling this is a special editor which encodes Shift JIS this way.

Shift JIS byte map

As defined in JIS X 0208:1997

The chart below gives the detailed meaning of each byte in a stream encoded in standard Shift JIS (conforming to JIS X 0208:1997).

{{Shift-JIS byte map}}

With vendor or JIS X 0213 extensions

Some of the bytes which are not used for single-byte codes or initial bytes in JIS X 0208:1997 are used by certain extensions, resulting in the layout detailed in the chart below.

{{Shift-JIS byte map extended}}

See also

  • Japanese language and computers
  • Mojibake
  • Shift JIS art
  • Microsoft code page 932

References

1. ^ -->{{infobox character encoding| name = Shift JIS| mime = Shift_JIS| alias = | standard = JIS X 0208:1997 Appendix 1| lang = Primarily Japanese, but also supporting English, Russian| status =| prev = | extends = JIS X 0201 8-bit format.| encodes = JIS X 0208| next = Shift_JIS-2004 (JIS)
Windows-31J (web)| classification = Extended ISO 646,{{efn|Not in the strictest sense of the term, as ASCII bytes can appear as trail bytes.}} Variable-width encoding, CJK encoding| extra =
{{notelist}}
}}Shift JIS (Shift Japanese Industrial Standards, also SJIS, MIME name Shift_JIS) is a character encoding for the Japanese language, originally developed by a Japanese company called ASCII Corporation in conjunction with Microsoft and standardized as JIS X 0208 Appendix 1. 0.4% of all web pages used Shift JIS in September 2018, a decline from 1.3% in July 2014.https://w3techs.com/technologies/history_overview/character_encoding
2. ^j1 and j2 are each in the range 33 (0x21) to 126 (0x7e) inclusive (i.e., 7-bit character values excluding control characters (0–31 (0x1f) and 127 (0x7f)) and space)
3. ^{{cite web | url=https://www.iana.org/assignments/character-sets/character-sets.xhtml | publisher=IANA | title=Character Sets}}
4. ^{{cite web|url=https://msdn.microsoft.com/en-us/library/system.text.encoding.windowscodepage(v=vs.110).aspx |title=Encoding.WindowsCodePage Property - .NET Framework (current version) |work=MSDN |publisher=Microsoft}}
5. ^{{cite web |url=https://docs.microsoft.com/en-us/windows/desktop/intl/code-page-identifiers |title=Code Page Identifiers |publisher=Microsoft |work=Windows Dev Center}}
6. ^{{cite web | url=https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/com.ibm.aix.nlsgdrf/ibm-943_ibm-932.htm | title=IBM-943 and IBM-932 | publisher=IBM | work=IBM Knowledge Center}}
7. ^{{cite web | url=https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP932.TXT | title=CP932.TXT | publisher=Unicode Consortium}}
8. ^{{cite web | url=http://www.opengroup.or.jp:80/jvc/cde/ucs-conv-e.html#ch3_1_1 | title=3.1.1 Details of Problems | publisher=The Open Group Japan | work=Problems and Solutions for Unicode and User/Vendor Defined Characters | archiveurl=https://web.archive.org/web/19990203115405/http://www.opengroup.or.jp/jvc/cde/ucs-conv-e.html#ch3_1_1 | archivedate=1999-02-03 | dead-url=yes | df= }}
9. ^{{cite web | title=When is a backslash not a backslash? | date=2005-09-17 | author=Kaplan, Michael S. | url=http://archives.miloush.net/michkap/archive/2005/09/17/469941.html}}
10. ^{{cite web | url=http://archives.miloush.net/michkap/archive/2007/05/26/2901371.html | title=The PUA outside of Unicode | author=Kaplan, Michael S | work=Sorting it all out | date=2007-05-26}}
11. ^{{cite web | url=https://encoding.spec.whatwg.org/#index-jis0208 | title=5. Indexes (§ Index jis0208) | publisher=WHATWG | work=Encoding Standard}}
12. ^{{cite web | url=https://encoding.spec.whatwg.org/#names-and-labels | title=4.2. Names and labels | publisher=WHATWG | work=Encoding Standard}}
13. ^{{cite web | url=ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/JAPANESE.TXT | title=JAPANESE.TXT: Map (external version) from Mac OS Japanese encoding to Unicode 2.1 and later. | publisher=Apple Computer, Inc.; Unicode Consortium}}
14. ^{{cite web | url=https://developer.apple.com/documentation/coreservices/1399915-encoding_variants_for_macjapanes?language=objc | title=Encoding Variants for MacJapanese | publisher=Apple | work=Apple Developer Documentation}}
15. ^{{cite web | url=http://x0213.org/codetable/index.en.html | title=JIS X 0213 Code Mapping Tables | publisher=x0213.org}}
16. ^{{cite web | url=http://www.asahi-net.or.jp/~wq6k-yn/code/enc-x0213.html#sjis-2004 | title=JIS X 0213の代表的な符号化方式 § Shift_JIS-2004 | language=ja}} Hexadecimal numbers in the source have been converted to decimal for display.
17. ^{{cite web | url=https://www.itscj.ipsj.or.jp/iso-ir/233.pdf | title=233: Japanese Graphic Character Set for Information Interchange, Plane 1 | publisher=IPSJ}}
18. ^{{cite web | url=https://encoding.spec.whatwg.org/jis0208.html | title=Index jis0208 visualization | publisher=WHATWG | work=Encoding Standard}}
19. ^{{cite web | url=https://www.fileformat.info/info/emoji/docomo.htm | title=Original Emoji from DoCoMo | publisher=FileFormat.info}}
20. ^{{cite web | url=https://www.fileformat.info/info/emoji/kddi.htm | title=Original Emoji from KDDI | publisher=FileFormat.info}}

External links

  • Shift-JIS Kanji Table{{snd}} a table of the non-ASCII part of the codeset
  • {{cite web |url=http://www.microsoft.com/globaldev/reference/dbcs/932.htm |title=Windows Codepage 932 |date=May 1, 2005 |archiveurl=https://web.archive.org/web/20080307021230/http://www.microsoft.com/GLOBALDEV/Reference/dbcs/932.mspx |archivedate=2008-03-07 |website=Microsoft |deadurl=yes |df= }}{{snd}} Microsoft's definition
  • Forms of Shift-JIS in ICU (International Components for Unicode)
    • ibm-942 (sjis78)
    • ibm-943 (contains the \\u00A5 ↔ \\x5C mapping)
    • Shift JIS (contains the \\u005C ↔ \\x5C mapping)
{{Character encoding}}{{DEFAULTSORT:Shift JIS}}

1 : Encodings of Japanese

随便看

 

开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。

 

Copyright © 2023 OENC.NET All Rights Reserved
京ICP备2021023879号 更新时间:2024/9/22 3:39:53