“ISO/IEC 6937”的意思、由来-开放百科全书

词条 ISO/IEC 6937

释义

Single byte characters
Two byte characters
Codepage layout
See also
References
External links

ISO/IEC 6937:2001, Information technology — Coded graphic character set for text communication — Latin alphabet, is a multibyte extension of ASCII, or rather of ISO/IEC 646-IRV. It was developed in common with ITU-T (then CCITT) for telematic services under the name of T.51, and first became an ISO standard in 1983. Certain byte codes are used as lead bytes for letters with diacritics (accents). The value of the lead byte often indicates which diacritic that the letter has, and the follow byte then has the ASCII-value for the letter that the diacritic is on. Only certain combinations of lead byte and follow byte are allowed, and there are some exceptions to the lead byte interpretation for some follow bytes. However, there are no combining characters at all are encoded in ISO/IEC 6937. But one can represent some free-standing diacritics, often by letting the follow byte have the code for ASCII space.

ISO/IEC 6937's architects were Hugh McGregor Ross, Peter Fenwick, Bernard Marti and Loek Zeckendorf.

ISO6937/2 defines 327 characters found in modern European languages using the Latin alphabet. Non-Latin European characters, such as Cyrillic and Greek, are not included in the standard. Also, some diacritics used with the Latin alphabet like the Romanian comma are not included, using cedilla instead as no distinction between cedilla and comma below was made at the time.

IANA has registered the charset names ISO_6937-2-25 and ISO_6937-2-add for two (older) versions of this standard (plus control codes). But in practice this character encoding is unused on the Internet.

The ISO/IEC 2022 escape sequence to specify the right-hand side of the ISO/IEC 6937 character set is ESC - R (hex 1B 2D 52).^[1]

Single byte characters

The primary set of ISO6937/2 is based on ISO 646-IRV (characters 0x00..0x7F) before the 1991 revision, that is with character 0x24 still denoted as a "international currency sign" (¤) instead of the dollar sign ($):

	!"#¤%&'()*+,-./0123456789:;<=>?@	ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`	abcdefghijklmnopqrstuvwxyz{|}

The supplementary set (characters 0x80..0xFF) contains a selection of spacing and non-spacing graphic characters, additional symbols and some locations reserved for future standardisation.

Two byte characters

The characters, which are not represented in the primary set, are coded on two bytes. The first byte, the "non spacing diacritical mark", is followed by a letter from the base set e.g.:

small e with acute accent (é) = [Acute]+e

In total 13 diacritical marks can be followed by the selected characters from the primary set:

Accent	Code	Second character	Result
Grave	0xC1	AEIOUaeiou	ÀÈÌÒÙàèìòù
Acute	0xC2	ACEILNORSUYZacegilnorsuyz	ÁĆÉÍĹŃÓŔŚÚÝŹáćéģíĺńóŕśúýź
Circumflex	0xC3	ACEGHIJOSUWYaceghijosuwy	ÂĈÊĜĤÎĴÔŜÛŴŶâĉêĝĥîĵôŝûŵŷ
Tilde	0xC4	AINOUainou	ÃĨÑÕŨãĩñõũ
macron	0xC5	AEIOUaeiou	ĀĒĪŌŪāēīōū
Breve	0xC6	AGUagu	ĂĞŬăğŭ
Dot	0xC7	CEGIZcegz	ĊĖĠİŻċėġż
Umlaut or diæresis	0xC8	AEIOUYaeiouy	ÄËÏÖÜŸäëïöüÿ

Ring	0xCA	AUau	ÅŮåů
Cedilla	0xCB	CGKLNRSTcklnrst	ÇĢĶĻŅŖŞŢçķļņŗşţ

DoubleAcute	0xCD	OUou	ŐŰőű
Ogonek	0xCE	AEIUaeiu	ĄĘĮŲąęįų
Caron	0xCF	CDELNRSTZcdelnrstz	ČĎĚĽŇŘŠŤŽčďěľňřšťž

Codepage layout

The reference to combining characters in the U+0300—U+036F range for the codes in the range 0xC1—0xCF below are only indicative of which “accent” is usually intended by that lead byte. ISO/IEC 6937 does not encode any combining characters whatsoever. Instead, there is an explicit list of precomposed characters that are encoded.

A little anomaly is that Latin Small Letter G with Cedilla is coded as if it were with an acute accent, that is, with a 0xC2 lead byte, since due to its descender interfering with a cedilla, the lowercase letter is usually with turned comma above: {{nobr|Ģ ģ}}.

Unicode distinguishes 0xE2 into D with stroke and uppercase Eth, which usually look different for the lowercase letters (0xF2 and 0xF3).

0
1
2	{{chset-ctrl\|0020\|SP\|32}}	{{chset-cell\|0021\|!\|33}}	{{chset-cell\|0022\|"\|34}}	{{chset-cell\|0023\|#\|35}}	{{chset-cell\|0024\|$\|36}}	{{chset-cell\|0025\|%\|37}}	{{chset-cell\|0026\|&\|38}}	{{chset-cell\|0027\|'\|39}}	{{chset-cell\|0028\|(\|40}}	{{chset-cell\|0029\|)\|41}}	{{chset-cell\|002A\|*\|42}}	{{chset-cell\|002B\|+\|43}}	{{chset-cell\|002C\|,\|44}}	{{chset-cell\|002D\|-\|45}}	{{chset-cell\|002E\|.\|46}}	{{chset-cell\|002F\|/\|47}}
3	{{chset-cell\|0030\|0\|48}}	{{chset-cell\|0031\|1\|49}}	{{chset-cell\|0032\|2\|50}}	{{chset-cell\|0033\|3\|51}}	{{chset-cell\|0034\|4\|52}}	{{chset-cell\|0035\|5\|53}}	{{chset-cell\|0036\|6\|54}}	{{chset-cell\|0037\|7\|55}}	{{chset-cell\|0038\|8\|56}}	{{chset-cell\|0039\|9\|57}}	{{chset-cell\|003A\|:\|58}}	{{chset-cell\|003B\|;\|59}}	{{chset-cell\|003C\|<\|60}}	{{chset-cell\|003D\|=\|61}}	{{chset-cell\|003E\|>\|62}}	{{chset-cell\|003F\|?\|63}}
4	{{chset-cell\|0040\|@\|64}}	{{chset-cell\|0041\|A\|65}}	{{chset-cell\|0042\|B\|66}}	{{chset-cell\|0043\|C\|67}}	{{chset-cell\|0044\|D\|68}}	{{chset-cell\|0045\|E\|69}}	{{chset-cell\|0046\|F\|70}}	{{chset-cell\|0047\|G\|71}}	{{chset-cell\|0048\|H\|72}}	{{chset-cell\|0049\|I\|73}}	{{chset-cell\|004A\|J\|74}}	{{chset-cell\|004B\|K\|75}}	{{chset-cell\|004C\|L\|76}}	{{chset-cell\|004D\|M\|77}}	{{chset-cell\|004E\|N\|78}}	{{chset-cell\|004F\|O\|79}}
5	{{chset-cell\|0050\|P\|80}}	{{chset-cell\|0051\|Q\|81}}	{{chset-cell\|0052\|R\|82}}	{{chset-cell\|0053\|S\|83}}	{{chset-cell\|0054\|T\|84}}	{{chset-cell\|0055\|U\|85}}	{{chset-cell\|0056\|V\|86}}	{{chset-cell\|0057\|W\|87}}	{{chset-cell\|0058\|X\|88}}	{{chset-cell\|0059\|Y\|89}}	{{chset-cell\|005A\|Z\|90}}	{{chset-cell\|005B\|[\|91}}	{{chset-cell\|005C\|\\\|92}}	{{chset-cell\|005D\|]\|93}}	{{chset-cell\|005E\|^\|94}}	{{chset-cell\|005F\|_\|95}}
6	{{chset-cell\|0060\|`\|96}}	{{chset-cell\|0061\|a\|97}}	{{chset-cell\|0062\|b\|98}}	{{chset-cell\|0063\|c\|99}}	{{chset-cell\|0064\|d\|100}}	{{chset-cell\|0065\|e\|101}}	{{chset-cell\|0066\|f\|102}}	{{chset-cell\|0067\|g\|103}}	{{chset-cell\|0068\|h\|104}}	{{chset-cell\|0069\|i\|105}}	{{chset-cell\|006A\|j\|106}}	{{chset-cell\|006B\|k\|107}}	{{chset-cell\|006C\|l\|108}}	{{chset-cell\|006D\|m\|109}}	{{chset-cell\|006E\|n\|110}}	{{chset-cell\|006F\|o\|111}}
7	{{chset-cell\|0070\|p\|112}}	{{chset-cell\|0071\|q\|113}}	{{chset-cell\|0072\|r\|114}}	{{chset-cell\|0073\|s\|115}}	{{chset-cell\|0074\|t\|116}}	{{chset-cell\|0075\|u\|117}}	{{chset-cell\|0076\|v\|118}}	{{chset-cell\|0077\|w\|119}}	{{chset-cell\|0078\|x\|120}}	{{chset-cell\|0079\|y\|121}}	{{chset-cell\|007A\|z\|122}}	{{chset-cell\|007B\|{ \|123}}	{{chset-cell\|007C\|\|\|124}}	{{chset-cell\|007D\|} \|125}}	{{chset-cell\|007E\|~\|126}}
8
9
A	{{chset-ctrl\|00A0\|NBSP\|160}}	{{chset-cell\|00A1\|¡\|161}}	{{chset-cell\|00A2\|¢\|162}}	{{chset-cell\|00A3\|£\|163}}		{{chset-cell\|00A5\|¥\|165}}		{{chset-cell\|00A7\|§\|167}}	{{chset-cell\|00A4\|¤\|168}}	{{chset-cell\|2018\|‘\|169}}	{{chset-cell\|201C\|“\|170}}	{{chset-cell\|00AB\|«\|171}}	{{chset-cell\|2190\|←\|172}}	{{chset-cell\|2191\|↑\|173}}	{{chset-cell\|2192\|→\|174}}	{{chset-cell\|2193\|↓\|175}}
B	{{chset-cell\|00B0\|°\|176}}	{{chset-cell\|00B1\|±\|177}}	{{chset-cell\|00B2\|²\|178}}	{{chset-cell\|00B3\|³\|179}}	{{chset-cell\|00D7\|×\|180}}	{{chset-cell\|00B5\|µ\|181}}	{{chset-cell\|00B6\|¶\|182}}	{{chset-cell\|00B7\|·\|183}}	{{chset-cell\|00F7\|÷\|184}}	{{chset-cell\|2019\|’\|185}}	{{chset-cell\|201D\|”\|186}}	{{chset-cell\|00BB\|»\|187}}	{{chset-cell\|00BC\|¼\|188}}	{{chset-cell\|00BD\|½\|189}}	{{chset-cell\|00BE\|¾\|190}}	{{chset-cell\|00BF\|¿\|191}}
C		{{chset-cell\|0300\|̀\|193}}	{{chset-cell\|0301\|́\|194}}	{{chset-cell\|0302\|̂\|195}}	{{chset-cell\|0303\|̃\|196}}	{{chset-cell\|0304\|̄\|197}}	{{chset-cell\|0306\|̆\|198}}	{{chset-cell\|0307\|̇\|199}}	{{chset-cell\|0308\|̈\|200}}		{{chset-cell\|030A\|̊\|202}}	{{chset-cell\|0327\|̧\|203}}		{{chset-cell\|030B\|̋\|205}}	{{chset-cell\|0328\|̨\|206}}	{{chset-cell\|030C\|̌\|207}}
D	{{chset-cell\|2015\|―\|208}}	{{chset-cell\|00B9\|¹\|209}}	{{chset-cell\|00AE\|®\|210}}	{{chset-cell\|00A9\|©\|211}}	{{chset-cell\|2122\|™\|212}}	{{chset-cell\|266A\|♪\|213}}	{{chset-cell\|00AC\|¬\|214}}	{{chset-cell\|00A6\|¦\|215}}					{{chset-cell\|215B\|⅛\|220}}	{{chset-cell\|215C\|⅜\|221}}	{{chset-cell\|215D\|⅝\|222}}	{{chset-cell\|215E\|⅞\|223}}
E	{{chset-cell\|2126\|Ω\|224}}	{{chset-cell\|00C6\|Æ\|225}}	{{chset-cell\|0110/00D0\|Đ/Ð\|226}}	{{chset-cell\|00AA\|ª\|227}}	{{chset-cell\|0126\|Ħ\|228}}		{{chset-cell\|0132\|Ĳ\|230}}	{{chset-cell\|013F\|Ŀ\|231}}	{{chset-cell\|0141\|Ł\|232}}	{{chset-cell\|00D8\|Ø\|233}}	{{chset-cell\|0152\|Œ\|234}}	{{chset-cell\|00BA\|º\|235}}	{{chset-cell\|00DE\|Þ\|236}}	{{chset-cell\|0166\|Ŧ\|237}}	{{chset-cell\|014A\|Ŋ\|238}}	{{chset-cell\|0149\|ŉ\|239}}
F	{{chset-cell\|0138\|ĸ\|240}}	{{chset-cell\|00E6\|æ\|241}}	{{chset-cell\|0111\|đ\|242}}	{{chset-cell\|00F0\|ð\|243}}	{{chset-cell\|0127\|ħ\|244}}	{{chset-cell\|0131\|ı\|245}}	{{chset-cell\|0133\|ĳ\|246}}	{{chset-cell\|0140\|ŀ\|247}}	{{chset-cell\|0142\|ł\|248}}	{{chset-cell\|00F8\|ø\|249}}	{{chset-cell\|0153\|œ\|250}}	{{chset-cell\|00DF\|ß\|251}}	{{chset-cell\|00FE\|þ\|252}}	{{chset-cell\|0167\|ŧ\|253}}	{{chset-cell\|014B\|ŋ\|254}}	{{chset-ctrl\|00AD\|SHY\|255}}

References

1. ^[https://www.itscj.ipsj.or.jp/iso-ir/156.pdf Supplementary Set of ISO/IEC 6937:1992] The high-ASCII half of the character set. (The left-hand side is [https://www.itscj.ipsj.or.jp/iso-ir/006.pdf U.S. ASCII].)

External links

ISO pages: [https://www.iso.org/standard/13466.html ISO 6937-1:1983], [https://www.iso.org/standard/13467.html ISO 6937-2:1983], [https://www.iso.org/standard/13468.html ISO 6937-2:1983/Add 1:1989], [https://www.iso.org/standard/13465.html ISO/IEC 6937:1994], [https://www.iso.org/standard/31393.html ISO/IEC 6937:2001]
WD 6937, Coded graphic character set for text communication - Latin alphabet (Revision of ISO/IEC 6937:1994) (ISO/IEC 6937:1994 draft)

4 : ITU-T recommendations|Character encoding|Character sets|Computer-related introductions in 1983

随便看

开放百科全书收录14589846条英语、德语、日语等多语种百科知识，基本涵盖了大多数领域的百科知识，是一部内容自由、开放的电子版国际百科全书。

Single byte characters

Two byte characters

Codepage layout

See also

References

External links