请输入您要查询的百科知识:

 

词条 Windows-1252
释义

  1. Details

  2. Character set

     History 

  3. See also

  4. References

  5. Further reading

  6. External links

{{About|the character encoding commonly mislabeled as "ANSI"|the actual ANSI character encoding|ASCII}}{{Refimprove|date=December 2014}}{{Infobox character encoding
| name = Windows-1252
| mime = windows-1252
| image = Windows-1252-infobox.svg
| caption =
| alias =
| by = Microsoft
| standard = WHATWG Encoding Standard
| lang = English, various others
| status =
| extends = ISO 8859-1 (excluding C1 controls)
| prev =
| next =
| encodes = ISO 8859-15
| classification = extended ASCII, Windows-125x
}}

Windows-1252 or CP-1252 (code page{{snd}} 1252) is a single-byte character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows in English and some other Western languages (other languages use different default encodings).

It is probably the most-used 8-bit character encoding in the world. {{As of|2019|03}}, 0.6% of all web sites declared use of Windows-1252,[1][2] but at the same time 3.4% used ISO 8859-1,[1] which by HTML5 standards should be considered the same encoding,[4] so that 4.0% of web sites effectively used Windows-1252. In addition, most web browsers will correctly render it if encountered in text that claims to be UTF-8, so its actual usage may be higher.

Details

This character encoding is a superset of ISO 8859-1 in terms of printable characters, but differs from the IANA's ISO-8859-1 by using displayable characters rather than control characters in the 80 to 9F (hex) range. Notable additional characters include curly quotation marks and all the printable characters that are in ISO 8859-15 (at different places than ISO 8859-15). It is known to Windows by the code page number 1252, and by the IANA-approved name "windows-1252".

It is very common to mislabel Windows-1252 text with the charset label ISO-8859-1. A common result was that all the quotes and apostrophes (produced by "smart quotes" in word-processing software) were replaced with question marks or boxes on non-Windows operating systems, making text difficult to read. Most modern web browsers and e-mail clients treat the media type charset ISO-8859-1 as Windows-1252 to accommodate such mislabeling. This is now standard behavior in the HTML5 specification, which requires that documents advertised as ISO-8859-1 actually be parsed with the Windows-1252 encoding.[3]

Historically, the phrase "ANSI Code Page" was used in Windows to refer to non-DOS encodings; the intention was that most of these would be ANSI standards such as ISO-8859-1. Even though Windows-1252 was the first and by far most popular code page named so in Microsoft Windows parlance, the code page has never been an ANSI standard. Microsoft explains, "The term ANSI as used to signify Windows code pages is a historical reference, but is nowadays a misnomer that continues to persist in the Windows community."[4]

In LaTeX packages, CP-1252 is referred to as "ansinew".

{{anchor|Code page layout}}

Character set

The following table shows Windows-1252. Each character is shown with its Unicode equivalent based on the Unicode.org mapping of Windows-1252 with "best fit".[5]

{{chset-table-header|Windows-1252 (CP1252)}}
0_
0
{{chset-ctrl|0000|NUL|0}}{{chset-ctrl|0001|SOH|1}}{{chset-ctrl|0002|STX|2}}{{chset-ctrl|0003|ETX|3}}{{chset-ctrl|0004|EOT|4}}{{chset-ctrl|0005|ENQ|5}}{{chset-ctrl|0006|ACK|6}}{{chset-ctrl|0007|BEL|7}}{{chset-ctrl|0008|BS|8}}{{chset-ctrl|0009|HT|9}}{{chset-ctrl|000A|LF|10}}{{chset-ctrl|000B|VT|11}}{{chset-ctrl|000C|FF|12}}{{chset-ctrl|000D|CR|13}}{{chset-ctrl|000E|SO|14}}{{chset-ctrl|000F|SI|15}}
1_
16
{{chset-ctrl|0010|DLE|16}}{{chset-ctrl|0011|DC1|17}}{{chset-ctrl|0012|DC2|18}}{{chset-ctrl|0013|DC3|19}}{{chset-ctrl|0014|DC4|20}}{{chset-ctrl|0015|NAK|21}}{{chset-ctrl|0016|SYN|22}}{{chset-ctrl|0017|ETB|23}}{{chset-ctrl|0018|CAN|24}}{{chset-ctrl|0019|EM|25}}{{chset-ctrl|001A|SUB|26}}{{chset-ctrl|001B|ESC|27}}{{chset-ctrl|001C|FS|28}}{{chset-ctrl|001D|GS|29}}{{chset-ctrl|001E|RS|30}}{{chset-ctrl|001F|US|31}}
2_
32
{{chset-ctrl|0020|SP|32}}{{chset-cell|0021|!|33}}{{chset-cell|0022|"|34}}{{chset-cell|0023|#|35}}{{chset-cell|0024|$|36}}{{chset-cell|0025|%|37}}{{chset-cell|0026|&|38}}{{chset-cell|0027|'|39}}{{chset-cell|0028|(|40}}{{chset-cell|0029|)|41}}{{chset-cell|002A|*|42}}{{chset-cell|002B|+|43}}{{chset-cell|002C|,|44}}{{chset-cell|002D|-|45}}{{chset-cell|002E|.|46}}{{chset-cell|002F|/|47}}
3_
48
{{chset-cell|0030|0|48}}{{chset-cell|0031|1|49}}{{chset-cell|0032|2|50}}{{chset-cell|0033|3|51}}{{chset-cell|0034|4|52}}{{chset-cell|0035|5|53}}{{chset-cell|0036|6|54}}{{chset-cell|0037|7|55}}{{chset-cell|0038|8|56}}{{chset-cell|0039|9|57}}{{chset-cell|003A|:|58}}{{chset-cell|003B|;|59}}{{chset-cell|003C|<|60}}{{chset-cell|003D|=|61}}{{chset-cell|003E|>|62}}{{chset-cell|003F|?|63}}
4_
64
{{chset-cell|0040|@|64}}{{chset-cell|0041|A|65}}{{chset-cell|0042|B|66}}{{chset-cell|0043|C|67}}{{chset-cell|0044|D|68}}{{chset-cell|0045|E|69}}{{chset-cell|0046|F|70}}{{chset-cell|0047|G|71}}{{chset-cell|0048|H|72}}{{chset-cell|0049|I|73}}{{chset-cell|004A|J|74}}{{chset-cell|004B|K|75}}{{chset-cell|004C|L|76}}{{chset-cell|004D|M|77}}{{chset-cell|004E|N|78}}{{chset-cell|004F|O|79}}
5_
80
{{chset-cell|0050|P|80}}{{chset-cell|0051|Q|81}}{{chset-cell|0052|R|82}}{{chset-cell|0053|S|83}}{{chset-cell|0054|T|84}}{{chset-cell|0055|U|85}}{{chset-cell|0056|V|86}}{{chset-cell|0057|W|87}}{{chset-cell|0058|X|88}}{{chset-cell|0059|Y|89}}{{chset-cell|005A|Z|90}}{{chset-cell|005B|[|91}}{{chset-cell|005C|\|92}}{{chset-cell|005D|]|93}}{{chset-cell|005E|^|94}}{{chset-cell|005F|_|95}}
6_
96
{{chset-cell|0060|`|96}}{{chset-cell|0061|a|97}}{{chset-cell|0062|b|98}}{{chset-cell|0063|c|99}}{{chset-cell|0064|d|100}}{{chset-cell|0065|e|101}}{{chset-cell|0066|f|102}}{{chset-cell|0067|g|103}}{{chset-cell|0068|h|104}}{{chset-cell|0069|i|105}}{{chset-cell|006A|j|106}}{{chset-cell|006B|k|107}}{{chset-cell|006C|l|108}}{{chset-cell|006D|m|109}}{{chset-cell|006E|n|110}}{{chset-cell|006F|o|111}}
7_
112
{{chset-cell|0070|p|112}}{{chset-cell|0071|q|113}}{{chset-cell|0072|r|114}}{{chset-cell|0073|s|115}}{{chset-cell|0074|t|116}}{{chset-cell|0075|u|117}}{{chset-cell|0076|v|118}}{{chset-cell|0077|w|119}}{{chset-cell|0078|x|120}}{{chset-cell|0079|y|121}}{{chset-cell|007A|z|122}}{{chset-cell|007B|{|123}}{{chset-cell|007C|||124}}{{chset-cell|007D|}|125}}{{chset-cell|007E|~|126}}{{chset-ctrl|007F|DEL|127}}
8_
128
{{chset-cell|20AC|€|128}} {{chset-cell|201A|‚|130}}{{chset-cell|0192|ƒ|131}}{{chset-cell|201E|„|132}}{{chset-cell|2026|…|133}}{{chset-cell|2020|†|134}}{{chset-cell|2021|‡|135}}{{chset-cell|02C6|ˆ|136}}{{chset-cell|2030|‰|137}}{{chset-cell|0160|Š|138}}{{chset-cell|2039|‹|139}}{{chset-cell|0152|Œ|140}} {{chset-cell|017D|Ž|142}} 
9_
144
 {{chset-cell|2018|‘|145}}{{chset-cell|2019|’|146}}{{chset-cell|201C|“|147}}{{chset-cell|201D|”|148}}{{chset-cell|2022|•|149}}{{chset-cell|2013|–|150}}{{chset-cell|2014|—|151}}{{chset-cell|02DC|˜ |152}}{{chset-cell|2122|™|153}}{{chset-cell|0161|š|154}}{{chset-cell|203A|›|155}}{{chset-cell|0153|œ|156}} {{chset-cell|017E|ž|158}}{{chset-cell|0178|Ÿ|159}}
A_
160
{{chset-ctrl|00A0|NBSP|160}}{{chset-cell|00A1|¡|161}}{{chset-cell|00A2|¢|162}}{{chset-cell|00A3|£|163}}{{chset-cell|00A4|¤|164}}{{chset-cell|00A5|¥|165}}{{chset-cell|00A6|¦|166}}{{chset-cell|00A7|§|167}}{{chset-cell|00A8|¨|168}}{{chset-cell|00A9|©|169}}{{chset-cell|00AA|ª|170}}{{chset-cell|00AB|«|171}}{{chset-cell|00AC|¬|172}}{{chset-ctrl|00AD|SHY|173}}{{chset-cell|00AE|®|174}}{{chset-cell|00AF|¯|175}}
B_
176
{{chset-cell|00B0|°|176}}{{chset-cell|00B1|±|177}}{{chset-cell|00B2|²|178}}{{chset-cell|00B3|³|179}}{{chset-cell|00B4|´|180}}{{chset-cell|00B5|µ|181}}{{chset-cell|00B6|¶|182}}{{chset-cell|00B7|·|183}}{{chset-cell|00B8|¸|184}}{{chset-cell|00B9|¹|185}}{{chset-cell|00BA|º|186}}{{chset-cell|00BB|»|187}}{{chset-cell|00BC|¼|188}}{{chset-cell|00BD|½|189}}{{chset-cell|00BE|¾|190}}{{chset-cell|00BF|¿|191}}
C_
192
{{chset-cell|00C0|À|192}}{{chset-cell|00C1|Á|193}}{{chset-cell|00C2|Â|194}}{{chset-cell|00C3|Ã|195}}{{chset-cell|00C4|Ä|196}}{{chset-cell|00C5|Å|197}}{{chset-cell|00C6|Æ|198}}{{chset-cell|00C7|Ç|199}}{{chset-cell|00C8|È|200}}{{chset-cell|00C9|É|201}}{{chset-cell|00CA|Ê|202}}{{chset-cell|00CB|Ë|203}}{{chset-cell|00CC|Ì|204}}{{chset-cell|00CD|Í|205}}{{chset-cell|00CE|Î|206}}{{chset-cell|00CF|Ï|207}}
D_
208
{{chset-cell|00D0|Ð|208}}{{chset-cell|00D1|Ñ|209}}{{chset-cell|00D2|Ò|210}}{{chset-cell|00D3|Ó|211}}{{chset-cell|00D4|Ô|212}}{{chset-cell|00D5|Õ|213}}{{chset-cell|00D6|Ö|214}}{{chset-cell|00D7|×|215}}{{chset-cell|00D8|Ø|216}}{{chset-cell|00D9|Ù|217}}{{chset-cell|00DA|Ú|218}}{{chset-cell|00DB|Û|219}}{{chset-cell|00DC|Ü|220}}{{chset-cell|00DD|Ý|221}}{{chset-cell|00DE|Þ|222}}{{chset-cell|00DF|ß|223}}
E_
224
{{chset-cell|00E0|à|224}}{{chset-cell|00E1|á|225}}{{chset-cell|00E2|â|226}}{{chset-cell|00E3|ã|227}}{{chset-cell|00E4|ä|228}}{{chset-cell|00E5|å|229}}{{chset-cell|00E6|æ|230}}{{chset-cell|00E7|ç|231}}{{chset-cell|00E8|è|232}}{{chset-cell|00E9|é|233}}{{chset-cell|00EA|ê|234}}{{chset-cell|00EB|ë|235}}{{chset-cell|00EC|ì|236}}{{chset-cell|00ED|í|237}}{{chset-cell|00EE|î|238}}{{chset-cell|00EF|ï|239}}
F_
240
{{chset-cell|00F0|ð|240}}{{chset-cell|00F1|ñ|241}}{{chset-cell|00F2|ò|242}}{{chset-cell|00F3|ó|243}}{{chset-cell|00F4|ô|244}}{{chset-cell|00F5|õ|245}}{{chset-cell|00F6|ö|246}}{{chset-cell|00F7|÷|247}}{{chset-cell|00F8|ø|248}}{{chset-cell|00F9|ù|249}}{{chset-cell|00FA|ú|250}}{{chset-cell|00FB|û|251}}{{chset-cell|00FC|ü|252}}{{chset-cell|00FD|ý|253}}{{chset-cell|00FE|þ|254}}{{chset-cell|00FF|ÿ|255}}
{{chset-legend}} {{Legend-inline|Transparent|border=medium solid gray|Differences from ISO-8859-1}}

According to the information on Microsoft's and the Unicode Consortium's websites, positions 81, 8D, 8F, 90, and 9D are unused; however, the Windows API [https://msdn.microsoft.com/en-us/library/windows/desktop/dd319072%28v=vs.85%29.aspx MultiByteToWideChar] maps these to the corresponding C1 control codes. The "best fit" mapping documents this behavior, too.[5]

History

  • The first version of the codepage 1252 used in Microsoft Windows 1.0 did not have positions D7 and F7 defined. All the characters in the ranges 80–9F were undefined too.
  • The second version, used in Microsoft Windows 2.0, positions D7, F7, 91, and 92 had been defined.
  • The third version, used since Microsoft Windows 3.1, had all the present-day positions defined, except Euro sign and Z with caron character pair.
  • The final version listed above debuted in Microsoft Windows 98 and was ported to older versions of Windows with the Euro symbol update.

See also

  • Western Latin character sets (computing)
  • Windows-1250

References

1. ^{{cite web|url=https://w3techs.com/technologies/history_overview/character_encoding|title=Historical trends in the usage of character encodings, February 2019|publisher=|accessdate=2019-02-18}}
2. ^{{cite web|url=https://w3techs.com/faq|title=Frequently Asked Questions|publisher=}}
3. ^{{cite web |url=https://encoding.spec.whatwg.org/#names-and-labels |title=Encoding |at=sec. 5.2 Names and labels |publisher=WHATWG |date=27 January 2015 |accessdate=4 February 2015 |archiveurl=https://web.archive.org/web/20150204174315/https://encoding.spec.whatwg.org/#names-and-labels |archivedate=4 February 2015 |dead-url=no}}
4. ^{{cite web |url=https://download.microsoft.com/download/5/6/8/56803da0-e4a0-4796-a62c-ca920b73bb17/21-Unicode_WinXP.pdf |title=Unicode and Windows XP |page=1 |last1=Wissink |first1=Cathy |publisher=Microsoft |date=5 April 2002 |accessdate=4 February 2015 |archiveurl=https://web.archive.org/web/20150204175931/http://download.microsoft.com/download/5/6/8/56803da0-e4a0-4796-a62c-ca920b73bb17/21-Unicode_WinXP.pdf |archivedate=4 February 2015 |dead-url=no}}
5. ^{{cite web |url=https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit1252.txt |title=Unicode mappings of Windows-1252 with 'Best Fit' |publisher=Unicode |accessdate=4 February 2015 |archiveurl=https://web.archive.org/web/20150204175922/http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit1252.txt |archivedate=4 February 2015 |dead-url=no}}

Further reading

  • {{cite web |title=Codepage 1004 - Windows Extended |publisher=IBM |date=2001 |url=http://www.borgendale.com/codepage/cp1004.gif |access-date=2018-05-13 |dead-url=no |archive-url=https://web.archive.org/web/20180513184106/http://www.borgendale.com/codepage/cp1004.gif |archive-date=2018-05-13}} (used by OS/2)

External links

  • [https://msdn.microsoft.com/en-us/library/cc195054.aspx Code Page 1252 Windows Latin 1 (ANSI){{snd}} Windows-1252 reference chart]
  • [https://www.iana.org/assignments/charset-reg/windows-1252 IANA Charset Name Registration]
  • [https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT Unicode mapping table for Windows-1252]
{{Character encoding}}

2 : Windows code pages|Computer-related introductions in 1985

随便看

 

开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。

 

Copyright © 2023 OENC.NET All Rights Reserved
京ICP备2021023879号 更新时间:2024/9/21 12:43:14