词条 | Base64 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
释义 |
Base64 is a group of similar binary-to-text encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation. The term Base64 originates from a specific MIME content transfer encoding. Each Base64 digit represents exactly 6 bits of data. Three 8-bit bytes (i.e., a total of 24 bits) can therefore be represented by four 6-bit Base64 digits. {{TOC limit|3}}DesignThe particular set of 64 characters chosen to represent the 64 place-values for the base varies between implementations. The general strategy is to choose 64 characters that are common to most encodings and that are also printable. This combination leaves the data unlikely to be modified in transit through information systems, such as email, that were traditionally not 8-bit clean.[1] For example, MIME's Base64 implementation uses The earliest instances of this type of encoding were created for dialup communication between systems running the same OS — e.g., uuencode for UNIX, BinHex for the TRS-80 (later adapted for the Macintosh) — and could therefore make more assumptions about what characters were safe to use. For instance, uuencode uses uppercase letters, digits, and many punctuation characters, but no lowercase.[2][3][4][1] Base64 tableThe Base64 index table:
ExamplesThe example below uses ASCII text for simplicity, but this is not a typical use case, as it can already be safely transferred across all systems that can handle Base64. The more typical use is to encode binary data (such as an image); the resulting Base64 data will only contain 64 different ASCII characters, all of which can reliably be transferred across systems that may corrupt the raw source bytes. A quote from Thomas Hobbes' Leviathan: Man is distinguished, not only by his reason, but by this singular passion from other animals, which is a lust of the mind, that by a perseverance of delight in the continued and indefatigable generation of knowledge, exceeds the short vehemence of any carnal pleasure. is represented as a byte sequence of 8-bit-padded ASCII characters encoded in MIME's Base64 scheme as follows (newlines and whitespaces may be present anywhere but are to be ignored on decoding): TWFuIGlzIGRpc3Rpbmd1aXNoZWQsIG5vdCBvbmx5IGJ5IGhpcyByZWFzb24sIGJ1dCBieSB0aGlz IHNpbmd1bGFyIHBhc3Npb24gZnJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyBhIGx1c3Qgb2Yg dGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlcmFuY2Ugb2YgZGVsaWdodCBpbiB0aGUgY29udGlu dWVkIGFuZCBpbmRlZmF0aWdhYmxlIGdlbmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZWRzIHRo ZSBzaG9ydCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbCBwbGVhc3VyZS4= In the above quote, the encoded value of Man is TWFu. Encoded in ASCII, the characters M, a, and n are stored as the bytes As this example illustrates, Base64 encoding converts three octets into four encoded characters.
If there are only two significant input octets (e.g., 'Ma'), or when the last input group contains only two octets, all 16 bits will be captured in the first three Base64 digits (18 bits); the two least significant bits of the last content-bearing 6-bit block will turn out to be zero, and discarded on decoding (along with the following
If there is only one significant input octet (e.g., 'M'), or when the last input group contains only one octet, all 8 bits will be captured in the first two Base64 digits (12 bits); the four least significant bits of the last content-bearing 6-bit block will turn out to be zero, and discarded on decoding (along with the following
Output paddingThe final
The same characters will be encoded differently depending on their position within the three-octet group which is encoded to produce the four characters. For example:
The ratio of output bytes to input bytes is 4:3 (33% overhead). Specifically, given an input of n bytes, the output will be bytes long, including padding characters. In theory, the padding character is not needed for decoding, since the number of missing bytes can be calculated from the number of Base64 digits. In some implementations, the padding character is mandatory, while for others it is not used. One case in which padding characters are required is concatenating multiple Base64 encoded files. Decoding Base64 with paddingWhen decoding Base64 text, four characters are typically converted back to three bytes. The only exceptions are when padding characters exist. A single
Decoding Base64 without paddingWithout padding, after normal decoding of four characters to three bytes over and over again, fewer than four encoded characters may remain. In this situation only two or three characters shall remain. A single remaining encoded character is not possible (because a single Base64 character only contains 6 bits, and 8 bits are required to create a byte, so a minimum of 2 Base64 characters are required: The first character contributes 6 bits, and the second character contributes its first 2 bits. For example:
Implementations and historyVariants summary tableImplementations may have some constraints on the alphabet used for representing some bit patterns. This notably concerns the last two characters used in the index table for index 62 and 63, and the character used for padding (which may be mandatory in some protocols, or removed in others). The table below summarizes these known variants, and link to the subsections below.
Due to so many variants for base64, base62x has been introduced to unify all of them by excluding symbols in its output, i.e. only letters and numbers in the textual representation of base64 implementation in base62x. Privacy-enhanced mailThe first known standardized use of the encoding now called MIME Base64 was in the Privacy-enhanced Electronic Mail (PEM) protocol, proposed by RFC 989 in 1987. PEM defines a "printable encoding" scheme that uses Base64 encoding to transform an arbitrary sequence of octets to a format that can be expressed in short lines of 6-bit characters, as required by transfer protocols such as SMTP.[6] The current version of PEM (specified in RFC 1421) uses a 64-character alphabet consisting of upper- and lower-case Roman letters ( To convert data to PEM printable encoding, the first byte is placed in the most significant eight bits of a 24-bit buffer, the next in the middle eight, and the third in the least significant eight bits. If there are fewer than three bytes left to encode (or in total), the remaining buffer bits will be zero. The buffer is then used, six bits at a time, most significant first, as indices into the string: " The process is repeated on the remaining data until fewer than four octets remain. If three octets remain, they are processed normally. If fewer than three octets (24 bits) are remaining to encode, the input data is right-padded with zero bits to form an integral multiple of six bits. After encoding the non-padded data, if two octets of the 24-bit buffer are padded-zeros, two PEM requires that all encoded lines consist of exactly 64 printable characters, with the exception of the last line, which may contain fewer printable characters. Lines are delimited by whitespace characters according to local (platform-specific) conventions. MIME{{Main|MIME}}The MIME (Multipurpose Internet Mail Extensions) specification lists Base64 as one of two binary-to-text encoding schemes (the other being quoted-printable).[3] MIME's Base64 encoding is based on that of the RFC 1421 version of PEM: it uses the same 64-character alphabet and encoding mechanism as PEM, and uses the MIME does not specify a fixed length for Base64-encoded lines, but it does specify a maximum line length of 76 characters. Additionally it specifies that any extra-alphabetic characters must be ignored by a compliant decoder, although most implementations use a CR/LF newline pair to delimit encoded lines. Thus, the actual length of MIME-compliant Base64-encoded binary data is usually about 137% of the original data length, though for very short messages the overhead can be much higher due to the overhead of the headers. Very roughly, the final size of Base64-encoded binary data is equal to 1.37 times the original data size + 814 bytes (for headers). The size of the decoded data can be approximated with this formula: UTF-7{{Main|UTF-7}}UTF-7, described first in RFC 1642, which was later superseded by RFC 2152, introduced a system called modified Base64. This data encoding scheme is used to encode UTF-16 as ASCII characters for use in 7-bit transports such as SMTP. It is a variant of the Base64 encoding used in MIME.[7][8]The "Modified Base64" alphabet consists of the MIME Base64 alphabet, but does not use the " OpenPGP{{Main|OpenPGP}}OpenPGP, described in RFC 4880, describes Radix-64 encoding, also known as "ASCII armor". Radix-64 is identical to the "Base64" encoding described from MIME, with the addition of an optional 24-bit CRC. The checksum is calculated on the input data before encoding; the checksum is then encoded with the same Base64 algorithm and, prefixed by "= " symbol as separator, appended to the encoded output data.[9]RFC 3548RFC 3548, entitled The Base16, Base32, and Base64 Data Encodings, is an informational (non-normative) memo that attempts to unify the RFC 1421 and RFC 2045 specifications of Base64 encodings, alternative-alphabet encodings, and the seldom-used Base32 and Base16 encodings. Unless implementations are written to a specification that refers to RFC 3548 and specifically requires otherwise, RFC 3548 forbids implementations from generating messages containing characters outside the encoding alphabet or without padding, and it also declares that decoder implementations must reject data that contain characters outside the encoding alphabet.[4] [https://tools.ietf.org/html/rfc4648#section-5 RFC 4648]This RFC obsoletes RFC 3548 and focuses on Base64/32/16: This document describes the commonly used Base64, Base32, and Base16 encoding schemes. It also discusses the use of line-feeds in encoded data, use of padding in encoded data, use of non-alphabet characters in encoded data, use of different encoding alphabets, and canonical encodings. FilenamesAnother variant called modified Base64 for filename uses ' It could be recommended to use the modified Base64 for URL instead, since then the filenames could be used in URLs also. URL applicationsBase64 encoding can be helpful when fairly lengthy identifying information is used in an HTTP environment. For example, a database persistence framework for Java objects might use Base64 encoding to encode a relatively large unique id (generally 128-bit UUIDs) into a string for use as an HTTP parameter in HTTP forms or HTTP GET URLs. Also, many applications need to encode binary data in a way that is convenient for inclusion in URLs, including in hidden web form fields, and Base64 is a convenient encoding to render them in a compact way. Using standard Base64 in URL requires encoding of ' For this reason, modified Base64 for URL variants exist (such as base64url in RFC 4648), where the ' Program identifiersThere are other variants{{clarify|date=September 2018}} that use XMLXML identifiers and name tokens are encoded using two variants:{{Citation needed|date=July 2018}}
HTMLThe atob() and btoa() JavaScript methods, defined in the HTML5 draft specification,[10] provide Base64 encoding and decoding functionality to web pages. The btoa() method outputs padding characters, but these are optional in the input of the atob() method. Other applicationsBase64 can be used in a variety of contexts:
Radix-64 applications not compatible with Base64
See also
References1. ^1 {{cite IETF |title= The Base16,Base32,and Base64 Data Encodings |rfc= 4648 |date=October 2006 |publisher=IETF |accessdate= March 18, 2010}} 2. ^1 {{cite IETF |title= Privacy Enhancement for InternetElectronic Mail: Part I: Message Encryption and Authentication Procedures |rfc= 1421 |date=February 1993 |publisher=IETF |accessdate= March 18, 2010}} 3. ^1 {{cite IETF |title= Multipurpose Internet Mail Extensions: (MIME) Part One: Format of Internet Message Bodies |rfc= 2045 |date=November 1996 |publisher=IETF |accessdate= March 18, 2010}} 4. ^1 {{cite IETF |title= The Base16, Base32, and Base64 Data Encodings |rfc= 3548 |date=July 2003 |publisher=IETF |accessdate= March 18, 2010}} 5. ^{{cite web|url=http://www.yuiblog.com/blog/2010/07/06/in-the-yui-3-gallery-base64-and-y64-encoding/ |title=YUIBlog |publisher=YUIBlog |date= |accessdate=2012-06-21}} 6. ^{{cite IETF |title=Privacy Enhancement for Internet Electronic Mail |rfc=989 |date=February 1987 |publisher=IETF |accessdate=March 18, 2010}} 7. ^{{cite IETF |title=UTF-7 A Mail-Safe Transformation Format of Unicode |rfc=1642 |date=July 1994 |publisher=IETF |accessdate=March 18, 2010}} 8. ^{{cite IETF |title=UTF-7 A Mail-Safe Transformation Format of Unicode |rfc=2152 |date=May 1997 |publisher=IETF |accessdate=March 18, 2010}} 9. ^{{cite IETF |title=OpenPGP Message Format |rfc=4880 |date=November 2007 |publisher=IETF |accessdate=March 18, 2010}} 10. ^{{cite web|title=7.3. Base64 utility methods|url=https://w3c.github.io/html/webappapis.html#base64-utility-methods|website=HTML 5.2 Editor's Draft|publisher=World Wide Web Consortium|accessdate=2 January 2017}} Introduced by changeset 5814, 2011-02-01. 11. ^<image xlink:href="data:image/jpeg;base64, JPEG contents encoded in Base64 " ... />12. ^{{cite web|url=http://jsfiddle.net/MxHPq/|title=Edit fiddle - JSFiddle|last=JSFiddle|website=jsfiddle.net}} 13. ^{{cite web|url=http://homepages.rootsweb.ancestry.com/~pmcbride/gedcom/55gctoc.htm |title=The GEDCOM Standard Release 5.5 |publisher=Homepages.rootsweb.ancestry.com |date= |accessdate=2012-06-21}} 14. ^{{cite web|url=https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/lib/libc/crypt/bcrypt.c?rev=1.1&content-type=text/x-cvsweb-markup|title=src/lib/libc/crypt/bcrypt.c r1.1|author-link=Niels Provos|first=Niels|last=Provos|date=1997-02-13|accessdate=2018-05-18}} 15. ^{{cite web|url=http://private.freepage.de/cgi-bin/feets/freepage_ext/41030x030A/rewrite/alexs/xfr/flexnet/6pack_en/6pack.htm|title=6PACK a "real time" PC to TNC protocol|accessdate=2013-05-19}} External links{{Wikibooks|Algorithm implementation|Miscellaneous/Base64|Base64}}
5 : Usenet|Email|Internet Standards|Binary-to-text encoding formats|Data serialization formats |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
随便看 |
|
开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。