“Bfloat16 floating-point format”的意思、由来-开放百科全书

The bfloat16 floating-point format is a computer number format occupying 16 bits in computer memory; it represents a wide dynamic range of numeric values by using a floating radix point. This format is a truncated (16-bit) version of the 32-bit IEEE 754 single-precision floating-point format (binary32) with the intent of accelerating machine learning and near-sensor computing.^[1] It preserves the approximate dynamic range of 32-bit floating-point numbers by retaining 8 exponent bits, but supports only an 8-bit precision rather than the 24-bit significand of the binary32 format. More so than single-precision 32-bit floating-point numbers, bfloat16 numbers are unsuitable for integer calculations, but this is not their intended use.

The bfloat16 format is utilized in upcoming Intel AI processors, such as Nervana NNP-L1000, Xeon processors, and Intel FPGAs,^[2]^[3]^[4] Google Cloud TPUs,^[5]^[6]^[7] and TensorFlow.^[7]^[8]

bfloat16 floating-point format

The bfloat16 format, being a truncated IEEE 754 single-precision 32-bit float, allows for fast conversion to and from an IEEE 754 single-precision 32-bit float, and preserves the exponent bits while reducing the significand. Preserving the exponent bits maintains the 32-bit float's range of ≈ 10⁻³⁸ to ≈ 3 × 10³⁸.^[9]

Exponent encoding

The bfloat16 binary floating-point exponent is encoded using an offset-binary representation, with the zero offset being 127; also known as exponent bias in the IEEE 754 standard.

Thus, in order to get the true exponent as defined by the offset-binary representation, the offset of 127 has to be subtracted from the value of the exponent field.

The minimum and maximum values of the exponent field (00_H and FF_H) are interpreted specially, like in the IEEE 754 standard formats.

The minimum positive normal value is 2⁻¹²⁶ ≈ 1.18 × 10⁻³⁸ and the minimum positive (subnormal) value is 2⁻¹²⁶⁻⁷ = 2⁻¹³³ ≈ 9.2 × 10⁻⁴¹.

Encoding of special values

Positive and negative infinity

Just as in IEEE 754, positive and negative infinity are represented with their corresponding sign bits, all 8 exponent bits set (FF_hex) and all significand bits zero. Explicitly,

NaN

Just as in IEEE 754, NaN values are represented with either sign bit, all 8 exponent bits set (FF_hex) and not all significand bits zero. Explicitly,

where at least one of k, l, m, n, o, p, or q is 1. As with IEEE 754, NaN values can be quiet or signaling, although there are no known uses of signaling bfloat16 NaNs as of September 2018.

Range and precision

Bfloat16 is designed to maintain the number range from the 32-bit IEEE 754 single-precision floating-point format (binary32), while reducing the precision from a 24 bits to a 8 bits.

Rounding modes

According to this [https://software.intel.com/sites/default/files/managed/40/8b/bf16-hardware-numerics-definition-white-paper.pdf white paper], 'round to the nearest; ties to even' (RNE) mode is the default and only supported rounding mode.

Examples

These examples are given in bit representation, in hexadecimal and binary, of the floating-point value. This includes the sign, (biased) exponent, and significand.

The maximum positive finite value of a normal bfloat16 number is 3.38953139 × 10³⁸, slightly below (2²⁴ − 1) × 2⁻²³ × 2¹²⁷ = 3.402823466 × 10³⁸, the max finite positive value representable in single precision.

Zeros and infinities

Special values

NaNs

See also

References

1. ^{{Cite book |doi=10.23919/DATE.2018.8342167|chapter=A transprecision floating-point platform for ultra-low power computing|title=2018 Design, Automation & Test in Europe Conference & Exhibition (DATE)|pages=1051–1056|year=2018|last1=Tagliavini|first1=Giuseppe|last2=Mach|first2=Stefan|last3=Rossi|first3=Davide|last4=Marongiu|first4=Andrea|last5=Benin|first5=Luca|isbn=978-3-9819263-0-9|arxiv=1711.10374}}
2. ^{{Cite web | title = Intel unveils Nervana Neural Net L-1000 for accelerated AI training | author = Khari Johnson | work = VentureBeat | date = 2018-05-23 | accessdate = 2018-05-23 | url = https://venturebeat.com/2018/05/23/intel-unveils-nervana-neural-net-l-1000-for-accelerated-ai-training/ |quote = ...Intel will be extending bfloat16 support across our AI product lines, including Intel Xeon processors and Intel FPGAs. }}
3. ^{{Cite web | title = Intel Lays Out New Roadmap for AI Portfolio | author = Michael Feldman | work = TOP500 Supercomputer Sites | date = 2018-05-23 | accessdate = 2018-05-23 | url = https://www.top500.org/news/intel-lays-out-new-roadmap-for-ai-portfolio/ | quote = Intel plans to support this format across all their AI products, including the Xeon and FPGA lines }}
4. ^{{Cite web | title = Intel To Launch Spring Crest, Its First Neural Network Processor, In 2019 | author = Lucian Armasu | work = Tom's Hardware | date = 2018-05-23 | accessdate = 2018-05-23 | url = https://www.tomshardware.com/news/intel-neural-network-processor-lake-crest,37105.html | quote = Intel said that the NNP-L1000 would also support bfloat16, a numerical format that’s being adopted by all the ML industry players for neural networks. The company will also support bfloat16 in its FPGAs, Xeons, and other ML products. The Nervana NNP-L1000 is scheduled for release in 2019. }}
5. ^{{Cite web | title = Available TensorFlow Ops {{!}} Cloud TPU {{!}} Google Cloud | author = | work = Google Cloud | date = | accessdate = 2018-05-23 | url = https://cloud.google.com/tpu/docs/tensorflow-ops | quote = This page lists the TensorFlow Python APIs and graph operators available on Cloud TPU. }}
6. ^{{Cite web | title = Comparing Google's TPUv2 against Nvidia's V100 on ResNet-50 | author = Elmar Haußmann | work = RiseML Blog | date = 2018-04-26 | accessdate = 2018-05-23 | url = https://blog.riseml.com/comparing-google-tpuv2-against-nvidia-v100-on-resnet-50-c2bbb6a51e5e | language = | quote = For the Cloud TPU, Google recommended we use the bfloat16 implementation from the official TPU repository with TensorFlow 1.7.0. Both the TPU and GPU implementations make use of mixed-precision computation on the respective architecture and store most tensors with half-precision. }}
7. ^¹{{Cite web | title = ResNet-50 using BFloat16 on TPU | author = Tensorflow Authors | work = Google | date = 2018-07-23 | accessdate = 2018-11-06 | url = https://github.com/tensorflow/tpu/tree/0ece10f6f4e523eab79aba0247b513fe57d38ae6/models/experimental/resnet_bfloat16 | quote = }}
8. ^{{cite report |title= TensorFlow Distributions |author= Joshua V. Dillon, Ian Langmore, Dustin Tran, Eugene Brevdo, Srinivas Vasudevan, Dave Moore, Brian Patton, Alex Alemi, Matt Hoffman, Rif A. Saurous |date= 2017-11-28 |id= Accessed 2018-05-23 |arxiv= 1711.10604 |quote= All operations in TensorFlow Distributions are numerically stable across half, single, and double floating-point precisions (as TensorFlow dtypes: tf.bfloat16 (truncated floating point), tf.float16, tf.float32, tf.float64). Class constructors have a validate_args flag for numerical asserts |bibcode= 2017arXiv171110604D }}
9. ^{{Cite web | title = Livestream Day 1: Stage 8 (Google I/O '18) - YouTube | author = | work = Google | date = 2018-05-08 | accessdate = 2018-05-23 | url = https://www.youtube.com/watch?v=vm67WcLzfvc&t=2555 | language = | quote = In many models this is a drop-in replacement for float-32 }}