请输入您要查询的百科知识:

 

词条 Bit Manipulation Instruction Sets
释义

  1. {{anchor|ABM}}ABM (Advanced Bit Manipulation)

  2. {{anchor|BMI1}}BMI1 (Bit Manipulation Instruction Set 1)

  3. {{anchor|BMI2}}BMI2 (Bit Manipulation Instruction Set 2)

      Parallel bit deposit and extract  

  4. {{anchor|TBM}}TBM (Trailing Bit Manipulation)

  5. Supporting CPUs

  6. See also

  7. References

  8. Further reading

  9. External links

Bit Manipulation Instructions Sets (BMI sets) are extensions to the x86 instruction set architecture for microprocessors from Intel and AMD. The purpose of these instruction sets is to improve the speed of bit manipulation. All the instructions in these sets are non-SIMD and operate only on general-purpose registers.

There are two sets published by Intel: BMI (here referred to as BMI1) and BMI2; they were both introduced with the Haswell microarchitecture. Another two sets were published by AMD: ABM (Advanced Bit Manipulation, which is also a subset of SSE4a implemented by Intel as part of SSE4.2 and BMI1), and TBM (Trailing Bit Manipulation, an extension introduced with Piledriver-based processors as an extension to BMI1, but dropped again in Zen-based processors).[1]

{{anchor|ABM}}ABM (Advanced Bit Manipulation)

ABM is only implemented as a single instruction set by AMD; all AMD processors support both instructions or neither. Intel considers POPCNT as part of SSE4.2, and LZCNT as part of BMI1. POPCNT has a separate CPUID flag; however, Intel uses AMD's ABM flag to indicate LZCNT support (since LZCNT completes the ABM).[2]

Instruction Description[3]
POPCNT Population count
LZCNT Leading zeros count

LZCNT is almost identical to the Bit Scan Reverse (BSR) instruction, but sets the ZF (if the result is zero) and CF (if the source is zero) flags rather than OF, and produces a defined result (the source operand size in bits) if the source operand is zero.

{{anchor|BMI1}}BMI1 (Bit Manipulation Instruction Set 1)

The instructions below are those enabled by the BMI bit in CPUID. Intel officially considers LZCNT as part of BMI, but advertises LZCNT support using the ABM CPUID feature flag.[2] BMI1 is available in AMD's Jaguar,[5] Piledriver[4] and newer processors, and in Intel's Haswell[5] and newer processors.

Instruction Description[2] Equivalent C expression[6]
ANDN Logical and not ~x & y
BEXTR Bit field extract (with register) (src >> start) & ((1 << len) - 1)[7]
BLSI Extract lowest set isolated bit x & -x
BLSMSK Get mask up to lowest set bit x ^ (x - 1)
BLSR Reset lowest set bit x & (x - 1)
TZCNT Count the number of trailing zero bits {{N/A}}

{{anchor|BMI2}}BMI2 (Bit Manipulation Instruction Set 2)

Intel introduced BMI2 together with BMI1 in its line of Haswell processors. Only AMD has produced processors supporting only BMI1 without BMI2; BMI2 is supported by AMDs Excavator architecture and newer.[8]

Instruction Description
BZHI Zero high bits starting with specified bit position
MULX Unsigned multiply without affecting flags, and arbitrary destination registers
PDEP Parallel bits deposit
PEXT Parallel bits extract
RORX Rotate right logical without affecting flags
SARX Shift arithmetic right without affecting flags
SHRX Shift logical right without affecting flags
SHLX Shift logical left without affecting flags

Parallel bit deposit and extract

The PDEP and PEXT instructions are new generalized bit-level compress and expand instructions. They take two inputs; one is a source, and the other is a selector. The selector is a bitmap selecting the bits that are to be packed or unpacked. PEXT copies selected bits from the source to contiguous low-order bits of the destination; higher-order destination bits are cleared. PDEP does the opposite for the selected bits: contiguous low-order bits are copied to selected bits of the destination; other destination bits are cleared. This can be used to extract any bitfield of the input, and even do a lot of bit-level shuffling that previously would have been expensive. While what these instructions do is similar to a bit level gather-scatter SIMD instructions, PDEP and PEXT instructions (like the rest of the BMI instruction sets) operate on general-purpose registers.[9][10]

Below are a few 16-bit examples of these operations:{{Citation needed|date=February 2014}}

Input Selector example Parallel bit extract Parallel bit deposit
rrrrrggggggbbbbb 1111100000000000 00000000000rrrrr bbbbb00000000000
rrrrrggggggbbbbb 0000011111100000 0000000000gggggg 00000gbbbbb00000
rrrrrggggggbbbbb 0000000000011111 00000000000bbbbb 00000000000bbbbb

{{anchor|TBM}}TBM (Trailing Bit Manipulation)

TBM consists of instructions complementary to the instruction set started by BMI1; their complementary nature means they do not necessarily need to be used directly but can be generated by an optimizing compiler when supported.[11] AMD introduced TBM together with BMI1 in its Piledriver[4] line of processors; AMD Jaguar and Zen-based processors do not support TBM.[12]

Instruction Description[3] Equivalent C expression[13]
BEXTR Bit field extract (with immediate) (src >> start) & ((1 << len) - 1)
BLCFILL Fill from lowest clear bit x & (x + 1)
BLCI Isolate lowest clear bit x | ~(x + 1)
BLCIC Isolate lowest clear bit and complement ~x & (x + 1)
BLCMSK Mask from lowest clear bit x ^ (x + 1)
BLCS Set lowest clear bit x | (x + 1)
BLSFILL Fill from lowest set bit x | (x - 1)
BLSIC Isolate lowest set bit and complement ~x | (x - 1)
T1MSKC Inverse mask from trailing ones ~x | (x + 1)
TZMSK Mask from trailing zeros ~x & (x - 1)

Supporting CPUs

  • Intel
    • Intel Nehalem processors and newer (like Sandy Bridge, Ivy Bridge) (POPCNT supported)
    • Intel Silvermont processors (POPCNT supported)
    • Intel Haswell processors and newer (like Skylake, Broadwell) (ABM, BMI1 and BMI2 supported)[5]
  • AMD
    • K10-based processors (ABM supported)
    • "Cat" low-power processors
    • Bobcat-based processors (ABM supported)[14]
    • Jaguar-based processors and newer (ABM and BMI1 supported)[12]
    • Puma-based processors and newer (ABM and BMI1 supported)[12]
    • "Heavy Equipment" processors
    • Bulldozer-based processors (ABM supported)
    • Piledriver-based processors (ABM, BMI1 and TBM supported)[1]
    • Steamroller-based processors (ABM, BMI1 and TBM supported)
    • Excavator-based processors and newer (ABM, BMI1, BMI2 and TBM supported)[8]
    • Zen-based processors (ABM, BMI1 and BMI2 supported)
    • Zen+-based processors (ABM, BMI1 and BMI2 supported)

See also

{{Portal|Computer programming|Computing}}{{Div col|colwidth=20em}}
  • Advanced Vector Extensions (AVX)
  • AES instruction set
  • CLMUL instruction set
  • F16C
  • FMA instruction set
  • Intel ADX
  • XOP instruction set
  • Intel BCD opcodes (also used for advanced bit manipulation techniques)
{{Div col end}}

References

1. ^{{cite web|url=http://developer.amd.com/wordpress/media/2012/10/New-Bulldozer-and-Piledriver-Instructions.pdf|title=New "Bulldozer" and "Piledriver" Instructions|accessdate=2014-01-03}}
2. ^{{cite web |url=http://software.intel.com/file/36945 |title=Intel Advanced Vector Extensions Programming Reference |date=June 2011 |accessdate=2014-01-03 |publisher=Intel |work=intel.com |format=PDF}}
3. ^{{cite web|url=http://support.amd.com/TechDocs/24594.pdf|title=AMD64 Architecture Programmer's Manual, Volume 3: General-Purpose and System Instructions|date=October 2013 |accessdate=2014-01-02 |publisher=AMD |work=amd.com|format=PDF}}
4. ^{{cite web|last1=Hollingsworth|first1=Brent|title=New "Bulldozer" and "Piledriver" instructions|url=http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/New-Bulldozer-and-Piledriver-Instructions.pdf|publisher=Advanced Micro Devices, Inc.|accessdate=11 December 2014|format=pdf}}
5. ^{{cite web|last1=Locktyukhin|first1=Max|title=How to detect New Instruction support in the 4th generation Intel® Core™ processor family|url=https://software.intel.com/en-us/articles/how-to-detect-new-instruction-support-in-the-4th-generation-intel-core-processor-family|website=www.intel.com|publisher=Intel|accessdate=11 December 2014}}
6. ^{{cite web|url=https://gcc.gnu.org/viewcvs/gcc/branches/gcc-4_8-branch/gcc/config/i386/bmiintrin.h?revision=201047&view=markup|title=bmiintrin.h from GCC 4.8|accessdate=2014-03-17}}
7. ^{{cite web|url=http://chessprogramming.wikispaces.com/BMI1|title=Chess Programming BMI1|accessdate=2014-04-08}}
8. ^{{cite web |url=http://www.xbitlabs.com/news/cpu/display/20131018224745_AMD_Excavator_Core_May_Dramatic_Performance_Increases.html |title=AMD Excavator Core May Bring Dramatic Performance Increases |publisher=X-bit labs |date=October 18, 2013 |accessdate=November 24, 2013 |deadurl=yes |archiveurl=https://web.archive.org/web/20131023074809/http://www.xbitlabs.com/news/cpu/display/20131018224745_AMD_Excavator_Core_May_Dramatic_Performance_Increases.html |archivedate=October 23, 2013 |df= }}
9. ^{{cite web|url=http://chessprogramming.wikispaces.com/BMI2|title=chessprogramming - BMI2|accessdate=2014-02-09}}
10. ^{{Cite web | url = http://palms.princeton.edu/system/files/IEEE_TC09_NewBasisForShifters.pdf | title = A New Basis for Shifters in General-Purpose Processors for Existing and Advanced Bit Manipulations | date = August 2009 | accessdate = 2014-02-10 | author1 = Yedidya Hilewitz | author2 = Ruby B. Lee | publisher = IEEE Transactions on Computers | work = palms.princeton.edu | volume = 58 | number = 8 | pages = 1035–1048 | format = PDF}}
11. ^{{cite web|url=http://chessprogramming.wikispaces.com/TBM|title=chessprogramming - TBM|accessdate=2014-02-09}}
12. ^{{cite web |url=http://support.amd.com/TechDocs/52169_KB_A_Series_Mobile.pdf |title=Family 16h AMD A-Series Data Sheet |date=October 2013 |accessdate=2014-01-02 |publisher=AMD |work=amd.com |format=PDF}}
13. ^{{cite web|url=https://gcc.gnu.org/viewcvs/gcc/branches/gcc-4_8-branch/gcc/config/i386/tbmintrin.h?revision=196696&view=markup|title=tbmintrin.h from GCC 4.8|accessdate=2014-03-17}}
14. ^{{cite web|url=http://developer.amd.com/wordpress/media/2012/10/43170_14h_Mod_00h-0Fh_BKDG.pdf|title=BIOS and Kernel Developer's Guide for AMD Family 14h|accessdate=2014-01-03}}

Further reading

  • {{Cite book |title=Hacker's Delight |first=Henry S. |last=Warren Jr. |date=2013 |edition=2 |publisher=Addison Wesley - Pearson Education, Inc. |isbn=978-0-321-84268-8}}

External links

  • [https://software.intel.com/sites/landingpage/IntrinsicsGuide/ Intel Intrinsics Guide]
{{AMD technology}}{{Intel technology}}{{Multimedia extensions}}

2 : X86 instructions|Advanced Micro Devices technologies

随便看

 

开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。

 

Copyright © 2023 OENC.NET All Rights Reserved
京ICP备2021023879号 更新时间:2024/11/11 7:40:08