词条 | FMA instruction set | ||||||||||||||||||||||||||||||||||||||||||||||||||||
释义 |
The FMA instruction set is an extension to the 128 and 256-bit Streaming SIMD Extensions instructions in the x86 microprocessor instruction set to perform fused multiply–add (FMA) operations.[1] There are two variants:
New instructionsFMA3 and FMA4 instructions have almost identical functionality, but are not compatible. Both contain fused multiply–add (FMA) instructions for floating-point scalar and SIMD operations, but FMA3 instructions have three operands, while FMA4 ones have four. The FMA operation has the form d = round(a · b + c), where the round function performs a rounding to allow the result to fit within the destination register if there are too many significant bits to fit within the destination. The four-operand form (FMA4) allows a, b, c and d to be four different registers, while the three-operand form (FMA3) requires that d be the same register as a, b or c. The three-operand form makes the code shorter and the hardware implementation slightly simpler, while the four-operand form provides more programming flexibility. See XOP instruction set for more discussion of compatibility issues between Intel and AMD. FMA3 instruction setCPUs with FMA3
Excerpt from FMA3
FMA4 instruction setCPUs with FMA4
Excerpt from FMA4
HistoryThe incompatibility between Intel's FMA3 and AMD's FMA4 is due to both companies changing plans without coordinating coding details with each other. AMD changed their plans from FMA3 to FMA4 while Intel changed their plans from FMA4 to FMA3 almost at the same time. The history can be summarized as follows:
Compiler and assembler supportDifferent compilers provide different levels of support for FMA4:
References1. ^"FMA3 and FMA4 are not instruction sets, they are individual instructions -- fused multiply add. They could be quite useful depending on how Intel and AMD implement them" {{cite web|last=Woltmann|first=George (Prime95)|title=Intel AVX and GIMPS|url=http://www.mersenneforum.org/showthread.php?t=14335&highlight=fused+multiply+add|work=mersenneforum.org/index.php|publisher=Great Internet Mersenne Prime Search (GIMPS) project|accessdate=27 July 2011}} {{AMD technology}}{{Intel technology}}{{Multimedia extensions|state=uncollapsed}}{{DEFAULTSORT:Fma Instruction Set}}2. ^{{cite web|last=Maffeo|first=Robin|title=AMD and the Visual Studio 11 Beta|url=http://developer.amd.com/community/blog/2012/03/01/amd-and-the-visual-studio-11-beta/|publisher=AMD|date=March 1, 2012|archive-url=https://archive.is/20131109140742/http://developer.amd.com/community/blog/2012/03/01/amd-and-the-visual-studio-11-beta/|archive-date=November 9, 2013|dead-url=yes|accessdate=2018-11-07}} 3. ^{{cite web | url=http://support.amd.com/TechDocs/43479.pdf | title=AMD64 Architecture Programmer’s Manual Volume 6: 128-Bit and 256-Bit XOP, FMA4 and CVT16 Instructions | date=May 1, 2009 | publisher=AMD}} 4. ^{{cite web | url=http://developer.amd.com/wordpress/media/2012/10/New-Bulldozer-and-Piledriver-Instructions.pdf | title=New "Bulldozer" and "Piledriver" Instructions A step forward for high performance software development | date=October 2012 | publisher=AMD}} 5. ^http://agner.org/optimize/blog/read.php?i=838 6. ^{{cite web | url=https://products.amd.com/en-us/search/cpu#Default=%7B%22k%22%3A%22%22%2C%22r%22%3A%5B%7B%22n%22%3A%22FMAOWSCHCS%22%2C%22t%22%3A%5B%22%5C%22%C7%82%C7%82464d4134%5C%22%22%5D%2C%22o%22%3A%22OR%22%2C%22k%22%3Afalse%2C%22m%22%3A%7B%22%5C%22%C7%82%C7%82464d4134%5C%22%22%3A%22FMA4%22%7D%7D%5D%7D#2d521741-4cc8-44d2-aa87-874f9bb51787=%7B%22k%22%3A%22%22%7D | title=www.amd.com, FMA4 support model list | }} 7. ^{{cite web | url=https://products.amd.com/en-us/search/APU/AMD-Ryzen™-Processors/AMD-Ryzen™-5-Processor-with-Radeon™-Vega-Graphics/AMD-Ryzen™-5-2400G/243 | title=www.amd.com, FMA4 support model list | }} 8. ^{{cite web | url=https://products.amd.com/en-us/search/APU/AMD-Ryzen™-Processors/AMD-Ryzen™-3-Processor-with-Radeon™-Vega-Graphics/AMD-Ryzen™-3-2200G/244 | title=www.amd.com, FMA4 support model list | }} 9. ^{{cite web|url=http://developer.amd.com/SSE5 |title=128-Bit SSE5 Instruction Set |publisher=AMD Developer Central |accessdate=2008-01-28 |archiveurl=https://web.archive.org/web/20080115163416/http://developer.amd.com/SSE5 |archivedate=2008-01-15 |deadurl=yes |df= }} 10. ^{{cite web | url=http://softwarecommunity.intel.com/isn/downloads/intelavx/Intel-AVX-Programming-Reference-31943302.pdf | title=Intel Advanced Vector Extensions Programming Reference | publisher=Intel | accessdate=2008-04-05 }}{{dead link|date=September 2017 |bot=InternetArchiveBot |fix-attempted=yes }} 11. ^{{cite web | url=http://software.intel.com/en-us/avx/ | title=Intel Advanced Vector Extensions Programming Reference | publisher=Intel | accessdate=2009-05-06}} 12. ^{{cite web | url=http://blogs.amd.com/developer/2009/05/06/striking-a-balance/ | title=Striking a balance | date=May 6, 2009 | publisher=Dave Christie, AMD Developer blogs | archive-url=https://archive.li/20120708101459/http://blogs.amd.com/developer/2009/05/06/striking-a-balance/ | archive-date=July 8, 2012 | dead-url=yes | accessdate=2018-11-07}} 13. ^1 {{cite web|title=New Bulldozer and Piledriver Instructions |url=http://developer.amd.com/wordpress/media/2012/10/New-Bulldozer-and-Piledriver-Instructions.pdf|publisher=AMD|accessdate=25 July 2013}} 14. ^{{cite web|title=Software Optimization Guide for AMD Family 15h Processors|url=http://support.amd.com/us/Processor_TechDocs/47414_15h_sw_opt_guide.pdf|publisher=AMD|accessdate=19 April 2012}} 15. ^{{cite web|title=Intel Architecture Instruction Set Extensions Programming Reference|url=http://software.intel.com/sites/default/files/319433-015.pdf|publisher=Intel|accessdate=25 July 2013}} 16. ^{{cite web | url=http://www.agner.org/optimize/microarchitecture.pdf | title=The microarchitecture of Intel, AMD and VIA CPUs An optimization guide for assembly programmers and compiler makers | accessdate=2017-05-02}} 17. ^https://sourceware.org/ml/binutils/2015-03/msg00078.html 18. ^https://sourceware.org/ml/binutils/2015-08/msg00039.html 19. ^1 {{cite web|url=https://www.reddit.com/r/Amd/comments/68s4bj/ryzen_has_undocumented_support_for_fma4/dh0y353/|title=Discussion – Ryzen has undocumented support for FMA4|accessdate=2017-05-10}} 20. ^{{cite web|url=https://www.techpowerup.com/231536/amd-ryzen-machine-crashes-to-a-sequence-of-fma3-instructions|title=AMD Ryzen Machine Crashes to a Sequence of FMA3 Instructions|accessdate=2017-09-10}} 21. ^1 {{cite web|url=http://www.theinquirer.net/inquirer/news/2124866/amd-bulldozer-fma4-xop-instructions-supported-gcc| title=AMD Bulldozer only FMA4 and XOP instructions are supported by GCC Intel still mute|work=The Inquirer|first=Lawrence |last=Latif|date=Nov 14, 2011}} 22. ^{{cite web|url=http://msdn.microsoft.com/en-us/library/vstudio/gg445134(v=vs.100).aspx|title=FMA4 Intrinsics Added for Visual Studio 2010 SP1}} 23. ^{{cite web|url=http://www.pathscale.com/node/272|title=EKOPath man doc|access-date=2013-07-24|archive-url=https://web.archive.org/web/20160623224118/http://www.pathscale.com/node/272|archive-date=2016-06-23|dead-url=yes|df=}} 24. ^{{cite web|url=http://llvm.org/releases/3.1/docs/ReleaseNotes.html|title=LLVM 3.1 Release Notes}} 25. ^{{cite web|url=http://llvm.org/viewvc/llvm-project?view=revision&revision=155618|title=Enable detection of AVX and AVX2 support through CPUID|date=2012-04-26|work=LLVM}} 3 : X86 instructions|SIMD computing|Advanced Micro Devices technologies |
||||||||||||||||||||||||||||||||||||||||||||||||||||
随便看 |
|
开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。