词条 | Calling convention | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
释义 |
In computer science, a calling convention is an implementation-level (low-level) scheme for how subroutines receive parameters from their caller and how they return a result. Differences in various implementations include where parameters, return values, return addresses and scope links are placed, and how the tasks of preparing for a function call and restoring the environment afterward are divided between the caller and the callee. Calling conventions may be related to a particular programming language's evaluation strategy but most often are not considered part of it (or vice versa), as the evaluation strategy is usually defined on a higher abstraction level and seen as a part of the language rather than as a low-level implementation detail of a particular language's compiler. VariationsCalling conventions may differ in:
In some cases, differences also include the following:
Compiler variationAlthough some{{which|date=November 2016}} languages actually may specify this partially in the programming language specification (or in some pivotal implementation), different implementations of such languages (i.e. different compilers) may typically still use various calling conventions, often selectable. Reasons for this are performance, frequent adaptation to the conventions of other popular languages (with or without technical reasons), and restrictions or conventions imposed by various "platforms" (combinations of CPU architectures and operating systems). Architecture variationCPU architectures always have more than one possible calling convention{{why|date=November 2016}}. With many general-purpose registers and other features, the potential number of calling conventions is large, although some{{which|date=November 2016}} architectures are formally specified to use only one calling convention, supplied by the architect. x86 (32-bit){{Main|x86 calling conventions}}The x86 architecture is used with many different calling conventions. Due to the small number of architectural registers, the x86 calling conventions mostly pass arguments on the stack, while the return value (or a pointer to it) is passed in a register. Some conventions use registers for the first few parameters, which may improve performance for short and simple leaf-routines very frequently invoked (i.e. routines that do not call other routines and do not have to be reentrant). Example call:Typical callee structure: (some or all (except ret) of the instructions below may be optimized away in simple procedures) ARM (A32)The standard 32-bit ARM calling convention allocates the 15 general-purpose registers as:
The 16th register, r15, is the program counter. If the type of value returned is too large to fit in r0 to r3, or whose size cannot be determined statically at compile time, then the caller must allocate space for that value at run time, and pass a pointer to that space in r0. Subroutines must preserve the contents of r4 to r11 and the stack pointer. (Perhaps by saving them to the stack in the function prologue, then using them as scratch space, then restoring them from the stack in the function epilogue). In particular, subroutines that call other subroutines *must* save the return address in the link register r14 to the stack before calling those other subroutines. However, such subroutines do not need to return that value to r14—they merely need to load that value into r15, the program counter, to return. The ARM calling convention mandates using a full-descending stack.[1] This calling convention causes a "typical" ARM subroutine to
ARM (A64)The 64-bit ARM (AArch64) calling convention allocates the 31 general-purpose registers as[2]:
The 32nd register, which serves as a stack pointer or as a zero register depending on the context, is referenced either as sp or xzr. All registers starting with x have a corresponding 32-bit register prefixed with w. Thus, a 32-bit x0 is called w0. PowerPCThe PowerPC architecture has a large number of registers so most functions can pass all arguments in registers for single level calls. Additional arguments are passed on the stack, and space for register-based arguments is also always allocated on the stack as a convenience to the called function in case multi-level calls are used (recursive or otherwise) and the registers must be saved. This is also of use in variadic functions, such as MIPS{{Main|MIPS architecture}}The most commonly used[3] calling convention for 32 bit MIPS is the O32[4] ABI which passes the first four arguments to a function in the registers $a0-$a3; subsequent arguments are passed on the stack. Space on the stack is reserved for $a0-$a3 in case the callee needs to save its arguments, but the registers are not stored there by the caller. The return value is stored in register $v0; a second return value may be stored in $v1. The 64 bit N64 ABI allows for more arguments in registers for more efficient function calls when there are more than four parameters. There is also the N32 ABI which also allows for more arguments in registers. The return address when a function is called is stored in the $ra register automatically by use of the JAL (jump and link) or JALR (jump and link register) instructions. The function prologue of a (non-leaf) MIPS subroutine pushes the return address (in $ra) to the stack.[5][6] The N32 and N64 ABIs pass the first eight arguments to a function in the registers $a0-$a7; subsequent arguments are passed on the stack. The return value (or a pointer to it) is stored in the registers $v0; a second return value may be stored in $v1. In both the N32 and N64 ABIs all registers are considered to be 64-bits wide. On both O32 and N32/N64 the stack grows downwards, however the N32/N64 ABIs require 64-bit alignment for all stack entries. The frame pointer ($30) is optional and in practice rarely used except when the stack allocation in a function is determined at runtime, for example, by calling alloca(). For N32 and N64, the return address is typically stored 8 bytes before the stack pointer although this may be optional. For the N32 and N64 ABIs, a function must preserve the $S0-$s7 registers, the global pointer ($gp or $28), the stack pointer ($sp or $29) and the frame pointer ($30). The O32 ABI is the same except the calling function is required to save the $gp register instead of the called function. For multi-threaded code, the thread local storage pointer is typically stored in special hardware register $29 and is accessed by using the mfhw (move from hardware) instruction. At least one vendor is known to store this information in the $k0 register which is normally reserved for kernel use, but this is not standard. The $k0 and $k1 registers ($26–$27) are reserved for kernel use and should not be used by applications since these registers can be changed at any time by the kernel due to interrupts, context switches or other events.
Registers that are preserved across a call are registers that (by convention) will not be changed by a system call or procedure (function) call. For example, $s-registers must be saved to the stack by a procedure that needs to use them, and $sp and $fp are always incremented by constants, and decremented back after the procedure is done with them (and the memory they point to). By contrast, $ra is changed automatically by any normal function call (ones that use jal), and $t-registers must be saved by the program before any procedure call (if the program needs the values inside them after the call). The userspace calling convention of position-independent code on Linux additionally requires that when a function is called the $t9 register must contain the address of that function.[8] This convention dates back to the System V ABI supplement for MIPS.[9] SPARCThe SPARC architecture, unlike most RISC architectures, is built on register windows. There are 24 accessible registers in each register window: 8 are the "in" registers (%i0-%i7), 8 are the "local" registers (%l0-%l7), and 8 are the "out" registers (%o0-%o7). The "in" registers are used to pass arguments to the function being called, and any additional arguments need to be pushed onto the stack. However, space is always allocated by the called function to handle a potential register window overflow, local variables, and (on 32-bit SPARC) returning a struct by value. To call a function, one places the arguments for the function to be called in the "out" registers; when the function is called, the "out" registers become the "in" registers and the called function accesses the arguments in its "in" registers. When the called function completes, it places the return value in the first "in" register, which becomes the first "out" register when the called function returns. The System V ABI,[10] which most modern Unix-like systems follow, passes the first six arguments in "in" registers %i0 through %i5, reserving %i6 for the frame pointer and %i7 for the return address. IBM System/360 and successorsThe IBM System/360 is another architecture without a hardware stack. The examples below illustrate the calling convention used by OS/360 and successors prior to the introduction of 64-bit z/Architecture; other operating systems for System/360 might have different calling conventions.
Notes:
In the System/390 ABI[11] and the z/Architecture ABI,[12] used in Linux:
SuperH{{Main|SuperH}}
68k{{wikibooks| 68000 Assembly}}The most common calling convention for the Motorola 68000 series is:[13][14][15][16]
IBM 1130{{Main|IBM 1130}}The IBM 1130 was a small 16-bit word-addressable machine. It had only six registers plus condition indicators, and no stack. The registers are Instruction Address Register (IAR), Accumulator (ACC), Accumulator Extension (EXT), and three index registers X1–X3. The calling program is responsible for saving ACC, EXT, X1, and X2.[17] There are two pseudo-operations for calling subroutines, Arguments follow the * 1130 subroutine example ENT SUB Declare "SUB" an external entry point SUB DC 0 Reserved word at entry point, conventionally coded "DC *-*" * Subroutine code begins here * If there were arguments the addresses can be loaded indirectly from the return addess LDX I 1 SUB Load X1 with the address of the first argument (for example) ... * Return sequence LD RES Load integer result into ACC * If no arguments were provided, indirect branch to the stored return address B I SUB If no arguments were provided END SUB Subroutines in IBM 1130, CDC 6600 and PDP-8 (all three computers were introduced in 1965) store the return address in the first location of a subroutine.[19] Implementation considerationsThis variability must be considered when combining modules written in multiple languages, or when calling operating system or library APIs from a language other than the one in which they are written; in these cases, special care must be taken to coordinate the calling conventions used by caller and callee. Even a program using a single programming language may use multiple calling conventions, either chosen by the compiler, for code optimization, or specified by the programmer. Threaded code{{Main|Threaded code}}Threaded code places all the responsibility for setting up for and cleaning up after a function call on the called code. The calling code does nothing but list the subroutines to be called. This puts all the function setup and cleanup code in one place—the prolog and epilog of the function—rather than in the many places that function is called. This makes threaded code the most compact calling convention. Threaded code passes all arguments on the stack. All return values are returned on the stack. This makes naive implementations slower than calling conventions that keep more values in registers. However, threaded code implementations that cache several of the top stack values in registers—in particular, the return address—are usually faster than subroutine calling conventions that always push and pop the return address to the stack.[20][21][22] PL/I{{Unreferenced section|date=May 2016}}The default calling convention for programs written in the PL/I language passes all arguments by reference, although other conventions may optionally be specified. The arguments are handled differently for different compilers and platforms, but typically the argument addresses are passed via an argument list in memory. A final, hidden, address may be passed pointing to an area to contain the return value. Because of the wide variety of data types supported by PL/I a data descriptor may also be passed to define, for example, the lengths of character or bit strings, the dimension and bounds of arrays (dope vectors), or the layout and contents of a data structure. Dummy arguments are created for arguments which are constants or which do not agree with the type of argument the called procedure expects. See also{{Portal|Computer programming}}{{Div col|colwidth=20em}}
References1. ^"Procedure Call Standard for the ARM Architecture" 2008 2. ^{{cite web |title=ARM Cortex-A Series Programmer’s Guide for ARMv8-A, §9.1.1. Parameters in general-purpose registers |url=https://developer.arm.com/products/architecture/cpu-architecture/a-profile/docs/den0024/latest/the-abi-for-arm-64-bit-architecture/register-use-in-the-aarch64-procedure-call-standard/parameters-in-general-purpose-registers |website=ARM Developer |accessdate=7 October 2018}} 3. ^{{cite book|last=Sweetman|first=Dominic|title=See MIPS Run, 2nd edition|publisher=Morgan Kaufmann|isbn=0-12088-421-6}} 4. ^{{cite web|url=https://www.mips.com/?do-download=mips32-instruction-set-quick-reference-v1-01|title=MIPS32 Instruction Set Quick Reference}} 5. ^Karen Miller."The MIPS Register Usage Conventions".2006. 6. ^Hal Perkins.[https://courses.cs.washington.edu/courses/cse410/09sp/examples/MIPSCallingConventionsSummary.pdf "MIPS Calling Convention"].2006. 7. ^{{cite book|url=https://www.linux-mips.org/pub/linux/mips/doc/ABI/MIPS-N32-ABI-Handbook.pdf|title=MIPSpro N32 ABI Handbook|publisher=Silicon Graphics}} 8. ^{{Cite web|url=https://www.linux-mips.org/wiki/PIC_code|title=PIC code - LinuxMIPS|website=www.linux-mips.org|language=en|access-date=2018-09-21}} 9. ^{{cite web|url=http://math-atlas.sourceforge.net/devel/assembly/mipsabi32.pdf|title=System V Application Binary Interface MIPS RISC Processor Supplement, 3rd Edition|at=p. 3-12}} 10. ^{{cite book|url=http://sparc.org/wp-content/uploads/2014/01/psABI3rd.pdf.gz|title=System V Application Binary Interface SPARC Processor Supplement|edition=3}} 11. ^{{cite web|url=http://refspecs.linuxbase.org/ELF/zSeries/lzsabi0_s390.html|title=S/390 ELF Application Binary Interface Supplement}} 12. ^{{cite web|url=https://refspecs.linuxfoundation.org/ELF/zSeries/lzsabi0_zSeries.html#AEN410|title=zSeries ELF Application Binary Interface Supplement}} 13. ^{{cite web|author=Dr. Mike Smith|url=http://people.ucalgary.ca/~smithmr/2002webs/encm515_02/02general/background_info/registercompare.htm|title=SHARC (21k) and 68k Register Comparison}} 14. ^{{cite book|publisher=Embedded Support Tools Corporation|url=http://gendev.spritesmind.net/files/xgcc/xgcc.pdf|title=XGCC: The Gnu C/C++ Language System for Embedded Development|year=2000|page=59}} 15. ^{{cite web|url=http://rtos.com/products/threadx/ColdFire68K|title=COLDFIRE/68K: ThreadX for the Freescale ColdFire Family|archive-url=https://web.archive.org/web/20151002215924/http://rtos.com/products/threadx/ColdFire68K#|archive-date=2015-10-02}} 16. ^{{cite web|author=Andreas Moshovos|url=http://www.eecg.toronto.edu/~moshovos/ECE243-06/l12-subroutines-2.html|title=Subroutines Continued: Passing Arguments, Returning Values and Allocating Local Variables|quote=all registers except d0, d1, a0, a1 and a7 should be preserved across a call.}} 17. ^{{cite book|last1=IBM Corporation|title=IBM 1130 Disk Monitor System, Version 2 System Introduction (C26-3709-0)|date=1967|page=67|url=http://media.ibm1130.org/E0018.pdf|accessdate=Dec 21, 2014}} 18. ^{{cite book|last1=IBM Corporation|title=IBM 1130 Assembler Language (C26-5927-4)|date=1968|pages=24–25|url=http://media.ibm1130.org/E0022.pdf}} 19. ^Mark Smotherman. "Subroutine and procedure call support: Early history". 2004. 20. ^Brad Rodriguez."Moving Forth, Part 1: Design Decisions in the Forth Kernel".quote:"On the 6809 or Zilog Super8, DTC is faster than STC." 21. ^Anton Ertl."Speed of various interpreter dispatch techniques". 22. ^Mathew Zaleski."YETI: a graduallY Extensible Trace Interpreter".2008.Chapter 4: Design and Implementation of Efficient Interpretation.quote:"Although direct-threaded interpreters are known to have poor branch prediction properties...the latency of a call and return may be greater than an indirect jump." External links{{Wikibooks|Embedded Systems|Mixed C and Assembly Programming}}{{Wikibooks|X86 Disassembly|Calling Conventions}}
1 : Subroutines |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
随便看 |
|
开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。