Skip to content

Latest commit

 

History

History
125 lines (95 loc) · 4.54 KB

cxx-x64-v2-calling-convention.md

File metadata and controls

125 lines (95 loc) · 4.54 KB

◀️ C++ Proposals and ideas

C++ Implementation Proposal: x86-64 calling convention v2

TL;DR: Spill everything, pack what you can, don't skip any registers, and be done with it.

This is obviously NOT a suggestion to change ABI for interfacing with the OS. That would not be feasible in any realistic form (albeit AArch64 Windows does have 2 native ABIs).

This is a suggestion for ABI used within the process and accompanied DLLs.
Just like C++ process on x86-32 is __cdecl, but is interfacing the OS via __stdcall.

Rationale

Large production C++ codebases still resist modernization efforts, such as replacing pointer & size parameters with std::span or std::string_view, or nullable pointers with std::optional or std::unique_ptr, for the reason that doing so is significant measurable pessimisation.

The culprit is Windows' X64 calling convention, which mandates those utilities are passed via pointer, instead of spilled into registers. This introduces complex constructions at the call site and, primarily, unnecessary indirection inside the callee.

The simplicity argument

After a mess of myriad x86-32 calling conventions, it is often presented as an important advantage and a fact, that there is only one single calling convention on "X64". That is not true.

There are at least System V AMD64, Microsoft's x64 and __vectorcall conventions. And then [[trivial_abi]] exists on Clang.

Proposed calling convention semantics

Non-skipping

  • Parameter registers are progressively allocated, not assigned by argument index.

Spilling

  • Structures, larger than a register, passed as arguments are spilled into registers

Packing

  • All structures smaller or equal to a register size are passed in single register.
    This can be achieved by a single MOV instruction.
    • Potential issues with structures of non-power of 2 size:
      • Loading 6B at the end of page, causing possible access violation fault. Restrict to power of 2?
      • Mov 6B to a register moves 8B. Insecure. Possible data leaking. Require extra masking?
  • While 8B heterogeneous struct { int i; float f; } is passed in a single register, larger ones are spilled per-element. Someone will eventually need to unpack those before use, so why not do that at call site.
  • Structures containing multiple float/double types are packed into as many XMM registers as possible.
    • YMM/ZMM?

Lifetime

Various

  • Empty objects do not consume register nor stack slot, when passed by value.

Examples

struct Pt {
    int x;
    int y;
};
struct Complex {
    float r;
    float i;
};
struct Data {
    int a;
    float b;
    Pt pt;
    __m128 c;
};
struct Version {
    std::uint8_t  major;
    std::uint8_t  minor;
    std::uint16_t patch;
    std::uint32_t build;
};

void CONVENTIONNAME function (int z, Data s, float f, Complex g, Version v);

Assignment

Parameter Register
z rcx
s.a rdx
s.b xmm0
s.pt r8
s.c xmm1
f xmm2
g xmm3
v r9

Possible names

  • __spillcall, __packcall, __spcall, ...

Limitations

  • Working within current boundaries of Microsoft Visual C++ we need to fit within:
    • RCX, RDX, R8, R9, R10, R11, and XMM0 to XMM5 are considered volatile (caller-saved)
  • If the arguments don't fit the available registers, the calling convention reverts to the original one.

Return values

  • The exactly same mechanism is used to pack and spill return value into RAX, R10, R11, XMM4 and XMM5 registers.
    • Microsoft's ABI considers those volatile on return

Additional reading