cranelift: Truly dynamic vector types #4418

sparker-arm · 2022-07-08T13:25:44Z

Current Status

Basic support for dynamic vector types has been committed in preparation to support the Wasm Flexible Vectors extension. The RFC discussion around the design is here. And while we now have support for dynamic/flexible/scalable vectors, it doesn't support proper dynamic types. This is a top-level issue for what needs to be done to get there.

Cranelift's type system now contains a dynamic vector type for each corresponding fixed-width type, with the dynamic type being a fixed-width type scaled by a target-defined factor. Currently, this factor (dyn_scale_target_const) is legalized to a constant. This makes lowering and stack layout simple. Below is an example CLIF function.

function %i8x16_splat_add(i8, i8) -> i8x16 {
  gv0 = dyn_scale_target_const.i8x16
  dt0 = i8x16*gv0

block0(v0: i8, v1: i8):
  v2 = splat.dt0 v0
  v3 = splat.dt0 v1
  v4 = iadd v2, v3
  v5 = extract_vector v4, 0
  return v5
}

Next Steps

The two main areas that need more work are in IR and the MachInst ABI layer. The first part is to modify, or introduce a new GlobalValue, that will be legalized and lowered to a runtime scaling value. Maybe in a similar way to how global values are currently used in the ABI for generating the stack limit.

The second, and bigger issue, is in the ABI layer where we determine the stack layout. A complicating factor here is that the 'Vanilla' layer is mainly shared between the backends and, of course, everything is also designed around known constant sizes. However, I think the biggest challenge is the interface with the register allocator.

Spill Slots
The register allocator currently only supports two register classes, and types are aren't tracked, so there is the potential for a target with wide vector support to use far more stack than necessary. For example, with the current implementation, a target using AVX-512 would require 64-bytes to spill a single precision float.

To enable truly dynamic types, the interface with the register allocator will need to change. If we leave it to return a constant value, we will need to accommodate the maximum possible register size (2KB in the case of SVE) and that is a prohibitive cost for most CPUs.

Also, with wider vectors usually comes predication and predicate registers are unlikely to map to either of the existing regalloc classes either.

Stack Slots
Along with spill slots, our frame layout will need to handle dynamically sized stack slots which are defined in the IR or, possibly, arguments passed on the stack. The current stack layout is as follows (there's currently no distinction between spill slots for fixed and dynamic types):

//!   (high address)
//!
//!                              +---------------------------+
//!                              |          ...              |
//!                              | stack args                |
//!                              | (accessed via FP)         |
//!                              +---------------------------+
//! SP at function entry ----->  | return address            |
//!                              +---------------------------+
//! FP after prologue -------->  | FP (pushed by prologue)   |
//!                              +---------------------------+
//!                              |          ...              |
//!                              | clobbered callee-saves    |
//! unwind-frame base     ---->  | (pushed by prologue)      |
//!                              +---------------------------+
//!                              |          ...              |
//!                              | spill slots               |
//!                              | (accessed via nominal SP) |
//!                              |          ...              |
//!                              | sized stack slots         |
//!                              | dynamic stack slots       |
//!                              | (accessed via nominal SP) |
//! nominal SP --------------->  | (alloc'd by prologue)     |
//! (SP at end of prologue)      +---------------------------+
//!                              | [alignment as needed]     |
//!                              |          ...              |
//!                              | args for call             |
//! SP before making a call -->  | (pushed at callsite)      |
//!                              +---------------------------+
//!
//!   (low address)

We likely want to collect all the dynamically sized stack values and move them up the stack and introduce a new StackAMode to be addressed by FP. Spill and stack slots of compile-time known sizes can accessed as they are now, but the way we calculate at the end of the prologue will need to be modified. The current implementation allows the TargetIsa to report a fixed size for each dynamic type and so stack offsets can be calculated at compile-time. For dynamically-sized objects, I think we'll want to use vmctx to generate our scaling factor from a GlobalValue and then multiple a slot index by the scaling factor to get our address.

It could be that a target wants to specify multiple scaling values though, depending on the type/register that will be used. So, we could group the values so that each group is using the same scale value. The awkward part here is that we won't have a uniform space to scale across, and so we'll need a method to 'jump' over groups.

The text was updated successfully, but these errors were encountered:

sparker-arm added cranelift:area:regalloc Issues related to register allocation. cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:area:clif labels Jul 8, 2022

sparker-arm mentioned this issue Jul 8, 2022

[RFC] Dynamic Vector Support #4200

Merged

akirilov-arm added the cranelift Issues related to the Cranelift code generator label Jul 11, 2022

workingjubilee mentioned this issue Mar 3, 2023

RFC: Add a scalable representation to allow support for scalable vectors rust-lang/rfcs#3268

Open

fitzgen mentioned this issue Oct 14, 2024

Implement support for Wasm's flexible-vectors proposal #9464

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cranelift: Truly dynamic vector types #4418

cranelift: Truly dynamic vector types #4418

sparker-arm commented Jul 8, 2022

cranelift: Truly dynamic vector types #4418

cranelift: Truly dynamic vector types #4418

Comments

sparker-arm commented Jul 8, 2022

Current Status

Next Steps