Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCT/API: introduce sys_dev field for mem_alloc #10448

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

Akshay-Venkatesh
Copy link
Contributor

@Akshay-Venkatesh Akshay-Venkatesh commented Jan 23, 2025

What? Why?

As we move towards supporting multi-device configurations and relaxing GPU context setting requirements, UCP and other UCT users need to be able to indicate on which system device the allocation is being requested on without having to rely on inferring the same from context set against calling thread. This patch extends allocation params struct to include system device. In this initial implementation all UCT allocators ignore the system device argument but eventually different transports can convert the given system device into a bus_id and allocate against that specific device.

UCT_MEM_ALLOC_PARAM_FIELD_NAME = UCS_BIT(4)
UCT_MEM_ALLOC_PARAM_FIELD_NAME = UCS_BIT(4),

/** Enables @ref uct_mem_alloc_params_t::sys_device */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is called sys_dev in uct_mem_alloc_params_t. Pls align one of them with the other.
also UCT_MEM_ALLOC_PARAM_FIELD_SYS_DEVICE

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UCT_MEM_ALLOC_PARAM_FIELD_SYS_DEV

@@ -101,12 +102,15 @@ ucs_status_t uct_mem_alloc(size_t length, const uct_alloc_method_t *methods,
params->name : "anonymous-uct_mem_alloc";
mem_type = (params->field_mask & UCT_MEM_ALLOC_PARAM_FIELD_MEM_TYPE) ?
params->mem_type : UCS_MEMORY_TYPE_HOST;
sys_dev = (params->field_mask & UCT_MEM_ALLOC_PARAM_FIELD_SYS_DEVICE) ?
params->sys_dev : UCS_SYS_DEVICE_ID_UNKNOWN;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whould we return INVALID_PARAM if sys_dev is not UCS_SYS_DEVICE_ID_UNKNOWN and mem_type is host?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brminich This assumes that GPU MDs cannot allocate host memory which is true today but maybe it is better for UCT MDs to return that error than catch it here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you think it can be a valid case in the future, then ok

Comment on lines 2573 to 2575
* Index of the system device on which memory is to be allocated.
* Eg: UCS_SYS_DEVICE_ID_UNKNOWN to allocate on host memory, or a specific
* index to allocate memory on GPU device.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just
"System device on which memory is to be allocated, or UCS_SYS_DEVICE_ID_UNKNOWN to allow allocating on any device".
The host memory comment is misleading because mem_type field is used to select host memory

src/uct/base/uct_mem.c Show resolved Hide resolved
src/uct/ib/mlx5/dv/ib_mlx5dv_md.c Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants