-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Capturing Stream Safety #1240
Comments
Thanks for bringing up this issue, we are looking into it now and will get back to you as soon as we can with some answers. Thanks again, |
Hello @FreddieWitherden, rocBLAS functions are not safe to use with HIP Graph functions. We will work towards making them Graph safe in future releases of rocBLAS. |
Thank you for this. My understanding is that the means of making them graph safe is whenever a function (such as SGEMM) is called which wants to use temporary storage the code should first call |
AFAIK it requires creating a pool of memory associated with the graph. Nodes in the graph asynchronously allocate from the pool, after the allocation is successful kernels are launched asynchronously, after the kernels have completed memory allocated by the node from the pool is asynchronously freed. There needs to be sufficient memory in the pool to allow progress. The order of the asynchronous operations is controlled by the graph. |
The approach outlined above is somewhat simpler and takes advantage of the fact that only one instance of a captured graph can be meaningfully run at once. Thus, when one detects a stream is capturing it is sufficient to simply allocate up fresh temporary storage (which is only ever used for that particular kernel invocation and never reused). Although a little bit wasteful it avoids any specific interaction with the graph, the need for the graph to be able to allocate/deallocate memory (which I do not think is currently possible in HIP), and any overhead associated with this. I believe this is the approach taken by cuBLAS to ensure graph safety. |
Is the current version of rocBLAS safe to use in the context of a stream which is capturing? For example:
The issue surrounds if any kernels in rocBLAS feel like using scratch space. For this to be safe rocBLAS needs to detect if a stream is capturing and, if so, allocate fresh storage (which is never reused or deallocated). This is because a graph can be launched in the context of any stream(s).
The text was updated successfully, but these errors were encountered: