-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Capturing Stream Safety #1240
Comments
Thanks for bringing up this issue, we are looking into it now and will get back to you as soon as we can with some answers. Thanks again, |
Hello @FreddieWitherden, rocBLAS functions are not safe to use with HIP Graph functions. We will work towards making them Graph safe in future releases of rocBLAS. |
Thank you for this. My understanding is that the means of making them graph safe is whenever a function (such as SGEMM) is called which wants to use temporary storage the code should first call |
AFAIK it requires creating a pool of memory associated with the graph. Nodes in the graph asynchronously allocate from the pool, after the allocation is successful kernels are launched asynchronously, after the kernels have completed memory allocated by the node from the pool is asynchronously freed. There needs to be sufficient memory in the pool to allow progress. The order of the asynchronous operations is controlled by the graph. |
The approach outlined above is somewhat simpler and takes advantage of the fact that only one instance of a captured graph can be meaningfully run at once. Thus, when one detects a stream is capturing it is sufficient to simply allocate up fresh temporary storage (which is only ever used for that particular kernel invocation and never reused). Although a little bit wasteful it avoids any specific interaction with the graph, the need for the graph to be able to allocate/deallocate memory (which I do not think is currently possible in HIP), and any overhead associated with this. I believe this is the approach taken by cuBLAS to ensure graph safety. |
Hi @FreddieWitherden, HIP Graph support was added as a beta feature for rocBLAS Level 1, Level 2, and Level 3 (pointer mode host) functions in ROCm 5.5.0 so you should be able to use those rocBLAS functions with hipStreamBeginCapture now (see the docs for more information). Let me know if you have any follow-up questions otherwise I can close this issue. |
@FreddieWitherden Closing this issue as resolved. |
Unsure if it is related to the capturing but we still observe issues with our code when using the graph API; see: The code itself is almost function-for-functional identical to what we do on CUDA (and that has been in production for 2-3 years with no reported issues) whereas for HIP when we transitions to graphs we get invalid results on even our simple test cases. |
@FreddieWitherden, More information here : https://rocm.docs.amd.com/projects/rocBLAS/en/latest/reference/memory-alloc.html#stream-order-alloc |
Is the current version of rocBLAS safe to use in the context of a stream which is capturing? For example:
The issue surrounds if any kernels in rocBLAS feel like using scratch space. For this to be safe rocBLAS needs to detect if a stream is capturing and, if so, allocate fresh storage (which is never reused or deallocated). This is because a graph can be launched in the context of any stream(s).
The text was updated successfully, but these errors were encountered: