You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 25, 2024. It is now read-only.
We are always using asynchronous thrust launch on a cuda stream, which involves extra
cudaStreamSync
within thrust calls, e.g.,wholegraph/cpp/src/wholememory_ops/functions/exchange_ids_nccl_func.cu
Line 63 in 9f290c4
wholegraph/cpp/src/wholegraph_ops/unweighted_sample_without_replacement_func.cuh
Line 340 in 9f290c4
It would be better to change to
thrust::cuda::par_nosync
, to make it easier to overlap with other operations.The text was updated successfully, but these errors were encountered: