Allow GPU models to train on sparse matrices that exceed the size of available GPU memory #605

benfred · 2022-08-15T18:18:50Z

Use CUDA Unified Virtual Memory for sparse matrices on the GPU. This allows GPU models to train on
input sparse matrices that exceed the size of GPU memory, by letting cuda page data to/from host memory using UVM.

This has been tested on a ALS model with around 2B entries in the spare matrix, on a GPU with 16GB of memory. Previously this OOM'ed since we need around 32GB of GPU memory to store the sparse matrix and its transpose, but with this change training succeeded - and was around 20x faster on the GPU than on the CPU.

Use CUDA Unified Virtual Memory for sparse matrices on the GPU. This will help out with the case where the dataset doesn't fit completely into GPU memory by letting cuda page data to/from host memory.

Use UVM for GPU sparse matrices

c08b794

Use CUDA Unified Virtual Memory for sparse matrices on the GPU. This will help out with the case where the dataset doesn't fit completely into GPU memory by letting cuda page data to/from host memory.

benfred changed the title ~~Use UVM for GPU sparse matrices~~ Allow input sparse matrices to exceed size of available GPU memory Aug 15, 2022

benfred changed the title ~~Allow input sparse matrices to exceed size of available GPU memory~~ Allow GPU models to train on sparse matrices that exceed the size of available GPU memory Aug 15, 2022

benfred merged commit 734f5c7 into main Aug 15, 2022

benfred deleted the uvm branch August 15, 2022 20:58

benfred added the enhancement label Aug 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow GPU models to train on sparse matrices that exceed the size of available GPU memory #605

Allow GPU models to train on sparse matrices that exceed the size of available GPU memory #605

benfred commented Aug 15, 2022 •

edited

Loading

Allow GPU models to train on sparse matrices that exceed the size of available GPU memory #605

Allow GPU models to train on sparse matrices that exceed the size of available GPU memory #605

Conversation

benfred commented Aug 15, 2022 • edited Loading

benfred commented Aug 15, 2022 •

edited

Loading