-
implementations.py : contains gpu_copy and unified functions
- gpu_copy : memory copies between CPU - GPU
- unified : use of cuda unified memory ( by using unified_{pointers, arrays} functions )
-
timers.py : contains helper function to time given function and returt result and stats
- test.py
: manually test function from implementations.py with below parameters
- size: Array size (NxN) used for testing (default=10)
- repeat: Benchmark repetition times (default=10)
- num_oper: Perform #num_oper operation loops (default=10)
- gpu_copy: use of gpu_copy function
- unified: use of unified function
- bench.py : Bench the above-mentioned functions and plot results
GPU specs used for experiment with bench.py:
nvidia-smi