The repository contains my reasearch on the NVIDIA CUDA architecture for the course of the University Of Bologna's course Architettura Dei Calcolatori Elettronici M
The repository is structured as follows:
notes/
contains the notes taken from various sourcesresources/
contains the resources used for the research (books, papers, presentations, etc.)- Some papers are password protected as they are licensed for the University of Bologna & its students only.
code/
contains the actual code example used for the research to understand the CUDA architecture
-
M. Garland, "CUDA parallel programming model," 2008 IEEE Hot Chips 20 Symposium (HCS), Stanford, CA, USA, 2008, pp. 1-29, doi: 10.1109/HOTCHIPS.2008.7476519.
keywords: {Instruction sets;Graphics processing units;Tutorials;Parallel programming;Kernel;Parallel algorithms;Synchronization},
URL: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7476519&isnumber=7476511 -
E. Lindholm, J. Nickolls, S. Oberman and J. Montrym, "NVIDIA Tesla: A Unified Graphics and Computing Architecture," in IEEE Micro, vol. 28, no. 2, pp. 39-55, March-April 2008, doi: 10.1109/MM.2008.31.
keywords: {Graphics;Computer architecture;Parallel processing;Pipelines;Concurrent computing;Load management;Multicore processing;Parallel programming;Portable computers;Workstations;Hot Chips 19;GPU;parallel processor;SIMT;SIMD;unified graphics and parallel computing architecture;graphics processing unit;cooperative thread array;Tesla},
URL: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4523358&isnumber=4523348 -
M. Garland et al., "Parallel Computing Experiences with CUDA," in IEEE Micro, vol. 28, no. 4, pp. 13-27, July-Aug. 2008, doi: 10.1109/MM.2008.57.
keywords: {Parallel processing;Programming profession;Parallel programming;Concurrent computing;Computer architecture;Computer graphics;Kernel;Throughput;Central Processing Unit},
URL: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4626815&isnumber=4626808 -
J. Nickolls, "Scalable parallel programming with CUDA introduction," 2008 IEEE Hot Chips 20 Symposium (HCS), Stanford, CA, USA, 2008, pp. 1-9, doi: 10.1109/HOTCHIPS.2008.7476518.
keywords: {Graphics processing units;Instruction sets;Tutorials;Parallel programming;Multithreading;Parallel processing},
URL: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7476518&isnumber=7476511 -
D. De Donno, A. Esposito, L. Tarricone and L. Catarinucci, "Introduction to GPU Computing and CUDA Programming: A Case Study on FDTD [EM Programmer's Notebook]," in IEEE Antennas and Propagation Magazine, vol. 52, no. 3, pp. 116-122, June 2010, doi: 10.1109/MAP.2010.5586593.
keywords: {Graphics processing unit;Finite difference methods;Time domain analysis;Parallel processing;Parallel programming;Parallel processing;parallel programming;FDTD methods;general purpose graphics processing unit (GPU);Compute Unified Device Architecture (CUDA)},
URL: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5586593&isnumber=5586558 -
An Even Easier Introduction to CUDA
URL: https://developer.nvidia.com/blog/even-easier-introduction-cuda/ -
CUDA Programming Guide
URL: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html -
A history of NVIDIA Stream Multiprocessor
URL: https://fabiensanglard.net/cuda/ -
CUDA Grid-Stride Loops
URL: https://developer.nvidia.com/blog/cuda-pro-tip-write-flexible-kernels-grid-stride-loops/ -
CUDA refresher
URL: https://developer.nvidia.com/blog/tag/cuda-refresher/ -
Understanding SPs & SMs
URL: https://stackoverflow.com/questions/2207171/how-much-is-run-concurrently-on-a-gpu-given-its-numbers-of-sms-and-sps/2213744#2213744