Lecturer(s)
|
-
Bližňák Michal, Ing. Ph.D.
-
Janků Peter, Ing. Ph.D.
|
Course content
|
Lectures: - Introduction to heterogenou programming - Introduction to CUDA API, kernel programming, memory model. Parallel vectors addition. - Multidimensional blocks and grids, threads synchronization. Parallel matrix multiplications. - Optimalization of global memory operations, coalesced memory access. Parallel convolution. - Atomic operations, Parallel histogram. - Advanced CUDA API I: events, time measurement, CC detection, ... - Advanced CUDA API II: data transfer by using streams, parallel tasks. - Optimizations of CUDA applications. - Visualization of computed results, Parallel Mandelbrot set. - Unified memory model in CUDA 6. - Introduction to OpenCL API - Introduction to OpenACC API Exercises: - Installation and configuration of development environment for CUDA API. - CUDA C API - Vectors addition. - CUDA C API - Basic matrix multiplication (in globl memory) - CUDA C API - Tiled matrix multiplication (in shared memory) - CUDA C API - Convolution - CUDA C API - Histogram
|
Learning activities and teaching methods
|
unspecified
|
prerequisite |
---|
Knowledge |
---|
The course prerequisities are "Programming in C language" amd "Parallel Processes and Programming". |
The course prerequisities are "Programming in C language" amd "Parallel Processes and Programming". |
learning outcomes |
---|
describe the essential elements of the architectures of the Nvidia computing GPU |
describe the essential elements of the architectures of the Nvidia computing GPU |
explain the principle of computational operations scheduling on the GPU |
explain the principle of computational operations scheduling on the GPU |
enumerate the differences in performing calculations over GPU and CPU |
enumerate the differences in performing calculations over GPU and CPU |
list GPU memory types and characterize them |
list GPU memory types and characterize them |
define the CUDA program model and its connection to HW |
define the CUDA program model and its connection to HW |
Skills |
---|
create a simple application by using the C programming language and the Cuda library |
create a simple application by using the C programming language and the Cuda library |
use the different memory types implemented on the GPU |
use the different memory types implemented on the GPU |
use streams to overlap memory operations and kernel execution |
use streams to overlap memory operations and kernel execution |
analyze the execution of a CUDA application by using profiling |
analyze the execution of a CUDA application by using profiling |
propose a way to parallelize the algorithm on the GPU by using the Cuda library |
propose a way to parallelize the algorithm on the GPU by using the Cuda library |
Recommended literature
|
-
David B. Kirk, Wen-mei W. Hwu. Programming Massively Parallel Processors: A Hands-on Approach. Morgan Kaufmann, 2016. ISBN 978-0128119860.
-
John Cheng, Max Grossman, Ty McKercher. Professional CUDA C Programming. Wrox, 2014. ISBN 978-1118739327.
-
Kernighan, Brian W. Programovací jazyk C : [ANSI C99]. Vyd. 1. Brno : Computer Press, 2006. ISBN 80-251-0897-X.
-
Prata, Stephen. Mistrovství v C++. 3., aktualiz. vyd. Brno : Computer Press, 2007. ISBN 978-80-251-1749-1.
-
SANDERS, Jason. CUDA by Example. Addison-Wesley, 2011. ISBN 978-0131387683.
|