|
Lecturer(s)
|
-
Beltran Prieto Luis Antonio, MSc.
-
Janků Peter, Ing. Ph.D.
-
Mirshahi Sina, MSc.
-
Bližňák Michal, Ing. Ph.D.
-
Vala Radek, Ing. Ph.D.
|
|
Course content
|
Lectures: - Introduction to heterogenou programming - Introduction to CUDA API, kernel programming, memory model. Parallel vectors addition. - Multidimensional blocks and grids, threads synchronization. Parallel matrix multiplications. - Optimalization of global memory operations, coalesced memory access. Parallel convolution. - Atomic operations, Parallel histogram. - Advanced CUDA API I: events, time measurement, CC detection, ... - Advanced CUDA API II: data transfer by using streams, parallel tasks. - Optimizations of CUDA applications. - Visualization of computed results, Parallel Mandelbrot set. - Unified memory model in CUDA 6. - Introduction to OpenCL API - Introduction to OpenACC API Exercises: - Installation and configuration of development environment for CUDA API. - CUDA C API - Vectors addition. - CUDA C API - Basic matrix multiplication (in globl memory) - CUDA C API - Tiled matrix multiplication (in shared memory) - CUDA C API - Convolution - CUDA C API - Histogram
|
|
Learning activities and teaching methods
|
|
unspecified
|
| prerequisite |
|---|
| Knowledge |
|---|
| The course prerequisities are "Programming in C language" amd "Parallel Processes and Programming". |
| The course prerequisities are "Programming in C language" amd "Parallel Processes and Programming". |
| learning outcomes |
|---|
| describe the essential elements of the architectures of the Nvidia computing GPU |
| describe the essential elements of the architectures of the Nvidia computing GPU |
| explain the principle of computational operations scheduling on the GPU |
| explain the principle of computational operations scheduling on the GPU |
| enumerate the differences in performing calculations over GPU and CPU |
| enumerate the differences in performing calculations over GPU and CPU |
| list GPU memory types and characterize them |
| list GPU memory types and characterize them |
| define the CUDA program model and its connection to HW |
| define the CUDA program model and its connection to HW |
| Skills |
|---|
| create a simple application by using the C programming language and the Cuda library |
| create a simple application by using the C programming language and the Cuda library |
| use the different memory types implemented on the GPU |
| use the different memory types implemented on the GPU |
| use streams to overlap memory operations and kernel execution |
| use streams to overlap memory operations and kernel execution |
| analyze the execution of a CUDA application by using profiling |
| analyze the execution of a CUDA application by using profiling |
| propose a way to parallelize the algorithm on the GPU by using the Cuda library |
| propose a way to parallelize the algorithm on the GPU by using the Cuda library |
|
Recommended literature
|
|