The semiconductor technology has reached physical limitations that prevent faster miniaturization of chips and more energy-efficient and more powerful processors. Current development focuses on devices with multicore processors and general-purpose graphics processing units on one side and a wide variety of computer clusters and cloud services on the other.
It is essential to know the architecture of these systems and appropriate software interfaces to use them efficiently. We will learn how to make multithreaded programs, OpenMP programs for shared memory systems, exploit general-purpose graphics processing units with OpenCL, and OpenMPI library for working with distributed memory systems.
We will learn about the traps of parallel and distributed systems, understand them and find ways to avoid them. We will become familiar with the typical parallel programming patterns and learn how to use them. In the practical part of the course, we will use one of Slovenia's supercomputing systems.
At the end of the course, you will be able to select the most appropriate hardware platform and write an efficient parallel program for a problem at hand.