Syllabus High Performance Computing - (414444) Credit Scheme : Examination Scheme : 03 Credits Mid¬_Semester : 30 Marks End_Semester : 70 Marks Unit I Introduction to Parallel Computing Introduction : What is parallel computing ? Motivating Parallelism, Scope of Parallel Computing, Parallel Computing - Grand Challenges and Advantages, Dichotomy of Parallel Computing Platforms (Flynn’s Classifications, Distributed Memory Architecture, Shared Memory Architecture, Hybrid Architecture), Communication Costs in Parallel Machines, Interconnection Networks and Routing Mechanisms. Impact of Process-Processor Mapping and Mapping Techniques. (Chapter - 1) Unit II Principles of Parallel Algorithm Design Parallel programming paradigm (Task forming, Pipelining, divide and conquer), Preliminaries, Decomposition Techniques, Characteristics of Tasks and Interactions, Mapping Techniques for Load Balancing, Parallel Algorithm Models, Accelerator based computing (Introduction to CUDA and OpenACC). (Chapter - 2) Unit III Basic Communication Message passing paradigm : Synchronous and asynchronous communication calls. Blocking Vs Nonblocking, Introduction to MPI : Point to point communication, Collecting Communication : One-to-All Broadcast and All-to-One Reduction, All-to-All Broadcast and Reduction, All-Reduce and Prefix-Sum Operations, Scatter and Gather, All-to-All Personalized Communication, Circular Shift, Shared memory programming and synchronisation. (Chapter - 3) Unit IV Analytical Modeling of Parallel Programs Sequential execution time, parallel execution time and Sources of Overhead in Parallel Programs, Performance Metrics for Parallel Systems(Speedup, efficiency, Amdahl’s law, Gustafson’s law), The effect of Granularity on performance, Scalability of parallel systems, Minimum execution time and minimum cost-optimal execution time, Asymptotic Analysis of parallel programs, other scalability metrics. (Chapter - 4) Unit V Shared Memory Programming CUDA Architecture, CUDA Programming (Kernels, synchronization, Memory Contention and Device to Host Communications), OpenMP Programming. (Chapter - 5) Unit VI Parallel Algorithms and Applications Dense Matrix Algorithms (Canon’s Algorithm) : Matrix-Vector Multiplication, Matrix-Matrix Multiplication, Monto Carlo Simulation (Calculation of PI), Parallel Sorting Algorithms(Bubble Sort and its Variants, Parallelizing Quick sort) Parallel graph (All-Pairs Shortest Paths, Algorithm for sparse graph) Parallel search algorithms (Depth-First Search, Best-First Search). (Chapter - 6)