Modular techniques and interfaces for data race detection in multi-paradigm parallel programming
- Modulare Techniken und Schnittstellen zur Erkennung von Speicherzugriffsanomalien in der Parallelprogrammierung mit mehreren Paradigmen
Protze, Joachim; Müller, Matthias S. (Thesis advisor); Träff, Jesper L. (Thesis advisor); Schulz, Martin (Thesis advisor)
Aachen : Joachim Protze (2021)
Book, Dissertation / PhD Thesis
Dissertation, RWTH Aachen University, 2021
The demand for ever-growing computing capabilities in scientific computing and simulation has led to heterogeneous computing systems with multiple parallelism levels. The aggregated performance of the Top 500 high-performance computing (HPC) systems showed an annual growth rate of 85% for the years 1993-2013. As this growth rate significantly exceeds the growth rate of 40% to 60% supported by Moore’s law, the additional growth was always supported by an increasing number of computing nodes with distributed memory and connected by a network. The message passing interface (MPI) proved to be the dominating programming paradigm for distributed memory computing as the most coarse-grain level of parallelism in HPC. While performance gain from Moore’s law in the last century mainly went into single-core performance by increasing the clock frequency, we see an increasing number of computing cores per socket since the beginning of this century. The cores within a socket or a node share the memory. Although MPI can be used and is used for shared memory parallelization, explicit use of shared memory as with OpenMP can improve the scalability and performance of parallel applications. As a result, hybrid MPI and OpenMP programming is a common paradigm in HPC. Memory access anomalies such as data races are a severe issue in parallel programming. Data race detection has been studied for years, and different static and dynamic analysis techniques have been presented. This work will not try and propose fundamentally new analysis techniques but will show how high-level abstraction of MPI and OpenMP can be mapped to the low-level abstraction of analysis tools without impact on the analysis’s soundness. This work develops and presents analysis workflows to identify memory access anomalies in hybrid, multi-paradigm parallel applications. This work collects parallel variants of memory access anomalies known from sequential programming and identifies specific patterns for distributed and shared memory programming. This work identifies the high-level synchronization, concurrency, and memory access semantics implicitly and explicitly defined by the parallel programming paradigms’ specifications to provide a mapping to the analysis abstraction. As part of these high-level concurrency concepts, we can identify several sources of concurrency within a thread. This work compares two techniques to handle this high-level concurrency for data race analysis and finds that a combined approach works best in the general case. The evaluation shows that this work’s analysis workflow provides a high precision while enabling increased recall for concurrency within a thread.