Pattern-Based Programming Abstractions for Heterogeneous Parallel Computing.

Ernstsson, August.

Pattern-Based Programming Abstractions for Heterogeneous Parallel Computing. - 1st ed. - 1 online resource (292 pages) - Link�oping Studies in Science and Technology. Dissertations Series ; v.2205 . - Link�oping Studies in Science and Technology. Dissertations Series .

Intro -- Popul�arvetenskaplig sammanfattning -- Abstract -- Acknowledgments -- Contents -- Introduction -- Aims and research questions -- Published work behind this thesis -- Other work behind this thesis -- Structure -- Background and related work -- Motivation -- High-level parallel programming -- Skeleton programming -- Related work -- GrPPI -- Musket -- Kokkos -- SYCL -- MLIR -- StarPU -- C++ AMP, and other industry efforts -- Other related frameworks, libraries, and toolchains -- Independent surveys -- Earlier related work on SkePU -- SkePU overview -- Basic constructs -- Backend architecture -- History -- SkePU 2 design principles -- SkePU 3 design principles -- Skeleton set -- Skeleton set -- Map skeleton -- Freely accessible containers inside user functions -- Variadic type signatures -- Multi-valued return -- Index-dependent computations -- MapPairs skeleton -- MapOverlap skeleton -- Edge handling modes -- Update modes -- Reduce skeleton -- One-dimensional reductions -- Two-dimensional reductions -- Scan skeleton -- MapReduce skeleton -- MapPairsReduce skeleton -- Call skeleton -- User functions -- User functions as lambda expressions -- User types -- User constants -- Strided skeletons -- Strides Map, MapPairs, and their reduce variants -- Strides in MapOverlap -- Data representation with smart data-containers -- Smart data-containers -- Container indexing -- Container proxies -- MatRow proxy -- MatCol proxy -- Region proxy -- Memory consistency model -- External scope -- Standard library -- Deterministic random number generation -- Complex numbers -- Linear algebra -- Image filtering and visualization -- Benchmark utilities -- High-level consistent input and output -- General utilities -- Implementation -- Implementation overview -- Language embedding and type safety -- Improved type safety from SkePU 1 -- Source-to-source compiler. Backends -- Sequential CPU backend -- Multi-core CPU backend: OpenMP -- GPU backends: OpenCL and CUDA -- C and Fortran language bindings -- Continuous integration and testing -- Dependencies -- Availability -- Hybrid CPU-GPU skeleton execution -- Introduction -- Workload partitioning and implementation -- StarPU backend implementation -- Auto-tuning -- Skeleton programming on large-scale cluster systems -- Background -- StarPU-MPI backend -- GPI backend -- GASPI and GPI -- Implementation -- Design -- Synchonization and state tracking -- Consistency model and double buffering -- Communication pattern -- Data representation -- Data transfers and caching -- Conclusions -- Extending smart data-containers for data locality awareness -- Introduction -- Large-scale data processing with MapReduce and Spark -- MapReduce -- Spark -- Lazily evaluated skeletons with tiling -- Basic approach and benefits -- Backend selection -- Loop optimization -- Evaluation points -- Further application areas -- Implementation -- Lazy tiling for stencil computations -- Applications and comparison to kernel fusion -- Polynomial evaluation using Horner's method -- Exponentiation by repeated squaring -- Heat propagation -- Related work -- High-level skeleton fusion -- Comparison to lineages -- Kernel fusion -- Types of fusions -- Example: N-body simulation -- Future work -- Multi-variant user functions -- Introduction -- Idea and implementation -- Use cases -- Vectorization example -- Generalized multi-variant components with the Call skeleton -- Other use cases -- Related work -- A deterministic portable parallel pseudo-random number generator -- Introduction -- Determinism in heterogeneous parallel computing -- Parallel pseudo-random number generation -- Previous manual parallelization of PRNG in SkePU programs -- Monte Carlo pi calculation-index-based scrambling. Markov Chain Monte Carlo methods in LQCD-PRNG with explicit state -- Designing a deterministic PRNG for SkePU -- Global synchronization -- Stream splitting -- State forwarding -- Optimizing long or iterated skeleton chains by pre-forwarding -- API extension design -- Related work -- Towards a modernized auto-tuner -- Background -- SkePU variadic tuner design -- Implementation -- Multi-dimensional argument sequences -- Sampler -- Execution plan and persistence -- Future work -- Evaluation results -- SkePU usability evaluation -- SkePU 2 prototype survey -- SkePU 3 survey -- Initial SkePU 2 performance evaluation -- Performance evaluation of lineages -- Sequences of Maps -- Heat propagation -- Hybrid backend -- Single skeleton evaluation -- Generic application evaluation -- Comparison to dynamic hybrid scheduling using StarPU -- Evaluation of multi-variant user functions -- Vectorization -- Median filtering -- Application benchmarks of SkePU 3 -- Libsolve ODE solver -- N-body -- Blackscholes and Streamcluster -- Brain simulation -- CO2 capture -- Supercapacitor simulation -- Conjugate gradient -- Experimental evaluation of deterministic PRNG -- Monte-Carlo Pi approximation -- LQCD Mini-Application -- Miller-Rabin primality testing -- Natural noise generation -- Programmability evaluation -- SkePU-GPI cluster backend -- Microbenchmarks of SkePU 3 -- OpenMP scheduling modes -- SkePU memory consistency model -- Variadic tuner prototype -- High-level skeleton fusion -- Limitations and future work -- Limitations -- Applicability of data-parallel patterns -- Dynamic data structures -- Limitations of language embedding -- Future work -- Further backend targets: reconfigurable accelerators -- Extending the parallel pattern set: stream parallelization -- Testing, debugging, and visualization -- Higher-level language interface -- Conclusions -- Bibliography. Additions and changes from the licentiate thesis -- New contributions -- Other changes -- Definitions -- Abbreviations -- Domain-specific terminology -- SkePU-specific terminology -- SkePU-BLAS API -- Application source code samples -- N-body simulation -- Game of life -- Conjugate gradient -- CO2 capture -- Dr-sammanst.

9789179291952


Electronic books.