000 | 07720nam a22004333i 4500 | ||
---|---|---|---|
001 | EBC6887234 | ||
003 | MiAaPQ | ||
005 | 20240122001522.0 | ||
006 | m o d | | ||
007 | cr cnu|||||||| | ||
008 | 231124s2021 xx o ||||0 eng d | ||
020 |
_a9789179291952 _q(electronic bk.) |
||
020 | _z9789179291952 | ||
035 | _a(MiAaPQ)EBC6887234 | ||
035 | _a(Au-PeEL)EBL6887234 | ||
035 | _a(OCoLC)1298391047 | ||
040 |
_aMiAaPQ _beng _erda _epn _cMiAaPQ _dMiAaPQ |
||
100 | 1 | _aErnstsson, August. | |
245 | 1 | 0 | _aPattern-Based Programming Abstractions for Heterogeneous Parallel Computing. |
250 | _a1st ed. | ||
264 | 1 |
_aLink�oping : _bLinkopings Universitet, _c2021. |
|
264 | 4 | _c{copy}2022. | |
300 | _a1 online resource (292 pages) | ||
336 |
_atext _btxt _2rdacontent |
||
337 |
_acomputer _bc _2rdamedia |
||
338 |
_aonline resource _bcr _2rdacarrier |
||
490 | 1 |
_aLink�oping Studies in Science and Technology. Dissertations Series ; _vv.2205 |
|
505 | 0 | _aIntro -- Popul�arvetenskaplig sammanfattning -- Abstract -- Acknowledgments -- Contents -- Introduction -- Aims and research questions -- Published work behind this thesis -- Other work behind this thesis -- Structure -- Background and related work -- Motivation -- High-level parallel programming -- Skeleton programming -- Related work -- GrPPI -- Musket -- Kokkos -- SYCL -- MLIR -- StarPU -- C++ AMP, and other industry efforts -- Other related frameworks, libraries, and toolchains -- Independent surveys -- Earlier related work on SkePU -- SkePU overview -- Basic constructs -- Backend architecture -- History -- SkePU 2 design principles -- SkePU 3 design principles -- Skeleton set -- Skeleton set -- Map skeleton -- Freely accessible containers inside user functions -- Variadic type signatures -- Multi-valued return -- Index-dependent computations -- MapPairs skeleton -- MapOverlap skeleton -- Edge handling modes -- Update modes -- Reduce skeleton -- One-dimensional reductions -- Two-dimensional reductions -- Scan skeleton -- MapReduce skeleton -- MapPairsReduce skeleton -- Call skeleton -- User functions -- User functions as lambda expressions -- User types -- User constants -- Strided skeletons -- Strides Map, MapPairs, and their reduce variants -- Strides in MapOverlap -- Data representation with smart data-containers -- Smart data-containers -- Container indexing -- Container proxies -- MatRow proxy -- MatCol proxy -- Region proxy -- Memory consistency model -- External scope -- Standard library -- Deterministic random number generation -- Complex numbers -- Linear algebra -- Image filtering and visualization -- Benchmark utilities -- High-level consistent input and output -- General utilities -- Implementation -- Implementation overview -- Language embedding and type safety -- Improved type safety from SkePU 1 -- Source-to-source compiler. | |
505 | 8 | _aBackends -- Sequential CPU backend -- Multi-core CPU backend: OpenMP -- GPU backends: OpenCL and CUDA -- C and Fortran language bindings -- Continuous integration and testing -- Dependencies -- Availability -- Hybrid CPU-GPU skeleton execution -- Introduction -- Workload partitioning and implementation -- StarPU backend implementation -- Auto-tuning -- Skeleton programming on large-scale cluster systems -- Background -- StarPU-MPI backend -- GPI backend -- GASPI and GPI -- Implementation -- Design -- Synchonization and state tracking -- Consistency model and double buffering -- Communication pattern -- Data representation -- Data transfers and caching -- Conclusions -- Extending smart data-containers for data locality awareness -- Introduction -- Large-scale data processing with MapReduce and Spark -- MapReduce -- Spark -- Lazily evaluated skeletons with tiling -- Basic approach and benefits -- Backend selection -- Loop optimization -- Evaluation points -- Further application areas -- Implementation -- Lazy tiling for stencil computations -- Applications and comparison to kernel fusion -- Polynomial evaluation using Horner's method -- Exponentiation by repeated squaring -- Heat propagation -- Related work -- High-level skeleton fusion -- Comparison to lineages -- Kernel fusion -- Types of fusions -- Example: N-body simulation -- Future work -- Multi-variant user functions -- Introduction -- Idea and implementation -- Use cases -- Vectorization example -- Generalized multi-variant components with the Call skeleton -- Other use cases -- Related work -- A deterministic portable parallel pseudo-random number generator -- Introduction -- Determinism in heterogeneous parallel computing -- Parallel pseudo-random number generation -- Previous manual parallelization of PRNG in SkePU programs -- Monte Carlo pi calculation-index-based scrambling. | |
505 | 8 | _aMarkov Chain Monte Carlo methods in LQCD-PRNG with explicit state -- Designing a deterministic PRNG for SkePU -- Global synchronization -- Stream splitting -- State forwarding -- Optimizing long or iterated skeleton chains by pre-forwarding -- API extension design -- Related work -- Towards a modernized auto-tuner -- Background -- SkePU variadic tuner design -- Implementation -- Multi-dimensional argument sequences -- Sampler -- Execution plan and persistence -- Future work -- Evaluation results -- SkePU usability evaluation -- SkePU 2 prototype survey -- SkePU 3 survey -- Initial SkePU 2 performance evaluation -- Performance evaluation of lineages -- Sequences of Maps -- Heat propagation -- Hybrid backend -- Single skeleton evaluation -- Generic application evaluation -- Comparison to dynamic hybrid scheduling using StarPU -- Evaluation of multi-variant user functions -- Vectorization -- Median filtering -- Application benchmarks of SkePU 3 -- Libsolve ODE solver -- N-body -- Blackscholes and Streamcluster -- Brain simulation -- CO2 capture -- Supercapacitor simulation -- Conjugate gradient -- Experimental evaluation of deterministic PRNG -- Monte-Carlo Pi approximation -- LQCD Mini-Application -- Miller-Rabin primality testing -- Natural noise generation -- Programmability evaluation -- SkePU-GPI cluster backend -- Microbenchmarks of SkePU 3 -- OpenMP scheduling modes -- SkePU memory consistency model -- Variadic tuner prototype -- High-level skeleton fusion -- Limitations and future work -- Limitations -- Applicability of data-parallel patterns -- Dynamic data structures -- Limitations of language embedding -- Future work -- Further backend targets: reconfigurable accelerators -- Extending the parallel pattern set: stream parallelization -- Testing, debugging, and visualization -- Higher-level language interface -- Conclusions -- Bibliography. | |
505 | 8 | _aAdditions and changes from the licentiate thesis -- New contributions -- Other changes -- Definitions -- Abbreviations -- Domain-specific terminology -- SkePU-specific terminology -- SkePU-BLAS API -- Application source code samples -- N-body simulation -- Game of life -- Conjugate gradient -- CO2 capture -- Dr-sammanst. | |
588 | _aDescription based on publisher supplied metadata and other sources. | ||
590 | _aElectronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2023. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries. | ||
655 | 4 | _aElectronic books. | |
776 | 0 | 8 |
_iPrint version: _aErnstsson, August _tPattern-Based Programming Abstractions for Heterogeneous Parallel Computing _dLink�oping : Linkopings Universitet,c2021 _z9789179291952 |
797 | 2 | _aProQuest (Firm) | |
830 | 0 | _aLink�oping Studies in Science and Technology. Dissertations Series | |
856 | 4 | 0 |
_uhttps://ebookcentral.proquest.com/lib/bacm-ebooks/detail.action?docID=6887234 _zClick to View |
999 |
_c309110 _d309110 |