High performance parallelism pearls : multicore and many-core programming approaches : Vol. 2 / [electronic resource]
by Reinders, James; Jeffers, Jim.
Material type: BookPublisher: Waltham, MA : Morgan Kaufmann an imprint of Elsevier, 2015Description: 1 online resource (574 pages).ISBN: 9780128038901; 012803890X.Subject(s): Parallel programming (Computer science) -- Data processing | Coprocessors | Computer programming | COMPUTERS -- Computer Literacy | COMPUTERS -- Computer Science | COMPUTERS -- Data Processing | COMPUTERS -- Hardware -- General | COMPUTERS -- Information Technology | COMPUTERS -- Machine Theory | COMPUTERS -- Reference | Computer programming | Coprocessors | Electronic book | Electronic booksOnline resources: ScienceDirectPrint version record.
Front Cover; High Performance Parallelism Pearls: Multicore and Many-core Programming Approaches; Copyright; Contents; Contributors; Acknowledgments; Foreword; Making a bet on many-core; 2013 Stampede-Intel Many-Core System -- A First; HPC journey and revelation; Stampede users discover: Its parallel programming; This book is timely and important; Preface; Inspired by 61 cores: A new era in programming; Chapter 1: Introduction; Applications and techniques; SIMD and vectorization; OpenMP and nested parallelism; Latency optimizations; Python; Streams; Ray tracing; Tuning prefetching.
MPI shared memoryUsing every last core; OpenCL vs. OpenMP; Power analysis for nodes and clusters; The future of many-core; Downloads; For more information; Chapter 2: Numerical Weather Prediction Optimization; Numerical weather prediction: Background and motivation; WSM6 in the NIM; Shared-memory parallelism and controlling horizontal vector length; Array alignment; Loop restructuring; Compile-time constants for loop and array bounds; Performance improvements; Summary; For more information; Chapter 3: WRF Goddard Microphysics Scheme Optimization; The motivation and background.
WRF Goddard microphysics schemeGoddard microphysics scheme; Benchmark setup; Code optimization; Removal of the vertical dimension from temporary variables for a reduced memory footprint; Collapse i- and j-loops into smaller cells for smaller footprint per thread; Addition of vector alignment directives; Summary of the code optimizations; Analysis using an instruction Mix report; VTune performance metrics; Performance effects of the optimization of Goddard microphysics scheme on the WRF; Summary; Acknowledgments; For more information; Chapter 4: Pairwise DNA Sequence Alignment Optimization.
Pairwise sequence alignmentParallelization on a single coprocessor; Multi-threading using OpenMP; Vectorization using SIMD intrinsics; Parallelization across multiple coprocessors using MPI; Performance results; Summary; For more information; Chapter 5: Accelerated Structural Bioinformatics for Drug Discovery; Parallelism enables proteome-scale structural bioinformatics; Overview of eFindSite; Benchmarking dataset; Code profiling; Porting eFindSite for coprocessor offload; Parallel version for a multicore processor; Task-level scheduling for processor and coprocessor; Case study; Summary.
For more informationChapter 6: Amber PME Molecular Dynamics Optimization; Theory of MD; Acceleration of neighbor list building using the coprocessor; Acceleration of direct space sum using the coprocessor; Additional optimizations in coprocessor code; Removing locks whenever possible; Exclusion list optimization; Reduce data transfer and computation in offload code; Modification of load balance algorithm; PME direct space sum and neighbor list work; PME reciprocal space sum work; Bonded force work; Compiler optimization flags; Results; Conclusions; For more information.
Chapter 7: Low-Latency Solutions for Financial Services Applications.
High Performance Parallelism Pearls Volume 2 offers another set of examples that demonstrate how to leverage parallelism. Similar to Volume 1, the techniques included here explain how to use processors and coprocessors with the same programming - illustrating the most effective ways to combine Xeon Phi coprocessors with Xeon and other multicore processors. The book includes examples of successful programming efforts, drawn from across industries and domains such as biomed, genetics, finance, manufacturing, imaging, and more. Each chapter in this edited work includes detailed explanations of t.
There are no comments for this item.