000 06190cam a2200673Ki 4500
001 ocn915311550
003 OCoLC
005 20190328114812.0
006 m o d
007 cr cnu---unuuu
008 150801s2015 mau o 000 0 eng d
040 _aEBLCP
_beng
_epn
_cEBLCP
_dN$T
_dOPELS
_dYDXCP
_dDEBSZ
_dOCLCQ
_dMERUC
_dOCLCQ
_dVT2
_dWRM
_dU3W
_dD6H
_dOCLCF
_dAU@
_dCOO
_dOCLCQ
_dWYU
_dOCLCQ
019 _a916948885
_a938482533
_a1066622354
020 _a9780128038901
_q(electronic bk.)
020 _a012803890X
_q(electronic bk.)
020 _z9780128038192
035 _a(OCoLC)915311550
_z(OCoLC)916948885
_z(OCoLC)938482533
_z(OCoLC)1066622354
050 4 _aQA76.642
_b.R456 2015
072 7 _aCOM
_x013000
_2bisacsh
072 7 _aCOM
_x014000
_2bisacsh
072 7 _aCOM
_x018000
_2bisacsh
072 7 _aCOM
_x067000
_2bisacsh
072 7 _aCOM
_x032000
_2bisacsh
072 7 _aCOM
_x037000
_2bisacsh
072 7 _aCOM
_x052000
_2bisacsh
082 0 4 _a004.1
100 1 _aReinders, James.
245 1 0 _aHigh performance parallelism pearls : multicore and many-core programming approaches : Vol. 2 /
_h[electronic resource]
_cJames Reinders, Jim Jeffers.
260 _aWaltham, MA :
_bMorgan Kaufmann an imprint of Elsevier,
_c2015
300 _a1 online resource (574 pages)
336 _atext
_btxt
_2rdacontent
337 _acomputer
_bc
_2rdamedia
338 _aonline resource
_bcr
_2rdacarrier
588 0 _aPrint version record.
505 0 _aFront Cover; High Performance Parallelism Pearls: Multicore and Many-core Programming Approaches; Copyright; Contents; Contributors; Acknowledgments; Foreword; Making a bet on many-core; 2013 Stampede-Intel Many-Core System -- A First; HPC journey and revelation; Stampede users discover: Its parallel programming; This book is timely and important; Preface; Inspired by 61 cores: A new era in programming; Chapter 1: Introduction; Applications and techniques; SIMD and vectorization; OpenMP and nested parallelism; Latency optimizations; Python; Streams; Ray tracing; Tuning prefetching.
505 8 _aMPI shared memoryUsing every last core; OpenCL vs. OpenMP; Power analysis for nodes and clusters; The future of many-core; Downloads; For more information; Chapter 2: Numerical Weather Prediction Optimization; Numerical weather prediction: Background and motivation; WSM6 in the NIM; Shared-memory parallelism and controlling horizontal vector length; Array alignment; Loop restructuring; Compile-time constants for loop and array bounds; Performance improvements; Summary; For more information; Chapter 3: WRF Goddard Microphysics Scheme Optimization; The motivation and background.
505 8 _aWRF Goddard microphysics schemeGoddard microphysics scheme; Benchmark setup; Code optimization; Removal of the vertical dimension from temporary variables for a reduced memory footprint; Collapse i- and j-loops into smaller cells for smaller footprint per thread; Addition of vector alignment directives; Summary of the code optimizations; Analysis using an instruction Mix report; VTune performance metrics; Performance effects of the optimization of Goddard microphysics scheme on the WRF; Summary; Acknowledgments; For more information; Chapter 4: Pairwise DNA Sequence Alignment Optimization.
505 8 _aPairwise sequence alignmentParallelization on a single coprocessor; Multi-threading using OpenMP; Vectorization using SIMD intrinsics; Parallelization across multiple coprocessors using MPI; Performance results; Summary; For more information; Chapter 5: Accelerated Structural Bioinformatics for Drug Discovery; Parallelism enables proteome-scale structural bioinformatics; Overview of eFindSite; Benchmarking dataset; Code profiling; Porting eFindSite for coprocessor offload; Parallel version for a multicore processor; Task-level scheduling for processor and coprocessor; Case study; Summary.
505 8 _aFor more informationChapter 6: Amber PME Molecular Dynamics Optimization; Theory of MD; Acceleration of neighbor list building using the coprocessor; Acceleration of direct space sum using the coprocessor; Additional optimizations in coprocessor code; Removing locks whenever possible; Exclusion list optimization; Reduce data transfer and computation in offload code; Modification of load balance algorithm; PME direct space sum and neighbor list work; PME reciprocal space sum work; Bonded force work; Compiler optimization flags; Results; Conclusions; For more information.
500 _aChapter 7: Low-Latency Solutions for Financial Services Applications.
520 _aHigh Performance Parallelism Pearls Volume 2 offers another set of examples that demonstrate how to leverage parallelism. Similar to Volume 1, the techniques included here explain how to use processors and coprocessors with the same programming - illustrating the most effective ways to combine Xeon Phi coprocessors with Xeon and other multicore processors. The book includes examples of successful programming efforts, drawn from across industries and domains such as biomed, genetics, finance, manufacturing, imaging, and more. Each chapter in this edited work includes detailed explanations of t.
650 0 _aParallel programming (Computer science)
_xData processing.
650 0 _aCoprocessors.
650 0 _aComputer programming.
650 7 _aCOMPUTERS
_xComputer Literacy.
_2bisacsh
650 7 _aCOMPUTERS
_xComputer Science.
_2bisacsh
650 7 _aCOMPUTERS
_xData Processing.
_2bisacsh
650 7 _aCOMPUTERS
_xHardware
_xGeneral.
_2bisacsh
650 7 _aCOMPUTERS
_xInformation Technology.
_2bisacsh
650 7 _aCOMPUTERS
_xMachine Theory.
_2bisacsh
650 7 _aCOMPUTERS
_xReference.
_2bisacsh
650 7 _aComputer programming.
_2fast
_0(OCoLC)fst00872390
650 7 _aCoprocessors.
_2fast
_0(OCoLC)fst01740892
655 0 _aElectronic book.
655 4 _aElectronic books.
700 1 _aJeffers, Jim.
776 0 8 _iPrint version:
_aJeffers, Jim.
_tHigh Performance Parallelism Pearls Volume Two : Multicore and Many-core Programming Approaches.
_dBurlington : Elsevier Science, �2015
_z9780128038192
856 4 0 _3ScienceDirect
_uhttp://www.sciencedirect.com/science/book/9780128038192
999 _c247123
_d247123