
Accelerator Programming Using Directives
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
The 6 full papers presented have been carefully reviewed and selected from 12 submissions. The papers share knowledge and experiences to program emerging complex parallel computing systems. They are organized in the following three sections: applications; using openMP; and program evaluation.
More details
Other editions
Additional editions

Persons
Content
- Intro
- 2018: 5th Workshop on Accelerator Programming Using Directives (WACCPD) http://waccpd.org/
- Organization
- Contents
- Applications
- Heterogeneous Programming and Optimization of Gyrokinetic Toroidal Code Using Directives
- Abstract
- 1 Introduction
- 2 Simulation Platforms: Titan, SummitDev, and Summit
- 3 Scientific Methods of GTC
- 4 Porting and Optimization Strategy
- 5 GPU Porting Status
- 6 Performance
- 6.1 Solver Performance Improvement
- 6.2 Scaling Performance
- 6.3 Tests on SummitDev
- 6.4 Performance and Scalability on Summit
- 7 Conclusion
- Acknowledgments
- References
- Using Compiler Directives for Performance Portability in Scientific Computing: Kernels from Molecular Simulation
- 1 Introduction
- 2 Background
- 2.1 Performance Portability
- 2.2 Molecular Dynamics
- 3 Portability Goals: Timings and Architectures
- 4 Designing the Kernels
- 4.1 The Programming Model and Its Portable Subset
- 4.2 Modular Format and Kernels
- 5 Binning Module (Neighbor-List Updates): Bin-Assign, Bin-Count, and Bin Sorting
- 5.1 Bin-Assign, Bin-Count
- 5.2 Parallel Algorithm Design for Bin Count and Gather
- 6 The Squared Pairwise Distance Calculation: Performance, Portability, and Effort
- 6.1 Use of OpenACC for the Squared Distance Calculation: GPU
- 6.2 Comparison to CUDA Kernel
- 6.3 OpenACC on the CPU
- 6.4 Comparison to a Purely BLAS-Based Algorithm: Lowest Programming Knowledge Required
- 7 Programming Effort
- 8 Conclusions
- A Artifact Description Appendix: Using Compiler Directives for Performance Portability in Scientific Computing: Kernels from Molecular Simulation
- A.1 Abstract
- A.2 Description
- References
- Using OpenMP
- OpenMP Code Offloading: Splitting GPU Kernels, Pipelining Communication and Computation, and Selecting Better Grid Geometries
- 1 Introduction
- 2 Background on Warp Specialization and Elision
- 3 Fission of Multiple-Parallel-Region Target Regions
- 4 Overlapping Data Transfer and Split Kernel Execution
- 5 Pipelining Data Transfer and Parallel Loop Execution
- 6 Custom Grid Geometry
- 7 Estimating Potential Benefits of Transformations
- 7.1 Combining Kernel Splitting with Elision Improves Performance
- 7.2 Elision Amplifies Benefits of Custom Grid Geometry
- 7.3 Pipelining Improves Performance for High Trip Counts
- 8 Related Work
- 9 Conclusion
- A Artifact Description Appendix: OpenMP Target Offloading: Splitting GPU Kernels, Pipelining Communication and Computation, and Selecting Better Grid Geometries
- A.1 Abstract
- A.2 Description
- A.3 Installation
- A.4 Experiment Workflow
- A.5 Evaluation and Expected Results
- A.6 Experiment Customization
- A.7 Notes
- References
- A Case Study for Performance Portability Using OpenMP 4.5
- 1 Introduction
- 2 The GPP Kernel and Its Baseline CPU Implementation
- 2.1 GPP Kernel
- 2.2 Baseline CPU Implementation
- 3 GPU Implementations of the GPP Kernel
- 3.1 Implementation Groundwork
- 3.2 OpenMP 4.5
- 3.3 OpenACC
- 3.4 CUDA
- 3.5 Performance Comparison Among GPU Implementations
- 4 Porting GPU Implementations Back to CPU
- 4.1 OpenACC
- 4.2 OpenMP 4.5
- 5 Related Work
- 6 Summary and Future Work
- A Reproducibility
- References
- Program Evaluation
- OpenACC Routine Directive Propagation Using Interprocedural Analysis
- 1 Introduction
- 2 OpenACC Routine Directive
- 3 Implementation of Automatic Routine Directive Propagation
- 3.1 PGI Interprocedural Analysis
- 3.2 Initial Compile Summary Information and Error Suppression
- 3.3 Propagating acc routine Information with IPA
- 3.4 Recompile Step
- 4 Examples
- 4.1 Propagating acc routine seq as Default
- 4.2 Propagating routine Type Across Files
- 4.3 Detecting routine Level of Parallelism Mismatch Across Files
- 4.4 Detecting Unannotated Global Variable Usage
- 5 Summary
- References
- OpenACC Based GPU Parallelization of Plane Sweep Algorithm for Geometric Intersection
- 1 Introduction
- 2 Background and Related Work
- 2.1 Segment Intersection Problem
- 2.2 Naive Brute Force Approach
- 2.3 Plane Sweep Algorithm
- 2.4 Existing Work on Parallelizing Segment Intersection Algorithms
- 2.5 OpenMP and OpenACC
- 3 Parallel Plane Sweep Algorithm
- 3.1 Algorithm Correctness
- 3.2 Algorithmic Analysis
- 4 Directive-Based Implementation Details
- 5 Experimental Results
- 5.1 Experimental Setup
- 5.2 Performance of Brute Force Parallel Algorithm
- 5.3 Performance of Parallel Plane Sweep Algorithm
- 5.4 Speedup and Efficiency Comparisons
- 6 Conclusion and Future Work
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.