
OpenMP in a Heterogeneous World
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Content
- Title
- Preface
- Organization
- Table of Contents
- Eighth International Workshop on OpenMP IWOMP 2012
- Proposed Extensions to OpenMP
- Specification and Performance Evaluation of Parallel I/O Interfaces for OpenMP
- Introduction
- Related Work
- Interface Specification
- Introduction to the Annotation Used
- Interface Specification
- Implementation in the OpenUH Compiler
- Performance Evaluation
- Resources
- Results
- BT I/O
- Conclusions
- References
- The Design of OpenMP Thread Affinity
- Introduction
- Related Work
- Current Implementation-specific Approaches
- Machine Model
- Places
- Place List
- Model Specification
- Strengths and Limitations of the Machine Model
- Affinity
- Threads per Place
- Affinity Policies
- Runtime Library Routines
- Use Scenarios and Implementation
- Reference Implementation on IBM POWER
- Future Enhancements
- Summary
- References
- Auto-scoping for OpenMP Tasks
- Introduction
- Motivation and Related Work
- Proposal
- Implementation
- Evaluation
- Conclusions and Future Work
- References
- A Case for Including Transactions in OpenMP II: Hardware Transactional Memory
- Introduction
- Current OpenMP Synchronization Mechanisms
- Prior Approaches to Concurrency Control
- Mutual Exclusion
- Non-blocking Atomic Primitives and OpenMP Atomics
- Lock Elision
- Integrated TM Language Support
- Proposed OpenMP Extension
- Blue Gene/Q TM Implementation
- Compiler Support
- TM Runtime
- Experimental Results Using BUSTM
- Geometries Used
- Experiments in Deterministic Mode
- Experiments in Probabilistic Mode
- Conclusions, Current and Future Work
- References
- Extending OpenMP* with Vector Constructs for Modern Multicore SIMD Architectures
- Introduction
- Related Work
- Motivation
- SIMD Extensions to OpenMP
- Vectorized Worksharing Construct
- Vectorization and Parallelization
- Additional Vectorization Clauses
- Vectorizing Functions
- Implementation
- Evaluation
- Methodology
- Benchmarks
- Results
- Conclusions and Future Work
- References
- Introducing Task Cancellation to OpenMP
- Introduction
- Related Work
- Task Cancellation in OpenMP
- The OpenMP Tasking Model
- The Motivation behind Adding Task Cancellation to OpenMP
- Adding Cancellable Tasks to OpenMP
- Creating Cancellable Tasks
- Dealing with Nested Tasks
- Cancelling Tasks
- Protecting Tasks from Cancellation
- When Does Cancellation Take Place?
- Evaluation of OpenMP Task Cancellation
- Conclusion and Future Work
- References
- Runtime Environments
- Automatic OpenMP Loop Scheduling: A Combined Compiler and Runtime Approach
- Introduction
- Motivation
- Architecture
- Compiler Analysis
- Compiler Backend
- Runtime Monitoring
- Loop Scheduling Algorithm
- Evaluation
- Kernel Experiments
- Real-World Applicability
- Related Work
- Conclusion
- References
- LIBKOMP, an Efficient OpenMP Runtime System for Both Fork-Join and Data Flow Paradigms
- Introduction
- The libKOMP Runtime System
- The libKOMP Execution Model
- Parallel Regions in libKOMP
- Data Access Modes for Dependent Tasks
- Stack-Based Execution
- Work Stealing and Data Flow Dependencies
- Discussion
- Parallel Loops in libKOMP
- Performance Evaluation
- Task Management Overhead
- Parallel Loops
- Barcelona OpenMP Tasks Suite (BOTS)
- Data Flow Tasks versus Fork-Join Tasks
- Related Work
- Conclusion
- References
- A Compiler-Assisted Runtime-Prefetching Scheme for Heterogeneous Platforms
- Introduction
- Language Extensions
- Compiling Supports
- Runtime Use-Def Analysis and Prefetch Scheduling
- Adaptive Scheduling
- Implementation and Evaluation
- Related Work
- Conclusion and Future Work
- References
- Optimization and Accelerators
- Experiments with WRF on Intel Many Integrated Core (Intel MIC) Architecture
- Introduction
- Intel MIC Architecture
- Hardware Architecture
- System Software Architecture
- Software Stack Implementation
- OpenMP
- MPI
- Compiler Offload
- WRF Benchmark
- WRF Offload
- MPI Implementation
- Timing Model
- Timing Measurement and Results
- Conclusions and Future Work
- References
- Optimizing the Advanced Accelerator Simulation Framework Synergia Using OpenMP
- Introduction
- Benchmark Application
- Platforms
- Improving the Performance Using OpenMP
- Parallelizing the Loops
- Using OpenMP for FFTW
- Parallelizing Deposit
- Performance Discussion
- Related Work
- Summary and Conclusions
- References
- Using Compiler Directives for Accelerating CFD Applications on GPUs
- Introduction
- GPU Programming
- Benchmark Implementations
- Baseline Code
- Implementations Using ACC Directives
- CUDA Implementations
- Performance Study
- Matrix Transposition
- Matrix Multiplication
- SP Benchmark
- Conclusions
- References
- Effects of Compiler Optimizations in OpenMP to CUDA Translation
- Introduction
- Overview of OpenMPC System
- Optimization Options
- Improving the OpenMPC Tuning System
- Modified IE (MIE) Algorithm for OpenMPC
- Iterative Elimination
- Grouping of Different Optimization Options
- MIE Running Strategy
- Performance Analysis
- Setup
- Performance Comparison between Pruned Exhaustive and Modified IE Algorithms
- Impact of Individual Optimization Options
- Conclusion and Future Work
- References
- Task Parallelism
- Assessing OpenMP Tasking Implementations on NUMA Architectures
- Introduction
- Related Work
- Monitoring Task Execution
- Load Balancing vs. Data Locality
- Task Overhead
- Task Behavior on NUMA Architectures
- STREAM
- SMXV in a CG Kernel
- Application Case Studies
- Summary
- References
- Performance Analysis Techniques for Task-Based OpenMP Applications
- Introduction
- Related Work
- Performance Problems Related to Tasking
- The OTF2 Task Event Model
- Task Interruption
- Evaluation
- Conclusion
- References
- Task-Based Execution of Nested OpenMP Loops
- Introduction
- Proof of Concept: Re-writing Loop Code Manually
- Overcoming Limitations by Automatic Transformation
- The OMPi Compiler
- Automating the Process
- Ordered
- Evaluation
- Synthetic Benchmark
- Face Detection
- Conclusion
- References
- Validation and Benchmarks
- SPEC OMP2012 - An Application Benchmark Suite for Parallel Systems Using OpenMP
- Introduction
- Design and Principles of SPEC OMP2012
- General Design
- Run Rules
- Description of the Benchmark
- Energy Efficiency
- First Scalability Results
- Related Work
- Summary and Conclusion
- References
- An OpenMP 3.1 Validation Testsuite
- Introduction
- The Design of an OpenMP Validation Suite
- Implementation
- Directives and Clauses
- Support for OpenMP 3.1
- Evaluation
- Related Work
- Conclusion
- References
- Poster Papers
- Performance Analysis of an Hybrid MPI/OpenMP ALM Software for Life Insurance Policies on Multi-core Architectures
- Introduction
- The ALM Software for Life Insurance Policies
- Performance Results
- References
- Adaptive OpenMP for Large NUMA Nodes
- Introduction
- Adaptive OpenMP Runtime
- Experimental Results
- Conclusion and Future Work
- References
- A Generalized Directive-Based Approach for Accelerating PDE Solvers
- Cube-Flu
- Parallelization and Directive-Based GPU Porting
- Performance Results
- References
- Design of a Shared-Memory Model for CAPE
- Introduction
- Shared-Memory Models on Distributed Systems
- OpenMP flush Directive and Memory-Consistency Mechanism
- Updated Home-Based Lazy Release Consistency Model
- The Global Flush Using the UHLRC Model
- The Selective Flush Directive Using the UHLRC Model
- Conclusion and Future Works
- References
- Overlapping Computations with Communications and I/O Explicitly Using OpenMP Based Heterogeneous Threading Models
- Introduction and Background
- Heterogeneous OpenMP Model: BigDFT Implementation
- Experiments and Results
- Discussion and Proposal for OpenMP Extensions
- References
- A Microbenchmark Suite for OpenMP Tasks
- Introduction
- Benchmark Design and Implementation
- Benchmark Results
- Hardware
- Results
- Conclusions
- References
- Support for Thread-Level Speculation into OpenMP
- Introduction
- Our Proposal
- Conclusions
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.