
Parallel Computing Technologies
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Content
- Intro
- Preface
- Organization
- Contents
- Parallel Programming Methods and Tools
- Trace-Based Optimization of Fragmented Programs Execution in LuNA System
- 1 Introduction
- 2 Trace Playback in LuNA System
- 2.1 LuNA System
- 2.2 Trace Playback in LuNA System
- 3 Discussion and Related Works
- 3.1 Analysis and Discussion
- 3.2 Related Works
- 4 Experiments
- 5 Conclusion
- References
- A New Model-Based Approach to Performance Comparison of MPI Collective Algorithms
- 1 Introduction
- 2 Related Work
- 2.1 Analytical Performance Models of MPI Collective Algorithms
- 2.2 Measurement of Model Parameters
- 2.3 Selection of Collective Algorithms Using Machine Learning Algorithms
- 3 Implementation-Derived Analytical Models of Collective Algorithms
- 3.1 Binomial Tree Broadcast Algorithm
- 4 Estimation of Model Parameters
- 4.1 Estimation of (P)
- 4.2 Estimation of Algorithm Specific and
- 5 Experimental Results and Analysis
- 5.1 Experiment Setup
- 5.2 Experimental Estimation of Model Parameters
- 5.3 Accuracy of Selection of Optimal Collective Algorithms Using the Constructed Analytical Performance Models
- 6 Conclusions
- References
- Deterministic OpenMP and the LBP Parallelizing Manycore Processor
- 1 Introduction
- 2 Deterministic OpenMP
- 3 The PISC ISA
- 4 The LBP Parallelizing Processor
- 4.1 The Cores
- 4.2 The Pipeline
- 4.3 The Memory
- 5 A Matrix Multiplication Program Example Experiment
- 6 Conclusion and Perspectives
- References
- Additional Parallelization of Existing MPI Programs Using SAPFOR
- 1 Introduction
- 2 MPI-Aware Parallelization in SAPFOR
- 3 Results
- 4 Related Works
- 5 Conclusion
- References
- Sparse System Solution Methods for Complex Problems
- 1 Introduction
- 2 S3M Package Structure
- 3 Mathematical Aspects
- 3.1 AMG Method
- 3.2 Smoothers
- 3.3 Constrained Pressure Residual Algorithm
- 3.4 Two-Stage Method
- 3.5 Two-Stage Symmetric Gauss-Seidel Algorithm
- 4 S3M Properties
- 5 Numerical Experiments
- 5.1 Symmetric Matrices
- 5.2 Schur Complement Method for Mimetic Finite Difference Scheme
- 5.3 CPR for Two-Phase Oil Recovery Problem
- References
- Resource-Independent Description of Information Graphs with Associative Operations in Set@l Programming Language
- 1 Introduction
- 2 Topological Modification of Graphs with Associative Operations
- 3 Synthesis of Computing Structure
- 4 Description of Basic Topologies for Information Graphs with Associative Operations in Set@l Programming Language
- 5 Development of Resource-Independent Program in Set@l Language
- 6 Conclusions
- References
- High-Level Synthesis of Scalable Solutions from C-Programs for Reconfigurable Computer Systems
- 1 Introduction
- 2 A Review of Existing High-Level Synthesis Tools
- 3 Calculation Scaling for Reconfigurable Computer System
- 3.1 Calculation Scaling Based on Inductive Programs Technology
- 3.2 Calculation Scaling Based on Performance and Hardware Costs Reduction
- 4 The Methodology of High-Level Synthesis for Scalable Solutions of C-Programs
- 4.1 Creation of the Problem Information Graph
- 4.2 Analysis of the Structure of the Problem Information Graph
- 4.3 Transformation of the Information Graph into the Scalable Parallel Pipeline Form
- 4.4 Calculation of the Problem Parallelism Parameters for the Available Hardware Resource
- 4.5 Data Processing Interval Optimization for the Generated Problem Solution
- 5 Results of Experimental Research for the Developed Complex
- 6 Conclusion
- References
- Precompiler for the ACELAN-COMPOS Package Solvers
- 1 Precompiler for Accelerating Solvers for SLE with Block-Band Matrix
- 1.1 Iterative Algorithms for Solving the Target SLEs
- 1.2 Precompiler for the Solver for SLE
- 1.3 Numerical Experiments with the Solver for SLEs with Block-Band Matrices
- 1.4 Parallelization of a Loop with Linear Recurrent Dependence
- 2 Precompiler for Accelerating the Gauss-Seidel Algorithm for Solving the Dirichlet Problem
- 2.1 Automatic Tiling for the Gauss-Seidel Algorithm for Solving the Dirichlet Problem
- 2.2 Linearization of Expressions in the OPS
- 2.3 Gauss-Seidel Algorithm Accelerating by Expression Linearization
- 3 On the Problem of High-Efficiency Software Portability
- References
- Execution of NVRAM Programs with Persistent Stack
- 1 Introduction
- 2 Model
- 2.1 System Model
- 2.2 Failure Model
- 2.3 Operation Execution
- 2.4 Correctness
- 3 Persistent Stack
- 3.1 Program Stack Concept
- 3.2 Issues of Existing Implementations
- 3.3 Persistent Stack Structure
- 3.4 Update of the Persistent Stack
- 4 System Implementation
- 4.1 Pointers to the Memory in NVRAM
- 4.2 Handling Return Values
- 4.3 Architecture of the System
- 5 Verification
- 5.1 Serializability
- 5.2 Running Examples
- 6 Future Work
- 7 Conclusion
- References
- A Study on the Influence of Monitoring System Noise on MPI Collective Operations
- 1 Introduction
- 2 Background and Related Work
- 3 The Proposed Method
- 3.1 Monitoring System Detector
- 3.2 Statistical Criteria
- 3.3 Experimental Setup
- 3.4 Allocation of the Detector
- 4 Experimental Results
- 4.1 Using All Logical Cores
- 4.2 Using Only Physical Cores
- 4.3 Setting Core Affinity
- 4.4 Increased Monitoring Frequency
- 5 Conclusion
- References
- High-Efficiency Specialized Support for Dense Linear Algebra Arithmetic in LuNA System
- 1 Introduction
- 2 Particular Execution Algorithms Approach
- 2.1 Main Idea
- 2.2 Main Definitions and Class of Algorithms Description
- 2.3 Compiler
- 2.4 Run-Time Library and Task Graph Execution
- 3 Performance Evaluation
- 4 Conclusion
- References
- Applications
- Efficient Cluster Parallelization Technology for Aerothermodynamics Problems
- 1 Introduction
- 2 Mathematical Model
- 3 General Organization of the Computational Code
- 4 Shared Memory Parallelization in the OpenMP Environment
- 5 Cluster Parallelization Approach
- 5.1 Grid Optimization and Partitioning
- 5.2 Structure of the Code
- 6 Performance Results
- 7 Conclusion
- References
- Computational Aspects of Solving Grid Equations in Heterogeneous Computing Systems
- 1 Introduction
- 2 Grid Equations Solving Method
- 3 Software Implementation of the Method for Solving Grid Equations
- 4 Parallel Implementation
- 5 Conclusion
- References
- Optimized Hybrid Execution of Dense Matrix-Matrix Multiplication on Clusters of Heterogeneous Multicore and Many-Core Platforms
- 1 Introduction
- 2 Related Work
- 3 System Model
- 4 The Ordering Problem
- 5 Closed-Form Solutions
- 5.1 Case I: LON Does Not Compute
- 5.2 Case II : LON Participates in the Computation
- 6 On-Line Estimation of Cost Parameters
- 7 Experimental Results
- 8 Conclusion and Future Work
- References
- Parallelization of Robust Multigrid Technique Using OpenMP Technology
- 1 Introduction
- 2 Parallel RMT
- 3 Algebraic Parallelism of RMT
- 4 Geometric Parallelism of RMT
- 5 Parallel Multigrid Iteration
- 6 Remarks on Parallel Implementation
- 7 Conclusions
- References
- Network Reliability Calculation with Use of GPUs
- 1 Introduction
- 2 Notations and Definitions
- 3 ATR Calculation Using GPU
- 4 Calculating NRDC Using GPU
- 5 Conclusion
- References
- Memory-Efficient Data Structures
- Implicit Data Layout Optimization for Portable Parallel Programming in C++
- 1 Introduction
- 2 Programming for Accelerators: PACXX vs. State of the Art
- 3 Data Layout Optimization: From Manual to Automatic
- 4 Implicit Data Layout: Design and Implementation
- 5 Conclusion
- References
- On Defragmentation Algorithms for GPU-Native Octree-Based AMR Grids
- 1 Introduction
- 2 Discrete Model on a Cartesian Grid with AMR
- 3 Parallel Defragmentation Algorithms
- 4 Numerical Results
- 5 Conclusions
- References
- Zipped Data Structure for Adaptive Mesh Refinement
- 1 Introduction
- 2 Data Structure
- 2.1 Random Access and Sequential Traversal
- 2.2 Algorithms for Neighbor Search
- 2.3 Mesh Adaptation Algorithms
- 3 Implementation
- 4 Benchmarks
- 5 Conclusion
- References
- Automatic Parallel Tiled Code Generation Based on Dependence Approximation
- 1 Introduction
- 2 Background
- 3 Idea of Dependence Approximation and Code Generation
- 3.1 Dependence Approximation
- 3.2 Time Partitions Constrains and Finding Linearly Independent Solutions to Them
- 3.3 Parallel Tiled Code Generation
- 4 Formal Algorithm
- 5 Related Work
- 6 Experiment
- 6.1 Comparison of Schedules Generated for Polybench Benchmarks
- 6.2 Examining Tile Dimensions for Dynamic Programming Codes
- 7 Conclusions
- References
- Experimental Studies
- Scalability Issues in FFT Computation
- 1 Introduction
- 2 Parallel FFT Performance Bottleneck
- 2.1 Scalability Issues
- 2.2 Peak Performance Model
- 2.3 Choosing the Fastest FFT Parallel Algorithm
- 3 Experimental Results
- 3.1 Strong and Weak Scalability
- 3.2 MPI Selection for Further Acceleration
- 4 Conclusion
- References
- High Performance Implementation of Boris Particle Pusher on DPC++. A First Look at oneAPI
- 1 Introduction
- 2 Method
- 3 Data Structures and Algorithm
- 4 Exploiting Parallelism Using the OneAPI Technology
- 4.1 Reference Implementation of the Boris Pusher
- 4.2 Porting the Pusher to DPC++
- 4.3 Improving Scaling Efficiency
- 5 Numerical Results
- 5.1 Computational Infrastructure
- 5.2 Benchmarks
- 5.3 Results and Discussion
- 6 Conclusion
- References
- Evaluating the Performance of Kunpeng 920 Processors on Modern HPC Applications
- 1 Introduction
- 2 Target Architectures
- 3 Related Work
- 4 Developed Benchmarking System
- 5 Implemented Benchmarks
- 5.1 Triada Variations
- 5.2 Matrix Transpose
- 5.3 One Dimensional Stencil
- 5.4 Two- and Three- Dimensional Stencil
- 5.5 LCopt
- 5.6 HPL (solving a SLE)
- 5.7 Random Number Generation
- 5.8 N-Body
- 5.9 Random Memory Access
- 5.10 Bellman-Ford and Page Rank Graph Algorithms
- 5.11 OpenMP-Based Benchmarks
- 5.12 Scaling of Multi-threaded Benchmarks
- 6 Analysis of Hardware Features
- 7 Conclusions
- References
- Job Management
- Optimization of Resources Allocation in High Performance Distributed Computing with Utilization Uncertainty
- 1 Introduction and Related Works
- 2 Resource Selection Algorithm
- 2.1 Resources Utilization Model
- 2.2 Resources Allocation Under Uncertainties
- 2.3 Near-Optimal Resources Allocation
- 2.4 Greedy Resources Allocation Algorithm
- 2.5 Time Scan Optimization
- 3 Simulation Study
- 3.1 Simulation Environment
- 3.2 Dynamic Resources Allocation
- 3.3 Time Scan Optimization
- 4 Conclusion and Future Work
- References
- Influence of Execution Time Forecast Accuracy on the Efficiency of Scheduling Jobs in a Distributed Network of Supercomputers
- 1 Introduction
- 2 Experiments
- 2.1 GDN Testbed
- 2.2 The Approach for Preparing Test Set
- 2.3 Assumptions and Methodology of the Simulation
- 2.4 Simulation Results
- 3 Conclusion
- References
- Performance Estimation of a BOINC-Based Desktop Grid for Large-Scale Molecular Docking
- 1 Introduction
- 2 Related Works
- 3 The Project SiDock@home
- 4 Performance Analysis of a Desktop Grid
- 4.1 Alternative Performance Metrics of a Desktop Grid
- 5 Conclusion
- References
- Essential Algorithms
- Consensus-Free Ledgers When Operations of Distinct Processes are Commutative
- 1 Introduction
- 1.1 Context of the Study
- 1.2 Computing Model
- 1.3 When the Operations append() of Different Processes Commute
- 2 Underlying Formalization
- 2.1 A Quick Look at Mazurkiewicz's Traces
- 2.2 Problem Formalization
- 2.3 From a Specification to Executions
- 3 An Algorithm Implementing a PC-Ledger
- 3.1 Reliable Broadcast
- 3.2 Local Data Structures
- 4 Proof of the Algorithm
- 5 Conclusion
- A Exercise: From a PC-Ledger to a Distributed PC-State Machine
- References
- Design and Implementation of Highly Scalable Quantifiable Data Structures
- 1 Introduction
- 2 Proving that a Data Structure is Quantifiably Correct
- 3 Design of a Quantifiable Stack
- 4 Design of a Quantifiable Queue
- 5 Quantifiability Applied to Other Types of Data Structures
- 6 Performance
- 7 Related Work
- 8 Conclusion
- References
- Optimal Concurrency for List-Based Sets
- 1 Introduction
- 2 Concurrency Analysis of List-Based Sets
- 2.1 Preliminaries
- 2.2 Concurrency as Admissible Schedules of Sequential Code
- 2.3 Concurrency Analysis of the Lazy and Harris-Michael Linked Lists
- 3 The VBL List
- 3.1 Value-Aware Try-Lock
- 3.2 VBL List
- 4 Experimental Evaluation
- 5 Related Work and Concluding Remarks
- References
- Mobile Agents Operating on a Grid: Some Non-conventional Issues in Parallel Computing
- 1 Rules of the Game
- 2 Several Cops Capture Several Robbers
- 3 Chasing a Continuous Stream of Invaders
- 3.1 Herd Immunity from Invaders
- 3.2 Controlling the Size of an Everlasting Invasion
- 4 Data Collection in a Sparse Sensor Network
- 5 Some Considerations and Possible Extensions
- References
- Computing Services
- Parallel Computations in Integrated Environment of Engineering Modeling and Global Optimization
- 1 Introduction
- 2 OpenFOAM+Globalizer Integration
- 3 Optimizing the Beam Profile of Complex Geometry
- 4 Conclusion
- References
- Implementing Autonomic Internet of Things Ecosystems - Practical Considerations
- 1 Introduction
- 2 Autonomic Computing for the Real-World IoT
- 2.1 Port Automation Pilot
- 2.2 Smart Safety of Workers Pilot
- 2.3 Cohesive Vehicle Monitoring and Diagnostics Pilot
- 3 State-of-the-art in tools for Autonomic Computing
- 4 Needs vs Available Tools - Critical Analysis
- 5 Concluding Remarks
- References
- Information-Analytical System to Support the Solution of Compute-Intensive Problems of Mathematical Physics on Supercomputers
- 1 Introduction
- 2 Conceptual Scheme of Intelligent Support for Solving Compute-Intensive Problems
- 3 The Knowledge Base
- 4 Implementation of Information-Analytical Internet Resource
- 5 Related Works
- 6 Conclusion
- References
- The Web Platform for Storing Biotechnologically Significant Properties of Bacterial Strains
- 1 Introduction
- 2 Web-Platform
- 2.1 Architecture
- 2.2 The Data Store
- 2.3 The Metadata Server
- 2.4 The Web-Service Core
- 2.5 The Graphical User Interface
- 3 Conclusion
- References
- Cellular Automata
- Minimal Covering of the Space by Domino Tiles
- 1 Introduction
- 2 Problem Statement
- 3 Design of the CA Rules
- 3.1 Templates
- 3.2 Hit Value
- 3.3 Processing Scheme
- 3.4 The First Rule
- 3.5 The Second Rule: Minimizing the Number of Dominoes
- 3.6 Performance for Other Field Sizes
- 4 Conclusion
- References
- Application of the Generalized Extremal Optimization and Sandpile Model in Search for the Airborne Contaminant Source
- 1 Introduction
- 2 Sandpile Model
- 3 GEO Algorithm
- 4 Two-Dimensional Cellular Automata for Localization Model
- 5 The GEO-Sandpile Localization Model
- 6 Verification of the GEO-Sandpile Model Effectiveness in the Localization of the Contaminant Source
- 6.1 Generation of Testing Data
- 6.2 Test Cases Assumptions
- 7 Results of the GEO-Sandpile Localization Model
- 7.1 Results for Various Wind Speed Test Cases
- 7.2 Results for Various Wind Directions and Target Source Positions
- 8 Conclusions an Future Works
- References
- Author Index
System requirements
File format: PDF
Copy protection: Watermark-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use the free software Adobe Reader, Adobe Digital Editions, or any other PDF viewer of your choice (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or another reading app for eBooks, e.g., PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Watermark-DRM, a „soft” copy protection. This means that there are no technical restrictions to prevent illegal distribution. However, there is a personalised watermark embedded in the eBook that can be used to identify the purchaser of the eBook in the event of misuse and to provide evidence for legal purposes.
For more information, see our eBook Help page.