Architecture-independent programming and automatic parallelisation have long been regarded as two different means of alleviating the prohibitive costs of parallel software development. Building on recent advances in both areas, Architecture-Independent Loop Parallelisation proposes a unified approach to the parallelisation of scientific computing code. This novel approach is based on the bulk-synchronous parallel model of computation, and succeeds in automatically generating parallel code that is architecture-independent, scalable, and of analytically predictable performance.
Reihe
Auflage
Softcover reprint of the original 1st ed. 2000
Sprache
Verlagsort
Zielgruppe
Für Beruf und Forschung
Research
Illustrationen
Maße
Höhe: 235 mm
Breite: 155 mm
Dicke: 11 mm
Gewicht
ISBN-13
978-1-4471-1197-9 (9781447111979)
DOI
10.1007/978-1-4471-0763-7
Schweitzer Klassifikation
1 Introduction.- 1.1 Motivation.- 1.2 Parallelisation Approach Proposed in the Book.- 1.3 Organisation of the Book.- 2 The Bulk-Synchronous Parallel Model.- 2.1 Introduction.- 2.2 Bulk-Synchronous Parallel Computers.- 2.3 The BSP Programming Model.- 2.4 The BSP Cost Model.- 2.5 Assessing the Efficiency of BSP Code.- 2.6 The Development of BSP Applications.- 2.7 BSP Pseudocode.- 3 Data Dependence Analysis and Code Transformation.- 3.1 Introduction.- 3.2 Data Dependence.- 3.3 Code Transformation Techniques.- 4 Communication Overheads in Loop Nest Scheduling.- 4.1 Introduction.- 4.2 Related Work.- 4.3 Communication Overheads Due to Input Data.- 4.4 Inter-Tile Communication Overheads.- 4.5 Summary.- 5 Template-Matching Parallelisation.- 5.1 Introduction.- 5.2 Related Work.- 5.3 Communication-Free Scheduling.- 5.4 Wavefront Block Scheduling.- 5.5 Iterative Scheduling.- 5.6 Reduction Scheduling.- 5.7 Recurrence Scheduling.- 5.8 Scheduling Broadcast Loop Nests.- 5.9 Summary.- 6 Generic Loop Nest Parallelisation.- 6.1 Introduction.- 6.2 Related Work.- 6.3 Data Dependence Analysis.- 6.4 Potential Parallelism Identification.- 6.5 Data and Computation Partitioning.- 6.6 Communication and Synchronisation Generation.- 6.7 Performance Analysis.- 6.8 Summary.- 7 A Strategy and a Tool for Architecture-Independent Loop Parallelisation.- 7.1 Introduction.- 7.2 Related Work.- 7.3 A Two-Phase Strategy for Loop Nest Parallelisation.- 7.4 BSPscheduler: an Architecture-Independent Loop Paralleliser.- 7.5 Summary.- 8 The Effectiveness of Architecture-Independent Loop Parallelisation.- 8.1 Introduction.- 8.2 Matrix-Vector and Matrix-Matrix Multiplication.- 8.3 LU Decomposition.- 8.4 Algebraic Path Problem.- 8.5 Finite Difference Iteration on a Cartesian Grid.- 8.6 Merging.- 8.7 Summary.- 9 Conclusions.- 9.1 Summary of Contributions and Concluding Remarks.- 9.2 Future work directions.