
Programming Massively Parallel Processors
A Hands-on Approach
Morgan Kaufmann (Publisher)
5th Edition
Published on 2. June 2026
Book
Paperback/Softback
680 pages
978-0-443-43900-1 (ISBN)
Description
Programming Massively Parallel Processors: A Hands-on Approach, Fifth Edition shows both students and professionals alike the basic concepts of parallel programming and GPU architecture. Concise, intuitive, and practical, it is based on years of road-testing in the authors' own parallel computing courses. Various techniques for constructing and optimizing parallel programs are explored in detail, while case studies demonstrate the development process, which begins with computational thinking and ends with effective and efficient parallel programs. This new edition has been updated with an expanded repertoire of optimizations, new patterns and applications, ad more coverage of important CUDA features.
More details
Edition
5th edition
Language
English
Place of publication
San Francisco
United States
Publishing group
Elsevier Science & Technology
Target group
College/higher education
Professional and scholarly
Product notice
Paperback (trade)
Unsewn / adhesive bound
Dimensions
Height: 235 mm
Width: 191 mm
Weight
450 gr
ISBN-13
978-0-443-43900-1 (9780443439001)
Copyright in bibliographic data and cover images is held by Nielsen Book Services Limited or by the publishers or by their respective licensors: all rights reserved.
Schweitzer Classification
Other editions
Additional editions

Wen-Mei W. Hwu | David B. Kirk | Izzat El Hajj
Programming Massively Parallel Processors
A Hands-on Approach
E-Book
02/2026
5th Edition
Morgan Kaufmann
€65.99
Available for download
Previous edition

Wen-Mei W. Hwu | David B. Kirk | Izzat El Hajj
Programming Massively Parallel Processors
A Hands-on Approach
Book
09/2022
4th Edition
Morgan Kaufmann
€86.00
Shipment within 10-15 days
Persons
Wen-mei W. Hwu
is a Senior Director of
Research of NVIDIA and the
Sanders-AMD Endowed Chair
Professor Emeritus of Electrical
and Computer Engineering
at the University of Illinois
at Urbana-Champaign. His
work focuses on parallel
computing-covering
architecture, implementation,
compilers, and algorithms. Dr.
Hwu has received numerous
honors, including the ACM/
IEEE Eckert-Mauchly Award,
ACM Grace Murray Hopper
Award, IEEE B.R. Rau Award.
He is an IEEE and ACM
Fellow. He earned his Ph.D.
in Computer Science from UC
Berkele David B. Kirk
is known for major
contributions to graphics,
hardware, and algorithms.
Before pursuing his Ph.D. at
Caltech, he earned B.S. and
M.S. degrees in mechanical
engineering from MIT and
worked at Raster Technologies
and Hewlett-Packard's Apollo
Systems Division. After
completing his doctorate, he
served as chief scientist and
head of technology at Crystal
Dynamics. In 1997, he became
Chief Scientist at NVIDIA. Dr.
Kirk has received numerous
honors including the IEEE
Seymour Cray Computer
Engineering Award and
ACM SIGGRAPH Computer
Graphics Achievement
Award. He is a member of
the U.S. National Academy of
Engineering. Izzat El Hajj
is an Assistant Professor
of Computer Science at
the American University
of Beirut. His research
focuses on leveraging
accelerator architectures
to tackle challenging
computations, with a
focus on GPU computing,
processing-in-memory,
and performance
modeling. He earned his
Ph.D. in Electrical and
Computer Engineering at
the University of Illinois
at Urbana-Champaign.
He has received the
Dan Vivoli Endowed
Fellowship (UIUC) and the
Distinguished Graduate
Award from the American
University of Beirut.
is a Senior Director of
Research of NVIDIA and the
Sanders-AMD Endowed Chair
Professor Emeritus of Electrical
and Computer Engineering
at the University of Illinois
at Urbana-Champaign. His
work focuses on parallel
computing-covering
architecture, implementation,
compilers, and algorithms. Dr.
Hwu has received numerous
honors, including the ACM/
IEEE Eckert-Mauchly Award,
ACM Grace Murray Hopper
Award, IEEE B.R. Rau Award.
He is an IEEE and ACM
Fellow. He earned his Ph.D.
in Computer Science from UC
Berkele David B. Kirk
is known for major
contributions to graphics,
hardware, and algorithms.
Before pursuing his Ph.D. at
Caltech, he earned B.S. and
M.S. degrees in mechanical
engineering from MIT and
worked at Raster Technologies
and Hewlett-Packard's Apollo
Systems Division. After
completing his doctorate, he
served as chief scientist and
head of technology at Crystal
Dynamics. In 1997, he became
Chief Scientist at NVIDIA. Dr.
Kirk has received numerous
honors including the IEEE
Seymour Cray Computer
Engineering Award and
ACM SIGGRAPH Computer
Graphics Achievement
Award. He is a member of
the U.S. National Academy of
Engineering. Izzat El Hajj
is an Assistant Professor
of Computer Science at
the American University
of Beirut. His research
focuses on leveraging
accelerator architectures
to tackle challenging
computations, with a
focus on GPU computing,
processing-in-memory,
and performance
modeling. He earned his
Ph.D. in Electrical and
Computer Engineering at
the University of Illinois
at Urbana-Champaign.
He has received the
Dan Vivoli Endowed
Fellowship (UIUC) and the
Distinguished Graduate
Award from the American
University of Beirut.
Author
CTO, MulticoreWare and professor specializing in compiler design, computer architecture, microarchitecture, and parallel processing, University of Illinois at Urbana-Champaign, USA
NVIDIA Fellow
Assistant Professor, Department of Computer Science, American University of Beirut, Lebanon
Content
1. Introduction
Part I. Fundamental Concepts
2. Heterogeneous data parallel computing
3. Multidimensional grids and data
4. Compute architecture and scheduling
5. Memory architecture and data locality
6. Performance considerations
Part II. Parallel Patterns
7. Convolution
8. Stencil
9. Parallel histogram
10. Reduction
11. Prefix sum (scan)
12. Merge
Part III. Advanced Patterns and Applications
13. Sorting
14. Filtering (new)
15. Sparse matrix computation
16. Wavefront Algorithms (new)
17. Graph traversal
18. Deep learning
19. Multi-GPU API (new)
20. Electrostatic potential map
21. Parallel programming and computational thinking
Part IV. Advanced Practices
22. Programming a heterogeneous computing cluster
23. Advanced Optimizations for Matrix Multiplication (new)
24. Advanced practices and future evolution
25. Conclusion and outlook
Part I. Fundamental Concepts
2. Heterogeneous data parallel computing
3. Multidimensional grids and data
4. Compute architecture and scheduling
5. Memory architecture and data locality
6. Performance considerations
Part II. Parallel Patterns
7. Convolution
8. Stencil
9. Parallel histogram
10. Reduction
11. Prefix sum (scan)
12. Merge
Part III. Advanced Patterns and Applications
13. Sorting
14. Filtering (new)
15. Sparse matrix computation
16. Wavefront Algorithms (new)
17. Graph traversal
18. Deep learning
19. Multi-GPU API (new)
20. Electrostatic potential map
21. Parallel programming and computational thinking
Part IV. Advanced Practices
22. Programming a heterogeneous computing cluster
23. Advanced Optimizations for Matrix Multiplication (new)
24. Advanced practices and future evolution
25. Conclusion and outlook