
Markov Decision Processes and Reinforcement Learning
Cambridge University Press
Will be published approx. on 31. August 2026
Book
Hardback
780 pages
978-1-009-09841-0 (ISBN)
Description
This book offers a comprehensive introduction to Markov decision process and reinforcement learning fundamentals using common mathematical notation and language. Its goal is to provide a solid foundation that enables readers to engage meaningfully with these rapidly evolving fields. Topics covered include finite and infinite horizon models, partially observable models, value function approximation, simulation-based methods, Monte Carlo methods, and Q-learning. Rigorous mathematical concepts and algorithmic developments are supported by numerous worked examples. As an up-to-date successor to Martin L. Puterman's influential 1994 textbook, this volume assumes familiarity with probability, mathematical notation, and proof techniques. It is ideally suited for students, researchers, and professionals in operations research, computer science, engineering, and economics.
Reviews / Votes
'This book invites a new generation of students, researchers, and practitioners into the thrilling world of sequential decision making. The mathematical exposition of both the foundations and cutting-edge ideas is both intuitive and rigorous.' Mykel Kochenderfer, Stanford University 'Puterman and Chan have authored the authoritative guide to Markov Decision Processes and Reinforcement Learning. This comprehensive book seamlessly blends theoretical insights with practical implementations and expert advice. It serves as the go-to resource for AI and Operations Research researchers and practitioners seeking state-of-the-art analysis of MDPs and RL.' Alan Mackworth, University of British Columbia (Emeritus) 'Puterman and Chan have produced an outstanding modern treatment of the subject that distills the core concepts, models, and computational approaches in a rigorous yet highly accessible way. This book will serve as a standard, 'must-have' text for those new to the subject or looking for a coherent distillation that seamlessly integrates both operations research and computer science perspectives. Practitioners will find this book to be an indispensable resource for developing models for use in the real world.' Daniel Adelman, The University of Chicago Booth School of Business 'This excellent new book provides a structured and up-to-date account of the field that is both accessible and rigorous. It covers a broad range of topics-from a variety of objective functions for the classical problem to its partially observable variant. It concludes with detailed chapters on reinforcement learning that should guide the reader in practical applications.' Abhijit Gosavi, Missouri University of Science and TechnologyMore details
Language
English
Place of publication
Cambridge
United Kingdom
Target group
College/higher education
Product notice
Laminated cover
Illustrations
Worked examples or Exercises
Dimensions
Height: 229 mm
Width: 152 mm
Thickness: 39 mm
Weight
500 gr
ISBN-13
978-1-009-09841-0 (9781009098410)
Copyright in bibliographic data and cover images is held by Nielsen Book Services Limited or by the publishers or by their respective licensors: all rights reserved.
Schweitzer Classification
Persons
Martin L. Puterman is Professor Emeritus at the Sauder School of Business, University of British Columbia. He received the INFORMS Lanchester Prize for his widely cited 1994 book Markov Decision Processes. He is an INFORMS Fellow and has received the CORS Award of Merit, the CORS Practice Prize and the INFORMS Case Competition Award. Timothy C. Y. Chan is Associate Vice-President and Vice-Provost, Strategic Initiatives and Professor of Industrial Engineering at the University of Toronto. He is an award-winning teacher, having been recognized with the INFORMS Prize for Teaching of OR/MS Practice, the INFORMS Case Competition Award, and the University of Toronto President's Teaching Award.
Author
University of British Columbia, Vancouver
University of Toronto
Content
Preface; 1. Introduction; Part I. Fundamentals: 2. Markov decision process fundamentals; 3. Examples and applications; Part II. Classical Markov Decision Process Models: 4. Finite horizon models; 5. Infinite horizon models: expected discounted reward; 6. Infinite horizon models: expected total reward; 7. Infinite horizon models: long-run average reward; 8. Partially observable Markov decision processes; Part III. Reinforcement Learning: 9. Value function approximation; 10. Simulation in tabular models; 11. Simulation with function approximation; Appendix A. Notation and conventions; Appendix B. Markov chains; Appendix C. Linear programming; Bibliography; Index.