
Foundations of Statistics for Data Scientists
With R and Python
CRC Press
1st Edition
Will be published approx. on 15. September 2028
Book
Paperback/Softback
468 pages
978-0-367-74843-2 (ISBN)
Description
Shows the elements of statistical science that are highly relevant for students who plan to become data scientists less emphasis on probability theory and methods of probability such as combinatorics, derivations of probability distributions of transformations of random variables (except for explanations of t, chi-squared, and F constructions)
Formal statements and proofs of theorems, and decision theory
Introduces some modern topics that do not normally appear in "math stat" texts but are especially relevant for data scientists, such as generalized linear models for non-normal responses (e.g., logistic regression)
Bayesian and regularized fitting of models (e.g., showing an example using the lasso), classification and clustering, and implementing methods with modern software (R and Python)
Formal statements and proofs of theorems, and decision theory
Introduces some modern topics that do not normally appear in "math stat" texts but are especially relevant for data scientists, such as generalized linear models for non-normal responses (e.g., logistic regression)
Bayesian and regularized fitting of models (e.g., showing an example using the lasso), classification and clustering, and implementing methods with modern software (R and Python)
Reviews / Votes
"[...] Overall, I found the book to be a creative and refreshing take on the challenge of building foundations of "classical" statistics while helping introduce newer topics that are increasingly central to the statistical sciences. Important ideas of the past 50 years (see Gelman and Ahtari 2021) such as resampling, regularization, and hierarchical modeling are incorporated as optional sections (marked with an asterisk). The authors have captured much of the excitement of the statistical sciences and shared it in a way that I believe that students (and instructors) will share their enthusiasm. I look forward to teaching using this book."-Nicholas J. Horton in the ?Journal of the American Statistical Association, ?July 2022
"If you find the other books (co-)authored by A. Agresti interesting, you will not be disappointed this time either. The book is a very good mixture of theory and practice. It presents the topics in statistical science that any data scientist should be familiar with. ... In general, the theory is provided in an easy to read and understand way. Mathematical details are limited to minimum. The emphasis is on the intuitive explanation of the statistical theory and its implementation in practice. And because of that the theory is broadly illustrated with examples based on the real data (which is an additional asset of the book). ... Another pro worth mentioning is the way how the book is organized. It is extremely easy to go back and find the content which is needed. Blueshaded areas with key messages, R codes presented in blue, summaries at the end of each sections - all of this makes this book very transparent and well organized. The book can be truly recommended to students who would like to start their journey as Data Scientists or young practitioners in this field. It can be also a great inspiration for lecturers."
-Kinga Salapa in ?ISCB Book Reviews?, September 2022
"The main goal of this textbook is to present foundational statistical methods and theory that are relevant in the field of data science. The authors depart from the typical approaches taken by many conventional mathematical statistics textbooks by placing more emphasis on providing the students with intuitive and practical interpretations of those methods with the aid of R programming codes. The book also takes slightly different organizations and presents a few topics that are not commonly found in conventional mathematical statistics textbooks. Notably, the book introduces both the frequentist approach and the Bayesian approach for each chapter on statistical inference in Chapters 4 - 6...I find its particular strength to be its intuitive presentation of statistical theory and methods without getting bogged down in mathematical details that are perhaps less useful to the practitioners."
-Mintaek Lee, Boise State University
"The statistical training for budding data scientists is different than the statistical training for budding statisticians, or other scientists. Data scientists require a different mix of theory and practice than statisticians, plus a great deal more exposure to computation than many other types of scientists. The aspects of this manuscript that I find appealing for the courses I teach: 1. The use of real data. 2. The use of R but with the option to use Python. 3. A good mix of theory and practice. 4. The text is well-written with good exercises. 5. The coverage of topics (e.g. Bayesian methods and clustering) that are not usually part of a course in statistics at the level of this book".
-Jason M. Graham, University of Scranton
"This book distinguishes itself with its focus on computational aspects of statistics (the appendices on R and Python and the examples throughout the text that use R). The 'cost' of this approach seems to be that much less attention is given to probability than in a standard text. There is a definite market for this approach - computational statistics/data science do not really require as much probability background as is usually given, while more focus on the way that things are actually done in practice (with software such as R or Python) is extremely beneficial to students that are looking to apply statistical methods. There is a wealth of problems in the book, and their variety (both computational and theoretical) is much appreciated. Also, the expansive appendices on R and Python wonderful, and will be of great help to students...Two major reasons that I would adopt the book are that its discussions seem to be slightly nontraditional in some cases (see above), yet still getting the salient points across. I also am happy about the examples throughout the text that use R-this is very useful for my students."
-Christopher Gaffney, Drexel University
"I will most likely adopt the proposed book for my class. The book seems to provide just about right level of mathematics-not too theoretical or like many other cookbooks which are available for R programming."
-Tumulesh Solanky, University of New Orleans
"The book is well-written and the examples are well-suited for building foundations for statistical science for data science as a discipline. The material covers most of the theoretical backgrounds in statistics. Throughout the book, the authors have used R programming to illustrate the concepts. In many cases, simulations were presented to support the theory. Each chapter has abundant practical exercises for the readers to explore the materials further. This textbook can serve as a textbook for a data science curriculum."
-Steve Chung, Cal State University Fresno
More details
Series
Language
English
Place of publication
London
United Kingdom
Publishing group
Taylor & Francis Ltd
Target group
College/higher education
Illustrations
3 s/w Photographien bzw. Rasterbilder, 104 Farbfotos bzw. farbige Rasterbilder, 3 s/w Abbildungen, 104 farbige Abbildungen
104 Halftones, color; 3 Halftones, black and white; 104 Illustrations, color; 3 Illustrations, black and white
Dimensions
Height: 254 mm
Width: 178 mm
ISBN-13
978-0-367-74843-2 (9780367748432)
Copyright in bibliographic data and cover images is held by Nielsen Book Services Limited or by the publishers or by their respective licensors: all rights reserved.
Schweitzer Classification
Other editions
Additional editions

Book
11/2021
1st Edition
Chapman & Hall/CRC
€166.90
Shipment within 15-20 days

E-Book
11/2021
1st Edition
Chapman & Hall/CRC
€125.99
Available for download

E-Book
11/2021
1st Edition
Chapman & Hall/CRC
€125.99
Available for download
Persons
Alan Agresti, Distinguished Professor Emeritus at the University of Florida, is the author of seven books, including Categorical Data Analysis (Wiley) and Statistics: The Art and Science of Learning from Data (Pearson), and has presented short courses in 35 countries. His awards include an honorary doctorate from De Montfort University (UK) and Statistician of the Year from the American Statistical Association (Chicago chapter).
Maria Kateri, Professor of Statistics and Data Science at the RWTH Aachen University, authored the monograph Contingency Table Analysis: Methods and Implementation Using R (Birkhaeuser/Springer) and a textbook on mathematics for economists (in German). She has long-term experience in teaching statistics courses to students of Data Science, Mathematics, Statistics, Computer Science, Business Administration, and Engineering.
Maria Kateri, Professor of Statistics and Data Science at the RWTH Aachen University, authored the monograph Contingency Table Analysis: Methods and Implementation Using R (Birkhaeuser/Springer) and a textbook on mathematics for economists (in German). She has long-term experience in teaching statistics courses to students of Data Science, Mathematics, Statistics, Computer Science, Business Administration, and Engineering.
Content
1. Introduction to Statistical Science 2. Probability Distributions 3. Sampling Distributions 4. Statistical Inference: Estimation Skip Product Menu 5. Statistical Inference: Significance Testing 6. Linear Models and Least Squares 7. Generalized Linear Models 8. Classification and Clustering 9. Statistical Science: A Historical Overview Appendices