
Practical Statistics for Data Scientists
50+ Essential Concepts Using R and Python
O'Reilly (Publisher)
2nd Edition
Published on 29. June 2020
Book
Paperback/Softback
350 pages
978-1-4920-7294-2 (ISBN)
Shipment within 15-20 days
Description
Statistical methods are a key part of data science, yet few data scientists have formal statistical training. Courses and books on basic statistics rarely cover the topic from a data science perspective. The second edition of this popular guide adds comprehensive examples in Python, provides practical guidance on applying statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not.
Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you're familiar with the R or Python programming languages and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.
With this book, you'll learn:
Why exploratory data analysis is a key preliminary step in data science
How random sampling can reduce bias and yield a higher-quality dataset, even with big data
How the principles of experimental design yield definitive answers to questions
How to use regression to estimate outcomes and detect anomalies
Key classification techniques for predicting which categories a record belongs to
Statistical machine learning methods that "learn" from data
Unsupervised learning methods for extracting meaning from unlabeled data
Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you're familiar with the R or Python programming languages and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format.
With this book, you'll learn:
Why exploratory data analysis is a key preliminary step in data science
How random sampling can reduce bias and yield a higher-quality dataset, even with big data
How the principles of experimental design yield definitive answers to questions
How to use regression to estimate outcomes and detect anomalies
Key classification techniques for predicting which categories a record belongs to
Statistical machine learning methods that "learn" from data
Unsupervised learning methods for extracting meaning from unlabeled data
More details
Edition
2nd New edition
Language
English
Place of publication
Sebastopol
United States
Edition type
New edition
Dimensions
Height: 233 mm
Width: 178 mm
ISBN-13
978-1-4920-7294-2 (9781492072942)
Copyright in bibliographic data and cover images is held by Nielsen Book Services Limited or by the publishers or by their respective licensors: all rights reserved.
Schweitzer Classification
Other editions
New editions
Book
approx. 07/2026
3rd Edition
O'Reilly
€79.50
Not yet published
Additional editions

E-Book
04/2020
O'Reilly
€50.49
Available for download

Peter Bruce | Andrew Bruce | Peter Gedeck
Practical Statistics for Data Scientists
50+ Essential Concepts Using R and Python
E-Book
04/2020
O'Reilly
€50.49
Available for download
Persons
Peter Bruce is the Founder and Chief Academic Officer of the Institute for Statistics Education at Statistics.com, which offers about 80 courses in statistics and analytics, roughly half of which are aimed at data scientists. He has authored or co-authored several books in statistics and analytics, and he earned his Bachelor's degree at Princeton, and Masters degrees at Harvard and the University of Maryland. Andrew Bruce, Principal Research Scientist at Amazon, has over 30 years of experience in statistics and data science in academia, government and business. The co-author of Applied Wavelet Analysis with S-PLUS, he earned his bachelor's degree at Princeton, and PhD in statistics at the University of Washington. eter Gedeck, Senior Data Scientist at Collaborative Drug Discovery, specializes in the development of machine learning algorithms to predict biological and physicochemical properties of drug candidates. Co-author of Data Mining for Business Analytics, he earned PhD's in Chemistry from the University of Erlangen-Nurnberg in Germany and Mathematics from Fernuniversitat Hagen, Germany