
Practical Machine Learning in R
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
Machine learning--a branch of Artificial Intelligence (AI) which enables computers to improve their results and learn new approaches without explicit instructions--allows organizations to reveal patterns in their data and incorporate predictive analytics into their decision-making process. Practical Machine Learning in R provides a hands-on approach to solving business problems with intelligent, self-learning computer algorithms.
Bestselling author and data analytics experts Fred Nwanganga and Mike Chapple explain what machine learning is, demonstrate its organizational benefits, and provide hands-on examples created in the R programming language. A perfect guide for professional self-taught learners or students in an introductory machine learning course, this reader-friendly book illustrates the numerous real-world business uses of machine learning approaches. Clear and detailed chapters cover data wrangling, R programming with the popular RStudio tool, classification and regression techniques, performance evaluation, and more.
* Explores data management techniques, including data collection, exploration and dimensionality reduction
* Covers unsupervised learning, where readers identify and summarize patterns using approaches such as apriori, eclat and clustering
* Describes the principles behind the Nearest Neighbor, Decision Tree and Naive Bayes classification techniques
* Explains how to evaluate and choose the right model, as well as how to improve model performance using ensemble methods such as Random Forest and XGBoost
Practical Machine Learning in R is a must-have guide for business analysts, data scientists, and other professionals interested in leveraging the power of AI to solve business problems, as well as students and independent learners seeking to enter the field.
More details
Other editions
Additional editions

Persons
MIKE CHAPPLE, PHD, is associate teaching professor of information technology, analytics, and operations at the Mendoza College of Business. Mike is a bestselling author of over 25 books, and he currently serves as academic director of the University's Master of Science in Business Analytics program.
Content
- Cover
- Title Page
- Copyright Page
- About the Authors
- About the Technical Editors
- Acknowledgments
- Contents at a Glance
- Contents
- Introduction
- What Does This Book Cover?
- Reader Support for This Book
- Part I Getting Started
- Chapter 1 What Is Machine Learning?
- Discovering Knowledge in Data
- Introducing Algorithms
- Artificial Intelligence, Machine Learning, and Deep Learning
- Machine Learning Techniques
- Supervised Learning
- Unsupervised Learning
- Model Selection
- Classification Techniques
- Regression Techniques
- Similarity Learning Techniques
- Model Evaluation
- Classification Errors
- Regression Errors
- Types of Error
- Partitioning Datasets
- Holdout Method
- Cross-Validation Methods
- Exercises
- Chapter 2 Introduction to R and RStudio
- Welcome to R
- R and RStudio Components
- The R Language
- RStudio
- RStudio Desktop
- RStudio Server
- Exploring the RStudio Environment
- R Packages
- The CRAN Repository
- Installing Packages
- Loading Packages
- Package Documentation
- Writing and Running an R Script
- Data Types in R
- Vectors
- Testing Data Types
- Converting Data Types
- Missing Values
- Exercises
- Chapter 3 Managing Data
- The Tidyverse
- Data Collection
- Key Considerations
- Collecting Ground Truth Data
- Data Relevance
- Quantity of Data
- Ethics
- Importing the Data
- Reading Comma-Delimited Files
- Reading Other Delimited Files
- Data Exploration
- Describing the Data
- Instance
- Feature
- Dimensionality
- Sparsity and Density
- Resolution
- Descriptive Statistics
- Visualizing the Data
- Comparison
- Relationship
- Distribution
- Composition
- Data Preparation
- Cleaning the Data
- Missing Values
- Noise
- Outliers
- Class Imbalance
- Transforming the Data
- Normalization
- Discretization
- Dummy Coding
- Reducing the Data
- Sampling
- Dimensionality Reduction
- Exercises
- Part II Regression
- Chapter 4 Linear Regression
- Bicycle Rentals and Regression
- Relationships Between Variables
- Correlation
- Regression
- Simple Linear Regression
- Ordinary Least Squares Method
- Simple Linear Regression Model
- Evaluating the Model
- Residuals
- Coefficients
- Diagnostics
- Multiple Linear Regression
- The Multiple Linear Regression Model
- Evaluating the Model
- Residual Diagnostics
- Influential Point Analysis
- Multicollinearity
- Improving the Model
- Considering Nonlinear Relationships
- Considering Categorical Variables
- Considering Interactions Between Variables
- Selecting the Important Variables
- Strengths and Weaknesses
- Case Study: Predicting Blood Pressure
- Importing the Data
- Exploring the Data
- Fitting the Simple Linear Regression Model
- Fitting the Multiple Linear Regression Model
- Exercises
- Chapter 5 Logistic Regression
- Prospecting for Potential Donors
- Classification
- Logistic Regression
- Odds Ratio
- Binomial Logistic Regression Model
- Dealing with Missing Data
- Dealing with Outliers
- Splitting the Data
- Dealing with Class Imbalance
- Training a Model
- Evaluating the Model
- Coefficients
- Diagnostics
- Predictive Accuracy
- Improving the Model
- Dealing with Multicollinearity
- Choosing a Cutoff Value
- Strengths and Weaknesses
- Case Study: Income Prediction
- Importing the Data
- Exploring and Preparing the Data
- Training the Model
- Evaluating the Model
- Exercises
- Part III Classification
- Chapter 6 k-Nearest Neighbors
- Detecting Heart Disease
- k-Nearest Neighbors
- Finding the Nearest Neighbors
- Labeling Unlabeled Data
- Choosing an Appropriate k
- k-Nearest Neighbors Model
- Dealing with Missing Data
- Normalizing the Data
- Dealing with Categorical Features
- Splitting the Data
- Classifying Unlabeled Data
- Evaluating the Model
- Improving the Model
- Strengths and Weaknesses
- Case Study: Revisiting the Donor Dataset
- Importing the Data
- Exploring and Preparing the Data
- Dealing with Missing Data
- Normalizing the Data
- Splitting and Balancing the Data
- Building the Model
- Evaluating the Model
- Exercises
- Chapter 7 Naïve Bayes
- Classifying Spam Email
- Naïve Bayes
- Probability
- Joint Probability
- Conditional Probability
- Classification with Naïve Bayes
- Additive Smoothing
- Naïve Bayes Model
- Evaluating the Model
- Strengths and Weaknesses of the Naïve Bayes Classifier
- Case Study: Revisiting the Heart Disease Detection Problem
- Importing the Data
- Exploring and Preparing the Data
- Building the Model
- Evaluating the Model
- Exercises
- Chapter 8 Decision Trees
- Predicting Build Permit Decisions
- Decision Trees
- Recursive Partitioning
- Entropy
- Information Gain
- Gini Impurity
- Pruning
- Building a Classification Tree Model
- Splitting the Data
- Training a Model
- Evaluating the Model
- Strengths and Weaknesses of the Decision Tree Model
- Case Study: Revisiting the Income Prediction Problem
- Importing the Data
- Exploring and Preparing the Data
- Building the Model
- Evaluating the Model
- Exercises
- Part IV Evaluating and Improving Performance
- Chapter 9 Evaluating Performance
- Estimating Future Performance
- Cross-Validation
- k-Fold Cross-Validation
- Leave-One-Out Cross-Validation
- Random Cross-Validation
- Bootstrap Sampling
- Beyond Predictive Accuracy
- Kappa
- Precision and Recall
- Sensitivity and Specificity
- Visualizing Model Performance
- Receiver Operating Characteristic Curve
- Area Under the Curve
- Exercises
- Chapter 10 Improving Performance
- Parameter Tuning
- Automated Parameter Tuning
- Customized Parameter Tuning
- Ensemble Methods
- Bagging
- Boosting
- Stacking
- Exercises
- Part V Unsupervised Learning
- Chapter 11 Discovering Patterns with Association Rules
- Market Basket Analysis
- Association Rules
- Identifying Strong Rules
- Support
- Confidence
- Lift
- The Apriori Algorithm
- Discovering Association Rules
- Generating the Rules
- Evaluating the Rules
- Strengths and Weaknesses
- Case Study: Identifying Grocery Purchase Patterns
- Importing the Data
- Exploring and Preparing the Data
- Generating the Rules
- Evaluating the Rules
- Exercises
- Notes
- Chapter 12 Grouping Data with Clustering
- Clustering
- k-Means Clustering
- Segmenting Colleges with -Means Clustering
- Creating the Clusters
- Analyzing the Clusters
- Choosing the Right Number of Clusters
- The Elbow Method
- The Average Silhouette Method
- The Gap Statistic
- Strengths and Weaknesses of k-Means Clustering
- Case Study: Segmenting Shopping Mall Customers
- Exploring and Preparing the Data
- Clustering the Data
- Evaluating the Clusters
- Exercises
- Note
- Index
- EULA
System requirements
File format: PDF
Copy-Protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our eBook Help page.