Advanced Machine Learning with R

Name: Advanced Machine Learning with R | Tackle data analytics and machine learning challenges and build complex applications with R 3.5
Brand: Packt Publishing
Price: 45.99 EUR
Availability: OnlineOnly

Tackle data analytics and machine learning challenges and build complex applications with R 3.5

Lesmeister Cory Lesmeister(Author)

Packt Publishing

1st Edition

Published on 20. May 2019

664 pages

E-Book

ePUB with Adobe-DRM

System requirements

978-1-83864-574-8 (ISBN)

€45.99incl. 7% vat

System requirements

for ePUB with Adobe-DRM

E-Book Single Licence

Available for download

Description

Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.

Alles über E-Books, Kopierschutz & Dateiformate finden Sie in unserem Info- & Hilfebereich.

Master machine learning techniques with real-world projects that interface TensorFlow with R, H2O, MXNet, and other languages

Key Features:

Gain expertise in machine learning, deep learning and other techniquesBuild intelligent end-to-end projects for finance, social media, and a variety of domainsImplement multi-class classification, regression, and clustering

Book Description:

R is one of the most popular languages when it comes to exploring the mathematical side of machine learning and easily performing computational statistics.

This Learning Path shows you how to leverage the R ecosystem to build efficient machine learning applications that carry out intelligent tasks within your organization. You'll tackle realistic projects such as building powerful machine learning models with ensembles to predict employee attrition. You'll explore different clustering techniques to segment customers using wholesale data and use TensorFlow and Keras-R for performing advanced computations. You'll also be introduced to reinforcement learning along with its various use cases and models. Additionally, it shows you how some of these black-box models can be diagnosed and understood.

By the end of this Learning Path, you'll be equipped with the skills you need to deploy machine learning techniques in your own projects.

This Learning Path includes content from the following Packt products:

R Machine Learning Projects by Dr. Sunil Kumar ChinnamgariMastering Machine Learning with R - Third Edition by Cory Lesmeister

What you will learn:

Develop a joke recommendation engine to recommend jokes that match users' tastesBuild autoencoders for credit card fraud detectionWork with image recognition and convolutional neural networksMake predictions for casino slot machine using reinforcement learningImplement NLP techniques for sentiment analysis and customer segmentationProduce simple and effective data visualizations for improved insightsUse NLP to extract insights for textImplement tree-based classifiers including random forest and boosted tree

Who this book is for:

If you are a data analyst, data scientist, or machine learning developer this is an ideal Learning Path for you. Each project will help you test your skills in implementing machine learning algorithms and techniques. A basic understanding of machine learning and working knowledge of R programming is necessary to get the most out of this Learning Path.

Cory Lesmeister has over fourteen years of quantitative experience and is currently a senior data scientist for the advanced analytics team at Cummins, Inc. in Columbus, Indiana. He has spent 16 years at Eli Lilly and Company in sales, market research, Lean Six Sigma, marketing analytics, and new product forecasting. He also has several years of experience in the insurance and banking industries, both as a consultant and as a manager of marketing analytics. A former US Army active duty and reserve officer, Cory was stationed in Baghdad, Iraq, in 2009 serving as the strategic advisor to the 29,000-person Iraqi Oil Police, succeeding where others failed by acquiring and delivering promised equipment to help the country secure and protect its oil infrastructure. He has a BBA in aviation administration from the University of North Dakota and a commercial helicopter license. Dr. Sunil Kumar Chinnamgari has a Ph.D. in computer science and he specializes in machine learning and natural language processing. He is an AI researcher with more than 14 years of industry experience. Currently, he works in the capacity of a lead data scientist with a US financial giant. He has published several research papers in Scopus and IEEE journals and is a frequent speaker at various meetups. He is an avid coder and has won multiple hackathons. In his spare time, Sunil likes to teach, travel, and spend time with family.

More details

Other editions

Content

Cover
Title Page
Copyright and Credits
About Packt
Contributors
Table of Contents
Preface
Chapter 1: Preparing and Understanding Data
Overview
Reading the data
Handling duplicate observations
Descriptive statistics
Exploring categorical variables
Handling missing values
Zero and near-zero variance features
Treating the data
Correlation and linearity
Summary
Chapter 2: Linear Regression
Univariate linear regression
Building a univariate model
Reviewing model assumptions
Multivariate linear regression
Loading and preparing the data
Modeling and evaluation - stepwise regression
Modeling and evaluation - MARS
Reverse transformation of natural log predictions
Summary
Chapter 3: Logistic Regression
Classification methods and linear regression
Logistic regression
Model training and evaluation
Training a logistic regression algorithm
Weight of evidence and information value
Feature selection
Cross-validation and logistic regression
Multivariate adaptive regression splines
Model comparison
Summary
Chapter 4: Advanced Feature Selection in Linear Models
Regularization overview
Ridge regression
LASSO
Elastic net
Data creation
Modeling and evaluation
Ridge regression
LASSO
Elastic net
Summary
Chapter 5: K-Nearest Neighbors and Support Vector Machines
K-nearest neighbors
Support vector machines
Manipulating data
Dataset creation
Data preparation
Modeling and evaluation
KNN modeling
Support vector machine
Summary
Chapter 6: Tree-Based Classification
An overview of the techniques
Understanding a regression tree
Classification trees
Random forest
Gradient boosting
Datasets and modeling
Classification tree
Random forest
Extreme gradient boosting - classification
Feature selection with random forests
Summary
Chapter 7: Neural Networks and Deep Learning
Introduction to neural networks
Deep learning - a not-so-deep overview
Deep learning resources and advanced methods
Creating a simple neural network
Data understanding and preparation
Modeling and evaluation
An example of deep learning
Keras and TensorFlow background
Loading the data
Creating the model function
Model training
Summary
Chapter 8: Creating Ensembles and Multiclass Methods
Ensembles
Data understanding
Modeling and evaluation
Random forest model
Creating an ensemble
Summary
Chapter 9: Cluster Analysis
Hierarchical clustering
Distance calculations
K-means clustering
Gower and PAM
Gower
PAM
Random forest
Dataset background
Data understanding and preparation
Modeling
Hierarchical clustering
K-means clustering
Gower and PAM
Random forest and PAM
Summary
Chapter 10: Principal Component Analysis
An overview of the principal components
Rotation
Data
Data loading and review
Training and testing datasets
PCA modeling
Component extraction
Orthogonal rotation and interpretation
Creating scores from the components
Regression with MARS
Test data evaluation
Summary
Chapter 11: Association Analysis
An overview of association analysis
Creating transactional data
Data understanding
Data preparation
Modeling and evaluation
Summary
Chapter 12: Time Series and Causality
Univariate time series analysis
Understanding Granger causality
Time series data
Data exploration
Modeling and evaluation
Univariate time series forecasting
Examining the causality
Linear regression
Vector autoregression
Summary
Chapter 13: Text Mining
Text mining framework and methods
Topic models
Other quantitative analysis
Data overview
Data frame creation
Word frequency
Word frequency in all addresses
Lincoln's word frequency
Sentiment analysis
N-grams
Topic models
Classifying text
Data preparation
LASSO model
Additional quantitative analysis
Summary
Chapter 14: Exploring the Machine Learning Landscape
ML versus software engineering
Types of ML methods
Supervised learning
Unsupervised learning
Semi-supervised learning
Reinforcement learning
Transfer learning
ML terminology - a quick review
Deep learning
Big data
Natural language processing
Computer vision
Cost function
Model accuracy
Confusion matrix
Predictor variables
Response variable
Dimensionality reduction
Class imbalance problem
Model bias and variance
Underfitting and overfitting
Data preprocessing
Holdout sample
Hyperparameter tuning
Performance metrics
Feature engineering
Model interpretability
ML project pipeline
Business understanding
Understanding and sourcing the data
Preparing the data
Model building and evaluation
Model deployment
Learning paradigm
Datasets
Summary
Chapter 15: Predicting Employee Attrition Using Ensemble Models
Philosophy behind ensembling
Getting started
Understanding the attrition problem and the dataset
K-nearest neighbors model for benchmarking the performance
Bagging
Bagged classification and regression trees (treeBag) implementation
Support vector machine bagging (SVMBag) implementation
Naive Bayes (nbBag) bagging implementation
Randomization with random forests
Implementing an attrition prediction model with random forests
Boosting
The GBM implementation
Building attrition prediction model with XGBoost
Stacking
Building attrition prediction model with stacking
Summary
Chapter 16: Implementing a Jokes Recommendation Engine
Fundamental aspects of recommendation engines
Recommendation engine categories
Content-based filtering
Collaborative filtering
Hybrid filtering
Getting started
Understanding the Jokes recommendation problem and the dataset
Converting the DataFrame
Dividing the DataFrame
Building a recommendation system with an item-based collaborative filtering technique
Building a recommendation system with a user-based collaborative filtering technique
Building a recommendation system based on an association-rule mining technique
The Apriori algorithm
Content-based recommendation engine
Differentiating between ITCF and content-based recommendations
Building a hybrid recommendation system for Jokes recommendations
Summary
References
Chapter 17: Sentiment Analysis of Amazon Reviews with NLP
The sentiment analysis problem
Getting started
Understanding the Amazon reviews dataset
Building a text sentiment classifier with the BoW approach
Pros and cons of the BoW approach
Understanding word embedding
Building a text sentiment classifier with pretrained word2vec word embedding based on Reuters news corpus
Building a text sentiment classifier with GloVe word embedding
Building a text sentiment classifier with fastText
Summary
Chapter 18: Customer Segmentation Using Wholesale Data
Understanding customer segmentation
Understanding the wholesale customer dataset and the segmentation problem
Categories of clustering algorithms
Identifying the customer segments in wholesale customer data using k-means clustering
Working mechanics of the k-means algorithm
Identifying the customer segments in the wholesale customer data using DIANA
Identifying the customer segments in the wholesale customers data using AGNES
Summary
Chapter 19: Image Recognition Using Deep Neural Networks
Technical requirements
Understanding computer vision
Achieving computer vision with deep learning
Convolutional Neural Networks
Layers of CNNs
Introduction to the MXNet framework
Understanding the MNIST dataset
Implementing a deep learning network for handwritten digit recognition
Implementing dropout to avoid overfitting
Implementing the LeNet architecture with the MXNet library
Implementing computer vision with pretrained models
Summary
Chapter 20: Credit Card Fraud Detection Using Autoencoders
Machine learning in credit card fraud detection
Autoencoders explained
Types of AEs based on hidden layers
Types of AEs based on restrictions
Applications of AEs
The credit card fraud dataset
Building AEs with the H2O library in R
Autoencoder code implementation for credit card fraud detection
Summary
Chapter 21: Automatic Prose Generation with Recurrent Neural Networks
Understanding language models
Exploring recurrent neural networks
Comparison of feedforward neural networks and RNNs
Backpropagation through time
Problems and solutions to gradients in RNN
Exploding gradients
Vanishing gradients
Building an automated prose generator with an RNN
Implementing the project
Summary
Chapter 22: Winning the Casino Slot Machines with Reinforcement Learning
Understanding RL
Comparison of RL with other ML algorithms
Terminology of RL
The multi-arm bandit problem
Strategies for solving MABP
The epsilon-greedy algorithm
Boltzmann or softmax exploration
Decayed epsilon greedy
The upper confidence bound algorithm
Thompson sampling
Multi-arm bandit - real-world use cases
Solving the MABP with UCB and Thompson sampling algorithms
Summary
Appendix: Creating a Package
Other Books You May Enjoy
Index

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Advanced Machine Learning with R

Description

More details

Other editions

Additional editions

Content

System requirements