
Machine Learning with R Cookbook, Second Edition
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
All prices
More details
Other editions
Additional editions

Person
Yu-Wei, Chiu (David Chiu) is the founder of LargitData (www.LargitData.com), a startup company that mainly focuses on providing big data and machine learning products. He has previously worked for Trend Micro as a software engineer, where he was responsible for building big data platforms for business intelligence and customer relationship management systems. In addition to being a start-up entrepreneur and data scientist, he specializes in using Spark and Hadoop to process big data and apply data mining techniques for data analysis. Yu-Wei is also a professional lecturer and has delivered lectures on big data and machine learning in R and Python, and given tech talks at a variety of conferences. In 2015, Yu-Wei wrote Machine Learning with R Cookbook, Packt Publishing. In 2013, Yu-Wei reviewed Bioinformatics with R Cookbook, Packt Publishing. For more information, please visit his personal website at www.ywchiu.com. **********************************Acknowledgement************************************** I have immense gratitude for my family and friends for supporting and encouraging me to complete this book. I would like to sincerely thank my mother, Ming-Yang Huang (Miranda Huang); my mentor, Man-Kwan Shan; the proofreader of this book, Brendan Fisher; Members of LargitData; Data Science Program (DSP); and other friends who have offered their support.
Content
- Cover
- Copyright
- Credits
- About the Authors
- About the Reviewers
- www.PacktPub.com
- Customer Feedback
- Table of Contents
- Preface
- Chapter 1: Practical Machine Learning with R
- Introduction
- Downloading and installing R
- Getting ready
- How to do it...
- How it works...
- See also
- Downloading and installing RStudio
- Getting ready
- How to do it...
- How it works...
- See also
- Installing and loading packages
- Getting ready
- How to do it...
- How it works...
- See also
- Understanding of basic data structures
- Data types
- Data structures
- Vectors
- How to do it...
- How it works...
- Lists
- How to do it...
- How it works...
- Array
- How to do it...
- How it works...
- Matrix
- How to do it...
- DataFrame
- How to do it...
- Basic commands for subsetting
- How to do it...
- Data input
- Reading and writing data
- Getting ready
- How to do it...
- How it works...
- There's more...
- Manipulating data
- Getting ready
- How to do it...
- How it works...
- There's more...
- Applying basic statistics
- Getting ready
- How to do it...
- How it works...
- There's more...
- Visualizing data
- Getting ready
- How to do it...
- How it works...
- See also
- Getting a dataset for machine learning
- Getting ready
- How to do it...
- How it works...
- See also
- Chapter 2: Data Exploration with Air Quality Datasets
- Introduction
- Using air quality dataset
- Getting ready
- How to do it...
- How it works...
- There's more...
- Converting attributes to factor
- Getting ready
- How to do it...
- How it works...
- There's more...
- Detecting missing values
- Getting ready
- How to do it...
- How it works...
- There's more...
- Imputing missing values
- Getting ready
- How to do it...
- How it works...
- Exploring and visualizing data
- Getting ready
- How to do it...
- Predicting values from datasets
- Getting ready
- How to do it...
- How it works...
- Chapter 3: Analyzing Time Series Data
- Introduction
- Looking at time series data
- Getting ready
- How to do it...
- How it works...
- See also
- Plotting and forecasting time series data
- Getting ready
- How to do it...
- How it works...
- See also
- Extracting, subsetting, merging, filling, and padding
- Getting ready
- How to do it...
- How it works...
- See also
- Successive differences and moving averages
- Getting ready
- How to do it...
- How it works...
- See also
- Exponential smoothing
- Getting ready
- How to do it...
- How it works...
- See also
- Plotting the autocorrelation function
- Getting ready
- How to do it...
- How it works...
- See also
- Chapter 4: R and Statistics
- Introduction
- Understanding data sampling in R
- Getting ready
- How to do it...
- How it works...
- See also
- Operating a probability distribution in R
- Getting ready
- How to do it...
- How it works...
- There's more...
- Working with univariate descriptive statistics in R
- Getting ready
- How to do it...
- How it works...
- There's more...
- Performing correlations and multivariate analysis
- Getting ready
- How to do it...
- How it works...
- See also
- Conducting an exact binomial test
- Getting ready
- How to do it...
- How it works...
- See also
- Performing a student's t-test
- Getting ready
- How to do it...
- How it works...
- See also
- Performing the Kolmogorov-Smirnov test
- Getting ready
- How to do it...
- How it works...
- See also
- Understanding the Wilcoxon Rank Sum and Signed Rank test
- Getting ready
- How to do it...
- How it works...
- See also
- Working with Pearson's Chi-squared test
- Getting ready
- How to do it...
- How it works...
- There's more...
- Conducting a one-way ANOVA
- Getting ready
- How to do it...
- How it works...
- There's more...
- Performing a two-way ANOVA
- Getting ready
- How to do it...
- How it works...
- See also
- Chapter 5: Understanding Regression Analysis
- Introduction
- Different types of regression
- Fitting a linear regression model with lm
- Getting ready
- How to do it...
- How it works...
- There's more...
- Summarizing linear model fits
- Getting ready
- How to do it...
- How it works...
- See also
- Using linear regression to predict unknown values
- Getting ready
- How to do it...
- How it works...
- See also
- Generating a diagnostic plot of a fitted model
- Getting ready
- How to do it...
- How it works...
- There's more...
- Fitting multiple regression
- Getting ready
- How to do it...
- How it works...
- Summarizing multiple regression
- Getting ready
- How to do it...
- How it works...
- See also
- Using multiple regression to predict unknown values
- Getting ready
- How to do it...
- How it works...
- See also
- Fitting a polynomial regression model with lm
- Getting ready
- How to do it...
- How it works...
- There's more...
- Fitting a robust linear regression model with rlm
- Getting ready
- How to do it...
- How it works...
- There's more...
- Studying a case of linear regression on SLID data
- Getting ready
- How to do it...
- How it works...
- See also
- Applying the Gaussian model for generalized linear regression
- Getting ready
- How to do it...
- How it works...
- See also
- Applying the Poisson model for generalized linear regression
- Getting ready
- How to do it...
- How it works...
- See also
- Applying the Binomial model for generalized linear regression
- Getting ready
- How to do it...
- How it works...
- See also
- Fitting a generalized additive model to data
- Getting ready
- How to do it...
- How it works...
- See also
- Visualizing a generalized additive model
- Getting ready
- How to do it...
- How it works...
- There's more...
- Diagnosing a generalized additive model
- Getting ready
- How to do it...
- How it works...
- There's more...
- Chapter 6: Survival Analysis
- Introduction
- Loading and observing data
- Getting ready
- How to do it...
- How it works...
- There's more...
- Viewing the summary of survival analysis
- Getting ready
- How to do it...
- How it works...
- Visualizing the Survival Curve
- Getting ready
- How to do it...
- How it works...
- Using the log-rank test
- Getting ready
- How to do it...
- How it works...
- Using the COX proportional hazard model
- Getting ready
- How to do it...
- How it works...
- Nelson-Aalen Estimator of cumulative hazard
- Getting ready
- How to do it...
- How it works...
- See also
- Chapter 7: Classification 1 - Tree, Lazy, and Probabilistic
- Introduction
- Preparing the training and testing datasets
- Getting ready
- How to do it...
- How it works...
- There's more...
- Building a classification model with recursive partitioning trees
- Getting ready
- How to do it...
- How it works...
- See also
- Visualizing a recursive partitioning tree
- Getting ready
- How to do it...
- How it works...
- See also
- Measuring the prediction performance of a recursive partitioning tree
- Getting ready
- How to do it...
- How it works...
- See also
- Pruning a recursive partitioning tree
- Getting ready
- How to do it...
- How it works...
- See also
- Handling missing data and split and surrogate variables
- Getting ready
- How to do it...
- How it works...
- See also
- Building a classification model with a conditional inference tree
- Getting ready
- How to do it...
- How it works...
- See also
- Control parameters in conditional inference trees
- Getting ready
- How to do it...
- How it works...
- See also
- Visualizing a conditional inference tree
- Getting ready
- How to do it...
- How it works...
- See also
- Measuring the prediction performance of a conditional inference tree
- Getting ready
- How to do it...
- How it works...
- See also
- Classifying data with the k-nearest neighbor classifier
- Getting ready
- How to do it...
- How it works...
- See also
- Classifying data with logistic regression
- Getting ready
- How to do it...
- How it works...
- See also
- Classifying data with the Naïve Bayes classifier
- Getting ready
- How to do it...
- How it works...
- See also
- Chapter 8: Classification 2 - Neural Network and SVM
- Introduction
- Classifying data with a support vector machine
- Getting ready
- How to do it...
- How it works...
- See also
- Choosing the cost of a support vector machine
- Getting ready
- How to do it...
- How it works...
- See also
- Visualizing an SVM fit
- Getting ready
- How to do it...
- How it works...
- See also
- Predicting labels based on a model trained by a support vector machine
- Getting ready
- How to do it...
- How it works...
- There's more...
- Tuning a support vector machine
- Getting ready
- How to do it...
- How it works...
- See also
- The basics of neural network
- Getting ready
- How to do it...
- Training a neural network with neuralnet
- Getting ready
- How to do it...
- How it works...
- See also
- Visualizing a neural network trained by neuralnet
- Getting ready
- How to do it...
- How it works...
- See also
- Predicting labels based on a model trained by neuralnet
- Getting ready
- How to do it...
- How it works...
- See also
- Training a neural network with nnet
- Getting ready
- How to do it...
- How it works...
- See also
- Predicting labels based on a model trained by nnet
- Getting ready
- How to do it...
- How it works...
- See also
- Chapter 9: Model Evaluation
- Introduction
- Why do models need to be evaluated?
- Different methods of model evaluation
- Estimating model performance with k-fold cross-validation
- Getting ready
- How to do it...
- How it works...
- There's more...
- Estimating model performance with Leave One Out Cross Validation
- Getting ready
- How to do it...
- How it works...
- See also
- Performing cross-validation with the e1071 package
- Getting ready
- How to do it...
- How it works...
- See also
- Performing cross-validation with the caret package
- Getting ready
- How to do it...
- How it works...
- See also
- Ranking the variable importance with the caret package
- Getting ready
- How to do it...
- How it works...
- There's more...
- Ranking the variable importance with the rminer package
- Getting ready
- How to do it...
- How it works...
- See also
- Finding highly correlated features with the caret package
- Getting ready
- How to do it...
- How it works...
- See also
- Selecting features using the caret package
- Getting ready
- How to do it...
- How it works...
- See also
- Measuring the performance of the regression model
- Getting ready
- How to do it...
- How it works...
- There's more...
- Measuring prediction performance with a confusion matrix
- Getting ready
- How to do it...
- How it works...
- See also
- Measuring prediction performance using ROCR
- Getting ready
- How to do it...
- How it works...
- See also
- Comparing an ROC curve using the caret package
- Getting ready
- How to do it...
- How it works...
- See also
- Measuring performance differences between models with the caret package
- Getting ready
- How to do it...
- How it works...
- See also
- Chapter 10: Ensemble Learning
- Introduction
- Using the Super Learner algorithm
- Getting ready
- How to do it...
- How it works...
- Using ensemble to train and test
- Getting ready
- How to do it...
- How it works...
- Classifying data with the bagging method
- Getting ready
- How to do it...
- How it works...
- There's more...
- Performing cross-validation with the bagging method
- Getting ready
- How to do it...
- How it works...
- See also
- Classifying data with the boosting method
- Getting ready
- How to do it...
- How it works...
- There's more...
- Performing cross-validation with the boosting method
- Getting ready
- How to do it...
- How it works...
- See also
- Classifying data with gradient boosting
- Getting ready
- How to do it...
- How it works...
- There's more...
- Calculating the margins of a classifier
- Getting ready
- How to do it...
- How it works...
- See also
- Calculating the error evolution of the ensemble method
- Getting ready
- How to do it...
- How it works...
- See also
- Classifying data with random forest
- Getting ready
- How to do it...
- How it works...
- There's more...
- Estimating the prediction errors of different classifiers
- Getting ready
- How to do it...
- How it works...
- See also
- Chapter 11: Clustering
- Introduction
- Clustering data with hierarchical clustering
- Getting ready
- How to do it...
- How it works...
- There's more...
- Cutting trees into clusters
- Getting ready
- How to do it...
- How it works...
- There's more...
- Clustering data with the k-means method
- Getting ready
- How to do it...
- How it works...
- See also
- Drawing a bivariate cluster plot
- Getting ready
- How to do it...
- How it works...
- There's more...
- Comparing clustering methods
- Getting ready
- How to do it...
- How it works...
- See also
- Extracting silhouette information from clustering
- Getting ready
- How to do it...
- How it works...
- See also
- Obtaining the optimum number of clusters for k-means
- Getting ready
- How to do it...
- How it works...
- See also
- Clustering data with the density-based method
- Getting ready
- How to do it...
- How it works...
- See also
- Clustering data with the model-based method
- Getting ready
- How to do it...
- How it works...
- See also
- Visualizing a dissimilarity matrix
- Getting ready
- How to do it...
- How it works...
- There's more...
- Validating clusters externally
- Getting ready
- How to do it...
- How it works...
- See also
- Chapter 12: Association Analysis and Sequence Mining
- Introduction
- Transforming data into transactions
- Getting ready
- How to do it...
- How it works...
- See also
- Displaying transactions and associations
- Getting ready
- How to do it...
- How it works...
- See also
- Mining associations with the Apriori rule
- Getting ready
- How to do it...
- How it works...
- See also
- Pruning redundant rules
- Getting ready
- How to do it...
- How it works...
- See also
- Visualizing association rules
- Getting ready
- How to do it...
- How it works...
- See also
- Mining frequent itemsets with Eclat
- Getting ready
- How to do it...
- How it works...
- See also
- Creating transactions with temporal information
- Getting ready
- How to do it...
- How it works...
- See also
- Mining frequent sequential patterns with cSPADE
- Getting ready
- How to do it...
- How it works...
- See also
- Using the TraMineR package for sequence analysis
- Getting ready
- How to do it...
- How it works...
- Visualizing sequence, Chronogram, and Traversal Statistics
- Getting ready
- How to do it...
- How it works...
- See also
- Chapter 13: Dimension Reduction
- Introduction
- Why to reduce the dimension?
- Performing feature selection with FSelector
- Getting ready
- How to do it...
- How it works...
- See also
- Performing dimension reduction with PCA
- Getting ready
- How to do it...
- How it works...
- There's more...
- Determining the number of principal components using the scree test
- Getting ready
- How to do it...
- How it works...
- There's more...
- Determining the number of principal components using the Kaiser method
- Getting ready
- How to do it...
- How it works...
- See also
- Visualizing multivariate data using biplot
- Getting ready
- How to do it...
- How it works...
- There's more...
- Performing dimension reduction with MDS
- Getting ready
- How to do it...
- How it works...
- There's more...
- Reducing dimensions with SVD
- Getting ready
- How to do it...
- How it works...
- See also
- Compressing images with SVD
- Getting ready
- How to do it...
- How it works...
- See also
- Performing nonlinear dimension reduction with ISOMAP
- Getting ready
- How to do it...
- How it works...
- There's more...
- Performing nonlinear dimension reduction with Local Linear Embedding
- Getting ready
- How to do it...
- How it works...
- See also
- Chapter 14: Big Data Analysis (R and Hadoop)
- Introduction
- Preparing the RHadoop environment
- Getting ready
- How to do it...
- How it works...
- See also
- Installing rmr2
- Getting ready
- How to do it...
- How it works...
- See also
- Installing rhdfs
- Getting ready
- How to do it...
- How it works...
- See also
- Operating HDFS with rhdfs
- Getting ready
- How to do it...
- How it works...
- See also
- Implementing a word count problem with RHadoop
- Getting ready
- How to do it...
- How it works...
- See also
- Comparing the performance between an R MapReduce program and a standard R program
- Getting ready
- How to do it...
- How it works...
- See also
- Testing and debugging the rmr2 program
- Getting ready
- How to do it...
- How it works...
- See also
- Installing plyrmr
- Getting ready
- How to do it...
- How it works...
- See also
- Manipulating data with plyrmr
- Getting ready
- How to do it...
- How it works...
- See also
- Conducting machine learning with RHadoop
- Getting ready
- How to do it...
- How it works...
- See also
- Configuring RHadoop clusters on Amazon EMR
- Getting ready
- How to do it...
- How it works...
- See also
- Index
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.
File format: PDF
Copy-Protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our eBook Help page.