
R: Predictive Analysis
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Content
- Cover
- Copyright
- Credits
- Preface
- Table of Content
- Module 1: Data Analysis with R
- Chapter 1: RefresheR
- Navigating the basics
- Getting help in R
- Vectors
- Functions
- Matrices
- Loading data into R
- Working with packages
- Chapter 2: The Shape of Data
- Univariate data
- Frequency distributions
- Central tendency
- Spread
- Populations, samples, and estimation
- Probability distributions
- Visualization methods
- Exercises
- Summary
- Chapter 3: Describing Relationships
- Multivariate data
- Relationships between a categorical and a continuous variable
- Relationships between two categorical variables
- The relationship between two continuous variables
- Visualization methods
- Exercises
- Summary
- Chapter 4: Probability
- Basic probability
- A tale of two interpretations
- Sampling from distributions
- The normal distribution
- Exercises
- Summary
- Chapter 5: Using Data to Reason About the World
- Estimating means
- The sampling distribution
- Interval estimation
- Smaller samples
- Exercises
- Summary
- Chapter 6: Testing Hypotheses
- Null Hypothesis Significance Testing
- Testing the mean of one sample
- Testing two means
- Testing more than two means
- Testing independence of proportions
- What if my assumptions are unfounded?
- Exercises
- Summary
- Chapter 7: Bayesian Methods
- The big idea behind Bayesian analysis
- Choosing a prior
- Who cares about coin flips
- Enter MCMC - stage left
- Using JAGS and runjags
- Fitting distributions the Bayesian way
- The Bayesian independent samples t-test
- Exercises
- Summary
- Chapter 8: Predicting Continuous Variables
- Linear models
- Simple linear regression
- Simple linear regression with a binary predictor
- Multiple regression
- Regression with a non-binary predictor
- Kitchen sink regression
- The bias-variance trade-off
- Linear regression diagnostics
- Advanced topics
- Exercises
- Summary
- Chapter 9: Predicting Categorical Variables
- k-Nearest Neighbors
- Logistic regression
- Decision trees
- Random forests
- Choosing a classifier
- Exercises
- Summary
- Chapter 10: Sources of Data
- Relational Databases
- Using JSON
- XML
- Other data formats
- Online repositories
- Exercises
- Summary
- Chapter 11: Dealing with Messy Data
- Analysis with missing data
- Analysis with unsanitized data
- Other messiness
- Exercises
- Summary
- Chapter 12: Dealing with Large Data
- Wait to optimize
- Using a bigger and faster machine
- Be smart about your code
- Using optimized packages
- Using another R implementation
- Use parallelization
- Using Rcpp
- Be smarter about your code
- Exercises
- Summary
- Chapter 13: Reproducibility and Best Practices
- R Scripting
- R projects
- Version control
- Communicating results
- Exercises
- Summary
- Module 2: Learning Predictive Analytics with R
- Chapter 1: Visualizing and Manipulating Data Using R
- The roulette case
- Histograms and bar plots
- Scatterplots
- Boxplots
- Line plots
- Application - Outlier detection
- Formatting plots
- Summary
- Chapter 2: Data Visualization with Lattice
- Loading and discovering the lattice package
- Discovering multipanel conditioning with xyplot()
- Discovering other lattice plots
- Updating graphics
- Case study - exploring cancer-related deaths in the US
- Summary
- Chapter 3: Cluster Analysis
- Distance measures
- Learning by doing - partition clustering with kmeans()
- Using k-means with public datasets
- Summary
- Chapter 4: Agglomerative Clustering Using hclust()
- The inner working of agglomerative clustering
- Agglomerative clustering with hclust()
- Summary
- Chapter 5: Dimensionality Reduction with Principal Component Analysis
- The inner working of Principal Component Analysis
- Learning PCA in R
- Summary
- Chapter 6: Exploring Association Rules with Apriori
- Apriori - basic concepts
- The inner working of apriori
- Analyzing data with apriori in R
- Summary
- Chapter 7: Probability Distributions, Covariance, and Correlation
- Probability distributions
- Covariance and correlation
- Summary
- Chapter 8: Linear Regression
- Understanding simple regression
- Working with multiple regression
- Analyzing data in R: correlation and regression
- Robust regression
- Bootstrapping
- Summary
- Chapter 9: Classification with k-Nearest Neighbors and Naïve Bayes
- Understanding k-NN
- Working with k-NN in R
- Understanding Naïve Bayes
- Working with Naïve Bayes in R
- Computing the performance of classification
- Summary
- Chapter 10: Classification Trees
- Understanding decision trees
- ID3
- C4.5
- C5.0
- Classification and regression trees and random forest
- Conditional inference trees and forests
- Installing the packages containing the required functions
- Performing the analyses in R
- Caret - a unified framework for classification
- Summary
- Chapter 11: Multilevel Analyses
- Nested data
- Multilevel regression
- Multilevel modeling in R
- Predictions using multilevel models
- Summary
- Chapter 12: Text Analytics with R
- An introduction to text analytics
- Loading the corpus
- Data preparation
- Creating the training and testing data frames
- Classification of the reviews
- Mining the news with R
- Summary
- Chapter 13: Cross-validation and Bootstrapping Using Caret and Exporting Predictive Models Using PMML
- Cross-validation and bootstrapping of predictive models using the caret package
- Exporting models using PMML
- Summary
- Appendix A: Exercises and Solutions
- Exercises
- Solutions
- Appendix B: Further Reading and References
- Preface
- Chapter 1 - Setting GNU R for Predictive Modeling
- Chapter 2 - Visualizing and Manipulating Data Using R
- Chapter 3 - Data Visualization with Lattice
- Chapter 4 - Cluster Analysis
- Chapter 5 - Agglomerative Clustering Using hclust()
- Chapter 6 - Dimensionality Reduction with Principal Component Analysis
- Chapter 7 - Exploring Association Rules with Apriori
- Chapter 8 - Probability Distributions, Covariance, and Correlation
- Chapter 9 - Linear Regression
- Chapter 10 - Classification with k-Nearest Neighbors and Naïve Bayes
- Chapter 11 - Classification Trees
- Chapter 12 - Multilevel Analyses
- Chapter 13 - Text Analytics with R
- Chapter 14 - Cross-validation and Bootstrapping Using Caret and Exporting Predictive Models Using PMML
- Module 3: Mastering Predictive Analytics with R
- Chapter 1: Gearing Up for Predictive Modeling
- Models
- Types of models
- The process of predictive modeling
- Performance metrics
- Summary
- Chapter 2: Linear Regression
- Introduction to linear regression
- Simple linear regression
- Multiple linear regression
- Assessing linear regression models
- Problems with linear regression
- Feature selection
- Regularization
- Summary
- Chapter 3: Logistic Regression
- Classifying with linear regression
- Introduction to logistic regression
- Predicting heart disease
- Assessing logistic regression models
- Regularization with the lasso
- Classification metrics
- Extensions of the binary logistic classifier
- Summary
- Chapter 4: Neural Networks
- The biological neuron
- The artificial neuron
- Stochastic gradient descent
- Multilayer perceptron networks
- Predicting the energy efficiency of buildings
- Predicting glass type revisited
- Predicting handwritten digits
- Summary
- Chapter 5: Support Vector Machines
- Maximal margin classification
- Support vector classification
- Kernels and support vector machines
- Predicting chemical biodegration
- Cross-validation
- Predicting credit scores
- Multiclass classification with support vector machines
- Summary
- Chapter 6: Tree-based Methods
- The intuition for tree models
- Algorithms for training decision trees
- Predicting class membership on synthetic 2D data
- Predicting the authenticity of banknotes
- Predicting complex skill learning
- Summary
- Chapter 7: Ensemble Methods
- Bagging
- Boosting
- Predicting atmospheric gamma ray radiation
- Predicting complex skill learning with boosting
- Random forests
- Summary
- Chapter 8: Probabilistic Graphical Models
- A little graph theory
- Bayes' Theorem
- Conditional independence
- Bayesian networks
- The Naïve Bayes classifier
- Hidden Markov models
- Predicting promoter gene sequences
- Predicting letter patterns in English words
- Summary
- Chapter 9: Time Series Analysis
- Fundamental concepts of time series
- Some fundamental time series
- Stationarity
- Stationary time series models
- Non-stationary time series models
- Predicting intense earthquakes
- Predicting lynx trappings
- Predicting foreign exchange rates
- Other time series models
- Summary
- Chapter 10: Topic Modeling
- An overview of topic modeling
- Latent Dirichlet Allocation
- Modeling the topics of online news stories
- Summary
- Chapter 11: Recommendation Systems
- Rating matrix
- Collaborative filtering
- Singular value decomposition
- R and Big Data
- Predicting recommendations for movies and jokes
- Loading and preprocessing the data
- Exploring the data
- Other approaches to recommendation systems
- Summary
- Bibliography
- Index
System requirements
File format: PDF
Copy-Protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our eBook Help page.