
Hands-On Predictive Analytics with Python
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
Predictive analytics is an applied field that employs a variety of quantitative methods using data to make predictions. It involves much more than just throwing data onto a computer to build a model. This book provides practical coverage to help you understand the most important concepts of predictive analytics. Using practical, step-by-step examples, we build predictive analytics solutions while using cutting-edge Python tools and packages. The book's step-by-step approach starts by defining the problem and moves on to identifying relevant data. We will also be performing data preparation, exploring and visualizing relationships, building models, tuning, evaluating, and deploying model. Each stage has relevant practical examples and efficient Python code. You will work with models such as KNN, Random Forests, and neural networks using the most important libraries in Python's data science stack: NumPy, Pandas, Matplotlib, Seaborn, Keras, Dash, and so on. In addition to hands-on code examples, you will find intuitive explanations of the inner workings of the main techniques and algorithms used in predictive analytics. By the end of this book, you will be all set to build high-performance predictive analytics solutions using Python programming.
More details
Other editions
Additional editions

Content
- Cover
- Title Page
- Copyright and Credits
- About Packt
- Contributors
- Table of Contents
- Preface
- Chapter 1: The Predictive Analytics Process
- Technical requirements
- What is predictive analytics?
- Reviewing important concepts of predictive analytics
- The predictive analytics process
- Problem understanding and definition
- Data collection and preparation
- Dataset understanding using EDA
- Model building
- Model evaluation
- Communication and/or deployment
- CRISP-DM and other approaches
- A quick tour of Python's data science stack
- Anaconda
- Jupyter
- NumPy
- A mini NumPy tutorial
- SciPy
- pandas
- Matplotlib
- Seaborn
- Scikit-learn
- TensorFlow and Keras
- Dash
- Summary
- Further reading
- Chapter 2: Problem Understanding and Data Preparation
- Technical requirements
- Understanding the business problem and proposing a solution
- Context is everything
- Define what is going to be predicted
- Make explicit the data that will be required
- Think about access to the data
- Proposing a solution
- Define your methodology
- Define key metrics of model performance
- Define the deliverables of the project
- Practical project - diamond prices
- Diamond prices - problem understanding and definition
- Getting more context
- Diamond prices - proposing a solution at a high level
- Goal
- Methodology
- Metrics for the model
- Deliverables for the project
- Diamond prices - data collection and preparation
- Dealing with missing values
- Practical project - credit card default
- Credit card default - problem understanding and definition
- Credit card default - proposing a solution
- Goal
- Methodology
- Metrics for the model
- Deliverables of the project
- Credit card default - data collection and preparation
- Credit card default - numerical features
- Encoding categorical features
- Low variance features
- Near collinearity
- One-hot encoding with pandas
- A brief introduction to feature engineering
- Summary
- Further reading
- Chapter 3: Dataset Understanding - Exploratory Data Analysis
- Technical requirements
- What is EDA?
- Univariate EDA
- Univariate EDA for numerical features
- Univariate EDA for categorical features
- Bivariate EDA
- Two numerical features
- Scatter plots
- The Pearson correlation coefficient
- Two categorical features
- Cross tables
- Barplots for two categorical variables
- One numerical feature and one categorical feature
- Introduction to graphical multivariate EDA
- Summary
- Further reading
- Chapter 4: Predicting Numerical Values with Machine Learning
- Technical requirements
- Introduction to ML
- Tasks in supervised learning
- Creating your first ML model
- The goal of ML models - generalization
- Overfitting
- Evaluation function and optimization
- Practical considerations before modeling
- Introducing scikit-learn
- Further feature transformations
- Train-test split
- Dimensionality reduction using PCA
- Standardization - centering and scaling
- MLR
- Lasso regression
- KNN
- Training versus testing error
- Summary
- Further reading
- Chapter 5: Predicting Categories with Machine Learning
- Technical requirements
- Classification tasks
- Predicting categories and probabilities
- Credit card default dataset
- Logistic regression
- A simple logistic regression model
- A complete logistic regression model
- Classification trees
- How trees work
- The good and the bad of trees
- Training a larger classification tree
- Random forests
- Training versus testing error
- Multiclass classification
- Naive Bayes classifiers
- Conditional probability
- Bayes' theorem
- Using Bayesian terms
- Back to the classification problem
- Gaussian Naive Bayes
- Gaussian Naive Bayes with scikit-learn
- Summary
- Further reading
- Chapter 6: Introducing Neural Nets for Predictive Analytics
- Technical requirements
- Introducing neural network models
- Deep learning
- Anatomy of an MLP - elements of a neural network model
- How MLPs learn
- Introducing TensorFlow and Keras
- TensorFlow
- Keras - deep learning for humans
- Regressing with neural networks
- Building the MLP for predicting diamond prices
- Training the MLP
- Making predictions with the neural network
- Classification with neural networks
- Building the MLP for predicting credit card default
- Evaluating predictions
- The dark art of training neural networks
- So many decisions
- so little time
- Regularization for neural networks
- Using a validation set
- Early stopping
- Dropout
- Practical advice on training neural networks
- Summary
- Further reading
- Chapter 7: Model Evaluation
- Technical requirements
- Evaluation of regression models
- Metrics for regression models
- MSE and Root Mean Squared Error (RMSE)
- MAE
- R-squared (R2)
- Defining a custom metric
- Visualization methods for evaluating regression models
- Evaluation for classification models
- Confusion matrix and related metrics
- Visualization methods for evaluating classification models
- Visualizing probabilities
- Receiver Operating Characteristic (ROC) and precision-recall curves
- Defining a custom metric for classification
- The k-fold cross-validation
- Summary
- Further reading
- Chapter 8: Model Tuning and Improving Performance
- Technical requirements
- Hyperparameter tuning
- Optimizing a single hyperparameter
- Optimizing more than one parameter
- Improving performance
- Improving our diamond price predictions
- Fitting a neural network
- Transforming the target
- Analyzing the results
- Not only a technical problem but a business problem
- Summary
- Chapter 9: Implementing a Model with Dash
- Technical requirements
- Model communication and/or deployment phase
- Using a technical report
- A feature of an existing product
- Using an analytic application
- Introducing Dash
- What is Dash?
- Plotly
- Installation
- The application layout
- Building a basic static app
- Building a basic interactive app
- Implementing a predictive model as a web application
- Producing the predictive model objects
- Building the web application
- Summary
- Further reading
- Other Books You May Enjoy
- Index
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.