
Introduction to Machine Learning with Python
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Content
- Cover
- Copyright
- Table of Contents
- Preface
- Who Should Read This Book
- Why We Wrote This Book
- Navigating This Book
- Online Resources
- Conventions Used in This Book
- Using Code Examples
- O'Reilly Safari
- How to Contact Us
- Acknowledgments
- From Andreas
- From Sarah
- Chapter 1. Introduction
- 1.1 Why Machine Learning?
- 1.1.1 Problems Machine Learning Can Solve
- 1.1.2 Knowing Your Task and Knowing Your Data
- 1.2 Why Python?
- 1.3 scikit-learn
- 1.3.1 Installing scikit-learn
- 1.4 Essential Libraries and Tools
- 1.4.1 Jupyter Notebook
- 1.4.2 NumPy
- 1.4.3 SciPy
- 1.4.4 matplotlib
- 1.4.5 pandas
- 1.4.6 mglearn
- 1.5 Python 2 Versus Python 3
- 1.6 Versions Used in this Book
- 1.7 A First Application: Classifying Iris Species
- 1.7.1 Meet the Data
- 1.7.2 Measuring Success: Training and Testing Data
- 1.7.3 First Things First: Look at Your Data
- 1.7.4 Building Your First Model: k-Nearest Neighbors
- 1.7.5 Making Predictions
- 1.7.6 Evaluating the Model
- 1.8 Summary and Outlook
- Chapter 2. Supervised Learning
- 2.1 Classification and Regression
- 2.2 Generalization, Overfitting, and Underfitting
- 2.2.1 Relation of Model Complexity to Dataset Size
- 2.3 Supervised Machine Learning Algorithms
- 2.3.1 Some Sample Datasets
- 2.3.2 k-Nearest Neighbors
- 2.3.3 Linear Models
- 2.3.4 Naive Bayes Classifiers
- 2.3.5 Decision Trees
- 2.3.6 Ensembles of Decision Trees
- 2.3.7 Kernelized Support Vector Machines
- 2.3.8 Neural Networks (Deep Learning)
- 2.4 Uncertainty Estimates from Classifiers
- 2.4.1 The Decision Function
- 2.4.2 Predicting Probabilities
- 2.4.3 Uncertainty in Multiclass Classification
- 2.5 Summary and Outlook
- Chapter 3. Unsupervised Learning and Preprocessing
- 3.1 Types of Unsupervised Learning
- 3.2 Challenges in Unsupervised Learning
- 3.3 Preprocessing and Scaling
- 3.3.1 Different Kinds of Preprocessing
- 3.3.2 Applying Data Transformations
- 3.3.3 Scaling Training and Test Data the Same Way
- 3.3.4 The Effect of Preprocessing on Supervised Learning
- 3.4 Dimensionality Reduction, Feature Extraction, and Manifold Learning
- 3.4.1 Principal Component Analysis (PCA)
- 3.4.2 Non-Negative Matrix Factorization (NMF)
- 3.4.3 Manifold Learning with t-SNE
- 3.5 Clustering
- 3.5.1 k-Means Clustering
- 3.5.2 Agglomerative Clustering
- 3.5.3 DBSCAN
- 3.5.4 Comparing and Evaluating Clustering Algorithms
- 3.5.5 Summary of Clustering Methods
- 3.6 Summary and Outlook
- Chapter 4. Representing Data and Engineering Features
- 4.1 Categorical Variables
- 4.1.1 One-Hot-Encoding (Dummy Variables)
- 4.1.2 Numbers Can Encode Categoricals
- 4.2 OneHotEncoder and ColumnTransformer: Categorical Variables with scikit-learn
- 4.3 Convenient ColumnTransformer creation with make_columntransformer
- 4.4 Binning, Discretization, Linear Models, and Trees
- 4.5 Interactions and Polynomials
- 4.6 Univariate Nonlinear Transformations
- 4.7 Automatic Feature Selection
- 4.7.1 Univariate Statistics
- 4.7.2 Model-Based Feature Selection
- 4.7.3 Iterative Feature Selection
- 4.8 Utilizing Expert Knowledge
- 4.9 Summary and Outlook
- Chapter 5. Model Evaluation and Improvement
- 5.1 Cross-Validation
- 5.1.1 Cross-Validation in scikit-learn
- 5.1.2 Benefits of Cross-Validation
- 5.1.3 Stratified k-Fold Cross-Validation and Other Strategies
- 5.2 Grid Search
- 5.2.1 Simple Grid Search
- 5.2.2 The Danger of Overfitting the Parameters and the Validation Set
- 5.2.3 Grid Search with Cross-Validation
- 5.3 Evaluation Metrics and Scoring
- 5.3.1 Keep the End Goal in Mind
- 5.3.2 Metrics for Binary Classification
- 5.3.3 Metrics for Multiclass Classification
- 5.3.4 Regression Metrics
- 5.3.5 Using Evaluation Metrics in Model Selection
- 5.4 Summary and Outlook
- Chapter 6. Algorithm Chains and Pipelines
- 6.1 Parameter Selection with Preprocessing
- 6.2 Building Pipelines
- 6.3 Using Pipelines in Grid Searches
- 6.4 The General Pipeline Interface
- 6.4.1 Convenient Pipeline Creation with make_pipeline
- 6.4.2 Accessing Step Attributes
- 6.4.3 Accessing Attributes in a Pipeline inside GridSearchCV
- 6.5 Grid-Searching Preprocessing Steps and Model Parameters
- 6.6 Grid-Searching Which Model To Use
- 6.6.1 Avoiding Redundant Computation
- 6.7 Summary and Outlook
- Chapter 7. Working with Text Data
- 7.1 Types of Data Represented as Strings
- 7.2 Example Application: Sentiment Analysis of Movie Reviews
- 7.3 Representing Text Data as a Bag of Words
- 7.3.1 Applying Bag-of-Words to a Toy Dataset
- 7.3.2 Bag-of-Words for Movie Reviews
- 7.4 Stopwords
- 7.5 Rescaling the Data with tf-idf
- 7.6 Investigating Model Coefficients
- 7.7 Bag-of-Words with More Than One Word (n-Grams)
- 7.8 Advanced Tokenization, Stemming, and Lemmatization
- 7.9 Topic Modeling and Document Clustering
- 7.9.1 Latent Dirichlet Allocation
- 7.10 Summary and Outlook
- Chapter 8. Wrapping Up
- 8.1 Approaching a Machine Learning Problem
- 8.1.1 Humans in the Loop
- 8.2 From Prototype to Production
- 8.3 Testing Production Systems
- 8.4 Building Your Own Estimator
- 8.5 Where to Go from Here
- 8.5.1 Theory
- 8.5.2 Other Machine Learning Frameworks and Packages
- 8.5.3 Ranking, Recommender Systems, and Other Kinds of Learning
- 8.5.4 Probabilistic Modeling, Inference, and Probabilistic Programming
- 8.5.5 Neural Networks
- 8.5.6 Scaling to Larger Datasets
- 8.5.7 Honing Your Skills
- 8.6 Conclusion
- Index
- About the Authors
- Colophon
System requirements
File format: PDF
Copy-Protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our eBook Help page.