
Building Machine Learning Systems with Python
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
- Implement popular supervised and unsupervised machine learning algorithms in Python
- Discover best practices for building production-grade machine learning systems from scratch
Book DescriptionMachine learning enables systems to make predictions based on historical data. Python is one of the most popular languages used to develop machine learning applications, thanks to its extensive library support. This updated third edition of Building Machine Learning Systems with Python helps you get up to speed with the latest trends in artificial intelligence (AI). With this guide's hands-on approach, you'll learn to build state-of-the-art machine learning models from scratch. Complete with ready-to-implement code and real-world examples, the book starts by introducing the Python ecosystem for machine learning. You'll then learn best practices for preparing data for analysis and later gain insights into implementing supervised and unsupervised machine learning techniques such as classification, regression and clustering. As you progress, you'll understand how to use Python's scikit-learn and TensorFlow libraries to build production-ready and end-to-end machine learning system models, and then fine-tune them for high performance. By the end of this book, you'll have the skills you need to confidently train and deploy enterprise-grade machine learning models in Python.What you will learn - Build a classification system that can be applied to text, images, and sound
- Solve regression-related problems using scikit-learn and TensorFlow
- Recommend products to users based on their previous purchases
- Explore different methods of applying deep neural networks to your data
- Understand recent advances in computer vision and natural language processing (NLP)
- Deploy Amazon Web Services (AWS) to run data models on the cloud
Who this book is forThis book is for data scientists, machine learning developers, and Python developers who want to learn how to build increasingly complex machine learning systems. Prior knowledge of Python programming is expected.
More details
Other editions
Additional editions

Previous edition

Content
- Cover
- Title Page
- Copyright and Credits
- Packt Upsell
- Contributors
- Table of Contents
- Preface
- Chapter 1: Getting Started with Python Machine Learning
- Machine learning and Python - a dream team
- What the book will teach you - and what it will not
- How to best read this book
- What to do when you are stuck
- Getting started
- Introduction to NumPy, SciPy, Matplotlib, and TensorFlow
- Installing Python
- Chewing data efficiently with NumPy and intelligently with SciPy
- Learning NumPy
- Indexing
- Handling nonexistent values
- Comparing the runtime
- Learning SciPy
- Fundamentals of machine learning
- Asking a question
- Getting answers
- Our first (tiny) application of machine learning
- Reading in the data
- Preprocessing and cleaning the data
- Choosing the right model and learning algorithm
- Before we build our first model
- Starting with a simple straight line
- Toward more complex models
- Stepping back to go forward - another look at our data
- Training and testing
- Answering our initial question
- Summary
- Chapter 2: Classifying with Real-World Examples
- The Iris dataset
- Visualization is a good first step
- Classifying with scikit-learn
- Building our first classification model
- Evaluation - holding out data and cross-validation
- How to measure and compare classifiers
- A more complex dataset and the nearest-neighbor classifier
- Learning about the seeds dataset
- Features and feature engineering
- Nearest neighbor classification
- Looking at the decision boundaries
- Which classifier to use
- Summary
- Chapter 3: Regression
- Predicting house prices with regression
- Multidimensional regression
- Cross-validation for regression
- Penalized or regularized regression
- L1 and L2 penalties
- Using Lasso or ElasticNet in scikit-learn
- Visualizing the Lasso path
- P-greater-than-N scenarios
- An example based on text documents
- Setting hyperparameters in a principled way
- Regression with TensorFlow
- Summary
- Chapter 4: Classification I - Detecting Poor Answers
- Sketching our roadmap
- Learning to classify classy answers
- Tuning the instance
- Tuning the classifier
- Fetching the data
- Slimming the data down to chewable chunks
- Preselecting and processing attributes
- Defining what a good answer is
- Creating our first classifier
- Engineering the features
- Training the classifier
- Measuring the classifier's performance
- Designing more features
- Deciding how to improve the performance
- Bias, variance and their trade-off
- Fixing high bias
- Fixing high variance
- High or low bias?
- Using logistic regression
- A bit of math with a small example
- Applying logistic regression to our post-classification problem
- Looking behind accuracy - precision and recall
- Slimming the classifier
- Ship it!
- Classification using Tensorflow
- Summary
- Chapter 5: Dimensionality Reduction
- Sketching our roadmap
- Selecting features
- Detecting redundant features using filters
- Correlation
- Mutual information
- Asking the model about the features using wrappers
- Other feature selection methods
- Feature projection
- Principal component analysis
- Sketching PCA
- Applying PCA
- Limitations of PCA and how LDA can help
- Multidimensional scaling
- Autoencoders, or neural networks for dimensionality reduction
- Summary
- Chapter 6: Clustering - Finding Related Posts
- Measuring the relatedness of posts
- How not to do it
- How to do it
- Preprocessing - similarity measured as a similar number of common words
- Converting raw text into a bag of words
- Counting words
- Normalizing word count vectors
- Removing less important words
- Stemming
- Installing and using NLTK
- Extending the vectorizer with NLTK's stemmer
- Stop words on steroids
- Our achievements and goals
- Clustering
- K-means
- Getting test data to evaluate our ideas
- Clustering posts
- Solving our initial challenge
- Another look at noise
- Tweaking the parameters
- Summary
- Chapter 7: Recommendations
- Rating predictions and recommendations
- Splitting into training and testing
- Normalizing the training data
- A neighborhood approach to recommendations
- A regression approach to recommendations
- Combining multiple methods
- Basket analysis
- Obtaining useful predictions
- Analyzing supermarket shopping baskets
- Association rule mining
- More advanced basket analysis
- Summary
- Chapter 8: Artificial Neural Networks and Deep Learning
- Using TensorFlow
- TensorFlow API
- Graphs
- Sessions
- Useful operations
- Saving and restoring neural networks
- Training neural networks
- Convolutional neural networks
- Recurrent neural networks
- LSTM for predicting text
- LSTM for image processing
- Summary
- Chapter 9: Classification II - Sentiment Analysis
- Sketching our roadmap
- Fetching the Twitter data
- Introducing the Naïve Bayes classifier
- Getting to know the Bayes theorem
- Being naïve
- Using Naïve Bayes to classify
- Accounting for unseen words and other oddities
- Accounting for arithmetic underflows
- Creating our first classifier and tuning it
- Solving an easy problem first
- Using all classes
- Tuning the classifier's parameters
- Cleaning tweets
- Taking the word types into account
- Determining the word types
- Successfully cheating using SentiWordNet
- Our first estimator
- Putting everything together
- Summary
- Chapter 10: Topic Modeling
- Latent Dirichlet allocation
- Building a topic model
- Comparing documents by topic
- Modeling the whole of Wikipedia
- Choosing the number of topics
- Summary
- Chapter 11: Classification III - Music Genre Classification
- Sketching our roadmap
- Fetching the music data
- Converting into WAV format
- Looking at music
- Decomposing music into sine-wave components
- Using FFT to build our first classifier
- Increasing experimentation agility
- Training the classifier
- Using a confusion matrix to measure accuracy in multiclass problems
- An alternative way to measure classifier performance using receiver-operator characteristics
- Improving classification performance with mel frequency cepstral coefficients
- Music classification using Tensorflow
- Summary
- Chapter 12: Computer Vision
- Introducing image processing
- Loading and displaying images
- Thresholding
- Gaussian blurring
- Putting the center in focus
- Basic image classification
- Computing features from images
- Writing your own features
- Using features to find similar images
- Classifying a harder dataset
- Local feature representations
- Image generation with adversarial networks
- Summary
- Chapter 13: Reinforcement Learning
- Types of reinforcement learning
- Policy and value network
- Q-network
- Excelling at games
- A small example
- Using Tensorflow for the text game
- Playing breakout
- Summary
- Chapter 14: Bigger Data
- Learning about big data
- Using jug to break up your pipeline into tasks
- An introduction to tasks in jug
- Looking under the hood
- Using jug for data analysis
- Reusing partial results
- Using Amazon Web Services
- Creating your first virtual machines
- Installing Python packages on Amazon Linux
- Running jug on our cloud machine
- Automating the generation of clusters with cfncluster
- Summary
- Appendi A: Where to Learn More About Machine Learning
- Online courses
- Books
- Blogs
- Data sources
- Getting competitive
- All that was left out
- Summary
- Other Books You May Enjoy
- Index
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.