Machine Learning in the AWS Cloud

Add Intelligence to Applications with Amazon SageMaker and Amazon Rekognition
Standards Information Network (Verlag)
  • 1. Auflage
  • |
  • erschienen am 13. August 2019
  • |
  • 528 Seiten
E-Book | ePUB mit Adobe-DRM | Systemvoraussetzungen
978-1-119-55672-5 (ISBN)
Put the power of AWS Cloud machine learning services to work in your business and commercial applications! Machine Learning in the AWS Cloud introduces readers to the machine learning (ML) capabilities of the Amazon Web Services ecosystem and provides practical examples to solve real-world regression and classification problems. While readers do not need prior ML experience, they are expected to have some knowledge of Python and a basic knowledge of Amazon Web Services. Part One introduces readers to fundamental machine learning concepts. You will learn about the types of ML systems, how they are used, and challenges you may face with ML solutions. Part Two focuses on machine learning services provided by Amazon Web Services. You'll be introduced to the basics of cloud computing and AWS offerings in the cloud-based machine learning space. Then you'll learn to use Amazon Machine Learning to solve a simpler class of machine learning problems, and Amazon SageMaker to solve more complex problems. * Learn techniques that allow you to preprocess data, basic feature engineering, visualizing data, and model building * Discover common neural network frameworks with Amazon SageMaker * Solve computer vision problems with Amazon Rekognition * Benefit from illustrations, source code examples, and sidebars in each chapter The book appeals to both Python developers and technical/solution architects. Developers will find concrete examples that show them how to perform common ML tasks with Python on AWS. Technical/solution architects will find useful information on the machine learning capabilities of the AWS ecosystem.
1. Auflage
  • Englisch
  • USA
John Wiley & Sons Inc
  • Für Beruf und Forschung
  • 55,16 MB
978-1-119-55672-5 (9781119556725)
weitere Ausgaben werden ermittelt

ABHISHEK MISHRA has more than 19 years' experience across a broad range of enterprise technologies. He consults as a security and fraud solution architect with Lloyds Banking group PLC in London. He is the author of Amazon Web Services for Mobile Developers.
Introduction xxiii

Part 1 Fundamentals of Machine Learning 1

Chapter 1 Introduction to Machine Learning 3

What is Machine Learning? 4

Tools Commonly Used by Data Scientists 4

Common Terminology 5

Real-World Applications of Machine Learning 7

Types of Machine Learning Systems 8

Supervised Learning 8

Unsupervised Learning 9

Semi-Supervised Learning 10

Reinforcement Learning 11

Batch Learning 11

Incremental Learning 12

Instance-based Learning 12

Model-based Learning 12

The Traditional Versus the Machine Learning Approach 13

A Rule-based Decision System 14

A Machine Learning-based System 17

Summary 25

Chapter 2 Data Collection and Preprocessing 27

Machine Learning Datasets 27

Scikit-learn Datasets 27

AWS Public Datasets 30 Datasets 30

UCI Machine Learning Repository 30

Data Preprocessing Techniques 31

Obtaining an Overview of the Data 31

Handling Missing Values 42

Creating New Features 44

Transforming Numeric Features 46

One-Hot Encoding Categorical Features 47

Summary 50

Chapter 3 Data Visualization with Python 51

Introducing Matplotlib 51

Components of a Plot 54

Figure 55


Axis 56

Axis Labels 56

Grids 57

Title 57

Common Plots 58

Histograms 58

Bar Chart 62

Grouped Bar Chart 63

Stacked Bar Chart 65

Stacked Percentage Bar Chart 67

Pie Charts 69

Box Plot 71

Scatter Plots 73

Summary 78

Chapter 4 Creating Machine Learning Models with Scikit-learn 79

Introducing Scikit-learn 79

Creating a Training and Test Dataset 80

K-Fold Cross Validation 84

Creating Machine Learning Models 86

Linear Regression 86

Support Vector Machines 92

Logistic Regression 101

Decision Trees 109

Summary 114

Chapter 5 Evaluating Machine Learning Models 115

Evaluating Regression Models 115

RMSE Metric 117

R2 Metric 119

Evaluating Classification Models 119

Binary Classification Models 119

Multi-Class Classification Models 126

Choosing Hyperparameter Values 131

Summary 132

Part 2 Machine Learning with Amazon Web Services 133

Chapter 6 Introduction to Amazon Web Services 135

What is Cloud Computing? 135

Cloud Service Models 136

Cloud Deployment Models 138

The AWS Ecosystem 139

Machine Learning Application Services 140

Machine Learning Platform Services 141

Support Services 142

Sign Up for an AWS Free-Tier Account 142

Step 1: Contact Information 143

Step 2: Payment Information 145

Step 3: Identity Verification 145

Step 4: Support Plan Selection 147

Step 5: Confirmation 148

Summary 148

Chapter 7 AWS Global Infrastructure 151

Regions and Availability Zones 151

Edge Locations 153

Accessing AWS 154

The AWS Management Console 156

Summary 160

Chapter 8 Identity and Access Management 161

Key Concepts 161

Root Account 161

User 162

Identity Federation 162

Group 163


Role 164

Common Tasks 165

Creating a User 167

Modifying Permissions Associated with an Existing Group 172

Creating a Role 173

Securing the Root Account with MFA 176

Setting Up an IAM Password Rotation Policy 179

Summary 180

Chapter 9 Amazon S3 181

Key Concepts 181

Bucket 181

Object Key 182

Object Value 182

Version ID 182

Storage Class 182

Costs 183

Subresources 183

Object Metadata 184

Common Tasks 185

Creating a Bucket 185

Uploading an Object 189

Accessing an Object 191

Changing the Storage Class of an Object 195

Deleting an Object 196

Amazon S3 Bucket Versioning 197

Accessing Amazon S3 Using the AWS CLI 199

Summary 200

Chapter 10 Amazon Cognito 201

Key Concepts 201

Authentication 201

Authorization 201

Identity Provider 202

Client 202

OAuth 2.0 202

OpenID Connect 202

Amazon Cognito User Pool 202

Identity Pool 203

Amazon Cognito Federated Identities 203

Common Tasks 204

Creating a User Pool 204

Retrieving the App Client Secret 213

Creating an Identity Pool 214

User Pools or Identity Pools: Which One Should You Use? 218

Summary 219

Chapter 11 Amazon DynamoDB 221

Key Concepts 221

Tables 222

Global Tables 222

Items 222

Attributes 222

Primary Keys 222

Secondary Indexes 223

Queries 223

Scans 223

Read Consistency 224

Read/Write Capacity Modes 224

Common Tasks 225

Creating a Table 225

Adding Items to a Table 228

Creating an Index 231

Performing a Scan 233

Performing a Query 235

Summary 236

Chapter 12 AWS Lambda 237

Common Use Cases for Lambda 237

Key Concepts 238

Supported Languages 238

Lambda Functions 238

Programming Model 239

Execution Environment 243

Service Limitations 244

Pricing and Availability 244

Common Tasks 244

Creating a Simple Python Lambda Function Using the AWS Management Console 244

Testing a Lambda Function Using the AWS Management Console 250

Deleting an AWS Lambda Function Using the AWS Management Console 253

Summary 255

Chapter 13 Amazon Comprehend 257

Key Concepts 257

Natural Language Processing 257

Topic Modeling 259

Language Support 259

Pricing and Availability 259

Text Analysis Using the Amazon Comprehend Management Console 260

Interactive Text Analysis with the AWS CLI 262

Entity Detection with the AWS CLI 263

Key Phrase Detection with the AWS CLI 264

Sentiment Analysis with the AWS CLI 265

Using Amazon Comprehend with AWS Lambda 266

Summary 274

Chapter 14 Amazon Lex 275

Key Concepts 275

Bot 275

Client Application 276

Intent 276

Slot 276

Utterance 277

Programming Model 277

Pricing and Availability 278

Creating an Amazon Lex Bot 278

Creating Amazon DynamoDB Tables 278

Creating AWS Lambda Functions 285

Creating the Chatbot 304

Customizing the AccountOverview Intent 308

Customizing the ViewTransactionList Intent 312

Testing the Chatbot 314

Summary 315

Chapter 15 Amazon Machine Learning 317

Key Concepts 317

Datasources 318

ML Model 318

Regularization 319

Training Parameters 319

Descriptive Statistics 320

Pricing and Availability 321

Creating Datasources 321

Creating the Training Datasource 324

Creating the Test Datasource 330

Viewing Data Insights 332

Creating an ML Model 337

Making Batch Predictions 341

Creating a Real-Time Prediction Endpoint for Your Machine Learning Model 346

Making Predictions Using the AWS CLI 347

Using Real-Time Prediction Endpoints with Your Applications 349

Summary 350

Chapter 16 Amazon SageMaker 353

Key Concepts 353

Programming Model 354

Amazon SageMaker Notebook Instances 354

Training Jobs 354

Prediction Instances 355

Prediction Endpoint and Endpoint Configuration 355

Amazon SageMaker Batch Transform 355

Data Channels 355

Data Sources and Formats 356

Built-in Algorithms 356

Pricing and Availability 357

Creating an Amazon SageMaker Notebook Instance 357

Preparing Test and Training Data 362

Training a Scikit-learn Model on an Amazon SageMaker Notebook Instance 364

Training a Scikit-learn Model on a Dedicated Training Instance 368

Training a Model Using a Built-in Algorithm on a Dedicated Training Instance 379

Summary 384

Chapter 17 Using Google TensorFlow with Amazon SageMaker 387

Introduction to Google TensorFlow 387

Creating a Linear Regression Model with Google TensorFlow 390

Training and Deploying a DNN Classifier Using the TensorFlow Estimators API and Amazon SageMaker 408

Summary 419

Chapter 18 Amazon Rekognition 421

Key Concepts 421

Object Detection 421

Object Location 422

Scene Detection 422

Activity Detection 422

Facial Recognition 422

Face Collection 422

API Sets 422

Non-Storage and Storage-Based Operations 423

Model Versioning 423

Pricing and Availability 423

Analyzing Images Using the Amazon Rekognition Management Console 423

Interactive Image Analysis with the AWS CLI 428

Using Amazon Rekognition with AWS Lambda 433

Creating the Amazon DynamoDB Table 433

Creating the AWS Lambda Function 435

Summary 444

Appendix A Anaconda and Jupyter Notebook Setup 445

Installing the Anaconda Distribution 445

Creating a Conda Python Environment 447

Installing Python Packages 449

Installing Jupyter Notebook 451

Summary 454

Appendix B AWS Resources Needed to Use This Book 455

Creating an IAM User for Development 455

Creating S3 Buckets 458

Appendix C Installing and Configuring the AWS CLI 461

Mac OS Users 461

Installing the AWS CLI 461

Configuring the AWS CLI 462

Windows Users 464

Installing the AWS CLI4 64

Configuring the AWS CLI 465

Appendix D Introduction to NumPy and Pandas 467

NumPy 467

Creating NumPy Arrays 467

Modifying Arrays 471

Indexing and Slicing 474

Pandas 475

Creating Series and Dataframes 476

Getting Dataframe Information 478

Selecting Data 481

Index 485


Amazon Web Services (AWS) is one of the leading cloud-computing platforms in the industry today. At the time this book was written, AWS offered more than 100 services, each of which resided in one of 18 different service categories. For someone who is new to cloud computing or to the AWS ecosystem, the sheer number of services on offer can be daunting. It can be difficult to know where to begin and what services to focus on.

Developers who are new to machine learning as well as experienced data scientists are often not aware of the power of the public cloud and AWS's offerings in the machine learning space in particular. In the past, cloud-based machine learning offerings have been limited in the types of algorithms they could support and the level of customization that was possible. All of this changed when Amazon announced SageMaker-a service that provided the ability to build machine learning models based on Amazon's implementation of cutting-edge algorithms, as well as the option to build custom models with frameworks such as Scikit-learn and Google TensorFlow.

Real-world use cases of cloud-based machine learning models are not based on using the model in isolation, but instead rely on a number of supporting systems such as databases, load balancers, API gateways, and identity providers, all of which are provided by AWS. This book is written to provide both seasoned machine learning experts and enthusiasts alike an introduction to a selection of AWS machine learning services that are based on pre-trained models, as well as step-by-step examples of how to train and deploy your own custom models on Amazon SageMaker. For enthusiasts who are new to machine learning, this book also provides a selection of chapters that cover the fundamentals of machine learning such as data preprocessing, visualization, feature engineering, and the use of common Python libraries such as NumPy, Pandas, and Scikit-learn.

This book at all times attempts to balance between theory and practice, giving you enough visibility into the underlying concepts and providing you with the best practices and practical advice that you can apply at your workplace right away. I have also made every attempt to keep the content up-to-date and relevant. Even though this makes the book susceptible to being outdated in a few rare instances, I am confident the content will remain useful and relevant through the next versions of the AWS services.

Who This Book Is For

This book is best suited for software developers who wish to learn about machine learning in general and how to leverage machine learning-specific offerings from AWS. The book is also useful to data scientists, system architects, and application architects, who want to get an introduction to some of the commonly used AWS services in the machine learning space.

If you are new to both machine learning and AWS, I advise that you read all chapters from start to finish. If you are an experienced data scientist, you may want to skip ahead to Part 2 to learn about machine learning-specific AWS services.

What This Book Covers

This book covers building and training machine learning models with Python on the AWS cloud, as well as a number of ready-to-use machine learning services such as Amazon Rekognition, Amazon Comprehend, and Amazon Lex.

The book also covers general high-level concepts of machine learning, including feature engineering, data visualization, as well as supporting AWS services that are used to build machine learning systems such as Amazon IAM, Amazon Cognito, Amazon S3, Amazon DynamoDB, and AWS Lambda.

The model-building and evaluation code in this book is written in Python 3. Services provided by Amazon, Apple, and Google are updated frequently and therefore sometimes you may encounter a newer version of a screen when you follow the instructions in a chapter.

How This Book Is Structured

This book consists of 18 chapters that are grouped into two parts, and four appendices. The first part consists of five chapters and covers the fundamentals of machine learning using Python. This part covers techniques for feature engineering, data visualization, model building, and model evaluation using Pandas, NumPy, Matplotlib, and Scikit-learn. The examples developed in this part make use of Jupyter Notebook and are aimed at readers who are new to machine learning.

Part 2 covers building machine learning applications using AWS services. This part starts with introducing the basics of commonly used AWS services such as Amazon S3, Amazon DynamoDB, and AWS Lambda. It then proceeds to AWS services that deal specifically with machine learning such as Amazon Comprehend, Amazon Lex, Amazon Machine Learning, and Amazon SageMaker. Two chapters are dedicated to Amazon SageMaker; the first one covers building and deploying models using built-in algorithms and Scikit-learn, and the second one covers building and deploying a model with Google TensorFlow. Not all chapters in this part include source code, but where applicable, you can download the source code that accompanies each chapter using a GitHub link. Some of the chapters in this part require you to upload files to Amazon S3; you will need to substitute the names of buckets in the examples with those from your own account.

The chapters in Part 1 include:

  • Introduction to Machine Learning (Chapter 1) This is an introduction to the types of machine learning systems, their applications, and tools used to build machine learning systems.
  • Data Collection and Preprocessing (Chapter 2) This chapter covers sources that can be used to obtain training data, techniques to explore datasets, and basic feature engineering.
  • Data Visualization with Python (Chapter 3) This chapter covers techniques to visualize datasets using Matplotlib.
  • Creating Machine Learning Models with Scikit-learn (Chapter 4) This chapter covers techniques to build and train classification and regression models using Scikit-learn.
  • Evaluating Machine Learning Models (Chapter 5) This chapter covers techniques to evaluate the quality of a machine learning model.

The chapters in Part 2 include:

  • Introduction to Amazon Web Services (Chapter 6) This chapter is a brief primer on cloud computing and Amazon Web Services. It also covers commonly encountered service and deployment models.
  • AWS Global Infrastructure (Chapter 7) This chapter introduces AWS regions, availability zones, and edge locations.
  • Identity and Access Management (Chapter 8) This chapter introduces one of the key services provided by AWS to secure your resources in the Amazon cloud. It also provides instructions to sign up for an account under the AWS free tier.
  • Amazon S3 (Chapter 9) This chapter introduces one the most commonly used storage services provided by AWS, Amazon Simple Storage Service (S3).
  • Amazon Cognito (Chapter 10) This chapter introduces Amazon's cloud-based OAuth2.0-compliant identity management solution, Amazon Cognito.
  • Amazon DynamoDB (Chapter 11) This chapter introduces Amazon's managed NoSQL database service, Amazon DynamoDB.
  • AWS Lambda (Chapter 12) This chapter introduces AWS Lambda, a service designed to allow you to run code in the Amazon cloud without having to provision or manage any infrastructure.
  • Amazon Comprehend (Chapter 13) This chapter introduces Amazon Comprehend, a cloud-based natural language processing service that you can integrate into your applications to analyze the contents of text documents.
  • Amazon Lex (Chapter 14) This chapter introduces Amazon Lex, a cloud-based service that you can use to create chatbots and integrate them into your applications.
  • Amazon Machine Learning (Chapter 15) This chapter introduces Amazon Machine Learning, a fully managed cloud-based service that you can use to build and deploy simple machine learning models without any programming.
  • Amazon SageMaker (Chapter 16) This chapter introduces Amazon SageMaker, a cloud-based machine learning service that can be used to train and deploy both built-in and custom machine learning models.
  • Using Google Tensorflow with Amazon SageMaker (Chapter 17) This chapter introduces Google's Tensorflow framework and covers the use of Amazon SageMaker to build and deploy Tensorflow models.
  • Amazon Rekognition (Chapter 18) This chapter introduces Amazon Rekognition, a fully managed cloud-based service that can be used to add computer vision capabilities to your applications.

The appendices cover the following topics:

  • Anaconda and Jupyter Notebook Setup (Appendix A) This appendix provides instructions to install the Anaconda distribution and set up a Jupyter Notebook server on your local computer.
  • AWS Resources Needed to Use This Book (Appendix B) This appendix provides information on the AWS resources that you need to set up in your account in order to follow along with the examples in the book.
  • Installing and Configuring the AWS CLI (Appendix C) This appendix provides instructions to download and install the AWS CLI tool.
  • Introduction to NumPy and Pandas (Appendix D) This appendix provides an introduction to two Python libraries commonly used by data scientists: NumPy and...

Dateiformat: ePUB
Kopierschutz: Adobe-DRM (Digital Rights Management)


Computer (Windows; MacOS X; Linux): Installieren Sie bereits vor dem Download die kostenlose Software Adobe Digital Editions (siehe E-Book Hilfe).

Tablet/Smartphone (Android; iOS): Installieren Sie bereits vor dem Download die kostenlose App Adobe Digital Editions (siehe E-Book Hilfe).

E-Book-Reader: Bookeen, Kobo, Pocketbook, Sony, Tolino u.v.a.m. (nicht Kindle)

Das Dateiformat ePUB ist sehr gut für Romane und Sachbücher geeignet - also für "fließenden" Text ohne komplexes Layout. Bei E-Readern oder Smartphones passt sich der Zeilen- und Seitenumbruch automatisch den kleinen Displays an. Mit Adobe-DRM wird hier ein "harter" Kopierschutz verwendet. Wenn die notwendigen Voraussetzungen nicht vorliegen, können Sie das E-Book leider nicht öffnen. Daher müssen Sie bereits vor dem Download Ihre Lese-Hardware vorbereiten.

Bitte beachten Sie bei der Verwendung der Lese-Software Adobe Digital Editions: wir empfehlen Ihnen unbedingt nach Installation der Lese-Software diese mit Ihrer persönlichen Adobe-ID zu autorisieren!

Weitere Informationen finden Sie in unserer E-Book Hilfe.

Download (sofort verfügbar)

32,99 €
inkl. 7% MwSt.
Download / Einzel-Lizenz
ePUB mit Adobe-DRM
siehe Systemvoraussetzungen
E-Book bestellen