Official Google Cloud Certified Professional Machine Learning Engineer Study Guide

Name: Official Google Cloud Certified Professional Machine Learning Engineer Study Guide
Brand: Wiley-Scrivener
Price: 52.99 EUR
Availability: OnlineOnly

Mona Mona Pratap Ramamurthy(Author)

Wiley-Scrivener (Publisher)

1st Edition

Published on 27. October 2023

579 pages

E-Book

ePUB with Adobe-DRM

System requirements

978-1-119-98156-5 (ISBN)

€52.99incl. 7% vat

System requirements

for ePUB with Adobe-DRM

E-Book Single Licence

Available for download

Description

More details

Other editions

Persons

Content

Cover Page
Title Page
Copyright Page
Acknowledgments
About the Authors
About the Technical Editors
Contents at a Glance
Contents
Introduction
Google Cloud Professional Machine Learning Engineer Certification
Why Become Professional ML Engineer (PMLE) Certified?
How to Become Certified
Who Should Buy This Book
How This Book Is Organized
Chapter Features
Bonus Digital Contents
Conventions Used in This Book
Google Cloud Professional ML Engineer Objective Map
How to Contact the Publisher
Chapter 1 Framing ML Problems
Translating Business Use Cases
Machine Learning Approaches
Supervised, Unsupervised, and Semi-supervised Learning
Classification, Regression, Forecasting, and Clustering
ML Success Metrics
Regression
Responsible AI Practices
Summary
Exam Essentials
Review Questions
Chapter 2 Exploring Data and Building Data Pipelines
Visualization
Box Plot
Line Plot
Bar Plot
Scatterplot
Statistics Fundamentals
Mean
Median
Mode
Outlier Detection
Standard Deviation
Correlation
Data Quality and Reliability
Data Skew
Data Cleaning
Scaling
Log Scaling
Z-score
Clipping
Handling Outliers
Establishing Data Constraints
Exploration and Validation at Big-Data Scale
Running TFDV on Google Cloud Platform
Organizing and Optimizing Training Datasets
Imbalanced Data
Data Splitting
Data Splitting Strategy for Online Systems
Handling Missing Data
Data Leakage
Summary
Exam Essentials
Review Questions
Chapter 3 Feature Engineering
Consistent Data Preprocessing
Encoding Structured Data Types
Mapping Numeric Values
Mapping Categorical Values
Feature Selection
Class Imbalance
Classification Threshold with Precision and Recall
Area under the Curve (AUC)
Feature Crosses
TensorFlow Transform
TensorFlow Data API (tf.data)
TensorFlow Transform
GCP Data and ETL Tools
Summary
Exam Essentials
Review Questions
Chapter 4 Choosing the Right ML Infrastructure
Pretrained vs. AutoML vs. Custom Models
Pretrained Models
Vision AI
Video AI
Natural Language AI
Translation AI
Speech-to-Text
Text-to-Speech
AutoML
AutoML for Tables or Structured Data
AutoML for Images and Video
AutoML for Text
Recommendations AI/Retail AI
Document AI
Dialogflow and Contact Center AI
Custom Training
How a CPU Works
GPU
TPU
Provisioning for Predictions
Scaling Behavior
Finding the Ideal Machine Type
Edge TPU
Deploy to Android or iOS Device
Summary
Exam Essentials
Review Questions
Chapter 5 Architecting ML Solutions
Designing Reliable, Scalable, and Highly Available ML Solutions
Choosing an Appropriate ML Service
Data Collection and Data Management
Google Cloud Storage (GCS)
BigQuery
Vertex AI Managed Datasets
Vertex AI Feature Store
NoSQL Data Store
Automation and Orchestration
Use Vertex AI Pipelines to Orchestrate the ML Workflow
Use Kubeflow Pipelines for Flexible Pipeline Construction
Use TensorFlow Extended SDK to Leverage Pre-built Components for Common Steps
When to Use Which Pipeline
Serving
Offline or Batch Prediction
Online Prediction
Summary
Exam Essentials
Review Questions
Chapter 6 Building Secure ML Pipelines
Building Secure ML Systems
Encryption at Rest
Encryption in Transit
Encryption in Use
Identity and Access Management
IAM Permissions for Vertex AI Workbench
Securing a Network with Vertex AI
Privacy Implications of Data Usage and Collection
Google Cloud Data Loss Prevention
Google Cloud Healthcare API for PHI Identification
Best Practices for Removing Sensitive Data
Summary
Exam Essentials
Review Questions
Chapter 7 Model Building
Choice of Framework and Model Parallelism
Data Parallelism
Model Parallelism
Modeling Techniques
Artificial Neural Network
Deep Neural Network (DNN)
Convolutional Neural Network
Recurrent Neural Network
What Loss Function to Use
Gradient Descent
Learning Rate
Batch
Batch Size
Epoch
Hyperparameters
Transfer Learning
Semi-supervised Learning
When You Need Semi-supervised Learning
Limitations of SSL
Data Augmentation
Offline Augmentation
Online Augmentation
Model Generalization and Strategies to Handle Overfitting and Underfitting
Bias Variance Trade-Off
Underfitting
Overfitting
Regularization
Summary
Exam Essentials
Review Questions
Chapter 8 Model Training and Hyperparameter Tuning
Ingestion of Various File Types into Training
Collect
Process
Store and Analyze
Developing Models in Vertex AI Workbench by Using Common Frameworks
Creating a Managed Notebook
Exploring Managed JupyterLab Features
Data Integration
BigQuery Integration
Ability to Scale the Compute Up or Down
Git Integration for Team Collaboration
Schedule or Execute a Notebook Code
Creating a User-Managed Notebook
Training a Model as a Job in Different Environments
Training Workflow with Vertex AI
Training Dataset Options in Vertex AI
Pre-built Containers
Custom Containers
Distributed Training
Hyperparameter Tuning
Why Hyperparameters Are Important
Techniques to Speed Up Hyperparameter Optimization
How Vertex AI Hyperparameter Tuning Works
Vertex AI Vizier
Tracking Metrics During Training
Interactive Shell
TensorFlow Profiler
What-If Tool
Retraining/Redeployment Evaluation
Data Drift
Concept Drift
When Should a Model Be Retrained?
Unit Testing for Model Training and Serving
Testing for Updates in API Calls
Testing for Algorithmic Correctness
Summary
Exam Essentials
Review Questions
Chapter 9 Model Explainability on Vertex AI
Model Explainability on Vertex AI
Explainable AI
Interpretability and Explainability
Feature Importance
Vertex Explainable AI
Data Bias and Fairness
ML Solution Readiness
How to Set Up Explanations in the Vertex AI
Summary
Exam Essentials
Review Questions
Chapter 10 Scaling Models in Production
Scaling Prediction Service
TensorFlow Serving
Serving (Online, Batch, and Caching)
Real-Time Static and Dynamic Reference Features
Pre-computing and Caching Prediction
Google Cloud Serving Options
Online Predictions
Batch Predictions
Hosting Third-Party Pipelines (MLflow) on Google Cloud
Testing for Target Performance
Configuring Triggers and Pipeline Schedules
Summary
Exam Essentials
Review Questions
Chapter 11 Designing ML Training Pipelines
Orchestration Frameworks
Kubeflow Pipelines
Vertex AI Pipelines
Apache Airflow
Cloud Composer
Comparison of Tools
Identification of Components, Parameters, Triggers, and Compute Needs
Schedule the Workflows with Kubeflow Pipelines
Schedule Vertex AI Pipelines
System Design with Kubeflow/TFX
System Design with Kubeflow DSL
System Design with TFX
Hybrid or Multicloud Strategies
Summary
Exam Essentials
Review Questions
Chapter 12 Model Monitoring, Tracking, and Auditing Metadata
Model Monitoring
Concept Drift
Data Drift
Model Monitoring on Vertex AI
Drift and Skew Calculation
Input Schemas
Logging Strategy
Types of Prediction Logs
Log Settings
Model Monitoring and Logging
Model and Dataset Lineage
Vertex ML Metadata
Vertex AI Experiments
Vertex AI Debugging
Summary
Exam Essentials
Review Questions
Chapter 13 Maintaining ML Solutions
MLOps Maturity
MLOps Level 0: Manual/Tactical Phase
MLOps Level 1: Strategic Automation Phase
MLOps Level 2: CI/CD Automation, Transformational Phase
Retraining and Versioning Models
Triggers for Retraining
Versioning Models
Feature Store
Solution
Data Model
Ingestion and Serving
Vertex AI Permissions Model
Custom Service Account
Access Transparency in Vertex AI
Common Training and Serving Errors
Training Time Errors
Serving Time Errors
TensorFlow Data Validation
Vertex AI Debugging Shell
Summary
Exam Essentials
Review Questions
Chapter 14 BigQuery ML
BigQuery - Data Access
BigQuery ML Algorithms
Model Training
Model Evaluation
Prediction
Explainability in BigQuery ML
BigQuery ML vs. Vertex AI Tables
Interoperability with Vertex AI
Access BigQuery Public Dataset
Import BigQuery Data into Vertex AI
Access BigQuery Data from Vertex AI Workbench Notebooks
Analyze Test Prediction Data in BigQuery
Export Vertex AI Batch Prediction Results
Export BigQuery Models into Vertex AI
BigQuery Design Patterns
Hashed Feature
Transforms
Summary
Exam Essentials
Review Questions
Appendix: Answers to Review Questions
Chapter 1: Framing ML Problems
Chapter 2: Exploring Data and Building Data Pipelines
Chapter 3: Feature Engineering
Chapter 4: Choosing the Right ML Infrastructure
Chapter 5: Architecting ML Solutions
Chapter 6: Building Secure ML Pipelines
Chapter 7: Model Building
Chapter 8: Model Training and Hyperparameter Tuning
Chapter 9: Model Explainability on Vertex AI
Chapter 10: Scaling Models in Production
Chapter 11: Designing ML Training Pipelines
Chapter 12: Model Monitoring, Tracking, and Auditing Metadata
Chapter 13: Maintaining ML Solutions
Chapter 14: BigQuery ML
Index
EULA

Introduction

When customers have a business problem, say to detect objects in an image, sometimes it can be solved very well using machine learning. Google Cloud Platform (GCP) provides an extensive set of tools to be able to build a model that can accomplish this and deploy it for production usage. This book will cover many different use cases, such as using sales data to forecast for next quarter, identifying objects in images or videos, and even extracting information from text documents. This book helps an engineer build a secure, scalable, resilient machine learning application and automate the whole process using the latest technologies.

The purpose of this book is to help you pass the latest version of the Google Cloud Professional ML Engineer (PMLE) exam. Even after you've taken and passed the PMLE exam, this book should remain a useful reference as it covers the basics of machine learning, BigQuery ML, the Vertex AI platform, and MLOps.

Google Cloud Professional Machine Learning Engineer Certification

A Professional Machine Learning Engineer designs, builds, and productionizes ML models to solve business challenges using Google Cloud technologies and knowledge of proven ML models and techniques. The ML engineer considers responsible AI throughout the ML development process and collaborates closely with other job roles to ensure the long-term success of models. The ML engineer should be proficient in all aspects of model architecture, data pipeline interaction, and metrics interpretation. The ML engineer needs familiarity with foundational concepts of application development, infrastructure management, data engineering, and data governance. Through an understanding of training, retraining, deploying, scheduling, monitoring, and improving models, the ML engineer designs and creates scalable solutions for optimal performance.

Why Become Professional ML Engineer (PMLE) Certified?

There are several good reasons to get your PMLE certification.

Provides proof of professional achievement Certifications are quickly becoming status symbols in the computer service industry. Organizations, including members of the computer service industry, are recognizing the benefits of certification.
Increases your marketability According to Forbes (www.forbes.com/sites/louiscolumbus/2020/02/10/15-top-paying-it-certifications-in-2020/?sh=12f63aa8358e), jobs that require GCP certifications are the highest-paying jobs for the second year in a row, paying an average salary of $175,761/year. So, there is a demand from many engineers to get certified. Of the many certifications that GCP offers, the AI/ML certified engineer is a new certification and is still evolving.
Provides an opportunity for advancement IDC's research (www.idc.com/getdoc.jsp?containerId=IDC_P40729) indicates that while AI/ML adoption is on the rise, the cost, lack of expertise, and lack of life cycle management tools are among the top three inhibitors to realizing AI and ML at scale.
This book is the first in the market to talk about Google Cloud AI/ML tools and the technology covering the latest Professional ML Engineer certification guidelines released on February 22, 2022.
Recognizes Google as a leader in open source and AI Google is the main contributor to many of the path-breaking open source softwares that dramatically changed the landscape of AI/ML, including TensorFlow, Kubeflow, Word2vec, BERT, and T5. Although these algorithms are in the open source domain, Google has the distinct ability of bringing these open source projects to the market through the Google Cloud Platform (GCP). In this regard, the other cloud providers are frequently seen as trailing Google's offering.
Raises customer confidence As the IT community, users, small business owners, and the like become more familiar with the PMLE certified professional, more of them will realize that the PMLE professional is more qualified to architect secure, cost-effective, and scalable ML solutions on the Google Cloud environment than a noncertified individual.

How to Become Certified

You do not have to work for a particular company. It's not a secret society. There is no prerequisite to take this exam. However, there is a recommendation to have 3+ years of industry experience, including one or more years designing and managing solutions using Google Cloud.

This exam is 2 hours and has 50-60 multiple-choice questions.

You can register two ways for this exam:

Take the online-proctored exam from anywhere or sitting at home. You can review the online testing requirements at www.webassessor.com/wa.do?page=certInfo&branding=GOOGLECLOUD&tabs=13.
Take the on-site, proctored exam at a testing center.

We usually prefer to go with the on-site option as we like the focus time in a proctored environment. We have taken all our certifications in a test center. You can find and locate a test center near you at www.kryterion.com/Locate-Test-Center.

Who Should Buy This Book

This book is intended to help students, developers, data scientists, IT professionals, and ML engineers gain expertise in the ML technology on the Google Cloud Platform and take the Professional Machine Learning Engineer exam. This book intends to take readers through the machine learning process starting from data and moving on through feature engineering, model training, and deployment on the Google Cloud. It also walks readers through best practices for when to pick custom models versus AutoML or pretrained models. Google Cloud AI/ML technologies are presented through real-world scenarios to illustrate how IT professionals can design, build, and operate secure ML cloud environments to modernize and automate applications.

Anybody who wants to pass the Professional ML Engineer exam may benefit from this book. If you're new to Google Cloud, this book covers the updated machine learning exam course material, including the Google Cloud Vertex AI platform, MLOps, and BigQuery ML. This is the only book on the market to cover the complete Vertex AI platform, from bringing your data to training, tuning, and deploying your models.

Since it's a professional-level study guide, this book is written with the assumption that you know the basics of the Google Cloud Platform, such as compute, storage, networking, databases, and identity and access management (IAM) or have taken the Google Cloud Associate-level certification exam. Moreover, this book assumes you understand the basics of machine learning and data science in general. In case you do not understand a term or concept, we have included a glossary for your reference.

How This Book Is Organized

This book consists of 14 chapters plus supplementary information: a glossary, this introduction, and the assessment test after the introduction. The chapters are organized as follows:

Chapter 1: Framing ML Problems This chapter covers how you can translate business challenges into ML use cases.
Chapter 2: Exploring Data and Building Data Pipelines This chapter covers visualization, statistical fundamentals at scale, evaluation of data quality and feasibility, establishing data constraints (e.g., TFDV), organizing and optimizing training datasets, data validation, handling missing data, handling outliers, and data leakage.
Chapter 3: Feature Engineering This chapter covers topics such as encoding structured data types, feature selection, class imbalance, feature crosses, and transformations (TensorFlow Transform).
Chapter 4: Choosing the Right ML Infrastructure This chapter covers topics such as evaluation of compute and accelerator options (e.g., CPU, GPU, TPU, edge devices) and choosing appropriate Google Cloud hardware components. It also covers choosing the best solution (ML vs. non-ML, custom vs. pre-packaged [e.g., AutoML, Vision API]) based on the business requirements. It talks about how defining the model output should be used to solve the business problem. It also covers deciding how incorrect results should be handled and identifying data sources (available vs. ideal). It talks about AI solutions such as CCAI, DocAI, and Recommendations AI.
Chapter 5: Architecting ML Solutions This chapter explains how to design reliable, scalable, and highly available ML solutions. Other topics include how you can choose appropriate ML services for a use case (e.g., Cloud Build, Kubeflow), component types (e.g., data collection, data management), automation, orchestration, and serving in machine learning.
Chapter 6: Building Secure ML Pipelines This chapter describes how to build secure ML systems (e.g., protecting against unintentional exploitation of data/model, hacking). It also covers the privacy implications of data usage and/or collection (e.g., handling sensitive data such as personally identifiable information [PII] and protected health information [PHI]).
Chapter 7: Model Building This chapter describes the choice of framework and model parallelism. It also covers modeling techniques given interpretability requirements, transfer learning, data augmentation, semi-supervised learning, model generalization, and strategies to handle overfitting and...

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Official Google Cloud Certified Professional Machine Learning Engineer Study Guide

Description

More details

Other editions

Additional editions

Persons

Content

Introduction

Google Cloud Professional Machine Learning Engineer Certification

Why Become Professional ML Engineer (PMLE) Certified?

How to Become Certified

Who Should Buy This Book

How This Book Is Organized

System requirements