
Official Google Cloud Certified Professional Machine Learning Engineer Study Guide
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
In Google Cloud Certified Professional Machine Learning Study Guide, a team of accomplished artificial intelligence (AI) and machine learning (ML) specialists delivers an expert roadmap to AI and ML on the Google Cloud Platform based on new exam curriculum. With Sybex, you'll prepare faster and smarter for the Google Cloud Certified Professional Machine Learning Engineer exam and get ready to hit the ground running on your first day at your new job as an ML engineer.
The book walks readers through the machine learning process from start to finish, starting with data, feature engineering, model training, and deployment on Google Cloud. It also discusses best practices on when to pick a custom model vs AutoML or pretrained models with Vertex AI platform. All technologies such as Tensorflow, Kubeflow, and Vertex AI are presented by way of real-world scenarios to help you apply the theory to practical examples and show you how IT professionals design, build, and operate secure ML cloud environments.
The book also shows you how to:
* Frame ML problems and architect ML solutions from scratch
* Banish test anxiety by verifying and checking your progress with built-in self-assessments and other practical tools
* Use the Sybex online practice environment, complete with practice questions and explanations, a glossary, objective maps, and flash cards
A can't-miss resource for everyone preparing for the Google Cloud Certified Professional Machine Learning certification exam, or for a new career in ML powered by the Google Cloud Platform, this Sybex Study Guide has everything you need to take the next step in your career.
More details
Other editions
Additional editions

Persons
ABOUT THE AUTHORS
MONA is an AI/ML specialist in the Google Public Sector. She is the author of Natural Language Processing with AWS AI Services and a frequent speaker at cloud computing and machine learning events. She was a Sr. AI/ML specialist SA at AWS before joining Google. She has 14 Certifications and has created courses for AWS AI/ML Certification Speciality Exam readiness. She has authored 17 articles on AI/ML and also co-authored a research paper on CORD-19 Neural Search, which won an award at the AAAI Conference on Artificial Intelligence
Pratap Ramamurthy is an AI/ML Specialist Customer Engineer in Google Cloud. Previously, he worked as a Sr. Principal Solution Architect at H2O.ai and before that was a Partner Solution Architect at AWS. He has authored several research papers and holds 3 patents.
Content
- Cover Page
- Title Page
- Copyright Page
- Acknowledgments
- About the Authors
- About the Technical Editors
- Contents at a Glance
- Contents
- Introduction
- Google Cloud Professional Machine Learning Engineer Certification
- Why Become Professional ML Engineer (PMLE) Certified?
- How to Become Certified
- Who Should Buy This Book
- How This Book Is Organized
- Chapter Features
- Bonus Digital Contents
- Conventions Used in This Book
- Google Cloud Professional ML Engineer Objective Map
- How to Contact the Publisher
- Chapter 1 Framing ML Problems
- Translating Business Use Cases
- Machine Learning Approaches
- Supervised, Unsupervised, and Semi-supervised Learning
- Classification, Regression, Forecasting, and Clustering
- ML Success Metrics
- Regression
- Responsible AI Practices
- Summary
- Exam Essentials
- Review Questions
- Chapter 2 Exploring Data and Building Data Pipelines
- Visualization
- Box Plot
- Line Plot
- Bar Plot
- Scatterplot
- Statistics Fundamentals
- Mean
- Median
- Mode
- Outlier Detection
- Standard Deviation
- Correlation
- Data Quality and Reliability
- Data Skew
- Data Cleaning
- Scaling
- Log Scaling
- Z-score
- Clipping
- Handling Outliers
- Establishing Data Constraints
- Exploration and Validation at Big-Data Scale
- Running TFDV on Google Cloud Platform
- Organizing and Optimizing Training Datasets
- Imbalanced Data
- Data Splitting
- Data Splitting Strategy for Online Systems
- Handling Missing Data
- Data Leakage
- Summary
- Exam Essentials
- Review Questions
- Chapter 3 Feature Engineering
- Consistent Data Preprocessing
- Encoding Structured Data Types
- Mapping Numeric Values
- Mapping Categorical Values
- Feature Selection
- Class Imbalance
- Classification Threshold with Precision and Recall
- Area under the Curve (AUC)
- Feature Crosses
- TensorFlow Transform
- TensorFlow Data API (tf.data)
- TensorFlow Transform
- GCP Data and ETL Tools
- Summary
- Exam Essentials
- Review Questions
- Chapter 4 Choosing the Right ML Infrastructure
- Pretrained vs. AutoML vs. Custom Models
- Pretrained Models
- Vision AI
- Video AI
- Natural Language AI
- Translation AI
- Speech-to-Text
- Text-to-Speech
- AutoML
- AutoML for Tables or Structured Data
- AutoML for Images and Video
- AutoML for Text
- Recommendations AI/Retail AI
- Document AI
- Dialogflow and Contact Center AI
- Custom Training
- How a CPU Works
- GPU
- TPU
- Provisioning for Predictions
- Scaling Behavior
- Finding the Ideal Machine Type
- Edge TPU
- Deploy to Android or iOS Device
- Summary
- Exam Essentials
- Review Questions
- Chapter 5 Architecting ML Solutions
- Designing Reliable, Scalable, and Highly Available ML Solutions
- Choosing an Appropriate ML Service
- Data Collection and Data Management
- Google Cloud Storage (GCS)
- BigQuery
- Vertex AI Managed Datasets
- Vertex AI Feature Store
- NoSQL Data Store
- Automation and Orchestration
- Use Vertex AI Pipelines to Orchestrate the ML Workflow
- Use Kubeflow Pipelines for Flexible Pipeline Construction
- Use TensorFlow Extended SDK to Leverage Pre-built Components for Common Steps
- When to Use Which Pipeline
- Serving
- Offline or Batch Prediction
- Online Prediction
- Summary
- Exam Essentials
- Review Questions
- Chapter 6 Building Secure ML Pipelines
- Building Secure ML Systems
- Encryption at Rest
- Encryption in Transit
- Encryption in Use
- Identity and Access Management
- IAM Permissions for Vertex AI Workbench
- Securing a Network with Vertex AI
- Privacy Implications of Data Usage and Collection
- Google Cloud Data Loss Prevention
- Google Cloud Healthcare API for PHI Identification
- Best Practices for Removing Sensitive Data
- Summary
- Exam Essentials
- Review Questions
- Chapter 7 Model Building
- Choice of Framework and Model Parallelism
- Data Parallelism
- Model Parallelism
- Modeling Techniques
- Artificial Neural Network
- Deep Neural Network (DNN)
- Convolutional Neural Network
- Recurrent Neural Network
- What Loss Function to Use
- Gradient Descent
- Learning Rate
- Batch
- Batch Size
- Epoch
- Hyperparameters
- Transfer Learning
- Semi-supervised Learning
- When You Need Semi-supervised Learning
- Limitations of SSL
- Data Augmentation
- Offline Augmentation
- Online Augmentation
- Model Generalization and Strategies to Handle Overfitting and Underfitting
- Bias Variance Trade-Off
- Underfitting
- Overfitting
- Regularization
- Summary
- Exam Essentials
- Review Questions
- Chapter 8 Model Training and Hyperparameter Tuning
- Ingestion of Various File Types into Training
- Collect
- Process
- Store and Analyze
- Developing Models in Vertex AI Workbench by Using Common Frameworks
- Creating a Managed Notebook
- Exploring Managed JupyterLab Features
- Data Integration
- BigQuery Integration
- Ability to Scale the Compute Up or Down
- Git Integration for Team Collaboration
- Schedule or Execute a Notebook Code
- Creating a User-Managed Notebook
- Training a Model as a Job in Different Environments
- Training Workflow with Vertex AI
- Training Dataset Options in Vertex AI
- Pre-built Containers
- Custom Containers
- Distributed Training
- Hyperparameter Tuning
- Why Hyperparameters Are Important
- Techniques to Speed Up Hyperparameter Optimization
- How Vertex AI Hyperparameter Tuning Works
- Vertex AI Vizier
- Tracking Metrics During Training
- Interactive Shell
- TensorFlow Profiler
- What-If Tool
- Retraining/Redeployment Evaluation
- Data Drift
- Concept Drift
- When Should a Model Be Retrained?
- Unit Testing for Model Training and Serving
- Testing for Updates in API Calls
- Testing for Algorithmic Correctness
- Summary
- Exam Essentials
- Review Questions
- Chapter 9 Model Explainability on Vertex AI
- Model Explainability on Vertex AI
- Explainable AI
- Interpretability and Explainability
- Feature Importance
- Vertex Explainable AI
- Data Bias and Fairness
- ML Solution Readiness
- How to Set Up Explanations in the Vertex AI
- Summary
- Exam Essentials
- Review Questions
- Chapter 10 Scaling Models in Production
- Scaling Prediction Service
- TensorFlow Serving
- Serving (Online, Batch, and Caching)
- Real-Time Static and Dynamic Reference Features
- Pre-computing and Caching Prediction
- Google Cloud Serving Options
- Online Predictions
- Batch Predictions
- Hosting Third-Party Pipelines (MLflow) on Google Cloud
- Testing for Target Performance
- Configuring Triggers and Pipeline Schedules
- Summary
- Exam Essentials
- Review Questions
- Chapter 11 Designing ML Training Pipelines
- Orchestration Frameworks
- Kubeflow Pipelines
- Vertex AI Pipelines
- Apache Airflow
- Cloud Composer
- Comparison of Tools
- Identification of Components, Parameters, Triggers, and Compute Needs
- Schedule the Workflows with Kubeflow Pipelines
- Schedule Vertex AI Pipelines
- System Design with Kubeflow/TFX
- System Design with Kubeflow DSL
- System Design with TFX
- Hybrid or Multicloud Strategies
- Summary
- Exam Essentials
- Review Questions
- Chapter 12 Model Monitoring, Tracking, and Auditing Metadata
- Model Monitoring
- Concept Drift
- Data Drift
- Model Monitoring on Vertex AI
- Drift and Skew Calculation
- Input Schemas
- Logging Strategy
- Types of Prediction Logs
- Log Settings
- Model Monitoring and Logging
- Model and Dataset Lineage
- Vertex ML Metadata
- Vertex AI Experiments
- Vertex AI Debugging
- Summary
- Exam Essentials
- Review Questions
- Chapter 13 Maintaining ML Solutions
- MLOps Maturity
- MLOps Level 0: Manual/Tactical Phase
- MLOps Level 1: Strategic Automation Phase
- MLOps Level 2: CI/CD Automation, Transformational Phase
- Retraining and Versioning Models
- Triggers for Retraining
- Versioning Models
- Feature Store
- Solution
- Data Model
- Ingestion and Serving
- Vertex AI Permissions Model
- Custom Service Account
- Access Transparency in Vertex AI
- Common Training and Serving Errors
- Training Time Errors
- Serving Time Errors
- TensorFlow Data Validation
- Vertex AI Debugging Shell
- Summary
- Exam Essentials
- Review Questions
- Chapter 14 BigQuery ML
- BigQuery - Data Access
- BigQuery ML Algorithms
- Model Training
- Model Evaluation
- Prediction
- Explainability in BigQuery ML
- BigQuery ML vs. Vertex AI Tables
- Interoperability with Vertex AI
- Access BigQuery Public Dataset
- Import BigQuery Data into Vertex AI
- Access BigQuery Data from Vertex AI Workbench Notebooks
- Analyze Test Prediction Data in BigQuery
- Export Vertex AI Batch Prediction Results
- Export BigQuery Models into Vertex AI
- BigQuery Design Patterns
- Hashed Feature
- Transforms
- Summary
- Exam Essentials
- Review Questions
- Appendix: Answers to Review Questions
- Chapter 1: Framing ML Problems
- Chapter 2: Exploring Data and Building Data Pipelines
- Chapter 3: Feature Engineering
- Chapter 4: Choosing the Right ML Infrastructure
- Chapter 5: Architecting ML Solutions
- Chapter 6: Building Secure ML Pipelines
- Chapter 7: Model Building
- Chapter 8: Model Training and Hyperparameter Tuning
- Chapter 9: Model Explainability on Vertex AI
- Chapter 10: Scaling Models in Production
- Chapter 11: Designing ML Training Pipelines
- Chapter 12: Model Monitoring, Tracking, and Auditing Metadata
- Chapter 13: Maintaining ML Solutions
- Chapter 14: BigQuery ML
- Index
- EULA
Introduction
When customers have a business problem, say to detect objects in an image, sometimes it can be solved very well using machine learning. Google Cloud Platform (GCP) provides an extensive set of tools to be able to build a model that can accomplish this and deploy it for production usage. This book will cover many different use cases, such as using sales data to forecast for next quarter, identifying objects in images or videos, and even extracting information from text documents. This book helps an engineer build a secure, scalable, resilient machine learning application and automate the whole process using the latest technologies.
The purpose of this book is to help you pass the latest version of the Google Cloud Professional ML Engineer (PMLE) exam. Even after you've taken and passed the PMLE exam, this book should remain a useful reference as it covers the basics of machine learning, BigQuery ML, the Vertex AI platform, and MLOps.
Google Cloud Professional Machine Learning Engineer Certification
A Professional Machine Learning Engineer designs, builds, and productionizes ML models to solve business challenges using Google Cloud technologies and knowledge of proven ML models and techniques. The ML engineer considers responsible AI throughout the ML development process and collaborates closely with other job roles to ensure the long-term success of models. The ML engineer should be proficient in all aspects of model architecture, data pipeline interaction, and metrics interpretation. The ML engineer needs familiarity with foundational concepts of application development, infrastructure management, data engineering, and data governance. Through an understanding of training, retraining, deploying, scheduling, monitoring, and improving models, the ML engineer designs and creates scalable solutions for optimal performance.
Why Become Professional ML Engineer (PMLE) Certified?
There are several good reasons to get your PMLE certification.
- Provides proof of professional achievement Certifications are quickly becoming status symbols in the computer service industry. Organizations, including members of the computer service industry, are recognizing the benefits of certification.
- Increases your marketability According to Forbes (
www.forbes.com/sites/louiscolumbus/2020/02/10/15-top-paying-it-certifications-in-2020/?sh=12f63aa8358e), jobs that require GCP certifications are the highest-paying jobs for the second year in a row, paying an average salary of $175,761/year. So, there is a demand from many engineers to get certified. Of the many certifications that GCP offers, the AI/ML certified engineer is a new certification and is still evolving. - Provides an opportunity for advancement IDC's research (
www.idc.com/getdoc.jsp?containerId=IDC_P40729) indicates that while AI/ML adoption is on the rise, the cost, lack of expertise, and lack of life cycle management tools are among the top three inhibitors to realizing AI and ML at scale. - This book is the first in the market to talk about Google Cloud AI/ML tools and the technology covering the latest Professional ML Engineer certification guidelines released on February 22, 2022.
- Recognizes Google as a leader in open source and AI Google is the main contributor to many of the path-breaking open source softwares that dramatically changed the landscape of AI/ML, including TensorFlow, Kubeflow, Word2vec, BERT, and T5. Although these algorithms are in the open source domain, Google has the distinct ability of bringing these open source projects to the market through the Google Cloud Platform (GCP). In this regard, the other cloud providers are frequently seen as trailing Google's offering.
- Raises customer confidence As the IT community, users, small business owners, and the like become more familiar with the PMLE certified professional, more of them will realize that the PMLE professional is more qualified to architect secure, cost-effective, and scalable ML solutions on the Google Cloud environment than a noncertified individual.
How to Become Certified
You do not have to work for a particular company. It's not a secret society. There is no prerequisite to take this exam. However, there is a recommendation to have 3+ years of industry experience, including one or more years designing and managing solutions using Google Cloud.
This exam is 2 hours and has 50-60 multiple-choice questions.
You can register two ways for this exam:
- Take the online-proctored exam from anywhere or sitting at home. You can review the online testing requirements at
www.webassessor.com/wa.do?page=certInfo&branding=GOOGLECLOUD&tabs=13. - Take the on-site, proctored exam at a testing center.
We usually prefer to go with the on-site option as we like the focus time in a proctored environment. We have taken all our certifications in a test center. You can find and locate a test center near you at www.kryterion.com/Locate-Test-Center.
Who Should Buy This Book
This book is intended to help students, developers, data scientists, IT professionals, and ML engineers gain expertise in the ML technology on the Google Cloud Platform and take the Professional Machine Learning Engineer exam. This book intends to take readers through the machine learning process starting from data and moving on through feature engineering, model training, and deployment on the Google Cloud. It also walks readers through best practices for when to pick custom models versus AutoML or pretrained models. Google Cloud AI/ML technologies are presented through real-world scenarios to illustrate how IT professionals can design, build, and operate secure ML cloud environments to modernize and automate applications.
Anybody who wants to pass the Professional ML Engineer exam may benefit from this book. If you're new to Google Cloud, this book covers the updated machine learning exam course material, including the Google Cloud Vertex AI platform, MLOps, and BigQuery ML. This is the only book on the market to cover the complete Vertex AI platform, from bringing your data to training, tuning, and deploying your models.
Since it's a professional-level study guide, this book is written with the assumption that you know the basics of the Google Cloud Platform, such as compute, storage, networking, databases, and identity and access management (IAM) or have taken the Google Cloud Associate-level certification exam. Moreover, this book assumes you understand the basics of machine learning and data science in general. In case you do not understand a term or concept, we have included a glossary for your reference.
How This Book Is Organized
This book consists of 14 chapters plus supplementary information: a glossary, this introduction, and the assessment test after the introduction. The chapters are organized as follows:
- Chapter 1: Framing ML Problems This chapter covers how you can translate business challenges into ML use cases.
- Chapter 2: Exploring Data and Building Data Pipelines This chapter covers visualization, statistical fundamentals at scale, evaluation of data quality and feasibility, establishing data constraints (e.g., TFDV), organizing and optimizing training datasets, data validation, handling missing data, handling outliers, and data leakage.
- Chapter 3: Feature Engineering This chapter covers topics such as encoding structured data types, feature selection, class imbalance, feature crosses, and transformations (TensorFlow Transform).
- Chapter 4: Choosing the Right ML Infrastructure This chapter covers topics such as evaluation of compute and accelerator options (e.g., CPU, GPU, TPU, edge devices) and choosing appropriate Google Cloud hardware components. It also covers choosing the best solution (ML vs. non-ML, custom vs. pre-packaged [e.g., AutoML, Vision API]) based on the business requirements. It talks about how defining the model output should be used to solve the business problem. It also covers deciding how incorrect results should be handled and identifying data sources (available vs. ideal). It talks about AI solutions such as CCAI, DocAI, and Recommendations AI.
- Chapter 5: Architecting ML Solutions This chapter explains how to design reliable, scalable, and highly available ML solutions. Other topics include how you can choose appropriate ML services for a use case (e.g., Cloud Build, Kubeflow), component types (e.g., data collection, data management), automation, orchestration, and serving in machine learning.
- Chapter 6: Building Secure ML Pipelines This chapter describes how to build secure ML systems (e.g., protecting against unintentional exploitation of data/model, hacking). It also covers the privacy implications of data usage and/or collection (e.g., handling sensitive data such as personally identifiable information [PII] and protected health information [PHI]).
- Chapter 7: Model Building This chapter describes the choice of framework and model parallelism. It also covers modeling techniques given interpretability requirements, transfer learning, data augmentation, semi-supervised learning, model generalization, and strategies to handle overfitting and...
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.