
Data Science on the Google Cloud Platform
Implementing end-to-end real-time data pipelines: from ingest to machine learning
Valliappa Lakshmanan(Author)
O'Reilly (Publisher)
Published on 8. January 2018
Book
Paperback/Softback
400 pages
978-1-4919-7456-8 (ISBN)
Description
Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build on top of the Google Cloud Platform (GCP). This hands-on guide shows developers entering the data science field how to implement an end-to-end data pipeline, using statistical and machine learning methods and tools on GCP. Through the course of the book, you'll work through a sample business decision by employing a variety of data science approaches.
Follow along by implementing these statistical and machine learning solutions in your own project on GCP, and discover how this platform provides a transformative and more collaborative way of doing data science.
You'll learn how to:
Automate and schedule data ingest, using an App Engine application
Create and populate a dashboard in Google Data Studio
Build a real-time analysis pipeline to carry out streaming analytics
Conduct interactive data exploration with Google BigQuery
Create a Bayesian model on a Cloud Dataproc cluster
Build a logistic regression machine-learning model with Spark
Compute time-aggregate features with a Cloud Dataflow pipeline
Create a high-performing prediction model with TensorFlow
Use your deployed model as a microservice you can access from both batch and real-time pipelines
Follow along by implementing these statistical and machine learning solutions in your own project on GCP, and discover how this platform provides a transformative and more collaborative way of doing data science.
You'll learn how to:
Automate and schedule data ingest, using an App Engine application
Create and populate a dashboard in Google Data Studio
Build a real-time analysis pipeline to carry out streaming analytics
Conduct interactive data exploration with Google BigQuery
Create a Bayesian model on a Cloud Dataproc cluster
Build a logistic regression machine-learning model with Spark
Compute time-aggregate features with a Cloud Dataflow pipeline
Create a high-performing prediction model with TensorFlow
Use your deployed model as a microservice you can access from both batch and real-time pipelines
More details
Language
English
Place of publication
Sebastopol
United States
Target group
Professional and scholarly
Dimensions
Height: 234 mm
Width: 181 mm
Thickness: 21 mm
Weight
700 gr
ISBN-13
978-1-4919-7456-8 (9781491974568)
Copyright in bibliographic data and cover images is held by Nielsen Book Services Limited or by the publishers or by their respective licensors: all rights reserved.
Schweitzer Classification
Other editions
Additional editions

Valliappa Lakshmanan
Data Science on the Google Cloud Platform
Implementing End-to-End Real-Time Data Pipelines: From Ingest to Machine Learning
E-Book
12/2017
O'Reilly
€50.49
Available for download
Person
Valliappa (Lak) Lakshmanan is currently a Tech Lead for Data and Machine Learning Professional Services for Google Cloud. His mission is to democratize machine learning so that it can be done by anyone anywhere using Google's amazing infrastructure, without deep knowledge of statistics or programming or ownership of a lot of hardware. Before Google, he led a team of data scientists at the Climate Corporation and was a Research Scientist at NOAA National Severe Storms Laboratory, working on machine learning applications for severe weather diagnosis and prediction.