
Spark Operations
O'Reilly (Publisher)
Book
Paperback/Softback
300 pages
978-1-4919-2028-2 (ISBN)
Description
Learn everything you need to know to reliably automate, deploy, and maintain Apache Spark on AWS, Google Cloud, On-Premises, and Openstack. Unlike other books that focus on the internals of how Apache Spark works, and which mostly interest data-scientists, this book covers the operational aspects of Apache Spark. This book will benefit anyone in a DevOps role, and help you understand how Spark was designed to handle scale and failures, and the operational considerations and tools useful in operating Apache Spark.
More details
Language
English
Place of publication
Sebastopol
United States
Target group
Professional and scholarly
Dimensions
Height: 250 mm
Width: 150 mm
Thickness: 15 mm
Weight
666 gr
ISBN-13
978-1-4919-2028-2 (9781491920282)
Copyright in bibliographic data and cover images is held by Nielsen Book Services Limited or by the publishers or by their respective licensors: all rights reserved.
Schweitzer Classification
Persons
Timothy Chen is a Distributed Systems Engineer at Mesosphere, where he helps customers build up their Mesos infrastructures. He has extensive experience with Spark, Mesos, Kafka, and many other Big Data Solutions. He is a committer and PMC for Apache Drill and Apache Mesos projects and regularly contributes code to Apache Spark.Parviz Deyhim is an Architect at Databricks, working with customers and the community on leading-edge Spark deployments. He has been involved in the Spark community since early 2012 and the early days of Apache Spark. In addition, he spent three years working at Amazon Web Services. During his time at AWS he championed adopting Spark on AWS and was responsible for Spark as an offering on AWS EMR.Denny Lee is a Senior Director of Data Sciences Engineering at Concur. He regularly (co)presents various Spark and Big Data sessions and webinars -- most recently at Tableau Data14 and Cloudera Apache Spark webinar. He writes regular blog posts on Spark and big data topics on his blog (dennyglee.com) and Concur's blog (concur.com/blog). He is the lead organizer of the Seattle Spark Meetup group and Seattle Mesos User Group, and had been involved with Spark since 2012. In his previous work at Microsoft, he helped build Project Isotope (Hadoop on Windows and Azure), co-authored various Wrox books on Analysis Services and PowerPivot, and wrote extensive blog posts on sqlcat.com.Benjamin Stickel is a Systems Engineer at Concur, and is actively involved in implementing a log-everything architecture to augment daily operations and improve visibility into the production environment. As part of the Architecture team for Concur, he has actively integrated ElasticSearch into the production environment, along with Hadoop and Spark, and he has spearheaded the use of Mesos to resolve various workloads. He has also actively contributed to the integration of Puppet for configuration management and participated with multiple DevOps teams.