
Automating Data Quality Monitoring at Scale
Scaling Beyond Rules with Machine Learning
O'Reilly (Verlag)
Erscheint ca. am 30. Januar 2024
Buch
Softcover
170 Seiten
978-1-0981-4593-4 (ISBN)
Beschreibung
The world's businesses ingest a combined 2.5 quintillion bytes of data every day. But how much of this vast amount of data--used to build products, power AI systems, and drive business decisions--is poor quality or just plain bad? This practical book shows you how to ensure that the data your organization relies on contains only high-quality records.
Most data engineers, data analysts, and data scientists genuinely care about data quality, but they often don't have the time, resources, or understanding to create a data quality monitoring solution that succeeds at scale. In this book, Jeremy Stanley and Paige Schwartz from Anomalo explain how you can use automated data quality monitoring to cover all your tables efficiently, proactively alert on every category of issue, and resolve problems immediately.
This book will help you:
Learn why data quality is a business imperative
Understand and assess unsupervised learning models for detecting data issues
Implement notifications that reduce alert fatigue and let you triage and resolve issues quickly
Integrate automated data quality monitoring with data catalogs, orchestration layers, and BI and ML systems
Understand the limits of automated data quality monitoring and how to overcome them
Learn how to deploy and manage your monitoring solution at scale
Maintain automated data quality monitoring for the long term
Most data engineers, data analysts, and data scientists genuinely care about data quality, but they often don't have the time, resources, or understanding to create a data quality monitoring solution that succeeds at scale. In this book, Jeremy Stanley and Paige Schwartz from Anomalo explain how you can use automated data quality monitoring to cover all your tables efficiently, proactively alert on every category of issue, and resolve problems immediately.
This book will help you:
Learn why data quality is a business imperative
Understand and assess unsupervised learning models for detecting data issues
Implement notifications that reduce alert fatigue and let you triage and resolve issues quickly
Integrate automated data quality monitoring with data catalogs, orchestration layers, and BI and ML systems
Understand the limits of automated data quality monitoring and how to overcome them
Learn how to deploy and manage your monitoring solution at scale
Maintain automated data quality monitoring for the long term
Weitere Details
Sprache
Englisch
Verlagsort
Sebastopol
USA
Produkt-Hinweis
Broschur/Paperback
Klebebindung
Maße
Höhe: 230 mm
Breite: 174 mm
Dicke: 14 mm
Gewicht
396 gr
ISBN-13
978-1-0981-4593-4 (9781098145934)
Copyright in bibliographic data and cover images is held by Nielsen Book Services Limited or by the publishers or by their respective licensors: all rights reserved.
Schweitzer Klassifikation
Weitere Ausgaben
Andere Ausgaben

Jeremy Stanley | Paige Schwartz
Automating Data Quality Monitoring
E-Book
01/2024
O'Reilly
50,49 €
Als Download verfügbar

Jeremy Stanley | Paige Schwartz
Automating Data Quality Monitoring
E-Book
01/2024
O'Reilly
50,49 €
Als Download verfügbar
Personen
Jeremy Stanley is co-founder and CTO at Anomalo. Prior to Anomalo, Jeremy was the VP of Data Science at Instacart, where he led machine learning and drove multiple initiatives to improve the company's profitability. Previously, he led data science and engineering at other hyper-growth companies like Sailthru. He's applied machine learning and AI technologies to everything from insurance and accounting to ad-tech and last-mile delivery logistics. He's also a recognized thought leader in the data science community with hugely popular blog posts like Deep Learning with Emojis (not Math). Jeremy holds a BS in Mathematics from Wichita State University and an MBA from Columbia University. Paige Schwartz is a professional technical writer at Anomalo who has worked with clients such as Airbnb, Grammarly, and Samsara, as well as successful startups like CodeSignal, Tecton, Clerky, and Fiddler. She specializes in communicating complex software engineering topics to a general audience and has spent her career working with machine learning and data systems, including 5 years as a product manager on Google Search. She holds a joint BA in Computer Science and English from UC Berkeley.