
Mastering Databricks Lakehouse Platform
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Content
- Intro
- Cover Page
- Title Page
- Copyright Page
- Dedication Page
- About the Authors
- About the Reviewer
- Acknowledgement
- Preface
- Errata
- Table of Contents
- 1. Getting Started with Databricks Platform
- Structure
- Objectives
- Introduction to Databricks
- What can we do with Databricks?
- Databricks architecture
- Control plane
- Data plane
- How does it work?
- Databricks for Data Engineers and Data Scientists
- Databricks SQL
- Features of Databricks SQL
- SQL endpoints for Databricks SQL
- Databricks components
- Workspace
- Notebooks
- Libraries
- Folder
- MLflow experiment
- Interface
- Databricks UI
- Databricks API
- Databricks CLI
- Data management
- DBFS
- Tables
- Database
- Metastore
- Computation management
- Cluster
- All-purpose cluster
- Job cluster
- Pools
- Databricks runtime
- Databricks runtime
- Databricks runtime for machine learning
- Photon
- Databricks light
- Databricks runtime for genomics (deprecated)
- Access management
- User
- Group
- Access Control Lists (ACLs)
- Conclusion
- Multiple choice questions
- Answers
- 2. Management of Databricks Platform
- Structure
- Objectives
- Databricks cluster basics
- Cluster computation resources
- Clusters
- Cluster governance
- Platform architecture, security, and data protection
- Platform architecture
- Platform security
- Data Protection
- Databricks data access management
- Databricks cluster management
- Databricks SQL Analytics administration
- Conclusion
- Multiple choice questions
- Answers
- 3. Spark, Databricks, and Building a Data Quality Framework
- Structure
- Objectives
- Introduction to Apache Spark
- History
- Evolution to DataBricks
- What happened to Apache Spark?
- Features of Apache Spark
- The book paraphrase and translation analogy
- Spark and its evolution
- Components of Apache Spark
- Resilient Distributed Dataset (RDD)
- Datasets and DataFrames
- Directed Acyclic Graph (DAG)
- Execution mechanism
- Processing data using Databricks pipeline
- Building an audit framework with Databricks
- Time travel
- Conclusion
- Multiple choice questions
- Answers
- 4. Data Sharing and Orchestration with Databricks
- Orchestrating Data and Machine Learning pipelines in Databricks
- Running Databricks tasks using Amazon Managed Airflow
- Run and orchestrate the Databricks tasks using Data Factory
- Create an Azure Databricks linked service
- Conclusion
- Multiple choice questions
- Answers
- 5. Simplified ETL with Delta Live Tables
- Structure
- Objectives
- Delta Live Table concepts
- Components of the Delta Live Table
- Creating Delta Live Tables using Python and SQL
- Delta Live Table components
- Development workflow with Delta Live Table
- Delta Live Table configurations
- Conclusion
- Multiple choice questions
- Answers
- 6. SCD Type 2 Implementation with Delta Lake
- Structure
- Objectives
- Streaming data with structure streaming
- Change Data Feed
- Conclusion
- Multiple choice questions
- Answers
- 7. Machine Learning Model Management with Databricks
- Structure
- Objectives
- Introduction to MLOps and MLflow
- Model life cycle management using MLflow
- Getting started with MLflow environment
- MLflow installation
- Setting up MLflow project with model repository
- Train and deploy the model
- Log model metrics
- Conclusion
- Multiple choice questions
- Answers
- 8. Continuous Integration and Delivery with Databricks
- Structure
- Objectives
- Repos for Git integration
- Conclusion
- Multiple Choice Questions
- Answers
- 9. Visualization with Databricks
- Structure
- Objectives
- Databricks SQL Analytics
- Databricks as a data source with Tableau
- Databricks DirectQuery with Power BI
- Databricks DirectQuery with Qlik
- Databricks DirectQuery with TIBCO Spotfire Analyst
- Conclusion
- Multiple choice questions
- Answers
- 10. Best Security and Compliance Practices of Databricks
- Structure
- Objectives
- Delta Lake: hyperparameter tuning with Hyperopt
- Access control and secret management
- Cluster configuration and policies
- Data governance
- GDPR and CCPA compliance using Delta Lake
- Conclusion
- Multiple choice questions
- Answers
- Index
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.