
Analytics Engineering with SQL and dbt
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
With the shift from data warehouses to data lakes, data now lands in repositories before it''s been transformed, enabling engineers to model raw data into clean, well-defined datasets. dbt (data build tool) helps you take data further. This practical book shows data analysts, data engineers, BI developers, and data scientists how to create a true self-service transformation platform through the use of dynamic SQL.
Authors Rui Machado from Monstarlab and Hélder Russa from Jumia show you how to quickly deliver new data products by focusing more on value delivery and less on architectural and engineering aspects. If you know your business well and have the technical skills to model raw data into clean, well-defined datasets, you''ll learn how to design and deliver data models without any technical influence.
With this book, you''ll learn:
- What dbt is and how a dbt project is structured
- How dbt fits into the data engineering and analytics worlds
- How to collaborate on building data models
- The main tools and architectures for building useful, functional data models
- How to fit dbt into data warehousing and laking architecture
- How to build tests for data transformations
More details
Other editions
Additional editions

Content
- Cover
- Copyright
- Table of Contents
- Preface
- Why We Wrote This Book
- Who This Book Is For
- How This Book Is Organized
- Conventions Used in This Book
- O'Reilly Online Learning
- How to Contact Us
- Acknowledgments
- Chapter 1. Analytics Engineering
- Databases and Their Impact on Analytics Engineering
- Cloud Computing and Its Impact on Analytics Engineering
- The Data Analytics Lifecycle
- The New Role of Analytics Engineer
- Responsibilities of an Analytics Engineer
- Enabling Analytics in a Data Mesh
- Data Products
- dbt as a Data Mesh Enabler
- The Heart of Analytics Engineering
- The Legacy Processes
- Using SQL and Stored Procedures for ETL/ELT
- Using ETL Tools
- The dbt Revolution
- Summary
- Chapter 2. Data Modeling for Analytics
- A Brief on Data Modeling
- The Conceptual Phase of Modeling
- The Logical Phase of Modeling
- The Physical Phase of Modeling
- The Data Normalization Process
- Dimensional Data Modeling
- Modeling with the Star Schema
- Modeling with the Snowflake Schema
- Modeling with Data Vault
- Monolith Data Modeling
- Building Modular Data Models
- Enabling Modular Data Models with dbt
- Testing Your Data Models
- Generating Data Documentation
- Debugging and Optimizing Data Models
- Medallion Architecture Pattern
- Summary
- Chapter 3. SQL for Analytics
- The Resiliency of SQL
- Database Fundamentals
- Types of Databases
- Database Management System
- "Speaking" with a Database
- Creating and Managing Your Data Structures with DDL
- Manipulating Data with DML
- Inserting Data with INSERT
- Selecting Data with SELECT
- Updating Data with UPDATE
- Deleting Data with DELETE
- Storing Queries as Views
- Common Table Expressions
- Window Functions
- SQL for Distributed Data Processing
- Data Manipulation with DuckDB
- Data Manipulation with Polars
- Data Manipulation with FugueSQL
- Bonus: Training Machine Learning Models with SQL
- Summary
- Chapter 4. Data Transformation with dbt
- dbt Design Philosophy
- dbt Data Flow
- dbt Cloud
- Setting Up dbt Cloud with BigQuery and GitHub
- Using the dbt Cloud UI
- Using the dbt Cloud IDE
- Structure of a dbt Project
- Jaffle Shop Database
- YAML Files
- Models
- Sources
- Tests
- Analyses
- Seeds
- Documentation
- dbt Commands and Selection Syntax
- Jobs and Deployment
- Summary
- Chapter 5. dbt Advanced Topics
- Model Materializations
- Tables, Views, and Ephemeral Models
- Incremental Models
- Materialized Views
- Snapshots
- Dynamic SQL with Jinja
- Using SQL Macros
- dbt Packages
- Installing Packages
- Exploring the dbt_utils Package
- Using Packages Inside Macros and Models
- dbt Semantic Layer
- Summary
- Chapter 6. Building an End-to-End Analytics Engineering Use Case
- Problem Definition: An Omnichannel Analytics Case
- Operational Data Modeling
- Conceptual Model
- Logical Model
- Physical Model
- High-Level Data Architecture
- Analytical Data Modeling
- Identify the Business Processes
- Identify Facts and Dimensions in the Dimensional Data Model
- Identify the Attributes for Dimensions
- Define the Granularity for Business Facts
- Creating Our Data Warehouse with dbt
- Tests, Documentation, and Deployment with dbt
- Data Analytics with SQL
- Conclusion
- Index
- About the Authors
- Colophon
- Using Code Examples
System requirements
File format: PDF
Copy-Protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our eBook Help page.