
Principles of Data Wrangling
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Content
- Cover
- Copyright
- Table of Contents
- Foreword
- Chapter 1. Introduction
- Magic Thresholds, PYMK, and User Growth at Facebook
- Chapter 2. A Data Workflow Framework
- How Data Flows During and Across Projects
- Connecting Analytic Actions to Data Movement: A Holistic Workflow Framework for Data Projects
- Raw Data Stage Actions: Ingest Data and Create Metadata
- Ingesting Known and Unknown Data
- Creating Metadata
- Refined Data Stage Actions: Create Canonical Data and Conduct Ad Hoc Analyses
- Designing Refined Data
- Refined Stage Analytical Actions
- Production Data Stage Actions: Create Production Data and Build Automated Systems
- Creating Optimized Data
- Designing Regular Reports and Automated Products/Services
- Data Wrangling within the Workflow Framework
- Chapter 3. The Dynamics of Data Wrangling
- Data Wrangling Dynamics
- Additional Aspects: Subsetting and Sampling
- Core Transformation and Profiling Actions
- Data Wrangling in the Workflow Framework
- Ingesting Data
- Describing Data
- Assessing Data Utility
- Designing and Building Refined Data
- Ad Hoc Reporting
- Exploratory Modeling and Forecasting
- Building an Optimized Dataset
- Regular Reporting and Building Data-Driven Products and Services
- Chapter 4. Profiling
- Overview of Profiling
- Individual Value Profiling: Syntactic Profiling
- Individual Value Profiling: Semantic Profiling
- Set-Based Profiling
- Profiling Individual Values in the Candidate Master File
- Syntactic Profiling in the Candidate Master File
- Set-Based Profiling in the Candidate Master File
- Chapter 5. Transformation: Structuring
- Overview of Structuring
- Intrarecord Structuring: Extracting Values
- Positional Extraction
- Pattern Extraction
- Complex Structure Extraction
- Intrarecord Structuring: Combining Multiple Record Fields
- Interrecord Structuring: Filtering Records and Fields
- Interrecord Structuring: Aggregations and Pivots
- Simple Aggregations
- Column-to-Row Pivots
- Row-to-Column Pivots
- Chapter 6. Transformation: Enriching
- Unions
- Joins
- Inserting Metadata
- Derivation of Values
- Generic
- Proprietary
- Chapter 7. Using Transformation to Clean Data
- Addressing Missing/NULL Values
- Addressing Invalid Values
- Chapter 8. Roles and Responsibilities
- Skills and Responsibilities
- Data Engineer
- Data Architect
- Data Scientist
- Analyst
- Roles Across the Data Workflow Framework
- Organizational Best Practices
- Chapter 9. Data Wrangling Tools
- Data Size and Infrastructure
- Data Structures
- Excel
- SQL
- Trifacta Wrangler
- Transformation Paradigms
- Excel
- SQL
- Trifacta Wrangler
- Choosing a Data Wrangling Tool
- About the Authors
- Colophon
System requirements
File format: PDF
Copy-Protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our eBook Help page.