
Getting Started with Impala
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Content
- Intro
- Copyright
- Table of Contents
- Introduction
- Who Is This Book For?
- Conventions Used in This Book
- Using Code Examples
- Safari® Books Online
- How to Contact Us
- Content Updates
- March 30, 2016
- Acknowledgments
- Chapter 1. Why Impala?
- Impala's Place in the Big Data Ecosystem
- Flexibility for Your Big Data Workflow
- High-Performance Analytics
- Exploratory Business Intelligence
- Chapter 2. Getting Up and Running with Impala
- Installation
- Connecting to Impala
- Your First Impala Queries
- Chapter 3. Impala for the Database Developer
- The SQL Language
- Standard SQL for Queries
- Limited DML
- No Transactions
- Numbers
- Recent Additions
- Big Data Considerations
- Billions and Billions of Rows
- HDFS Block Size
- Parquet Files: The Biggest Blocks of All
- How Impala Is Like a Data Warehouse
- Physical and Logical Data Layouts
- The HDFS Storage Model
- Distributed Queries
- Normalized and Denormalized Data
- File Formats
- Text File Format
- Parquet File Format
- Getting File Format Information
- Switching File Formats
- Aggregation
- Chapter 4. Common Developer Tasks for Impala
- Getting Data into an Impala Table
- INSERT Statement
- LOAD DATA Statement
- External Tables
- Figuring Out Where Impala Data Resides
- Manually Loading Data Files into HDFS
- Hive
- Sqoop
- Kite
- Porting SQL Code to Impala
- Using Impala from a JDBC or ODBC Application
- JDBC
- ODBC
- Using Impala with a Scripting Language
- Running Impala SQL Statements from Scripts
- Variable Substitution
- Saving Query Results
- The impyla Package for Python Scripting
- Optimizing Impala Performance
- Optimizing Query Performance
- Optimizing Memory Usage
- Working with Partitioned Tables
- Finding the Ideal Granularity
- Inserting into Partitioned Tables
- Adding and Loading New Partitions
- Keeping Statistics Up to Date for Partitioned Tables
- Writing User-Defined Functions
- Collaborating with Your Administrators
- Designing for Security
- Anticipate Memory Usage
- Understanding Resource Management
- Helping to Plan for Performance (Stats, HDFS Caching)
- Understanding Cluster Topology
- Always Close Your Queries
- Chapter 5. Tutorials and Deep Dives
- Tutorial: From Unix Data File to Impala Table
- Tutorial: Queries Without a Table
- Tutorial: The Journey of a Billion Rows
- Generating a Billion Rows of CSV Data
- Normalizing the Original Data
- Converting to Parquet Format
- Making a Partitioned Table
- Next Steps
- Deep Dive: Joins and the Role of Statistics
- Creating a Million-Row Table to Join With
- Loading Data and Computing Stats
- Reviewing the EXPLAIN Plan
- Trying a Real Query
- The Story So Far
- Final Join Query with 1B x 1M Rows
- Anti-Pattern: A Million Little Pieces
- Tutorial: Across the Fourth Dimension
- TIMESTAMP Data Type
- Format Strings for Dates and Times
- Working with Individual Date and Time Fields
- Date and Time Arithmetic
- Let's Solve the Y2K Problem
- More Fun with Dates
- Tutorial: Verbose and Quiet impala-shell Output
- Tutorial: When Schemas Evolve
- Numbers Versus Strings
- Dealing with Out-of-Range Integers
- Tutorial: Levels of Abstraction
- String Formatting
- Temperature Conversion
- Tutorial: Subqueries
- Subqueries in the FROM Clause
- Subqueries in the FROM Clause for Join Queries
- Subqueries in the WHERE Clause
- Uncorrelated and Correlated Subqueries
- Common Table Expressions in the WITH Clause
- Tutorial: Analytic Functions
- Analyzing the Numbers 1 Through 10
- Running Totals and Moving Averages
- Breaking Ties
- Tutorial: Complex Types
- ARRAY: A List of Items with Identical Types
- MAP: A Hash Table or Dictionary with Key-Value Pairs
- STRUCT: A Row-Like Object for Flexible Typing and Naming
- Nesting Complex Types to Represent Arbitrary Data Structures
- Querying Tables with Nested Complex Types
- Constructing Data for Complex Types
- About the Author
- Colophon
System requirements
File format: PDF
Copy-Protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our eBook Help page.