Hadoop: The Definitive Guide

Name: Hadoop: The Definitive Guide
Price: 43.0 EUR
Availability: OutOfStock
Author: Tom White
ISBN: 978-0-596-52197-4

Tom White(Author)

O'Reilly (Publisher)

1st Edition

Published on 14. July 2009

Book

Paperback/Softback

524 pages

978-0-596-52197-4 (ISBN)

€43.00incl. 7% vat

Article exhausted; check for reprint

New edition available

Description

More details

Other editions

Person

Content

Inhaltsverzeichnis

* Chapter 1 Meet Hadoop

* Data!

* Data Storage and Analysis

* Comparison with Other Systems

* A Brief History of Hadoop

* The Apache Hadoop Project

* Chapter 2 MapReduce

* A Weather Dataset

* Analyzing the Data with Unix Tools

* Analyzing the Data with Hadoop

* Scaling Out

* Hadoop Streaming

* Hadoop Pipes

* Chapter 3 The Hadoop Distributed Filesystem

* The Design of HDFS

* HDFS Concepts

* The Command-Line Interface

* Hadoop Filesystems

* The Java Interface

* Data Flow

* Parallel Copying with distcp

* Hadoop Archives

* Chapter 4 Hadoop I/O

* Data Integrity

* Compression

* Serialization

* File-Based Data Structures

* Chapter 5 Developing a MapReduce Application

* The Configuration API

* Configuring the Development Environment

* Writing a Unit Test

* Running Locally on Test Data

* Running on a Cluster

* Tuning a Job

* MapReduce Workflows

* Chapter 6 How MapReduce Works

* Anatomy of a MapReduce Job Run

* Failures

* Job Scheduling

* Shuffle and Sort

* Task Execution

* Chapter 7 MapReduce Types and Formats

* MapReduce Types

* Input Formats

* Output Formats

* Chapter 8 MapReduce Features

* Counters

* Sorting

* Joins

* Side Data Distribution

* MapReduce Library Classes

* Chapter 9 Setting Up a Hadoop Cluster

* Cluster Specification

* Cluster Setup and Installation

* SSH Configuration

* Hadoop Configuration

* Post Install

* Benchmarking a Hadoop Cluster

* Hadoop in the Cloud

* Chapter 10 Administering Hadoop

* HDFS

* Monitoring

* Maintenance

* Chapter 11 Pig

* Installing and Running Pig

* An Example

* Comparison with Databases

* Pig Latin

* User-Defined Functions

* Data Processing Operators

* Pig in Practice

* Chapter 12 HBase

* HBasics

* Concepts

* Installation

* Clients

* Example

* HBase Versus RDBMS

* Praxis

* Chapter 13 ZooKeeper

* Installing and Running ZooKeeper

* An Example

* The ZooKeeper Service

* Building Applications with ZooKeeper

* ZooKeeper in Production

* Chapter 14 Case Studies

* Hadoop Usage at Last.fm

* Hadoop and Hive at Facebook

* Nutch Search Engine

* Log Processing at Rackspace

* Cascading

* TeraByte Sort on Apache Hadoop

* Appendix Installing Apache Hadoop

* Prerequisites

* Installation

* Configuration

* Appendix Cloudera's Distribution for Hadoop

* Prerequisites

* Standalone Mode

* Pseudo-Distributed Mode

* Fully Distributed Mode

* Hadoop-Related Packages

* Appendix Preparing the NCDC Weather Data

* Colophon

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen