
Hadoop in Practice
Includes 85 Techniques
Holmes(Author)
Manning Publications (Publisher)
Published on 13. October 2012
Book
Paperback/Softback
536 pages
978-1-61729-023-7 (ISBN)
Description
Summary
Hadoop in Practice collects 85 Hadoop examples and presents them in a problem/solution format. Each technique addresses a specific task you'll face, like querying big data using Pig or writing a log file loader. You'll explore each problem step by step, learning both how to build and deploy that specific solution along with the thinking that went into its design. As you work through the tasks, you'll find yourself growing more comfortable with Hadoop and at home in the world of big data.
About the Technology
Hadoop is an open source MapReduce platform designed to query and analyze data distributed across large clusters. Especially effective for big data systems, Hadoop powers mission-critical software at Apple, eBay, LinkedIn, Yahoo, and Facebook. It offers developers handy ways to store, manage, and analyze data.
About the Book
Hadoop in Practice collects 85 battle-tested examples and presents them in a problem/solution format. It balances conceptual foundations with practical recipes for key problem areas like data ingress and egress, serialization, and LZO compression. You'll explore each technique step by step, learning how to build a specific solution along with the thinking that went into it. As a bonus, the book's examples create a well-structured and understandable codebase you can tweak to meet your own needs.
This book assumes the reader knows the basics of Hadoop.
Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.
What's Inside
Hadoop in Practice collects 85 Hadoop examples and presents them in a problem/solution format. Each technique addresses a specific task you'll face, like querying big data using Pig or writing a log file loader. You'll explore each problem step by step, learning both how to build and deploy that specific solution along with the thinking that went into its design. As you work through the tasks, you'll find yourself growing more comfortable with Hadoop and at home in the world of big data.
About the Technology
Hadoop is an open source MapReduce platform designed to query and analyze data distributed across large clusters. Especially effective for big data systems, Hadoop powers mission-critical software at Apple, eBay, LinkedIn, Yahoo, and Facebook. It offers developers handy ways to store, manage, and analyze data.
About the Book
Hadoop in Practice collects 85 battle-tested examples and presents them in a problem/solution format. It balances conceptual foundations with practical recipes for key problem areas like data ingress and egress, serialization, and LZO compression. You'll explore each technique step by step, learning how to build a specific solution along with the thinking that went into it. As a bonus, the book's examples create a well-structured and understandable codebase you can tweak to meet your own needs.
This book assumes the reader knows the basics of Hadoop.
Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.
What's Inside
- Conceptual overview of Hadoop and MapReduce
- 85 practical, tested techniques
- Real problems, real solutions
- How to integrate MapReduce and R
Table of Contents
- Hadoop in a heartbeat
- Moving data in and out of Hadoop
- Data serialization?working with text and beyond
- Applying MapReduce patterns to big data
- Streamlining HDFS for big data
- Diagnosing and tuning performance problems
- Utilizing data structures and algorithms
- Integrating R and Hadoop for statistics and more
- Predictive analytics with Mahout
- Hacking with Hive
- Programming pipelines with Pig
- Crunch and other technologies
- Testing and debugging
More details
Language
English
Place of publication
New York
United States
Product notice
Paperback (trade)
Unsewn / adhesive bound
Illustrations
Illustrations
Dimensions
Height: 242 mm
Width: 192 mm
Thickness: 33 mm
Weight
900 gr
ISBN-13
978-1-61729-023-7 (9781617290237)
Copyright in bibliographic data and cover images is held by Nielsen Book Services Limited or by the publishers or by their respective licensors: all rights reserved.
Schweitzer Classification
Person
Alex Holmes is a senior software engineer with extensive expertise in solving big data problems using Hadoop. He has presented at JavaOne and Jazoon and is a technical lead at VeriSign.