
Practical Real-time Data Processing and Analytics
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Person
Shilpi has more than 12 years (3 years in the big data space) of experience in the development and execution of various facets of enterprise solutions both in the products and services dimensions of the software industry. An engineer by degree and profession, she has worn varied hats, such as developer, technical leader, product owner, tech manager, and so on, and she has seen all the flavors that the industry has to offer. She has architected and worked through some of the pioneers' production implementations in Big Data on Storm and Impala with auto-scaling in AWS.
Shilpi also authored Real-time Analytics with Storm and Cassandra with Packt Publishing. Saurabh Gupta is an software engineer who has worked aspects of software requirements, designing, execution, and delivery. Saurabh has more than 3 years of experience working in Big Data domain. Saurabh is handling and designing real time as well as batch processing projects running in production including technologies like Impala, Storm, NiFi, Kafka and deployment on AWS using Docker. Saurabh also worked in product development and delivery.
Saurabh has total 10 years (3+ years in big data) rich experience in IT industry. Saurabh has exposure in various IOT use-cases including Telecom, HealthCare, Smart city, Smart cars and so on.
Content
- Intro
- Practical Real-Time Data Processing and Analytics
- Credits
- About the Authors
- About the Reviewers
- www.PacktPub.com
- Why subscribe?
- Customer Feedback
- Table of Contents
- Preface
- What this book covers
- What you need for this book
- Who this book is for
- Conventions
- Reader feedback
- Customer support
- 1 Introducing Real-Time Analytics
- What is big data?
- Big data infrastructure
- Real-time analytics - the myth and the reality
- Near real-time solution - an architecture that works
- Lambda architecture - analytics possibilities
- IOT - thoughts and possibilities
- Cloud - considerationos for NRT and IOT
- Summary
- 2 Real Time Applications - The Basic Ingredients
- The NRT system and its building blocks
- NRT - high-level system view
- NRT - technology view
- Summary
- 3 Understanding and Tailing Data Streams
- Understanding data streams
- Setting up infrastructure for data ingestion
- Taping data from source to the processor - expectations and caveats
- Comparing and choosing what works best for your use case
- Do it yourself
- Summary
- 4 Setting up the Infrastructure for Storm
- Overview of Storm
- Storm architecture and its components
- Setting up and configuring Storm
- Real-time processing job on Storm
- Summary
- 5 Configuring Apache Spark and Flink
- Setting up and a quick execution of Spark
- Setting up and a quick execution of Flink
- Setting up and a quick execution of Apache Beam
- Balancing in Apache Beam
- Summary
- 6 Integrating Storm with a Data Source
- RabbitMQ - messaging that works
- RabbitMQ exchanges
- RabbitMQ - integration with Storm
- PubNub data stream publisher
- String together Storm-RMQ-PubNub sensor data topology
- Summary
- 7 From Storm to Sink
- Setting up and configuring Cassandra
- Storm and Cassandra topology
- Storm and IMDB integration for dimensional data
- Integrating the presentation layer with Storm
- Do It Yourself
- Summary
- 8 Storm Trident
- State retention and the need for Trident
- Basic Storm Trident topology
- Trident internals
- Trident operations
- DRPC
- Do It Yourself
- Summary
- 9 Working with Spark
- Spark overview
- Distinct advantages of Spark
- Spark - use cases
- Spark architecture - working inside the engine
- Spark pragmatic concepts
- Spark 2.x - advent of data frames and datasets
- Summary
- 10 Working with Spark Operations
- Spark - packaging and API
- RDD pragmatic exploration
- Shared variables - broadcast variables and accumulators
- Summary
- 11 Spark Streaming
- Spark Streaming concepts
- Spark Streaming - introduction and architecture
- Packaging structure of Spark Streaming
- Connecting Kafka to Spark Streaming
- Summary
- 12 Working with Apache Flink
- Flink architecture and execution engine
- Flink basic components and processes
- Integration of source stream to Flink
- Flink processing and computation
- Flink persistence
- FlinkCEP
- Pattern API
- Gelly
- DIY
- Summary
- 13 Case Study
- Introduction
- Data modeling
- Tools and frameworks
- Setting up the infrastructure
- Implementing the case study
- Running the case study
- Summary
- Index
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.