Building Data Streaming Applications with Apache Kafka

Name: Building Data Streaming Applications with Apache Kafka | Design, develop and streamline applications using Apache Kafka, Storm, Heron and Spark
Brand: Packt Publishing
Price: 44.49 EUR
Availability: OnlineOnly

Design, develop and streamline applications using Apache Kafka, Storm, Heron and Spark

Manish Kumar(Author)

Packt Publishing

Published on 8. July 2025

278 pages

E-Book

ePUB with Adobe-DRM

System requirements

978-1-78728-763-1 (ISBN)

€44.49incl. 7% vat

System requirements

for ePUB with Adobe-DRM

E-Book Single Licence

Available for download

Description

More details

Other editions

Person

Content

Cover
Title Page
Copyright
Credits
About the Authors
About the Reviewer
www.PacktPub.com
Customer Feedback
Table of Contents
Preface
Chapter 1: Introduction to Messaging Systems
Understanding the principles of messaging systems
Understanding messaging systems
Peeking into a point-to-point messaging system
Publish-subscribe messaging system
Advance Queuing Messaging Protocol
Using messaging systems in big data streaming applications
Summary
Chapter 2: Introducing Kafka the Distributed Messaging Platform
Kafka origins
Kafka's architecture
Message topics
Message partitions
Replication and replicated logs
Message producers
Message consumers
Role of Zookeeper
Summary
Chapter 3: Deep Dive into Kafka Producers
Kafka producer internals
Kafka Producer APIs
Producer object and ProducerRecord object
Custom partition
Additional producer configuration
Java Kafka producer example
Common messaging publishing patterns
Best practices
Summary
Chapter 4: Deep Dive into Kafka Consumers
Kafka consumer internals
Understanding the responsibilities of Kafka consumers
Kafka consumer APIs
Consumer configuration
Subscription and polling
Committing and polling
Additional configuration
Java Kafka consumer
Scala Kafka consumer
Rebalance listeners
Common message consuming patterns
Best practices
Summary
Chapter 5: Building Spark Streaming Applications with Kafka
Introduction to Spark
Spark architecture
Pillars of Spark
The Spark ecosystem
Spark Streaming
Receiver-based integration
Disadvantages of receiver-based approach
Java example for receiver-based integration
Scala example for receiver-based integration
Direct approach
Java example for direct approach
Scala example for direct approach
Use case log processing - fraud IP detection
Maven
Producer
Property reader
Producer code
Fraud IP lookup
Expose hive table
Streaming code
Summary
Chapter 6: Building Storm Applications with Kafka
Introduction to Apache Storm
Storm cluster architecture
The concept of a Storm application
Introduction to Apache Heron
Heron architecture
Heron topology architecture
Integrating Apache Kafka with Apache Storm - Java
Example
Integrating Apache Kafka with Apache Storm - Scala
Use case - log processing in Storm, Kafka, Hive
Producer
Producer code
Fraud IP lookup
Running the project
Summary
Chapter 7: Using Kafka with Confluent Platform
Introduction to Confluent Platform
Deep driving into Confluent architecture
Understanding Kafka Connect and Kafka Stream
Kafka Streams
Playing with Avro using Schema Registry
Moving Kafka data to HDFS
Camus
Running Camus
Gobblin
Gobblin architecture
Kafka Connect
Flume
Summary
Chapter 8: Building ETL Pipelines Using Kafka
Considerations for using Kafka in ETL pipelines
Introducing Kafka Connect
Deep dive into Kafka Connect
Introductory examples of using Kafka Connect
Kafka Connect common use cases
Summary
Chapter 9: Building Streaming Applications Using Kafka Streams
Introduction to Kafka Streams
Using Kafka in Stream processing
Kafka Stream - lightweight Stream processing library
Kafka Stream architecture
Integrated framework advantages
Understanding tables and Streams together
Maven dependency
Kafka Stream word count
KTable
Use case example of Kafka Streams
Maven dependency of Kafka Streams
Property reader
IP record producer
IP lookup service
Fraud detection application
Summary
Chapter 10: Kafka Cluster Deployment
Kafka cluster internals
Role of Zookeeper
Replication
Metadata request processing
Producer request processing
Consumer request processing
Capacity planning
Capacity planning goals
Replication factor
Memory
Hard drives
Network
CPU
Single cluster deployment
Multicluster deployment
Decommissioning brokers
Data migration
Summary
Chapter 11: Using Kafka in Big Data Applications
Managing high volumes in Kafka
Appropriate hardware choices
Producer read and consumer write choices
Kafka message delivery semantics
At least once delivery
At most once delivery
Exactly once delivery
Big data and Kafka common usage patterns
Kafka and data governance
Alerting and monitoring
Useful Kafka matrices
Producer matrices
Broker matrices
Consumer metrics
Summary
Chapter 12: Securing Kafka
An overview of securing Kafka
Wire encryption using SSL
Steps to enable SSL in Kafka
Configuring SSL for Kafka Broker
Configuring SSL for Kafka clients
Kerberos SASL for authentication
Steps to enable SASL/GSSAPI - in Kafka
Configuring SASL for Kafka broker
Configuring SASL for Kafka client - producer and consumer
Understanding ACL and authorization
Common ACL operations
List ACLs
Understanding Zookeeper authentication
Apache Ranger for authorization
Adding Kafka Service to Ranger
Adding policies
Best practices
Summary
Chapter 13: Streaming Application Design Considerations
Latency and throughput
Data and state persistence
Data sources
External data lookups
Data formats
Data serialization
Level of parallelism
Out-of-order events
Message processing semantics
Summary
Index

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Building Data Streaming Applications with Apache Kafka

Description

More details

Other editions

Additional editions

Person

Content

System requirements