
Building Data Streaming Applications with Apache Kafka
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Person
Chanchal Singh has over half decades experience in Product Development and Architect Design. He has been working very closely with leadership team of various companies including directors ,CTO's and Founding members to define technical road-map for company.He is the Founder and Speaker at meetup group Big Data and AI Pune MeetupExperience Speaks. He is Co-Author of Book Building Data Streaming Application with Apache Kafka. He has a Bachelor's degree in Information Technology from the University of Mumbai and a Master's degree in Computer Application from Amity University. He was also part of the Entrepreneur Cell in IIT Mumbai. His Linkedin Profile can be found at with the username Chanchal Singh.Kumar Manish :
Manish Kumar works as Director of Technology and Architecture at VSquare. He has over 13 years' experience in providing technology solutions to complex business problems. He has worked extensively on web application development, IoT, big data, cloud technologies, and blockchain. Aside from this book, Manish has co-authored three books (Mastering Hadoop 3, Artificial Intelligence for Big Data, and Building Streaming Applications with Apache Kafka).
Content
- Cover
- Title Page
- Copyright
- Credits
- About the Authors
- About the Reviewer
- www.PacktPub.com
- Customer Feedback
- Table of Contents
- Preface
- Chapter 1: Introduction to Messaging Systems
- Understanding the principles of messaging systems
- Understanding messaging systems
- Peeking into a point-to-point messaging system
- Publish-subscribe messaging system
- Advance Queuing Messaging Protocol
- Using messaging systems in big data streaming applications
- Summary
- Chapter 2: Introducing Kafka the Distributed Messaging Platform
- Kafka origins
- Kafka's architecture
- Message topics
- Message partitions
- Replication and replicated logs
- Message producers
- Message consumers
- Role of Zookeeper
- Summary
- Chapter 3: Deep Dive into Kafka Producers
- Kafka producer internals
- Kafka Producer APIs
- Producer object and ProducerRecord object
- Custom partition
- Additional producer configuration
- Java Kafka producer example
- Common messaging publishing patterns
- Best practices
- Summary
- Chapter 4: Deep Dive into Kafka Consumers
- Kafka consumer internals
- Understanding the responsibilities of Kafka consumers
- Kafka consumer APIs
- Consumer configuration
- Subscription and polling
- Committing and polling
- Additional configuration
- Java Kafka consumer
- Scala Kafka consumer
- Rebalance listeners
- Common message consuming patterns
- Best practices
- Summary
- Chapter 5: Building Spark Streaming Applications with Kafka
- Introduction to Spark
- Spark architecture
- Pillars of Spark
- The Spark ecosystem
- Spark Streaming
- Receiver-based integration
- Disadvantages of receiver-based approach
- Java example for receiver-based integration
- Scala example for receiver-based integration
- Direct approach
- Java example for direct approach
- Scala example for direct approach
- Use case log processing - fraud IP detection
- Maven
- Producer
- Property reader
- Producer code
- Fraud IP lookup
- Expose hive table
- Streaming code
- Summary
- Chapter 6: Building Storm Applications with Kafka
- Introduction to Apache Storm
- Storm cluster architecture
- The concept of a Storm application
- Introduction to Apache Heron
- Heron architecture
- Heron topology architecture
- Integrating Apache Kafka with Apache Storm - Java
- Example
- Integrating Apache Kafka with Apache Storm - Scala
- Use case - log processing in Storm, Kafka, Hive
- Producer
- Producer code
- Fraud IP lookup
- Running the project
- Summary
- Chapter 7: Using Kafka with Confluent Platform
- Introduction to Confluent Platform
- Deep driving into Confluent architecture
- Understanding Kafka Connect and Kafka Stream
- Kafka Streams
- Playing with Avro using Schema Registry
- Moving Kafka data to HDFS
- Camus
- Running Camus
- Gobblin
- Gobblin architecture
- Kafka Connect
- Flume
- Summary
- Chapter 8: Building ETL Pipelines Using Kafka
- Considerations for using Kafka in ETL pipelines
- Introducing Kafka Connect
- Deep dive into Kafka Connect
- Introductory examples of using Kafka Connect
- Kafka Connect common use cases
- Summary
- Chapter 9: Building Streaming Applications Using Kafka Streams
- Introduction to Kafka Streams
- Using Kafka in Stream processing
- Kafka Stream - lightweight Stream processing library
- Kafka Stream architecture
- Integrated framework advantages
- Understanding tables and Streams together
- Maven dependency
- Kafka Stream word count
- KTable
- Use case example of Kafka Streams
- Maven dependency of Kafka Streams
- Property reader
- IP record producer
- IP lookup service
- Fraud detection application
- Summary
- Chapter 10: Kafka Cluster Deployment
- Kafka cluster internals
- Role of Zookeeper
- Replication
- Metadata request processing
- Producer request processing
- Consumer request processing
- Capacity planning
- Capacity planning goals
- Replication factor
- Memory
- Hard drives
- Network
- CPU
- Single cluster deployment
- Multicluster deployment
- Decommissioning brokers
- Data migration
- Summary
- Chapter 11: Using Kafka in Big Data Applications
- Managing high volumes in Kafka
- Appropriate hardware choices
- Producer read and consumer write choices
- Kafka message delivery semantics
- At least once delivery
- At most once delivery
- Exactly once delivery
- Big data and Kafka common usage patterns
- Kafka and data governance
- Alerting and monitoring
- Useful Kafka matrices
- Producer matrices
- Broker matrices
- Consumer metrics
- Summary
- Chapter 12: Securing Kafka
- An overview of securing Kafka
- Wire encryption using SSL
- Steps to enable SSL in Kafka
- Configuring SSL for Kafka Broker
- Configuring SSL for Kafka clients
- Kerberos SASL for authentication
- Steps to enable SASL/GSSAPI - in Kafka
- Configuring SASL for Kafka broker
- Configuring SASL for Kafka client - producer and consumer
- Understanding ACL and authorization
- Common ACL operations
- List ACLs
- Understanding Zookeeper authentication
- Apache Ranger for authorization
- Adding Kafka Service to Ranger
- Adding policies
- Best practices
- Summary
- Chapter 13: Streaming Application Design Considerations
- Latency and throughput
- Data and state persistence
- Data sources
- External data lookups
- Data formats
- Data serialization
- Level of parallelism
- Out-of-order events
- Message processing semantics
- Summary
- Index
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.