0% found this document useful (0 votes)

4 views

Apache Kafka

Uploaded by

dyvikmanju5

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Apache Kafka

Uploaded by

dyvikmanju5

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 17

INTERNET OF THINGS

TOPIC: APACHE KAFKA

Group members
Taniya Souza
[1DA21CS150]
Srinivasan r Guide:
[1DA21CS143] Prof. Lavanya Santosh
Yashwanth B K
CSE Dept
[1DA21CS171]
Yashwanth Gowda B
[1DA21CS172]
What is Apache Kafka?
•Apache Kafka is an open-source distributed event-streaming platform.

•Originally developed by LinkedIn and donated to the Apache Software

Foundation in 2011.

•Designed to handle high-throughput, low-latency, real-time data streams.

Key Features of Kafka
 Distributed System: Runs as a cluster of brokers for scalability and fault tolerance.

 Durable Storage: Data is stored on disk and replicated across brokers.

 High Throughput: Can handle millions of messages per second.

 Low Latency: Ensures quick delivery of messages.

 Decoupling Systems: Allows independent development and scaling of producers and consumers.
Why Use Kafka?
•Ideal for modern data-driven applications.

•Helps in building real-time analytics systems.

•Serves as a backbone for microservices communication.

•Ensures scalability to handle large datasets.

•Integrates with popular big data frameworks like Spark, Flink, and Hadoop.
Core Functions:
•Publish and Subscribe: Enables real-time messaging between producers and consumers through

topics.

•Durable Storage: Persistently stores data streams on disk, allowing replay and recovery.

•Scalable Partitioning: Divides topics into partitions for parallel and distributed data processing.

•Fault Tolerance: Ensures data availability and reliability through replication across brokers.

•Real-Time Stream Processing: Processes and analyzes data streams in real time using Kafka

Streams or external tools.

Kafka Architecture
Overview
• Kafka is a publish-subscribe messaging
system with the following components:
• Producers: Publish messages to topics.
• Consumers: Subscribe to topics to consume
messages.
• Brokers: Manage the storage and retrieval
of messages.
• Topics: Categories to which messages are
published.
• Partitions: Break down topics for scalability.
Kafka Topics
 A topic is a logical channel for data streams.

 Each topic is divided into partitions for parallel processing.

 Data in topics is retained for a configurable period, even after

consumption.

 Topics can have configurations for replication and data retention.

 Example: A “Sales Data” topic could have partitions based on

regions.
Producers and Consumers
•Producers: Send data to Kafka topics.

•Push messages to specific partitions.

•Can define custom partitioning logic (e.g., based on keys).

•Consumers: Read data from topics.

•Join consumer groups for parallel processing.

•Kafka ensures that each partition is read by one consumer in a group.

Brokers and Clusters
• A Kafka cluster consists of multiple brokers.

• Brokers: Handle storage and management of data

streams.

• Each broker handles a subset of partitions.

• Collaborate to provide fault tolerance and scalability.

• Clusters use ZooKeeper (or KRaft, in newer versions)

for managing configurations and leader election.

Kafka Partitions
 Topics are divided into partitions to distribute data

and allow parallelism.

 Data Placement: Messages in partitions are stored

in the order they arrive.

 Key-Based Partitioning: Ensures that messages

with the same key go to the same partition.

 Example: A “User Activity” topic could have partitions

for different user IDs.

Offset and Message
Ordering
• Offset: A unique identifier for each message in a partition.

• Used to keep track of consumed messages.

• Kafka guarantees message order within a partition but not

across partitions.

• Consumers can reset offsets for reprocessing messages.

Durability and Replication
•Kafka ensures durability by replicating data across brokers.

•Leader Replica: Handles all read and write requests for a partition.

•Follower Replicas: Maintain copies and take over if the leader fails.

•Acknowledgments: Producers can configure how many replicas must confirm

a message before it's considered successful.

Use Cases of Kafka
•Real-Time Analytics:
Monitor and analyze social media feeds or website activities.
•Log Aggregation:
Centralized logging from distributed systems.
•Event Sourcing:
Capture application changes as a sequence of events.
•Data Integration:
Sync databases and applications.
•Stream Processing:
Process and analyze data in real-time with Kafka Streams or other tools.
Advantages of Kafka
•Scalability: Can scale horizontally by adding brokers.
•Flexibility: Works with multiple programming languages.
•Resilience: Fault-tolerant with replication and partitioning.
•Performance: Handles millions of events per second with low latency.
•Integration: Seamlessly integrates with popular tools like Spark and Flink.
Challenges with Kafka
 Complex Setup: Requires expertise to configure and maintain.

 Resource-Intensive: High memory usage for durability and performance.

 Message Duplication: Can occur without proper configuration.

 Operational Overhead: ZooKeeper dependency in older versions.

SUMMAR
Y
 Apache Kafka is a distributed platform for real-time data streaming and
processing, designed for high-throughput, low-latency, and fault-tolerant
communication.
 Kafka uses topics for organizing data, partitions for scalability, and
replication for reliability, enabling efficient handling of massive data
streams.
 Common applications include real-time analytics, event-driven
architectures, log aggregation, and data integration between diverse
systems.
THANK YOU

Understanding Apache Kafka White Paper
No ratings yet
Understanding Apache Kafka White Paper
7 pages
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
CS6551 Computer Networks Two Mark With Answer
100% (7)
CS6551 Computer Networks Two Mark With Answer
35 pages
Kafka
No ratings yet
Kafka
1 page
Introduction To Apache Kafka
No ratings yet
Introduction To Apache Kafka
15 pages
Fundamentals and Architecture of Apache Kafka
No ratings yet
Fundamentals and Architecture of Apache Kafka
30 pages
Apache Kafka 101
No ratings yet
Apache Kafka 101
25 pages
SITA1603 Unit 3 Material
No ratings yet
SITA1603 Unit 3 Material
45 pages
BD IMP QUES 2
No ratings yet
BD IMP QUES 2
26 pages
Kafka
No ratings yet
Kafka
5 pages
Introduction To Apache Kafka - 070224-1155-334
No ratings yet
Introduction To Apache Kafka - 070224-1155-334
7 pages
Kafka: Big Data Huawei Course
No ratings yet
Kafka: Big Data Huawei Course
14 pages
Apache Kafka
No ratings yet
Apache Kafka
27 pages
Apache Kafka Introduction
No ratings yet
Apache Kafka Introduction
21 pages
Kafka architecture
No ratings yet
Kafka architecture
5 pages
_Data_and_AI_Kafka_Overview_1740507867
No ratings yet
_Data_and_AI_Kafka_Overview_1740507867
20 pages
Kafka Notes1
No ratings yet
Kafka Notes1
19 pages
Kafka Clustering v1.0.0
No ratings yet
Kafka Clustering v1.0.0
20 pages
Documentation
No ratings yet
Documentation
105 pages
Kafka
No ratings yet
Kafka
12 pages
Apache Kafka - Introduction - Tutorialspoint
No ratings yet
Apache Kafka - Introduction - Tutorialspoint
3 pages
Kafka Interview Questions
No ratings yet
Kafka Interview Questions
10 pages
i
No ratings yet
i
26 pages
Unit 5 Apache Kafka Notes
No ratings yet
Unit 5 Apache Kafka Notes
54 pages
Kafka
No ratings yet
Kafka
23 pages
Kafka_Kafdrop
No ratings yet
Kafka_Kafdrop
13 pages
Kafka Ebook SoftwareMill
No ratings yet
Kafka Ebook SoftwareMill
27 pages
Unveiling Kafka Topics - The Heartbeat of Real-Time Data Streaming
No ratings yet
Unveiling Kafka Topics - The Heartbeat of Real-Time Data Streaming
5 pages
Kafka a Deep Dive Into Real Time Data Streaming
No ratings yet
Kafka a Deep Dive Into Real Time Data Streaming
10 pages
Apache Kafka | Thi Nguyen's Blog
No ratings yet
Apache Kafka | Thi Nguyen's Blog
39 pages
Chapter 1 - Introduction To KAFKA: Objectives
No ratings yet
Chapter 1 - Introduction To KAFKA: Objectives
17 pages
Kafka Sparkstreaming
No ratings yet
Kafka Sparkstreaming
75 pages
KAFKAExample2
No ratings yet
KAFKAExample2
12 pages
Kafka Notes
No ratings yet
Kafka Notes
7 pages
Bda 07
No ratings yet
Bda 07
9 pages
Big Data - Group 14
No ratings yet
Big Data - Group 14
26 pages
kafka
No ratings yet
kafka
43 pages
01 - Chapter Introduction To AMQ Streams
No ratings yet
01 - Chapter Introduction To AMQ Streams
10 pages
Kafka Notes Linkedin
No ratings yet
Kafka Notes Linkedin
33 pages
Instaclustr Understanding Apache Kafka White Paper
No ratings yet
Instaclustr Understanding Apache Kafka White Paper
8 pages
Apache Kafka Long Polling
No ratings yet
Apache Kafka Long Polling
20 pages
Basics of Kafka
No ratings yet
Basics of Kafka
17 pages
Learning Apache Kafka - Second Edition - Sample Chapter
No ratings yet
Learning Apache Kafka - Second Edition - Sample Chapter
12 pages
Pache Kafka Is An Open-Source Distr
No ratings yet
Pache Kafka Is An Open-Source Distr
1 page
Configuring Kafka For High Throughput
No ratings yet
Configuring Kafka For High Throughput
11 pages
Getting To Know Kafka: Ola Is The First Course in The Series of Courses Covering All The Aspects of Kafka
No ratings yet
Getting To Know Kafka: Ola Is The First Course in The Series of Courses Covering All The Aspects of Kafka
23 pages
unit 3
No ratings yet
unit 3
26 pages
HD Mod011 Kafka
No ratings yet
HD Mod011 Kafka
29 pages
study3
No ratings yet
study3
20 pages
Lecture Intro Kafka
No ratings yet
Lecture Intro Kafka
27 pages
BDA Lab A7
No ratings yet
BDA Lab A7
10 pages
5. Introduction to Data Ingestion and Processing
No ratings yet
5. Introduction to Data Ingestion and Processing
28 pages
Event-Driven Architecture- Leveraging Kafka for Real-Time Data Processing
No ratings yet
Event-Driven Architecture- Leveraging Kafka for Real-Time Data Processing
4 pages
Apache Kafka - Introduction
No ratings yet
Apache Kafka - Introduction
2 pages
Kafka101training Public v2 140818033637 Phpapp01
No ratings yet
Kafka101training Public v2 140818033637 Phpapp01
119 pages
Step 19 Kafka Optional
No ratings yet
Step 19 Kafka Optional
10 pages
Lecture24 PDF
No ratings yet
Lecture24 PDF
40 pages
Apache Kafka Description
No ratings yet
Apache Kafka Description
36 pages
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
From Everand
Kafka Up and Running for Network DevOps: Set Your Network Data in Motion
Eric Chou
No ratings yet
Advanced Apache Kafka: Engineering High-Performance Streaming Applications
From Everand
Advanced Apache Kafka: Engineering High-Performance Streaming Applications
Peter Jones
No ratings yet
Confluent Certified Developer for Apache Kafka® Exam kit
From Everand
Confluent Certified Developer for Apache Kafka® Exam kit
PRIYANKA
No ratings yet
Do - 026 - s2023 (Item 624 Roadway Lighting)
No ratings yet
Do - 026 - s2023 (Item 624 Roadway Lighting)
13 pages
PROGRAMMING FOR PROBLEM SOLVING-CSE 1001-2024
No ratings yet
PROGRAMMING FOR PROBLEM SOLVING-CSE 1001-2024
4 pages
Computer Science (Optional II) Grade 9-10: Micro Syllabus - Academic Year 2069
No ratings yet
Computer Science (Optional II) Grade 9-10: Micro Syllabus - Academic Year 2069
6 pages
Smart Integration Connector Guide 1 (1)
No ratings yet
Smart Integration Connector Guide 1 (1)
117 pages
Whittle Instructions
No ratings yet
Whittle Instructions
20 pages
S20 LPG Ride On Sweeper Parts Manual
No ratings yet
S20 LPG Ride On Sweeper Parts Manual
202 pages
TIGT I2 Manual
No ratings yet
TIGT I2 Manual
68 pages
JVC KD-LX100J Manual de Servicio PDF
No ratings yet
JVC KD-LX100J Manual de Servicio PDF
38 pages
Pub Websphere Application Server Administration Using Jython
No ratings yet
Pub Websphere Application Server Administration Using Jython
496 pages
Ultrafilter Scandinavia 2018 Catalogue - Web
No ratings yet
Ultrafilter Scandinavia 2018 Catalogue - Web
79 pages
DXT Plug-In Unit Descriptions: Mbif-C, Mbif-Cr
No ratings yet
DXT Plug-In Unit Descriptions: Mbif-C, Mbif-Cr
53 pages
Electronics Expert Portfolio by Slidesgo
No ratings yet
Electronics Expert Portfolio by Slidesgo
39 pages
Sandesh
No ratings yet
Sandesh
4 pages
Flipkart: Customer Centricity and CRM Implementation in Indian Industry
No ratings yet
Flipkart: Customer Centricity and CRM Implementation in Indian Industry
9 pages
CDC UP Risk Management Log Template
No ratings yet
CDC UP Risk Management Log Template
10 pages
DiskWarrior 3.0.3 Revision 39 Original Instructions
No ratings yet
DiskWarrior 3.0.3 Revision 39 Original Instructions
4 pages
Date Sheet: The Punjab State Board of Technical Education & Industrial Training
No ratings yet
Date Sheet: The Punjab State Board of Technical Education & Industrial Training
40 pages
AWS Well-Architected ISV Funding Guide
No ratings yet
AWS Well-Architected ISV Funding Guide
5 pages
Mod Menu Log - JP - Co.studioz - Shamanking
No ratings yet
Mod Menu Log - JP - Co.studioz - Shamanking
120 pages
Dali Oberon: Manual
No ratings yet
Dali Oberon: Manual
35 pages
Debugging For Functional Consultants - SAP Blogs
100% (1)
Debugging For Functional Consultants - SAP Blogs
21 pages
Top Zone B1 Right Fore Edge Heating
No ratings yet
Top Zone B1 Right Fore Edge Heating
17 pages
Hioki 3159
No ratings yet
Hioki 3159
198 pages
2018-19 A CS500NI1 A3 CW Coursework17031035 Renish Gautam
No ratings yet
2018-19 A CS500NI1 A3 CW Coursework17031035 Renish Gautam
45 pages
Ideas To Consider For New Chemical Engineering Educators Senior Design
No ratings yet
Ideas To Consider For New Chemical Engineering Educators Senior Design
11 pages
Mercedes-Amg Petronas Formula One Team: Case Study
No ratings yet
Mercedes-Amg Petronas Formula One Team: Case Study
8 pages
Job Portal Showcase
No ratings yet
Job Portal Showcase
10 pages
Transceivers For Millimeter Waves
No ratings yet
Transceivers For Millimeter Waves
35 pages
Grid Protection Brochure 2017
No ratings yet
Grid Protection Brochure 2017
10 pages

Apache Kafka

Uploaded by

Apache Kafka

Uploaded by

INTERNET OF THINGS

TOPIC: APACHE KAFKA

•Originally developed by LinkedIn and donated to the Apache Software

•Designed to handle high-throughput, low-latency, real-time data streams.

 Durable Storage: Data is stored on disk and replicated across brokers.

 High Throughput: Can handle millions of messages per second.

 Low Latency: Ensures quick delivery of messages.

•Helps in building real-time analytics systems.

•Serves as a backbone for microservices communication.

•Ensures scalability to handle large datasets.

Streams or external tools.

 Each topic is divided into partitions for parallel processing.

 Data in topics is retained for a configurable period, even after

 Topics can have configurations for replication and data retention.

 Example: A “Sales Data” topic could have partitions based on

•Push messages to specific partitions.

•Can define custom partitioning logic (e.g., based on keys).

•Consumers: Read data from topics.

•Join consumer groups for parallel processing.

•Kafka ensures that each partition is read by one consumer in a group.

• Brokers: Handle storage and management of data

• Each broker handles a subset of partitions.

• Collaborate to provide fault tolerance and scalability.

• Clusters use ZooKeeper (or KRaft, in newer versions)

for managing configurations and leader election.

and allow parallelism.

 Data Placement: Messages in partitions are stored

in the order they arrive.

 Key-Based Partitioning: Ensures that messages

with the same key go to the same partition.

 Example: A “User Activity” topic could have partitions

for different user IDs.

• Used to keep track of consumed messages.

• Kafka guarantees message order within a partition but not

• Consumers can reset offsets for reprocessing messages.

•Acknowledgments: Producers can configure how many replicas must confirm

a message before it's considered successful.

 Resource-Intensive: High memory usage for durability and performance.

 Message Duplication: Can occur without proper configuration.

 Operational Overhead: ZooKeeper dependency in older versions.

You might also like