Apache Kafka 101 by Confluent Developer Friendly

@yourtwitterhandle | developer.confluent.io
What’s Covered
1. Introduction
2. Hands On: Your First Kafka Application in 10
Minutes or Less
3. Topics
4. Partitioning
5. Hands On: Partitioning
6. Brokers
7. Replication
8. Producers
9. Consumers
10.Hands On: Consumers
11.Ecosystem
12.Kafka Connect
13.Hands On: Kafka Connect
14.Confluent Schema Registry
15.Hands On: Confluent Schema Registry
16.Kafka Streams
17.ksqlDB
18.Hands On: ksqlDB

Event
● Internet of Things

Event
● Business process change

Event
● User Interaction

Event
● User Interaction
● Microservice output

Notification
+
State

Key
Value

Hands On: Your First Kafka Application
in 10 Minutes or Less

GET STARTED TODAY
Confluent Cloud
cnfl.io/confluent-cloud
PROMO CODE: KAFKA101
$101 of free usage

Hands On: Your First Kafka Application in 10 Minutes or
Less

Topics

Topics
● Named container for similar events

Topics
○ System contains lots of topics
○ Can duplicate data between topics

Topics
● Durable logs of events

Topics
○ Append only

Topics
○ Append only
○ Can only seek by offset, not indexed

Topics
○ Append only
○ Can only seek by offset, not indexed
● Events are immutable

Topics are durable

Retention period is configurable

Partition 0
Partition 1
Partition 2
Partitioned Topic

Partition 0
Partition 1
Partition 2

Partition 0
Partition 1
Partition 2
1 4 7
2 5 8
3 6 9
#

Partition 0
Partition 1
Partition 2
1 2 3
4 5 7
6 8 9
#

Partition 0
Partition 1
Partition 2
#
1 2 3
4 5 7
6 8 9

Hands On: Partitioning

Brokers
● An computer, instance, or container running the Kafka process

Brokers
● Manage partitions
● Handle write and read requests

Brokers
● Manage replication of partitions

Brokers
● Manage replication of partitions
● Intentionally very simple

Replication
● Copies of data for fault tolerance

Replication
● One lead partition and N-1 followers

Replication
● In general, writes and reads happen to the leader

Replication
● An invisible process to most developers

Replication
● An invisible process to most developers
● Tunable in the producer

Partition 0
Partition 1
Partition 2
Partitioned Topic
Producer

{
final Properties props = KafkaProducerApplication.loadProperties(args[0]);
final String topic = props.getProperty(“output.topic.name”);
final Producer<String, String> producer = new KafkaProducer<>(props);
final KafkaProducerApplication producerApp = new KafkaProducerApplication(producer, topic);
}

{
final ProducerRecord<String, String> producerRecord = new ProducerRecord<>(outTopic, key, value);
return producer.send(producerRecord);
}

Producers
● Client application
● Puts messages into topics
● Connection pooling
● Network buffering
● Partitioning

{
final Properties consumerAppProps = KafkaConsumerApplication.loadProperties(args[0]);
final String filePath = consumerAppProps.getProperty(“file.path”);
final Consumer<String, String> consumer = new KafkaConsumer<>(consumerAppProps);
final ConsumerRecordsHandler<String, String> recordsHandler = new FilewritingrecordsHandler(Paths.get(filePa …
final KafkaConsumerapplication consumerApplication = new KafkaConsumerApplication(consumer, recordsHandler);

@ public void runConsume(final Properties consumerProps) {
try {
consumer.subscribe(Collections.singletonList(consumerProps.getProperty(“input.topic.name”)));
while (keepConsuming) {
final ConsumerRecords<String, String> consumerRecords
=vconsumer.poll(Duration.ofSeconds(1));
recordsHandler.process(consumerRecords);
}

@ public void process(final ConsumerRecords<String, String> consumerRecords) {
final List<String> valueList = new ArrayList<>();
consumerRecords.forEach(record -> valueList.add(record.value()));
if (!valueList.isEmpty()) {
try {
Files.write(path, valueList, StandardOpenOption.CREATE, StandardOpenOption.WRITE, StandardOpenOption …
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}

Consumers
● Client application
● Reads messages from topics
● Connection pooling
● Network protocol
● Horizontally and elastically scalable
● Maintains ordering within partitions at scale

Partition 0
Partition 1
Partition 2
Partitioned Topic
Consumer A
Consumer B

Partition 0
Partition 1
Partition 2
Partitioned Topic
Consumer A

Partition 0
Partition 1
Partition 2
Partitioned Topic
Consumer A
Consumer A
Consumer B

java application > configuration > dev.properties
1 group.id=movie_ratings
2
3 bootstrap.servers=localhost:29092
4
5 key.serializer=org.apache.kafka.common.serialization.StringSerializer
6 value.serializer=org.apache.kafka.common.serialization.StringSerializer
7 acks=all
8
9 #Properties below this line are specific to code in this application
10 input.topic.name=input-topic
11 output.topic.name=output-topic

Partition 0
Partition 1
Partition 2
Partitioned Topic
Consumer A
Consumer A
Consumer B
Consumer A

Hands On: Consumers

Kafka Connect
● Data integration system and ecosystem
● Because some other systems are not Kafka
● External client process; does not run on brokers

Cluster
Data Source Kafka Connect Kafka Connect Data Sink

Kafka Connect
● Horizontally scalable
● Fault tolerant

{
“connector.class”: “io.confluent.connect.elasticsearch.ElasticsearchSinkConnector”,
“connector.url”: “http://elasticsearch:9200”,
“tasks.max”: “1”,
“topics”: “simple.elasticsearch.data”,
“name”: “simple-elasticsearch-connector”,
“type.name”: “doc”,
“value.converter”: “org.apache.kafka.connect.json.JsonConverter”
“value.converter.schemas.enable”: “false”
}

Kafka Connect
● Horizontally scalable
● Fault tolerant
● Declarative

Connectors
● Pluggable software component

Connectors
● Interfaces to external system and to Kafka

Connectors
● Also exist as runtime entities

Connectors
● Source connectors act as producers

Connectors
● Source connectors act as producers
● Sink connectors act as consumers

Hands On: Kafka Connect

New consumers will emerge

Schemas evolve with the business

Schema Registry
● Server process external to Kafka brokers

Schema Registry
● Maintains a database of schemas

Schema Registry
● HA deployment option available

Schema Registry
● HA deployment option available
● Consumer/Producer API component

Schema Registry
● Defines schema compatibility rules per topic

Schema Registry
● Producer API prevents incompatible messages from being produced

Schema Registry
● Producer API prevents incompatible messages from being produced
● Consumer API prevents incompatible messages from being consumer

Supported Formats
● JSON Schema
● Avro
● Protocol Buffers

Hands On: Confluent Schema Registry

Kafka Streams
● Functional Java API

Kafka Streams
● Filtering, grouping, aggregating, joining, and more

Kafka Streams
● Scalable, fault-tolerant state management

Kafka Streams
● Scalable, fault-tolerant state management
● Scalable computation based on consumer groups

Kafka Streams
● Integrates within your services as a library

Kafka Streams
● Runs in the context of your application

Kafka Streams
● Runs in the context of your application
● Does not require special infrastructure

Insert Code: https://youtu.be/UbNoL5tJEjc?t=443

ksqlDB
● A database optimized for stream processing

ksqlDB
● Runs on its own scalable, fault-tolerant cluster adjacent to the Kafka
cluster

ksqlDB
● Runs on its own scalable, fault-tolerant cluster adjacent to the Kafka
cluster
● Stream processing programs written in SQL

src > main > ksql > rate-movies.sql
1
2 CREATE TABLE rated_movies AS
3 SELECT title,
4 SUM(rating)/COUNT(rating) AS avg_rating
5 FROM ratings
6 INNER JOIN movies
7 ON ratings.movie_id=movies.movie_id
8 GROUP BY title EMIT CHANGES;

ksqlDB
● Command line interface
● REST API for application integration

ksqlDB
● Java library

ksqlDB
● Java library
● Kafka Connect integration

Hands On: ksqlDB

Your Apache Kafka
journey begins here
developer.confluent.io

Apache Kafka 101 by Confluent Developer Friendly

Apache Kafka 101 by Confluent Developer Friendly

Recommended

More Related Content

Similar to Apache Kafka 101 by Confluent Developer Friendly (20)

Recently uploaded (20)

Apache Kafka 101 by Confluent Developer Friendly