0% found this document useful (0 votes)

102 views

Apache Kafka Essentials

Uploaded by

Priya Sharma

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

102 views

Apache Kafka Essentials

Uploaded by

Priya Sharma

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

BROUGHT TO YOU IN PARTNERSHIP WITH

CONTENTS

∙ Introduction

Apache Kafka
∙ About Apache Kafka

∙ Quickstart for Apache Kafka

∙ Pub/Sub in Apache Kafka

Essentials
∙ Kafka Connect

∙ Kafka Streams

∙ Extending Apache Kafka

∙ One Size Does Not Fit All

∙ Conclusion

∙ Additional Resources

ORIGINAL BY JUN RAO | UPDATE BY BILL MCLANE

Two trends have emerged in the information technology space. First, The main benefits of Kafka are:
the diversity and velocity of the data that an enterprise wants to
1. High throughput: Each server is capable of handling hundreds
collect for decision-making continues to grow. Second, there is a
of MB per second of data.
growing need for an enterprise to make decisions in real-time based
on that collected data. For example, financial institutions want to 2. High availability: Data can be stored redundantly in multiple
not only detect fraud immediately, but also offer a better banking servers and can survive individual server failure.
experience through features like real-time alerting, real-time product
recommendations, and more effective customer service. 3. High scalability: New servers can be added over time to scale
out the system.
Apache Kafka is a streaming engine for collecting, caching, and
processing high volumes of data in real-time. As illustrated in Figure 4. Easy integration with external data sources or data sinks.

1, Kafka typically serves as a part of a central data hub in which

5. Built-in real-time processing layer.
data within an enterprise is collected. The data can then be used for
continuous processing or fed into other systems and applications in
real time. Kafka is used by more than 40% of Fortune 500 companies
across all industries. Refer to Figure 1 on page 3.

ABOUT APACHE KAFKA

Kafka was originally developed at LinkedIn in 2010, and it became a
top-level Apache project in 2012. It has three main components: Pub/
Sub, Kafka Connect, and Kafka Streams. The role of each component
is summarized in the table below.

Storing and delivering data efficiently and

Pub/Sub
reliably at scale
Integrating Kafka with external data sources
Kafka Connect
and data sinks.
Kafka Streams Processing data in Kafka in real time.

1
APACHE KAFKA

Figure 1: Apache Kafka as a central real-time hub Defines a logical name for producing and
Topic
consuming records.
QUICKSTART FOR APACHE KAFKA Defines a non-overlapping subset of records
Partition
within a topic.
It’s easy to get started on Kafka. The following are the steps to get
A unique sequential number assigned to each
Kafka running in your environment: Offset
record within a topic partition.
A record contains a key, a value, a timestamp,
1. Download the latest Apache Kafka binary distribution from Record
and a list of headers.
http://kafka.apache.org/downloads and untar it.
Server where records are stored. Multiple
Broker
brokers can be used to form a cluster.
2. Start the Zookeeper server

3. Start the Kafka broker Figure 2 depicts a topic with two partitions. Partition 0 has 5 records,
with offsets from 0 to 4, and partition 1 has 4 records, with offsets
4. Create a topic from 0 to 3.

5. Produce and Consume data

For detailed Quickstart instructions, please see the Apache Kafka

documentation in the Additional Resources section below.

PUB/SUB IN APACHE KAFKA

The first component in Kafka deals with the production and Figure 2: Partitions in a topic
consumption of the data. The following table describes a few key
concepts in Kafka: The following code snippet shows how to produce records to a topic
“test” using the Java API:

3 BROUGHT TO YOU IN PARTNERSHIP WITH

APACHE KAFKA

Properties props = new Properties();

for (ConsumerRecord<String, String> record : records)
props.put("bootstrap.servers",
System.out.printf("offset=%d, key=%s, value=%s",
"localhost:9092");
record.offset(), record.key(), record.value());
props.put("key.serializer",
consumer.commitSync();
"org.apache.kafka.common.serialization.
StringSerializer"); }

props.put("value.serializer",
Records within a partition are always delivered to the consumer in

"org.apache.kafka.common.serialization. offset order. By saving the offset of the last consumed record from
StringSerializer"); each partition, the consumer can resume from where it left off after a
restart. In the example above, we use the commitSync() API to save
Producer<String, String> producer = new
the offsets explicitly after consuming a batch of records. One can also
save the offsets automatically by setting the property enable.auto.
KafkaProducer<>(props);
commit to true.
producer.send(
A record in Kafka is not removed from the broker immediately after
new ProducerRecord<String, String>("test", "key", it is consumed. Instead, it is retained according to a configured
"value")); retention policy. The following table summarizes the two common
policies:

In the above example, both the key and value are strings, so we are Retention Policy Meaning
using a StringSerializer . It’s possible to customize the serializer The number of hours to keep a record on
log.retention.hours
when types become more complex. the broker.
The maximum size of records retained in
The following code snippet shows how to consume records with log.retention.bytes
a partition.
string key and value in Java.

Properties props = new Properties(); props. KAFKA CONNECT

put("bootstrap.servers", "localhost:9092");
The second component in Kafka is Kafka Connect, which is a
props.put("key.deserializer", framework that makes it easy to stream data between Kafka and
other systems. As shown in Figure 3, one can deploy a Connect
"org.apache.kafka.common.serialization.
cluster and run various connectors to import data from sources like
StringDeserializer");
MySQL, TIBCO Messaging, or Splunk into Kafka (Source Connectors)
props.put("value.deserializer", and export data from Kafka (Sink Connectors) such as HDFS, S3,
and Elasticsearch.
"org.apache.kafka.common.serialization.
StringDeserializer"); See page 5 for Figure 3.

KafkaConsumer<String, String> consumer =

new KafkaConsumer<>(props);

consumer.subscribe(Arrays.asList("test"));

while (true) {

ConsumerRecords<String, String> records =

consumer.poll(100);

4 BROUGHT TO YOU IN PARTNERSHIP WITH

APACHE KAFKA

Figure 3: Usage of Apache Kafka Connect 4. Verify the data in Kafka:

The benefits of using Kafka Connect are: > bin/kafka-console-consumer.sh

∙ Parallelism and fault tolerance --bootstrap-server localhost:9092

∙ Avoiding ad-hoc code by reusing existing connectors --topic connect-test

∙ Built-in offset and configuration management --from-beginning

QUICKSTART FOR KAFKA CONNECT {"schema":{"type":"string",

The following steps show how to run the existing file connector "optional":false},

in standalone mode to copy the content from a source file to a

"payload":"hello"}
destination file via Kafka:
{"schema":{"type":"string",
1. Prepare some data in a source file:
"optional":false},
> echo -e \"hello\nworld\" > test.txt

"payload":"world"}
2. Start a file source and a file sink connector:

> bin/connect-standalone.sh In the example above, the data in the source file test.txt is first
streamed into a Kafka topic connect-test through a file source
config/connect-file-source.properties connector. The records in connect-test are then streamed into
the destination file test.sink.txt . If a new line is added to test.
config/connect-file-sink.properties
txt , it will show up immediately in test.sink.txt . Note that

3. Verify the data in the destination file: we achieve this by running two connectors without writing any
custom code.
> more test.sink.txt

Connectors are powerful tools that allow for integration of Apache

hello
Kafka into many other systems. There are many open source and
commercially supported options for integration of Apache Kafka both
at the connector layer as well as through an integration services layer
that can provide much more flexibility in message transformation.
In addition to open-source connectors, vendors like Confluent
and TIBCO Software offer commercially supported connectors to
hundreds of endpoints for simplifying integration with Apache Kafka.

5 BROUGHT TO YOU IN PARTNERSHIP WITH

APACHE KAFKA

TRANSFORMATIONS IN CONNECT In step 1 above, we add two transformations MakeMap

and InsertSource , which are implemented by class
Connect is primarily designed to stream data between systems
HoistField$Value and InsertField$Value , respectively. The
as is, whereas Kafka Streams is designed to perform complex
first one adds a field name “line” to each input string. The second
transformations once the data is in Kafka. That said, Kafka Connect
one adds an additional field “data_source” that indicates the name
provides a mechanism used to perform simple transformations per
of the source file. After applying the transformation logic, the data in
record. The following example shows how to enable a couple of
the input file is now transformed to the output in step 3. Because the
transformations in the file source connector.
last transformation step is more complex, we implement it with the

1. Add the following lines to connect-file-source. Streams API (covered in more detail below):

properties : final Serde<String> stringSerde = Serdes.String();

transforms=MakeMap, InsertSource
final Serde<Long> longSerde = Serdes.Long();

transforms.MakeMap.type=org.apache.kafka
StreamsBuilder builder = new StreamsBuilder();

.connect.transforms.HoistField$Value
// build a stream from an input topic

transforms.MakeMap.field=line
KStream<String, String> source = builder.stream(

transforms.InsertSource.type=org.apache
"streams-plaintext-input",

.kafka.connect.transforms
Consumed.with(stringSerde, stringSerde));

.InsertField$Value
KTable<String, Long> counts = source

transforms.InsertSource.static.field=
.flatMapValues(value -> Arrays.asList(value.
toLowerCase().split(" ")))
data_source

.groupBy((key, value) -> value)

transforms.InsertSource.static.value=

.count();
test-file-source

// convert the output to another topic

2. Start a file source connector:
counts.toStream().to("streams-wordcount-output",
> bin/connect-standalone.sh

Produced.with(stringSerde, longSerde));
config/connect-file-source.properties

3. Verify the data in Kafka:

CONNECT REST API
> bin/kafka-console-consumer.sh
In production, Kafka Connect typically runs in distributed mode and
--bootstrap-server localhost:9092 can be managed through REST APIs. The following table shows the
common APIs. See the Kafka documentation for more information.
--topic connect-test

{"line":"hello","data_source":"test

-file-source"}

{"line":"world","data_source":"test

-file-source"}

6 BROUGHT TO YOU IN PARTNERSHIP WITH

APACHE KAFKA

Connect REST API Meaning

Produced.with(stringSerde, longSerde));
GET /connectors Return a list of active connectors
POST /connectors Create a new connector
GET /connectors/ The code above first creates a stream from an input topic streams-
Get the information of a specific connector
{name} plaintext-input . It then applies a transformation to split each
GET /connectors/ Get configuration parameters for a specific input line into words. Next, it counts the number of occurrences of
{name} /config connector
each unique word. Finally, the results are written to an output topic
PUT /connectors/ Update configuration parameters for a
streams-wordcount-output .
{name} /config specific connector
GET /connectors/
Get the current status of the connector The following are the steps to run the example code.
{name} /status
1. Create the input topic:
KAFKA STREAMS
bin/kafka-topics.sh --create

Kafka Streams is a client library for building real-time applications

--zookeeper localhost:2181
and microservices where the input and/or output data is stored in
Kafka. The benefits of using Kafka Streams are: --replication-factor 1

∙ Less code in the application --partitions 1

∙ Built-in state management --topic streams-plaintext-input

∙ Lightweight 2. Run the stream application:

∙ Parallelism and fault tolerance bin/kafka-run-class.sh org.apache.

The most common way of using Kafka Streams is through the kafka.streams.examples.wordcount.
Streams DSL, which includes operations such as filtering, joining,
grouping, and aggregation. The following code snippet shows the WordCountDemo

main logic of a Streams example called WordCountDemo .

3. Produce some data in the input topic:
final Serde stringSerde = Serdes.String();
bin/kafka-console-producer.sh

final Serde longSerde = Serdes.Long();

--broker-list localhost:9092

StreamsBuilder builder = new StreamsBuilder();

--topic streams-plaintext-input

// build a stream from an input topic

hello world

KStream source = builder.stream(

4. Verify the data in the output topic:
"streams-plaintext-input",
bin/kafka-console-consumer.sh

Consumed.with(stringSerde, stringSerde));
--bootstrap-server localhost:9092

KTable counts = source

--topic streams-wordcount-output

.flatMapValues(value -\> Arrays.asList(value.

--from-beginning

toLowerCase().split(" ")))
--formatter kafka.tools.
.groupBy((key, value) -\> value) .count();

DefaultMessageFormatter
// convert the output to another topic

--property print.key=true
counts.toStream().to("streams-wordcount-output",

7 BROUGHT TO YOU IN PARTNERSHIP WITH

APACHE KAFKA

COMMONLY USED OPERATIONS IN KSTREAM

--property print.value=true
Operation Example
--property key.deserializer= ks_out = ks_in.filter(
filter(Predicate)
(key,value) -> value > 5
org.apache.kafka.common. Create a new KStream that ); ks_in: ks_out: ("k1",
consists of all records of this 2) ("k2", 7) ("k2", 7)
serialization.StringDeserializer stream that satisfy the given
predicate.
--property value.deserializer=
map(KeyValueMapper) ks_out = ks_in..map(
(key,value) -> new
org.apache.kafka.common. Transform each record of the KeyValue<>(key,key) )
input stream into a new record ks_in: ks_out: ("k1", 2)
serialization.LongDeserializer in the output stream (both key ("k1", "k1") ("k2", 7)
and value type can be altered ("k2", "k2")
hello 1 arbitrarily).
groupBy() ks_out = ks.groupBy()
world 1 ks_in: ks_out: ("k1", 1)
Group the records by their ("k1", (("k1", 1),
current key into a KGrouped- ("k2", 2) ("k1", 3)))
KSTREAM VS. KTABLE
Stream while preserving the ("k1", 3) ("k2",
original values. (("k2", 2)))
There are two key concepts in Kafka Streams: KStream and KTable.
join(KTable, ks_out = ks_in.join(
A topic can be viewed as either of the two. Their differences are
ValueJoiner) kt, (value1, value2) ->
summarized in the table below. value1 + value2
Join records of the input stream ); ks_in: kt:
KStream KTable with records from the KTable ("k1", 1) ("k1", 11)
Each record is treated if the keys from the records ("k2", 2) ("k2", 12)
Each record is treated as an ("k3", 3) ("k4", 13)
Concept as an update to an match. Return a stream of the
append to the stream. ks_out: ("k1", 12)
existing key. key and the combined value
("k2", 14)
Model updatable using ValueJoiner .
Model append-only data join(KStream, ks_out = ks1.join( ks2,
Usage reference data such as
such as click streams. ValueJoiner, (value1, value2) ->
user profiles.
JoinWindows) value1 + value2,
JoinWindows. of(100) );
The following example illustrates the difference of the two: Join records of the two streams ks1: ks2:
if the keys match and the ("k1", 1, 100t)
(Key, Value) Sum of the Values as Sum of the Values as timestamp from the records ("k1", 11, 150t) ("k2",
Records KStream KTable 2, 200t) ("k2", 12,
satisfy the time constraint
350t) ("k3", 3, 300t)
(“k1”, 2) (“k1”, 5) 7 5 specified by JoinWindows .
("k4", 13, 380t) * t
Return a stream of the key
indicates a timestamp.
When a topic is viewed as a KStream, there are two independent and the combined value using ks_out: ("k1", 12)
ValueJoiner .
records and thus the sum of the values is 7. On the other hand, if the
topic is viewed as a KTable, the second record is treated as an update
to the first record since they have the same key “k1”. Therefore, only
the second record is retained in the stream and the sum is 5 instead.

KSTREAMS DSL

The following tables show a list of common operations available in

Kafka Streams:

8 BROUGHT TO YOU IN PARTNERSHIP WITH

APACHE KAFKA

COMMONLY USED OPERATIONS IN KGROUPEDSTREAM functional, many client applications are looking for a lighter weight
interface to Apache Kafka Streams via low-code environments
Operation Example
or continuous queries using SQL-like commands. In many cases,
count() kt = kgs.count();
developers looking to leverage continuous query functionality
kgs: ("k1", (("k1",
Count the number of records in 1), ("k1", 3))) ("k2", are looking for a low-code environment where stream processing
this stream by the grouped key (("k2", 2))) kt: ("k1", can be dynamically accessed, modified, and scaled in real-time.
and return it as a KTable. 2) ("k2", 1)
Accessing Apache Kafka data streams via SQL is just one approach to
addressing this low-code stream processing, and many commercial
reduce(Reducer) kt = kgs.reduce( vendors (TIBCO Software, Confluent) as well as open-source
(aggValue, newValue)
solutions (Apache Spark) offer solutions to providing SQL access to
Combine the values of records -> aggValue + newValue
in this stream by the grouped ); kgs: ("k1", ("k1", unlock Apache Kafka data streams for stream processing.
key and return it as a KTable. 1), ("k1", 3))) ("k2",
(("k2", 2))) kt: ("k1", Apache Kafka is a data distribution platform; it’s what you do with
4) ("k2", 2) the data that is important. Once data is available via Kafka, it can be
windowedBy(Windows) twks = kgs.windowedBy ( distributed to many different processing engines from integration
TimeWindows. of(100) ); services, event streaming, and AI/ML functions to data analytics.
Further group the records by kgs: ("k1", (("k1", 1,
the timestamp and return it as a 100t), ("k1", 3, 150t)))
TimeWindowedKStream . ("k2", (("k2", 2, 100t),
("k2", 4, 250t))) * t
indicates a timestamp.
twks: ("k1", 100t --
200t, (("k1", 1, 100t),
("k1", 3, 150t))) ("k2",
100t -- 200t, (("k2", 2,
100t))) ("k2", 200t --
300t, (("k2", 4, 250t)))

A similar set of operations is available on KTable and KGroupedTable.

You can check the Kafka documentation for more information.

QUERYING THE STATES IN KSTREAMS Figure 4: Leveraging Apache Kafka beyond data distribution

While processing the data in real-time, a KStreams application For more information on low-code stream processing options,
locally maintains the states such as the word counts in the previous including SQL access to Apache Kafka, please see the Additional
example. Those states can be queried interactively through an Resources section below.
API described in the Interactive Queries section of the Kafka
documentation. This avoids the need of an external data store for ONE SIZE DOES NOT FIT ALL
exporting and serving those states.
With the increasing popularity of real-time stream processing and
EXACTLY-ONCE PROCESSING IN KSTREAMS the rise of event-driven architectures, a number of alternatives have
started to gain traction for real-time data distribution. Apache Kafka
Failures in the brokers or the clients may introduce duplicates
is the flavor of choice for distributed high volume data streaming;
during the processing of records. KStreams provides the capability
however, many implementations have begun to struggle with
of processing records exactly once even under failures. This can be
building solutions at scale when the application’s requirements go
achieved by simply setting the property processing.guarantee
beyond a single data center or single location.
to exactly_once in KStreams.
So, while Apache Kafka is purpose-built for real-time data
EXTENDING APACHE KAFKA distribution and streaming processing, it will not fit all the
requirements of every enterprise application. Alternatives like
While processing data directly into an application via KStreams is
Apache Pulsar, Eclipse Mosquitto, and many others may be worth

9 BROUGHT TO YOU IN PARTNERSHIP WITH

APACHE KAFKA

investigating, especially if requirements prioritize large scale global ADDITIONAL RESOURCES

infrastructure where built-in replication is needed or if native IoT/
MQTT support is needed. ∙ Documentation of Apache Kafka

For more information on comparisons between Apache Kafka ∙ Apache NiFi website

and other data distribution solutions, please see the Additional

∙ Article on real-time stock processing with Apache NiFi and
Resources section below.
Apache Kafka

CONCLUSION ∙ Apache Kafka Summit website

Apache Kafka has become the de-facto standard for high ∙ Apache Kafka Mirroring and Replication
performance, distributed data streaming. It has a large and
growing community of developers, corporations, and applications ∙ Apache Pulsar Vs. Apache Kafka O’Reilly eBook

that are supporting, maintaining, and leveraging it. If you are

∙ Apache Pulsar website
building an event-driven architecture or looking for a way to
stream data in real-time, Apache Kafka is a clear leader in providing
a proven, robust platform for enabling stream processing and
enterprise communications.

Written by Bill McLane,

TIBCO Messaging Evangelist

With over 20 years of experience building, architecting, and designing large scale messaging infrastructure, William McLane is one of the
thought leaders for global data distribution. William and TIBCO have history and experience building mission critical, real world data
distribution architectures that power some of the largest financial services institutions, to the global scale of tracking transportation
and logistics operations. From Pub/Sub, to point-to-point, to real-time data streaming, William has experience designing, building, and
leveraging the right tools for building a nervous system that can connect, augment and unify your enterprise.

DZone, a Devada Media Property, is the resource software Devada, Inc.

developers, engineers, and architects turn to time and again 600 Park Offices Drive
to learn new skills, solve software development problems, Suite 150
and share their expertise. Every day, hundreds of tousands Research Triangle Park, NC 27709
of developers come to DZone to read about the latest
888.678.0399 919.678.0300
technologies, methodologies, and best practices. That makes
DZone the ideal place for developer marketers to build product Copyright © 2020 Devada, Inc. All rights reserved. No
and brand awareness and drive sales. DZone clients include part of this publication may be reporoduced, stored in a
some of the most innovative technology and tech-enabled retrieval system, or transmitted, in any form or by means
companies in the world including Red Hat, Cloud Elements, of electronic, mechanical, photocopying, or otherwise,
Sensu, and Sauce Labs. without prior written permission of the publisher.

10 BROUGHT TO YOU IN PARTNERSHIP WITH

Regulatory Forms For Spouse or Co Borrower-BDO
No ratings yet
Regulatory Forms For Spouse or Co Borrower-BDO
9 pages
GN 49 of 1971
100% (1)
GN 49 of 1971
13 pages
Kafka Low Level Architecture
No ratings yet
Kafka Low Level Architecture
52 pages
Apache Kafka
No ratings yet
Apache Kafka
9 pages
Norvasc Drug Card
No ratings yet
Norvasc Drug Card
1 page
Documentation
No ratings yet
Documentation
105 pages
List The Various Components in Kafka
No ratings yet
List The Various Components in Kafka
2 pages
Pattern Saga
No ratings yet
Pattern Saga
5 pages
Rabbitmq
No ratings yet
Rabbitmq
24 pages
Handle Large Messages in Apache Kafka
No ratings yet
Handle Large Messages in Apache Kafka
59 pages
12 Microservices Design Patterns 1696645895
No ratings yet
12 Microservices Design Patterns 1696645895
14 pages
AWS Essentials
No ratings yet
AWS Essentials
6 pages
Spring Transactions
No ratings yet
Spring Transactions
22 pages
Whitepaper: Continuous Integration Using Jenkins
No ratings yet
Whitepaper: Continuous Integration Using Jenkins
23 pages
Practical Part 2
No ratings yet
Practical Part 2
143 pages
AWS Solution Architect Associate Agenda PDF
No ratings yet
AWS Solution Architect Associate Agenda PDF
6 pages
Kata Containers
No ratings yet
Kata Containers
34 pages
Interview Question With Answer Imporant
No ratings yet
Interview Question With Answer Imporant
37 pages
CISUC - Microservices Observability
No ratings yet
CISUC - Microservices Observability
3 pages
Public Class: Singleton, Prototype, Request, Session and Global Session
No ratings yet
Public Class: Singleton, Prototype, Request, Session and Global Session
6 pages
Day 1
No ratings yet
Day 1
105 pages
Dzone Refcardz Kubernetes Rc233
No ratings yet
Dzone Refcardz Kubernetes Rc233
6 pages
Containerized Microservices Architecture: WWW - Ijecs.in
No ratings yet
Containerized Microservices Architecture: WWW - Ijecs.in
10 pages
Istio Service Mesh Summary1 8th April 2023
No ratings yet
Istio Service Mesh Summary1 8th April 2023
4 pages
Chapter 1 IntroDistributed
No ratings yet
Chapter 1 IntroDistributed
143 pages
Event-Driven Microservices With Spring Boot and Activemq
100% (1)
Event-Driven Microservices With Spring Boot and Activemq
13 pages
Core AWS Services
No ratings yet
Core AWS Services
36 pages
Cloud Computing Chapter 3
No ratings yet
Cloud Computing Chapter 3
17 pages
Cloud Computing
100% (1)
Cloud Computing
44 pages
Concepts _ Kubernetes
No ratings yet
Concepts _ Kubernetes
609 pages
Micro Service
No ratings yet
Micro Service
38 pages
The Saga Pattern in A Reactive Microservices Environment: January 2019
No ratings yet
The Saga Pattern in A Reactive Microservices Environment: January 2019
9 pages
AWS Dumps With Answers Part-1
No ratings yet
AWS Dumps With Answers Part-1
78 pages
Introduction To Cloud Computing
No ratings yet
Introduction To Cloud Computing
27 pages
Unit - 1: Cloud Architecture and Model
No ratings yet
Unit - 1: Cloud Architecture and Model
9 pages
Containers in The Cloud
No ratings yet
Containers in The Cloud
56 pages
DDD in Distributed Computing
No ratings yet
DDD in Distributed Computing
5 pages
Camel Microservices With Spring Boot and Kubernetes
No ratings yet
Camel Microservices With Spring Boot and Kubernetes
67 pages
AWS Knowledge: AWS Certified Solutions Architect - Associate Level Exam Blueprint
No ratings yet
AWS Knowledge: AWS Certified Solutions Architect - Associate Level Exam Blueprint
3 pages
Microservice Architecture API Gateway Considerations
100% (1)
Microservice Architecture API Gateway Considerations
13 pages
Nosql
No ratings yet
Nosql
8 pages
Design Patterns
No ratings yet
Design Patterns
13 pages
Introduction To Redux
No ratings yet
Introduction To Redux
7 pages
Kafka Secuirty
No ratings yet
Kafka Secuirty
4 pages
Presentation On Exceptions in Java
No ratings yet
Presentation On Exceptions in Java
23 pages
How To A Measure The Performance of A Server?
No ratings yet
How To A Measure The Performance of A Server?
5 pages
Spring Boot Interview Questions
100% (1)
Spring Boot Interview Questions
6 pages
Cloud Design Patterns
No ratings yet
Cloud Design Patterns
154 pages
Aws Cloudwatch Ec2
No ratings yet
Aws Cloudwatch Ec2
5 pages
Microservices Roundtable Puneet Sachdev 21 Apr 2018
No ratings yet
Microservices Roundtable Puneet Sachdev 21 Apr 2018
20 pages
Cloud Native Journey v01 160218234048 PDF
No ratings yet
Cloud Native Journey v01 160218234048 PDF
35 pages
Interprocess Communication in Microservice
No ratings yet
Interprocess Communication in Microservice
2 pages
Capacity Planning For Application Design: White Paper
No ratings yet
Capacity Planning For Application Design: White Paper
10 pages
Apache Kafka
No ratings yet
Apache Kafka
17 pages
What Is Elastic Load Balancing
No ratings yet
What Is Elastic Load Balancing
3 pages
Lecturenotes Module-5 BCS403 Databasemanagementsystem
No ratings yet
Lecturenotes Module-5 BCS403 Databasemanagementsystem
20 pages
Learning-Notes - Books - Designing-Data-Intensive-Applications - MD at Master Keyvanakbary - Learning-Notes
No ratings yet
Learning-Notes - Books - Designing-Data-Intensive-Applications - MD at Master Keyvanakbary - Learning-Notes
91 pages
Kubernetes Notes - Get Start
No ratings yet
Kubernetes Notes - Get Start
26 pages
Aws Solutions Associate
No ratings yet
Aws Solutions Associate
9 pages
Apache Kafka Long Polling
No ratings yet
Apache Kafka Long Polling
20 pages
Apache Kafka Essentials
No ratings yet
Apache Kafka Essentials
10 pages
Kafka
No ratings yet
Kafka
23 pages
Apache Kafka Beginner Guide
No ratings yet
Apache Kafka Beginner Guide
40 pages
Java Application Vulnerabilities:: What They Are and How To Fix Them
No ratings yet
Java Application Vulnerabilities:: What They Are and How To Fix Them
9 pages
Getting Started Microservices PDF
No ratings yet
Getting Started Microservices PDF
6 pages
Introduction To Devsecops: Powerful Automation
No ratings yet
Introduction To Devsecops: Powerful Automation
11 pages
Code Review Checklist V0.3 Author Vaibhav Pandey
No ratings yet
Code Review Checklist V0.3 Author Vaibhav Pandey
16 pages
How Google Big Query Changed The Game
No ratings yet
How Google Big Query Changed The Game
11 pages
Human Resource Management PDF
No ratings yet
Human Resource Management PDF
44 pages
Andre Bazin, The Ontology of The Photographic Image From His Book What Is Cinema Vol. I
No ratings yet
Andre Bazin, The Ontology of The Photographic Image From His Book What Is Cinema Vol. I
8 pages
Sublay Versus Underlay in Open Ventral Hernia Repair
No ratings yet
Sublay Versus Underlay in Open Ventral Hernia Repair
7 pages
B.voc Operation Theatre Technology
No ratings yet
B.voc Operation Theatre Technology
51 pages
MINISTRY OF PUBLIC WORKS-Attorney IV
No ratings yet
MINISTRY OF PUBLIC WORKS-Attorney IV
1 page
Present Perfect Continuous
No ratings yet
Present Perfect Continuous
21 pages
2018 Book ExploringLanguageAptitude PDF
No ratings yet
2018 Book ExploringLanguageAptitude PDF
394 pages
Looking over Raskol'nikov's Shoulder The Narrator in C&P
No ratings yet
Looking over Raskol'nikov's Shoulder The Narrator in C&P
6 pages
Class 8 English
No ratings yet
Class 8 English
8 pages
Specification Final Draft RDSO - CG - P - 20001
No ratings yet
Specification Final Draft RDSO - CG - P - 20001
63 pages
STP 15 51 Permeable Concrete
No ratings yet
STP 15 51 Permeable Concrete
116 pages
Love Letters
No ratings yet
Love Letters
18 pages
Design of Precast Segmental Tunnel Lining For Pawtucket CSO Tunnel Project
No ratings yet
Design of Precast Segmental Tunnel Lining For Pawtucket CSO Tunnel Project
12 pages
Single Wide Master Catalog: Goss Community Unit Goss Community Folder
100% (1)
Single Wide Master Catalog: Goss Community Unit Goss Community Folder
109 pages
GEI41042 O Compressor Washing
No ratings yet
GEI41042 O Compressor Washing
12 pages
SUNDARI Product Knowledge 2016 Skincare
No ratings yet
SUNDARI Product Knowledge 2016 Skincare
6 pages
Glass and Building Regulations Impact Safety
No ratings yet
Glass and Building Regulations Impact Safety
11 pages
Marietta Directive
No ratings yet
Marietta Directive
4 pages
INDIVIDUAL SPORTS Reports
No ratings yet
INDIVIDUAL SPORTS Reports
24 pages
605 00 159 FLOAT DPX2 Tuning Guide White Reva PDF
No ratings yet
605 00 159 FLOAT DPX2 Tuning Guide White Reva PDF
8 pages
Research Paper
No ratings yet
Research Paper
7 pages
Treatment of Pediatric Overweight and Obesity Position of the Academy of Nutrition and Dietetics Based on an Umbrella Review of Systematic Reviews
No ratings yet
Treatment of Pediatric Overweight and Obesity Position of the Academy of Nutrition and Dietetics Based on an Umbrella Review of Systematic Reviews
14 pages
Product Brochure of Graphite Electrode 1
No ratings yet
Product Brochure of Graphite Electrode 1
11 pages
Valimet Data Sheet: Spherical Aluminum Alloy Powder GRADE: AM-357
No ratings yet
Valimet Data Sheet: Spherical Aluminum Alloy Powder GRADE: AM-357
1 page
The Best Affiliate Marketing Tools in 2020 (Free and Paid)
No ratings yet
The Best Affiliate Marketing Tools in 2020 (Free and Paid)
33 pages
Iffah Matsud Ecm241 Lab 1
No ratings yet
Iffah Matsud Ecm241 Lab 1
5 pages
WargameRD Hidden Knowledge Spreadsheet
No ratings yet
WargameRD Hidden Knowledge Spreadsheet
250 pages

Apache Kafka Essentials

Uploaded by

Apache Kafka Essentials

Uploaded by

BROUGHT TO YOU IN PARTNERSHIP WITH

∙ Quickstart for Apache Kafka

∙ Pub/Sub in Apache Kafka

∙ Extending Apache Kafka

∙ One Size Does Not Fit All

ORIGINAL BY JUN RAO | UPDATE BY BILL MCLANE

1, Kafka typically serves as a part of a central data hub in which

ABOUT APACHE KAFKA

Storing and delivering data efficiently and

5. Produce and Consume data

For detailed Quickstart instructions, please see the Apache Kafka

PUB/SUB IN APACHE KAFKA

3 BROUGHT TO YOU IN PARTNERSHIP WITH

Properties props = new Properties();

Properties props = new Properties(); props. KAFKA CONNECT

KafkaConsumer<String, String> consumer =

ConsumerRecords<String, String> records =

4 BROUGHT TO YOU IN PARTNERSHIP WITH

Figure 3: Usage of Apache Kafka Connect 4. Verify the data in Kafka:

The benefits of using Kafka Connect are: > bin/kafka-console-consumer.sh

∙ Parallelism and fault tolerance --bootstrap-server localhost:9092

∙ Avoiding ad-hoc code by reusing existing connectors --topic connect-test

∙ Built-in offset and configuration management --from-beginning

QUICKSTART FOR KAFKA CONNECT {"schema":{"type":"string",

in standalone mode to copy the content from a source file to a

Connectors are powerful tools that allow for integration of Apache

5 BROUGHT TO YOU IN PARTNERSHIP WITH

TRANSFORMATIONS IN CONNECT In step 1 above, we add two transformations MakeMap

properties : final Serde<String> stringSerde = Serdes.String();

.groupBy((key, value) -> value)

// convert the output to another topic

3. Verify the data in Kafka:

6 BROUGHT TO YOU IN PARTNERSHIP WITH

Connect REST API Meaning

Kafka Streams is a client library for building real-time applications

∙ Less code in the application --partitions 1

∙ Built-in state management --topic streams-plaintext-input

∙ Lightweight 2. Run the stream application:

∙ Parallelism and fault tolerance bin/kafka-run-class.sh org.apache.

main logic of a Streams example called WordCountDemo .

final Serde longSerde = Serdes.Long();

StreamsBuilder builder = new StreamsBuilder();

// build a stream from an input topic

KStream source = builder.stream(

KTable counts = source

.flatMapValues(value -\> Arrays.asList(value.

7 BROUGHT TO YOU IN PARTNERSHIP WITH

COMMONLY USED OPERATIONS IN KSTREAM

The following tables show a list of common operations available in

8 BROUGHT TO YOU IN PARTNERSHIP WITH

A similar set of operations is available on KTable and KGroupedTable.

9 BROUGHT TO YOU IN PARTNERSHIP WITH

investigating, especially if requirements prioritize large scale global ADDITIONAL RESOURCES

and other data distribution solutions, please see the Additional

CONCLUSION ∙ Apache Kafka Summit website

that are supporting, maintaining, and leveraging it. If you are

Written by Bill McLane,

DZone, a Devada Media Property, is the resource software Devada, Inc.

10 BROUGHT TO YOU IN PARTNERSHIP WITH

You might also like