Spring Boot – Integrate with Apache Kafka for Streaming
Last Updated :
27 Aug, 2024
Apache Kafka is a widely used distributed streaming platform that enables the development of scalable, fault-tolerant, and high-throughput applications. In this article, we'll walk you through the process of integrating Kafka with a Spring Boot application, providing detailed code examples and explanations along the way. By the end of this article, you'll gain a strong understanding of how to seamlessly incorporate Kafka into your Spring Boot projects, enhancing your application's performance and capabilities.
What are Kafka Streams?
Kafka Streams is a client library built on top of Apache Kafka. It enables the processing of unbounded streams of events in a declarative manner. Streaming data examples include stock market prices, system logs, or the number of users on a website at any given moment.
Kafka Streams provides a connection between Kafka topics and relational database tables. It enables operations such as joins, grouping, aggregation, and filtering on streaming events. A key concept in Kafka Streams is processor topology, which outlines the operations performed on one or more event streams. This topology consists of a directed acyclic graph (DAG), where nodes are categorized as:
- Source Nodes: Consume one or more Kafka topics and forward the data to successor nodes.
- Processor Nodes: Receive records from upstream nodes, process them, and optionally forward new records to downstream nodes. For instance, a source processor node named "Source" is added to the topology with the
addSource
method, and a processor node named "Process" with predefined logic is added using the addProcessor
method. - Sink Nodes: Receive records from upstream nodes and write them to a Kafka topic.
The topology is constructed as an acyclic graph, which is then passed to a KafkaStreams
instance for consuming, processing, and producing records.
Processor API
The Processor API offers flexibility for defining and connecting custom processors to the processing topology. It allows the creation of stream processors that handle one record at a time and supports both stateless and stateful operations. Stateful operations connect stream processors to state stores.
Dependencies
Dependencies are libraries that provide specific functionalities for use in your application. In Spring Boot, dependency management and auto-configuration work together seamlessly. To integrate Kafka Streams with Spring Boot, add the following dependencies in your pom.xml
file:
<dependency>
<groupId>org.springframework.kafka</groupId>
<artifactId>spring-kafka</artifactId>
<version>3.1.2</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId
<artifactId>kafka-streams</artifactId>
<version>3.6.1</version>
</dependency>
Configuration
Spring Boot's auto-configuration feature automatically configures your Spring application based on the jar dependencies you've included. Now, let's define the Kafka Streams configuration in a Java configuration class:
Let us define the Kafka stream configuration in a Java config class:
- @Configuration: Spring @Configuration annotation is part of the spring core framework. Spring Configuration annotation indicates that the class has @Bean definition methods
- @EnableKafka: The KafkaListenerContainerFactory is responsible for creating the listener container for a specific endpoint.
- @EnableKafkaStreams: To autoconfigure Kafka Streams support in Spring Boot
Java
@Configuration
@EnableKafka
@EnableKafkaStreams
public class KafkaConfig {
@Value(value = "${spring.kafka.bootstrap-servers}")
private String bootstrapAddress;
@Bean(name = KafkaStreamsDefaultConfiguration.DEFAULT_STREAMS_CONFIG_BEAN_NAME)
KafkaStreamsConfiguration kStreamsConfig() {
Map<String, Object> props = new HashMap<>();
props.put(APPLICATION_ID_CONFIG, "streams-app");
props.put(BOOTSTRAP_SERVERS_CONFIG, bootstrapAddress);
props.put(DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
props.put(DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
return new KafkaStreamsConfiguration(props);
}
// other config
}
Basic Kafka Producer and Consumer
Before diving into Kafka Streams, it’s essential to understand how to integrate basic Kafka producers and consumers with Spring Boot.
Producer Configuration
To configure a Kafka producer in Spring Boot, specify the Kafka server addresses and serialization settings in the application.yml
file. Include the necessary dependencies in your pom.xml
file.
Java
@Configuration
public class KafkaProducerConfig {
@Bean
public ProducerFactory<String, String> producerFactory() {
Map<String, Object> configProps = new HashMap<>();
configProps.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
configProps.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
configProps.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
return new DefaultKafkaProducerFactory<>(configProps);
}
@Bean
public KafkaTemplate<String, String> kafkaTemplate() {
return new KafkaTemplate<>(producerFactory());
}
}
Output:
Consumer Configuration:
Similarly, to configure a Kafka consumer in Spring Boot, you need to specify settings for bootstrap-servers, key-serializer, and value-serializer in the application.yml file. Add the required dependencies in your pom.xml file. The consumer configuration generally includes properties such as group-id, key-deserializer, and value-deserializer.
Code Example:
Java
@Configuration
@EnableKafka
public class KafkaConsumerConfig {
@Bean
public ConsumerFactory<String, String> consumerFactory() {
Map<String, Object> configProps = new HashMap<>();
configProps.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
configProps.put(ConsumerConfig.GROUP_ID_CONFIG, "group_id");
configProps.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
configProps.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
return new DefaultKafkaConsumerFactory<>(configProps);
}
@Bean
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
return factory;
}
}
Output:
Producer and Consumer Example
To demonstrate Kafka producer and consumer functionality, first configure a Kafka producer to send messages to a Kafka topic. Then, configure a Kafka consumer using a Kafka listener.
- This involves setting up a producer bean and using its 'send' method to publish messages.
- Next, you configure a Kafka consumer using a Kafka listener with the @KafkaListener annotation, which allows the application to automatically receive and process messages from the Kafka topic to which the producer is sending data.
- This setup illustrates a complete cycle where the producer sends data to a topic, and the consumer processes that data as it arrives.
Code Example:
Java
@Service
public class KafkaProducerService {
private final KafkaTemplate<String, String> kafkaTemplate;
@Autowired
public KafkaProducerService(KafkaTemplate<String, String> kafkaTemplate) {
this.kafkaTemplate = kafkaTemplate;
}
public void sendMessage(String message) {
kafkaTemplate.send("my_topic", message);
}
}
@Service
public class KafkaConsumerService {
@KafkaListener(topics = "my_topic", groupId = "group_id")
public void consume(String message) {
System.out.println("Consumed message: " + message);
}
}
Output:
KafkaProducerService: When you call sendMessage("Hello Kafka"), you will see:
Message sent: Hello Kafka
KafkaConsumerService: When the Kafka consumer receives the message "Hello Kafka" from the topic, you will see:
Consumed message: Hello Kafka
Testing Kafka Integration
You can test the producer and consumer setup by using a REST controller to trigger the producer and view the output from the consumer in the console.
Code Example:
Java
@RestController
public class KafkaController {
private final KafkaProducerService kafkaProducerService;
@Autowired
public KafkaController(KafkaProducerService kafkaProducerService) {
this.kafkaProducerService = kafkaProducerService;
}
@GetMapping("/send/{message}")
public String sendMessage(@PathVariable String message) {
kafkaProducerService.sendMessage(message);
return "Message sent successfully!";
}
}
Building a Topology
Now that we have set up the configuration, let’s build the topology for our application to keep a count of the words from input messages:
- @Component:
@Component
is an annotation that allows Spring to automatically detect and manage custom beans in the application context - @Autowired: The Spring framework facilitates automatic dependency injection. This means that by defining bean dependencies in a Spring configuration file, the Spring container can automatically wire the relationships between collaborating beans.
Java
@Component
public class WordCountProcessor {
private static final Serde<String> STRING_SERDE = Serdes.String();
@Autowired
void buildPipeline(StreamsBuilder streamsBuilder) {
KStream<String, String> messageStream = streamsBuilder
.stream("input-topic", Consumed.with(STRING_SERDE, STRING_SERDE));
KTable<String, Long> wordCounts = messageStream
.mapValues((ValueMapper<String, String>) String::toLowerCase)
.flatMapValues(value -> Arrays.asList(value.split("\\W+")))
.groupBy((key, word) -> word, Grouped.with(STRING_SERDE, STRING_SERDE))
.count();
wordCounts.toStream().to("output-topic");
}
}
Creating REST Endpoints
After defining our pipeline with the declarative steps, create the REST controller. This provides the endpoints to POST messages to the input topic and GET the counts for the specified word.
- @GetMapping: @GetMapping annotation in Spring is a powerful for building RESTful web services. It maps HTTP GET requests to a specific handler method in Spring controllers. With the help of @GetMapping annotation we can easily define endpoints of RESTful API and handle various HTTP requests.
- @PathVariable: @PathVariable annotation can be used to handle template variables in the request URI mapping.
Java
@RestController
public class WordCountController {
@Autowired
private KafkaStreamsFactoryBean factoryBean;
@GetMapping("/count/{word}")
public Long getWordCount(@PathVariable String word) {
KafkaStreams kafkaStreams = factoryBean.getKafkaStreams();
ReadOnlyKeyValueStore<String, Long> counts = kafkaStreams.store(
StoreQueryParameters.fromNameAndType("counts", QueryableStoreTypes.keyValueStore())
);
return counts.get(word);
}
}
Conclusion
This article walked you through how to integrate Apache Kafka for streaming in a Spring Boot application. We explored the basics of Kafka producers and consumers, configured Kafka Streams, built a simple topology, and set up REST endpoints for interacting with the stream data. By following these steps, you can enhance your Spring Boot applications with the robust capabilities of Apache Kafka for handling streaming data.
Similar Reads
Spring Boot - Integration with Kafka
Apache Kafka is a distributed messaging system designed for high-throughput and low-latency message delivery. It is widely used in real-time data pipelines, streaming analytics, and other applications requiring reliable and scalable data processing. Kafkaâs publish-subscribe model allows producers t
6 min read
How to Integrate Keycloak with Spring Boot and Spring Security?
Keycloak is Open Source Identity and Access Management (IAM) solution developed by Red Hat. By using this you can add authentication to applications and secure services with minimum effort. No need to deal with storing users or authenticating users. Keycloak provides user federation, strong authenti
2 min read
Spring Boot â Using Spring Boot with Apache Camel
Apache Camel and Spring Boot are two powerful frameworks that can be seamlessly integrated to build robust, scalable, and efficient applications. Apache Camel is an open-source integration framework that provides an extensive range of components and connectors, enabling developers to integrate diffe
5 min read
Message Compression in Apache Kafka using Spring Boot
Generally, producers send text-based data, such as JSON data. It is essential to apply compression to the producer in this situation. Producer messages are transmitted uncompressed by default. There are two types of Kafka compression. 1. Producer-Level Kafka Compression When compression is enabled o
4 min read
Spring Security Integration with Spring Boot
Spring Security is a powerful and customizable authentication and access control framework for Java applications. It provides comprehensive security services for Java EE-based enterprise software applications. This article will integrate Spring Security with a Spring Boot application, covering confi
5 min read
Microservices Communication with Apache Kafka in Spring Boot
Apache Kafka is a distributed streaming platform and can be widely used to create real-time data pipelines and streaming applications. It can publish and subscribe to records in progress, save these records in an error-free manner, and handle floating records as they arrive. Combined with Spring Boo
6 min read
How to Use Apache Kafka for Real-Time Data Streaming?
In the present era, when data is king, many businesses are realizing that there is processing information in real-time, which is allowing Apache Kafka, the current clear leader with an excellent framework for real-time data streaming. This article dives into the heart of Apache Kafka and its applica
5 min read
Spring Boot - Create and Configure Topics in Apache Kafka
Topics are a special and essential component of Apache Kafka that are used to organize events or messages. In other words, Kafka Topics enable simple data transmission and reception across Kafka Servers by acting as Virtual Groups or Logs that store messages and events in a logical sequence. In this
2 min read
Apache Kafka - Create Producer with Keys using Java
Apache Kafka is a publish-subscribe messaging system. A messaging system lets you send messages between processes, applications, and servers. Apache Kafka is software where topics (A topic might be a category) can be defined and further processed. Read more on Kafka here: What is Apache Kafka and Ho
6 min read
Apache Kafka Streams - Simple Word Count Example
Kafka Streams is used to create apps and microservices with input and output data stored in an Apache Kafka cluster. It combines the advantages of Kafka's server-side cluster technology with the ease of creating and deploying regular Java and Scala apps on the client side. Approach In this article,
5 min read