Kafka is a distributed messaging system that allows for publishing and subscribing to streams of records, known as topics. Producers write data to topics and consumers read from topics. The data is partitioned and replicated across clusters of machines called brokers for reliability and scalability. A common data format like Avro can be used to serialize the data.
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Jean-Paul Azar
Why is Kafka so fast? Why is Kafka so popular? Why Kafka? This slide deck is a tutorial for the Kafka streaming platform. This slide deck covers Kafka Architecture with some small examples from the command line. Then we expand on this with a multi-server example to demonstrate failover of brokers as well as consumers. Then it goes through some simple Java client examples for a Kafka Producer and a Kafka Consumer. We have also expanded on the Kafka design section and added references. The tutorial covers Avro and the Schema Registry as well as advance Kafka Producers.
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
Flink Forward San Francisco 2022.
At Stripe we have created a complete end to end exactly-once processing pipeline to process financial data at scale, by combining the exactly-once power from Flink, Kafka, and Pinot together. The pipeline provides exactly-once guarantee, end-to-end latency within a minute, deduplication against hundreds of billions of keys, and sub-second query latency against the whole dataset with trillion level rows. In this session we will discuss the technical challenges of designing, optimizing, and operating the whole pipeline, including Flink, Kafka, and Pinot. We will also share our lessons learned and the benefits gained from exactly-once processing.
by
Xiang Zhang & Pratyush Sharma & Xiaoman Dong
In the last few years, Apache Kafka has been used extensively in enterprises for real-time data collecting, delivering, and processing. In this presentation, Jun Rao, Co-founder, Confluent, gives a deep dive on some of the key internals that help make Kafka popular.
- Companies like LinkedIn are now sending more than 1 trillion messages per day to Kafka. Learn about the underlying design in Kafka that leads to such high throughput.
- Many companies (e.g., financial institutions) are now storing mission critical data in Kafka. Learn how Kafka supports high availability and durability through its built-in replication mechanism.
- One common use case of Kafka is for propagating updatable database records. Learn how a unique feature called compaction in Apache Kafka is designed to solve this kind of problem more naturally.
Kafka is an open-source distributed commit log service that provides high-throughput messaging functionality. It is designed to handle large volumes of data and different use cases like online and offline processing more efficiently than alternatives like RabbitMQ. Kafka works by partitioning topics into segments spread across clusters of machines, and replicates across these partitions for fault tolerance. It can be used as a central data hub or pipeline for collecting, transforming, and streaming data between systems and applications.
It covers a brief introduction to Apache Kafka Connect, giving insights about its benefits,use cases, motivation behind building Kafka Connect.And also a short discussion on its architecture.
ksqlDB is a stream processing SQL engine, which allows stream processing on top of Apache Kafka. ksqlDB is based on Kafka Stream and provides capabilities for consuming messages from Kafka, analysing these messages in near-realtime with a SQL like language and produce results again to a Kafka topic. By that, no single line of Java code has to be written and you can reuse your SQL knowhow. This lowers the bar for starting with stream processing significantly.
ksqlDB offers powerful capabilities of stream processing, such as joins, aggregations, time windows and support for event time. In this talk I will present how KSQL integrates with the Kafka ecosystem and demonstrate how easy it is to implement a solution using ksqlDB for most part. This will be done in a live demo on a fictitious IoT sample.
Full recorded presentation at https://www.youtube.com/watch?v=2UfAgCSKPZo for Tetrate Tech Talks on 2022/05/13.
Envoy's support for Kafka protocol, in form of broker-filter and mesh-filter.
Contents:
- overview of Kafka (usecases, partitioning, producer/consumer, protocol);
- proxying Kafka (non-Envoy specific);
- proxying Kafka with Envoy;
- handling Kafka protocol in Envoy;
- Kafka-broker-filter for per-connection proxying;
- Kafka-mesh-filter to provide front proxy for multiple Kafka clusters.
References:
- https://adam-kotwasinski.medium.com/deploying-envoy-and-kafka-8aa7513ec0a0
- https://adam-kotwasinski.medium.com/kafka-mesh-filter-in-envoy-a70b3aefcdef
Kafka Streams State Stores Being Persistentconfluent
This document discusses Kafka Streams state stores. It provides examples of using different types of windowing (tumbling, hopping, sliding, session) with state stores. It also covers configuring state store logging, caching, and retention policies. The document demonstrates how to define windowed state stores in Kafka Streams applications and discusses concepts like grace periods.
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...confluent
Apache Kafka is a scalable streaming platform with built-in dynamic client scaling. The elastic scale-in/scale-out feature leverages Kafka’s “rebalance protocol” that was designed in the 0.9 release and improved ever since then. The original design aims for on-prem deployments of stateless clients. However, it does not always align with modern deployment tools like Kubernetes and stateful stream processing clients, like Kafka Streams. Those shortcoming lead to two mayor recent improvement proposals, namely static group membership and incremental rebalancing (which will hopefully be available in version 2.3). This talk provides a deep dive into the details of the rebalance protocol, starting from its original design in version 0.9 up to the latest improvements and future work. We discuss internal technical details, pros and cons of the existing approaches, and explain how you configure your client correctly for your use case. Additionally, we discuss configuration tradeoffs for stateless, stateful, on-prem, and containerized deployments.
A brief introduction to Apache Kafka and describe its usage as a platform for streaming data. It will introduce some of the newer components of Kafka that will help make this possible, including Kafka Connect, a framework for capturing continuous data streams, and Kafka Streams, a lightweight stream processing library.
Building distributed systems is challenging. Luckily, Apache Kafka provides a powerful toolkit for putting together big services as a set of scalable, decoupled components. In this talk, I'll describe some of the design tradeoffs when building microservices, and how Kafka's powerful abstractions can help. I'll also talk a little bit about what the community has been up to with Kafka Streams, Kafka Connect, and exactly-once semantics.
Presentation by Colin McCabe, Confluent, Big Data Day LA
Bringing Kafka Without Zookeeper Into Production with Colin McCabe | Kafka Su...HostedbyConfluent
The document discusses bringing Apache Kafka clusters into production without using ZooKeeper for coordination and metadata storage. It describes how Kafka uses ZooKeeper currently and the problems with this approach. It then introduces KRaft, which replaces ZooKeeper by using Raft consensus to replicate cluster metadata within Kafka. The key aspects of deploying, operating and troubleshooting KRaft-based Kafka clusters are covered, including formatting storage, controller setup, rolling upgrades, and examining the replicated metadata log.
The document provides an introduction and overview of Apache Kafka presented by Jeff Holoman. It begins with an agenda and background on the presenter. It then covers basic Kafka concepts like topics, partitions, producers, consumers and consumer groups. It discusses efficiency and delivery guarantees. Finally, it presents some use cases for Kafka and positioning around when it may or may not be a good fit compared to other technologies.
Exactly-Once Semantics Revisited: Distributed Transactions across Flink and K...HostedbyConfluent
"Apache Flink’s Exactly-Once Semantics (EOS) integration for writing to Apache Kafka has several pitfalls, due mostly to the fact that the Kafka transaction protocol was not originally designed with distributed transactions in mind. The integration uses Java reflection hacks as a workaround, and the solution can still result in data loss under certain scenarios. Can we do better?
In this session, you’ll see how the Flink and Kafka communities are uniting to tackle these long-standing technical debts. We’ll introduce the basics of how Flink achieves EOS with external systems and explore the common hurdles that are encountered when implementing distributed transactions. Then we’ll dive into the details of the proposed changes to both the Kafka transaction protocol and Flink transaction coordination that seek to provide a more robust integration.
By the end of the talk, you’ll know the unique challenges of EOS with Flink and Kafka and the improvements you can expect across both projects."
Managing Microservices With The Istio Service Mesh on KubernetesIftach Schonbaum
Istio is an open source service mesh that provides traffic management, service identity and security, observability and policy enforcement capabilities for microservices. At its core, Istio uses the Envoy proxy as a sidecar for each microservice to mediate all inbound and outbound traffic. It provides features like load balancing, failure recovery, metrics and monitoring out of the box without requiring any code changes to the application. Istio's control plane components like Pilot, Mixer and Citadel manage the proxy configuration, collect telemetry, and handle authentication tasks.
Performance Analysis and Optimizations for Kafka Streams Applications (Guozha...confluent
High-speed and low footprint data stream processing is high in demand for Kafka Streams applications. However, how to write an efficient streaming application using the Streams DSL has been asked by many users in the past since it requires some deep knowledge about Kafka Streams internals. In this talk, I will talk about how to analyze your Kafka Streams applications, target performance bottlenecks and unnecessary storage costs, and optimize your application code accordingly using the Streams DSL.
In addition, I will talk about the new optimization framework that we have been developed inside Kafka Streams since the 2.1 release which replaced the in-place translation of the Streams DSL into a comprehensive process composed of streams topology compilation and rewriting phases, with a focus on reducing various storage footprints of Streams applications, such as state stores, internal topics etc.
This talk is aimed to give developers who are interested to write their first or second Streams applications using the Streams DSL, or those who have already launched several services written in Kafka Streams in production but wants to further optimize these applications a better understanding on how different Streams operators in the DSL are being translated into the streams topology during runtime. And by having a deep-dive into the newly added optimization framework for Streams DSL, audience will have more insight into the kinds of optimization opportunities that are possible for Kafka Streams, that the library is trying to tackle right now and in the near future.
Best Practices for Middleware and Integration Architecture Modernization with...Claus Ibsen
This document discusses best practices for middleware and integration architecture modernization using Apache Camel. It provides an overview of Apache Camel, including what it is, how it works through routes, and the different Camel projects. It then covers trends in integration architecture like microservices, cloud native, and serverless. Key aspects of Camel K and Camel Quarkus are summarized. The document concludes with a brief discussion of the Camel Kafka Connector and pointers to additional resources.
The document discusses intra-cluster replication in Apache Kafka, including its architecture where partitions are replicated across brokers for high availability. Kafka uses a leader and in-sync replicas approach to strongly consistent replication while tolerating failures. Performance considerations in Kafka replication include latency and durability tradeoffs for producers and optimizing throughput for consumers.
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022HostedbyConfluent
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs is a hyperscale PaaS event stream broker with protocol support for HTTP, AMQP, and Apache Kafka RPC that accepts and forwards several trillion (!) events per day and is available in all global Azure regions. This session is a look behind the curtain where we dive deep into the architecture of Event Hubs and look at the Event Hubs cluster model, resource isolation, and storage strategies and also review some performance figures.
Aljoscha Krettek is the PMC chair of Apache Flink and Apache Beam, and co-founder of data Artisans. Apache Flink is an open-source platform for distributed stream and batch data processing. It allows for stateful computations over data streams in real-time and historically. Flink supports batch and stream processing using APIs like DataSet and DataStream. Data Artisans originated Flink and provides an application platform powered by Flink and Kubernetes for building stateful stream processing applications.
Change Data Streaming Patterns for Microservices With Debezium confluent
(Gunnar Morling, RedHat) Kafka Summit SF 2018
Debezium (noun | de·be·zi·um | /dɪ:ˈbɪ:ziːəm/): secret sauce for change data capture (CDC) streaming changes from your datastore that enables you to solve multiple challenges: synchronizing data between microservices, gradually extracting microservices from existing monoliths, maintaining different read models in CQRS-style architectures, updating caches and full-text indexes and feeding operational data to your analytics tools
Join this session to learn what CDC is about, how it can be implemented using Debezium, an open source CDC solution based on Apache Kafka and how it can be utilized for your microservices. Find out how Debezium captures all the changes from datastores such as MySQL, PostgreSQL and MongoDB, how to react to the change events in near real time and how Debezium is designed to not compromise on data correctness and completeness also if things go wrong. In a live demo we’ll show how to set up a change data stream out of your application’s database without any code changes needed. You’ll see how to sink the change events into other databases and how to push data changes to your clients using WebSockets.
Kafka, Apache Kafka evolved from an enterprise messaging system to a fully distributed streaming data platform (Kafka Core + Kafka Connect + Kafka Streams) for building streaming data pipelines and streaming data applications.
This talk, that I gave at the Chicago Java Users Group (CJUG) on June 8th 2017, is mainly focusing on Kafka Streams, a lightweight open source Java library for building stream processing applications on top of Kafka using Kafka topics as input/output.
You will learn more about the following:
1. Apache Kafka: a Streaming Data Platform
2. Overview of Kafka Streams: Before Kafka Streams? What is Kafka Streams? Why Kafka Streams? What are Kafka Streams key concepts? Kafka Streams APIs and code examples?
3. Writing, deploying and running your first Kafka Streams application
4. Code and Demo of an end-to-end Kafka-based Streaming Data Application
5. Where to go from here?
Meetup tools for-cloud_native_apps_meetup20180510-vsminseok kim
마이크로서비스로 시스템을 구성하면 서비스간에 연관관계가 줄어들면서 서비스 릴리즈 속도가 높아지고 유연하게 대처할 수 있지만, 관리포인트가 늘어나게 되어 운영상에 많은 어려움을 마주치게 됩니다. 배포 될 때마다 생성되고 소멸되는 마이크로서비스를 다른 마이크로서비스가 쉽게 참조하게 하고 마이크로서비스들의 설정 정보를 일관되게 관리하는 일은 쉬운일이 아닙니다. 이러한 문제를 해결하기 위해 Spring Cloud 프로젝트와 같은 도구를 비롯하여 Pivotal Cloud Foundry와 같은 클라우드 플랫폼등이 있습니다. 이번 밋업에서는 마이크로서비스를 운영할 때의 어려운점과 도움을 주는 다양한 도구들에 대해 알아보도록 하겠습니다.
Full recorded presentation at https://www.youtube.com/watch?v=2UfAgCSKPZo for Tetrate Tech Talks on 2022/05/13.
Envoy's support for Kafka protocol, in form of broker-filter and mesh-filter.
Contents:
- overview of Kafka (usecases, partitioning, producer/consumer, protocol);
- proxying Kafka (non-Envoy specific);
- proxying Kafka with Envoy;
- handling Kafka protocol in Envoy;
- Kafka-broker-filter for per-connection proxying;
- Kafka-mesh-filter to provide front proxy for multiple Kafka clusters.
References:
- https://adam-kotwasinski.medium.com/deploying-envoy-and-kafka-8aa7513ec0a0
- https://adam-kotwasinski.medium.com/kafka-mesh-filter-in-envoy-a70b3aefcdef
Kafka Streams State Stores Being Persistentconfluent
This document discusses Kafka Streams state stores. It provides examples of using different types of windowing (tumbling, hopping, sliding, session) with state stores. It also covers configuring state store logging, caching, and retention policies. The document demonstrates how to define windowed state stores in Kafka Streams applications and discusses concepts like grace periods.
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...confluent
Apache Kafka is a scalable streaming platform with built-in dynamic client scaling. The elastic scale-in/scale-out feature leverages Kafka’s “rebalance protocol” that was designed in the 0.9 release and improved ever since then. The original design aims for on-prem deployments of stateless clients. However, it does not always align with modern deployment tools like Kubernetes and stateful stream processing clients, like Kafka Streams. Those shortcoming lead to two mayor recent improvement proposals, namely static group membership and incremental rebalancing (which will hopefully be available in version 2.3). This talk provides a deep dive into the details of the rebalance protocol, starting from its original design in version 0.9 up to the latest improvements and future work. We discuss internal technical details, pros and cons of the existing approaches, and explain how you configure your client correctly for your use case. Additionally, we discuss configuration tradeoffs for stateless, stateful, on-prem, and containerized deployments.
A brief introduction to Apache Kafka and describe its usage as a platform for streaming data. It will introduce some of the newer components of Kafka that will help make this possible, including Kafka Connect, a framework for capturing continuous data streams, and Kafka Streams, a lightweight stream processing library.
Building distributed systems is challenging. Luckily, Apache Kafka provides a powerful toolkit for putting together big services as a set of scalable, decoupled components. In this talk, I'll describe some of the design tradeoffs when building microservices, and how Kafka's powerful abstractions can help. I'll also talk a little bit about what the community has been up to with Kafka Streams, Kafka Connect, and exactly-once semantics.
Presentation by Colin McCabe, Confluent, Big Data Day LA
Bringing Kafka Without Zookeeper Into Production with Colin McCabe | Kafka Su...HostedbyConfluent
The document discusses bringing Apache Kafka clusters into production without using ZooKeeper for coordination and metadata storage. It describes how Kafka uses ZooKeeper currently and the problems with this approach. It then introduces KRaft, which replaces ZooKeeper by using Raft consensus to replicate cluster metadata within Kafka. The key aspects of deploying, operating and troubleshooting KRaft-based Kafka clusters are covered, including formatting storage, controller setup, rolling upgrades, and examining the replicated metadata log.
The document provides an introduction and overview of Apache Kafka presented by Jeff Holoman. It begins with an agenda and background on the presenter. It then covers basic Kafka concepts like topics, partitions, producers, consumers and consumer groups. It discusses efficiency and delivery guarantees. Finally, it presents some use cases for Kafka and positioning around when it may or may not be a good fit compared to other technologies.
Exactly-Once Semantics Revisited: Distributed Transactions across Flink and K...HostedbyConfluent
"Apache Flink’s Exactly-Once Semantics (EOS) integration for writing to Apache Kafka has several pitfalls, due mostly to the fact that the Kafka transaction protocol was not originally designed with distributed transactions in mind. The integration uses Java reflection hacks as a workaround, and the solution can still result in data loss under certain scenarios. Can we do better?
In this session, you’ll see how the Flink and Kafka communities are uniting to tackle these long-standing technical debts. We’ll introduce the basics of how Flink achieves EOS with external systems and explore the common hurdles that are encountered when implementing distributed transactions. Then we’ll dive into the details of the proposed changes to both the Kafka transaction protocol and Flink transaction coordination that seek to provide a more robust integration.
By the end of the talk, you’ll know the unique challenges of EOS with Flink and Kafka and the improvements you can expect across both projects."
Managing Microservices With The Istio Service Mesh on KubernetesIftach Schonbaum
Istio is an open source service mesh that provides traffic management, service identity and security, observability and policy enforcement capabilities for microservices. At its core, Istio uses the Envoy proxy as a sidecar for each microservice to mediate all inbound and outbound traffic. It provides features like load balancing, failure recovery, metrics and monitoring out of the box without requiring any code changes to the application. Istio's control plane components like Pilot, Mixer and Citadel manage the proxy configuration, collect telemetry, and handle authentication tasks.
Performance Analysis and Optimizations for Kafka Streams Applications (Guozha...confluent
High-speed and low footprint data stream processing is high in demand for Kafka Streams applications. However, how to write an efficient streaming application using the Streams DSL has been asked by many users in the past since it requires some deep knowledge about Kafka Streams internals. In this talk, I will talk about how to analyze your Kafka Streams applications, target performance bottlenecks and unnecessary storage costs, and optimize your application code accordingly using the Streams DSL.
In addition, I will talk about the new optimization framework that we have been developed inside Kafka Streams since the 2.1 release which replaced the in-place translation of the Streams DSL into a comprehensive process composed of streams topology compilation and rewriting phases, with a focus on reducing various storage footprints of Streams applications, such as state stores, internal topics etc.
This talk is aimed to give developers who are interested to write their first or second Streams applications using the Streams DSL, or those who have already launched several services written in Kafka Streams in production but wants to further optimize these applications a better understanding on how different Streams operators in the DSL are being translated into the streams topology during runtime. And by having a deep-dive into the newly added optimization framework for Streams DSL, audience will have more insight into the kinds of optimization opportunities that are possible for Kafka Streams, that the library is trying to tackle right now and in the near future.
Best Practices for Middleware and Integration Architecture Modernization with...Claus Ibsen
This document discusses best practices for middleware and integration architecture modernization using Apache Camel. It provides an overview of Apache Camel, including what it is, how it works through routes, and the different Camel projects. It then covers trends in integration architecture like microservices, cloud native, and serverless. Key aspects of Camel K and Camel Quarkus are summarized. The document concludes with a brief discussion of the Camel Kafka Connector and pointers to additional resources.
The document discusses intra-cluster replication in Apache Kafka, including its architecture where partitions are replicated across brokers for high availability. Kafka uses a leader and in-sync replicas approach to strongly consistent replication while tolerating failures. Performance considerations in Kafka replication include latency and durability tradeoffs for producers and optimizing throughput for consumers.
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022HostedbyConfluent
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Azure Event Hubs is a hyperscale PaaS event stream broker with protocol support for HTTP, AMQP, and Apache Kafka RPC that accepts and forwards several trillion (!) events per day and is available in all global Azure regions. This session is a look behind the curtain where we dive deep into the architecture of Event Hubs and look at the Event Hubs cluster model, resource isolation, and storage strategies and also review some performance figures.
Aljoscha Krettek is the PMC chair of Apache Flink and Apache Beam, and co-founder of data Artisans. Apache Flink is an open-source platform for distributed stream and batch data processing. It allows for stateful computations over data streams in real-time and historically. Flink supports batch and stream processing using APIs like DataSet and DataStream. Data Artisans originated Flink and provides an application platform powered by Flink and Kubernetes for building stateful stream processing applications.
Change Data Streaming Patterns for Microservices With Debezium confluent
(Gunnar Morling, RedHat) Kafka Summit SF 2018
Debezium (noun | de·be·zi·um | /dɪ:ˈbɪ:ziːəm/): secret sauce for change data capture (CDC) streaming changes from your datastore that enables you to solve multiple challenges: synchronizing data between microservices, gradually extracting microservices from existing monoliths, maintaining different read models in CQRS-style architectures, updating caches and full-text indexes and feeding operational data to your analytics tools
Join this session to learn what CDC is about, how it can be implemented using Debezium, an open source CDC solution based on Apache Kafka and how it can be utilized for your microservices. Find out how Debezium captures all the changes from datastores such as MySQL, PostgreSQL and MongoDB, how to react to the change events in near real time and how Debezium is designed to not compromise on data correctness and completeness also if things go wrong. In a live demo we’ll show how to set up a change data stream out of your application’s database without any code changes needed. You’ll see how to sink the change events into other databases and how to push data changes to your clients using WebSockets.
Kafka, Apache Kafka evolved from an enterprise messaging system to a fully distributed streaming data platform (Kafka Core + Kafka Connect + Kafka Streams) for building streaming data pipelines and streaming data applications.
This talk, that I gave at the Chicago Java Users Group (CJUG) on June 8th 2017, is mainly focusing on Kafka Streams, a lightweight open source Java library for building stream processing applications on top of Kafka using Kafka topics as input/output.
You will learn more about the following:
1. Apache Kafka: a Streaming Data Platform
2. Overview of Kafka Streams: Before Kafka Streams? What is Kafka Streams? Why Kafka Streams? What are Kafka Streams key concepts? Kafka Streams APIs and code examples?
3. Writing, deploying and running your first Kafka Streams application
4. Code and Demo of an end-to-end Kafka-based Streaming Data Application
5. Where to go from here?
Meetup tools for-cloud_native_apps_meetup20180510-vsminseok kim
마이크로서비스로 시스템을 구성하면 서비스간에 연관관계가 줄어들면서 서비스 릴리즈 속도가 높아지고 유연하게 대처할 수 있지만, 관리포인트가 늘어나게 되어 운영상에 많은 어려움을 마주치게 됩니다. 배포 될 때마다 생성되고 소멸되는 마이크로서비스를 다른 마이크로서비스가 쉽게 참조하게 하고 마이크로서비스들의 설정 정보를 일관되게 관리하는 일은 쉬운일이 아닙니다. 이러한 문제를 해결하기 위해 Spring Cloud 프로젝트와 같은 도구를 비롯하여 Pivotal Cloud Foundry와 같은 클라우드 플랫폼등이 있습니다. 이번 밋업에서는 마이크로서비스를 운영할 때의 어려운점과 도움을 주는 다양한 도구들에 대해 알아보도록 하겠습니다.
Openshift 활용을 위한 Application의 준비, Cloud Nativerockplace
What is Cloud-native - DevOps, MSA and Cloud-native: Openshift 활용을 위한 Application의 준비, Cloud Native
*웨비나 다시보기 영상 바로가기:
https://www.youtube.com/watch?v=tzSBS-vki6w
왜 컨테이너인가? - OpenShift 구축 사례와 컨테이너로 환경 전환 시 고려사항rockplace
[Microsoft Azure와 Red Hat OpenShift를 통한 비즈니스 스피드 업! 웨비나]
왜 컨테이너인가? - OpenShift 구축 사례와 컨테이너로 환경 전환 시 고려사항
락플레이스 구천모 상무
영상 다시보기 : https://youtu.be/i3yKrHLHYJI
Deview 2013 :: Backend PaaS, CloudFoundry 뽀개기Nanha Park
# Part 1
개발자의 주위환경에 대해 살펴보고 Cloud Foundry overview, Cloud Foundry 를 구성하는 components 마지막으로 Deploy 환경에 대해 알아보겠습니다.
# Part 2
설치부터 코드까지, 데모찍은 동영상은 추후 제공예정
부족한 부분은 [email protected] 으로 문의메일 주시면 성심성의껏 답변 드리겠습니다. 감사합니다.
Migration, backup and restore made easy using Kannikaconfluent
In this presentation, you’ll discover how easily you can migrate data from any Kafka-compatible event hub to Confluent using Kannika’s intuitive self-service interface. We’ll guide you through the process, showing how the same approach can be applied to define specific event data sets and effortlessly spin up secure environments for demos, testing, or other purposes.
You’ll also learn how to back up event data in just a few steps by transferring compressed data to the cloud storage location of your choice. In addition, we’ll demonstrate how to restore filtered datasets of topics, ensuring quick recovery and maintaining business continuity when needed.
Five Things You Need to Know About Data Streaming in 2025confluent
Topics that Peter covers:
Tapping into the Potential of Data Products: Data drives some of today's most important business use cases. Data products enable instant access to reliable and trustworthy data by eliminating the data mess created by point-to-point connections.
The Need to Tap into 'Quick Thinking': The C-level has to reorient itself so it doesn't become the bottleneck to adaptability in a data-driven world. Nine in 10 (90%) business leaders say they must now react in real-time. Learn what you can do to provide executive access to real-time data to enable 'Quick Thinking.'
Rise Above Data Hurdles: Discover how to enforce governance at data production. Reestablishing trustworthiness later is almost always harder, so investing in data tools that solve business problems rather than add to them is essential.
Paradigm to Shift Left: Shift Left is a new paradigm for processing and governing data at any scale, complexity, and latency. Shift Left moves the processing and governance of data closer to the source, enabling organisations to build their data once, build it right and reuse it anywhere within moments of its creation.
The Need for a Strategic View: The positive correlation between data streaming maturity and significant business returns underscores the importance of a long-term, strategic view of data streaming investments. It also highlights the value of advancing beyond initial, siloed use cases to a more integrated approach that leverages data streaming across the enterprise.
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...confluent
In this presentation, we’ll demonstrate how Confluent and Lightstreamer come together to tackle the last-mile challenge of extending your Kafka architecture to web and mobile platforms.
Learn how to effortlessly build real-time web applications within minutes, subscribing to Kafka topics directly from your web pages, with unmatched low latency and high scalability.
Explore how Confluent's leading Kafka platform and Lightstreamer's intelligent proxy work seamlessly to bridge Kafka with the internet frontier, delivering data in real-time.
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...confluent
Confluent per il settore FSI:
- Cos'è il Data Streaming e perché la tua azienda ne ha bisogno
- Chi siamo e come Confluent può aiutarti:
- Rendere Kafka ampiamente accessibile
- Stream, Connect, Process e Governance
- Deep dive sulle soluzioni tecnologiche implementate all'interno della Data Streaming Platform
- Dalla teoria alla pratica: applicazioni reali delle architetture FSI
Data in Motion Tour 2024 Riyadh, Saudi Arabiaconfluent
Data streaming platforms are becoming increasingly important in today’s fast-paced world. From retail giants who need to monitor inventory levels to ensure stores never run out of items, to new-age, innovative banks who are building out-of-the-box banking solutions for traditional retail banks, data streaming platforms are at the centre, powering these workflows.
Data streaming platforms connect all your applications, systems, and teams with a shared view of the most up-to-date, real-time data. From Gen AI, stream governance to stream processing - it’s these cutting edge developments that will be featured during the day.
Build a Real-Time Decision Support Application for Financial Market Traders w...confluent
Quix's intuitive visual programming interface and extensive library of pre-built components make it easy to build these applications without complex coding. Experience how this dynamic duo accelerates the development and deployment of your trading strategies, empowering you to make more informed decisions with real-time data!
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeksconfluent
As businesses strive to stay at the forefront of innovation, the ability to quickly develop scalable Generative AI (GenAI) applications is essential. Join us for an exclusive webinar featuring MIA Platform, MongoDB, and Confluent, where you'll learn how to compose GenAI apps with real-time data integration in a fraction of the time.
Discover how these three powerful platforms work together to ensure applications remain responsive, relevant, and adaptive to user preferences and contextual changes. Our experts will guide you through leveraging MIA Platform's microservices architecture and low-code development, MongoDB's flexibility, and Confluent's stream processing capabilities. Experience live demonstrations and practical insights that will transform your approach to AI-driven app development, enabling you to accelerate your development process from weeks to mere minutes. Don't miss this opportunity to keep your business at the cutting edge.
Building Real-Time Gen AI Applications with SingleStore and Confluentconfluent
Discover how SingleStore and Confluent together create a powerful foundation for real-time generative AI applications. Learn how SingleStore's high-performance data platform and Confluent integrate to process and analyze streaming data in real-time. We'll explore real-world, innovative solutions and show you how SingleStore + Confluent can unlock new gen AI opportunities with your clients.
Unlocking value with event-driven architecture by Confluentconfluent
Sfrutta il potere dello streaming di dati in tempo reale e dei microservizi basati su eventi per il futuro di Sky con Confluent e Kafka®.
In questo tech talk esploreremo le potenzialità di Confluent e Apache Kafka® per rivoluzionare l'architettura aziendale e sbloccare nuove opportunità di business. Ne approfondiremo i concetti chiave, guidandoti nella creazione di applicazioni scalabili, resilienti e fruibili in tempo reale per lo streaming di dati.
Scoprirai come costruire microservizi basati su eventi con Confluent, sfruttando i vantaggi di un'architettura moderna e reattiva.
Il talk presenterà inoltre casi d'uso reali di Confluent e Kafka®, dimostrando come queste tecnologie possano ottimizzare i processi aziendali e generare valore concreto.
Il Data Streaming per un’AI real-time di nuova generazioneconfluent
Per costruire applicazioni di AI affidabili, sicure e governate occorre una base dati in tempo reale altrettanto solida. Ancor più quando ci troviamo a gestire ingenti flussi di dati in continuo movimento.
Come arrivarci? Affidati a una vera piattaforma di data streaming che ti permetta di scalare e creare rapidamente applicazioni di AI in tempo reale partendo da dati affidabili.
Scopri di più! Non perdere il nostro prossimo webinar durante il quale avremo l’occasione di:
• Esplorare il paradigma della GenAI e di come questa nuova tecnnologia sta rimodellando il panorama aziendale, rispondendo alla necessità di offrire un contesto e soluzioni in tempo reale che soddisfino le esigenze della tua azienda.
• Approfondire le incertezze del panorama dell'AI in evoluzione e l'importanza cruciale del data streaming e dell'elaborazione dati.
• Vedere in dettaglio l'architettura in continua evoluzione e il ruolo chiave di Kafka e Confluent nelle applicazioni di AI.
• Analizzare i vantaggi di una piattaforma di streaming dei dati come Confluent nel collegare l'eredità legacy e la GenAI, facilitando lo sviluppo e l’utilizzo di AI predittive e generative.
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...confluent
As businesses strive to remain at the cutting edge of innovation, the demand for scalable and up-to-date conversational AI solutions has become paramount. Generative AI (GenAI) chatbots that seamlessly integrate into our daily lives and adapt to the ever-evolving nuances of human interaction are crucial. Real-time data plays a pivotal role in ensuring the responsiveness and relevance of these chatbots, empowering them to stay abreast of the latest trends, user preferences, and contextual information.
Break data silos with real-time connectivity using Confluent Cloud Connectorsconfluent
Connectors integrate Apache Kafka® with external data systems, enabling you to move away from a brittle spaghetti architecture to one that is more streamlined, secure, and future-proof. However, if your team still spends multiple dev cycles building and managing connectors using just open source Kafka Connect, it’s time to consider a faster and cost-effective alternative.
Building API data products on top of your real-time data infrastructureconfluent
This talk and live demonstration will examine how Confluent and Gravitee.io integrate to unlock value from streaming data through API products.
You will learn how data owners and API providers can document, secure data products on top of Confluent brokers, including schema validation, topic routing and message filtering.
You will also see how data and API consumers can discover and subscribe to products in a developer portal, as well as how they can integrate with Confluent topics through protocols like REST, Websockets, Server-sent Events and Webhooks.
Whether you want to monetize your real-time data, enable new integrations with partners, or provide self-service access to topics through various protocols, this webinar is for you!
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
In our exclusive webinar, you'll learn why event-driven architecture is the key to unlocking cost efficiency, operational effectiveness, and profitability. Gain insights on how this approach differs from API-driven methods and why it's essential for your organization's success.
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
Ad
Sources와 Sinks를 Confluent Cloud에 원활하게 연결
1. Sources와 Sinks를 Confluent Cloud에
원활하게 연결
HyunSoo Kim
Senior Solutions Engineer, Korea SE Team Lead
2. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
주제
2
1. 워크샵 팁 및 도움말
2. 프레젠테이션: Confluent Connectors 개요 및 이용 사례
3. 워크샵: Confluent 전문가가 안내하는 Hands-on Lab
4. Q&A
3. C O N F I D E N T I A L
3
워크샵 팁 및 도움말:
1. 세션이 진행되는 동안 '채팅' 창에서 지침을 확인하세요.
[Zoom 하단의 도구 모음에 있는 아이콘]
2. 기술적인 문제가 있는 경우
'손 들기' 버튼을 클릭하거나 '채팅' 창에 글을 올려주세요.
[Confluent 팀원이 도와드립니다.]
3. 이 세션은 대화형 세션이므로 Zoom 하단의 도구 모음에 있는 'Q&A 버튼'을 활용해 질문을
해주시기 바랍니다. 세션이 끝나기 전에 'Q&A' 시간도 가질 예정입니다.
4. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
이벤트 기반 아키텍처로의 전환을 고려하기 전에는...
기존 아키텍처는 다음과 같은 모습이었습니다.
• 확장이 어렵고 처리량이 적음
• 여러 개의 단일 장애 지점(Single Points of Failure)
• 비영구적 데이터
• 데이터 통합이 복잡
• 높은 레거시 시스템 비용
5. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
그리고 시간이 지나면서 이렇게 바뀌었습니다...
5
비즈니스 부문 1 비즈니스 부문 2 퍼블릭 클라우드
6. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
Data in Motion을 위한 Confluent
실시간 재고 관리 실시간 사기
탐지
실시간
고객 360
머신
러닝
모델
실시간 데이터
변환
LOB
앱
이벤트 스트리밍 애플리케이션
범용 이벤트 파이프라인
Hadoop ...
장치
로그 ... 앱 ...
데이터
웨어하우스
Splunk ...
데이터 저장소 로그 타사 앱 사용자 정의 앱 / 마이크로서비스
메인프레임
마이크로
서비스
7. Kafka Connect
알려진 시스템(데이터베이스, 오브젝트 스토리지, 큐 등)을
Apache Kafka에 연결하는 코드 없는 방식
Kafka Connect Kafka Connect
데이터
Sources
데이터
Sinks
9. Copyright 2020, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc.
널리 사용되는 Data Sources & Sinks 연결
데이터
다이오드
175+
사전 구축된
connectors
80개 이상의 Confluent 지원 45개 이상의 파트너 지원, Confluent 인증
10. 다음을 통해 connector를 쉽게 찾아볼
수 있습니다.
• Source vs. Sinks
• Confluent 지원 vs 파트너 지원
• 상업용 vs 무료
• Confluent Cloud에서 이용 가능
confluent.io/ko-kr/hub
Confluent Hub
17. Fully Managed
Connectors
● Confluent Cloud에서 완전히 관리되고
호스팅됨
● 간단한 GUI 기반 구성 - UI, ccloud CLI
또는 Connect API를 사용하여 쉽게
설정 가능
● 관리할 인프라가 없는 탄력적인 확장
● 완전 관리형 Connector 30개, 지금도
계속 증가하고 있음
18. Self Managed Connectors
1) Confluent Platform을 사용하여 로컬
Connect 클러스터 배포
2) Confluent Cloud에 로컬 Connect 클러스터
연결
3) Connect 클러스터에 Self Managed
Connector 설치
4) Connector 실행
18
● Confluent Cloud의 클러스터에 연결된 로컬
Kafka Connect 클러스터에 설치
● Confluent에서 지원
● 다양한 배포 방법:
○ Zip
○ RPM(RHEL 및 Centos)
○ DEB(Ubuntu 및 Debian)
○ Docker
○ Kubernetes - Confluent Operator
○ Terraform(GCP/AWS용)
20. C O N F I D E N T I A L
Confluent Cloud
동적 성능 및 탄력성
자동 밸런싱 및 업그레이드
유연한 DevOps 자동화
CLI | Terraform | Operator | Ansible
GUI 기반 관리 및 모니터링
클라우드 대시보드 | Metrics API
규모에 맞는
효율적인 운영
선택의 자유
커미터 기반의 전문성
이벤트 스트리밍 데이터베이스
ksql
풍부한 사전 구축 에코시스템
Connectors | Hub | Schema Registry
다국어 개발
비 Java 클라이언트 | REST 프록시
글로벌 탄력성
Replicator
데이터 호환성
Schema Registry
엔터프라이즈급 보안
SSO/SAML | ACL
아키텍트
운영자
개발자
오픈 소스 | 커뮤니티 라이선스
무제한의
개발자 생산성
프로덕션 단계 전제조건
완전 관리형 클라우드 서비스
자체 관리형 소프트웨어
교육 파트너
엔터프라이즈
지원
프로페셔널
서비스
Apache Kafka
22. Copyright 2021, Confluent, Inc. All rights reserved. This document may not be reproduced in any manner without the express written permission of Confluent, Inc. 22
Confluent Cloud 계정 가입하기
● 여기서 가입하세요! https://www.confluent.io/confluent-cloud/tryfree/
● 가입을 완료하고 로그인한 후 우측 상단에 있는 메뉴 아이콘을 클릭하고 "Billing & payment"를
클릭한 다음 "Payment details & contacts"에 결제 정보를 입력합니다.
● Confluent Cloud 계정에 가입하면 처음 두 달 동안 Confluent Cloud 계정에 최대 400달러의
크레딧이 추가됩니다.
● 참고: 이 워크샵에서는 비용이 발생하는 리소스를 생성하게 됩니다. 400달러의 크레딧으로 워크샵 중에
생성한 리소스 비용을 충당합니다. 추가 비용이 발생하지 않도록 "클러스터를 스핀다운"하세요.
랩 가이드:
https://github.com/confluentinc/commercial-workshops/tree/master/series-getting-started-wi
th-cc/workshop-connectors
23. 워크샵 내용
23
- 완전 관리형(fully-managed), 셀프 관리형(self-managed) connector를 이용해
Confluent Cloud 환경 설정하기
- 완전 관리형(fully-managed), 셀프 관리형(self-managed) connector의 사용 사례
배우기
- Confluent Cloud 환경에서 data의 엔드투엔드 흐름 확인하기
25. C O N F I D E N T I A L
워크샵 아키텍처 다이어그램 본 실습에서는
GCS 사용
26. C O N F I D E N T I A L
워크샵 주의사항
26
- 완전히 새로운 Environment를 생성하세요.
- GCP Seoul Region을 선택해서 Confluent Cloud 생성하세요.
- Pre-requisites : 6번 항목
Service Account는 이미 생성되어서 제공됩니다
Bucket Name : bucket_workshop_20220223_kr
Region : asia-northeast3 (Seoul)
IAM Policy는 이미 적용되어 있습니다
- Fully-Managed Google Cloud Storage(GCS) Sink Connector 사용
Use an existing API key 선택
27. C O N F I D E N T I A L
Fully-Managed Google Cloud Storage(GCS) Sink
Connector 생성 화면
27
bucket_workshop_20220223_kr
중요
28. C O N F I D E N T I A L
docker-compose 실행 결과
28
29. C O N F I D E N T I A L
목표 과제
29
GCS Sink Connector 생성후 정상 동작 화면 캡처 후
메일(korea-marketing@confluent.io)로 전송