Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...HostedbyConfluent
You cannot operate what you cannot measure. In this talk, I am going to present the built-in metrics framework of Kafka Streams that supports monitoring Kafka Streams applications. You will learn how to setup monitoring of metrics for your Kafka Streams applications and you will hear about the following recent improvements to the metrics framework that aim to extend and simplify monitoring. KIP-444 aims to simplify and extend the built-in metrics framework. The RocksDB metrics introduced in KIP-471 and KIP-607 allow you to look directly into the built-in persistent state stores of your Kafka Streams applications. Finally, KIP-613 specifies metrics that measure end-to-end latencies in your applications. This talk will help you collect intel about the behavior of your Kafka Streams applications, and will allow you to reason about the deployment. In the end, you will be able to better understand your applications and run them in a more robust manner.
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...HostedbyConfluent
Whether you are a die-hard DC comic enthusiast, mad for Marvel, or completely clueless when it comes to comic books, at the end of the day each of us would love to possess the superpower to transform data in seconds versus minutes or days. But architects and developers are challenged with designing and managing platforms that scale elastically and combine event streams with stored data, to enable more contextually rich data analytics. This made even more complex with data coming from hundreds of sources, and in hundreds of terabytes, or even petabytes, per day.
Now, with Apache Kafka and Intel hardware technology advances, organizations can turn massive volumes of disparate data into actionable insights with the ability to filter, enrich, join and process data instream. Let's consider Information Security. IT leaders need to ensure all company data and IP is secured against threats and vulnerabilities. A combination of real-time event streaming with Confluent Platform and Intel Architecture has enabled threat detection efforts that once took hours to be completed in seconds, while simultaneously reducing technical debt and data processing and storage costs.
In this session, Confluent and Intel architects will share detailed performance benchmarking results and new joint reference architecture. We’ll detail ways to remove Kafka performance bottlenecks, and improve platform resiliency and ensure high availability using Confluent Control Center and Multi-Region Clusters. And we’ll offer up tips for addressing challenges that you may be facing in your own super heroic efforts to design, deploy, and manage your organization’s data platforms.
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...HostedbyConfluent
From migrations between Apache Kafka clusters to multi-region deployments across datacenters, the introduction of MirrorMaker2 has expanded the possibilities for Apache Kafka deployments and use cases. In this session you will learn about patterns, best practices, and learnings compiled from running MirrorMaker2 in production at every scale.
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...HostedbyConfluent
In this session we share our experience of building a real-time data pipelines at Tencent PCG - one that handles 20 trillion daily messages with 700 clusters and 100Gb/s bursting traffic from a single app. We discuss our roadmap of enhancing Kafka to break its limits in terms of scalability, robustness and cost of operation.
We first built a proxy layer that aggregates physical clusters in a way agnostic to the clients. While this architecture solves many operational problems, it requires significant development to stay future-proof. With retrospection with our customer and careful study of the ongoing work from the community, we then designed a region federation solution in the broker layer, which allows us to deploy clusters at a much larger scale than previously possible, while at the same time providing better failure recovery and operability. We discuss how we make this development compatible with KIP-500 and KIP-405, and the two KIP (693, 694) that we submitted for discussion.
Beyond the Brokers | Emma Humber and Andrew Borley, IBMHostedbyConfluent
While Kafka has guarantees around the number of server failures a cluster can tolerate, to avoid service interruptions, or even data loss, it is prudent to have infrastructure in place for when an environment becomes unavailable during a planned or unplanned outage.
This talk describes the architectures available to you when planning for an outage. We will examine configurations including active/passive and active/active as well as availability zones and debate the benefits and limitations of each. We will also cover how to set up each configuration using the tools in Kafka.
Whether downtime while you fail over clients to a backup is acceptable or you require your Kafka clusters to be highly available, this talk will give you an understanding of the options available to mitigate the impact of the loss of an environment.
Not Your Mother's Kafka - Deep Dive into Confluent Cloud Infrastructure | Gwe...HostedbyConfluent
Confluent Cloud runs a modified version of Apache Kafka - redesigned to be cloud-native and deliver a serverless user experience. In this talk, we will discuss key improvements we've made to Kafka and how they contribute to Confluent Cloud availability, elasticity, and multi-tenancy. You'll learn about innovations that you can use on-prem, and everything you need to make the most of Confluent Cloud.
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMwareHostedbyConfluent
Spring Cloud Stream is a framework built on top of the foundations of Spring Boot, the foremost JVM framework for developing microservice applications. It brings the familiar patterns and philosophies that Spring has championed for years through its programming model by allowing developers to focus primarily on the business logic of their applications. Kafka Streams is a powerful stream processing library built on top of Apache Kafka and attracts many developers because of its simplicity and deployment models as microservice applications. By developing Kafka Streams applications using Spring Cloud Stream, application developers get the best of both worlds - simpler stream processing execution models of Kafka Streams and battle-tested microservices foundations of Spring Boot via Spring Cloud Stream. This talk will explore: The integration points and various capabilities of Spring Cloud Stream touchpoints with Kafka Streams How to build event streaming applications using Spring’s programming model built on top of Kafka Streams, including a demo of a stateful application using Kafka Streams and Spring Cloud Stream’s functional support How to use interactive queries to expose materialized views from the state stores in the application How this Kafka Streams application can run as part of a data pipeline using Spring Cloud Data Flow in Kubernetes
Kafka Excellence at Scale – Cloud, Kubernetes, Infrastructure as Code (Vik Wa...HostedbyConfluent
Cloud is changing the world; Kubernetes is changing the world; real-time event streaming is changing the world. In this talk we explore some of best practices to synergistically combine the power of these paradigm shifts to achieve a much greater return on your Kafka investments. From declarative deployments, zero-downtime upgrades, elastic scaling to self-healing and automated governance, learn how you can bring the next level of speed, agility, resilience, and security to your Kafka implementations.
Guaranteed Event Delivery with Kafka and NodeJS | Amitesh Madhur, NutanixHostedbyConfluent
The business systems of an organization are a continuous source of events. Each system also needs to know about events happening in the other systems. Exchanging these events through direct API calls creates a web of inter-dependencies, is fragile and fails to scale. We examine how this problem can be solved through the use of right integration patterns implemented as a light-weight event hub that leverages the power of Kafka and Confluent to operate at enterprise scale. We demonstrate how JavaScript with its event-driven programming model can be a good fit for implementing an event hub that ensures guaranteed message delivery in the face of failures within the individual subscriber systems.
Many organizations having large engineering teams skilled in NodeJS and a multitude of NodeJs applications. We show how these teams can easily leverage the power of Kafka and scale their applications with the right architectural building blocks. We also offer insights from our own experience of building NodeJS based Kafka applications.
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...HostedbyConfluent
Kafka Connect makes it possible to easily integrate data sources like MongoDB! In this session we will first explore how MongoDB enables developers to rapidly innovate through the use of the document model. We will then put the document model to life and showcase how to integrate MongoDB and Kafka through the use of the MongoDB Connector with Apache Kafka. Finally, we will explore the different ways of using the connector including the new Confluent Cloud integration.
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...HostedbyConfluent
Kubernetes became the de-facto standard for running cloud-native applications. And many users turn to it also to run stateful applications such as Apache Kafka. You can use different tools to deploy Kafka on Kubernetes - write your own YAML files, use Helm Charts, or go for one of the available operators. But there is one thing all of these have in common. You still need very good knowledge of Kubernetes to make sure your Kafka cluster works properly in all situations. This talk will cover different Kubernetes features such as resources, affinity, tolerations, pod disruption budgets, topology spread constraints and more. And it will explain why they are important for Apache Kafka and how to use them. If you are interested in running Kafka on Kubernetes and do not know all of these, this is a talk for you.
Twitter’s Apache Kafka Adoption Journey | Ming Liu, TwitterHostedbyConfluent
Until recently, the Messaging team at Twitter had been running an in-house build Pub/Sub system, namely EventBus (built on top of Apache DistributedLog and Apache Bookkeeper, and similar in architecture to Apache Pulsar) to cater to our pubsub needs. In 2018, we made the decision to move to Apache Kafka by migrating existing use cases as well as onboarding new use cases directly onto Apache Kafka. Fast forward to today, Kafka is now an essential piece of Twitter Infrastructure and processes over 200M messages per second. In this talk, we will share the learning and challenges in our journey moving to Apache Kafka.
Understanding Kafka Produce and Fetch api calls for high throughtput applicat...HostedbyConfluent
The data team at Cloudflare uses Kafka to process tens of petabytes a day. All this data is moved using the 2 foundational Kafka api calls: Produce (api key 0) and Fetch (api key 1). Understanding the structure of these calls (and of the underlying RecordSet structure) is key to building high throughput clients.
The talk describes the basics of the Kafka wire protocol (api keys, correlation id), and the structure of the Produce and Fetch calls. It shows how the asynchronous nature of the wire protocol can combine with the structure of the Produce and Fetch calls to increase latency and reduce client throughput; a solution is offered through use of synchronous single-partition calls.
The RecordSet structure, which is used to encode and store sets (batches) of records is described, and its implications on Fetch requests are discussed. The relationship between Fetch api calls and ""consume"" operations is discussed, as is the impact of offset alignment to RecordSet boundaries.
Changing landscapes in data integration - Kafka Connect for near real-time da...HostedbyConfluent
1. The document discusses Kafka Connect and its evolution for managing near real-time streaming pipelines. It describes how Kafka Connect can be used for data integration across different systems and challenges around identity when deploying Kafka Connect.
2. It introduces the concept of a managed Kafka Connect which deploys Kafka Connect on the customer's own Kubernetes namespace to avoid security and identity issues. The managed Connect is configured and managed through a centralized control plane.
3. It details how the managed Kafka Connect control plane can be used to provision Kafka Connect clusters, deploy connectors between different systems, override connector configurations, and monitor tasks.
Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...HostedbyConfluent
Do your event streams use connected-data domains such as fraud detection, live logistics routing, or predicting network outages? How can you maintain the analysis and leverage those connections real-time?
Graph databases differ from traditional, tabular ones in that they treat connections between data as first class citizens. This means they are optimized for detecting and understanding these relationships – providing insight at speed and at scale.
By combining event streams from Kafka along with the power of the Neo4j graph database for interrogating and investigating connections, you make real-time, event-driven intelligent insight a reality.
Neo4j Streams integrates Neo4j with Apache Kafka event streams, to serve as a source of data, for instance Change Data Capture or a sink to ingest any kind of Kafka event into your graph. In this session we’ll show you how to get up and running with Neo4j Streams to show you how to sink and source between graphs and streams.
Using Kafka as a Database For Real-Time Transaction Processing | Chad Preisle...HostedbyConfluent
You have learned about Kafka event sourcing with streams and using Kafka as a database, but you may be having a tough time wrapping your head around what that means and what challenges you will face. Kafka’s exactly once semantics, data retention rules, and stream DSL make it a great database for real-time transaction processing. This talk will focus on how to use Kafka events as a database. We will talk about using KTables vs GlobalKTables, and how to apply them to patterns we use with traditional databases. We will go over a real-world example of joining events against existing data and some issues to be aware of. We will finish covering some important things to remember about state stores, partitions, and streams to help you avoid problems when your data sets become large.
Kafka error handling patterns and best practices | Hemant Desale and Aruna Ka...HostedbyConfluent
Transaction Banking from Goldman Sachs is a high volume, latency sensitive digital banking platform offering. We have chosen an event driven architecture to build highly decoupled and independent microservices in a cloud native manner and are designed to meet the objectives of Security, Availability Latency and Scalability. Kafka was a natural choice – to decouple producers and consumers and to scale easily for high volume processing. However, there are certain aspects that require careful consideration – handling errors and partial failures, managing downtime of consumers, secure communication between brokers and producers / consumers. In this session, we will present the patterns and best practices that helped us build robust event driven applications. We will also present our solution approach that has been reused across multiple application domains. We hope that by sharing our experience, we can establish a reference implementation that application developers can benefit from.
Flexible Authentication Strategies with SASL/OAUTHBEARER (Michael Kaminski, T...confluent
In order to maximize Kafka accessibility within an organization, Kafka operators must choose an authentication option that balances security with ease of use. Kafka has been historically limited to a small number of authentication options that are difficult to integrate with a Single Signon (SSO) strategy, such as mutual TLS, basic auth, and Kerberos. The arrival of SASL/OAUTHBEARER in Kafka 2.0.0 affords system operators a flexible framework for integrating Kafka with their existing authentication infrastructure. Ron Dagostino (State Street Corporation) and Mike Kaminski (The New York Times) team up to discuss SASL/OAUTHBEARER and it’s real-world applications. Ron, who contributed the feature to core Kafka, explains the origins and intricacies of its development along with additional, related security changes, including client re-authentication (merged and scheduled for release in v2.2.0) and the plans for support of SASL/OAUTHBEARER in librdkafka-based clients. Mike Kaminski, a developer on The Publishing Pipeline team at The New York Times, talks about how his team leverages SASL/OAUTHBEARER to break down silos between teams by making it easy for product owners to get connected to the Publishing Pipeline’s Kafka cluster.
Stream Processing with Apache Kafka and .NETconfluent
Presentation from South Bay.NET meetup on 3/30.
Speaker: Matt Howlett, Software Engineer at Confluent
Apache Kafka is a scalable streaming platform that forms a key part of the infrastructure at many companies including Uber, Netflix, Walmart, Airbnb, Goldman Sachs and LinkedIn. In this talk Matt will give a technical overview of Kafka, discuss some typical use cases (from surge pricing to fraud detection to web analytics) and show you how to use Kafka from within your C#/.NET applications.
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, QlikHostedbyConfluent
Qlik is an industry leader across its solution stack, both on the Data Integration side of things with Qlik Replicate (real-time CDC) and Qlik Compose (data warehouse and data lake automation), and on the Analytics side with Qlik Sense. These two “sides” of Qlik are coming together more frequently these days as the need for “always fresh” data increases across organizations.
When real-time streaming applications are the topic du jour, those companies are looking to Apache Kafka to provide the architectural backbone those applications require. Those same companies turn to Qlik Replicate to put the data from their enterprise database systems into motion at scale, whether that data resides in “legacy” mainframe databases; traditional relational databases such as Oracle, MySQL, or SQL Server; or applications such as SAP and SalesForce.
In this session we will look in depth at how Qlik Replicate can be used to continuously stream changes from a source database into Apache Kafka. From there, we will explore how a purpose-built consumer can be used to provide the bridge between Apache Kafka and an analytics application such as Qlik Sense.
What is Apache Kafka and What is an Event Streaming Platform?confluent
Speaker: Gabriel Schenker, Lead Curriculum Developer, Confluent
Streaming platforms have emerged as a popular, new trend, but what exactly is a streaming platform? Part messaging system, part Hadoop made fast, part fast ETL and scalable data integration. With Apache Kafka® at the core, event streaming platforms offer an entirely new perspective on managing the flow of data. This talk will explain what an event streaming platform such as Apache Kafka is and some of the use cases and design patterns around its use—including several examples of where it is solving real business problems. New developments in this area such as KSQL will also be discussed.
This three-day course teaches developers how to build applications that can publish and subscribe to data from an Apache Kafka cluster. Students will learn Kafka concepts and components, how to use Kafka and Confluent APIs, and how to develop Kafka producers, consumers, and streams applications. The hands-on course covers using Kafka tools, writing producers and consumers, ingesting data with Kafka Connect, and more. It is designed for developers who need to interact with Kafka as a data source or destination.
Delivering: from Kafka to WebSockets | Adam Warski, SoftwareMillHostedbyConfluent
Here's the challenge: we've got a Kafka topic, where services publish messages to be delivered to browser-based clients through web sockets.
Sounds simple? It might, but we're faced with an increasing number of messages, as well as a growing count of web socket clients. How do we scale our solution? As our system contains a larger number of servers, failures become more frequent. How to ensure fault tolerance?
There’s a couple possible architectures. Each websocket node might consume all messages. Otherwise, we need an intermediary, which redistributes the messages to the proper web socket nodes.
Here, we might either use a Kafka topic, or a streaming forwarding service. However, we still need a feedback loop so that the intermediary knows where to distribute messages.
We’ll take a look at the strengths and weaknesses of each solution, as well as limitations created by the chosen technologies (Kafka and web sockets).
Live Event Debugging With ksqlDB at Reddit | Hannah Hagen and Paul Kiernan, R...HostedbyConfluent
Convincing developers to write tests for new code is hard; convincing developers to write tests for new event data is even harder. At Reddit, engineers have often deployed new app versions, only to find out later that the event wasn’t firing at all, or it was missing critical fields. So this begs the question, “How can engineers at Reddit be confident that the events they instrument are accurate and complete?”
In this session, we will learn about an internal tool developed at Reddit to QA events in real-time. This KSQL-powered web app streams events from our pipeline, allowing developers to filter events they care about using criteria like User ID, Device ID or the type of user interaction. With a backbone of KSQL and Kafka Streams, engineers can get real-time feedback on how accurate (or how erroneous) their event data is.
Cloud native Kafka | Sascha Holtbruegge and Margaretha Erber, HiveMQHostedbyConfluent
Joins in Kafka Streams and ksqlDB are a killer-feature for data processing and basic join semantics are well understood. However, in a streaming world records are associated with timestamps that impact the semantics of joins: welcome to the fabulous world of _temporal_ join semantics. For joins, timestamps are as important as the actual data and it is important to understand how they impact the join result.
In this talk we want to deep dive on the different types of joins, with a focus of their temporal aspect. Furthermore, we relate the individual join operators to the overall ""time engine"" of the Kafka Streams query runtime and explain its relationship to operator semantics. To allow developers to apply their knowledge on temporal join semantics, we provide best practices, tip and tricks to ""bend"" time, and configuration advice to get the desired join results. Last, we give an overview of recent, and an outlook to future, development that improves joins even further.
Developing and Deploying Microservices to IBM Cloud PrivateShikha Srivastava
This document discusses developing and deploying microservices on IBM Cloud Private. It provides an overview of IBM Cloud Private including its architecture, editions, and included content. It also covers Kubernetes concepts like pods and services. Helm is introduced as a tool for managing Kubernetes applications and charts. Finally, an example application called Stock Trader is presented to demonstrate how a hybrid cloud application could be built on IBM Cloud Private.
Kafka Excellence at Scale – Cloud, Kubernetes, Infrastructure as Code (Vik Wa...HostedbyConfluent
Cloud is changing the world; Kubernetes is changing the world; real-time event streaming is changing the world. In this talk we explore some of best practices to synergistically combine the power of these paradigm shifts to achieve a much greater return on your Kafka investments. From declarative deployments, zero-downtime upgrades, elastic scaling to self-healing and automated governance, learn how you can bring the next level of speed, agility, resilience, and security to your Kafka implementations.
Guaranteed Event Delivery with Kafka and NodeJS | Amitesh Madhur, NutanixHostedbyConfluent
The business systems of an organization are a continuous source of events. Each system also needs to know about events happening in the other systems. Exchanging these events through direct API calls creates a web of inter-dependencies, is fragile and fails to scale. We examine how this problem can be solved through the use of right integration patterns implemented as a light-weight event hub that leverages the power of Kafka and Confluent to operate at enterprise scale. We demonstrate how JavaScript with its event-driven programming model can be a good fit for implementing an event hub that ensures guaranteed message delivery in the face of failures within the individual subscriber systems.
Many organizations having large engineering teams skilled in NodeJS and a multitude of NodeJs applications. We show how these teams can easily leverage the power of Kafka and scale their applications with the right architectural building blocks. We also offer insights from our own experience of building NodeJS based Kafka applications.
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...HostedbyConfluent
Kafka Connect makes it possible to easily integrate data sources like MongoDB! In this session we will first explore how MongoDB enables developers to rapidly innovate through the use of the document model. We will then put the document model to life and showcase how to integrate MongoDB and Kafka through the use of the MongoDB Connector with Apache Kafka. Finally, we will explore the different ways of using the connector including the new Confluent Cloud integration.
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...HostedbyConfluent
Kubernetes became the de-facto standard for running cloud-native applications. And many users turn to it also to run stateful applications such as Apache Kafka. You can use different tools to deploy Kafka on Kubernetes - write your own YAML files, use Helm Charts, or go for one of the available operators. But there is one thing all of these have in common. You still need very good knowledge of Kubernetes to make sure your Kafka cluster works properly in all situations. This talk will cover different Kubernetes features such as resources, affinity, tolerations, pod disruption budgets, topology spread constraints and more. And it will explain why they are important for Apache Kafka and how to use them. If you are interested in running Kafka on Kubernetes and do not know all of these, this is a talk for you.
Twitter’s Apache Kafka Adoption Journey | Ming Liu, TwitterHostedbyConfluent
Until recently, the Messaging team at Twitter had been running an in-house build Pub/Sub system, namely EventBus (built on top of Apache DistributedLog and Apache Bookkeeper, and similar in architecture to Apache Pulsar) to cater to our pubsub needs. In 2018, we made the decision to move to Apache Kafka by migrating existing use cases as well as onboarding new use cases directly onto Apache Kafka. Fast forward to today, Kafka is now an essential piece of Twitter Infrastructure and processes over 200M messages per second. In this talk, we will share the learning and challenges in our journey moving to Apache Kafka.
Understanding Kafka Produce and Fetch api calls for high throughtput applicat...HostedbyConfluent
The data team at Cloudflare uses Kafka to process tens of petabytes a day. All this data is moved using the 2 foundational Kafka api calls: Produce (api key 0) and Fetch (api key 1). Understanding the structure of these calls (and of the underlying RecordSet structure) is key to building high throughput clients.
The talk describes the basics of the Kafka wire protocol (api keys, correlation id), and the structure of the Produce and Fetch calls. It shows how the asynchronous nature of the wire protocol can combine with the structure of the Produce and Fetch calls to increase latency and reduce client throughput; a solution is offered through use of synchronous single-partition calls.
The RecordSet structure, which is used to encode and store sets (batches) of records is described, and its implications on Fetch requests are discussed. The relationship between Fetch api calls and ""consume"" operations is discussed, as is the impact of offset alignment to RecordSet boundaries.
Changing landscapes in data integration - Kafka Connect for near real-time da...HostedbyConfluent
1. The document discusses Kafka Connect and its evolution for managing near real-time streaming pipelines. It describes how Kafka Connect can be used for data integration across different systems and challenges around identity when deploying Kafka Connect.
2. It introduces the concept of a managed Kafka Connect which deploys Kafka Connect on the customer's own Kubernetes namespace to avoid security and identity issues. The managed Connect is configured and managed through a centralized control plane.
3. It details how the managed Kafka Connect control plane can be used to provision Kafka Connect clusters, deploy connectors between different systems, override connector configurations, and monitor tasks.
Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...HostedbyConfluent
Do your event streams use connected-data domains such as fraud detection, live logistics routing, or predicting network outages? How can you maintain the analysis and leverage those connections real-time?
Graph databases differ from traditional, tabular ones in that they treat connections between data as first class citizens. This means they are optimized for detecting and understanding these relationships – providing insight at speed and at scale.
By combining event streams from Kafka along with the power of the Neo4j graph database for interrogating and investigating connections, you make real-time, event-driven intelligent insight a reality.
Neo4j Streams integrates Neo4j with Apache Kafka event streams, to serve as a source of data, for instance Change Data Capture or a sink to ingest any kind of Kafka event into your graph. In this session we’ll show you how to get up and running with Neo4j Streams to show you how to sink and source between graphs and streams.
Using Kafka as a Database For Real-Time Transaction Processing | Chad Preisle...HostedbyConfluent
You have learned about Kafka event sourcing with streams and using Kafka as a database, but you may be having a tough time wrapping your head around what that means and what challenges you will face. Kafka’s exactly once semantics, data retention rules, and stream DSL make it a great database for real-time transaction processing. This talk will focus on how to use Kafka events as a database. We will talk about using KTables vs GlobalKTables, and how to apply them to patterns we use with traditional databases. We will go over a real-world example of joining events against existing data and some issues to be aware of. We will finish covering some important things to remember about state stores, partitions, and streams to help you avoid problems when your data sets become large.
Kafka error handling patterns and best practices | Hemant Desale and Aruna Ka...HostedbyConfluent
Transaction Banking from Goldman Sachs is a high volume, latency sensitive digital banking platform offering. We have chosen an event driven architecture to build highly decoupled and independent microservices in a cloud native manner and are designed to meet the objectives of Security, Availability Latency and Scalability. Kafka was a natural choice – to decouple producers and consumers and to scale easily for high volume processing. However, there are certain aspects that require careful consideration – handling errors and partial failures, managing downtime of consumers, secure communication between brokers and producers / consumers. In this session, we will present the patterns and best practices that helped us build robust event driven applications. We will also present our solution approach that has been reused across multiple application domains. We hope that by sharing our experience, we can establish a reference implementation that application developers can benefit from.
Flexible Authentication Strategies with SASL/OAUTHBEARER (Michael Kaminski, T...confluent
In order to maximize Kafka accessibility within an organization, Kafka operators must choose an authentication option that balances security with ease of use. Kafka has been historically limited to a small number of authentication options that are difficult to integrate with a Single Signon (SSO) strategy, such as mutual TLS, basic auth, and Kerberos. The arrival of SASL/OAUTHBEARER in Kafka 2.0.0 affords system operators a flexible framework for integrating Kafka with their existing authentication infrastructure. Ron Dagostino (State Street Corporation) and Mike Kaminski (The New York Times) team up to discuss SASL/OAUTHBEARER and it’s real-world applications. Ron, who contributed the feature to core Kafka, explains the origins and intricacies of its development along with additional, related security changes, including client re-authentication (merged and scheduled for release in v2.2.0) and the plans for support of SASL/OAUTHBEARER in librdkafka-based clients. Mike Kaminski, a developer on The Publishing Pipeline team at The New York Times, talks about how his team leverages SASL/OAUTHBEARER to break down silos between teams by making it easy for product owners to get connected to the Publishing Pipeline’s Kafka cluster.
Stream Processing with Apache Kafka and .NETconfluent
Presentation from South Bay.NET meetup on 3/30.
Speaker: Matt Howlett, Software Engineer at Confluent
Apache Kafka is a scalable streaming platform that forms a key part of the infrastructure at many companies including Uber, Netflix, Walmart, Airbnb, Goldman Sachs and LinkedIn. In this talk Matt will give a technical overview of Kafka, discuss some typical use cases (from surge pricing to fraud detection to web analytics) and show you how to use Kafka from within your C#/.NET applications.
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, QlikHostedbyConfluent
Qlik is an industry leader across its solution stack, both on the Data Integration side of things with Qlik Replicate (real-time CDC) and Qlik Compose (data warehouse and data lake automation), and on the Analytics side with Qlik Sense. These two “sides” of Qlik are coming together more frequently these days as the need for “always fresh” data increases across organizations.
When real-time streaming applications are the topic du jour, those companies are looking to Apache Kafka to provide the architectural backbone those applications require. Those same companies turn to Qlik Replicate to put the data from their enterprise database systems into motion at scale, whether that data resides in “legacy” mainframe databases; traditional relational databases such as Oracle, MySQL, or SQL Server; or applications such as SAP and SalesForce.
In this session we will look in depth at how Qlik Replicate can be used to continuously stream changes from a source database into Apache Kafka. From there, we will explore how a purpose-built consumer can be used to provide the bridge between Apache Kafka and an analytics application such as Qlik Sense.
What is Apache Kafka and What is an Event Streaming Platform?confluent
Speaker: Gabriel Schenker, Lead Curriculum Developer, Confluent
Streaming platforms have emerged as a popular, new trend, but what exactly is a streaming platform? Part messaging system, part Hadoop made fast, part fast ETL and scalable data integration. With Apache Kafka® at the core, event streaming platforms offer an entirely new perspective on managing the flow of data. This talk will explain what an event streaming platform such as Apache Kafka is and some of the use cases and design patterns around its use—including several examples of where it is solving real business problems. New developments in this area such as KSQL will also be discussed.
This three-day course teaches developers how to build applications that can publish and subscribe to data from an Apache Kafka cluster. Students will learn Kafka concepts and components, how to use Kafka and Confluent APIs, and how to develop Kafka producers, consumers, and streams applications. The hands-on course covers using Kafka tools, writing producers and consumers, ingesting data with Kafka Connect, and more. It is designed for developers who need to interact with Kafka as a data source or destination.
Delivering: from Kafka to WebSockets | Adam Warski, SoftwareMillHostedbyConfluent
Here's the challenge: we've got a Kafka topic, where services publish messages to be delivered to browser-based clients through web sockets.
Sounds simple? It might, but we're faced with an increasing number of messages, as well as a growing count of web socket clients. How do we scale our solution? As our system contains a larger number of servers, failures become more frequent. How to ensure fault tolerance?
There’s a couple possible architectures. Each websocket node might consume all messages. Otherwise, we need an intermediary, which redistributes the messages to the proper web socket nodes.
Here, we might either use a Kafka topic, or a streaming forwarding service. However, we still need a feedback loop so that the intermediary knows where to distribute messages.
We’ll take a look at the strengths and weaknesses of each solution, as well as limitations created by the chosen technologies (Kafka and web sockets).
Live Event Debugging With ksqlDB at Reddit | Hannah Hagen and Paul Kiernan, R...HostedbyConfluent
Convincing developers to write tests for new code is hard; convincing developers to write tests for new event data is even harder. At Reddit, engineers have often deployed new app versions, only to find out later that the event wasn’t firing at all, or it was missing critical fields. So this begs the question, “How can engineers at Reddit be confident that the events they instrument are accurate and complete?”
In this session, we will learn about an internal tool developed at Reddit to QA events in real-time. This KSQL-powered web app streams events from our pipeline, allowing developers to filter events they care about using criteria like User ID, Device ID or the type of user interaction. With a backbone of KSQL and Kafka Streams, engineers can get real-time feedback on how accurate (or how erroneous) their event data is.
Cloud native Kafka | Sascha Holtbruegge and Margaretha Erber, HiveMQHostedbyConfluent
Joins in Kafka Streams and ksqlDB are a killer-feature for data processing and basic join semantics are well understood. However, in a streaming world records are associated with timestamps that impact the semantics of joins: welcome to the fabulous world of _temporal_ join semantics. For joins, timestamps are as important as the actual data and it is important to understand how they impact the join result.
In this talk we want to deep dive on the different types of joins, with a focus of their temporal aspect. Furthermore, we relate the individual join operators to the overall ""time engine"" of the Kafka Streams query runtime and explain its relationship to operator semantics. To allow developers to apply their knowledge on temporal join semantics, we provide best practices, tip and tricks to ""bend"" time, and configuration advice to get the desired join results. Last, we give an overview of recent, and an outlook to future, development that improves joins even further.
Developing and Deploying Microservices to IBM Cloud PrivateShikha Srivastava
This document discusses developing and deploying microservices on IBM Cloud Private. It provides an overview of IBM Cloud Private including its architecture, editions, and included content. It also covers Kubernetes concepts like pods and services. Helm is introduced as a tool for managing Kubernetes applications and charts. Finally, an example application called Stock Trader is presented to demonstrate how a hybrid cloud application could be built on IBM Cloud Private.
Edge 2016 Session 1886 Building your own docker container cloud on ibm power...Yong Feng
The material for IBM Edge 2016 session for a client use case of Spectrum Conductor for Containers
https://www-01.ibm.com/events/global/edge/sessions/.
Please refer to http://ibm.biz/ConductorForContainers for more details about Spectrum Conductor for Containers.
Please refer to https://www.youtube.com/watch?v=7YMjP6EypqA and https://www.youtube.com/watch?v=d9oVPU3rwhE for the demo of Spectrum Conductor for Containers.
Technical Capabilities of the kitsune frameworkRonak Samantray
kitsune enables developers to migrate / build serverless applications that are cloud agnostic. With kitsune, you need not have any knowledge of cloud-native / serverless components. You focus on the user experience while the framework takes care of architecture, scalability and performance.
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Anant Corporation
As the demand for real-time data processing continues to grow, so too do the challenges associated with building production-ready applications that can handle large volumes of data and handle it quickly. In this talk, we will explore common problems faced when building real-time applications at scale, with a focus on a specific use case: detecting and responding to cyclist crashes. Using telemetry data collected from a fitness app, we’ll demonstrate how we used a combination of Apache Kafka and Python-based microservices running on Kubernetes to build a pipeline for processing and analyzing this data in real-time. We'll also discuss how we used machine learning techniques to build a model for detecting collisions and how we implemented notifications to alert family members of a crash. Our ultimate goal is to help you navigate the challenges that come with building data-intensive, real-time applications that use ML models. By showcasing a real-world example, we aim to provide practical solutions and insights that you can apply to your own projects.
Key takeaways:
An understanding of the common challenges faced when building real-time applications at scale
Strategies for using Apache Kafka and Python-based microservices to process and analyze data in real-time
Tips for implementing machine learning models in a real-time application
Best practices for responding to and handling critical events in a real-time application
VMworld Recap summarizes announcements from VMworld including:
- Updates to vRealize Automation to simplify deployment, enhance authentication, and allow blueprint modeling with a graphical design canvas.
- vRealize Business improvements to provide single-pane-of-glass cost analysis across clouds and more granular cost reporting.
- New starter kits that bundle vRealize Suite licenses, professional services, and training to help customers automate cloud management.
Confluent Operator as Cloud-Native Kafka Operator for KubernetesKai Wähner
Agenda:
- Cloud Native vs. SaaS / Serverless Kafka
- The Emergence of Kubernetes
- Kafka on K8s Deployment Challenges
- Confluent Operator as Kafka Operator
- Q&A
Confluent Operator enables you to:
Provisioning, management and operations of Confluent Platform (including ZooKeeper, Apache Kafka, Kafka Connect, KSQL, Schema Registry, REST Proxy, Control Center)
Deployment on any Kubernetes Platform (Vanilla K8s, OpenShift, Rancher, Mesosphere, Cloud Foundry, Amazon EKS, Azure AKS, Google GKE, etc.)
Automate provisioning of Kafka pods in minutes
Monitor SLAs through Confluent Control Center or Prometheus
Scale Kafka elastically, handle fail-over & Automate rolling updates
Automate security configuration
Built on our first hand knowledge of running Confluent at scale
Fully supported for production usage
This document discusses application modernization and provides an overview of Docker containers, WebSphere Application Server (WAS) lift-and-shift, and next steps. It introduces modernization stages like lift-and-shift, refactor, and rebuild. It then covers Docker containers for WAS Liberty and traditional WAS. IBM Cloud Private (ICP) and Helm charts for automating deployments on Kubernetes are also discussed. The document concludes with a brief discussion of WAS lift-and-shift to IBM Cloud and potential next steps involving OpenShift and ICP.
K8sfor dev parisoss-summit-microsoft-5-decembre-shortGabriel Bechara
This document discusses several open source tools for Kubernetes development including Helm, Brigade, Kashti and Draft. It provides overviews of each tool's purpose and benefits. For example, it states that Helm helps define, install and upgrade even complex Kubernetes applications using reusable charts. It also includes links to demo videos showing how these tools can be used together for continuous integration and delivery pipelines on Kubernetes.
Pivotal Container Service : la nuova soluzione per gestire Kubernetes in aziendaVMware Tanzu
The document discusses Pivotal Container Service (PKS), a container management platform from VMware and Pivotal. PKS provides an enterprise-grade solution for provisioning, operating, and managing Kubernetes clusters across multiple clouds. It integrates Kubernetes with VMware technologies like NSX-T, vSphere, and vRealize to provide networking, security, storage, and management capabilities. PKS aims to simplify running containers at scale in production by handling tasks like cluster operations, upgrades, and monitoring.
IBM Cloud Paris Meetup - 20180628 - IBM Cloud PrivateIBM France Lab
IBM Cloud Private is a Kubernetes platform that allows organizations to develop modern applications using microservices architectures within their own datacenters. It includes Kubernetes for container orchestration, Cloud Foundry for application development and deployment, and Terraform for infrastructure provisioning on public and private clouds. IBM Cloud Private provides middleware, analytics and other services through Helm charts as well as core operational services for security, DevOps and hybrid integration. It can run on existing infrastructure from IBM, Dell, Cisco, NetApp, Lenovo and others.
Building Cloud-Native Applications with Kubernetes, Helm and KubelessBitnami
This document discusses building cloud-native applications with Kubernetes, Helm, and Kubeless. It introduces cloud-native concepts like containers and microservices. It then explains how Kubernetes provides container orchestration and Helm provides application packaging. Finally, it discusses how Kubeless enables serverless functionality on Kubernetes.
Event streaming: A paradigm shift in enterprise software architectureSina Sojoodi
This talk helps developers and architects understand the benefits, opportunities and challenges in moving from traditional point-to-point integration in application architecture to one with event streaming. Apache Kafka and Spring provide a solid foundation for enterprise and large organizations to implement event streaming solutions. Examples and common patterns are covered
towards the end.
Many thanks to James Watters and all the original content authors, editors and aggregators referenced in the slides.
NDC London 2021 - Application Autoscaling Made Easy With Kubernetes Event-Dri...Tom Kerkhove
Kubernetes Event-driven Autoscaling (KEDA) provides application autoscaling on Kubernetes using a variety of metric sources. It automatically scales deployments, jobs, and other resources. KEDA supports over 30 built-in scalers for sources like Azure, AWS, Google Cloud, and more. It is cloud-agnostic and focuses on scaling applications without managing the scaling internals. The Azure Functions CLI makes it easy to deploy functions to Kubernetes and automatically configure KEDA for autoscaling. KEDA is an open source project with over 2,800 stars on GitHub and contributions from Microsoft, Red Hat, and other companies.
ContainerConf 2019, November 2019, Mannheim: Vortrag von Mario-Leander Reimer (@LeanderReimer, Cheftechnologe bei QAware)
== Dokument bitte herunterladen, falls unscharf! ==
Abstract:
Vor nicht allzu langer Zeit haben Microservice-Architekturen die Art und Weise, wie wir Softwaresysteme bauen, revolutioniert: Anstatt als Monolithen werden Systeme nun in Form autonomer Services komponiert und ausgeführt.
Serverless und FaaS sind die nächste logische Stufe in dieser Evolution, um die Komplexität in der Entwicklung und im Betrieb solcher Systeme zu reduzieren.
FaaS-Plattformen schießen derzeit wie Pilze aus dem Boden: Knative, OpenFaaS, Fission oder Nuclio sind nur einige Beispiele. Aber welche davon sind bereits geeignet für den Einsatz im nächsten Projekt? Lassen sich damit hybride Architekturen umsetzen oder muss es vollständig Functionless sein? Lasst es uns herausfinden.
Accelerate Digital Transformation with IBM Cloud PrivateMichael Elder
Accelerate the journey to cloud-native, refactor existing mission-critical workloads, and catalyze enterprise digital transformations.
How do you ensure the success of your enterprise in highly competitive market landscapes? How will you deliver new cloud-native workloads, modernize existing estates, and drive integration between them?
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Data Con LA
Abstract:- Apache Kafka evolved from an enterprise messaging system to a fully distributed streaming data platform for building real-time streaming data pipelines and streaming data applications without the need for other tools/clusters for data ingestion, storage and stream processing. In this talk you will learn more about: A quick introduction to Kafka Core, Kafka Connect and Kafka Streams through code examples, key concepts and key features. A reference architecture for building such Kafka-based streaming data applications. A demo of an end-to-end Kafka-based streaming data application.
The Kubernetes WebLogic revival (part 2)Simon Haslam
The second of two sessions Martien & I presented at UKOUG Techfest19 in Brighton, UK about:
(a) Running WebLogic in containers, managed by Kubernetes
(b) Oracle's Container Engine for Kubernetes (OKE) - Oracle Cloud's managed k8s service
Multi-Arch Infra From the Ground Up.pptxCheryl Hung
Webinar: https://www.brighttalk.com/webcast/17792/588393?utm_source=ArmLtd&utm_medium=brighttalk&utm_campaign=588393
Multi-architecture infrastructures enable workloads to run on the best hardware for the task (Arm or x86) and optimize price/performance ratios while boosting design flexibility. However, migrating from a single- to a multi-arch framework can be tricky.
Join me to learn about:
- What is multi-architecture infrastructure?
- The trend for moving cloud workloads to multi-arch
- How early adopters handled the challenges
- The growing software ecosystem of Arm
- Comparing the price-performance of common workloads on x86 vs Arm
- Resources to get you started
This talk prepares developers for the road ahead with the ability to run workloads on the best hardware without needing to be concerned with the underlying architecture.
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
"In this talk, attendees will be provided with an introduction to Kafka Connect and the basics of Single Message Transforms (SMTs) and how they can be used to transform data streams in a simple and efficient way. SMTs are a powerful feature of Kafka Connect that allow custom logic to be applied to individual messages as they pass through the data pipeline. The session will explain how SMTs work, the types of transformations they can be used for, and how they can be applied in a modular and composable way.
Further, the session will discuss where SMTs fit in with Kafka Connect and when they should be used. Examples will be provided of how SMTs can be used to solve common data integration challenges, such as data enrichment, filtering, and restructuring. Attendees will also learn about the limitations of SMTs and when it might be more appropriate to use other tools or frameworks.
Additionally, an overview of the alternatives to SMTs, such as Kafka Streams and KSQL, will be provided. This will help attendees make an informed decision about which approach is best for their specific use case.
Whether attendees are developers, data engineers, or data scientists, this talk will provide valuable insights into how Kafka Connect and SMTs can help streamline data processing workflows. Attendees will come away with a better understanding of how these tools work and how they can be used to solve common data integration challenges."
"While Apache Kafka lacks native support for topic renaming, there are scenarios where renaming topics becomes necessary. This presentation will delve into the utilization of MirrorMaker 2.0 as a solution for renaming Kafka topics. It will illustrate how MirrorMaker 2.0 can efficiently facilitate the migration of messages from the old topic to the new one and how Kafka Connect Metrics can be employed to monitor the mirroring progress. The discussion will encompass the complexity of renaming Kafka topics, addressing certain limitations, and exploring potential workarounds when using MirrorMaker 2.0 for this purpose. Despite not being originally designed for topic renaming, MirrorMaker 2.0 has a suitable solution for renaming Kafka topics.
Blog Post : https://engineering.hellofresh.com/renaming-a-kafka-topic-d6ff3aaf3f03"
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
"Trendyol, Turkey's leading e-commerce company, is committed to positively impacting the lives of millions of customers. Our decision-making processes are entirely driven by data. As a data warehouse team, our primary goal is to provide accurate and up-to-date data, enabling the extraction of valuable business insights.
We utilize the benefits provided by Kafka and Kafka Connect to facilitate the transfer of data from the source to our analytical environment. We recently transitioned our Kafka Connect clusters from on-premise VMs to Kubernetes. This shift was driven by our desire to effectively manage rapid growth(marked by a growing number of producers, consumers, and daily messages), ensuring proper monitoring and consistency. Consistency is crucial, especially in instances where we employ Single Message Transforms to manipulate records like filtering based on their keys or converting a JSON Object into a JSON string.
Monitoring our cluster's health is key and we achieve this through Grafana dashboards and alerts generated through kube-state-metrics. Additionally, Kafka Connect's JMX metrics, coupled with NewRelic, are employed for comprehensive monitoring.
The session will aim to explain our approach to NRT data ingestion, outlining the role of Kafka and Kafka Connect, our transition journey to K8s, and methods employed to monitor the health of our clusters."
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
"Join our lightning talk to delve into the strategies vital for maintaining a resilient Kafka service.
While proactive monitoring is key for issue prevention, failures will still occur. Rapid detection tools will enable you to identify and resolve problems before they impact end-users. This session explores the techniques employed by Kafka cloud providers for this detection, many of which are also applicable if you are managing independent Kafka clusters or applications.
The talk focuses on health-checking, a powerful tool that encompasses an application and its monitoring to validate Kafka environment availability. The session navigates through Kafka health-check methods, sharing best practices, identifying common pitfalls, and highlighting the monitoring of critical performance metrics like throughput and latency for early issue detection.
Attendees will gain valuable insights into the art of health-checking their Kafka environment, equipping them with the tools to identify and address issues before they escalate into critical problems. We invite all Kafka enthusiasts to join us in this talk to foster a deeper understanding of Kafka health-checking and ensure the continued smooth operation of your Kafka environment."
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
"Stream processing systems traditionally gave their users the choice between at least once processing and at most once processing: accepting duplicate data or missing data. But ideally we would provide exactly-once processing, where every event in the input data is represented exactly once in the output.
Kafka provides a transaction API that enables exactly-once when using Kafka as your source and sink. But this API has turned out to not be well suited for use by high level streaming systems, requiring various work arounds to still provide transactional processing.
In this talk, I’ll cover how the transaction API works, and how systems like Arroyo and Flink have used it to build exactly-once support, and how improvements to the transactional API will enable better end-to-end support for consistent stream processing."
"In this talk, we will explore the exciting world of IoT and computer vision by presenting a unique project: Fish Plays Pokemon. Using an ESP Eye camera connected to an ESP32 and other IoT devices, to monitor fish's movements in an aquarium.
This project showcases the power of IoT and computer vision, demonstrating how even a fish can play a popular video game. We will discuss the challenges we faced during development, including real-time processing, IoT device integration, and Kafka message consumption.
By the end of the talk, attendees will have a better understanding of how to combine IoT, computer vision, and the usage of a serverless cloud to create innovative projects. They will also learn how to integrate IoT devices with Kafka to simulate keyboard behavior, opening up endless possibilities for real-time interactions between the physical and digital worlds."
What is tiered storage and what is it good for? After this session you will know how to leverage the tiered storage feature to enable longer retention than the storage attached to brokers allows. You will get acquainted with the different configuration options and know what to expect when you enable the feature, like for example when will the first upload to the remote object storage take place.
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
"Real-time 24/7 monitoring and verification of massive data is challenging – even more so for the world’s second largest manufacturer of memory chips and semiconductors. Tolerance levels are incredibly small, any small defect needs to be identified and dealt with immediately. The goal of semiconductor manufacturing is to improve yield and minimize unnecessary work.
However, even with real-time data collection, the data was not easy to manipulate by users and it took many days to enable stream processing requests – limiting its usefulness and value to the business.
You’ll hear why SK hynix switched to Confluent and how we developed a self-service stream process portal on top of it. Now users have an easy-to-use service to manipulate the data they want.
Results have been impressive, stream processing requests are available the same day – previously taking 5 days! We were also able to drive down costs by 10% as stream processing requests no longer require additional hardware.
What you’ll take away from our talk:
- What were the pain points in the previous environment
- How we transitioned to Confluent without service downtime
- Creating a self-service stream processing portal built on top of Connect and ksqlDB
- Use case of stream process portal"
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
"Discover how default configurations might impact ingestion times, especially when dealing with large files. We'll explore a real-world scenario with a 20,000,000+ line file, assessing metrics and exploring the bottleneck in the default setup. Understand the intricacies of batch size calculations and how to optimize them based on your unique data characteristics.
Walk away with actionable insights as we showcase a practical example, turning a 7-hour ingestion process into a mere 30 minutes for over 30,000,000 records in a Kafka topic. Uncover metrics, configurations, and best practices to elevate the performance of your Kafka Connect CSV source connectors. Don't miss this opportunity to optimize your data pipeline and ensure smooth, efficient data flow."
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
"In order to meet the current and ever-increasing demand for near-zero RPO/RTO systems, a focus on resiliency is critical. While Kafka offers built-in resiliency features, a perfect blend of client and cluster resiliency is necessary in order to achieve a highly resilient Kafka client application.
At Fidelity Investments, Kafka is used for a variety of event streaming needs such as core brokerage trading platforms, log aggregation, communication platforms, and data migrations. In this lightening talk, we will discuss the governance framework that has enabled producers and consumers to achieve their SLAs during unprecedented failure scenarios. We will highlight how we automated resiliency tests through chaos engineering and tightly integrated observability dashboards for Kafka clients to analyze and optimize client configurations. And finally, we will summarize the chaos test suite and the ""test, test and test"" mantra that are helping Fidelity Investments reach its goal of a future with zero down-time."
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
"There are various strategies for securely connecting to Kafka clusters between different networks or over the public internet. Many cloud providers even offer endpoints that privately route traffic between networks and are not exposed to the internet. But, depending on your network setup and how you are running Kafka, these options ... might not be an option!
In this session, we’ll discuss how you can use SSH bastions or a self managed PrivateLink endpoint to establish connectivity to your Kafka clusters without exposing brokers directly to the internet. We explain the required network configuration, and show how we at Materialize have contributed to librdkafka to simplify these scenarios and avoid fragile workarounds."
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
"In my talk, we will examine all the stages of building our self-service Streaming Data Platform based on Apache Flink and Kafka Connect, from the selection of a solution for stateful streaming data processing, right up to the successful design of a robust self-service platform, covering the challenges that we’ve met.
I will share our experience in providing non-Java developers with a company-wide self-service solution, which allows them to quickly and easily develop their streaming data pipelines.
Additionally, I will highlight specific business use cases that would not have been implemented without our platform.0 characters0 characters"
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
"Almost everyone has heard about large language models, and tens of millions of people have tried out OpenAI ChatGPT and Google Bard. However, the intricate architecture and underlying mathematics driving these remarkable systems remain elusive to many.
LLM's are fascinating - so let's grab a drink and find out how these systems are built and dive deep into their inner workings. In the length of time it to enjoy a round of drinks, you'll understand the inner workings of these models. We'll take our first sip of word vectors, enjoy the refreshing taste of the transformer, and drain a glass understanding how these models are trained on phenomenally large quantities of data.
Large language models for your streaming application - explained with a little maths and a lot of pub stories"
"Monitoring is a fundamental operation when running Kafka and Kafka applications in production. There are numerous metrics available when using Kafka, however the sheer number is overwhelming, making it challenging to know where to start and how to properly utilise them.
This session will introduce you to some of the key metrics that should be monitored and best practices in fine tuning your monitoring. We will delve into which metrics are the indicators for cluster’s availability and performance and are the most helpful when debugging client applications."
Kafka Streams relies on state restoration for maintaining standby tasks as failure recovery mechanism as well as for restoring the state after rebalance scenarios. When you are scaling up or down your application instances, it is necessary to know the current state of the restoration process for each active and standby task in order to prevent a long restoration process as much as possible. During this presentation, you will get an understanding of how KIP-869 provides valuable information about the current active task restoration after a rebalance and KIP-988 opens a window to the continuous process of standby restoration. When you encounter a situation in which you need to choose whether or not to scale up or down your application instances, both KIPs will be an invaluable ally for you.
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
"In this talk, we will dive into the world of Kafka producer configs and explore how to understand and optimize them for better performance. We will cover the different types of configs, their impact on performance, and how to tune them to achieve the best results. Whether you're new to Kafka or a seasoned pro, this session will provide valuable insights and practical tips for improving your Kafka producer performance.
- Introduction to Kafka producer internal and workflow
- Understanding the producer configs like linger.ms, batch.size, buffer.memory and their impact on performance
- Learning about producer configs like max.block.ms, delivery.timeout.ms, request.timeout.ms and retries to make producer more resilient.
- Discuss configs like enable.idempotence, max.in.flight.requests.per.connection and transaction related configs to achieve delivery guarantees.
- Q&A session with attendees to address specific questions and concerns."
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
"Data contracts are one of the hottest topics in the data management community. A data contract is a formal agreement between a data producer and its consumers, aimed at reducing data downtime and improving data quality. Schemas are an important part of data contracts, but they are not the only relevant element.
In this talk, we’ll:
1. see why data contracts are so important but also difficult to implement;
2. identify the characteristics of a well-designed data contract:
discuss the anatomy of a data contract, its main elements and, how to formally describe them;
3. show how to manage the lifecycle of a data contract leveraging Confluent Platform's services."
"In the realm of stateful stream processing, Apache Flink has emerged as a powerful and versatile platform. However, the conventional SQL-based approach often limits the full potential of Flink applications.
We will delve into the benefits of adopting a code-first approach, which provides developers with greater control over application logic, facilitates complex transformations, and enables more efficient handling of state and time. We will also discuss how the code-first approach can lead to more maintainable and testable code, ultimately improving the overall quality of your Flink applications.
Whether you're a seasoned Flink developer or just starting your journey, this talk will provide valuable insights into how a code-first approach can revolutionize your stream processing applications."
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
"Change Data Capture (CDC) has become a commodity in data engineering, much in part due to the ever-rising success of Debezium [1]. But is that all there is? In this lightning talk, we’ll outline the current state of the CDC ecosystem, and understand why adopting a Debezium alternative is still a hard sell. If you’ve ever wondered what else is out there, but can’t keep up with the sprawling of new tools in the ecosystem; we’ll wrap it up for you!
[1] https://debezium.io/"
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
"Separation of compute and storage has become the de-facto standard in the data industry for batch processing.
The addition of tiered storage to open source Apache Kafka is the first step in bringing true separation of compute and storage to the streaming world.
In this talk, we'll discuss in technical detail how to take the concept of tiered storage to its logical extreme by building an Apache Kafka protocol compatible system that has zero local disks.
Eliminating all local disks in the system requires not only separating storage from compute, but also separating data from metadata. This is a monumental task that requires reimagining Kafka's architecture from the ground up, but the benefits are worth it.
This approach enables a stateless, elastic, and serverless deployment model that minimizes operational overhead and also drives inter-zone networking costs to almost zero."
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc
Most consumers believe they’re making informed decisions about their personal data—adjusting privacy settings, blocking trackers, and opting out where they can. However, our new research reveals that while awareness is high, taking meaningful action is still lacking. On the corporate side, many organizations report strong policies for managing third-party data and consumer consent yet fall short when it comes to consistency, accountability and transparency.
This session will explore the research findings from TrustArc’s Privacy Pulse Survey, examining consumer attitudes toward personal data collection and practical suggestions for corporate practices around purchasing third-party data.
Attendees will learn:
- Consumer awareness around data brokers and what consumers are doing to limit data collection
- How businesses assess third-party vendors and their consent management operations
- Where business preparedness needs improvement
- What these trends mean for the future of privacy governance and public trust
This discussion is essential for privacy, risk, and compliance professionals who want to ground their strategies in current data and prepare for what’s next in the privacy landscape.
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxJustin Reock
Building 10x Organizations with Modern Productivity Metrics
10x developers may be a myth, but 10x organizations are very real, as proven by the influential study performed in the 1980s, ‘The Coding War Games.’
Right now, here in early 2025, we seem to be experiencing YAPP (Yet Another Productivity Philosophy), and that philosophy is converging on developer experience. It seems that with every new method we invent for the delivery of products, whether physical or virtual, we reinvent productivity philosophies to go alongside them.
But which of these approaches actually work? DORA? SPACE? DevEx? What should we invest in and create urgency behind today, so that we don’t find ourselves having the same discussion again in a decade?
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfAbi john
Analyze the growth of meme coins from mere online jokes to potential assets in the digital economy. Explore the community, culture, and utility as they elevate themselves to a new era in cryptocurrency.
This is the keynote of the Into the Box conference, highlighting the release of the BoxLang JVM language, its key enhancements, and its vision for the future.
Book industry standards are evolving rapidly. In the first part of this session, we’ll share an overview of key developments from 2024 and the early months of 2025. Then, BookNet’s resident standards expert, Tom Richardson, and CEO, Lauren Stewart, have a forward-looking conversation about what’s next.
Link to recording, presentation slides, and accompanying resource: https://bnctechforum.ca/sessions/standardsgoals-for-2025-standards-certification-roundup/
Presented by BookNet Canada on May 6, 2025 with support from the Department of Canadian Heritage.
HCL Nomad Web – Best Practices and Managing Multiuser Environmentspanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-nomad-web-best-practices-and-managing-multiuser-environments/
HCL Nomad Web is heralded as the next generation of the HCL Notes client, offering numerous advantages such as eliminating the need for packaging, distribution, and installation. Nomad Web client upgrades will be installed “automatically” in the background. This significantly reduces the administrative footprint compared to traditional HCL Notes clients. However, troubleshooting issues in Nomad Web present unique challenges compared to the Notes client.
Join Christoph and Marc as they demonstrate how to simplify the troubleshooting process in HCL Nomad Web, ensuring a smoother and more efficient user experience.
In this webinar, we will explore effective strategies for diagnosing and resolving common problems in HCL Nomad Web, including
- Accessing the console
- Locating and interpreting log files
- Accessing the data folder within the browser’s cache (using OPFS)
- Understand the difference between single- and multi-user scenarios
- Utilizing Client Clocking
Dev Dives: Automate and orchestrate your processes with UiPath MaestroUiPathCommunity
This session is designed to equip developers with the skills needed to build mission-critical, end-to-end processes that seamlessly orchestrate agents, people, and robots.
📕 Here's what you can expect:
- Modeling: Build end-to-end processes using BPMN.
- Implementing: Integrate agentic tasks, RPA, APIs, and advanced decisioning into processes.
- Operating: Control process instances with rewind, replay, pause, and stop functions.
- Monitoring: Use dashboards and embedded analytics for real-time insights into process instances.
This webinar is a must-attend for developers looking to enhance their agentic automation skills and orchestrate robust, mission-critical processes.
👨🏫 Speaker:
Andrei Vintila, Principal Product Manager @UiPath
This session streamed live on April 29, 2025, 16:00 CET.
Check out all our upcoming Dev Dives sessions at https://community.uipath.com/dev-dives-automation-developer-2025/.
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveScyllaDB
Want to learn practical tips for designing systems that can scale efficiently without compromising speed?
Join us for a workshop where we’ll address these challenges head-on and explore how to architect low-latency systems using Rust. During this free interactive workshop oriented for developers, engineers, and architects, we’ll cover how Rust’s unique language features and the Tokio async runtime enable high-performance application development.
As you explore key principles of designing low-latency systems with Rust, you will learn how to:
- Create and compile a real-world app with Rust
- Connect the application to ScyllaDB (NoSQL data store)
- Negotiate tradeoffs related to data modeling and querying
- Manage and monitor the database for consistently low latencies
How Can I use the AI Hype in my Business Context?Daniel Lehner
𝙄𝙨 𝘼𝙄 𝙟𝙪𝙨𝙩 𝙝𝙮𝙥𝙚? 𝙊𝙧 𝙞𝙨 𝙞𝙩 𝙩𝙝𝙚 𝙜𝙖𝙢𝙚 𝙘𝙝𝙖𝙣𝙜𝙚𝙧 𝙮𝙤𝙪𝙧 𝙗𝙪𝙨𝙞𝙣𝙚𝙨𝙨 𝙣𝙚𝙚𝙙𝙨?
Everyone’s talking about AI but is anyone really using it to create real value?
Most companies want to leverage AI. Few know 𝗵𝗼𝘄.
✅ What exactly should you ask to find real AI opportunities?
✅ Which AI techniques actually fit your business?
✅ Is your data even ready for AI?
If you’re not sure, you’re not alone. This is a condensed version of the slides I presented at a Linkedin webinar for Tecnovy on 28.04.2025.
Generative Artificial Intelligence (GenAI) in BusinessDr. Tathagat Varma
My talk for the Indian School of Business (ISB) Emerging Leaders Program Cohort 9. In this talk, I discussed key issues around adoption of GenAI in business - benefits, opportunities and limitations. I also discussed how my research on Theory of Cognitive Chasms helps address some of these issues
Mobile App Development Company in Saudi ArabiaSteve Jonas
EmizenTech is a globally recognized software development company, proudly serving businesses since 2013. With over 11+ years of industry experience and a team of 200+ skilled professionals, we have successfully delivered 1200+ projects across various sectors. As a leading Mobile App Development Company In Saudi Arabia we offer end-to-end solutions for iOS, Android, and cross-platform applications. Our apps are known for their user-friendly interfaces, scalability, high performance, and strong security features. We tailor each mobile application to meet the unique needs of different industries, ensuring a seamless user experience. EmizenTech is committed to turning your vision into a powerful digital product that drives growth, innovation, and long-term success in the competitive mobile landscape of Saudi Arabia.
Role of Data Annotation Services in AI-Powered ManufacturingAndrew Leo
From predictive maintenance to robotic automation, AI is driving the future of manufacturing. But without high-quality annotated data, even the smartest models fall short.
Discover how data annotation services are powering accuracy, safety, and efficiency in AI-driven manufacturing systems.
Precision in data labeling = Precision on the production floor.