How Netflix run Apache Flink at very large scale in these two scenarios. (1) Thousands of stateless routing jobs in the context of Keystone data pipeline (2) single large state job with many TBs of state and parallelism at a couple thousands
Deploying Flink on Kubernetes - David AndersonVerverica
Kubernetes has rapidly established itself as the de facto standard for orchestrating containerized infrastructures. And with the recent completion of the refactoring of Flink's deployment and process model known as FLIP-6, Kubernetes has become a natural choice for Flink deployments. In this talk we will walk through how to get Flink running on Kubernetes
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
Flinkn Forward San Francisco 2022.
In this talk, we will cover various topics around performance issues that can arise when running a Flink job and how to troubleshoot them. We’ll start with the basics, like understanding what the job is doing and what backpressure is. Next, we will see how to identify bottlenecks and which tools or metrics can be helpful in the process. Finally, we will also discuss potential performance issues during the checkpointing or recovery process, as well as and some tips and Flink features that can speed up checkpointing and recovery times.
by
Piotr Nowojski
Using the New Apache Flink Kubernetes Operator in a Production DeploymentFlink Forward
Flink Forward San Francisco 2022.
Running natively on Kubernetes, using the new Apache Flink Kubernetes Operator is a great way to deploy and manage Flink application and session deployments. In this presentation, we provide: - A brief overview of Kubernetes operators and their benefits. - Introduce the five levels of the operator maturity model. - Introduce the newly released Apache Flink Kubernetes Operator and FlinkDeployment CRs - Dockerfile modifications you can make to swap out UBI images and Java of the underlying Flink Operator container - Enhancements we're making in: - Versioning/Upgradeability/Stability - Security - Demo of the Apache Flink Operator in-action, with a technical preview of an upcoming product using the Flink Kubernetes Operator. - Lessons learned - Q&A
by
James Busche & Ted Chang
The document discusses scaling state management in Apache Flink streaming applications to very large state. It describes how Flink uses state sharding and increasing operator parallelism to scale stateful computation. For fault tolerance, it discusses scaling checkpointing by making checkpoints asynchronous and less frequent, and scaling recovery by replicating state so fewer operators need recovery. It presents work in progress on incremental checkpointing and recovery to further optimize state management for large, stateful streaming applications.
Tuning Apache Kafka Connectors for Flink.pptxFlink Forward
Flink Forward San Francisco 2022.
In normal situations, the default Kafka consumer and producer configuration options work well. But we all know life is not all roses and rainbows and in this session we’ll explore a few knobs that can save the day in atypical scenarios. First, we'll take a detailed look at the parameters available when reading from Kafka. We’ll inspect the params helping us to spot quickly an application lock or crash, the ones that can significantly improve the performance and the ones to touch with gloves since they could cause more harm than benefit. Moreover we’ll explore the partitioning options and discuss when diverging from the default strategy is needed. Next, we’ll discuss the Kafka Sink. After browsing the available options we'll then dive deep into understanding how to approach use cases like sinking enormous records, managing spikes, and handling small but frequent updates.. If you want to understand how to make your application survive when the sky is dark, this session is for you!
by
Olena Babenko
Stephan Ewen - Experiences running Flink at Very Large ScaleVerverica
This talk shares experiences from deploying and tuning Flink steam processing applications for very large scale. We share lessons learned from users, contributors, and our own experiments about running demanding streaming jobs at scale. The talk will explain what aspects currently render a job as particularly demanding, show how to configure and tune a large scale Flink job, and outline what the Flink community is working on to make the out-of-the-box for experience as smooth as possible. We will, for example, dive into - analyzing and tuning checkpointing - selecting and configuring state backends - understanding common bottlenecks - understanding and configuring network parameters
Apache Kafka is a distributed streaming platform. It provides a high-throughput distributed messaging system with publish-subscribe capabilities. The document discusses Kafka producers and consumers, Kafka clients in different programming languages, and important configuration settings for Kafka brokers and topics. It also demonstrates sending messages to Kafka topics from a Java producer and consuming messages from the console consumer.
Flink Forward San Francisco 2022.
Resource Elasticity is a frequently requested feature in Apache Flink: Users want to be able to easily adjust their clusters to changing workloads for resource efficiency and cost saving reasons. In Flink 1.13, the initial implementation of Reactive Mode was introduced, later releases added more improvements to make the feature production ready. In this talk, we’ll explain scenarios to deploy Reactive Mode to various environments to achieve autoscaling and resource elasticity. We’ll discuss the constraints to consider when planning to use this feature, and also potential improvements from the Flink roadmap. For those interested in the internals of Flink, we’ll also briefly explain how the feature is implemented, and if time permits, conclude with a short demo.
by
Robert Metzger
Practical learnings from running thousands of Flink jobsFlink Forward
Flink Forward San Francisco 2022.
Task Managers constantly running out of memory? Flink job keeps restarting from cryptic Akka exceptions? Flink job running but doesn’t seem to be processing any records? We share practical learnings from running thousands of Flink Jobs for different use-cases and take a look at common challenges they have experienced such as out-of-memory errors, timeouts and job stability. We will cover memory tuning, S3 and Akka configurations to address common pitfalls and the approaches that we take on automating health monitoring and management of Flink jobs at scale.
by
Hong Teoh & Usamah Jassat
This document provides an overview of cBPF and eBPF. It discusses the history and implementation of cBPF, including how it was originally used for packet filtering. It then covers eBPF in more depth, explaining what it is, its history, implementation including different program types and maps. It also discusses several uses of eBPF including networking, firewalls, DDoS mitigation, profiling, security, and chaos engineering. Finally, it introduces XDP and DPDK, comparing XDP's benefits over DPDK.
Replacing Your Shared Drive with Alfresco - Open Source ECMAlfresco Software
1) Alfresco replaces traditional shared drives with a virtual file system that provides better document management capabilities than a standard file system, including search, version control, metadata, and collaboration tools.
2) It utilizes a rules engine and smart spaces to automate processes like metadata extraction and workflow. Rules can be created to organize, structure, and enrich content.
3) Alfresco provides a virtual file system via several protocols like CIFS, WebDAV, and FTP to emulate a shared drive and allow dragging and dropping of files while also enabling server-side actions.
Keystone Data Pipeline manages several thousand Flink pipelines, with variable workloads. These pipelines are simple routers which consume from Kafka and write to one of three sinks. In order to alleviate our operational overhead, we’ve implemented autoscaling for our routers. Autoscaling has reduced our resource usage by 25% - 45% (varying by region and time), and has reduced our on call burden. This talk will take an in depth look at the mathematics, algorithms, and infrastructure details for implementing autoscaling of simple pipelines at scale. It will also discuss future work for autoscaling complex pipelines.
Building large scale transactional data lake using apache hudiBill Liu
Data is a critical infrastructure for building machine learning systems. From ensuring accurate ETAs to predicting optimal traffic routes, providing safe, seamless transportation and delivery experiences on the Uber platform requires reliable, performant large-scale data storage and analysis. In 2016, Uber developed Apache Hudi, an incremental processing framework, to power business critical data pipelines at low latency and high efficiency, and helps distributed organizations build and manage petabyte-scale data lakes.
In this talk, I will describe what is APache Hudi and its architectural design, and then deep dive to improving data operations by providing features such as data versioning, time travel.
We will also go over how Hudi brings kappa architecture to big data systems and enables efficient incremental processing for near real time use cases.
Speaker: Satish Kotha (Uber)
Apache Hudi committer and Engineer at Uber. Previously, he worked on building real time distributed storage systems like Twitter MetricsDB and BlobStore.
website: https://www.aicamp.ai/event/eventdetails/W2021043010
Evening out the uneven: dealing with skew in FlinkFlink Forward
Flink Forward San Francisco 2022.
When running Flink jobs, skew is a common problem that results in wasted resources and limited scalability. In the past years, we have helped our customers and users solve various skew-related issues in their Flink jobs or clusters. In this talk, we will present the different types of skew that users often run into: data skew, key skew, event time skew, state skew, and scheduling skew, and discuss solutions for each of them. We hope this will serve as a guideline to help you reduce skew in your Flink environment.
by
Jun Qin & Karl Friedrich
Producer Performance Tuning for Apache KafkaJiangjie Qin
Kafka is well known for high throughput ingestion. However, to get the best latency characteristics without compromising on throughput and durability, we need to tune Kafka. In this talk, we share our experiences to achieve the optimal combination of latency, throughput and durability for different scenarios.
This document introduces the (B)ELK stack, which consists of Beats, Elasticsearch, Logstash, and Kibana. It describes each component and how they work together. Beats are lightweight data shippers that collect data from logs and systems. Logstash processes and transforms data from inputs like Beats. Elasticsearch stores and indexes the data. Kibana provides visualization and analytics capabilities. The document provides examples of using each tool and tips for working with the ELK stack.
The document provides an overview of Kubernetes networking concepts including single pod networking, pod to pod communication, service discovery and load balancing, external access patterns, network policies, Istio service mesh, multi-cluster networking, and best practices. It covers topics such as pod IP addressing, communication approaches like L2, L3, overlays, services, ingress controllers, network policies, multi-cluster use cases and deployment options.
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John HaanCeph Community
This document discusses basic and advanced features of using Ceph as the backend driver for Cinder block storage in OpenStack. It begins with basic concepts like Cinder volumes, snapshots, and backups using RBD copy-on-write, snapshots, and export/import diffs. More advanced topics covered include image-cached volumes to improve volume creation performance, and replication between Ceph clusters for disaster recovery using RBD mirroring. The document provides configuration details and diagrams to illustrate how data is stored and managed in Ceph for basic and advanced Cinder integration.
Flink Forward San Francisco 2022.
This talk will take you on the long journey of Apache Flink into the cloud-native era. It started all the way from where Hadoop and YARN were the standard way of deploying and operating data applications.
We're going to deep dive into the cloud-native set of principles and how they map to the Apache Flink internals and recent improvements. We'll cover fast checkpointing, fault tolerance, resource elasticity, minimal infrastructure dependencies, industry-standard tooling, ease of deployment and declarative APIs.
After this talk you'll get a broader understanding of the operational requirements for a modern streaming application and where the current limits are.
by
David Moravek
Best Practices: How to Analyze IoT Sensor Data with InfluxDBInfluxData
InfluxDB is the purpose-built time series platform. Its high ingest capability makes it perfect for collecting, storing and analyzing time-stamped data from sensors — down to the nanosecond. The InfluxDB platform has everything developers need: the data collection agent, the database, visualization tools, and data querying and scripting language. Join this webinar as Brian Gilmore provides a product overview; he will also deep-dive with some helpful tips and ticks. Stick around for a live demo and Q&A time.
Join this webinar as Brian Gilmore dives into:
The basics of time series data and applications
A platform overview — learn about InfluxDB, Telegraf, and Flux
InfluxDB use case examples — start collecting data at the edge and use your preferred IoT protocol (i.e. MQTT)
PCI Express* based Storage: Data Center NVM Express* Platform TopologiesOdinot Stanislas
This document discusses PCI Express based solid state drives (SSDs) for data centers. It covers the growth opportunity for PCIe SSDs, topology options using various form factors like SFF-8639 and M.2, and validation tools. It also discusses hot plug support on Intel Xeon processor based servers and upcoming industry workshops to advance the PCIe SSD ecosystem.
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Databricks
Watch video at: http://youtu.be/Wg2boMqLjCg
Want to learn how to write faster and more efficient programs for Apache Spark? Two Spark experts from Databricks, Vida Ha and Holden Karau, provide some performance tuning and testing tips for your Spark applications
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward
Flink Forward San Francisco 2022.
Flink consumers read from Kafka as a scalable, high throughput, and low latency data source. However, there are challenges in scaling out data streams where migration and multiple Kafka clusters are required. Thus, we introduced a new Kafka source to read sharded data across multiple Kafka clusters in a way that conforms well with elastic, dynamic, and reliable infrastructure. In this presentation, we will present the source design and how the solution increases application availability while reducing maintenance toil. Furthermore, we will describe how we extended the existing KafkaSource to provide mechanisms to read logical streams located on multiple clusters, to dynamically adapt to infrastructure changes, and to perform transparent cluster migrations and failover.
by
Mason Chen
The document summarizes a talk on container performance analysis. It discusses identifying bottlenecks at the host, container, and kernel level using various Linux performance tools. It then provides an overview of how containers work in Linux using namespaces and control groups (cgroups). Finally, it demonstrates some example commands like docker stats, systemd-cgtop, and bcc/BPF tools that can be used to analyze containers and cgroups from the host system.
Kubernetes is a platform for managing containerized workloads and services that provides a container-centric management environment. It aims to provide high utilization, high availability, minimize fault recovery time, and reduce the probability of correlated failures through a declarative job specification language, name service integration, real-time job monitoring, and analyzing and simulating system behavior using APIs and dashboards. Kubernetes can manage 100,000s of jobs, 1000s of applications across multiple clusters each with 10,000s of machines.
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud" Flink Forward
Over 109 million subscribers are enjoying more than 125 million hours of TV shows and movies per day on Netflix. This leads to massive amount of data flowing through our data ingestion pipeline to improve service and user experience. They are powering various data analytic cases like personalization, operational insight, fraud detection. At the heart of this massive data ingestion pipeline is a self-serve stream processing platform that processes 3 trillion events and 12 PB of data every day. We have recently migrated this stream processing platform from Samza to Flink. In this talk, we will share the challenges and issues that we run into when running Flink at scale in cloud. We will dive deep into the troubleshooting techniques and lessons learned.
This document discusses end-to-end processing of 3.7 million telemetry events per second using a lambda architecture at Symantec. It provides an overview of Symantec's security data lake infrastructure, the telemetry data processing architecture using Kafka, Storm and HBase, tuning targets for the infrastructure components, and performance benchmarks for Kafka, Storm and Hive.
Flink Forward San Francisco 2022.
Resource Elasticity is a frequently requested feature in Apache Flink: Users want to be able to easily adjust their clusters to changing workloads for resource efficiency and cost saving reasons. In Flink 1.13, the initial implementation of Reactive Mode was introduced, later releases added more improvements to make the feature production ready. In this talk, we’ll explain scenarios to deploy Reactive Mode to various environments to achieve autoscaling and resource elasticity. We’ll discuss the constraints to consider when planning to use this feature, and also potential improvements from the Flink roadmap. For those interested in the internals of Flink, we’ll also briefly explain how the feature is implemented, and if time permits, conclude with a short demo.
by
Robert Metzger
Practical learnings from running thousands of Flink jobsFlink Forward
Flink Forward San Francisco 2022.
Task Managers constantly running out of memory? Flink job keeps restarting from cryptic Akka exceptions? Flink job running but doesn’t seem to be processing any records? We share practical learnings from running thousands of Flink Jobs for different use-cases and take a look at common challenges they have experienced such as out-of-memory errors, timeouts and job stability. We will cover memory tuning, S3 and Akka configurations to address common pitfalls and the approaches that we take on automating health monitoring and management of Flink jobs at scale.
by
Hong Teoh & Usamah Jassat
This document provides an overview of cBPF and eBPF. It discusses the history and implementation of cBPF, including how it was originally used for packet filtering. It then covers eBPF in more depth, explaining what it is, its history, implementation including different program types and maps. It also discusses several uses of eBPF including networking, firewalls, DDoS mitigation, profiling, security, and chaos engineering. Finally, it introduces XDP and DPDK, comparing XDP's benefits over DPDK.
Replacing Your Shared Drive with Alfresco - Open Source ECMAlfresco Software
1) Alfresco replaces traditional shared drives with a virtual file system that provides better document management capabilities than a standard file system, including search, version control, metadata, and collaboration tools.
2) It utilizes a rules engine and smart spaces to automate processes like metadata extraction and workflow. Rules can be created to organize, structure, and enrich content.
3) Alfresco provides a virtual file system via several protocols like CIFS, WebDAV, and FTP to emulate a shared drive and allow dragging and dropping of files while also enabling server-side actions.
Keystone Data Pipeline manages several thousand Flink pipelines, with variable workloads. These pipelines are simple routers which consume from Kafka and write to one of three sinks. In order to alleviate our operational overhead, we’ve implemented autoscaling for our routers. Autoscaling has reduced our resource usage by 25% - 45% (varying by region and time), and has reduced our on call burden. This talk will take an in depth look at the mathematics, algorithms, and infrastructure details for implementing autoscaling of simple pipelines at scale. It will also discuss future work for autoscaling complex pipelines.
Building large scale transactional data lake using apache hudiBill Liu
Data is a critical infrastructure for building machine learning systems. From ensuring accurate ETAs to predicting optimal traffic routes, providing safe, seamless transportation and delivery experiences on the Uber platform requires reliable, performant large-scale data storage and analysis. In 2016, Uber developed Apache Hudi, an incremental processing framework, to power business critical data pipelines at low latency and high efficiency, and helps distributed organizations build and manage petabyte-scale data lakes.
In this talk, I will describe what is APache Hudi and its architectural design, and then deep dive to improving data operations by providing features such as data versioning, time travel.
We will also go over how Hudi brings kappa architecture to big data systems and enables efficient incremental processing for near real time use cases.
Speaker: Satish Kotha (Uber)
Apache Hudi committer and Engineer at Uber. Previously, he worked on building real time distributed storage systems like Twitter MetricsDB and BlobStore.
website: https://www.aicamp.ai/event/eventdetails/W2021043010
Evening out the uneven: dealing with skew in FlinkFlink Forward
Flink Forward San Francisco 2022.
When running Flink jobs, skew is a common problem that results in wasted resources and limited scalability. In the past years, we have helped our customers and users solve various skew-related issues in their Flink jobs or clusters. In this talk, we will present the different types of skew that users often run into: data skew, key skew, event time skew, state skew, and scheduling skew, and discuss solutions for each of them. We hope this will serve as a guideline to help you reduce skew in your Flink environment.
by
Jun Qin & Karl Friedrich
Producer Performance Tuning for Apache KafkaJiangjie Qin
Kafka is well known for high throughput ingestion. However, to get the best latency characteristics without compromising on throughput and durability, we need to tune Kafka. In this talk, we share our experiences to achieve the optimal combination of latency, throughput and durability for different scenarios.
This document introduces the (B)ELK stack, which consists of Beats, Elasticsearch, Logstash, and Kibana. It describes each component and how they work together. Beats are lightweight data shippers that collect data from logs and systems. Logstash processes and transforms data from inputs like Beats. Elasticsearch stores and indexes the data. Kibana provides visualization and analytics capabilities. The document provides examples of using each tool and tips for working with the ELK stack.
The document provides an overview of Kubernetes networking concepts including single pod networking, pod to pod communication, service discovery and load balancing, external access patterns, network policies, Istio service mesh, multi-cluster networking, and best practices. It covers topics such as pod IP addressing, communication approaches like L2, L3, overlays, services, ingress controllers, network policies, multi-cluster use cases and deployment options.
Basic and Advanced Analysis of Ceph Volume Backend Driver in Cinder - John HaanCeph Community
This document discusses basic and advanced features of using Ceph as the backend driver for Cinder block storage in OpenStack. It begins with basic concepts like Cinder volumes, snapshots, and backups using RBD copy-on-write, snapshots, and export/import diffs. More advanced topics covered include image-cached volumes to improve volume creation performance, and replication between Ceph clusters for disaster recovery using RBD mirroring. The document provides configuration details and diagrams to illustrate how data is stored and managed in Ceph for basic and advanced Cinder integration.
Flink Forward San Francisco 2022.
This talk will take you on the long journey of Apache Flink into the cloud-native era. It started all the way from where Hadoop and YARN were the standard way of deploying and operating data applications.
We're going to deep dive into the cloud-native set of principles and how they map to the Apache Flink internals and recent improvements. We'll cover fast checkpointing, fault tolerance, resource elasticity, minimal infrastructure dependencies, industry-standard tooling, ease of deployment and declarative APIs.
After this talk you'll get a broader understanding of the operational requirements for a modern streaming application and where the current limits are.
by
David Moravek
Best Practices: How to Analyze IoT Sensor Data with InfluxDBInfluxData
InfluxDB is the purpose-built time series platform. Its high ingest capability makes it perfect for collecting, storing and analyzing time-stamped data from sensors — down to the nanosecond. The InfluxDB platform has everything developers need: the data collection agent, the database, visualization tools, and data querying and scripting language. Join this webinar as Brian Gilmore provides a product overview; he will also deep-dive with some helpful tips and ticks. Stick around for a live demo and Q&A time.
Join this webinar as Brian Gilmore dives into:
The basics of time series data and applications
A platform overview — learn about InfluxDB, Telegraf, and Flux
InfluxDB use case examples — start collecting data at the edge and use your preferred IoT protocol (i.e. MQTT)
PCI Express* based Storage: Data Center NVM Express* Platform TopologiesOdinot Stanislas
This document discusses PCI Express based solid state drives (SSDs) for data centers. It covers the growth opportunity for PCIe SSDs, topology options using various form factors like SFF-8639 and M.2, and validation tools. It also discusses hot plug support on Intel Xeon processor based servers and upcoming industry workshops to advance the PCIe SSD ecosystem.
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Databricks
Watch video at: http://youtu.be/Wg2boMqLjCg
Want to learn how to write faster and more efficient programs for Apache Spark? Two Spark experts from Databricks, Vida Ha and Holden Karau, provide some performance tuning and testing tips for your Spark applications
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward
Flink Forward San Francisco 2022.
Flink consumers read from Kafka as a scalable, high throughput, and low latency data source. However, there are challenges in scaling out data streams where migration and multiple Kafka clusters are required. Thus, we introduced a new Kafka source to read sharded data across multiple Kafka clusters in a way that conforms well with elastic, dynamic, and reliable infrastructure. In this presentation, we will present the source design and how the solution increases application availability while reducing maintenance toil. Furthermore, we will describe how we extended the existing KafkaSource to provide mechanisms to read logical streams located on multiple clusters, to dynamically adapt to infrastructure changes, and to perform transparent cluster migrations and failover.
by
Mason Chen
The document summarizes a talk on container performance analysis. It discusses identifying bottlenecks at the host, container, and kernel level using various Linux performance tools. It then provides an overview of how containers work in Linux using namespaces and control groups (cgroups). Finally, it demonstrates some example commands like docker stats, systemd-cgtop, and bcc/BPF tools that can be used to analyze containers and cgroups from the host system.
Kubernetes is a platform for managing containerized workloads and services that provides a container-centric management environment. It aims to provide high utilization, high availability, minimize fault recovery time, and reduce the probability of correlated failures through a declarative job specification language, name service integration, real-time job monitoring, and analyzing and simulating system behavior using APIs and dashboards. Kubernetes can manage 100,000s of jobs, 1000s of applications across multiple clusters each with 10,000s of machines.
Flink Forward San Francisco 2018: Steven Wu - "Scaling Flink in Cloud" Flink Forward
Over 109 million subscribers are enjoying more than 125 million hours of TV shows and movies per day on Netflix. This leads to massive amount of data flowing through our data ingestion pipeline to improve service and user experience. They are powering various data analytic cases like personalization, operational insight, fraud detection. At the heart of this massive data ingestion pipeline is a self-serve stream processing platform that processes 3 trillion events and 12 PB of data every day. We have recently migrated this stream processing platform from Samza to Flink. In this talk, we will share the challenges and issues that we run into when running Flink at scale in cloud. We will dive deep into the troubleshooting techniques and lessons learned.
This document discusses end-to-end processing of 3.7 million telemetry events per second using a lambda architecture at Symantec. It provides an overview of Symantec's security data lake infrastructure, the telemetry data processing architecture using Kafka, Storm and HBase, tuning targets for the infrastructure components, and performance benchmarks for Kafka, Storm and Hive.
Running Presto and Spark on the Netflix Big Data PlatformEva Tse
This document summarizes Netflix's big data platform, which uses Presto and Spark on Amazon EMR and S3. Key points:
- Netflix processes over 50 billion hours of streaming per quarter from 65+ million members across over 1000 devices.
- Their data warehouse contains over 25PB stored on S3. They read 10% daily and write 10% of reads.
- They use Presto for interactive queries and Spark for both batch and iterative jobs.
- They have customized Presto and Spark for better performance on S3 and Parquet, and contributed code back to open source projects.
- Their architecture leverages dynamic EMR clusters with Presto and Spark deployed via bootstrap actions for scalability.
This document provides an overview of Amazon Kinesis and how it can be used to build a real-time big data application on AWS. Key points discussed include using Kinesis to collect streaming data from sources, processing the data in real-time using services like Kinesis, EMR and Redshift, and storing and analyzing the results. Examples are provided of ingesting log data from sources into Kinesis, analyzing the data with Hive on EMR, and loading results into Redshift for interactive querying and business intelligence.
Intro to Apache Kafka I gave at the Big Data Meetup in Geneva in June 2016. Covers the basics and gets into some more advanced topics. Includes demo and source code to write clients and unit tests in Java (GitHub repo on the last slides).
Flink at netflix paypal speaker seriesMonal Daxini
(1) Monal Daxini presented on Netflix's use of Apache Flink for stream processing.
(2) Netflix introduced Flink two years ago and has driven its adoption within the company.
(3) Key aspects of Netflix's Flink usage include around 2,000 routing jobs processing around 3 trillion events per day across around 10,000 containers.
The document discusses saving streaming data from Kafka to S3 using Spark Streaming while ensuring exactly-once delivery. It describes two options for handling failures: (1) writing offsets to a database, requiring additional cleanup; and (2) combining offsets with file paths in S3 to allow overwriting on failure without duplication. The implemented solution uses the second approach by partitioning data by date and sum of starting offsets and deleting folders before writing to ensure exactly-once delivery in a simple way without additional systems.
This document discusses using Amazon Web Services (AWS) with ColdFusion 11. It begins with an introduction to AWS and then covers specific AWS services that can be integrated with ColdFusion, including Simple Storage Service (S3) for file storage, DynamoDB for a NoSQL database, and Elasticache for caching. It also provides instructions for running ColdFusion 11 on an AWS Elastic Compute Cloud (EC2) instance using the official ColdFusion 11 Amazon Machine Image (AMI).
Link to the full talk - https://youtu.be/2Rf5t2Eh6IQ
https://go.dok.community/slack
https://dok.community
ABSTRACT OF THE TALK
This talk will provide a high-level overview of Kubernetes, Helm charts and how they can be used to deploy Apache Druid clusters of any size.
We'll review how Kubernetes functionality enables resilience and self-healing, historical tiers through node group affinity, middle manager scaling through Kubernetes autoscaling to optimize ingestion capacity and some of the gotchas along the way.
BIO
Sergio Ferragut is a database veteran turned Developer Advocate at Imply. His experience includes 16 years at Teradata in professional services and engineering roles.
He has direct experience in building analytics applications spanning the retail, supply chain, pricing optimization and IoT spaces.
Sergio has worked at multiple technology start-ups including APL and Splice Machine where he helped guide product design and field messaging.
Serverless Machine Learning on Modern Hardware Using Apache Spark with Patric...Databricks
Recently, there has been increased interest in running analytics and machine learning workloads on top of serverless frameworks in the cloud. The serverless execution model provides fine-grained scaling and unburdens users from having to manage servers, but also adds substantial performance overheads due to the fact that all data and intermediate state of compute task is stored on remote shared storage.
In this talk I first provide a detailed performance breakdown from a machine learning workload using Spark on AWS Lambda. I show how the intermediate state of tasks — such as model updates or broadcast messages — is exchanged using remote storage and what the performance overheads are. Later, I illustrate how the same workload performs on-premise using Apache Spark and Apache Crail deployed on a high-performance cluster (100Gbps network, NVMe Flash, etc.). Serverless computing simplifies the deployment of machine learning applications. The talk shows that performance does not need to be sacrificed.
We are using Elasticsearch to power the search feature of our public frontend, serving 10k queries per hour across 8 markets in SEA.
Here we are sharing our experiences of running Elasticsearch on Kubernetes, presenting our general setup, configuration tweaks and possible pitfalls.
Taking advantage of the Amazon Web Services (AWS) FamilyBen Hall
The document provides an overview of using Amazon Web Services (AWS) for hosting applications and storing files. It summarizes key AWS services including Amazon S3 for object storage, Amazon EC2 for virtual servers, and Amazon CloudFront for content delivery. It also provides code examples for accessing S3 and EC2 using APIs and SDKs.
AWS re:Invent presentation: Unmeltable Infrastructure at Scale by Loggly SolarWinds Loggly
This document summarizes Loggly's transition from their first generation log management infrastructure to their second generation infrastructure built on Apache Kafka, Twitter Storm, and ElasticSearch on AWS. The first generation faced challenges around tightly coupling event ingestion and indexing. The new system uses Kafka as a persistent queue, Storm for real-time event processing, and ElasticSearch for search and storage. This architecture leverages AWS services like auto-scaling and provisioned IOPS for high availability and scale. The new system provides improved elasticity, multi-tenancy, and a pre-production staging environment.
This talk (delivered at QConLondon 2016) covers the evolution of Coursera's nearline architecture, delves into our latest generation system, and then covers the flagship application of the architecture (evaluating programming assignments).
Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, UberHostedbyConfluent
Kafka is a vital part of data infrastructure in many organizations. When the Kafka cluster grows and more data is stored in Kafka for a longer duration, several issues related to scalability, efficiency, and operations become important to address. Kafka cluster storage is typically scaled by adding more broker nodes to the cluster. But this also adds needless memory and CPUs to the cluster making overall storage cost less efficient compared to storing the older data in external storage.
Tiered storage is introduced to extend Kafka's storage beyond the local storage available on the Kafka cluster by retaining the older data in cheaper stores, such as HDFS, S3, Azure or GCS with minimal impact on the internals of Kafka.
We will talk about
- How tiered storage addresses the above problems and also brings several other advantages.
- High level architecture of tiered storage
- Future work planned as part of tiered storage.
Logging for Production Systems in The Container Era discusses how to effectively collect and analyze logs and metrics in microservices-based container environments. It introduces Fluentd as a centralized log collection service that supports pluggable input/output, buffering, and aggregation. Fluentd allows collecting logs from containers and routing them to storage systems like Kafka, HDFS and Elasticsearch. It also supports parsing, filtering and enriching log data through plugins.
ETL with SPARK - First Spark London meetupRafal Kwasny
The document discusses how Spark can be used to supercharge ETL workflows by running them faster and with less code compared to traditional Hadoop approaches. It provides examples of using Spark for tasks like sessionization of user clickstream data. Best practices are covered like optimizing for JVM issues, avoiding full GC pauses, and tips for deployment on EC2. Future improvements to Spark like SQL support and Java 8 are also mentioned.
Lambda is AWS's serverless compute service that allows you to run code without provisioning or managing servers. Code is triggered by events and runs in isolated containers. Key points:
- Code is written as single functions that are triggered by events from AWS services or APIs
- Functions run in managed containers that are allocated memory and compute proportionally
- Functions are stateless and ephemeral, running code only in response to events
- AWS handles automatic scaling of functions based on event load and manages the underlying infrastructure
Managing big data stored on ADLSgen2/Databricks may be challenging. Setting up security, moving or copying the data of Hive tables or their partitions may be very slow, especially when dealing with hundreds of thousands of files.
Not So Common Memory Leaks in Java WebinarTier1 app
This SlideShare presentation is from our May webinar, “Not So Common Memory Leaks & How to Fix Them?”, where we explored lesser-known memory leak patterns in Java applications. Unlike typical leaks, subtle issues such as thread local misuse, inner class references, uncached collections, and misbehaving frameworks often go undetected and gradually degrade performance. This deck provides in-depth insights into identifying these hidden leaks using advanced heap analysis and profiling techniques, along with real-world case studies and practical solutions. Ideal for developers and performance engineers aiming to deepen their understanding of Java memory management and improve application stability.
AgentExchange is Salesforce’s latest innovation, expanding upon the foundation of AppExchange by offering a centralized marketplace for AI-powered digital labor. Designed for Agentblazers, developers, and Salesforce admins, this platform enables the rapid development and deployment of AI agents across industries.
Email: [email protected]
Phone: +1(630) 349 2411
Website: https://www.fexle.com/blogs/agentexchange-an-ultimate-guide-for-salesforce-consultants-businesses/?utm_source=slideshare&utm_medium=pptNg
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentShubham Joshi
A secure test infrastructure ensures that the testing process doesn’t become a gateway for vulnerabilities. By protecting test environments, data, and access points, organizations can confidently develop and deploy software without compromising user privacy or system integrity.
Exceptional Behaviors: How Frequently Are They Tested? (AST 2025)Andre Hora
Exceptions allow developers to handle error cases expected to occur infrequently. Ideally, good test suites should test both normal and exceptional behaviors to catch more bugs and avoid regressions. While current research analyzes exceptions that propagate to tests, it does not explore other exceptions that do not reach the tests. In this paper, we provide an empirical study to explore how frequently exceptional behaviors are tested in real-world systems. We consider both exceptions that propagate to tests and the ones that do not reach the tests. For this purpose, we run an instrumented version of test suites, monitor their execution, and collect information about the exceptions raised at runtime. We analyze the test suites of 25 Python systems, covering 5,372 executed methods, 17.9M calls, and 1.4M raised exceptions. We find that 21.4% of the executed methods do raise exceptions at runtime. In methods that raise exceptions, on the median, 1 in 10 calls exercise exceptional behaviors. Close to 80% of the methods that raise exceptions do so infrequently, but about 20% raise exceptions more frequently. Finally, we provide implications for researchers and practitioners. We suggest developing novel tools to support exercising exceptional behaviors and refactoring expensive try/except blocks. We also call attention to the fact that exception-raising behaviors are not necessarily “abnormal” or rare.
Adobe Master Collection CC Crack Advance Version 2025kashifyounis067
🌍📱👉COPY LINK & PASTE ON GOOGLE http://drfiles.net/ 👈🌍
Adobe Master Collection CC (Creative Cloud) is a comprehensive subscription-based package that bundles virtually all of Adobe's creative software applications. It provides access to a wide range of tools for graphic design, video editing, web development, photography, and more. Essentially, it's a one-stop-shop for creatives needing a broad set of professional tools.
Key Features and Benefits:
All-in-one access:
The Master Collection includes apps like Photoshop, Illustrator, InDesign, Premiere Pro, After Effects, Audition, and many others.
Subscription-based:
You pay a recurring fee for access to the latest versions of all the software, including new features and updates.
Comprehensive suite:
It offers tools for a wide variety of creative tasks, from photo editing and illustration to video editing and web development.
Cloud integration:
Creative Cloud provides cloud storage, asset sharing, and collaboration features.
Comparison to CS6:
While Adobe Creative Suite 6 (CS6) was a one-time purchase version of the software, Adobe Creative Cloud (CC) is a subscription service. CC offers access to the latest versions, regular updates, and cloud integration, while CS6 is no longer updated.
Examples of included software:
Adobe Photoshop: For image editing and manipulation.
Adobe Illustrator: For vector graphics and illustration.
Adobe InDesign: For page layout and desktop publishing.
Adobe Premiere Pro: For video editing and post-production.
Adobe After Effects: For visual effects and motion graphics.
Adobe Audition: For audio editing and mixing.
Interactive Odoo Dashboard for various business needs can provide users with dynamic, visually appealing dashboards tailored to their specific requirements. such a module that could support multiple dashboards for different aspects of a business
✅Visit And Buy Now : https://bit.ly/3VojWza
✅This Interactive Odoo dashboard module allow user to create their own odoo interactive dashboards for various purpose.
App download now :
Odoo 18 : https://bit.ly/3VojWza
Odoo 17 : https://bit.ly/4h9Z47G
Odoo 16 : https://bit.ly/3FJTEA4
Odoo 15 : https://bit.ly/3W7tsEB
Odoo 14 : https://bit.ly/3BqZDHg
Odoo 13 : https://bit.ly/3uNMF2t
Try Our website appointment booking odoo app : https://bit.ly/3SvNvgU
👉Want a Demo ?📧 [email protected]
➡️Contact us for Odoo ERP Set up : 091066 49361
👉Explore more apps: https://bit.ly/3oFIOCF
👉Want to know more : 🌐 https://www.axistechnolabs.com/
#odoo #odoo18 #odoo17 #odoo16 #odoo15 #odooapps #dashboards #dashboardsoftware #odooerp #odooimplementation #odoodashboardapp #bestodoodashboard #dashboardapp #odoodashboard #dashboardmodule #interactivedashboard #bestdashboard #dashboard #odootag #odooservices #odoonewfeatures #newappfeatures #odoodashboardapp #dynamicdashboard #odooapp #odooappstore #TopOdooApps #odooapp #odooexperience #odoodevelopment #businessdashboard #allinonedashboard #odooproducts
⭕️➡️ FOR DOWNLOAD LINK : http://drfiles.net/ ⬅️⭕️
Maxon Cinema 4D 2025 is the latest version of the Maxon's 3D software, released in September 2024, and it builds upon previous versions with new tools for procedural modeling and animation, as well as enhancements to particle, Pyro, and rigid body simulations. CG Channel also mentions that Cinema 4D 2025.2, released in April 2025, focuses on spline tools and unified simulation enhancements.
Key improvements and features of Cinema 4D 2025 include:
Procedural Modeling: New tools and workflows for creating models procedurally, including fabric weave and constellation generators.
Procedural Animation: Field Driver tag for procedural animation.
Simulation Enhancements: Improved particle, Pyro, and rigid body simulations.
Spline Tools: Enhanced spline tools for motion graphics and animation, including spline modifiers from Rocket Lasso now included for all subscribers.
Unified Simulation & Particles: Refined physics-based effects and improved particle systems.
Boolean System: Modernized boolean system for precise 3D modeling.
Particle Node Modifier: New particle node modifier for creating particle scenes.
Learning Panel: Intuitive learning panel for new users.
Redshift Integration: Maxon now includes access to the full power of Redshift rendering for all new subscriptions.
In essence, Cinema 4D 2025 is a major update that provides artists with more powerful tools and workflows for creating 3D content, particularly in the fields of motion graphics, VFX, and visualization.
Meet the Agents: How AI Is Learning to Think, Plan, and CollaborateMaxim Salnikov
Imagine if apps could think, plan, and team up like humans. Welcome to the world of AI agents and agentic user interfaces (UI)! In this session, we'll explore how AI agents make decisions, collaborate with each other, and create more natural and powerful experiences for users.
Copy & Paste On Google >>> https://dr-up-community.info/
EASEUS Partition Master Final with Crack and Key Download If you are looking for a powerful and easy-to-use disk partitioning software,
How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...Egor Kaleynik
This case study explores how we partnered with a mid-sized U.S. healthcare SaaS provider to help them scale from a successful pilot phase to supporting over 10,000 users—while meeting strict HIPAA compliance requirements.
Faced with slow, manual testing cycles, frequent regression bugs, and looming audit risks, their growth was at risk. Their existing QA processes couldn’t keep up with the complexity of real-time biometric data handling, and earlier automation attempts had failed due to unreliable tools and fragmented workflows.
We stepped in to deliver a full QA and DevOps transformation. Our team replaced their fragile legacy tests with Testim’s self-healing automation, integrated Postman and OWASP ZAP into Jenkins pipelines for continuous API and security validation, and leveraged AWS Device Farm for real-device, region-specific compliance testing. Custom deployment scripts gave them control over rollouts without relying on heavy CI/CD infrastructure.
The result? Test cycle times were reduced from 3 days to just 8 hours, regression bugs dropped by 40%, and they passed their first HIPAA audit without issue—unlocking faster contract signings and enabling them to expand confidently. More than just a technical upgrade, this project embedded compliance into every phase of development, proving that SaaS providers in regulated industries can scale fast and stay secure.
Get & Download Wondershare Filmora Crack Latest [2025]saniaaftab72555
Copy & Past Link 👉👉
https://dr-up-community.info/
Wondershare Filmora is a video editing software and app designed for both beginners and experienced users. It's known for its user-friendly interface, drag-and-drop functionality, and a wide range of tools and features for creating and editing videos. Filmora is available on Windows, macOS, iOS (iPhone/iPad), and Android platforms.
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New Versionsaimabibi60507
Copy & Past Link👉👉
https://dr-up-community.info/
Pixologic ZBrush, now developed by Maxon, is a premier digital sculpting and painting software renowned for its ability to create highly detailed 3D models. Utilizing a unique "pixol" technology, ZBrush stores depth, lighting, and material information for each point on the screen, allowing artists to sculpt and paint with remarkable precision .
Who Watches the Watchmen (SciFiDevCon 2025)Allon Mureinik
Tests, especially unit tests, are the developers’ superheroes. They allow us to mess around with our code and keep us safe.
We often trust them with the safety of our codebase, but how do we know that we should? How do we know that this trust is well-deserved?
Enter mutation testing – by intentionally injecting harmful mutations into our code and seeing if they are caught by the tests, we can evaluate the quality of the safety net they provide. By watching the watchmen, we can make sure our tests really protect us, and we aren’t just green-washing our IDEs to a false sense of security.
Talk from SciFiDevCon 2025
https://www.scifidevcon.com/courses/2025-scifidevcon/contents/680efa43ae4f5
How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?steaveroggers
Migrating from Lotus Notes to Outlook can be a complex and time-consuming task, especially when dealing with large volumes of NSF emails. This presentation provides a complete guide on how to batch export Lotus Notes NSF emails to Outlook PST format quickly and securely. It highlights the challenges of manual methods, the benefits of using an automated tool, and introduces eSoftTools NSF to PST Converter Software — a reliable solution designed to handle bulk email migrations efficiently. Learn about the software’s key features, step-by-step export process, system requirements, and how it ensures 100% data accuracy and folder structure preservation during migration. Make your email transition smoother, safer, and faster with the right approach.
Read More:- https://www.esofttools.com/nsf-to-pst-converter.html
This presentation explores code comprehension challenges in scientific programming based on a survey of 57 research scientists. It reveals that 57.9% of scientists have no formal training in writing readable code. Key findings highlight a "documentation paradox" where documentation is both the most common readability practice and the biggest challenge scientists face. The study identifies critical issues with naming conventions and code organization, noting that 100% of scientists agree readable code is essential for reproducible research. The research concludes with four key recommendations: expanding programming education for scientists, conducting targeted research on scientific code quality, developing specialized tools, and establishing clearer documentation guidelines for scientific software.
Presented at: The 33rd International Conference on Program Comprehension (ICPC '25)
Date of Conference: April 2025
Conference Location: Ottawa, Ontario, Canada
Preprint: https://arxiv.org/abs/2501.10037
Avast Premium Security Crack FREE Latest Version 2025mu394968
🌍📱👉COPY LINK & PASTE ON GOOGLE https://dr-kain-geera.info/👈🌍
Avast Premium Security is a paid subscription service that provides comprehensive online security and privacy protection for multiple devices. It includes features like antivirus, firewall, ransomware protection, and website scanning, all designed to safeguard against a wide range of online threats, according to Avast.
Key features of Avast Premium Security:
Antivirus: Protects against viruses, malware, and other malicious software, according to Avast.
Firewall: Controls network traffic and blocks unauthorized access to your devices, as noted by All About Cookies.
Ransomware protection: Helps prevent ransomware attacks, which can encrypt your files and hold them hostage.
Website scanning: Checks websites for malicious content before you visit them, according to Avast.
Email Guardian: Scans your emails for suspicious attachments and phishing attempts.
Multi-device protection: Covers up to 10 devices, including Windows, Mac, Android, and iOS, as stated by 2GO Software.
Privacy features: Helps protect your personal data and online privacy.
In essence, Avast Premium Security provides a robust suite of tools to keep your devices and online activity safe and secure, according to Avast.
5. Job isolation: single job
Job
Manager
Task
Manager
Task
Manager
Task
Manager
...
Titus Job #1
Titus Job #2
Flink
standalone
cluster
6. State backend and checkpoint store
State backend
● Memory
● File system
● RocksDB
Source: http://flink.apache.org/
checkpoint store
● HDFS
● S3
7. Why S3 as the snapshot store
● Only out-of-the-box support for Amazon cloud
● Cost-effective, scalability, durability
8. S3 concepts
● Massive storage system
● Bucket: container for objects
● Object: identified by a key (and a version)
● Filesystem like operations
○ GET, PUT, DELETE, LIST, HEAD
14. S3 Performance
● Optimized for high I/O throughput
● Not optimized for high request rate without
tweaking key names
● Not optimized for small files
● Not optimized for consistent low latency
26. Math 201: S3 writes
● ~200,000 operators. Each operator writes
checkpoint to S3
● checkpoint interval is 30 seconds
● ~6,600 writes (= 200,000 / 30) per second
○ Actual writes 2-3x smaller because only Kafka
source operators have state
40. Issue #1 Hadoop S3 file system
● Half of the HEADs failed for non-exist objects
● Always two HEADs for the same object (with and
without trailing slash)
○ checkpoints/<flink job>/fe68ab5591614163c19b55ff4aa66ac
○ checkpoints/<flink job>/fe68ab5591614163c19b55ff4aa66ac/
42. BTrace: dynamic tracing tool for Java
● Dynamically trace a running Java process
● Dynamically instruments the classes of the target
application to inject tracing code ("bytecode
tracing")
47. Findings from task manager
● No S3 writes
● 4 HEAD requests per checkpoint interval
○ 1 (subtask) * 2 (operators) * 2 (with and without
trailing slash)
49. Math 301: metadata reqs
● ~200,000 operators
● Each operator creates 2 HEAD requests (with and
without trailing slash)
● checkpoint interval is 30 seconds
● ~13,000 (200,000 * 2 / 30) HEAD reqs/s from
task managers even though they write zero S3
files
50. Create CheckpointStreamFactory
only once during operator
initialization (FLINK-5800)
Fixed in 1.2.1 (https://github.com/apache/flink/pull/3312)
58. Current implementation issue (FLINK-8042)
● Revert to full restart immediately if replacement
container didn’t come back in time (FLINK-8042)
● Fix expected in FLIP-6
61. Recap of scaling stateless jobs
● Introduce random prefix in checkpoint path to
spread S3 writes from many different jobs
● Avoid S3 writes from task managers
● Enable fine grained recovery (+1 standby)
63. Often come with data shuffling
A1
A2
A3
B1
B2
B3
C1
C2
C3
keyBysource window sink
64. Challenges of large-state job
● Introduce random hex chars in checkpoint path to
spread S3 writes from different jobs
○ Single job writes large state to S3
● Avoid S3 writes from task managers
○ Each task manager has large state
● Enable fine grained recovery (+1 standby)
○ Connected job graph
65. Challenges of large-state job
● Single job writes large state to S3
● Each task manager has large state
● Connected job graph
81. Recap of scaling stateful jobs
● Inject dynamic random prefix in checkpoint path
to spread S3 writes from operators in the same
job
● Enable incremental checkpoint with RocksDB
● Challenge: connected graph makes recovery
more expensive
#2: Today, I am going share our experiences on running Flink at scale in cloud environment. What are the challenges and what are the solutions?
#5: We run Flink on our Titus container platform. Titus is similar to Kubernetes. It is developed in house and not open sourced yet.
#7: Flink state backend defines the data structure that holds the state. It also implement the logic to take a snapshot of the job state and store that snapshot to some distributed file system like S3. Checkpoints is how Flink achieve fault tolerance.
#8: Flink support S3 as the distributed storage system for checkpoint state out of the box. Hadoop or presto has S3 adapter that implements HDFS interface on top of Amazon S3.
S3 is very cost effective. It is scalable although sometimes you may need to jump through some hoops. It is highly durable with 11 9’s durability.
#9: S3 is designed as a massive storage system (like infinitely large) with very high durability. Netflix uses S3 for our data warehouse stored with over a hundred pera bytes of compressed data.
#10: S3 shard data by range partition. Object keys are stored in order across multiple partitions.
#11: With range partition, S3 can support prefix query efficiently. In this example, when you are querying objects with this date prefix, S3 know it only needs to look into partition 1 and 2
#12: If you have a big rollout and sudden traffic jump, you would want to work with AWS to pre-partition your bucket for higher throughput.
#13: Using a sequential prefix, such as time stamp, increases the likelihood that Amazon S3 will target one specific partition for a large number of your keys, overwhelming the I/O capacity of the partition.
#14: If your workload consistently exceed 100 requests per second for a bucket, Amazon recommend avoiding sequential key names and introduce some random prefix in key names. therefore, the key names and the I/O load will be distributed across more than one partition.
Note that with random prefix, you can’t really do prefix query anymore, because there is no more common prefix.
#15: S3 is optimized for high I/O throughput, but not small files. That’s why our Hive data warehouse compacts small files into larger files (like a few hundred MBs large) to improve read performance.
If you want to checkpoint at high frequency (e.g. every second), S3 is probably not the best choice. You probably want to consider some state backend that can deliver consistent low latency (e.g. DynamoDB)
#17: At 10,000 feet level, Keystone data pipeline is responsible for moving data from producer to sinks for data consumption. We will get into more details of Keystone pipeline when we are talking about Keystone router later.
#18: Pretty much every applications publishes some data to our data pipeline.
#24: 2,000 jobs. They come in different sizes. Some small jobs only need one container with 1-CPU. Some large jobs have over 100 containers each with 8-CPU.
#26: Let’s zoom in a little bit on how Flink performs checkpoint. As checkpoint barrier, each operator snapshot its state and upload the snapshot to S3. In another word, each operator writes to S3 during each checkpoint cycle.
#27: Actually write is probably 2-3 times smaller than 6,000, because only Kafka source operator has state and needs to write to S3. even 2,000 writes is still a lot.
While it is straightforward to do a back-of-envelope calculation for the write volume, it is difficult to estimate request rates for other S3 operations (like get or list)
There are also other s3 requests.
#30: At beginning, we set checkpoint path like this. Using a timestamp cause sequential key names. As we said earlier, sequential keys don’t scale well.
#31: We said earlier that we need to avoid sequential key names if we want to scale more than 100 reqs/second without throttling. We introduced this 4-char hex random prefix in S3 path for checkpoint location.Such random hex chars will distribute S3 writes from many different routing jobs to different S3 partitions. This is just a trick from our deployment tooling. There is no change needed from Flink.
#33: Each operator writes a checkpoint file to S3 for its own state. For stateless job, this creates many small files. After writing the snapshot to S3, operators send acknowledgement back to jobmanager.
#34: After jobmanager got the acknowledgements from all operators, it writes a uber checkpoint file with all metadata received from acknowledgements
#35: Flink has this awesome feature of memory-threshold. We set this threshold to 1 MB for Keystone router.
#36: If operator state size is smaller than this threshold (default is 1024 bytes), task manager will ship the state to jobmanager without writing anything to S3.
#37: After jobmanager got the acknowledgements from all operators, it writes the uber checkpoint file with state embedded along with other metadata
#38: Flink has this awesome feature of memory-threshold. We set this threshold to 1 MB for Keystone router.
#40: If you are not familiar with S3, HEAD requests are for querying object metadata and PUT requests are writes. What really caught us by surprise is the fact that HEAD requests are ~150 times of PUT requests.
We enabled S3 access log
#41: First request for dir without trailing slash char, which always resulted in 404 NoSuchKey failure. Then second request with trailing slash char, which always succeeds. This is an unfortunate behavior of hadoop s3 file system implementation. But it is actually a minor issue in the whole thing, as it only explains for 2x. What is the other 75x difference. That is the bigger fish that we should target. I believe this minor issue still exists as of today.
#42: I manually spot checked client IP addresses in the access log. Those HEAD request all come from task managers. Task managers do not write any checkpoint file to S3 anymore. Why making so many HEAD requests?
#43: To find out why we are making so many HEAD requests. I started to run BTrace on task manager process.
#49: I don’t expect you to read the stack trace here. Here is the take away. Even though task manager doesn’t actually write to S3, it still goes through the checkpoint code path where a FsCheckpointStreamFactory object is created for each operator for each checkpoint cycle. FsCheckpointStreamFactory constructor calls mkdirs() method which results in S3 metadata requests.
#50:
Even though HEAD requests are pretty cheap metadata query. It is still counted when S3 enforcing throttling on request rate. And again, S3 is not optimized on high request rate.
#51: The key problem is CheckpointStreamFactory is created in each checkpoint cycle. After we shared the finding of this issue in 1.2.0, Stephan Ewen quickly fixed it in 1.2.1 release.
#52: For stateless jobs, I strongly encourage you to consider fine grained recovery that Flink implemented since 1.3
#53: Here is an simple embarrassingly parallel job DAG. no data shuffling. three operators running with parallelism of 3. A is source operator and C is sink operator
#54: Here is an simple embarrassingly parallel job DAG. no data shuffling. three operators running with parallelism of 3. A is source operator and C is sink operator
#55: Flink only needs to restart the portion of DAG marked as gray color. Other parallel chains are unaffected and untouched.
#57: This graph shows the impact of full job restart. X axis is time. Y axis the message rate per second. Red line is the incoming message rate to Kafka topic. Blue line is the record consume rate by the Flin job. In this graph, message rate is peaked at 800K messages per sec and it is coming off peak hours. We enabled Chaos Monkey to kill one container every 10 minutes. You can see each kill caused a full job restart and subsequent recovery spike of over 2 times of incoming msg rate. That means significant duplicates, which can be problematic for some jobs. You may wonder why would you run Chaos Monkey killing so frequently. This is to simulate a real-world scenario. As I mentioned earlier, our Flink jobs run on Titus container platform. When our Titus team update code on agent host, Titus team kills one container per ASG every 10 minutes to evacuate containers off old agents.
#58: Those small bumps are fine grained recovery working. Those big spikes are full restart. This flink job is actually not very bad. Only small number of recovery reverted to full restart. In another job, we have seen over 80% of time it reverted to full restart
#60: That is how we reduce or avoid the reversion to full restart.
#61: Same Flink job with fine grained recovery enabled. This is a 20-node cluster. If we kill one task manager, that is about 5% of the job graph. Recovery bump is proportional to that at ~5%.
#63: Now let’s shift gear from stateless computation to stateful computation. Let’s look at the challenges and some of the solutions for scaling large state jobs. By large state, I mean as large as TBs.
#64: Stateful job often has data shuffling to bring events for the same key to the same operator. This is connected graph now. Not embarrassingly parallel anymore.
#66: Here the challenge is hundreds or thousands parallel operators from the same job are writing large state to S3.
#67: We introduced new config to dynamically substitute the “_entroy_key” substring in the checkpoint path with a 4-char random hex string for each S3 write. In another word, each operator got checkpoint path with its own random prefix. This way, we can spread the S3 writes from different operators of the same Flink job to different S3 partitions.
#68: We like to contribute this improvement back. We are discussing it with the community in FLINK-9061
#70: For large-state job, we have to do the following tunings so that Flink job can keep up with the state churning and checkpointing
For very large state (like TBs), you probably only want to use RocksDB state backend in Flink. Memory and filesystem statebackends just can’t scale to very large state.
Since our container comes with SSD ephemeral disk. Flink has predefined tuning for RocksDB on SSD drive that works well out of the box .
Since this job has a large cluster size and high parallelism, we found it helpful to increase the network buffer size from default 1 GB to 4 GB
#71: I want to share some performance test number. By no means we are claiming this is the best you can do with Flink. Just want to give you some ideas what is possible with Flink today. There are plenty of room for improvement both in Flink platform and in our application.
#72: For those who is not familiar with savepoint. Savepoint is like checkpoint but allows you to rescale the job with a different parallelism. We use savepoint to get an ideal of the total state size
#73: We are pretty happy with these numbers. At least, it shows that we can build a large-state application on Flink dealing with TBs of state.
#78: Assume A1, B1, C1 runs on TM #1. Similarly for TM #2 and #3. When TM #3 got terminated, full job got restarted.
#79: Currently, all operators on all task managers download data from S3 and recover operators from downloaded data. Is that really necessary. Obviously for task manager #3, it has no choice since ephemeral disk is lost when container got terminated. Data is gone. But what about task manager #1 and #2? They are still running and their local disk still have the data. If we can reschedule the same operators on the same task managers, potentially they don’t need to download data from S3.
#80: That is exactly what the upcoming new feature, called task local recovery will do. Flink implements schedule affinity that schedule the same operators back to the same task managers. This way, task manager #1 and #2 can recover job from local data.
This may not be a big deal with a cluster with 3 task manager nodes. Thinking about a large-state job I shown earlier for performance number. Instead of all 200 task managers go to S3 download 21 TBs of state, with task local recovery only 1 task manager needs to download 100 GBs of state from S3. That makes a huge difference.
#81: Once task local recovery is available, we also want to explore EBS with it. For those who is not familiar with EBS, Elastic Block Store. You can think of EBS volume like a network attached hard drive that can be mounted to an instance and only one instance.
Even for task manager #3, after the replacement container come up, it can attach the proper EBS volume from previously terminated container, data is still there in the persistent EBS volume. Task manager #3 can also recover from local data. Nobody needs to download anything from S3. that will make recovery much faster
#83: Before I opening up for questions, I want to mention that I will be at the O'Reilly Booth between 3 and 4 pm this afternoon. If you have more questions or just like to chat, please drop by.