How to Improve the Observability of Apache Cassandra and Kafka applications with Prometheus and OpenTracing

Mar 26, 2019Download as PPTX, PDF1 like282 views

As distributed cloud applications grow more complex, dynamic, and massively scalable, “observability” becomes more critical. Observability is the practice of using metrics, monitoring and distributed tracing to understand how a system works. We’ll explore two complementary Open Source technologies: Prometheus for monitoring application metrics, and OpenTracing and Jaeger for distributed tracing. We’ll discover how they improve the observability of an Anomaly Detection application, deployed on AWS Kubernetes, and using Instaclustr managed Apache Cassandra and Kafka clusters.

How to Improve
the Observability of
Apache Cassandra and Kafka applications
with Prometheus and OpenTracing
March 27 2019
Paul Brebner
Technology Evangelist
instaclustr.com

As distributed applications
grow more complex,
dynamic, and massively
scalable, “observability”
becomes more critical.
Observability is the practice
of using metrics, monitoring
and distributed tracing to
understand how a system
works.
Observability
Critical

As distributed cloud
applications grow more
complex, dynamic, and
massively scalable,
“observability” becomes
more critical.
Observability is the practice
of using metrics, monitoring
and distributed tracing to
understand how a system
works.
And find the invisible cows
Observability
Critical

Open
APM
Land
scape
Lots
of
options
https://openapm.io/landscape

Open
APM
Land
scape
In this webinar we’ll explore two complementary Open
Source technologies:
- Prometheus for monitoring application metrics, and
- OpenTracing and Jaeger for distributed tracing.
We’ll discover how they improve the observability of
- an Anomaly Detection application,
- deployed on AWS Kubernetes, and
- using Instaclustr managed Apache Cassandra and
Kafka clusters.

Goal
To increase the
observability of this
anomaly detection
application
Kubernetes
Cluster
?

Cloud
context
Running across
Kafka, Cassandra,&
Kubernetes Clusters

Observability
Goal 1: Metrics
T
P
S
TPS
TPS
T
i
m
e
R
o
w
s
Anomalies
Producer rate
Consumer rate
Anomaly checks rate
Detector duration Rows returned
anomaly rate

Observability
Goal 2: Distributed
Tracing

1
2
3
Overview
Prometheus for Monitoring
OpenTracing for Distributed Tracing
Conclusions

Monitoring
with
Prometheus
Popular Open
Source monitoring
system from
Soundcloud
Now Cloud Native
Computing
Foundation (CNCF)

Prometheus
Monitoring of
applications and
servers
Pull-based
Architecture &
Components…

Prometheus
Server
Server
responsible for service discovery,
pulling metrics from monitored
applications, storing metrics, and
analysis of time series data

Prometheus
GUI
Built in simple graphing GUI, and
native support for Grafana

Prometheus
Optional
Push gateway
Alerting
Optional push gateway and
alerting
Optional push gateway and alerting

Prometheus
How does metrics
capture work?
Instrumentation and Agents (Exporters)
- Client libraries for instrumenting applications in
multiple programming languages
- Java client collects JVM metrics and enables
custom application metrics
- Node exporter for host hardware metrics

Prometheus
Data Model
■ Metrics
● Time series data
ᐨ timestamp and value; name, key:value pairs
● By convention name includes
ᐨ thing being monitored, logical type, and units
ᐨ e.g. http_requests_total, http_duration_seconds
■ Prometheus automatically adds labels
● Job, host:port
■ Metric types (only relevant for instrumentation)
● Counter (increasing values)
● Gauge (values up and down)
● Histogram
● Summary

Target
metrics
Business metric
(Anomaly checks/s)
Diagnostic metrics T
P
S
TPS
TPS
T
i
m
e
R
o
w
s
Anomalies
Producer rate
Consumer rate
Anomaly checks rate
Detector duration Rows returned
anomaly rate

Steps
Basic
■ Create and register Prometheus Metric types
● (e.g. Counter) for each timeseries type (e.g. throughputs) including
name and units
■ Instrument the code
● e.g. increment the count, using name of the component (e.g.
producer, consumer, etc) as label
■ Create HTTP server in code
■ Tell Prometheus where to scrape from (config file)
■ Run Prometheus Server
■ Browse to Prometheus server
■ View and select metrics, check that there’s data
■ Construct expression
■ Graph the expression
■ Run and configure Grafana for better graphs

Instrumentation
Counter example
// Use a single Counter for throughput metrics
// for all stages of the pipeline
// stages are distinguished by labels
static final Counter pipelineCounter = Counter
.build()
.name(appName + "_requests_total")
.help("Count of executions of pipeline stages")
.labelNames("stage")
.register();
. . .
// After successful execution of each stage:
// increment producer/consumer/detector rate count
pipelineCounter.labels(“producer”).inc();
. . .
pipelineCounter.labels(“consumer”).inc();
. . .
pipelineCounter.labels(“detector”).inc();

Instrumentation
Gauge example
// A Gauge can go up and down
// Used to measure the current value of some variable.
// pipelineGauge will measure duration of each labelled stage
static final Gauge pipelineGauge = Gauge
.build()
.name(appName + "_duration_seconds")
.help("Gauge of stage durations in seconds")
.labelNames("stage")
.register();
. . .
// in detector pipeline, compute duration and set
long duration = nowTime – startTime;
pipelineGauge.labels(”detector”).setToTime(duration);

HTTP Server
For metric pulls
// Metrics are pulled by Prometheus
// Create an HTTP server as the endpoint to pull from
// If there are multiple processes running on the same server
// then you need different port numbers
// Add IPs and port numbers to the Prometheus configuration
// file.
HTTPServer server = null;
try {
server = new HTTPServer(1234);
} catch (IOException e) {
e.printStackTrace();
}

Using
Prometheus
Configure
Run
■ Configure Prometheus with IP and Ports to poll.
● Edit the default Prometheus.yml file
● Includes polling frequency, timeouts etc
● Ok for testing but doesn’t scale for production systems
■ Get, install and run Prometheus.
● Initially just running locally.

Graphs
Counter
■ Browse to Prometheus Server URL
■ No default dashboards
■ View and select metrics
■ Execute them to graph
■ Counter value increases over time

Rate
Graph using irate
function
■ Enter expressions, e.g. irate function
■ Expression language has multiple data types and many
functions

Gauge
graph
Pipeline stage
durations in
seconds
■ Doesn’t need a function as it’s a Gauge

Grafana
Prometheus GUI ok
for debugging
Grafana better for
production
■ Install and run Grafana
■ Browse to Grafana URL, create a Prometheus data
source, add a Prometheus Graph.
■ Can enter multiple Prometheus expressions and graph
them on the same graph.
■ Example shows rate and duration metrics

Simple Test
configuration
Prometheus Server
outside Kubernetes
cluster, pulls metrics
from Pods
Dynamic/many
Pods are a
challenge
■ IP addresses to pull from are dynamic
● Have to update Prometheus pull configurations
● In production too many Pods to do this manually

Prometheus
on
Kubernetes
A few extra steps
makes life easier
■ Create and register Prometheus Metric types
● (e.g. Counter) for each timeseries type (e.g. throughputs) including name and
units
■ Instrument the code
● e.g. increment the count, using name of the component (e.g. producer,
consumer, etc) as label
■ Create HTTP server in code
■ Run Prometheus Server on Kubernetes cluster,
using Kubernetes Operator
■ Configure so it dynamically monitors selected Pods
■ Enable ingress and external access to Prometheus
server
■ Browse to Prometheus server
■ View and select metrics, check that there’s data
■ Construct expression
■ Graph the expression
■ Run and configure Grafana for better graphs

Prometheus
In production on
Kubernetes
Use Prometheus
Operator

Prometheus
In production on
Kubernetes
Use Prometheus
Operator
1 Install Prometheus
Operator and Run

Prometheus
In production on
Kubernetes
Use Prometheus
Operator
1 Install Prometheus
Operator and Run
2 Configure Service Objects to
monitor Pods

Prometheus
In production on
Kubernetes
Use Prometheus
Operator
1 Install Prometheus
Operator and Run
2 Configure Service Objects to
monitor Pods
3 Configure ServiceMonitors to
discover Service Objects

Prometheus
In production on
Kubernetes
Use Prometheus
Operator
1 Install Prometheus
Operator and Run
2 Configure Service Objects to
monitor Pods
3 Configure ServiceMonitors to
discover Service Objects
4 Configure Prometheus objects
to specify which ServiceMonitors
should be included

Prometheus
In production on
Kubernetes
Use Prometheus
Operator
1 Install Prometheus
Operator and Run
2 Configure Service Objects to
monitor Pods
3 Configure ServiceMonitors to
discover Service Objects
4 Configure Prometheus objects
to specify which ServiceMonitors
should be included
5 Allow ingress to Prometheus
by using a Kubernetes
NodePort Service
6 Create Role-based access
control rules for both
Prometheus and Prometheus
Operator
7 Configure AWS EC2 firewalls

Weavescope
Prometheus now
magically monitors
Pods as they come
and go
Showing
Prometheus
monitoring Pods
Prometheus
Operator
Pods

OpenTracing
Use Case:
Topology Maps
■ Prometheus collects and displays metric aggregations
● No dependency or order information, no single events
■ Distributed tracing shows “call tree” (causality, timing) for
each event
■ And Topology Maps

OpenTracing
Standard API for
distributed tracing
■ Specification, not implementation
■ Need
● Application instrumentation
● OpenTracing tracer
Traced Applications API Tracer implementations
Open Source, Datadog

Spans
Smallest logical unit
of work in
distributed system
■ Spans are smallest logical unit of work
● Have name, start time, duration, associated component
■ Simplest trace is a single span

Trace
Multi-span trace
■ Spans can be related
● ChildOf = synchronous dependency (wait)
● FollowsFrom = asynchronous relationships (no wait)
■ A Trace is a DAG of Spans.
● 1 or more Spans.

Instrumentation
■ Language specific client instrumentation
● Used to create spans in the application within the same process
■ Contributed libraries for frameworks
● E.g. Elasticsearch, Cassandra, Kafka etc
● Used to create spans across process boundaries (Kafka producers
-> consumers)
■ Choose and Instantiate a Tracer implementation
// Example instrumentation for consumer -> detector spans
static Tracer tracer = initTracer(”AnomaliaMachina");
. . .
Span span1 = tracer.buildSpan(”consumer").start();
. . .
span1.finish();
Span span2 = tracer
.buildSpan(”detector")
.addReference(References.CHILD_OF, span1.context())
.start();
. . .
span2.finish();
Steps

Tracing
across
process
boundaries
Inject/extract
metadata
■ To trace across process boundaries (processes,
servers, clouds) OpenTracing injects metadata into
the cross-process call flows to build traces across
heterogeneous systems.
■ Inject and extract a spanContext, how depends on
protocol.

How to do
this for
Kafka?
Producer
Automatically
inserts a span
context into Kafka
headers using
Interceptors
// Register tracer with GlobalTracer:
GlobalTracer.register(tracer);
// Add TracingProducerInterceptor to sender properties:
senderProps.put(ProducerConfig.INTERCEPTOR_CLASSES_CONFIG,
TracingProducerInterceptor.class.getName());
// Instantiate KafkaProducer
KafkaProducer<Integer, String> producer = new
KafkaProducer<>(senderProps);
// Send
producer.send(...);
// 3rd party library
// https://github.com/opentracing-contrib/java-kafka-client

Consumer
side
Extract spanContext
// Once you have a consumer record, extract
// the span context and
// create a new FOLLOWS_FROM span
SpanContext spanContext
= tracer.extract(Format.Builtin.TEXT_MAP, new
MyHeadersMapExtractAdapter(record.headers(),
false));
newSpan =
tracer.buildSpan("consumer").addReference(Refe
rences.FOLLOWS_FROM, spanContext).start();

Jaeger
Tracer
Open Source Tracer
Uber/CNCF

Jaeger
Tracer
How to use?
• Tracers can have different architectures and protocols
• Jaeger should scale well in production as
• It can use Cassandra and Spark
• Uses adaptive sampling
• Need to instantiate a Jaeger tracer in your code

Jaeger
GUI
■ Install and start Jaeger
■ Browse to Jaeger URL
■ Find traces by name, operation, and filter.
■ Select to drill down for more detail.

Jaeger
Single trace
■ Insight into total trace time, relationships and times
of spans
■ This is a trace of a single event through the
anomaly detector pipeline
● Producer (async)
● Consumer (async)
● Detector (async, with sync children)
ᐨ CassandraWrite
ᐨ CassandraRead
ᐨ AnomalyDetector

Jaeger
Dependencies view
■ Correctly shows anomaly detector topology
■ Only metric is number of spans observed
■ Can’t select subset of traces, or filter
■ Force directed view, select node and highlights
dependencies

Kafka
Challenge
Multiple Kafka topic
topologies
■ More complex example (application simulates
complex event flows across topics)
■ Show dependencies between source, intermediate
and sink Kafka topics.

Conclusions
Observations &
Alternatives
■ Topology view is basic (c.f. some commercial APMs)
■ Still need Prometheus for metrics
● in theory OpenTracing has everything needed for metrics.
■ Other OpenTracing tracers may be worth trying, e.g.
Datadog
■ OpenCensus is a competing approach.
■ Manual instrumentation is tedious and potentially
error prone, many commercial APMs use byte-code
injection to avoid this problem
■ The future? Kubernetes based service mesh
frameworks could construct traces for microservices
without instrumentation
● as they have visibility into how Pods interact with each other and
external systems
● and Pods only contain a single microservice, not a monolithic
application

Results
Scaled out to 48
Cassandra nodes
Approx 600 cores
for whole system
109 Pods for
Prometheus to
monitor
Producer rate metric
(9 Pods)
Peak Producer rate = 2.3 Million events/s
Prometheus was critical for collecting, computing and displaying
the metrics, as this needed to be done from multiple Pods

Business
metric
Detector rate
100 Pods
220,000 anomaly
checks/s computed
from 100 stacked
metrics
Anomaly Checks/s = 220,000
Prometheus was critical for tuning the system to achieve near perfect linear
scalability - used metrics for consumer and detector rate to tune thread pool
sizes to optimize anomaly checks/s, for increasingly bigger systems.
OpenTracing and Jaeger was useful during test deployment
- to check/debug if components were working together as expected
- but didn’t use in final production deployment
- as more set-up required using the Jaeger Kubernetes Operator:
https://github.com/jaegertracing/jaeger-operator

Cassandra &
OpenTracing
Visibility into
Cassandra clusters?
■ OpenTracing the example application was
● Across Kafka producers/consumers
● And within the Kubernetes deployed application
■ What options are there for improved visibility of
tracing of Cassandra clusters?
■ Instaclustr managed service
● OpenTracing support for the C* driver
● May not require any support from C* clusters
● https://github.com/opentracing-contrib/java-cassandra-driver
■ Self-managed clusters
● end-to-end OpenTracing through a C* cluster
● May require support from C* cluster
● https://github.com/thelastpickle/cassandra-zipkin-tracing

Cassandra &
Prometheus
Visibility into
Cassandra clusters?
Option 1
Instaclustr managed
service
■ Prometheus monitoring of the example application
● limited to application metrics collected from Kubernetes Pods
■ What options are there for integration with Casandra
Cluster metrics?
■ Instaclustr managed Cassandra
● 3rd party Prometheus exporter, native integration planned
● https://www.instaclustr.com/support/api-integrations/integrations/using-
instaclustr-monitoring-api-prometheus/

Cassandra &
Prometheus
Visibility into
Cassandra clusters?
Option 2
Self-managed
clusters
■ Instaclustr OpenSource contributions (under development)
● cassandra-exporter exports Cassandra metrics to Prometheus
ᐨ https://github.com/instaclustr/cassandra-exporter
● Kubernetes Operator for Apache Cassandra
ᐨ https://github.com/instaclustr/cassandra-operator/
● The Cassandra operator will create the appropriate objects to inform the
Prometheus operator about the metrics endpoints available from Cassandra
■ Instaclustr customers can then use
● Prometheus to monitor their own applications
● Prometheus federation to scrape the Cassandra Prometheus server to
integrate application and cluster metrics
ᐨ https://prometheus.io/docs/prometheus/latest/federation/

Prometheus
Federation
Federation
Prometheus servers can pull metrics from
other Prometheus servers

More
information?
Anomalia Machina
Blogs: Massively
Scalable Anomaly
Detection with
Apache Kafka and
Cassandra
■ Anomalia Machina 5 – Application Monitoring with
Prometheus
● https://www.instaclustr.com/anomalia-machina-5-1-application-
monitoring-prometheus-massively-scalable-anomaly-detection-
apache-kafka-cassandra/
■ Anomalia Machina 6 – Application Tracing with
OpenTracing
● https://www.instaclustr.com/anomalia-machina-6-application-
tracing-opentracing-massively-scalable-anomaly-detection-apache-
kafka-cassandra/
■ Anomalia Machina 8 – Production Application
Deployment with Kubernetes
● https://www.instaclustr.com/anomalia-machina-8-production-
application-deployment-kubernetes-massively-scalable-anomaly-
detection-apache-kafka-cassandra/
● Enabling Ingress into Kubernetes: Connecting Prometheus to the
Application running in Kubernetes
■ Anomalia Machina 10 – Final Results (soon)
● Using Prometheus Operator
■ All Blogs

The End
Instaclustr Managed Platform
Multiple Open Source Technologies and Providers
www.instaclustr.com/platform/

Recommended

ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...

ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...

ApacheCon2019 Talk: Improving the Observability of Cassandra, Kafka and Kuber...Paul Brebner

As distributed applications grow more complex, dynamic, and massively scalable, “observability” becomes more critical. Observability is the practice of using metrics, monitoring and distributed tracing to understand how a system works. In this presentation we’ll explore two complementary Open Source technologies: Prometheus for monitoring application metrics; and OpenTracing and Jaeger for distributed tracing. We’ll discover how they improve the observability of a massively scalable Anomaly Detection system - an application which is built around Apache Cassandra and Apache Kafka for the data layers, and dynamically deployed and scaled on Kubernetes, a container orchestration technology. We will give an overview of Prometheus and OpenTracing/Jaeger, explain how the application is instrumented, and describe how Prometheus and OpenTracing are deployed and configured in a production environment running Kubernetes, to dynamically monitor the application at scale. We conclude by exploring the benefits of monitoring and tracing technologies for understanding, debugging and tuning complex dynamic distributed systems built on Kafka, Cassandra and Kubernetes, and introduce a new use case to enable Cassandra Elastic Autoscaling, by combining Prometheus alerts, Instaclustr’s Provisioning API for Dynamic Resizing, and the new Prometheus monitoring API.

Building Stream Processing Applications with Apache Kafka Using KSQL (Robin M...

Building Stream Processing Applications with Apache Kafka Using KSQL (Robin M...

Building Stream Processing Applications with Apache Kafka Using KSQL (Robin M...confluent

Robin is a Developer Advocate at Confluent, the company founded by the creators of Apache Kafka, as well as an Oracle Groundbreaker Ambassador. His career has always involved data, from the old worlds of COBOL and DB2, through the worlds of Oracle and Hadoop, and into the current world with Kafka. His particular interests are analytics, systems architecture, performance testing and optimization. He blogs at http://cnfl.io/rmoff and http://rmoff.net/ and can be found tweeting grumpy geek thoughts as @rmoff. Outside of work he enjoys drinking good beer and eating fried breakfasts, although generally not at the same time.

Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry

Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry

Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry confluent

Apache Beam (unified Batch and strEAM processing!) is a new Apache incubator project. Originally based on years of experience developing Big Data infrastructure within Google (such as MapReduce, FlumeJava, and MillWheel), it has now been donated to the OSS community at large. Come learn about the fundamentals of out-of-order stream processing, and how Beam’s powerful tools for reasoning about time greatly simplify this complex task. Beam provides a model that allows developers to focus on the four important questions that must be answered by any stream processing pipeline: What results are being calculated? Where in event time are they calculated? When in processing time are they materialized? How do refinements of results relate? Furthermore, by cleanly separating these questions from runtime characteristics, Beam programs become portable across multiple runtime environments, both proprietary (e.g., Google Cloud Dataflow) and open-source (e.g., Flink, Spark, et al).

Real Time Data Streaming using Kafka & Storm

Real Time Data Streaming using Kafka & Storm

Real Time Data Streaming using Kafka & StormRan Silberman

ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...

ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...

ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...Paul Brebner

Join with me in a journey of exploration upriver with "Kongo", a scalable streaming IoT logistics demonstration application using Apache Kafka, the popular open source distributed streaming platform. Along the way you'll discover: an example logistics IoT problem domain (involving the rapid movement of thousands of goods by trucks between warehouses, with real-time checking of complex business and safety rules from sensor data); an overview of the Apache Kafka architecture and components; lessons learned from making critical Kaka application design decisions; an example of Kafka Streams for checking truck load limits; and finish the journey by overcoming final performance challenges and shooting the rapids to scale Kongo on a production Kafka cluster. https://aceu19.apachecon.com/session/kongo-building-scalable-streaming-iot-application-using-apache-kafka

Getting Started with Confluent Schema Registry

Getting Started with Confluent Schema Registry

Getting Started with Confluent Schema Registryconfluent

Introducing Exactly Once Semantics in Apache Kafka with Matthias J. Sax

Introducing Exactly Once Semantics in Apache Kafka with Matthias J. Sax

Introducing Exactly Once Semantics in Apache Kafka with Matthias J. SaxDatabricks

Flink at netflix paypal speaker series

Flink at netflix paypal speaker series

Flink at netflix paypal speaker seriesMonal Daxini

From a kafkaesque story to The Promised Land

From a kafkaesque story to The Promised Land

From a kafkaesque story to The Promised LandRan Silberman

Apache Incubator Samza: Stream Processing at LinkedIn

Apache Incubator Samza: Stream Processing at LinkedIn

Apache Incubator Samza: Stream Processing at LinkedInChris Riccomini

Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...

Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...

Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...confluent

Getting Kafka running on Kubernetes is only step one of a journey to create a production-ready Kafka cluster. This talk walks through the other steps: 1) Monitoring and remediating faults. 2) Updates to Kubernetes nodes for clusters not using shared storage. 3) Automating Kafka updates and restarts. We present how to create fault-tolerant Kafka clusters on Kubernetes without sacrificing availability, durability, or latency. Learn about Lyft's overlay-free Kubernetes networking driver and how we use it to keep performance on par with non-Kubernetes clusters.

Chti jug - 2018-06-26

Chti jug - 2018-06-26

Chti jug - 2018-06-26Florent Ramiere

This document contains an agenda and overview of Confluent and streaming with Kafka. The agenda includes introductions to Confluent, streaming, KSQL, and a demo. Confluent is presented as the company founded by the creators of Apache Kafka to develop streaming platforms based on Kafka. Key concepts of streaming, the Confluent platform, and Kafka Streams, Kafka Connect, and KSQL are summarized. The document concludes with resources and time for questions.

Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink

Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink

Till Rohrmann – Fault Tolerance and Job Recovery in Apache FlinkFlink Forward

Flink provides fault tolerance guarantees through checkpointing and recovery mechanisms. Checkpoints take consistent snapshots of distributed state and data, while barriers mark checkpoints in the data flow. This allows Flink to recover jobs from failures and resume processing from the last completed checkpoint. Flink also implements high availability by persisting metadata like the execution graph and checkpoints to Apache Zookeeper, enabling a standby JobManager to take over if the active one fails.

Top Ten Kafka® Configs

Top Ten Kafka® Configs

Top Ten Kafka® Configsconfluent

This document provides a summary of the top 10 Kafka configuration settings for optimal performance and robustness. It begins with a brief introduction to Kafka and then discusses important broker configurations like enabling JMX metrics, unclean leader election, retention policies, and minimum in-sync replicas. Client-side configurations like max poll interval and committing offsets for consumers as well as linger time, acknowledgement levels, retries, and maintaining ordering for producers are also covered. The document emphasizes that proper configuration is key to the health of a Kafka cluster and recommends understanding the goals and measuring performance before and after any changes.

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARNblueboxtraveler

Apache Samza is a framework for reliable stream processing using Apache Kafka and Hadoop YARN. It provides low-latency stream processing by allowing users to write stream processing jobs that consume messages from Kafka topics and process them using simple process functions. Samza jobs are distributed and run across clusters using YARN to provide reliability and scalability. The process functions in Samza allow users to easily integrate stream processing with state storage and message output to other Kafka topics.

Kubernetes and Prometheus

Kubernetes and Prometheus

Kubernetes and PrometheusWeaveworks

Prometheus was recently accepted into the Cloud Native Computing Foundation, making it the second project after Kubernetes to be given their blessing and acknowledging that Prometheus and Kubernetes make an awesome combination. In this talk we'll cover common patterns for running Prometheus on Kubernetes, how to monitor services on Kubernetes, and some cool tips and hacks to ensure you get the most out of your Prometheus + Kubernetes deployment.

ApacheCon BigData Europe 2015

ApacheCon BigData Europe 2015

ApacheCon BigData Europe 2015 Renato Javier Marroquín Mogrovejo

Recently, the interest in highly scalable stream processing engines has risen, thus many projects have appeared. Apache Samza is a distributed stream-processing framework that uses Apache Kafka for messaging, and Apache Hadoop YARN to provide fault tolerance, and resource management. It is one of the most popular stream processing engines out there used by many high-profile companies. On the other hand, we have Amazon Kinesis that is a fully managed service for real-time processing of streaming data which allows users to scale the amount of data ingested by Kinesis without worrying about the infrastructure details. This presentation gives a brief introduction about the very popular Samza-Kafka integration, then focuses on the new Samza-Kinesis integration, and explains users the new opportunities they have due to the new Samza-Kinesis integration.

PromQL Deep Dive - The Prometheus Query Language

PromQL Deep Dive - The Prometheus Query Language

PromQL Deep Dive - The Prometheus Query Language Weaveworks

Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen

Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen

Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewenconfluent

Flink and Kafka are popular components to build an open source stream processing infrastructure. We present how Flink integrates with Kafka to provide a platform with a unique feature set that matches the challenging requirements of advanced stream processing applications. In particular, we will dive into the following points: Flink’s support for event-time processing, how it handles out-of-order streams, and how it can perform analytics on historical and real-time streams served from Kafka’s persistent log using the same code. We present Flink’s windowing mechanism that supports time-, count- and session- based windows, and intermixing event and processing time semantics in one program. How Flink’s checkpointing mechanism integrates with Kafka for fault-tolerance, for consistent stateful applications with exactly-once semantics. We will discuss “”Savepoints””, which allows users to save the state of the streaming program at any point in time. Together with a durable event log like Kafka, savepoints allow users to pause/resume streaming programs, go back to prior states, or switch to different versions of the program, while preserving exactly-once semantics. We explain the techniques behind the combination of low-latency and high throughput streaming, and how latency/throughput trade-off can configured. We will give an outlook on current developments for streaming analytics, such as streaming SQL and complex event processing.

Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...

Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...

Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...Flink Forward

Apache Beam is Flink’s sibling in the Apache family of streaming processing frameworks. The Beam and Flink teams work closely together on advancing what is possible in streaming processing, including Streaming SQL extensions and code interoperability on both platforms. Beam was originally developed at Google as the amalgamation of its internal batch and streaming frameworks to power the exabyte-scale data processing for Gmail, YouTube and Ads. It now powers a fully-managed, serverless service Google Cloud Dataflow, as well as is available to run in other Public Clouds and on-premises when deployed in portability mode on Apache Flink, Spark, Samza and other runners. Users regularly run distributed data processing jobs on Beam spanning tens of thousands of CPU cores and processing millions of events per second. In this session, Sergei Sokolenko, Cloud Dataflow product manager, and Reuven Lax, the founding member of the Dataflow and Beam team, will share Google’s learnings from building and operating a global streaming processing infrastructure shared by thousands of customers, including: safe deployment to dozens of geographic locations, resource autoscaling to minimize processing costs, separating compute and state storage for better scaling behavior, dynamic work rebalancing of work items away from overutilized worker nodes, offering a throughput-optimized batch processing capability with the same API as streaming, grouping and joining of 100s of Terabytes in a hybrid in-memory/on-desk file system, integrating with the Google Cloud security ecosystem, and other lessons. Customers benefit from these advances through faster execution of jobs, resource savings, and a fully managed data processing environment that runs in the Cloud and removes the need to manage infrastructure.

Samza at LinkedIn: Taking Stream Processing to the Next Level

Samza at LinkedIn: Taking Stream Processing to the Next Level

Samza at LinkedIn: Taking Stream Processing to the Next LevelMartin Kleppmann

Slides from my talk at Berlin Buzzwords, 27 May 2014. Unfortunately Slideshare has screwed up the fonts. See https://speakerdeck.com/ept/samza-at-linkedin-taking-stream-processing-to-the-next-level for a version of the deck with correct fonts. Stream processing is an essential part of real-time data systems, such as news feeds, live search indexes, real-time analytics, metrics and monitoring. But writing stream processes is still hard, especially when you're dealing with so much data that you have to distribute it across multiple machines. How can you keep the system running smoothly, even when machines fail and bugs occur? Apache Samza is a new framework for writing scalable stream processing jobs. Like Hadoop and MapReduce for batch processing, it takes care of the hard parts of running your message-processing code on a distributed infrastructure, so that you can concentrate on writing your application using simple APIs. It is in production use at LinkedIn. This talk will introduce Samza, and show how to use it to solve a range of different problems. Samza has some unique features that make it especially interesting for large deployments, and in this talk we will dig into how they work under the hood. In particular: • Samza is built to support many different jobs written by different teams. Isolation between jobs ensures that a single badly behaved job doesn't affect other jobs. It is robust by design. • Samza can handle jobs that require large amounts of state, for example joining multiple streams, augmenting a stream with data from a database, or aggregating data over long time windows. This makes it a very powerful tool for applications.

Apache samza past, present and future

Apache samza past, present and future

Apache samza past, present and futureEd Yakabosky

The document provides an overview of Apache Samza, including its key differentiators and future plans. It discusses Samza's performance advantages from using local state instead of remote databases. Samza allows stateful stream processing and incremental checkpointing for applications with terabytes of state. It supports a variety of input sources, processing as a service on YARN or embedded as a library. Upcoming features include a high-level API, support for event time windows, pipelines, and exactly-once processing while auto-scaling local state.

QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons

QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons

QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemonsaspyker

Disenchantment is a Netflix show following the medieval misadventures of a hard-drinking princess, her feisty elf, and her personal demon. In this talk, we will follow the story of Netflix’s container management platform, Titus, which powers critical aspects of the Netflix business (video encoding & streaming, big data, recommendations & machine learning, and other workloads). We’ll cover the challenges growing Titus from 10’s to 1000’s of workloads. We’ll talk about our feisty team’s work across container runtimes, scheduling & control plane, and cloud infrastructure integration. We’ll talk about the demons we’ve found on this journey covering operability, security, reliability and performance.

Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...

Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...

Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...Flink Forward

Many stream processing applications can benefit from or need to rely on the prediction made with machine learning (ML) methods. In this presentation, new features of Apache Samoa are presented with a real data processing scenario. These features make Apache SAMOA fully accessible for Apache Flink users: (1) the data stream processed within Apache Flink is forwarded to Apache Samoa stream mining engine to perform predictions with stream-oriented ML models, (2) ML models evolve after every labelled instance and, at the same time, new predictions are sent back to Apache Flink. In both cases, Apache Kafka is used for data exchange. Hence, Apache Samoa is used as stream mining engine, provided with input data from, and sending predictions to Apache Flink. During the presentation, real life aspects are illustrated with code examples, such as input and prediction stream integration and monitoring latency of data processing and stream mining.

So You Want to Write a Connector?

So You Want to Write a Connector?

So You Want to Write a Connector? confluent

(Randall Hauch, Confluent) Kafka Summit SF 2018 The Kafka Connect framework makes it easy to move data into and out of Kafka, and you want to write a connector. Where do you start, and what are the most important things to know? This is an advanced talk that will cover important aspects of how the Connect framework works and best practices of designing, developing, testing and packaging connectors so that you and your users will be successful. We’ll review how the Connect framework is evolving, and how you can help develop and improve it.

Flink Forward Berlin 2017: Steffen Hausmann - Build a Real-time Stream Proces...

Flink Forward Berlin 2017: Steffen Hausmann - Build a Real-time Stream Proces...

Flink Forward Berlin 2017: Steffen Hausmann - Build a Real-time Stream Proces...Flink Forward

The increasing number of available data sources in today's application stacks created a demand to continuously capture and process data from various sources to quickly turn high volume streams of raw data into actionable insights. Apache Flink addresses many of the challenges faced in this domain as it's specifically tailored to distributed computations over streams. While Flink provides all the necessary capabilities to process streaming data, provisioning and maintaining a Flink cluster still requires considerable effort and expertise. We will discuss how cloud services can remove most of the burden of running the clusters underlying your Flink jobs and explain how to build a real-time processing pipeline on top of AWS by integrating Flink with Amazon Kinesis and Amazon EMR. We will furthermore illustrate how to leverage the reliable, scalable, and elastic nature of the AWS cloud to effectively create and operate your real-time processing pipeline with little operational overhead.

Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)

Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)

Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Brian Brazil

Kafka Summit NYC 2017 Hanging Out with Your Past Self in VR

Kafka Summit NYC 2017 Hanging Out with Your Past Self in VR

Kafka Summit NYC 2017 Hanging Out with Your Past Self in VRconfluent

Monitoring in Big Data Platform - Albert Lewandowski, GetInData

Monitoring in Big Data Platform - Albert Lewandowski, GetInData

Monitoring in Big Data Platform - Albert Lewandowski, GetInDataGetInData

Did you like it? Check out our blog to stay up to date: https://getindata.com/blog The webinar was organized by GetinData on 2020. During the webinar we explaned the concept of monitoring and observability with focus on data analytics platforms. Watch more here: https://www.youtube.com/watch?v=qSOlEN5XBQc Whitepaper - Monitoring ang Observability for Data Platform: https://getindata.com/blog/white-paper-big-data-monitoring-observability-data-platform/ Speaker: Albert Lewandowski Linkedin: https://www.linkedin.com/in/albert-lewandowski/ ___ Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets. Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries. https://getindata.com

MeetUp Monitoring with Prometheus and Grafana (September 2018)

MeetUp Monitoring with Prometheus and Grafana (September 2018)

MeetUp Monitoring with Prometheus and Grafana (September 2018)Lucas Jellema

This presentation introduces the concept of monitoring - focusing on why and how and finally on the tools to use. It introduces Prometheus (metrics gathering, processing, alerting), application instrumentation and Prometheus exporters and finally it introduces Grafana as a common companion for dashboarding, alerting and notifications. This presentations also introduces the handson workshop - for which materials are available from https://github.com/lucasjellema/monitoring-workshop-prometheus-grafana

Ad

More Related Content

What's hot (20)

From a kafkaesque story to The Promised Land

From a kafkaesque story to The Promised Land

From a kafkaesque story to The Promised LandRan Silberman

Apache Incubator Samza: Stream Processing at LinkedIn

Apache Incubator Samza: Stream Processing at LinkedIn

Apache Incubator Samza: Stream Processing at LinkedInChris Riccomini

Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...

Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...

Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...confluent

Getting Kafka running on Kubernetes is only step one of a journey to create a production-ready Kafka cluster. This talk walks through the other steps: 1) Monitoring and remediating faults. 2) Updates to Kubernetes nodes for clusters not using shared storage. 3) Automating Kafka updates and restarts. We present how to create fault-tolerant Kafka clusters on Kubernetes without sacrificing availability, durability, or latency. Learn about Lyft's overlay-free Kubernetes networking driver and how we use it to keep performance on par with non-Kubernetes clusters.

Chti jug - 2018-06-26

Chti jug - 2018-06-26

Chti jug - 2018-06-26Florent Ramiere

This document contains an agenda and overview of Confluent and streaming with Kafka. The agenda includes introductions to Confluent, streaming, KSQL, and a demo. Confluent is presented as the company founded by the creators of Apache Kafka to develop streaming platforms based on Kafka. Key concepts of streaming, the Confluent platform, and Kafka Streams, Kafka Connect, and KSQL are summarized. The document concludes with resources and time for questions.

Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink

Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink

Till Rohrmann – Fault Tolerance and Job Recovery in Apache FlinkFlink Forward

Flink provides fault tolerance guarantees through checkpointing and recovery mechanisms. Checkpoints take consistent snapshots of distributed state and data, while barriers mark checkpoints in the data flow. This allows Flink to recover jobs from failures and resume processing from the last completed checkpoint. Flink also implements high availability by persisting metadata like the execution graph and checkpoints to Apache Zookeeper, enabling a standby JobManager to take over if the active one fails.

Top Ten Kafka® Configs

Top Ten Kafka® Configs

Top Ten Kafka® Configsconfluent

This document provides a summary of the top 10 Kafka configuration settings for optimal performance and robustness. It begins with a brief introduction to Kafka and then discusses important broker configurations like enabling JMX metrics, unclean leader election, retention policies, and minimum in-sync replicas. Client-side configurations like max poll interval and committing offsets for consumers as well as linger time, acknowledgement levels, retries, and maintaining ordering for producers are also covered. The document emphasizes that proper configuration is key to the health of a Kafka cluster and recommends understanding the goals and measuring performance before and after any changes.

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARNblueboxtraveler

Apache Samza is a framework for reliable stream processing using Apache Kafka and Hadoop YARN. It provides low-latency stream processing by allowing users to write stream processing jobs that consume messages from Kafka topics and process them using simple process functions. Samza jobs are distributed and run across clusters using YARN to provide reliability and scalability. The process functions in Samza allow users to easily integrate stream processing with state storage and message output to other Kafka topics.

Kubernetes and Prometheus

Kubernetes and Prometheus

Kubernetes and PrometheusWeaveworks

Prometheus was recently accepted into the Cloud Native Computing Foundation, making it the second project after Kubernetes to be given their blessing and acknowledging that Prometheus and Kubernetes make an awesome combination. In this talk we'll cover common patterns for running Prometheus on Kubernetes, how to monitor services on Kubernetes, and some cool tips and hacks to ensure you get the most out of your Prometheus + Kubernetes deployment.

ApacheCon BigData Europe 2015

ApacheCon BigData Europe 2015

ApacheCon BigData Europe 2015 Renato Javier Marroquín Mogrovejo

Recently, the interest in highly scalable stream processing engines has risen, thus many projects have appeared. Apache Samza is a distributed stream-processing framework that uses Apache Kafka for messaging, and Apache Hadoop YARN to provide fault tolerance, and resource management. It is one of the most popular stream processing engines out there used by many high-profile companies. On the other hand, we have Amazon Kinesis that is a fully managed service for real-time processing of streaming data which allows users to scale the amount of data ingested by Kinesis without worrying about the infrastructure details. This presentation gives a brief introduction about the very popular Samza-Kafka integration, then focuses on the new Samza-Kinesis integration, and explains users the new opportunities they have due to the new Samza-Kinesis integration.

PromQL Deep Dive - The Prometheus Query Language

PromQL Deep Dive - The Prometheus Query Language

PromQL Deep Dive - The Prometheus Query Language Weaveworks

Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen

Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen

Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewenconfluent

Flink and Kafka are popular components to build an open source stream processing infrastructure. We present how Flink integrates with Kafka to provide a platform with a unique feature set that matches the challenging requirements of advanced stream processing applications. In particular, we will dive into the following points: Flink’s support for event-time processing, how it handles out-of-order streams, and how it can perform analytics on historical and real-time streams served from Kafka’s persistent log using the same code. We present Flink’s windowing mechanism that supports time-, count- and session- based windows, and intermixing event and processing time semantics in one program. How Flink’s checkpointing mechanism integrates with Kafka for fault-tolerance, for consistent stateful applications with exactly-once semantics. We will discuss “”Savepoints””, which allows users to save the state of the streaming program at any point in time. Together with a durable event log like Kafka, savepoints allow users to pause/resume streaming programs, go back to prior states, or switch to different versions of the program, while preserving exactly-once semantics. We explain the techniques behind the combination of low-latency and high throughput streaming, and how latency/throughput trade-off can configured. We will give an outlook on current developments for streaming analytics, such as streaming SQL and complex event processing.

Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...

Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...

Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...Flink Forward

Apache Beam is Flink’s sibling in the Apache family of streaming processing frameworks. The Beam and Flink teams work closely together on advancing what is possible in streaming processing, including Streaming SQL extensions and code interoperability on both platforms. Beam was originally developed at Google as the amalgamation of its internal batch and streaming frameworks to power the exabyte-scale data processing for Gmail, YouTube and Ads. It now powers a fully-managed, serverless service Google Cloud Dataflow, as well as is available to run in other Public Clouds and on-premises when deployed in portability mode on Apache Flink, Spark, Samza and other runners. Users regularly run distributed data processing jobs on Beam spanning tens of thousands of CPU cores and processing millions of events per second. In this session, Sergei Sokolenko, Cloud Dataflow product manager, and Reuven Lax, the founding member of the Dataflow and Beam team, will share Google’s learnings from building and operating a global streaming processing infrastructure shared by thousands of customers, including: safe deployment to dozens of geographic locations, resource autoscaling to minimize processing costs, separating compute and state storage for better scaling behavior, dynamic work rebalancing of work items away from overutilized worker nodes, offering a throughput-optimized batch processing capability with the same API as streaming, grouping and joining of 100s of Terabytes in a hybrid in-memory/on-desk file system, integrating with the Google Cloud security ecosystem, and other lessons. Customers benefit from these advances through faster execution of jobs, resource savings, and a fully managed data processing environment that runs in the Cloud and removes the need to manage infrastructure.

Samza at LinkedIn: Taking Stream Processing to the Next Level

Samza at LinkedIn: Taking Stream Processing to the Next Level

Samza at LinkedIn: Taking Stream Processing to the Next LevelMartin Kleppmann

Slides from my talk at Berlin Buzzwords, 27 May 2014. Unfortunately Slideshare has screwed up the fonts. See https://speakerdeck.com/ept/samza-at-linkedin-taking-stream-processing-to-the-next-level for a version of the deck with correct fonts. Stream processing is an essential part of real-time data systems, such as news feeds, live search indexes, real-time analytics, metrics and monitoring. But writing stream processes is still hard, especially when you're dealing with so much data that you have to distribute it across multiple machines. How can you keep the system running smoothly, even when machines fail and bugs occur? Apache Samza is a new framework for writing scalable stream processing jobs. Like Hadoop and MapReduce for batch processing, it takes care of the hard parts of running your message-processing code on a distributed infrastructure, so that you can concentrate on writing your application using simple APIs. It is in production use at LinkedIn. This talk will introduce Samza, and show how to use it to solve a range of different problems. Samza has some unique features that make it especially interesting for large deployments, and in this talk we will dig into how they work under the hood. In particular: • Samza is built to support many different jobs written by different teams. Isolation between jobs ensures that a single badly behaved job doesn't affect other jobs. It is robust by design. • Samza can handle jobs that require large amounts of state, for example joining multiple streams, augmenting a stream with data from a database, or aggregating data over long time windows. This makes it a very powerful tool for applications.

Apache samza past, present and future

Apache samza past, present and future

Apache samza past, present and futureEd Yakabosky

The document provides an overview of Apache Samza, including its key differentiators and future plans. It discusses Samza's performance advantages from using local state instead of remote databases. Samza allows stateful stream processing and incremental checkpointing for applications with terabytes of state. It supports a variety of input sources, processing as a service on YARN or embedded as a library. Upcoming features include a high-level API, support for event time windows, pipelines, and exactly-once processing while auto-scaling local state.

QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons

QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons

QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemonsaspyker

Disenchantment is a Netflix show following the medieval misadventures of a hard-drinking princess, her feisty elf, and her personal demon. In this talk, we will follow the story of Netflix’s container management platform, Titus, which powers critical aspects of the Netflix business (video encoding & streaming, big data, recommendations & machine learning, and other workloads). We’ll cover the challenges growing Titus from 10’s to 1000’s of workloads. We’ll talk about our feisty team’s work across container runtimes, scheduling & control plane, and cloud infrastructure integration. We’ll talk about the demons we’ve found on this journey covering operability, security, reliability and performance.

Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...

Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...

Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...Flink Forward

Many stream processing applications can benefit from or need to rely on the prediction made with machine learning (ML) methods. In this presentation, new features of Apache Samoa are presented with a real data processing scenario. These features make Apache SAMOA fully accessible for Apache Flink users: (1) the data stream processed within Apache Flink is forwarded to Apache Samoa stream mining engine to perform predictions with stream-oriented ML models, (2) ML models evolve after every labelled instance and, at the same time, new predictions are sent back to Apache Flink. In both cases, Apache Kafka is used for data exchange. Hence, Apache Samoa is used as stream mining engine, provided with input data from, and sending predictions to Apache Flink. During the presentation, real life aspects are illustrated with code examples, such as input and prediction stream integration and monitoring latency of data processing and stream mining.

So You Want to Write a Connector?

So You Want to Write a Connector?

So You Want to Write a Connector? confluent

(Randall Hauch, Confluent) Kafka Summit SF 2018 The Kafka Connect framework makes it easy to move data into and out of Kafka, and you want to write a connector. Where do you start, and what are the most important things to know? This is an advanced talk that will cover important aspects of how the Connect framework works and best practices of designing, developing, testing and packaging connectors so that you and your users will be successful. We’ll review how the Connect framework is evolving, and how you can help develop and improve it.

Flink Forward Berlin 2017: Steffen Hausmann - Build a Real-time Stream Proces...

Flink Forward Berlin 2017: Steffen Hausmann - Build a Real-time Stream Proces...

Flink Forward Berlin 2017: Steffen Hausmann - Build a Real-time Stream Proces...Flink Forward

The increasing number of available data sources in today's application stacks created a demand to continuously capture and process data from various sources to quickly turn high volume streams of raw data into actionable insights. Apache Flink addresses many of the challenges faced in this domain as it's specifically tailored to distributed computations over streams. While Flink provides all the necessary capabilities to process streaming data, provisioning and maintaining a Flink cluster still requires considerable effort and expertise. We will discuss how cloud services can remove most of the burden of running the clusters underlying your Flink jobs and explain how to build a real-time processing pipeline on top of AWS by integrating Flink with Amazon Kinesis and Amazon EMR. We will furthermore illustrate how to leverage the reliable, scalable, and elastic nature of the AWS cloud to effectively create and operate your real-time processing pipeline with little operational overhead.

Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)

Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)

Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Brian Brazil

Kafka Summit NYC 2017 Hanging Out with Your Past Self in VR

Kafka Summit NYC 2017 Hanging Out with Your Past Self in VR

Kafka Summit NYC 2017 Hanging Out with Your Past Self in VRconfluent

From a kafkaesque story to The Promised Land

From a kafkaesque story to The Promised Land

From a kafkaesque story to The Promised LandRan Silberman

Apache Incubator Samza: Stream Processing at LinkedIn

Apache Incubator Samza: Stream Processing at LinkedIn

Apache Incubator Samza: Stream Processing at LinkedInChris Riccomini

Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...

Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...

Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...confluent

Chti jug - 2018-06-26

Chti jug - 2018-06-26

Chti jug - 2018-06-26Florent Ramiere

Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink

Till Rohrmann – Fault Tolerance and Job Recovery in Apache Flink

Till Rohrmann – Fault Tolerance and Job Recovery in Apache FlinkFlink Forward

Top Ten Kafka® Configs

Top Ten Kafka® Configs

Top Ten Kafka® Configsconfluent

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARN

Apache Samza: Reliable Stream Processing Atop Apache Kafka and Hadoop YARNblueboxtraveler

Kubernetes and Prometheus

Kubernetes and Prometheus

Kubernetes and PrometheusWeaveworks

ApacheCon BigData Europe 2015

ApacheCon BigData Europe 2015

ApacheCon BigData Europe 2015 Renato Javier Marroquín Mogrovejo

PromQL Deep Dive - The Prometheus Query Language

PromQL Deep Dive - The Prometheus Query Language

PromQL Deep Dive - The Prometheus Query Language Weaveworks

Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen

Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen

Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewenconfluent

Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...

Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...

Keynote: Building and Operating A Serverless Streaming Runtime for Apache Bea...Flink Forward

Samza at LinkedIn: Taking Stream Processing to the Next Level

Samza at LinkedIn: Taking Stream Processing to the Next Level

Samza at LinkedIn: Taking Stream Processing to the Next LevelMartin Kleppmann

Apache samza past, present and future

Apache samza past, present and future

Apache samza past, present and futureEd Yakabosky

QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons

QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemons

QConSF18 - Disenchantment: Netflix Titus, its Feisty Team, and Daemonsaspyker

Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...

Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...

Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...Flink Forward

So You Want to Write a Connector?

So You Want to Write a Connector?

So You Want to Write a Connector? confluent

Flink Forward Berlin 2017: Steffen Hausmann - Build a Real-time Stream Proces...

Flink Forward Berlin 2017: Steffen Hausmann - Build a Real-time Stream Proces...

Flink Forward Berlin 2017: Steffen Hausmann - Build a Real-time Stream Proces...Flink Forward

Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)

Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)

Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Brian Brazil

Kafka Summit NYC 2017 Hanging Out with Your Past Self in VR

Kafka Summit NYC 2017 Hanging Out with Your Past Self in VR

Kafka Summit NYC 2017 Hanging Out with Your Past Self in VRconfluent

Similar to How to Improve the Observability of Apache Cassandra and Kafka applications with Prometheus and OpenTracing (20)

Monitoring in Big Data Platform - Albert Lewandowski, GetInData

Monitoring in Big Data Platform - Albert Lewandowski, GetInData

Monitoring in Big Data Platform - Albert Lewandowski, GetInDataGetInData

Did you like it? Check out our blog to stay up to date: https://getindata.com/blog The webinar was organized by GetinData on 2020. During the webinar we explaned the concept of monitoring and observability with focus on data analytics platforms. Watch more here: https://www.youtube.com/watch?v=qSOlEN5XBQc Whitepaper - Monitoring ang Observability for Data Platform: https://getindata.com/blog/white-paper-big-data-monitoring-observability-data-platform/ Speaker: Albert Lewandowski Linkedin: https://www.linkedin.com/in/albert-lewandowski/ ___ Getindata is a company founded in 2014 by ex-Spotify data engineers. From day one our focus has been on Big Data projects. We bring together a group of best and most experienced experts in Poland, working with cloud and open-source Big Data technologies to help companies build scalable data architectures and implement advanced analytics over large data sets. Our experts have vast production experience in implementing Big Data projects for Polish as well as foreign companies including i.a. Spotify, Play, Truecaller, Kcell, Acast, Allegro, ING, Agora, Synerise, StepStone, iZettle and many others from the pharmaceutical, media, finance and FMCG industries. https://getindata.com

MeetUp Monitoring with Prometheus and Grafana (September 2018)

MeetUp Monitoring with Prometheus and Grafana (September 2018)

MeetUp Monitoring with Prometheus and Grafana (September 2018)Lucas Jellema

This presentation introduces the concept of monitoring - focusing on why and how and finally on the tools to use. It introduces Prometheus (metrics gathering, processing, alerting), application instrumentation and Prometheus exporters and finally it introduces Grafana as a common companion for dashboarding, alerting and notifications. This presentations also introduces the handson workshop - for which materials are available from https://github.com/lucasjellema/monitoring-workshop-prometheus-grafana

Prometheus with Grafana - AddWeb Solution

Prometheus with Grafana - AddWeb Solution

Prometheus with Grafana - AddWeb SolutionAddWeb Solution Pvt. Ltd.

Prometheus is an open source monitoring tool that collects metrics from monitored applications and stores them for querying and alerting. Grafana is used to visualize the metrics collected by Prometheus through customizable dashboards. It connects Prometheus as a data source and fires queries to retrieve and display time-series data as graphs and visualizations. This allows monitoring of server status, resources, containers, probes, and other metrics to more easily track system performance and issues over time.

Prometheus - Utah Software Architecture Meetup - Clint Checketts

Prometheus - Utah Software Architecture Meetup - Clint Checketts

Prometheus - Utah Software Architecture Meetup - Clint Checkettsclintchecketts

2022-05-23-DevOps pro Europe - Managing Apps at scale.pdf

2022-05-23-DevOps pro Europe - Managing Apps at scale.pdf

2022-05-23-DevOps pro Europe - Managing Apps at scale.pdfŁukasz Piątkowski

Monitoring with Prometheus

Monitoring with Prometheus

Monitoring with PrometheusRichard Langlois P. Eng.

MuleSoft Meetup Roma - Processi di Automazione su CloudHub

MuleSoft Meetup Roma - Processi di Automazione su CloudHub

MuleSoft Meetup Roma - Processi di Automazione su CloudHubAlfonso Martino

The document summarizes an event held by the Rome MuleSoft Meetup Group to discuss automation of processes on CloudHub using MuleSoft's Anypoint Platform. The agenda included presentations on using infrastructure as code to automate CloudHub setup, managing API proxies, and a Q&A session. A tool called the CloudHub Automation Tool was demonstrated, which uses Terraform and other open source tools to automate CloudHub configuration and setup of environments, users, and other resources through code. The document also provided information on migrating APIs from a legacy system to the Anypoint Platform at scale.

Slack in the Age of Prometheus

Slack in the Age of Prometheus

Slack in the Age of PrometheusGeorge Luong

This document summarizes Slack's transition from Graphite to Prometheus for monitoring. It describes the issues with Graphite including difficulty discovering metrics, slow queries, lack of tagging, and inability to scale. Prometheus was chosen because it meets requirements for high availability, ease of use, fast queries, scaling, and customization. The document outlines Slack's Prometheus architecture with HA clusters and discusses challenges of monitoring many metrics from web apps and jobs. It also previews future plans including Consul for service discovery and adopting Thanos and per-service Prometheus instances.

Prometheus and Grafana

Prometheus and Grafana

Prometheus and GrafanaLhouceine OUHAMZA

Prometheus is an open-source monitoring system that collects metrics from configured targets, stores time-series data, and allows users to query and visualize the data. It works by scraping metrics over HTTP from applications and servers, storing the data in its time-series database, and providing a UI and query language to analyze the data. Prometheus is useful for monitoring system metrics like CPU usage and memory as well as application metrics like HTTP requests and errors.

About Qtp 92techgajanan

About Qtp_1 92techgajanan

About QTP 9.2chandrasekhar

DevOps Spain 2019. Beatriz Martínez-IBM

DevOps Spain 2019. Beatriz Martínez-IBM

DevOps Spain 2019. Beatriz Martínez-IBMatSistemas

This document provides an overview of cloud native monitoring with Prometheus. It discusses Prometheus and how it has become the standard for metrics-based monitoring. It covers monitoring systems and applications with Prometheus, including scraping metrics, querying, and instrumenting applications to expose metrics. It also discusses alerting with Alertmanager and scaling Prometheus through federation and projects like Thanos. The document aims to explain how Prometheus enables observability of systems in cloud native environments and the growing ecosystem around Prometheus.

Webinar Monitoring in era of cloud computing

Webinar Monitoring in era of cloud computing

Webinar Monitoring in era of cloud computingCREATE-NET

Create-Net is a research center that offers cloud computing research, consulting, training, and webinars. This webinar discusses monitoring in the cloud computing era, beginning with introductions to Ceilometer and Monasca. Ceilometer is OpenStack's metering framework that collects data from OpenStack services through agents and notifications. It stores data in a database and provides an API. Monasca is a monitoring as a service platform that processes metrics and events at scale through microservices and stores data for querying and visualization. The webinar concludes with a discussion of trends in cloud monitoring.

Build cloud native solution using open source

Build cloud native solution using open source

Build cloud native solution using open source Nitesh Jadhav

Native Support of Prometheus Monitoring in Apache Spark 3.0

Native Support of Prometheus Monitoring in Apache Spark 3.0

Native Support of Prometheus Monitoring in Apache Spark 3.0Databricks

Client-Side Performance Monitoring (MobileTea, Rome)

Client-Side Performance Monitoring (MobileTea, Rome)

Client-Side Performance Monitoring (MobileTea, Rome)Andrew Rota

The document discusses effective strategies for monitoring client-side web performance. It recommends collecting both real user monitoring metrics from actual users as well as synthetic metrics from automated tests. It describes tools like Navigation Timing API, paint metrics, custom metrics, and open-source libraries that can capture metrics. It also discusses storing and visualizing metrics with tools like Graphite and Grafana and how to reduce noise and account for environment differences when analyzing performance data. The overall goal is to utilize performance metrics to inform decisions that improve the user experience.

System monitoring

System monitoring

System monitoringHardikBadola

This document provides guidance on setting up system monitoring using Prometheus, Node Exporter, and Grafana. It discusses why system monitoring is important for maintaining stability and catching issues early. Prometheus is an open-source monitoring tool that collects and stores metrics. Node Exporter exposes system metrics to Prometheus. Grafana is used to create visualizations and dashboards from Prometheus data. The document outlines installing and configuring these tools, including configuring Prometheus to scrape Node Exporter and setting up Grafana.

Prometheus-Grafana-RahulSoni1584KnolX.pptx.pdf

Prometheus-Grafana-RahulSoni1584KnolX.pptx.pdf

Prometheus-Grafana-RahulSoni1584KnolX.pptx.pdfKnoldus Inc.

Monitoring kubernetes with prometheus-operator

Monitoring kubernetes with prometheus-operator

Monitoring kubernetes with prometheus-operatorLili Cosic

This document summarizes monitoring Kubernetes clusters with the prometheus-operator. It introduces the prometheus-operator project, describes the main components it provides like Prometheus, Alertmanager, ServiceMonitor and PodMonitor custom resources. It explains how these resources work and how the operator configures and deploys monitoring targets. It also introduces the kube-prometheus project which provides manifests to easily monitor a Kubernetes cluster out of the box. Finally it provides tips on troubleshooting and where to find help and documentation for using the prometheus-operator.

Monitoring in Big Data Platform - Albert Lewandowski, GetInData

Monitoring in Big Data Platform - Albert Lewandowski, GetInData

Monitoring in Big Data Platform - Albert Lewandowski, GetInDataGetInData

MeetUp Monitoring with Prometheus and Grafana (September 2018)

MeetUp Monitoring with Prometheus and Grafana (September 2018)

MeetUp Monitoring with Prometheus and Grafana (September 2018)Lucas Jellema

Prometheus with Grafana - AddWeb Solution

Prometheus with Grafana - AddWeb Solution

Prometheus with Grafana - AddWeb SolutionAddWeb Solution Pvt. Ltd.

Prometheus - Utah Software Architecture Meetup - Clint Checketts

Prometheus - Utah Software Architecture Meetup - Clint Checketts

Prometheus - Utah Software Architecture Meetup - Clint Checkettsclintchecketts

2022-05-23-DevOps pro Europe - Managing Apps at scale.pdf

2022-05-23-DevOps pro Europe - Managing Apps at scale.pdf

2022-05-23-DevOps pro Europe - Managing Apps at scale.pdfŁukasz Piątkowski

Monitoring with Prometheus

Monitoring with Prometheus

Monitoring with PrometheusRichard Langlois P. Eng.

MuleSoft Meetup Roma - Processi di Automazione su CloudHub

MuleSoft Meetup Roma - Processi di Automazione su CloudHub

MuleSoft Meetup Roma - Processi di Automazione su CloudHubAlfonso Martino

Slack in the Age of Prometheus

Slack in the Age of Prometheus

Slack in the Age of PrometheusGeorge Luong

Prometheus and Grafana

Prometheus and Grafana

Prometheus and GrafanaLhouceine OUHAMZA

About Qtp 92techgajanan

About Qtp_1 92techgajanan

About QTP 9.2chandrasekhar

DevOps Spain 2019. Beatriz Martínez-IBM

DevOps Spain 2019. Beatriz Martínez-IBM

DevOps Spain 2019. Beatriz Martínez-IBMatSistemas

Webinar Monitoring in era of cloud computing

Webinar Monitoring in era of cloud computing

Webinar Monitoring in era of cloud computingCREATE-NET

Build cloud native solution using open source

Build cloud native solution using open source

Build cloud native solution using open source Nitesh Jadhav

Native Support of Prometheus Monitoring in Apache Spark 3.0

Native Support of Prometheus Monitoring in Apache Spark 3.0

Native Support of Prometheus Monitoring in Apache Spark 3.0Databricks

Client-Side Performance Monitoring (MobileTea, Rome)

Client-Side Performance Monitoring (MobileTea, Rome)

Client-Side Performance Monitoring (MobileTea, Rome)Andrew Rota

System monitoring

System monitoring

System monitoringHardikBadola

Prometheus-Grafana-RahulSoni1584KnolX.pptx.pdf

Prometheus-Grafana-RahulSoni1584KnolX.pptx.pdf

Prometheus-Grafana-RahulSoni1584KnolX.pptx.pdfKnoldus Inc.

Monitoring kubernetes with prometheus-operator

Monitoring kubernetes with prometheus-operator

Monitoring kubernetes with prometheus-operatorLili Cosic

Ad

More from Paul Brebner (20)

Streaming More For Less With Apache Kafka Tiered Storage

Streaming More For Less With Apache Kafka Tiered Storage

Streaming More For Less With Apache Kafka Tiered StoragePaul Brebner

Apache Kafka's tiered storage is not just a new feature but a major architectural shift that enables virtually unlimited storage. Traditionally designed for fast, high-throughput real-time streaming, Kafka now also supports more extensive data retention and replay capabilities. This talk will delve into the mysteries of Kafka's time and space, exploring the architectural changes behind tiered storage and how it functions—whether it's more like a tiered fountain or a pumped hydro dam. We'll uncover the performance, scalability, tuning, sizing and cost impacts, and examine intriguing and challenging Kafka replaying use cases. Talk from Day 3 of FOSSASIA 2025 Bangkok in the Cloud and DevOps track, https://eventyay.com/e/4c0e0c27/session/9517

30 Of My Favourite Open Source Technologies In 30 Minutes

30 Of My Favourite Open Source Technologies In 30 Minutes

30 Of My Favourite Open Source Technologies In 30 MinutesPaul Brebner

Closing talk in the main auditorium at FOSSASIA (Hanoi, Vietnam, April 10 2024). What do the following apparently un-related Open Source technologies have in common? Apache Cassandra Apache Lucene Apache Spark Apache Zeppelin Apache Kafka Apache Kafka Connect Apache Kafka Streams Apache Kafka MirrorMaker2 Apache Camel Apache Superset Apache ZooKeeper Apache Curator Kubernetes Guava Redis OpenSearch PostgreSQL Prometheus Grafana OpenTracing Jaeger Debezium Karapace Cadence FerretDB TensorFlow And more! They are all technologies that I've used over the last 7 years to help solve challenging big data application problems. This talk will take a bird's eye view of each one and how they can be used together in your next big data project.

Superpower Your Apache Kafka Applications Development with Complementary Open...

Superpower Your Apache Kafka Applications Development with Complementary Open...

Superpower Your Apache Kafka Applications Development with Complementary Open...Paul Brebner

Kafka Summit talk (Bangalore, India, May 2, 2024, https://events.bizzabo.com/573863/agenda/session/1300469 ) Many Apache Kafka use cases take advantage of Kafka’s ability to integrate multiple heterogeneous systems for stream processing and real-time machine learning scenarios. But Kafka also exists in a rich ecosystem of related but complementary stream processing technologies and tools, particularly from the open-source community. In this talk, we’ll take you on a tour of a selection of complementary tools that can make Kafka even more powerful. We’ll focus on tools for stream processing and querying, streaming machine learning, stream visibility and observation, stream meta-data, stream visualisation, stream development including testing and the use of Generative AI and LLMs, and stream performance and scalability. By the end you will have a good idea of the types of Kafka “superhero” tools that exist, which are my favourites (and what superpowers they have), and how they combine to save your Kafka applications development universe from swamploads of data stagnation monsters!

Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...

Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...

Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Paul Brebner

Closing talk for the Performance Engineering track at Community Over Code EU (Bratislava, Slovakia, June 5 2024) https://eu.communityovercode.org/sessions/2024/why-apache-kafka-clusters-are-like-galaxies-and-other-cosmic-kafka-quandaries-explored/ Instaclustr (now part of NetApp) manages 100s of Apache Kafka clusters of many different sizes, for a variety of use cases and customers. For the last 7 years I’ve been focused outwardly on exploring Kafka application development challenges, but recently I decided to look inward and see what I could discover about the performance, scalability and resource characteristics of the Kafka clusters themselves. Using a suite of Performance Engineering techniques, I will reveal some surprising discoveries about cosmic Kafka mysteries in our data centres, related to: cluster sizes and distribution (using Zipf’s Law), horizontal vs. vertical scalability, and predicting Kafka performance using metrics, modelling and regression techniques. These insights are relevant to Kafka developers and operators.

Architecting Applications With Multiple Open Source Big Data Technologies

Architecting Applications With Multiple Open Source Big Data Technologies

Architecting Applications With Multiple Open Source Big Data TechnologiesPaul Brebner

Keynote for Data Engineering track at Community over Code EU (Bratislava, Slovakia, June 4 2024) https://eu.communityovercode.org/sessions/2024/architecting-applications-with-multiple-open-source-big-data-technologies/ When I started as the Instaclustr Technology Evangelist 7 years ago, I already had a background in computer science R&D and thought I knew a few things about architecting complex distributed systems. But it was still challenging to learn multiple new Apache (and other) Big Data technologies and build and scale realistic demonstration applications for domains such as IoT/logistics, fintech, anomaly detection, geospatial data, data pipelines and a drone delivery application - with streaming machine learning. What did I learn that my younger (-7 years) self could have benefited from? This talk highlights some of my discoveries using Apache Cassandra, Lucene, Kafka, Kafka Connect, Kafka Streams, Camel, Superset; and Karapace, PostgreSQL, Debezium, OpenSearch, Uber’s Cadence (for workflow orchestration), and more.

The Impact of Hardware and Software Version Changes on Apache Kafka Performan...

The Impact of Hardware and Software Version Changes on Apache Kafka Performan...

The Impact of Hardware and Software Version Changes on Apache Kafka Performan...Paul Brebner

Apache Kafka's performance and scalability can be impacted by both hardware and software dimensions. In this presentation, we explore two recent experiences from running a managed Kafka service. The first example recounts our experiences with running Kafka on AWS's Graviton2 (ARM) instances. We performed extensive benchmarking but didn't initially see the expected performance benefits. We developed multiple hypotheses to explain the unrealized performance improvement, but we could not experimentally determine the cause. We then profiled the Kafka application, and after identifying and confirming a likely cause, we found a workaround and obtained the hoped-for improved price/performance. The second example explores the ability of Kafka to scale with increasing partitions. We revisit our previous benchmarking experiments with the newest version of Kafka (3.X), which has the option to replace Zookeeper with the new KRaft protocol. We test the theory that Kafka with KRaft can 'scale to millions of partitions' and also provide valuable experimental feedback on how close KRaft is to being production-ready. Presentation for the ApacheCon NA Performance Engineering Track, October 6, 2022, Sheraton Hotel, New Orleans.

Apache ZooKeeper and Apache Curator: Meet the Dining Philosophers

Apache ZooKeeper and Apache Curator: Meet the Dining Philosophers

Apache ZooKeeper and Apache Curator: Meet the Dining PhilosophersPaul Brebner

Spinning your Drones with Cadence Workflows and Apache Kafka

Spinning your Drones with Cadence Workflows and Apache Kafka

Spinning your Drones with Cadence Workflows and Apache KafkaPaul Brebner

The rapid rise in Big Data use cases over the last decade has been accelerated by popular massively scalable open-source technologies such as Apache Cassandra® for storage, Apache Kafka® for streaming, and OpenSearch® for search. Now there’s a new member of the peloton, Cadence, for orchestration - code-based scalable fault-tolerant workflow orchestration. To illustrate the most important Cadence concepts (and more) we’ll build a realistic drone delivery service demonstration application. We’ll also explore what happens when orchestration meets choreography, and use the drone application to illustrate different ways to integrate Cadence with Apache Kafka, including reusing Kafka microservices. But how scalable is Cadence in practice? We’ll fill the sky with drones - how many drones can we get flying at once?

Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...

Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...

Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...Paul Brebner

Modern event-based/streaming distributed systems embrace the idea that change is inevitable and actually desirable! Without being change-aware, systems are inflexible, can’t evolve or react, and are simply incapable of keeping up with real-time real-world data. But how can we speed up an “Elephant” (PostgreSQL) to be as fast as a “Cheetah” (Kafka)? In this talk, we'll introduce the Debezium PostgreSQL Connector, and explain how to deploy, configure and run it on a Kafka Connect cluster, explore the semantics and format of the change data events (including Schemas and Table/Topic mapping), and test the performance. Finally, we'll show how to stream the change data events into an example downstream system, Elasticsearch, using an open source sink connector. Presentation for PostgresConf.CN and PGConf.Asia 2021 https://www.highgo.ca/2022/01/19/2021-pg-asia-conference-delivered-another-successful-online-conference-again/

Scaling Open Source Big Data Cloud Applications is Easy/Hard

Scaling Open Source Big Data Cloud Applications is Easy/Hard

Scaling Open Source Big Data Cloud Applications is Easy/HardPaul Brebner

In the last decade, the development of modern horizontally scalable open-source Big Data technologies such as Apache Cassandra (for data storage), and Apache Kafka (for data streaming) enabled cost-effective, highly scalable, reliable, low-latency applications, and made these technologies increasingly ubiquitous. To enable reliable horizontal scalability, both Cassandra and Kafka utilize partitioning (for concurrency) and replication (for reliability and availability) across clustered servers. But building scalable applications isn’t as easy as just throwing more servers at the clusters, and unexpected speed humps are common. Consequently, you also need to understand the performance impact of partitions, replication, and clusters; monitor the correct metrics to have an end-to-end view of applications and clusters; conduct careful benchmarking, and scale and tune iteratively to take into account performance insights and optimizations. In this presentation, I will explore some of the performance goals, challenges, solutions, and results I discovered over the last 5 years building multiple realistic demonstration applications. The examples will include trade-offs with elastic Cassandra auto-scaling, scaling a Cassandra and Kafka anomaly detection application to 19 Billion checks per day, and building low-latency streaming data pipelines using Kafka Connect for multiple heterogeneous source and sink systems. Invited keynote for 5th Workshop on Hot Topics in Cloud Computing Performance (HotCloudPerf 2022) https://hotcloudperf.spec.org/ at ICPE 2022 https://icpe2022.spec.org/

OPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/Hard

OPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/Hard

OPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/HardPaul Brebner

DeveloperWeek Management 2022 Conference Presentation https://www.developerweek.com/global/conference/management/schedule/ In the last decade, the development of modern horizontally scalable open-source Big Data technologies such as Apache Cassandra (for data storage), and Apache Kafka (for data streaming) enabled cost-effective, highly scalable, reliable, low-latency applications, and made these technologies increasingly ubiquitous. To enable reliable horizontal scalability, both Cassandra and Kafka utilize partitioning (for concurrency) and replication (for reliability and availability) across clustered servers. But building scalable applications isn’t as easy as just throwing more servers at the clusters, and unexpected speed humps are common. Consequently, you also need to understand the performance impact of partitions, replication, and clusters; monitor the correct metrics to have an end-to-end view of applications and clusters; conduct careful benchmarking, and scale and tune iteratively to take into account performance insights and optimizations. In this presentation, I will explore some of the performance goals, challenges, solutions, and results I discovered over the last 5 years building multiple realistic demonstration applications. The examples will include trade-offs with elastic Cassandra auto-scaling, scaling a Cassandra and Kafka anomaly detection application to 19 Billion checks per day, and building low-latency streaming data pipelines using Kafka Connect for multiple heterogeneous source and sink systems.

A Visual Introduction to Apache Kafka

A Visual Introduction to Apache Kafka

A Visual Introduction to Apache KafkaPaul Brebner

n this Cartoon Style Visual Introduction to Apache Kafka we’re going to build a “Postal Service” to deliver party invitations to two groups, Nerds and Pugsters – find out who goes to the party. Along the way we’ll learn about Kafka Producers, Consumers, Groups, Topics, Partitions, Keys, Records, Delivery Semantics (Guaranteed delivery, and who gets what messages). We’ll also have a quick look at Streams (mail sorting) and Connectors (how does mail get delivered between post offices). Presentation for Open Source 101 2022: https://opensource101.com/sessions/a-visual-introduction-to-apache-kafka/ Video: https://youtu.be/NUnsHFn52sE

Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...

Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...

Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...Paul Brebner

This presentation will explore how we added location data to a scalable real-time anomaly detection application, built around Apache Kafka, and Cassandra. Kafka and Cassandra are designed for time-series data, however, it’s not so obvious how they can efficiently process spatiotemporal data (space and time). In order to find location-specific anomalies, we need ways to represent locations, to index locations, and to query locations. We explore alternative geospatial representations including: Latitude/Longitude points, Bounding Boxes, Geohashes, and go vertical with 3D representations, including 3D Geohashes. For each representation we also explore possible Cassandra implementations including: Clustering columns, Secondary indexes, Denormalized tables, and the Cassandra Lucene Index Plugin. To conclude we measure and compare the query throughput of some of the solutions, and summarise the results in terms of accuracy vs. performance to answer the question “Which geospatial data representation and Cassandra implementation is best?” ApacheCon NA 2020 Geospatial track presentation https://www.apachecon.com/acah2020/tracks/geospatial.html

Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...

Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...

Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...Paul Brebner

With the rapid onset of the global Covid-19 Pandemic from the start of this year the USA Centers for Disease Control and Prevention (CDC) had to quickly implement a new Covid-19 specific pipeline to collect testing data from all of the USA’s states and territories, and carry out other critical steps including integration, cleaning, checking, enrichment, analysis, and enforcing data governance and privacy etc. The pipeline then produces multiple consumable results for federal and public agencies. They did this in under 30 days, using Apache Kafka. In this presentation we'll build a similar (but simpler) pipeline for ingesting, integrating, indexing, searching/analysing and visualising some publicly available tidal data. We'll briefly introduce each technology and component, and walk through the steps of using Apache Kafka, Kafka Connect, Elasticsearch and Kibana to build the pipeline and visualise the results.

Grid Middleware – Principles, Practice and Potential

Grid Middleware – Principles, Practice and Potential

Grid Middleware – Principles, Practice and PotentialPaul Brebner

A presentation I gave at UCL, while I was managing the UK OGSA Evaluation Project in 2004, while I was on leave from CSIRO, at UCL Computer Science department, working with Wolfgang Emmerich. Paul Brebner, University College London, Computer Science Department Seminar: "Grid Middleware - Principles, Practice, and Potential", 1 November 2004. The project page was still here (2020): http://sse.cs.ucl.ac.uk/UK-OGSA/

Grid middleware is easy to install, configure, secure, debug and manage acros...

Grid middleware is easy to install, configure, secure, debug and manage acros...

Grid middleware is easy to install, configure, secure, debug and manage acros...Paul Brebner

A presentation made while I was managing the UK OGSA Evaluation Project in 2004, while I was on leave from CSIRO, at UCL Computer Science department, working with Wolfgang Emmerich: in which we "believe 6 impossible things before breakfast". This project encountered and partially solved many of the problems that Cloud computing finally solved. Paul Brebner, Oxford University Computing Laboratory invited talk: "Grid middleware is easy to install, configure, debug and manage - across multiple sites (One can't believe impossible things)", 15 October 2004. The project web site is still here (2020): http://sse.cs.ucl.ac.uk/UK-OGSA/

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Paul Brebner

This presentation will explore how we added location data to a scalable real-time anomaly detection application, built around Apache Kafka, and Cassandra. Kafka and Cassandra are designed for time-series data, however, it’s not so obvious how they can process geospatial data. In order to find location-specific anomalies, we need a way to represent locations, index locations, and query locations. We explore alternative geospatial representations including: Latitude/Longitude points, Bounding Boxes, Geohashes, and go vertical with 3D representations, including 3D Geohashes. To conclude we measure and compare the query throughput of some of the solutions, and summarise the results in terms of accuracy vs. performance to answer the question “Which geospatial data representation and Cassandra implementation is best?” This version is a slightly shorter version of previous ones. Google Cloud Special Edition, Sydney Data Engineering Meetup https://www.meetup.com/Sydney-Data-Engineering-Meetup/events/269146076/

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Paul Brebner

This presentation will explore how we added location data to a scalable real-time anomaly detection application, built around Apache Kafka, and Cassandra. Kafka and Cassandra are designed for time-series data, however, it’s not so obvious how they can process geospatial data. In order to find location-specific anomalies, we need a way to represent locations, index locations, and query locations. We explore alternative geospatial representations including: Latitude/Longitude points, Bounding Boxes, Geohashes, and go vertical with 3D representations, including 3D Geohashes. For each representation we also explore possible Cassandra implementations including: Clustering columns, Secondary indexes, Denormalized tables, and the Cassandra Lucene Index Plugin. To conclude we measure and compare the query throughput of some of the solutions, and summarise the results in terms of accuracy vs. performance to answer the question “Which geospatial data representation and Cassandra implementation is best?” Updated version of presentation for 30 April 2020 Melbourne Distributed Meetup (online)

Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...

Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...

Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...Paul Brebner

Apache Kafka, Apache Cassandra and Kubernetes are open source big data technologies enabling applications and business operations to scale massively and rapidly. While Kafka and Cassandra underpins the data layer of the stack providing capability to stream, disseminate, store and retrieve data at very low latency, Kubernetes is a container orchestration technology that helps in automated application deployment and scaling of application clusters. In this presentation, Paul will reveal how he architected a massive scale deployment of a streaming data pipeline with Kafka and Cassandra to cater to an example Anomaly detection application running on a Kubernetes cluster and generating and processing massive amount of events. Anomaly detection is a method used to detect unusual events in an event stream. It is widely used in a range of applications such as financial fraud detection, security, threat detection, website user analytics, sensors, IoT, system health monitoring, etc. When such applications operate at massive scale generating millions or billions of events, they impose significant computational, performance and scalability challenges to anomaly detection algorithms and data layer technologies. Paul will demonstrate the scalability, performance and cost effectiveness of Apache Kafka, Cassandra and Kubernetes, with results from his experiments allowing the Anomaly detection application to scale to 19 Billion anomaly checks per day. Melbourne Big Data Meetup, March 5 2020 https://www.eventbrite.com/e/melbourne-big-data-meetup-realtime-anomaly-detection-with-cassandra-kafka-tickets-93028445585

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Paul Brebner

Geospatial data makes it possible to leverage location, location, location! Geospatial data is taking off, as companies realize that just about everyone needs the benefits of geospatially aware applications. As a result there are no shortages of unique but demanding use cases of how enterprises are leveraging large-scale and fast geospatial big data processing. The data must be processed in large quantities - and quickly - to reveal hidden spatiotemporal insights vital to businesses and their end users. In the rush to tap into geospatial data, many enterprises will find that representing, indexing and querying geospatially-enriched data is more complex than they anticipated - and might bring about tradeoffs between accuracy, latency, and throughput.This presentation will explore how we added location data to a scalable real-time anomaly detection application, built around Apache Kafka, and Cassandra. Kafka and Cassandra are designed for time-series data, however, it’s not so obvious how they can process geospatial data. In order to find location-specific anomalies, we need a way to represent locations, index locations, and query locations. We explore alternative geospatial representations including: Latitude/Longitude points, Bounding Boxes, Geohashes, and go vertical with 3D representations, including 3D Geohashes. To conclude we measure and compare the query throughput of some of the solutions, and summarise the results in terms of accuracy vs. performance to answer the question “Which geospatial data representation and Cassandra implementation is best?”

Streaming More For Less With Apache Kafka Tiered Storage

Streaming More For Less With Apache Kafka Tiered Storage

Streaming More For Less With Apache Kafka Tiered StoragePaul Brebner

30 Of My Favourite Open Source Technologies In 30 Minutes

30 Of My Favourite Open Source Technologies In 30 Minutes

30 Of My Favourite Open Source Technologies In 30 MinutesPaul Brebner

Superpower Your Apache Kafka Applications Development with Complementary Open...

Superpower Your Apache Kafka Applications Development with Complementary Open...

Superpower Your Apache Kafka Applications Development with Complementary Open...Paul Brebner

Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...

Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...

Why Apache Kafka Clusters Are Like Galaxies (And Other Cosmic Kafka Quandarie...Paul Brebner

Architecting Applications With Multiple Open Source Big Data Technologies

Architecting Applications With Multiple Open Source Big Data Technologies

Architecting Applications With Multiple Open Source Big Data TechnologiesPaul Brebner

The Impact of Hardware and Software Version Changes on Apache Kafka Performan...

The Impact of Hardware and Software Version Changes on Apache Kafka Performan...

The Impact of Hardware and Software Version Changes on Apache Kafka Performan...Paul Brebner

Apache ZooKeeper and Apache Curator: Meet the Dining Philosophers

Apache ZooKeeper and Apache Curator: Meet the Dining Philosophers

Apache ZooKeeper and Apache Curator: Meet the Dining PhilosophersPaul Brebner

Spinning your Drones with Cadence Workflows and Apache Kafka

Spinning your Drones with Cadence Workflows and Apache Kafka

Spinning your Drones with Cadence Workflows and Apache KafkaPaul Brebner

Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...

Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...

Change Data Capture (CDC) With Kafka Connect® and the Debezium PostgreSQL Sou...Paul Brebner

Scaling Open Source Big Data Cloud Applications is Easy/Hard

Scaling Open Source Big Data Cloud Applications is Easy/Hard

Scaling Open Source Big Data Cloud Applications is Easy/HardPaul Brebner

OPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/Hard

OPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/Hard

OPEN Talk: Scaling Open Source Big Data Cloud Applications is Easy/HardPaul Brebner

A Visual Introduction to Apache Kafka

A Visual Introduction to Apache Kafka

A Visual Introduction to Apache KafkaPaul Brebner

Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...

Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...

Massively Scalable Real-time Geospatial Anomaly Detection with Apache Kafka a...Paul Brebner

Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...

Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...

Building a real-time data processing pipeline using Apache Kafka, Kafka Conne...Paul Brebner

Grid Middleware – Principles, Practice and Potential

Grid Middleware – Principles, Practice and Potential

Grid Middleware – Principles, Practice and PotentialPaul Brebner

Grid middleware is easy to install, configure, secure, debug and manage acros...

Grid middleware is easy to install, configure, secure, debug and manage acros...

Grid middleware is easy to install, configure, secure, debug and manage acros...Paul Brebner

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Paul Brebner

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Paul Brebner

Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...

Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...

Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...Paul Brebner

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...

Massively Scalable Real-time Geospatial Data Processing with Apache Kafka and...Paul Brebner

Ad

Recently uploaded (20)

How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...

How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...

How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...Egor Kaleynik

This case study explores how we partnered with a mid-sized U.S. healthcare SaaS provider to help them scale from a successful pilot phase to supporting over 10,000 users—while meeting strict HIPAA compliance requirements. Faced with slow, manual testing cycles, frequent regression bugs, and looming audit risks, their growth was at risk. Their existing QA processes couldn’t keep up with the complexity of real-time biometric data handling, and earlier automation attempts had failed due to unreliable tools and fragmented workflows. We stepped in to deliver a full QA and DevOps transformation. Our team replaced their fragile legacy tests with Testim’s self-healing automation, integrated Postman and OWASP ZAP into Jenkins pipelines for continuous API and security validation, and leveraged AWS Device Farm for real-device, region-specific compliance testing. Custom deployment scripts gave them control over rollouts without relying on heavy CI/CD infrastructure. The result? Test cycle times were reduced from 3 days to just 8 hours, regression bugs dropped by 40%, and they passed their first HIPAA audit without issue—unlocking faster contract signings and enabling them to expand confidently. More than just a technical upgrade, this project embedded compliance into every phase of development, proving that SaaS providers in regulated industries can scale fast and stay secure.

Download Wondershare Filmora Crack [2025] With Latest

Download Wondershare Filmora Crack [2025] With Latest

Download Wondershare Filmora Crack [2025] With Latesttahirabibi60507

Copy & Past Link 👉👉 http://drfiles.net/ Wondershare Filmora is a video editing software and app designed for both beginners and experienced users. It's known for its user-friendly interface, drag-and-drop functionality, and a wide range of tools and features for creating and editing videos. Filmora is available on Windows, macOS, iOS (iPhone/iPad), and Android platforms.

Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate

Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate

Meet the Agents: How AI Is Learning to Think, Plan, and CollaborateMaxim Salnikov

Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf

Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf

Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdfTechSoup

In this webinar we will dive into the essentials of generative AI, address key AI concerns, and demonstrate how nonprofits can benefit from using Microsoft’s AI assistant, Copilot, to achieve their goals. This event series to help nonprofits obtain Copilot skills is made possible by generous support from Microsoft. What You’ll Learn in Part 2: Explore real-world nonprofit use cases and success stories. Participate in live demonstrations and a hands-on activity to see how you can use Microsoft 365 Copilot in your own work!

Get & Download Wondershare Filmora Crack Latest [2025]

Get & Download Wondershare Filmora Crack Latest [2025]

Get & Download Wondershare Filmora Crack Latest [2025]saniaaftab72555

Copy & Past Link 👉👉 https://dr-up-community.info/ Wondershare Filmora is a video editing software and app designed for both beginners and experienced users. It's known for its user-friendly interface, drag-and-drop functionality, and a wide range of tools and features for creating and editing videos. Filmora is available on Windows, macOS, iOS (iPhone/iPad), and Android platforms.

Who Watches the Watchmen (SciFiDevCon 2025)

Who Watches the Watchmen (SciFiDevCon 2025)

Who Watches the Watchmen (SciFiDevCon 2025)Allon Mureinik

Tests, especially unit tests, are the developers’ superheroes. They allow us to mess around with our code and keep us safe. We often trust them with the safety of our codebase, but how do we know that we should? How do we know that this trust is well-deserved? Enter mutation testing – by intentionally injecting harmful mutations into our code and seeing if they are caught by the tests, we can evaluate the quality of the safety net they provide. By watching the watchmen, we can make sure our tests really protect us, and we aren’t just green-washing our IDEs to a false sense of security. Talk from SciFiDevCon 2025 https://www.scifidevcon.com/courses/2025-scifidevcon/contents/680efa43ae4f5

Kubernetes_101_Zero_to_Platform_Engineer.pptx

Kubernetes_101_Zero_to_Platform_Engineer.pptx

Kubernetes_101_Zero_to_Platform_Engineer.pptxCloudScouts

Secure Test Infrastructure: The Backbone of Trustworthy Software Development

Secure Test Infrastructure: The Backbone of Trustworthy Software Development

Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentShubham Joshi

Not So Common Memory Leaks in Java Webinar

Not So Common Memory Leaks in Java Webinar

Not So Common Memory Leaks in Java WebinarTier1 app

This SlideShare presentation is from our May webinar, “Not So Common Memory Leaks & How to Fix Them?”, where we explored lesser-known memory leak patterns in Java applications. Unlike typical leaks, subtle issues such as thread local misuse, inner class references, uncached collections, and misbehaving frameworks often go undetected and gradually degrade performance. This deck provides in-depth insights into identifying these hidden leaks using advanced heap analysis and profiling techniques, along with real-world case studies and practical solutions. Ideal for developers and performance engineers aiming to deepen their understanding of Java memory management and improve application stability.

Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards

Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards

Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage DashboardsBradBedford3

Join Ajay Sarpal and Miray Vu to learn about key Marketo Engage enhancements. Discover improved in-app Salesforce CRM connector statistics for easy monitoring of sync health and throughput. Explore new Salesforce CRM Synch Dashboards providing up-to-date insights into weekly activity usage, thresholds, and limits with drill-down capabilities. Learn about proactive notifications for both Salesforce CRM sync and product usage overages. Get an update on improved Salesforce CRM synch scale and reliability coming in Q2 2025. Key Takeaways: Improved Salesforce CRM User Experience: Learn how self-service visibility enhances satisfaction. Utilize Salesforce CRM Synch Dashboards: Explore real-time weekly activity data. Monitor Performance Against Limits: See threshold limits for each product level. Get Usage Over-Limit Alerts: Receive notifications for exceeding thresholds. Learn About Improved Salesforce CRM Scale: Understand upcoming cloud-based incremental sync.

Explaining GitHub Actions Failures with Large Language Models Challenges, In...

Explaining GitHub Actions Failures with Large Language Models Challenges, In...

Explaining GitHub Actions Failures with Large Language Models Challenges, In...ssuserb14185

GitHub Actions (GA) has become the de facto tool that developers use to automate software workflows, seamlessly building, testing, and deploying code. Yet when GA fails, it disrupts development, causing delays and driving up costs. Diagnosing failures becomes especially challenging because error logs are often long, complex and unstructured. Given these difficulties, this study explores the potential of large language models (LLMs) to generate correct, clear, concise, and actionable contextual descriptions (or summaries) for GA failures, focusing on developers’ perceptions of their feasibility and usefulness. Our results show that over 80% of developers rated LLM explanations positively in terms of correctness for simpler/small logs. Overall, our findings suggest that LLMs can feasibly assist developers in understanding common GA errors, thus, potentially reducing manual analysis. However, we also found that improved reasoning abilities are needed to support more complex CI/CD scenarios. For instance, less experienced developers tend to be more positive on the described context, while seasoned developers prefer concise summaries. Overall, our work offers key insights for researchers enhancing LLM reasoning, particularly in adapting explanations to user expertise. https://arxiv.org/abs/2501.16495

Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.

Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.

Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.Dele Amefo

PDF Reader Pro Crack Latest Version FREE Download 2025

PDF Reader Pro Crack Latest Version FREE Download 2025

PDF Reader Pro Crack Latest Version FREE Download 2025mu394968

🌍📱👉COPY LINK & PASTE ON GOOGLE https://dr-kain-geera.info/👈🌍 PDF Reader Pro is a software application, often referred to as an AI-powered PDF editor and converter, designed for viewing, editing, annotating, and managing PDF files. It supports various PDF functionalities like merging, splitting, converting, and protecting PDFs. Additionally, it can handle tasks such as creating fillable forms, adding digital signatures, and performing optical character recognition (OCR).

Expand your AI adoption with AgentExchange

Expand your AI adoption with AgentExchange

Expand your AI adoption with AgentExchangeFexle Services Pvt. Ltd.

AgentExchange is Salesforce’s latest innovation, expanding upon the foundation of AppExchange by offering a centralized marketplace for AI-powered digital labor. Designed for Agentblazers, developers, and Salesforce admins, this platform enables the rapid development and deployment of AI agents across industries. Email: [email protected] Phone: +1(630) 349 2411 Website: https://www.fexle.com/blogs/agentexchange-an-ultimate-guide-for-salesforce-consultants-businesses/?utm_source=slideshare&utm_medium=pptNg

Adobe Lightroom Classic Crack FREE Latest link 2025

Adobe Lightroom Classic Crack FREE Latest link 2025

Adobe Lightroom Classic Crack FREE Latest link 2025kashifyounis067

🌍📱👉COPY LINK & PASTE ON GOOGLE http://drfiles.net/ 👈🌍 Adobe Lightroom Classic is a desktop-based software application for editing and managing digital photos. It focuses on providing users with a powerful and comprehensive set of tools for organizing, editing, and processing their images on their computer. Unlike the newer Lightroom, which is cloud-based, Lightroom Classic stores photos locally on your computer and offers a more traditional workflow for professional photographers. Here's a more detailed breakdown: Key Features and Functions: Organization: Lightroom Classic provides robust tools for organizing your photos, including creating collections, using keywords, flags, and color labels. Editing: It offers a wide range of editing tools for making adjustments to color, tone, and more. Processing: Lightroom Classic can process RAW files, allowing for significant adjustments and fine-tuning of images. Desktop-Focused: The application is designed to be used on a computer, with the original photos stored locally on the hard drive. Non-Destructive Editing: Edits are applied to the original photos in a non-destructive way, meaning the original files remain untouched. Key Differences from Lightroom (Cloud-Based): Storage Location: Lightroom Classic stores photos locally on your computer, while Lightroom stores them in the cloud. Workflow: Lightroom Classic is designed for a desktop workflow, while Lightroom is designed for a cloud-based workflow. Connectivity: Lightroom Classic can be used offline, while Lightroom requires an internet connection to sync and access photos. Organization: Lightroom Classic offers more advanced organization features like Collections and Keywords. Who is it for? Professional Photographers: PCMag notes that Lightroom Classic is a popular choice among professional photographers who need the flexibility and control of a desktop-based application. Users with Large Collections: Those with extensive photo collections may prefer Lightroom Classic's local storage and robust organization features. Users who prefer a traditional workflow: Users who prefer a more traditional desktop workflow, with their original photos stored on their computer, will find Lightroom Classic a good fit.

How to Optimize Your AWS Environment for Improved Cloud Performance

How to Optimize Your AWS Environment for Improved Cloud Performance

How to Optimize Your AWS Environment for Improved Cloud PerformanceThousandEyes

How can one start with crypto wallet development.pptx

How can one start with crypto wallet development.pptx

How can one start with crypto wallet development.pptxlaravinson24

Landscape of Requirements Engineering for/by AI through Literature Review

Landscape of Requirements Engineering for/by AI through Literature Review

Landscape of Requirements Engineering for/by AI through Literature ReviewHironori Washizaki

How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?

How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?

How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?steaveroggers

Migrating from Lotus Notes to Outlook can be a complex and time-consuming task, especially when dealing with large volumes of NSF emails. This presentation provides a complete guide on how to batch export Lotus Notes NSF emails to Outlook PST format quickly and securely. It highlights the challenges of manual methods, the benefits of using an automated tool, and introduces eSoftTools NSF to PST Converter Software — a reliable solution designed to handle bulk email migrations efficiently. Learn about the software’s key features, step-by-step export process, system requirements, and how it ensures 100% data accuracy and folder structure preservation during migration. Make your email transition smoother, safer, and faster with the right approach. Read More:- https://www.esofttools.com/nsf-to-pst-converter.html

Download YouTube By Click 2025 Free Full Activated

Download YouTube By Click 2025 Free Full Activated

Download YouTube By Click 2025 Free Full Activatedsaniamalik72555

How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...

How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...

How Valletta helped healthcare SaaS to transform QA and compliance to grow wi...Egor Kaleynik

Download Wondershare Filmora Crack [2025] With Latest

Download Wondershare Filmora Crack [2025] With Latest

Download Wondershare Filmora Crack [2025] With Latesttahirabibi60507

Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate

Meet the Agents: How AI Is Learning to Think, Plan, and Collaborate

Meet the Agents: How AI Is Learning to Think, Plan, and CollaborateMaxim Salnikov

Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf

Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdf

Microsoft AI Nonprofit Use Cases and Live Demo_2025.04.30.pdfTechSoup

Get & Download Wondershare Filmora Crack Latest [2025]

Get & Download Wondershare Filmora Crack Latest [2025]

Get & Download Wondershare Filmora Crack Latest [2025]saniaaftab72555

Who Watches the Watchmen (SciFiDevCon 2025)

Who Watches the Watchmen (SciFiDevCon 2025)

Who Watches the Watchmen (SciFiDevCon 2025)Allon Mureinik

Kubernetes_101_Zero_to_Platform_Engineer.pptx

Kubernetes_101_Zero_to_Platform_Engineer.pptx

Kubernetes_101_Zero_to_Platform_Engineer.pptxCloudScouts

Secure Test Infrastructure: The Backbone of Trustworthy Software Development

Secure Test Infrastructure: The Backbone of Trustworthy Software Development

Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentShubham Joshi

Not So Common Memory Leaks in Java Webinar

Not So Common Memory Leaks in Java Webinar

Not So Common Memory Leaks in Java WebinarTier1 app

Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards

Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage Dashboards

Adobe Marketo Engage Champion Deep Dive - SFDC CRM Synch V2 & Usage DashboardsBradBedford3

Explaining GitHub Actions Failures with Large Language Models Challenges, In...

Explaining GitHub Actions Failures with Large Language Models Challenges, In...

Explaining GitHub Actions Failures with Large Language Models Challenges, In...ssuserb14185

Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.

Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.

Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.Dele Amefo

PDF Reader Pro Crack Latest Version FREE Download 2025

PDF Reader Pro Crack Latest Version FREE Download 2025

PDF Reader Pro Crack Latest Version FREE Download 2025mu394968

Expand your AI adoption with AgentExchange

Expand your AI adoption with AgentExchange

Expand your AI adoption with AgentExchangeFexle Services Pvt. Ltd.

Adobe Lightroom Classic Crack FREE Latest link 2025

Adobe Lightroom Classic Crack FREE Latest link 2025

Adobe Lightroom Classic Crack FREE Latest link 2025kashifyounis067

How to Optimize Your AWS Environment for Improved Cloud Performance

How to Optimize Your AWS Environment for Improved Cloud Performance

How to Optimize Your AWS Environment for Improved Cloud PerformanceThousandEyes

How can one start with crypto wallet development.pptx

How can one start with crypto wallet development.pptx

How can one start with crypto wallet development.pptxlaravinson24

Landscape of Requirements Engineering for/by AI through Literature Review

Landscape of Requirements Engineering for/by AI through Literature Review

Landscape of Requirements Engineering for/by AI through Literature ReviewHironori Washizaki

How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?

How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?

How to Batch Export Lotus Notes NSF Emails to Outlook PST Easily?steaveroggers

Download YouTube By Click 2025 Free Full Activated

Download YouTube By Click 2025 Free Full Activated

Download YouTube By Click 2025 Free Full Activatedsaniamalik72555

How to Improve the Observability of Apache Cassandra and Kafka applications with Prometheus and OpenTracing

1. How to Improve the Observability of Apache Cassandra and Kafka applications with Prometheus and OpenTracing March 27 2019 Paul Brebner Technology Evangelist instaclustr.com

2. As distributed applications grow more complex, dynamic, and massively scalable, “observability” becomes more critical. Observability is the practice of using metrics, monitoring and distributed tracing to understand how a system works. Observability Critical

3. As distributed cloud applications grow more complex, dynamic, and massively scalable, “observability” becomes more critical. Observability is the practice of using metrics, monitoring and distributed tracing to understand how a system works. And find the invisible cows Observability Critical

4. Open APM Land scape Lots of options https://openapm.io/landscape

5. Open APM Land scape In this webinar we’ll explore two complementary Open Source technologies: - Prometheus for monitoring application metrics, and - OpenTracing and Jaeger for distributed tracing. We’ll discover how they improve the observability of - an Anomaly Detection application, - deployed on AWS Kubernetes, and - using Instaclustr managed Apache Cassandra and Kafka clusters.

6. Goal To increase the observability of this anomaly detection application Kubernetes Cluster ?

7. Cloud context Running across Kafka, Cassandra,& Kubernetes Clusters

8. Observability Goal 1: Metrics T P S TPS TPS T i m e R o w s Anomalies Producer rate Consumer rate Anomaly checks rate Detector duration Rows returned anomaly rate

9. Observability Goal 2: Distributed Tracing

10. 1 2 3 Overview Prometheus for Monitoring OpenTracing for Distributed Tracing Conclusions

11. Monitoring with Prometheus Popular Open Source monitoring system from Soundcloud Now Cloud Native Computing Foundation (CNCF)

12. Prometheus Monitoring of applications and servers Pull-based Architecture & Components…

13. Prometheus Server Server responsible for service discovery, pulling metrics from monitored applications, storing metrics, and analysis of time series data

14. Prometheus GUI Built in simple graphing GUI, and native support for Grafana

15. Prometheus Optional Push gateway Alerting Optional push gateway and alerting Optional push gateway and alerting

16. Prometheus How does metrics capture work? Instrumentation and Agents (Exporters) - Client libraries for instrumenting applications in multiple programming languages - Java client collects JVM metrics and enables custom application metrics - Node exporter for host hardware metrics

17. Prometheus Data Model ■ Metrics ● Time series data ᐨ timestamp and value; name, key:value pairs ● By convention name includes ᐨ thing being monitored, logical type, and units ᐨ e.g. http_requests_total, http_duration_seconds ■ Prometheus automatically adds labels ● Job, host:port ■ Metric types (only relevant for instrumentation) ● Counter (increasing values) ● Gauge (values up and down) ● Histogram ● Summary

18. Target metrics Business metric (Anomaly checks/s) Diagnostic metrics T P S TPS TPS T i m e R o w s Anomalies Producer rate Consumer rate Anomaly checks rate Detector duration Rows returned anomaly rate

19. Steps Basic ■ Create and register Prometheus Metric types ● (e.g. Counter) for each timeseries type (e.g. throughputs) including name and units ■ Instrument the code ● e.g. increment the count, using name of the component (e.g. producer, consumer, etc) as label ■ Create HTTP server in code ■ Tell Prometheus where to scrape from (config file) ■ Run Prometheus Server ■ Browse to Prometheus server ■ View and select metrics, check that there’s data ■ Construct expression ■ Graph the expression ■ Run and configure Grafana for better graphs

20. Instrumentation Counter example // Use a single Counter for throughput metrics // for all stages of the pipeline // stages are distinguished by labels static final Counter pipelineCounter = Counter .build() .name(appName + "_requests_total") .help("Count of executions of pipeline stages") .labelNames("stage") .register(); . . . // After successful execution of each stage: // increment producer/consumer/detector rate count pipelineCounter.labels(“producer”).inc(); . . . pipelineCounter.labels(“consumer”).inc(); . . . pipelineCounter.labels(“detector”).inc();

21. Instrumentation Gauge example // A Gauge can go up and down // Used to measure the current value of some variable. // pipelineGauge will measure duration of each labelled stage static final Gauge pipelineGauge = Gauge .build() .name(appName + "_duration_seconds") .help("Gauge of stage durations in seconds") .labelNames("stage") .register(); . . . // in detector pipeline, compute duration and set long duration = nowTime – startTime; pipelineGauge.labels(”detector”).setToTime(duration);

22. HTTP Server For metric pulls // Metrics are pulled by Prometheus // Create an HTTP server as the endpoint to pull from // If there are multiple processes running on the same server // then you need different port numbers // Add IPs and port numbers to the Prometheus configuration // file. HTTPServer server = null; try { server = new HTTPServer(1234); } catch (IOException e) { e.printStackTrace(); }

23. Using Prometheus Configure Run ■ Configure Prometheus with IP and Ports to poll. ● Edit the default Prometheus.yml file ● Includes polling frequency, timeouts etc ● Ok for testing but doesn’t scale for production systems ■ Get, install and run Prometheus. ● Initially just running locally.

24. Graphs Counter ■ Browse to Prometheus Server URL ■ No default dashboards ■ View and select metrics ■ Execute them to graph ■ Counter value increases over time

25. Rate Graph using irate function ■ Enter expressions, e.g. irate function ■ Expression language has multiple data types and many functions

26. Gauge graph Pipeline stage durations in seconds ■ Doesn’t need a function as it’s a Gauge

27. Grafana Prometheus GUI ok for debugging Grafana better for production ■ Install and run Grafana ■ Browse to Grafana URL, create a Prometheus data source, add a Prometheus Graph. ■ Can enter multiple Prometheus expressions and graph them on the same graph. ■ Example shows rate and duration metrics

28. Simple Test configuration Prometheus Server outside Kubernetes cluster, pulls metrics from Pods Dynamic/many Pods are a challenge ■ IP addresses to pull from are dynamic ● Have to update Prometheus pull configurations ● In production too many Pods to do this manually

29. Prometheus on Kubernetes A few extra steps makes life easier ■ Create and register Prometheus Metric types ● (e.g. Counter) for each timeseries type (e.g. throughputs) including name and units ■ Instrument the code ● e.g. increment the count, using name of the component (e.g. producer, consumer, etc) as label ■ Create HTTP server in code ■ Run Prometheus Server on Kubernetes cluster, using Kubernetes Operator ■ Configure so it dynamically monitors selected Pods ■ Enable ingress and external access to Prometheus server ■ Browse to Prometheus server ■ View and select metrics, check that there’s data ■ Construct expression ■ Graph the expression ■ Run and configure Grafana for better graphs

30. Prometheus In production on Kubernetes Use Prometheus Operator

31. Prometheus In production on Kubernetes Use Prometheus Operator 1 Install Prometheus Operator and Run

32. Prometheus In production on Kubernetes Use Prometheus Operator 1 Install Prometheus Operator and Run 2 Configure Service Objects to monitor Pods

33. Prometheus In production on Kubernetes Use Prometheus Operator 1 Install Prometheus Operator and Run 2 Configure Service Objects to monitor Pods 3 Configure ServiceMonitors to discover Service Objects

34. Prometheus In production on Kubernetes Use Prometheus Operator 1 Install Prometheus Operator and Run 2 Configure Service Objects to monitor Pods 3 Configure ServiceMonitors to discover Service Objects 4 Configure Prometheus objects to specify which ServiceMonitors should be included

35. Prometheus In production on Kubernetes Use Prometheus Operator 1 Install Prometheus Operator and Run 2 Configure Service Objects to monitor Pods 3 Configure ServiceMonitors to discover Service Objects 4 Configure Prometheus objects to specify which ServiceMonitors should be included 5 Allow ingress to Prometheus by using a Kubernetes NodePort Service 6 Create Role-based access control rules for both Prometheus and Prometheus Operator 7 Configure AWS EC2 firewalls

36. Weavescope Prometheus now magically monitors Pods as they come and go Showing Prometheus monitoring Pods Prometheus Operator Pods

37. OpenTracing Use Case: Topology Maps ■ Prometheus collects and displays metric aggregations ● No dependency or order information, no single events ■ Distributed tracing shows “call tree” (causality, timing) for each event ■ And Topology Maps

38. OpenTracing Standard API for distributed tracing ■ Specification, not implementation ■ Need ● Application instrumentation ● OpenTracing tracer Traced Applications API Tracer implementations Open Source, Datadog

39. Spans Smallest logical unit of work in distributed system ■ Spans are smallest logical unit of work ● Have name, start time, duration, associated component ■ Simplest trace is a single span

40. Trace Multi-span trace ■ Spans can be related ● ChildOf = synchronous dependency (wait) ● FollowsFrom = asynchronous relationships (no wait) ■ A Trace is a DAG of Spans. ● 1 or more Spans.

41. Instrumentation ■ Language specific client instrumentation ● Used to create spans in the application within the same process ■ Contributed libraries for frameworks ● E.g. Elasticsearch, Cassandra, Kafka etc ● Used to create spans across process boundaries (Kafka producers -> consumers) ■ Choose and Instantiate a Tracer implementation // Example instrumentation for consumer -> detector spans static Tracer tracer = initTracer(”AnomaliaMachina"); . . . Span span1 = tracer.buildSpan(”consumer").start(); . . . span1.finish(); Span span2 = tracer .buildSpan(”detector") .addReference(References.CHILD_OF, span1.context()) .start(); . . . span2.finish(); Steps

42. Tracing across process boundaries Inject/extract metadata ■ To trace across process boundaries (processes, servers, clouds) OpenTracing injects metadata into the cross-process call flows to build traces across heterogeneous systems. ■ Inject and extract a spanContext, how depends on protocol.

43. How to do this for Kafka? Producer Automatically inserts a span context into Kafka headers using Interceptors // Register tracer with GlobalTracer: GlobalTracer.register(tracer); // Add TracingProducerInterceptor to sender properties: senderProps.put(ProducerConfig.INTERCEPTOR_CLASSES_CONFIG, TracingProducerInterceptor.class.getName()); // Instantiate KafkaProducer KafkaProducer<Integer, String> producer = new KafkaProducer<>(senderProps); // Send producer.send(...); // 3rd party library // https://github.com/opentracing-contrib/java-kafka-client

44. Consumer side Extract spanContext // Once you have a consumer record, extract // the span context and // create a new FOLLOWS_FROM span SpanContext spanContext = tracer.extract(Format.Builtin.TEXT_MAP, new MyHeadersMapExtractAdapter(record.headers(), false)); newSpan = tracer.buildSpan("consumer").addReference(Refe rences.FOLLOWS_FROM, spanContext).start();

45. Jaeger Tracer Open Source Tracer Uber/CNCF

46. Jaeger Tracer How to use? • Tracers can have different architectures and protocols • Jaeger should scale well in production as • It can use Cassandra and Spark • Uses adaptive sampling • Need to instantiate a Jaeger tracer in your code

47. Jaeger GUI ■ Install and start Jaeger ■ Browse to Jaeger URL ■ Find traces by name, operation, and filter. ■ Select to drill down for more detail.

48. Jaeger Single trace ■ Insight into total trace time, relationships and times of spans ■ This is a trace of a single event through the anomaly detector pipeline ● Producer (async) ● Consumer (async) ● Detector (async, with sync children) ᐨ CassandraWrite ᐨ CassandraRead ᐨ AnomalyDetector

49. Jaeger Dependencies view ■ Correctly shows anomaly detector topology ■ Only metric is number of spans observed ■ Can’t select subset of traces, or filter ■ Force directed view, select node and highlights dependencies

50. Kafka Challenge Multiple Kafka topic topologies ■ More complex example (application simulates complex event flows across topics) ■ Show dependencies between source, intermediate and sink Kafka topics.

51. Conclusions Observations & Alternatives ■ Topology view is basic (c.f. some commercial APMs) ■ Still need Prometheus for metrics ● in theory OpenTracing has everything needed for metrics. ■ Other OpenTracing tracers may be worth trying, e.g. Datadog ■ OpenCensus is a competing approach. ■ Manual instrumentation is tedious and potentially error prone, many commercial APMs use byte-code injection to avoid this problem ■ The future? Kubernetes based service mesh frameworks could construct traces for microservices without instrumentation ● as they have visibility into how Pods interact with each other and external systems ● and Pods only contain a single microservice, not a monolithic application

52. Results Scaled out to 48 Cassandra nodes Approx 600 cores for whole system 109 Pods for Prometheus to monitor Producer rate metric (9 Pods) Peak Producer rate = 2.3 Million events/s Prometheus was critical for collecting, computing and displaying the metrics, as this needed to be done from multiple Pods

53. Business metric Detector rate 100 Pods 220,000 anomaly checks/s computed from 100 stacked metrics Anomaly Checks/s = 220,000 Prometheus was critical for tuning the system to achieve near perfect linear scalability - used metrics for consumer and detector rate to tune thread pool sizes to optimize anomaly checks/s, for increasingly bigger systems. OpenTracing and Jaeger was useful during test deployment - to check/debug if components were working together as expected - but didn’t use in final production deployment - as more set-up required using the Jaeger Kubernetes Operator: https://github.com/jaegertracing/jaeger-operator

54. Cassandra & OpenTracing Visibility into Cassandra clusters? ■ OpenTracing the example application was ● Across Kafka producers/consumers ● And within the Kubernetes deployed application ■ What options are there for improved visibility of tracing of Cassandra clusters? ■ Instaclustr managed service ● OpenTracing support for the C* driver ● May not require any support from C* clusters ● https://github.com/opentracing-contrib/java-cassandra-driver ■ Self-managed clusters ● end-to-end OpenTracing through a C* cluster ● May require support from C* cluster ● https://github.com/thelastpickle/cassandra-zipkin-tracing

55. Cassandra & Prometheus Visibility into Cassandra clusters? Option 1 Instaclustr managed service ■ Prometheus monitoring of the example application ● limited to application metrics collected from Kubernetes Pods ■ What options are there for integration with Casandra Cluster metrics? ■ Instaclustr managed Cassandra ● 3rd party Prometheus exporter, native integration planned ● https://www.instaclustr.com/support/api-integrations/integrations/using- instaclustr-monitoring-api-prometheus/

56. Cassandra & Prometheus Visibility into Cassandra clusters? Option 2 Self-managed clusters ■ Instaclustr OpenSource contributions (under development) ● cassandra-exporter exports Cassandra metrics to Prometheus ᐨ https://github.com/instaclustr/cassandra-exporter ● Kubernetes Operator for Apache Cassandra ᐨ https://github.com/instaclustr/cassandra-operator/ ● The Cassandra operator will create the appropriate objects to inform the Prometheus operator about the metrics endpoints available from Cassandra ■ Instaclustr customers can then use ● Prometheus to monitor their own applications ● Prometheus federation to scrape the Cassandra Prometheus server to integrate application and cluster metrics ᐨ https://prometheus.io/docs/prometheus/latest/federation/

57. Prometheus Federation Federation Prometheus servers can pull metrics from other Prometheus servers

58. More information? Anomalia Machina Blogs: Massively Scalable Anomaly Detection with Apache Kafka and Cassandra ■ Anomalia Machina 5 – Application Monitoring with Prometheus ● https://www.instaclustr.com/anomalia-machina-5-1-application- monitoring-prometheus-massively-scalable-anomaly-detection- apache-kafka-cassandra/ ■ Anomalia Machina 6 – Application Tracing with OpenTracing ● https://www.instaclustr.com/anomalia-machina-6-application- tracing-opentracing-massively-scalable-anomaly-detection-apache- kafka-cassandra/ ■ Anomalia Machina 8 – Production Application Deployment with Kubernetes ● https://www.instaclustr.com/anomalia-machina-8-production- application-deployment-kubernetes-massively-scalable-anomaly- detection-apache-kafka-cassandra/ ● Enabling Ingress into Kubernetes: Connecting Prometheus to the Application running in Kubernetes ■ Anomalia Machina 10 – Final Results (soon) ● Using Prometheus Operator ■ All Blogs

59. The End Instaclustr Managed Platform Multiple Open Source Technologies and Providers www.instaclustr.com/platform/