SlideShare a Scribd company logo
dbisINSTITUT FÜR INFORMATIK
HUMBOLDT−UNIVERSITÄT ZU ERLINB
A Tale of Squirrels and Storms
Flink Forward 2015
Matthias J. Sax
mjsax@{informatik.hu-berlin.de|apache.org}
@MatthiasJSax
Humboldt-Universit¨at zu Berlin
Department of Computer Science
October 13st
2015
–MatthiasJ.Sax–SquirrelsandStorms
1/22
About Me
Ph. D. student in CS, DBIS Group, HU Berlin
involved in Stratosphere research project
working on data stream processing and optimization
Aeolus: build on top of Apache Storm
(https://github.com/mjsax/aeolus)
Committer at Apache Flink
Flink and Storm
vs.
Flink and Storm
Flinkvs.Storm
–MatthiasJ.Sax–SquirrelsandStorms
3/22
Similarities of Flink and Storm
–MatthiasJ.Sax–SquirrelsandStorms
3/22
Similarities of Flink and Storm
true stream processing engines (no micro-batching)
–MatthiasJ.Sax–SquirrelsandStorms
3/22
Similarities of Flink and Storm
true stream processing engines (no micro-batching)
low latencies ( 100ms)
–MatthiasJ.Sax–SquirrelsandStorms
3/22
Similarities of Flink and Storm
true stream processing engines (no micro-batching)
low latencies ( 100ms)
executing data flow programs
–MatthiasJ.Sax–SquirrelsandStorms
3/22
Similarities of Flink and Storm
true stream processing engines (no micro-batching)
low latencies ( 100ms)
executing data flow programs
parallel and distributed
–MatthiasJ.Sax–SquirrelsandStorms
3/22
Similarities of Flink and Storm
true stream processing engines (no micro-batching)
low latencies ( 100ms)
executing data flow programs
parallel and distributed
fault-tolerant
–MatthiasJ.Sax–SquirrelsandStorms
3/22
Similarities of Flink and Storm
true stream processing engines (no micro-batching)
low latencies ( 100ms)
executing data flow programs
parallel and distributed
fault-tolerant
cloud or cluster environment
–MatthiasJ.Sax–SquirrelsandStorms
3/22
Similarities of Flink and Storm
true stream processing engines (no micro-batching)
low latencies ( 100ms)
executing data flow programs
parallel and distributed
fault-tolerant
cloud or cluster environment
Trident:
similar Java API
exactly-once processing
–MatthiasJ.Sax–SquirrelsandStorms
4/22
Flink vs. Storm
Advantages of Storm:
super low latency (< 10ms)
very robust:
stateless JVM for easy restart on failure
Zookeeper manages cluster state
isolation of topology
dynamic scaling (to some extent)
multi-language protocol (for experts only)
distributed RPC
–MatthiasJ.Sax–SquirrelsandStorms
5/22
Flink vs. Storm
Advantages of Flink:1
richer API
Java and Scala
type safe programs
system is aware of multiple input streams
ordered stream processing
system and user timestamps
count/time and customized windows
stateful processing
light weight fault-tolerance
Chandy-Lamport distributed snapshots
1
http:
//data-artisans.com/real-time-stream-processing-the-next-step-for-apache-flink/
–MatthiasJ.Sax–SquirrelsandStorms
6/22
Flink vs. Storm
Advantages of Flink (cont.):
provides exactly-once sinks
native flow control (back pressure)2
higher throughput (> x 100)3
no lambda or kappa architecture necessary
native support for iterations (cyclic data flows)
managed memory
2
http://data-artisans.com/how-flink-handles-backpressure/
3
http://data-artisans.com/
high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink/
–MatthiasJ.Sax–SquirrelsandStorms
7/22
System Architecture: Storm
–MatthiasJ.Sax–SquirrelsandStorms
7/22
System Architecture: Storm
Nimbus
–MatthiasJ.Sax–SquirrelsandStorms
7/22
System Architecture: Storm
Nimbus
Client
–MatthiasJ.Sax–SquirrelsandStorms
7/22
System Architecture: Storm
Nimbus
Client
Supervisor
Supervisor
Supervisor
Supervisor
Supervisor
–MatthiasJ.Sax–SquirrelsandStorms
7/22
System Architecture: Storm
Nimbus
Client
Supervisor
Supervisor
Supervisor
Supervisor
Supervisor
Zookeeper
Zookeeper
Zookeeper
–MatthiasJ.Sax–SquirrelsandStorms
7/22
System Architecture: Storm
Nimbus
Client
Supervisor
Supervisor
Supervisor
Supervisor
Supervisor
Zookeeper
Zookeeper
Zookeeper
–MatthiasJ.Sax–SquirrelsandStorms
7/22
System Architecture: Storm
Nimbus
Client
Supervisor
Supervisor
Supervisor
Supervisor
Supervisor
Zookeeper
Zookeeper
Zookeeper
Worker
Worker
Worker
Worker
Worker
Worker
–MatthiasJ.Sax–SquirrelsandStorms
7/22
System Architecture: Storm
Nimbus
Client
Supervisor
Supervisor
Supervisor
Supervisor
Supervisor
Zookeeper
Zookeeper
Zookeeper
Worker
Worker
Worker
Worker
Worker
Worker
–MatthiasJ.Sax–SquirrelsandStorms
7/22
System Architecture: Storm
Nimbus
Client
Supervisor
Supervisor
Supervisor
Supervisor
Supervisor
Zookeeper
Zookeeper
Zookeeper
Worker
Worker
Worker
Worker
Worker
Worker
–MatthiasJ.Sax–SquirrelsandStorms
8/22
System Architecture: Flink
–MatthiasJ.Sax–SquirrelsandStorms
8/22
System Architecture: Flink
JobManager
–MatthiasJ.Sax–SquirrelsandStorms
8/22
System Architecture: Flink
JobManager
WebClientCLI Shell
–MatthiasJ.Sax–SquirrelsandStorms
8/22
System Architecture: Flink
JobManager
WebClientCLI Shell
TaskManager
TaskManager
TaskManager
TaskManager
TaskManager
–MatthiasJ.Sax–SquirrelsandStorms
8/22
System Architecture: Flink
JobManager
WebClientCLI Shell
TaskManager
TaskManager
TaskManager
TaskManager
TaskManager
–MatthiasJ.Sax–SquirrelsandStorms
8/22
System Architecture: Flink
JobManager
WebClientCLI Shell
TaskManager
TaskManager
TaskManager
TaskManager
TaskManager
–MatthiasJ.Sax–SquirrelsandStorms
8/22
System Architecture: Flink
JobManager
WebClientCLI Shell
TaskManager
TaskManager
TaskManager
TaskManager
TaskManager
JobManager
–MatthiasJ.Sax–SquirrelsandStorms
9/22
Topology Deployment: Storm
per default: round-robin scheduling
high overhead due to intra JVM and/or network
communication
localOfShuffle connection pattern poorly exploited
isolation of topologies
custom scheduler possible (for experts only)
–MatthiasJ.Sax–SquirrelsandStorms
9/22
Topology Deployment: Storm
per default: round-robin scheduling
high overhead due to intra JVM and/or network
communication
localOfShuffle connection pattern poorly exploited
isolation of topologies
custom scheduler possible (for experts only)
–MatthiasJ.Sax–SquirrelsandStorms
9/22
Topology Deployment: Storm
per default: round-robin scheduling
high overhead due to intra JVM and/or network
communication
localOfShuffle connection pattern poorly exploited
isolation of topologies
custom scheduler possible (for experts only)
Src
–MatthiasJ.Sax–SquirrelsandStorms
9/22
Topology Deployment: Storm
per default: round-robin scheduling
high overhead due to intra JVM and/or network
communication
localOfShuffle connection pattern poorly exploited
isolation of topologies
custom scheduler possible (for experts only)
Src T1 T2
–MatthiasJ.Sax–SquirrelsandStorms
9/22
Topology Deployment: Storm
per default: round-robin scheduling
high overhead due to intra JVM and/or network
communication
localOfShuffle connection pattern poorly exploited
isolation of topologies
custom scheduler possible (for experts only)
Src T1 T2 F1
F2
–MatthiasJ.Sax–SquirrelsandStorms
9/22
Topology Deployment: Storm
per default: round-robin scheduling
high overhead due to intra JVM and/or network
communication
localOfShuffle connection pattern poorly exploited
isolation of topologies
custom scheduler possible (for experts only)
Src T1 T2 F1
F2 C1 C2
–MatthiasJ.Sax–SquirrelsandStorms
9/22
Topology Deployment: Storm
per default: round-robin scheduling
high overhead due to intra JVM and/or network
communication
localOfShuffle connection pattern poorly exploited
isolation of topologies
custom scheduler possible (for experts only)
Src T1 T2 F1
F2 C1 C2 Sk
–MatthiasJ.Sax–SquirrelsandStorms
10/22
Topology Deployment: Flink
deploys whole pipeline to each TaskManager
local-forward is default
operator chaining
–MatthiasJ.Sax–SquirrelsandStorms
10/22
Topology Deployment: Flink
deploys whole pipeline to each TaskManager
local-forward is default
operator chaining
–MatthiasJ.Sax–SquirrelsandStorms
10/22
Topology Deployment: Flink
deploys whole pipeline to each TaskManager
local-forward is default
operator chaining
Src
–MatthiasJ.Sax–SquirrelsandStorms
10/22
Topology Deployment: Flink
deploys whole pipeline to each TaskManager
local-forward is default
operator chaining
Src
T1 T2
–MatthiasJ.Sax–SquirrelsandStorms
10/22
Topology Deployment: Flink
deploys whole pipeline to each TaskManager
local-forward is default
operator chaining
Src
T1 T2
F1 F2
–MatthiasJ.Sax–SquirrelsandStorms
10/22
Topology Deployment: Flink
deploys whole pipeline to each TaskManager
local-forward is default
operator chaining
Src
T1 T2
F1 F2
C1 C2
–MatthiasJ.Sax–SquirrelsandStorms
10/22
Topology Deployment: Flink
deploys whole pipeline to each TaskManager
local-forward is default
operator chaining
Src
T1 T2
F1 F2
C1 C2
Sk
–MatthiasJ.Sax–SquirrelsandStorms
10/22
Topology Deployment: Flink
deploys whole pipeline to each TaskManager
local-forward is default
operator chaining
Src
T1 T2
F1 F2
C1 C2
Sk
Storm Compatibility
–MatthiasJ.Sax–SquirrelsandStorms
12/22
Storm Compatibility
Allows to4
execute Storm topologies in Flink
embed Spouts/Bolts in Flink streaming programs
4
https://ci.apache.org/projects/flink/flink-docs-master/apis/storm_compatibility.html
–MatthiasJ.Sax–SquirrelsandStorms
12/22
Storm Compatibility
Allows to4
execute Storm topologies in Flink
embed Spouts/Bolts in Flink streaming programs
Runtime
Distributed Streaming Dataflow
DataSet API
Batch Processing
Streaming API
Stream Processing
Local
JVM, Embedded
Cluster
Standalone, YARN
Cloud
GCE, EC2
FlinkML
MachineLearning
Gelly
GraphAPI&Library
TableAPI
Batch
HadoopM/R
Comptibility
TableAPI
Streaming
Storm
Compatibility
4
https://ci.apache.org/projects/flink/flink-docs-master/apis/storm_compatibility.html
–MatthiasJ.Sax–SquirrelsandStorms
13/22
Storm Compatibility: API
Execute whole topologies:
FlinkTopologyBuilder
FlinkSubmitter
FlinkClient
FlinkLocalCluster
–MatthiasJ.Sax–SquirrelsandStorms
13/22
Storm Compatibility: API
Execute whole topologies:
FlinkTopologyBuilder
FlinkSubmitter
FlinkClient
FlinkLocalCluster
Embedded mode:
SpoutWrapper
BoltWrapper
–MatthiasJ.Sax–SquirrelsandStorms
13/22
Storm Compatibility: API
Execute whole topologies:
FlinkTopologyBuilder
FlinkSubmitter
FlinkClient
FlinkLocalCluster
Embedded mode:
SpoutWrapper
BoltWrapper
Additionally:
FiniteSpout interface
–MatthiasJ.Sax–SquirrelsandStorms
14/22
Storm Compatibility: Internals
Wrappers for Operators and Collectors
–MatthiasJ.Sax–SquirrelsandStorms
14/22
Storm Compatibility: Internals
Wrappers for Operators and Collectors
Bolt
–MatthiasJ.Sax–SquirrelsandStorms
14/22
Storm Compatibility: Internals
Wrappers for Operators and Collectors
Bolt
BoltWrapper
–MatthiasJ.Sax–SquirrelsandStorms
14/22
Storm Compatibility: Internals
Wrappers for Operators and Collectors
Bolt
BoltWrapper
Flink Collector
–MatthiasJ.Sax–SquirrelsandStorms
14/22
Storm Compatibility: Internals
Wrappers for Operators and Collectors
redirecting method calls
run() ⇒ nextTuple()
processElement() ⇒ execute()
emit() ⇒ collect()
Bolt
BoltWrapper
Flink Collector
–MatthiasJ.Sax–SquirrelsandStorms
14/22
Storm Compatibility: Internals
Wrappers for Operators and Collectors
redirecting method calls
run() ⇒ nextTuple()
processElement() ⇒ execute()
emit() ⇒ collect()
Bolt
BoltWrapper
Flink Collector
execute()
processElement()
emit()
collect()
–MatthiasJ.Sax–SquirrelsandStorms
14/22
Storm Compatibility: Internals
Wrappers for Operators and Collectors
redirecting method calls
run() ⇒ nextTuple()
processElement() ⇒ execute()
emit() ⇒ collect()
translating data types
TupleX, POJO ⇔ Tuple/Values
primitive types for single attribute input/output
Bolt
BoltWrapper
Flink Collector
execute()
processElement()
emit()
collect()
–MatthiasJ.Sax–SquirrelsandStorms
15/22
WordCount on Storm
public void main(String [] args) {
–MatthiasJ.Sax–SquirrelsandStorms
15/22
WordCount on Storm
public void main(String [] args) {
TopologyBuilder builder
= new TopologyBuilder ();
–MatthiasJ.Sax–SquirrelsandStorms
15/22
WordCount on Storm
public void main(String [] args) {
TopologyBuilder builder
= new TopologyBuilder ();
builder.setSpout("source",
new FileSpout("/tmp/hamlet.txt"));
–MatthiasJ.Sax–SquirrelsandStorms
15/22
WordCount on Storm
public void main(String [] args) {
TopologyBuilder builder
= new TopologyBuilder ();
builder.setSpout("source",
new FileSpout("/tmp/hamlet.txt"));
builder.setBolt("tokenizer", new BoltTokenizer ())
. shuffleGrouping ("source");
–MatthiasJ.Sax–SquirrelsandStorms
15/22
WordCount on Storm
public void main(String [] args) {
TopologyBuilder builder
= new TopologyBuilder ();
builder.setSpout("source",
new FileSpout("/tmp/hamlet.txt"));
builder.setBolt("tokenizer", new BoltTokenizer ())
. shuffleGrouping ("source");
builder.setBolt("counter", new BoltCounter ())
. fieldsGrouping("tokenizer",
new Fields("word"));
–MatthiasJ.Sax–SquirrelsandStorms
15/22
WordCount on Storm
public void main(String [] args) {
TopologyBuilder builder
= new TopologyBuilder ();
builder.setSpout("source",
new FileSpout("/tmp/hamlet.txt"));
builder.setBolt("tokenizer", new BoltTokenizer ())
. shuffleGrouping ("source");
builder.setBolt("counter", new BoltCounter ())
. fieldsGrouping("tokenizer",
new Fields("word"));
builder.setBolt("sink",
new BoltFileSink("/tmp/count.txt"))
. shuffleGrouping ("counter");
–MatthiasJ.Sax–SquirrelsandStorms
15/22
WordCount on Storm
public void main(String [] args) {
TopologyBuilder builder
= new TopologyBuilder ();
builder.setSpout("source",
new FileSpout("/tmp/hamlet.txt"));
builder.setBolt("tokenizer", new BoltTokenizer ())
. shuffleGrouping ("source");
builder.setBolt("counter", new BoltCounter ())
. fieldsGrouping("tokenizer",
new Fields("word"));
builder.setBolt("sink",
new BoltFileSink("/tmp/count.txt"))
. shuffleGrouping ("counter");
Config conf = new Config ();
StormSubmitter. submitTopology("WordCount", conf ,
builder.createTopology ());
}
–MatthiasJ.Sax–SquirrelsandStorms
16/22
WordCount on Flink
public void main(String [] args) {
FlinkTopologyBuilder builder
= new FlinkTopologyBuilder ();
builder.setSpout("source",
new FileSpout("/tmp/hamlet.txt"));
builder.setBolt("tokenizer", new BoltTokenizer ())
. shuffleGrouping ("source");
builder.setBolt("counter", new BoltCounter ())
. fieldsGrouping("tokenizer",
new Fields("word"));
builder.setBolt("sink",
new BoltFileSink("/tmp/count.txt"))
. shuffleGrouping ("counter");
Config conf = new Config ();
FlinkSubmitter. submitTopology("WordCount", conf ,
builder.createTopology ());
}
–MatthiasJ.Sax–SquirrelsandStorms
17/22
Storm on Flink
run Storm topology on Flink:
changing two lines of code
sufficient
–MatthiasJ.Sax–SquirrelsandStorms
18/22
WordCount: Embedded Spout
public void main(String [] args) {
StreamExecutionEnvironment env
= StreamExecutionEnvironment
. getExecutionEnvironment ();
–MatthiasJ.Sax–SquirrelsandStorms
18/22
WordCount: Embedded Spout
public void main(String [] args) {
StreamExecutionEnvironment env
= StreamExecutionEnvironment
. getExecutionEnvironment ();
DataStream <Tuple1 <String >> source
= env.addSource(
–MatthiasJ.Sax–SquirrelsandStorms
18/22
WordCount: Embedded Spout
public void main(String [] args) {
StreamExecutionEnvironment env
= StreamExecutionEnvironment
. getExecutionEnvironment ();
DataStream <Tuple1 <String >> source
= env.addSource(
new SpoutWrapper <Tuple1 <String >>(
–MatthiasJ.Sax–SquirrelsandStorms
18/22
WordCount: Embedded Spout
public void main(String [] args) {
StreamExecutionEnvironment env
= StreamExecutionEnvironment
. getExecutionEnvironment ();
DataStream <Tuple1 <String >> source
= env.addSource(
new SpoutWrapper <Tuple1 <String >>(
new FileSpout("/tmp/hamlet.txt")),
–MatthiasJ.Sax–SquirrelsandStorms
18/22
WordCount: Embedded Spout
public void main(String [] args) {
StreamExecutionEnvironment env
= StreamExecutionEnvironment
. getExecutionEnvironment ();
DataStream <Tuple1 <String >> source
= env.addSource(
new SpoutWrapper <Tuple1 <String >>(
new FileSpout("/tmp/hamlet.txt")),
TypeExtractor.getForObject(
new Tuple1 <String >("")));
–MatthiasJ.Sax–SquirrelsandStorms
18/22
WordCount: Embedded Spout
public void main(String [] args) {
StreamExecutionEnvironment env
= StreamExecutionEnvironment
. getExecutionEnvironment ();
DataStream <Tuple1 <String >> source
= env.addSource(
new SpoutWrapper <Tuple1 <String >>(
new FileSpout("/tmp/hamlet.txt")),
TypeExtractor.getForObject(
new Tuple1 <String >("")));
// do further processing on source
source.flatMap(new Tokenizer ())
// out -> Tuple2 <String ,Integer >
.keyBy (0). sum (1). writeAsText("/tmp/count.txt");
env.execute("WordCount");
}
–MatthiasJ.Sax–SquirrelsandStorms
19/22
WordCount: Embedded Bolt
public void main(String [] args) {
StreamExecutionEnvironment env
= StreamExecutionEnvironment
. getExecutionEnvironment ();
DataStream <String > text
= env.readTextFile("/tmp/hamlet.txt");
–MatthiasJ.Sax–SquirrelsandStorms
19/22
WordCount: Embedded Bolt
public void main(String [] args) {
StreamExecutionEnvironment env
= StreamExecutionEnvironment
. getExecutionEnvironment ();
DataStream <String > text
= env.readTextFile("/tmp/hamlet.txt");
DataStream <Tuple2 <String ,Integer >> tokens
= text.transform(
–MatthiasJ.Sax–SquirrelsandStorms
19/22
WordCount: Embedded Bolt
public void main(String [] args) {
StreamExecutionEnvironment env
= StreamExecutionEnvironment
. getExecutionEnvironment ();
DataStream <String > text
= env.readTextFile("/tmp/hamlet.txt");
DataStream <Tuple2 <String ,Integer >> tokens
= text.transform(
"tokenizer",
new BoltWrapper <String ,
Tuple2 <String ,Integer >>(
–MatthiasJ.Sax–SquirrelsandStorms
19/22
WordCount: Embedded Bolt
public void main(String [] args) {
StreamExecutionEnvironment env
= StreamExecutionEnvironment
. getExecutionEnvironment ();
DataStream <String > text
= env.readTextFile("/tmp/hamlet.txt");
DataStream <Tuple2 <String ,Integer >> tokens
= text.transform(
"tokenizer",
new BoltWrapper <String ,
Tuple2 <String ,Integer >>(
new BoltTokenizer ()));
–MatthiasJ.Sax–SquirrelsandStorms
19/22
WordCount: Embedded Bolt
public void main(String [] args) {
StreamExecutionEnvironment env
= StreamExecutionEnvironment
. getExecutionEnvironment ();
DataStream <String > text
= env.readTextFile("/tmp/hamlet.txt");
DataStream <Tuple2 <String ,Integer >> tokens
= text.transform(
"tokenizer",
TypeExtractor.getForObject
new Tuple2 <String ,Integer >("", 0),
new BoltWrapper <String ,
Tuple2 <String ,Integer >>(
new BoltTokenizer ()));
–MatthiasJ.Sax–SquirrelsandStorms
19/22
WordCount: Embedded Bolt
public void main(String [] args) {
StreamExecutionEnvironment env
= StreamExecutionEnvironment
. getExecutionEnvironment ();
DataStream <String > text
= env.readTextFile("/tmp/hamlet.txt");
DataStream <Tuple2 <String ,Integer >> tokens
= text.transform(
"tokenizer",
TypeExtractor.getForObject
new Tuple2 <String ,Integer >("", 0),
new BoltWrapper <String ,
Tuple2 <String ,Integer >>(
new BoltTokenizer ()));
// do further processing on tokens
tokens.keyBy (0). sum (1). writeAsText("/tmp/count.txt");
env.execute("WordCount");
}
–MatthiasJ.Sax–SquirrelsandStorms
20/22
Embedded Compatibility Mode
Re-use code within Flink streaming program:
Spouts as Flink sources
Bolts as Flink operators
–MatthiasJ.Sax–SquirrelsandStorms
20/22
Embedded Compatibility Mode
Re-use code within Flink streaming program:
Spouts as Flink sources
Bolts as Flink operators
Pros:
mix-and-match of Storm and Flink operators
configure Spouts/Bolts (Map/Config)
spliting Spout/Bolt output streams
type-safe embedding
also raw types, ie, String instead of Tuple1 String
convert infinite Spouts to finite sources
FinitSpout interfacee
–MatthiasJ.Sax–SquirrelsandStorms
20/22
Embedded Compatibility Mode
Re-use code within Flink streaming program:
Spouts as Flink sources
Bolts as Flink operators
Pros:
mix-and-match of Storm and Flink operators
configure Spouts/Bolts (Map/Config)
spliting Spout/Bolt output streams
type-safe embedding
also raw types, ie, String instead of Tuple1 String
convert infinite Spouts to finite sources
FinitSpout interfacee
Cons: Currently, quite some boilderplate code necessary :/
–MatthiasJ.Sax–SquirrelsandStorms
21/22
Outlook: Storm Compatibility
Current status:
available in master branch
based on Storm 0.9.4
will be part of Flink 0.10.0
–MatthiasJ.Sax–SquirrelsandStorms
21/22
Outlook: Storm Compatibility
Current status:
available in master branch
based on Storm 0.9.4
will be part of Flink 0.10.0
Work in progress:
Hooks
Metrics
–MatthiasJ.Sax–SquirrelsandStorms
21/22
Outlook: Storm Compatibility
Current status:
available in master branch
based on Storm 0.9.4
will be part of Flink 0.10.0
Work in progress:
Hooks
Metrics
Next steps:
enable fault-tolerance
introduce FlinkTridentTopology
improve embedded mode (StormEnvironment)
dbisINSTITUT FÜR INFORMATIK
HUMBOLDT−UNIVERSITÄT ZU ERLINB
A Tale of Squirrels and Storms
Flink Forward 2015
Thanks!

More Related Content

What's hot (20)

PDF
Dongwon Kim – A Comparative Performance Evaluation of Flink
Flink Forward
 
PPTX
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Robert Metzger
 
PDF
Christian Kreuzfeld – Static vs Dynamic Stream Processing
Flink Forward
 
PDF
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Flink Forward
 
PPTX
Apache Flink at Strata San Jose 2016
Kostas Tzoumas
 
PDF
Computing recommendations at extreme scale with Apache Flink @Buzzwords 2015
Till Rohrmann
 
PDF
K. Tzoumas & S. Ewen – Flink Forward Keynote
Flink Forward
 
PDF
Pulsar connector on flink 1.14
宇帆 盛
 
PDF
Fault Tolerance and Job Recovery in Apache Flink @ FlinkForward 2015
Till Rohrmann
 
PPTX
Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pr...
Stephan Ewen
 
PDF
Data Stream Analytics - Why they are important
Paris Carbone
 
PPTX
An Introduction to Distributed Data Streaming
Paris Carbone
 
PDF
Unified Stream and Batch Processing with Apache Flink
DataWorks Summit/Hadoop Summit
 
PDF
A look at Flink 1.2
Stefan Richter
 
PPTX
Taking a look under the hood of Apache Flink's relational APIs.
Fabian Hueske
 
PPTX
Apache Flink@ Strata & Hadoop World London
Stephan Ewen
 
PDF
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
Apache Flink Taiwan User Group
 
PPTX
Flink history, roadmap and vision
Stephan Ewen
 
PPTX
Data Stream Processing with Apache Flink
Fabian Hueske
 
PPTX
SICS: Apache Flink Streaming
Turi, Inc.
 
Dongwon Kim – A Comparative Performance Evaluation of Flink
Flink Forward
 
Architecture of Flink's Streaming Runtime @ ApacheCon EU 2015
Robert Metzger
 
Christian Kreuzfeld – Static vs Dynamic Stream Processing
Flink Forward
 
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Flink Forward
 
Apache Flink at Strata San Jose 2016
Kostas Tzoumas
 
Computing recommendations at extreme scale with Apache Flink @Buzzwords 2015
Till Rohrmann
 
K. Tzoumas & S. Ewen – Flink Forward Keynote
Flink Forward
 
Pulsar connector on flink 1.14
宇帆 盛
 
Fault Tolerance and Job Recovery in Apache Flink @ FlinkForward 2015
Till Rohrmann
 
Apache Flink - Overview and Use cases of a Distributed Dataflow System (at pr...
Stephan Ewen
 
Data Stream Analytics - Why they are important
Paris Carbone
 
An Introduction to Distributed Data Streaming
Paris Carbone
 
Unified Stream and Batch Processing with Apache Flink
DataWorks Summit/Hadoop Summit
 
A look at Flink 1.2
Stefan Richter
 
Taking a look under the hood of Apache Flink's relational APIs.
Fabian Hueske
 
Apache Flink@ Strata & Hadoop World London
Stephan Ewen
 
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
Apache Flink Taiwan User Group
 
Flink history, roadmap and vision
Stephan Ewen
 
Data Stream Processing with Apache Flink
Fabian Hueske
 
SICS: Apache Flink Streaming
Turi, Inc.
 

Viewers also liked (20)

PPTX
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Flink Forward
 
PDF
Martin Junghans – Gradoop: Scalable Graph Analytics with Apache Flink
Flink Forward
 
PDF
Mohamed Amine Abdessemed – Real-time Data Integration with Apache Flink & Kafka
Flink Forward
 
PPTX
Apache Flink Training: DataStream API Part 1 Basic
Flink Forward
 
PDF
Maximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Flink Forward
 
PPTX
Slim Baltagi – Flink vs. Spark
Flink Forward
 
PPTX
Flink Case Study: Bouygues Telecom
Flink Forward
 
PDF
Introduction to Apache Flink - Fast and reliable big data processing
Till Rohrmann
 
PDF
Mikio Braun – Data flow vs. procedural programming
Flink Forward
 
PDF
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Flink Forward
 
PDF
Vasia Kalavri – Training: Gelly School
Flink Forward
 
PPTX
Apache Flink: API, runtime, and project roadmap
Kostas Tzoumas
 
PPTX
Michael Häusler – Everyday flink
Flink Forward
 
PDF
Suneel Marthi – BigPetStore Flink: A Comprehensive Blueprint for Apache Flink
Flink Forward
 
PPTX
Assaf Araki – Real Time Analytics at Scale
Flink Forward
 
PDF
Apache Flink internals
Kostas Tzoumas
 
PDF
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
Flink Forward
 
PPTX
Apache Flink Training: DataSet API Basics
Flink Forward
 
PDF
Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015
Till Rohrmann
 
PPTX
Aljoscha Krettek – Notions of Time
Flink Forward
 
Chris Hillman – Beyond Mapreduce Scientific Data Processing in Real-time
Flink Forward
 
Martin Junghans – Gradoop: Scalable Graph Analytics with Apache Flink
Flink Forward
 
Mohamed Amine Abdessemed – Real-time Data Integration with Apache Flink & Kafka
Flink Forward
 
Apache Flink Training: DataStream API Part 1 Basic
Flink Forward
 
Maximilian Michels – Google Cloud Dataflow on Top of Apache Flink
Flink Forward
 
Slim Baltagi – Flink vs. Spark
Flink Forward
 
Flink Case Study: Bouygues Telecom
Flink Forward
 
Introduction to Apache Flink - Fast and reliable big data processing
Till Rohrmann
 
Mikio Braun – Data flow vs. procedural programming
Flink Forward
 
Marc Schwering – Using Flink with MongoDB to enhance relevancy in personaliza...
Flink Forward
 
Vasia Kalavri – Training: Gelly School
Flink Forward
 
Apache Flink: API, runtime, and project roadmap
Kostas Tzoumas
 
Michael Häusler – Everyday flink
Flink Forward
 
Suneel Marthi – BigPetStore Flink: A Comprehensive Blueprint for Apache Flink
Flink Forward
 
Assaf Araki – Real Time Analytics at Scale
Flink Forward
 
Apache Flink internals
Kostas Tzoumas
 
William Vambenepe – Google Cloud Dataflow and Flink , Stream Processing by De...
Flink Forward
 
Apache Flink Training: DataSet API Basics
Flink Forward
 
Streaming Data Flow with Apache Flink @ Paris Flink Meetup 2015
Till Rohrmann
 
Aljoscha Krettek – Notions of Time
Flink Forward
 
Ad

Similar to Matthias J. Sax – A Tale of Squirrels and Storms (20)

PPTX
Flink vs. Spark
Slim Baltagi
 
PPTX
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Slim Baltagi
 
PDF
Tale of two streaming frameworks (Karthik D - Walmart)
KafkaZone
 
PPT
Tale of two streaming frameworks- Apace Storm & Apache Flink
Karthik Deivasigamani
 
PPTX
Chicago Flink Meetup: Flink's streaming architecture
Robert Metzger
 
PDF
Comparison of Open-Source Data Stream Processing Engines: Spark Streaming, Fl...
Darshan Gorasiya
 
PPTX
Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Int...
Robert Metzger
 
PPTX
Apache Flink(tm) - A Next-Generation Stream Processor
Aljoscha Krettek
 
PPTX
QCon London - Stream Processing with Apache Flink
Robert Metzger
 
PPTX
Unified Batch and Real-Time Stream Processing Using Apache Flink
Slim Baltagi
 
PPTX
Stream Computing (The Engineer's Perspective)
Ilya Ganelin
 
PPTX
GOTO Night Amsterdam - Stream processing with Apache Flink
Robert Metzger
 
PDF
Why Serverless Flink Matters - Blazing Fast Stream Processing Made Scalable
HostedbyConfluent
 
PPTX
Overview of Apache Flink: Next-Gen Big Data Analytics Framework
Slim Baltagi
 
PDF
Stream processing comparison
Yangjun Wang
 
PDF
Apache Flink - a Gentle Start
Liangjun Jiang
 
PPTX
Flink
Alexey Demin
 
PDF
Apache Flink Training Workshop @ HadoopCon2016 - #1 System Overview
Apache Flink Taiwan User Group
 
PDF
Apache Flink
Mike Frampton
 
PPTX
Performance Comparison of Streaming Big Data Platforms
DataWorks Summit/Hadoop Summit
 
Flink vs. Spark
Slim Baltagi
 
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Slim Baltagi
 
Tale of two streaming frameworks (Karthik D - Walmart)
KafkaZone
 
Tale of two streaming frameworks- Apace Storm & Apache Flink
Karthik Deivasigamani
 
Chicago Flink Meetup: Flink's streaming architecture
Robert Metzger
 
Comparison of Open-Source Data Stream Processing Engines: Spark Streaming, Fl...
Darshan Gorasiya
 
Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Int...
Robert Metzger
 
Apache Flink(tm) - A Next-Generation Stream Processor
Aljoscha Krettek
 
QCon London - Stream Processing with Apache Flink
Robert Metzger
 
Unified Batch and Real-Time Stream Processing Using Apache Flink
Slim Baltagi
 
Stream Computing (The Engineer's Perspective)
Ilya Ganelin
 
GOTO Night Amsterdam - Stream processing with Apache Flink
Robert Metzger
 
Why Serverless Flink Matters - Blazing Fast Stream Processing Made Scalable
HostedbyConfluent
 
Overview of Apache Flink: Next-Gen Big Data Analytics Framework
Slim Baltagi
 
Stream processing comparison
Yangjun Wang
 
Apache Flink - a Gentle Start
Liangjun Jiang
 
Apache Flink Training Workshop @ HadoopCon2016 - #1 System Overview
Apache Flink Taiwan User Group
 
Apache Flink
Mike Frampton
 
Performance Comparison of Streaming Big Data Platforms
DataWorks Summit/Hadoop Summit
 
Ad

More from Flink Forward (20)

PDF
Building a fully managed stream processing platform on Flink at scale for Lin...
Flink Forward
 
PPTX
Evening out the uneven: dealing with skew in Flink
Flink Forward
 
PPTX
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
Flink Forward
 
PDF
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Flink Forward
 
PDF
Introducing the Apache Flink Kubernetes Operator
Flink Forward
 
PPTX
Autoscaling Flink with Reactive Mode
Flink Forward
 
PDF
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Flink Forward
 
PPTX
One sink to rule them all: Introducing the new Async Sink
Flink Forward
 
PPTX
Tuning Apache Kafka Connectors for Flink.pptx
Flink Forward
 
PDF
Flink powered stream processing platform at Pinterest
Flink Forward
 
PPTX
Apache Flink in the Cloud-Native Era
Flink Forward
 
PPTX
Where is my bottleneck? Performance troubleshooting in Flink
Flink Forward
 
PPTX
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Flink Forward
 
PPTX
The Current State of Table API in 2022
Flink Forward
 
PDF
Flink SQL on Pulsar made easy
Flink Forward
 
PPTX
Dynamic Rule-based Real-time Market Data Alerts
Flink Forward
 
PPTX
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Flink Forward
 
PPTX
Processing Semantically-Ordered Streams in Financial Services
Flink Forward
 
PDF
Tame the small files problem and optimize data layout for streaming ingestion...
Flink Forward
 
PDF
Batch Processing at Scale with Flink & Iceberg
Flink Forward
 
Building a fully managed stream processing platform on Flink at scale for Lin...
Flink Forward
 
Evening out the uneven: dealing with skew in Flink
Flink Forward
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
Flink Forward
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Flink Forward
 
Introducing the Apache Flink Kubernetes Operator
Flink Forward
 
Autoscaling Flink with Reactive Mode
Flink Forward
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Flink Forward
 
One sink to rule them all: Introducing the new Async Sink
Flink Forward
 
Tuning Apache Kafka Connectors for Flink.pptx
Flink Forward
 
Flink powered stream processing platform at Pinterest
Flink Forward
 
Apache Flink in the Cloud-Native Era
Flink Forward
 
Where is my bottleneck? Performance troubleshooting in Flink
Flink Forward
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Flink Forward
 
The Current State of Table API in 2022
Flink Forward
 
Flink SQL on Pulsar made easy
Flink Forward
 
Dynamic Rule-based Real-time Market Data Alerts
Flink Forward
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Flink Forward
 
Processing Semantically-Ordered Streams in Financial Services
Flink Forward
 
Tame the small files problem and optimize data layout for streaming ingestion...
Flink Forward
 
Batch Processing at Scale with Flink & Iceberg
Flink Forward
 

Recently uploaded (20)

PDF
Market Insight : ETH Dominance Returns
CIFDAQ
 
PPTX
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
PDF
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
PPTX
python advanced data structure dictionary with examples python advanced data ...
sprasanna11
 
PDF
TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...
TrustArc
 
PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PDF
The Past, Present & Future of Kenya's Digital Transformation
Moses Kemibaro
 
PDF
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
PDF
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
PDF
Per Axbom: The spectacular lies of maps
Nexer Digital
 
PDF
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
PDF
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
PPTX
Agentic AI in Healthcare Driving the Next Wave of Digital Transformation
danielle hunter
 
PPTX
The Future of AI & Machine Learning.pptx
pritsen4700
 
PDF
introduction to computer hardware and sofeware
chauhanshraddha2007
 
PDF
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
PDF
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
PDF
Lecture A - AI Workflows for Banking.pdf
Dr. LAM Yat-fai (林日辉)
 
PDF
RAT Builders - How to Catch Them All [DeepSec 2024]
malmoeb
 
PPTX
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
Market Insight : ETH Dominance Returns
CIFDAQ
 
Introduction to Flutter by Ayush Desai.pptx
ayushdesai204
 
Tea4chat - another LLM Project by Kerem Atam
a0m0rajab1
 
python advanced data structure dictionary with examples python advanced data ...
sprasanna11
 
TrustArc Webinar - Navigating Data Privacy in LATAM: Laws, Trends, and Compli...
TrustArc
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
The Past, Present & Future of Kenya's Digital Transformation
Moses Kemibaro
 
CIFDAQ's Market Wrap : Bears Back in Control?
CIFDAQ
 
Make GenAI investments go further with the Dell AI Factory
Principled Technologies
 
Per Axbom: The spectacular lies of maps
Nexer Digital
 
How Open Source Changed My Career by abdelrahman ismail
a0m0rajab1
 
Trying to figure out MCP by actually building an app from scratch with open s...
Julien SIMON
 
Agentic AI in Healthcare Driving the Next Wave of Digital Transformation
danielle hunter
 
The Future of AI & Machine Learning.pptx
pritsen4700
 
introduction to computer hardware and sofeware
chauhanshraddha2007
 
GDG Cloud Munich - Intro - Luiz Carneiro - #BuildWithAI - July - Abdel.pdf
Luiz Carneiro
 
Structs to JSON: How Go Powers REST APIs
Emily Achieng
 
Lecture A - AI Workflows for Banking.pdf
Dr. LAM Yat-fai (林日辉)
 
RAT Builders - How to Catch Them All [DeepSec 2024]
malmoeb
 
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 

Matthias J. Sax – A Tale of Squirrels and Storms