SlideShare a Scribd company logo
Aljoscha Krettek – PMC of Apache Flink and Apache Beam, Co-Founder at data Artisans
Apache Flink® and what it is used for
© 2018 data Artisans2
About Data Artisans
Original Creators of
Apache Flink®
RealTime Stream Processing
Enterprise Ready
2008 2009 2011
Studying and working at IBM
Development on Stratosphere begins
I joinTU Berlin research group with other co-founders
2014
data Artisans is founded
© 2018 data Artisans3
What is Apache Flink?
Queries
Applications
Devices
etc.
Database
Stream
File / Object
Storage
Historic
Data
Streams
Application
Stateful computations over streams
real-time and historic
fast, scalable, fault tolerant, in-memory,
event time, large state, exactly-once
© 2018 data Artisans4
Overview of Flink Use Cases
Event-driven
applications
Stream
Processing
Batch
processing
© 2018 data Artisans5
Batch Processing
a.k.a. collect now, figure
out later
© 2018 data Artisans6
Batch Processing Technologies Supported by Flink
DataSet
API
© 2018 data Artisans7
Observation 1
The origin of data are streams
© 2018 data Artisans8
Stream Processing
we can build applications directly
on data streams
© 2018 data Artisans9
Stream Processing Technologies Supported by Flink
DataStream
API
© 2018 data Artisans10
Observation 2
Stream Processing changes the database-centric
architecture
© 2018 data Artisans11
Changing the Two-Tier Architecture
reads/writes across
tier boundary
asynchronous writes
of large blobs
all modifications
are local
Classic tiered architecture Streaming architectureClassic tiered architecture
database
layer
compute
layer
application working state
+ historic state
compute
+
stream storage
and
snapshot storage
(backup)
application state
Streaming architecture
© 2018 data Artisans12
Keystone Routing Pipeline at Netflix
(as presented at Flink Forward San Francisco, 2018)
@
 Athena X
 SQL to define metrics
 Thresholds and actions to trigger
 Blends analytics and
actions
Streams from
Hadoop, Kafka, etc
SQL, thresholds,
actions
Analytics
Alerts
Derived streams
© 2018 data Artisans13
Observation 3
Stream Processing is about building applications,
not platforms
© 2018 data Artisans14
Internal streaming data platforms
built with Apache Flink
© 2018 data Artisans15
KubernetesResource Manager
Logging
Metrics
CI / CD
Application Platforms
deploying new
applications
scaling
applications
Kubernetes
© 2018 data Artisans16
Kubernetes
Database
Kubernetes
• Example: Scaling down a replicated database
• 3 replicas, 4 node scale down
need to move or
reorganize data
before container
shutdown
Kubernetes & stateful applications What about stateful containers?
© 2018 data Artisans17
• consistent stateful upgrades
‒application evolution and bug fixes
• migration of application state
‒cluster migration, A/B testing
• re-processing and reinstatement
‒fix corrupt results, bootstrap new applications
• state evolution (schema evolution)
A B
Stateful Questions
Container-based
Resource Orchestration
Stateful Stream
Processing & Snapshots
Kubernetes Apache Flink
Container-based
platform for stateful
data-driven applications
dA Platform
Code, Resource, Config, and
Snapshot Management
Application
Manager
© 2018 data Artisans18
Architecture
Apache Flink
Stateful stream processing
Kubernetes
Container platform
Logging
Metrics
dA Application
Manager
Application
lifecycle
management
Storage
In Action
App
Manager
Kubernetes
Resource
Allocation
Job Control
Snapshot
Management
CI/CDWeb
interface
© 2018 data Artisans19
Powered by Apache Flink
© 2018 data Artisans20
Free Trial Download of data Artisans Platform
data-artisans.com/download
THANK YOU! WE ARE HIRING
data-artisans.com/careers
@aljoscha
@dataArtisans
@ApacheFlink
© 2018 data Artisans22
Backup Slides
Ad

More Related Content

What's hot (20)

Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
Till Rohrmann
 
Apache flink
Apache flinkApache flink
Apache flink
Ahmed Nader
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flink
datamantra
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
DataWorks Summit/Hadoop Summit
 
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem
DataWorks Summit/Hadoop Summit
 
Unified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache FlinkUnified Stream and Batch Processing with Apache Flink
Unified Stream and Batch Processing with Apache Flink
DataWorks Summit/Hadoop Summit
 
Flink vs. Spark
Flink vs. SparkFlink vs. Spark
Flink vs. Spark
Slim Baltagi
 
Kafka Streams State Stores Being Persistent
Kafka Streams State Stores Being PersistentKafka Streams State Stores Being Persistent
Kafka Streams State Stores Being Persistent
confluent
 
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kai Wähner
 
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward
 
kafka
kafkakafka
kafka
Amikam Snir
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
Kostas Tzoumas
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developers
confluent
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
Flink Forward
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
Jeff Holoman
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
Xiang Fu
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Flink Forward
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
confluent
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
Flink Forward
 
Spark architecture
Spark architectureSpark architecture
Spark architecture
GauravBiswas9
 
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processingIntroduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
Till Rohrmann
 
Introduction to Apache Flink
Introduction to Apache FlinkIntroduction to Apache Flink
Introduction to Apache Flink
datamantra
 
Kafka Streams State Stores Being Persistent
Kafka Streams State Stores Being PersistentKafka Streams State Stores Being Persistent
Kafka Streams State Stores Being Persistent
confluent
 
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache KafkaKafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Kai Wähner
 
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
Kostas Tzoumas
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developers
confluent
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
Flink Forward
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
Jeff Holoman
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
Xiang Fu
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Flink Forward
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
confluent
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
Flink Forward
 

Similar to Apache Flink and what it is used for (20)

The Past, Present, and Future of Apache Flink®
The Past, Present, and Future of Apache Flink®The Past, Present, and Future of Apache Flink®
The Past, Present, and Future of Apache Flink®
Aljoscha Krettek
 
(Past), Present, and Future of Apache Flink
(Past), Present, and Future of Apache Flink(Past), Present, and Future of Apache Flink
(Past), Present, and Future of Apache Flink
Aljoscha Krettek
 
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward
 
dA Platform Overview
dA Platform OverviewdA Platform Overview
dA Platform Overview
Robert Metzger
 
The Past, Present, and Future of Apache Flink
The Past, Present, and Future of Apache FlinkThe Past, Present, and Future of Apache Flink
The Past, Present, and Future of Apache Flink
Aljoscha Krettek
 
Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...
Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...
Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...
Flink Forward
 
Leveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern AnalyticsLeveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern Analytics
confluent
 
Real-time Analysis of Data Processing Pipelines with Spring Cloud Data Flow a...
Real-time Analysis of Data Processing Pipelines with Spring Cloud Data Flow a...Real-time Analysis of Data Processing Pipelines with Spring Cloud Data Flow a...
Real-time Analysis of Data Processing Pipelines with Spring Cloud Data Flow a...
VMware Tanzu
 
Stream processing for the practitioner: Blueprints for common stream processi...
Stream processing for the practitioner: Blueprints for common stream processi...Stream processing for the practitioner: Blueprints for common stream processi...
Stream processing for the practitioner: Blueprints for common stream processi...
Aljoscha Krettek
 
Big Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHING
Big Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHINGBig Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHING
Big Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHING
Matt Stubbs
 
Streaming in the Wild with Apache Flink
Streaming in the Wild with Apache FlinkStreaming in the Wild with Apache Flink
Streaming in the Wild with Apache Flink
Kostas Tzoumas
 
Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017
Nitin Kumar
 
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks
 
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics FrameworksOverview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Slim Baltagi
 
Overview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics FrameworksOverview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics Frameworks
DataWorks Summit/Hadoop Summit
 
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics FrameworksOverview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks
Slim Baltagi
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming Analytics
Slim Baltagi
 
Cloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsCloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive Applications
VMware Tanzu
 
xGem Data Stream Processing
xGem Data Stream ProcessingxGem Data Stream Processing
xGem Data Stream Processing
Jorge Hirtz
 
Flink Forward Berlin 2018: Stephan Ewen - Keynote: "Unlocking the next wave o...
Flink Forward Berlin 2018: Stephan Ewen - Keynote: "Unlocking the next wave o...Flink Forward Berlin 2018: Stephan Ewen - Keynote: "Unlocking the next wave o...
Flink Forward Berlin 2018: Stephan Ewen - Keynote: "Unlocking the next wave o...
Flink Forward
 
The Past, Present, and Future of Apache Flink®
The Past, Present, and Future of Apache Flink®The Past, Present, and Future of Apache Flink®
The Past, Present, and Future of Apache Flink®
Aljoscha Krettek
 
(Past), Present, and Future of Apache Flink
(Past), Present, and Future of Apache Flink(Past), Present, and Future of Apache Flink
(Past), Present, and Future of Apache Flink
Aljoscha Krettek
 
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward San Francisco 2018: Robert Metzger & Patrick Lucas - "dA Platfo...
Flink Forward
 
The Past, Present, and Future of Apache Flink
The Past, Present, and Future of Apache FlinkThe Past, Present, and Future of Apache Flink
The Past, Present, and Future of Apache Flink
Aljoscha Krettek
 
Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...
Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...
Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...
Flink Forward
 
Leveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern AnalyticsLeveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern Analytics
confluent
 
Real-time Analysis of Data Processing Pipelines with Spring Cloud Data Flow a...
Real-time Analysis of Data Processing Pipelines with Spring Cloud Data Flow a...Real-time Analysis of Data Processing Pipelines with Spring Cloud Data Flow a...
Real-time Analysis of Data Processing Pipelines with Spring Cloud Data Flow a...
VMware Tanzu
 
Stream processing for the practitioner: Blueprints for common stream processi...
Stream processing for the practitioner: Blueprints for common stream processi...Stream processing for the practitioner: Blueprints for common stream processi...
Stream processing for the practitioner: Blueprints for common stream processi...
Aljoscha Krettek
 
Big Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHING
Big Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHINGBig Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHING
Big Data LDN 2018: STREAM PROCESSING TAKES ON EVERYTHING
Matt Stubbs
 
Streaming in the Wild with Apache Flink
Streaming in the Wild with Apache FlinkStreaming in the Wild with Apache Flink
Streaming in the Wild with Apache Flink
Kostas Tzoumas
 
Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017Confluent kafka meetupseattle jan2017
Confluent kafka meetupseattle jan2017
Nitin Kumar
 
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks
 
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics FrameworksOverview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Overview of Apache Fink: the 4 G of Big Data Analytics Frameworks
Slim Baltagi
 
Overview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics FrameworksOverview of Apache Flink: the 4G of Big Data Analytics Frameworks
Overview of Apache Flink: the 4G of Big Data Analytics Frameworks
DataWorks Summit/Hadoop Summit
 
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics FrameworksOverview of Apache Fink: The 4G of Big Data Analytics Frameworks
Overview of Apache Fink: The 4G of Big Data Analytics Frameworks
Slim Baltagi
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming Analytics
Slim Baltagi
 
Cloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsCloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive Applications
VMware Tanzu
 
xGem Data Stream Processing
xGem Data Stream ProcessingxGem Data Stream Processing
xGem Data Stream Processing
Jorge Hirtz
 
Flink Forward Berlin 2018: Stephan Ewen - Keynote: "Unlocking the next wave o...
Flink Forward Berlin 2018: Stephan Ewen - Keynote: "Unlocking the next wave o...Flink Forward Berlin 2018: Stephan Ewen - Keynote: "Unlocking the next wave o...
Flink Forward Berlin 2018: Stephan Ewen - Keynote: "Unlocking the next wave o...
Flink Forward
 
Ad

More from Aljoscha Krettek (12)

Apache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream ProcessorApache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream Processor
Aljoscha Krettek
 
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...Talk Python To Me: Stream Processing in your favourite Language with Beam on ...
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...
Aljoscha Krettek
 
The Evolution of (Open Source) Data Processing
The Evolution of (Open Source) Data ProcessingThe Evolution of (Open Source) Data Processing
The Evolution of (Open Source) Data Processing
Aljoscha Krettek
 
Python Streaming Pipelines with Beam on Flink
Python Streaming Pipelines with Beam on FlinkPython Streaming Pipelines with Beam on Flink
Python Streaming Pipelines with Beam on Flink
Aljoscha Krettek
 
Robust stream processing with Apache Flink
Robust stream processing with Apache FlinkRobust stream processing with Apache Flink
Robust stream processing with Apache Flink
Aljoscha Krettek
 
Unified stateful big data processing in Apache Beam (incubating)
Unified stateful big data processing in Apache Beam (incubating)Unified stateful big data processing in Apache Beam (incubating)
Unified stateful big data processing in Apache Beam (incubating)
Aljoscha Krettek
 
Advanced Flink Training - Design patterns for streaming applications
Advanced Flink Training - Design patterns for streaming applicationsAdvanced Flink Training - Design patterns for streaming applications
Advanced Flink Training - Design patterns for streaming applications
Aljoscha Krettek
 
Apache Flink - A Stream Processing Engine
Apache Flink - A Stream Processing EngineApache Flink - A Stream Processing Engine
Apache Flink - A Stream Processing Engine
Aljoscha Krettek
 
Adventures in Timespace - How Apache Flink Handles Time and Windows
Adventures in Timespace - How Apache Flink Handles Time and WindowsAdventures in Timespace - How Apache Flink Handles Time and Windows
Adventures in Timespace - How Apache Flink Handles Time and Windows
Aljoscha Krettek
 
Flink 0.10 - Upcoming Features
Flink 0.10 - Upcoming FeaturesFlink 0.10 - Upcoming Features
Flink 0.10 - Upcoming Features
Aljoscha Krettek
 
Data Analysis with Apache Flink (Hadoop Summit, 2015)
Data Analysis with Apache Flink (Hadoop Summit, 2015)Data Analysis with Apache Flink (Hadoop Summit, 2015)
Data Analysis with Apache Flink (Hadoop Summit, 2015)
Aljoscha Krettek
 
Apache Flink Hands-On
Apache Flink Hands-OnApache Flink Hands-On
Apache Flink Hands-On
Aljoscha Krettek
 
Apache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream ProcessorApache Flink(tm) - A Next-Generation Stream Processor
Apache Flink(tm) - A Next-Generation Stream Processor
Aljoscha Krettek
 
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...Talk Python To Me: Stream Processing in your favourite Language with Beam on ...
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...
Aljoscha Krettek
 
The Evolution of (Open Source) Data Processing
The Evolution of (Open Source) Data ProcessingThe Evolution of (Open Source) Data Processing
The Evolution of (Open Source) Data Processing
Aljoscha Krettek
 
Python Streaming Pipelines with Beam on Flink
Python Streaming Pipelines with Beam on FlinkPython Streaming Pipelines with Beam on Flink
Python Streaming Pipelines with Beam on Flink
Aljoscha Krettek
 
Robust stream processing with Apache Flink
Robust stream processing with Apache FlinkRobust stream processing with Apache Flink
Robust stream processing with Apache Flink
Aljoscha Krettek
 
Unified stateful big data processing in Apache Beam (incubating)
Unified stateful big data processing in Apache Beam (incubating)Unified stateful big data processing in Apache Beam (incubating)
Unified stateful big data processing in Apache Beam (incubating)
Aljoscha Krettek
 
Advanced Flink Training - Design patterns for streaming applications
Advanced Flink Training - Design patterns for streaming applicationsAdvanced Flink Training - Design patterns for streaming applications
Advanced Flink Training - Design patterns for streaming applications
Aljoscha Krettek
 
Apache Flink - A Stream Processing Engine
Apache Flink - A Stream Processing EngineApache Flink - A Stream Processing Engine
Apache Flink - A Stream Processing Engine
Aljoscha Krettek
 
Adventures in Timespace - How Apache Flink Handles Time and Windows
Adventures in Timespace - How Apache Flink Handles Time and WindowsAdventures in Timespace - How Apache Flink Handles Time and Windows
Adventures in Timespace - How Apache Flink Handles Time and Windows
Aljoscha Krettek
 
Flink 0.10 - Upcoming Features
Flink 0.10 - Upcoming FeaturesFlink 0.10 - Upcoming Features
Flink 0.10 - Upcoming Features
Aljoscha Krettek
 
Data Analysis with Apache Flink (Hadoop Summit, 2015)
Data Analysis with Apache Flink (Hadoop Summit, 2015)Data Analysis with Apache Flink (Hadoop Summit, 2015)
Data Analysis with Apache Flink (Hadoop Summit, 2015)
Aljoscha Krettek
 
Ad

Recently uploaded (20)

Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
#StandardsGoals for 2025: Standards & certification roundup - Tech Forum 2025
BookNet Canada
 
Drupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy ConsumptionDrupalcamp Finland – Measuring Front-end Energy Consumption
Drupalcamp Finland – Measuring Front-end Energy Consumption
Exove
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxDevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptx
Justin Reock
 
Mobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi ArabiaMobile App Development Company in Saudi Arabia
Mobile App Development Company in Saudi Arabia
Steve Jonas
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
Transcript: #StandardsGoals for 2025: Standards & certification roundup - Tec...
BookNet Canada
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 

Apache Flink and what it is used for

  • 1. Aljoscha Krettek – PMC of Apache Flink and Apache Beam, Co-Founder at data Artisans Apache Flink® and what it is used for
  • 2. © 2018 data Artisans2 About Data Artisans Original Creators of Apache Flink® RealTime Stream Processing Enterprise Ready 2008 2009 2011 Studying and working at IBM Development on Stratosphere begins I joinTU Berlin research group with other co-founders 2014 data Artisans is founded
  • 3. © 2018 data Artisans3 What is Apache Flink? Queries Applications Devices etc. Database Stream File / Object Storage Historic Data Streams Application Stateful computations over streams real-time and historic fast, scalable, fault tolerant, in-memory, event time, large state, exactly-once
  • 4. © 2018 data Artisans4 Overview of Flink Use Cases Event-driven applications Stream Processing Batch processing
  • 5. © 2018 data Artisans5 Batch Processing a.k.a. collect now, figure out later
  • 6. © 2018 data Artisans6 Batch Processing Technologies Supported by Flink DataSet API
  • 7. © 2018 data Artisans7 Observation 1 The origin of data are streams
  • 8. © 2018 data Artisans8 Stream Processing we can build applications directly on data streams
  • 9. © 2018 data Artisans9 Stream Processing Technologies Supported by Flink DataStream API
  • 10. © 2018 data Artisans10 Observation 2 Stream Processing changes the database-centric architecture
  • 11. © 2018 data Artisans11 Changing the Two-Tier Architecture reads/writes across tier boundary asynchronous writes of large blobs all modifications are local Classic tiered architecture Streaming architectureClassic tiered architecture database layer compute layer application working state + historic state compute + stream storage and snapshot storage (backup) application state Streaming architecture
  • 12. © 2018 data Artisans12 Keystone Routing Pipeline at Netflix (as presented at Flink Forward San Francisco, 2018) @  Athena X  SQL to define metrics  Thresholds and actions to trigger  Blends analytics and actions Streams from Hadoop, Kafka, etc SQL, thresholds, actions Analytics Alerts Derived streams
  • 13. © 2018 data Artisans13 Observation 3 Stream Processing is about building applications, not platforms
  • 14. © 2018 data Artisans14 Internal streaming data platforms built with Apache Flink
  • 15. © 2018 data Artisans15 KubernetesResource Manager Logging Metrics CI / CD Application Platforms deploying new applications scaling applications Kubernetes
  • 16. © 2018 data Artisans16 Kubernetes Database Kubernetes • Example: Scaling down a replicated database • 3 replicas, 4 node scale down need to move or reorganize data before container shutdown Kubernetes & stateful applications What about stateful containers?
  • 17. © 2018 data Artisans17 • consistent stateful upgrades ‒application evolution and bug fixes • migration of application state ‒cluster migration, A/B testing • re-processing and reinstatement ‒fix corrupt results, bootstrap new applications • state evolution (schema evolution) A B Stateful Questions Container-based Resource Orchestration Stateful Stream Processing & Snapshots Kubernetes Apache Flink Container-based platform for stateful data-driven applications dA Platform Code, Resource, Config, and Snapshot Management Application Manager
  • 18. © 2018 data Artisans18 Architecture Apache Flink Stateful stream processing Kubernetes Container platform Logging Metrics dA Application Manager Application lifecycle management Storage In Action App Manager Kubernetes Resource Allocation Job Control Snapshot Management CI/CDWeb interface
  • 19. © 2018 data Artisans19 Powered by Apache Flink
  • 20. © 2018 data Artisans20 Free Trial Download of data Artisans Platform data-artisans.com/download
  • 21. THANK YOU! WE ARE HIRING data-artisans.com/careers @aljoscha @dataArtisans @ApacheFlink
  • 22. © 2018 data Artisans22 Backup Slides

Editor's Notes

  • #3: • data Artisans was founded by the original creators of Apache Flink • We provide dA Platform, a complete stream processing infrastructure with open-source Apache Flink
  • #13: Alibaba 472 mio records / second at peak Largest job: thousands of subtasks, tens of TBs of state Thousands of jobs >5k nodes >500k CPU cores Netflix ~ 3 trillion events/day ~2000 routing jobs ~10k containers ~200k parallel operator instances Uber Athena X SQL to define metrics Thresholds and actions to trigger Blends analytics and actions All have in common: built internal platforms
  • #21: • Also included is the Application Manager, which turns dA Platform into a self-service platform for stateful stream processing applications. • dA Platform is generally available, and you can download a free trial today!