SlideShare a Scribd company logo
Rob	Keevil	- KnowIT
Fokko Driesprong- GoDataDriven
Working	with	Skewed	Data:
The	Iterative	Broadcast
#EUde11
#EUde11
About	Rob	Keevil
• Philosophy,	Politics	&	Economics
• Big	Data	Architect
• Financial	and	Anti-Fraud	Domains
• Scala	Programmer
#EUde11
About	Fokko Driesprong
• Software	Engineering	and	Distributed	Systems
• Data	Engineer	at	GoDataDriven
• Committer	&	PPMC	Member,	Apache	Airflow
• Open	source	enthusiast
https://godatadriven.com/players/fokko-driesprong
#EUde11
Overview
• The	Skew	Problem	at	ING
• Join	Strategies	in	Spark
• Our	Solution
• Performance	 Analysis
• Considerations
• Future	Work
#EUde11
The	Skew	Problem	at
• Billions	of	transactions…
• Billions	of	accounts…
• Millions	of	companies…
• …with	a	very uneven	distribution!
#EUde11
Familiar?
#EUde11
Join	Types	Native	to	Spark
• Sort Merge Join
• Broadcast Hash Join
• Shuffled Hash Join
• Broadcast Nested Loop Join
• Cartesian Product
Key Column A
1 HELLO
The	Sort	Merge	Join
8#EUde11
Stage	1:
Determine	
partitions	
by	hashing	
the	key(s)
3 GOOD
Key Column A
Key Column BKey Column B
1 WORLD!
2 SUMMIT
1 DUBLIN
2 STREAMING
1 EVERYBODY
2 SQL
3 MORNING
3 EVENING
3 NIGHT
4 SPAM
4 SPAM
4 SPAM
2 SPARK
4 SPAM
Executor	1
Key Column B
1 WORLD!
1 DUBLIN
1 EVERYBODY
3 MORNING
3 EVENING
3 NIGHT
Key Column A Column B
1 WORLD!
1 DUBLIN
1 EVERYBODY
3 MORNING
3 EVENING
3 NIGHT
3 GOOD3 GOOD3 GOOD3 GOOD
Key Column A
Executor	2
9#EUde11
Stage	2:
Merge
Key Column B
2 SUMMIT
2 STREAMING
2 SQL
4 SPAM
4 SPAM
4 SPAM
Key Column AKey Column A
1 HELLO1 HELLO1 HELLO1 HELLO
Key Column A Column B
2 SUMMIT
2 STREAMING
2 SQL
4 SPAM
4 SPAM
4 SPAM
2 SPARK2 SPARK2 SPARK
4 SPAM4 SPAM4 SPAM
2 SPARK
4 SPAM
Executor	1 Executor	2
10#EUde11
Key Column A Column B
1 HELLO WORLD!
1 HELLO DUBLIN
1 HELLO EVERYBODY
3 GOOD MORNING
3 GOOD EVENING
3 GOOD NIGHT
Key Column A Column B
2 SPARK SUMMIT
2 SPARK STREAMING
2 SPARK SQL
4 SPAM SPAM
4 SPAM SPAM
4 SPAM SPAM
11
Key Column A
1 HELLO
12#EUde11
Stage	1:
Determine	
partitions	
by	hashing	
the	key(s)
3 GOOD
Key Column A
2 SPARK
4 SPAM
Key Column BKey Column B
4 SPAM
2 SUMMIT
3 MORNING
4 SPAM
1 WORLD!
4 SPAM
4 SPAM
4 SPAM
4 SPAM
4 SPAM
4 SPAM
4 SPAM
Regular Table Skewed Table
Executor	2Executor	1
Key Column B
1 WORLD!
3 MORNING
Key Column A Column B
1 WORLD!
3 MORNING
3 GOOD
Key Column A
13#EUde11
Stage	2:
Merge
Key Column B
2 SUMMIT
4 SPAM
4 SPAM
4 SPAM
4 SPAM
4 SPAM
4 SPAM
4 SPAM
4 SPAM
4 SPAM
Key Column AKey Column A
1 HELLO
Key Column A Column B
2 SUMMIT
4 SPAM
4 SPAM
4 SPAM
4 SPAM
4 SPAM
4 SPAM
4 SPAM
4 SPAM
4 SPAM
2 SPARK
4 SPAM
1 HELLO
3 GOOD 4 SPAM4 SPAM4 SPAM4 SPAM4 SPAM4 SPAM4 SPAM4 SPAM
2 SPARK
4 SPAM
Executor	1 Executor	2
14#EUde11
Key Column A Column B
1 HELLO WORLD!
3 GOOD MORNING
Key Column A Column B
2 SPARK SUMMIT
4 SPAM SPAM
4 SPAM SPAM
4 SPAM SPAM
4 SPAM SPAM
4 SPAM SPAM
4 SPAM SPAM
4 SPAM SPAM
4 SPAM SPAM
4 SPAM SPAM
Potential	Solutions
• Give	it	more	resources?
• Repartition	the	data?
15#EUde11
What	about	Broadcast	Hash	Join?
• Leaves	the	partitions	larger	table	
untouched
• Copies	the	entire	smaller	table	to	
every	executor
• Iterate	over	the	large	table,	and	
join	using	a	HashMap
16#EUde11
Key Column B
4 SPAM
4 SPAM
4 SPAM
2 SUMMIT
3 MORNING
4 SPAM
Key Column A
1 HELLO
4 SPAM
2 SPARK
3 GOOD
Key Column A
1 HELLO
4 SPAM
2 SPARK
3 GOOD
Key Column A
1 HELLO
4 SPAM
2 SPARK
3 GOOD
Key Column B
4 SPAM
4 SPAM
1 WORLD
4 SPAM
4 SPAM
4 SPAM
17#EUde11
Smaller Table Larger Table
Stage	1:
Broadcast	
smaller	
dataset	to	
all	
executors
Key Column B
4 SPAM
4 SPAM
4 SPAM
2 SUMMIT
3 MORNING
4 SPAM
4 SPAM
4 SPAM
1 WORLD!
4 SPAM
4 SPAM
4 SPAM
Executor	2
3 GOOD
2 SPARK
Executor	1
Key Column B
4 SPAM
4 SPAM
4 SPAM
2 SUMMIT
3 MORNING
4 SPAM
Key Column A Column B
4 SPAM
4 SPAM
4 SPAM
2 SUMMIT
3 MORNING
4 SPAM
3 GOOD
Key Column A
18#EUde11
Stage	2:
Left	join
Key Column B
4 SPAM
4 SPAM
1 WORLD!
4 SPAM
4 SPAM
4 SPAM
Key Column A
Key Column A Column B
4 SPAM
4 SPAM
1 WORLD!
4 SPAM
4 SPAM
4 SPAM
2 SPARK
4 SPAM4 SPAM4 SPAM4 SPAM2 SPARK
4 SPAM
3 GOOD
Key Column AKey Column A
1 HELLO
4 SPAM4 SPAM4 SPAM4 SPAM4 SPAM
1 HELLO
4 SPAM
1 HELLO
Executor	1 Executor	2
19#EUde11
Key Column A Column B
4 SPAM SPAM
4 SPAM SPAM
4 SPAM SPAM
2 SPARK SUMMIT
3 GOOD MORNING
4 SPAM SPAM
Key Column A Column B
4 SPAM SPAM
4 SPAM SPAM
1 HELLO WORLD!
4 SPAM SPAM
4 SPAM SPAM
4 SPAM SPAM
But…
• …What	if	your	“smaller”	table	
isn’t	so	small?
• The	whole	small	table	needs	
to	fit	in	driver	memory	and	
every	executor
20#EUde11
Introducing:	The	Iterative	Broadcast!
• Divide	the	smaller	table	into	“passes”
• Broadcast	a	pass	&	left	join	to	the	larger	dataset
• Clear	the	broadcast	partition	from	memory
• Repeat	until	all	passes	are	processed
21#EUde11
22#EUde11
Key Column A
1 HELLO
4 SPAM
2 SPARK
3 GOOD
Stage	1:
Assign	Pass	Number	
to	Smaller	Table
Key Column A Pass
1 HELLO 1
4 SPAM 2
2 SPARK 2
3 GOOD 1
Executor	2Executor	1
23#EUde11
Stage	2:
Broadcast	
first	pass
Key Column B
4 SPAM
4 SPAM
1 WORLD!
4 SPAM
4 SPAM
4 SPAM
Key Column B
4 SPAM
4 SPAM
4 SPAM
2 SUMMIT
3 MORNING
4 SPAM
3 GOOD
1 HELLO
Key Column A Pass
1 HELLO 1
3 GOOD 1
Key Column A Pass
1 HELLO 1
3 GOOD 1
Executor	2Executor	1
24#EUde11
Stage	3:
Left	join	
first	pass
Key Column B
4 SPAM
4 SPAM
1 WORLD!
4 SPAM
4 SPAM
4 SPAM
Key Column B
4 SPAM
4 SPAM
4 SPAM
2 SUMMIT
3 MORNING
4 SPAM
Key Column A Column B
4 SPAM
4 SPAM
4 SPAM
2 SUMMIT
3 MORNING
4 SPAM
Key Column A Column B
4 SPAM
4 SPAM
1 WORLD!
4 SPAM
4 SPAM
4 SPAM
3 GOOD
1 HELLO
Key Column A Pass
1 HELLO 1
3 GOOD 1
Key Column A Pass
1 HELLO 1
3 GOOD 1
Executor	2Executor	1
25#EUde11
Key Column B
4 SPAM
4 SPAM
1 WORLD!
4 SPAM
4 SPAM
4 SPAM
Key Column B
4 SPAM
4 SPAM
4 SPAM
2 SUMMIT
3 MORNING
4 SPAM
Key Column A Column B
4 SPAM
4 SPAM
4 SPAM
2 SUMMIT
3 GOOD MORNING
4 SPAM
Key Column A Column B
4 SPAM
4 SPAM
1 HELLO WORLD!
4 SPAM
4 SPAM
4 SPAM
1 HELLO
Stage	4:
Broadcast	
second	
pass
Key Column A Pass
1 HELLO 1
3 GOOD 1
Key Column A Pass
1 HELLO 1
3 GOOD 1
Key Column A Pass
2 SPARK 2
4 SPAM 2
Key Column A Pass
2 SPARK 2
4 SPAM 2
Executor	2Executor	1
26#EUde11
Key Column B
4 SPAM
4 SPAM
1 WORLD!
4 SPAM
4 SPAM
4 SPAM
Key Column B
4 SPAM
4 SPAM
4 SPAM
2 SUMMIT
3 MORNING
4 SPAM
Key Column A Column B
4 SPAM
4 SPAM
4 SPAM
2 SUMMIT
3 GOOD MORNING
4 SPAM
Key Column A Column B
4 SPAM
4 SPAM
1 HELLO WORLD!
4 SPAM
4 SPAM
4 SPAM
3 GOOD
1 HELLO
Stage	5:
Left	join	
second	
pass
Key Column A Pass
1 HELLO 1
3 GOOD 1
3 SPARK
3 GOOD3 GOOD3 GOOD3 GOOD 3 GOOD3 GOOD3 GOOD3 GOOD3 GOOD3 GOOD
Key Column A Pass
2 SPARK 2
4 SPAM 2
Key Column A Pass
2 SPARK 2
4 SPAM 2
27#EUde11
Performance	Analysis
• Setup	notes
– Tests	on	Amazon	EMR	using	4x	m4.4xlarge	workers
– 5	cores	and	18gb	memory	 per	executor
– Spark	2.1
– Benchmark	on	GitHub
– Skewed	 dataset,	10000	keys
• Largest	key	10000	rows,	2nd	key	5000	rows,	3rd	key	2500	rows	…	10000th	key	1	row
– Uniform	dataset,	evenly	 spread
28#EUde11
Performance	Analysis	– Skewed	Data
29#EUde11
Performance	Analysis	– Uniform	Data
Considerations
• Hyper	parameters
– Size	of	the	broadcast	variable
– The	number	of	passes
• Additional	complexity
30#EUde11
Future	Work
31#EUde11
• In-memory iterations
• Native to Spark
Recap
32#EUde11
• Skew hurts Spark parallelism and stability
• Default join types have issues with skewed data
• Iterative Broadcast Join can be used to process
skewed data while maintaining parallelism
That’s	All	Folks!
https://github.com/godatadriven/iterative-broadcast-join
#EUde11
Ad

More Related Content

What's hot (20)

Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper Optimization
Databricks
 
Spark SQL Join Improvement at Facebook
Spark SQL Join Improvement at FacebookSpark SQL Join Improvement at Facebook
Spark SQL Join Improvement at Facebook
Databricks
 
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Databricks
 
Rds data lake @ Robinhood
Rds data lake @ Robinhood Rds data lake @ Robinhood
Rds data lake @ Robinhood
BalajiVaradarajan13
 
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
Andrew Lamb
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applications
hadooparchbook
 
Simplifying Change Data Capture using Databricks Delta
Simplifying Change Data Capture using Databricks DeltaSimplifying Change Data Capture using Databricks Delta
Simplifying Change Data Capture using Databricks Delta
Databricks
 
Deep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDeep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache Spark
Databricks
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Databricks
 
Spark overview
Spark overviewSpark overview
Spark overview
Lisa Hua
 
MPP vs Hadoop
MPP vs HadoopMPP vs Hadoop
MPP vs Hadoop
Alexey Grishchenko
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0
Cloudera, Inc.
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
Databricks
 
DPDK In Depth
DPDK In DepthDPDK In Depth
DPDK In Depth
Kernel TLV
 
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
HostedbyConfluent
 
Flink SQL: The Challenges to Build a Streaming SQL Engine
Flink SQL: The Challenges to Build a Streaming SQL EngineFlink SQL: The Challenges to Build a Streaming SQL Engine
Flink SQL: The Challenges to Build a Streaming SQL Engine
HostedbyConfluent
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
The Hive
 
A look under the hood at Apache Spark's API and engine evolutions
A look under the hood at Apache Spark's API and engine evolutionsA look under the hood at Apache Spark's API and engine evolutions
A look under the hood at Apache Spark's API and engine evolutions
Databricks
 
Analyzing MySQL Logs with ClickHouse, by Peter Zaitsev
Analyzing MySQL Logs with ClickHouse, by Peter ZaitsevAnalyzing MySQL Logs with ClickHouse, by Peter Zaitsev
Analyzing MySQL Logs with ClickHouse, by Peter Zaitsev
Altinity Ltd
 
Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...
Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...
Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...
Alex Levenson
 
Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper Optimization
Databricks
 
Spark SQL Join Improvement at Facebook
Spark SQL Join Improvement at FacebookSpark SQL Join Improvement at Facebook
Spark SQL Join Improvement at Facebook
Databricks
 
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Databricks
 
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
2022-06-23 Apache Arrow and DataFusion_ Changing the Game for implementing Da...
Andrew Lamb
 
Top 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applicationsTop 5 mistakes when writing Spark applications
Top 5 mistakes when writing Spark applications
hadooparchbook
 
Simplifying Change Data Capture using Databricks Delta
Simplifying Change Data Capture using Databricks DeltaSimplifying Change Data Capture using Databricks Delta
Simplifying Change Data Capture using Databricks Delta
Databricks
 
Deep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache SparkDeep Dive: Memory Management in Apache Spark
Deep Dive: Memory Management in Apache Spark
Databricks
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Databricks
 
Spark overview
Spark overviewSpark overview
Spark overview
Lisa Hua
 
Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0Efficient Data Storage for Analytics with Apache Parquet 2.0
Efficient Data Storage for Analytics with Apache Parquet 2.0
Cloudera, Inc.
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
Databricks
 
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
HostedbyConfluent
 
Flink SQL: The Challenges to Build a Streaming SQL Engine
Flink SQL: The Challenges to Build a Streaming SQL EngineFlink SQL: The Challenges to Build a Streaming SQL Engine
Flink SQL: The Challenges to Build a Streaming SQL Engine
HostedbyConfluent
 
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookTech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of Facebook
The Hive
 
A look under the hood at Apache Spark's API and engine evolutions
A look under the hood at Apache Spark's API and engine evolutionsA look under the hood at Apache Spark's API and engine evolutions
A look under the hood at Apache Spark's API and engine evolutions
Databricks
 
Analyzing MySQL Logs with ClickHouse, by Peter Zaitsev
Analyzing MySQL Logs with ClickHouse, by Peter ZaitsevAnalyzing MySQL Logs with ClickHouse, by Peter Zaitsev
Analyzing MySQL Logs with ClickHouse, by Peter Zaitsev
Altinity Ltd
 
Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...
Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...
Hadoop Summit 2015: Performance Optimization at Scale, Lessons Learned at Twi...
Alex Levenson
 

Viewers also liked (6)

Beyond unit tests: Testing for Spark/Hadoop Workflows with Shankar Manian Ana...
Beyond unit tests: Testing for Spark/Hadoop Workflows with Shankar Manian Ana...Beyond unit tests: Testing for Spark/Hadoop Workflows with Shankar Manian Ana...
Beyond unit tests: Testing for Spark/Hadoop Workflows with Shankar Manian Ana...
Spark Summit
 
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
Spark Summit
 
Fire in the Sky: An Introduction to Monitoring Apache Spark in the Cloud with...
Fire in the Sky: An Introduction to Monitoring Apache Spark in the Cloud with...Fire in the Sky: An Introduction to Monitoring Apache Spark in the Cloud with...
Fire in the Sky: An Introduction to Monitoring Apache Spark in the Cloud with...
Spark Summit
 
Saving Energy with Apache Spark and Toon with Stephen Galsworthy
Saving Energy with Apache Spark and Toon with Stephen GalsworthySaving Energy with Apache Spark and Toon with Stephen Galsworthy
Saving Energy with Apache Spark and Toon with Stephen Galsworthy
Spark Summit
 
Optimal Strategies for Large Scale Batch ETL Jobs with Emma Tang
Optimal Strategies for Large Scale Batch ETL Jobs with Emma TangOptimal Strategies for Large Scale Batch ETL Jobs with Emma Tang
Optimal Strategies for Large Scale Batch ETL Jobs with Emma Tang
Databricks
 
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai Yu
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai YuAn Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai Yu
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai Yu
Databricks
 
Beyond unit tests: Testing for Spark/Hadoop Workflows with Shankar Manian Ana...
Beyond unit tests: Testing for Spark/Hadoop Workflows with Shankar Manian Ana...Beyond unit tests: Testing for Spark/Hadoop Workflows with Shankar Manian Ana...
Beyond unit tests: Testing for Spark/Hadoop Workflows with Shankar Manian Ana...
Spark Summit
 
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
Spark Summit
 
Fire in the Sky: An Introduction to Monitoring Apache Spark in the Cloud with...
Fire in the Sky: An Introduction to Monitoring Apache Spark in the Cloud with...Fire in the Sky: An Introduction to Monitoring Apache Spark in the Cloud with...
Fire in the Sky: An Introduction to Monitoring Apache Spark in the Cloud with...
Spark Summit
 
Saving Energy with Apache Spark and Toon with Stephen Galsworthy
Saving Energy with Apache Spark and Toon with Stephen GalsworthySaving Energy with Apache Spark and Toon with Stephen Galsworthy
Saving Energy with Apache Spark and Toon with Stephen Galsworthy
Spark Summit
 
Optimal Strategies for Large Scale Batch ETL Jobs with Emma Tang
Optimal Strategies for Large Scale Batch ETL Jobs with Emma TangOptimal Strategies for Large Scale Batch ETL Jobs with Emma Tang
Optimal Strategies for Large Scale Batch ETL Jobs with Emma Tang
Databricks
 
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai Yu
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai YuAn Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai Yu
An Adaptive Execution Engine for Apache Spark with Carson Wang and Yucai Yu
Databricks
 
Ad

More from Spark Summit (20)

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
Spark Summit
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
Spark Summit
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Spark Summit
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Spark Summit
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
Spark Summit
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
Spark Summit
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
Spark Summit
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
Spark Summit
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Spark Summit
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Spark Summit
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
Spark Summit
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spark Summit
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
Spark Summit
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Spark Summit
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Spark Summit
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Spark Summit
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
Spark Summit
 
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
Spark Summit
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
Spark Summit
 
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang WuApache Spark Structured Streaming Helps Smart Manufacturing with  Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Spark Summit
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Spark Summit
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
Spark Summit
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
Spark Summit
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
Spark Summit
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
Spark Summit
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Spark Summit
 
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Spark Summit
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
Spark Summit
 
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spark Summit
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
Spark Summit
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Spark Summit
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Spark Summit
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Spark Summit
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
Spark Summit
 
Ad

Recently uploaded (20)

How iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost FundsHow iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost Funds
ireneschmid345
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
Defense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptxDefense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptx
Greg Makowski
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptxPerencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
PareaRusan
 
Minions Want to eat presentacion muy linda
Minions Want to eat presentacion muy lindaMinions Want to eat presentacion muy linda
Minions Want to eat presentacion muy linda
CarlaAndradesSoler1
 
Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..
yuvarajreddy2002
 
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.pptJust-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
ssuser5f8f49
 
Principles of information security Chapter 5.ppt
Principles of information security Chapter 5.pptPrinciples of information security Chapter 5.ppt
Principles of information security Chapter 5.ppt
EstherBaguma
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
GenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.aiGenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.ai
Inspirient
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
Molecular methods diagnostic and monitoring of infection  -  Repaired.pptxMolecular methods diagnostic and monitoring of infection  -  Repaired.pptx
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
7tzn7x5kky
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 
How iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost FundsHow iCode cybertech Helped Me Recover My Lost Funds
How iCode cybertech Helped Me Recover My Lost Funds
ireneschmid345
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
Defense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptxDefense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptx
Greg Makowski
 
Calories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptxCalories_Prediction_using_Linear_Regression.pptx
Calories_Prediction_using_Linear_Regression.pptx
TijiLMAHESHWARI
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptxPerencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
Perencanaan Pengendalian-Proyek-Konstruksi-MS-PROJECT.pptx
PareaRusan
 
Minions Want to eat presentacion muy linda
Minions Want to eat presentacion muy lindaMinions Want to eat presentacion muy linda
Minions Want to eat presentacion muy linda
CarlaAndradesSoler1
 
Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..
yuvarajreddy2002
 
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.pptJust-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
ssuser5f8f49
 
Principles of information security Chapter 5.ppt
Principles of information security Chapter 5.pptPrinciples of information security Chapter 5.ppt
Principles of information security Chapter 5.ppt
EstherBaguma
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdfIAS-slides2-ia-aaaaaaaaaaain-business.pdf
IAS-slides2-ia-aaaaaaaaaaain-business.pdf
mcgardenlevi9
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
GenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.aiGenAI for Quant Analytics: survey-analytics.ai
GenAI for Quant Analytics: survey-analytics.ai
Inspirient
 
LLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bertLLM finetuning for multiple choice google bert
LLM finetuning for multiple choice google bert
ChadapornK
 
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
Molecular methods diagnostic and monitoring of infection  -  Repaired.pptxMolecular methods diagnostic and monitoring of infection  -  Repaired.pptx
Molecular methods diagnostic and monitoring of infection - Repaired.pptx
7tzn7x5kky
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 

Working with Skewed Data: The Iterative Broadcast with Fokko Driesprong Rob Keevil