SlideShare a Scribd company logo
Hiding Apache Spark
Complexity for Fast Prototyping
Francisco J. Lacueva, ITAINNOVA
Rosa Montañés, ITAINNOVA
of Big Data Applications
#EUent6
Index
• everisMoriarty
– Win-win dialog: Developer & Data Scientist
– everisMoriarty platform
• Spark in everisMoriarty• Spark in everisMoriarty
– Basic example
– Real cases
• TT
• FACTS4WORKERS
• Streaming
2#EUent6
everisMoriarty everisMoriarty is a platform
for end-to-end development
of data analytics models
3#EUent6
everis (www.everis.es) belongs to NTT Data Group
It offers IT solutions, services and outsourcing to several
sectors such as Telcos., financial entities, public
administration, industrial, utilities, energy providers o
health companies.
Revenues: 1031M€
Sites in 16 countries and 19000 professionals work for
everis.
NTT Data Group has sites in 50 countries and 110000
professionals work in
ITAINNOVA (www.itainnova.es) is a non profit R&D Centre
owned by Aragon Regional Government.
Sites:
Zaragoza and Huesca
15000m2 facilities.
15M€ Budget
59% private projects, 30% CPF, 11%NCPF
1,5M€ Investments, 15000m2 facilities.
Knowledge Areas:
Materials, Mechatronics, Electrical Power Systems, Industrial
Processes, ICT-Big Data, Laboratories –Quality
everisMoriarty Win-win dialog
4#EUent6
everisMoriarty Win-win dialog
Aspect “WorkTeam” (solution)
I use your models
and develop tools
quicker and
effectively
I build models,
patterns, and
cognitive
systems ready
to use
5#EUent6
Data Scientist
Collaborate with developer
everisMoriarty
More than 100 workitems (WI) covering: ML
and DL Models, DB connectors, Crawlers, Text
Mining algorithms, Spark components…
WFs can be used as a WI within another WF.
Supported programming languages: Java,
Python.
Deploy Button
6#EUent6
Python.
Automatic Provision of REST APIs for
Published WFs.
Easy integration with third-party APIs
List of Workitems
eM WorkFlow
eM WorkItem
WorkFlow
Parameters
Spark in everisMoriarty
Spark Clúster (Cloudera CDH)
eM Spark WIs:
Generic, SparkStreaming, SparkML
7#EUent6
everisMoriarty
Spark Basic Wis Spark Streaming WIs Spark Machine Learning WIs
Spark WIs overview
8#EUent6
• Spark Context:
• SparkContextCreator
• SparkContextStop
• Data Sources:
• SparkCSVLoader
• SparkCSVWriter
• MongoSpark
• …
• Spark Streaming Context:
• SparkStreamingContextCreator
• SparkStreamingContextStartAnd
Wait
• SparkStreamingContextStop
• DStream transformations:
• SparkStreamingInputDStreamWi
ndower
• SparkKMeans
• SparkLinearRegression
• SparkDecisionTreeClassification
• SparkPCA
• SparkClassifier
• SparkTokenizer
• SparkTfIdf
Spark in everisMoriarty
Integration with Spark:
– Spark WFs (using Spark WIs)
• Native Spark operations & Spark data types
9#EUent6
– Other WFs
• No Native Spark operations & No Spark data types
 Reused within Spark WFs, i.e. as mapFunction.
 Disitributed execution over Spark environment.
Spark in everisMoriarty
(Very) Basic Example
Start Event End EventSparkLoadData ProcessData SparkStoreData
Record processingWF
10#EUent6
Encapsulates mapFunction
DataScientist-based:
Spark in everisMoriarty
(Very) Basic Example: classify data
Start Event End EventSparkLoadData ProcessData SparkStoreData
Record processingWF
11#EUent6
Encapsulates mapFunction
DataScientist-based:
- Classification
Spark in everisMoriarty
(Very) Basic Example: classify by opinion
Start Event End EventSparkLoadData ProcessData SparkStoreData
Record processingWF
12#EUent6
Encapsulates mapFunction
DataScientist-based:
- Classification
- Opinion Mining
Spark in everisMoriarty
(Very) Basic Example: use external resource
Start Event End EventSparkLoadData ProcessData SparkStoreData
Record processingWF
13#EUent6
Encapsulates mapFunction
DataScientist-based:
- Classification
- Opinion Mining
- Invoke REST APIs
Spark in everisMoriarty
(Very) Basic Example
Start Event End EventSparkLoadData ProcessData SparkStoreData
Record processingWF
14#EUent6
Encapsulates mapFunction
DataScientist-based:
- Classification
- Opinion Mining
- Invoke REST APIs
- ...
Real case
Spark in everisMoriarty
15#EUent6
• PILOT DOMAIN: Dynamic Supply Networks / eCommerce
• DOMAIN LEAD: Athens University of Economics and Business AUEB
• USE CASES:
• Route Planning
• Forecasting https://transformingtransport.eu/
Data Transformation & Route Planning Service Invocation
Spark in everisMoriarty
• Data processed over Spark platform
• Parallelized execution
16
<Service>
<ID>261680</ID>
<Name>Cliente261680</Name>
<Duration>PT10M</Duration>
<Location>
<Address>No Address</Address>
<HouseNumber />
<City />
<PostalCode />
<Region />
<Country>ES</Country>
<Coord srs="EPSG:4326" x= "37.984332" y="23.726466" />
<GeocodeLevel>XY</GeocodeLevel>
</Location>
<Priority>1</Priority>
<Windows>
<Window start="2017-04-28T08:00:00" end="2017-04-28T16:00:00" />
</Windows>
<UnloadUnits>1</UnloadUnits>
<UnloadKg>6.000</UnloadKg>
<UnloadM3>0.036</UnloadM3>
<Comments />
</Service>
Aaaa Aaaa Aaaa Aaaa
Aaaaaaaa Aaaaaaaa Aaaaaaaa Aaaaaaaa
Aaaaaaa Aaaaaaa Aaaaaaa Aaaaaaa
Aaaaaaaa Aaaaaaaa Aaaaaaaa Aaaaaaaa
<Service>
<ID>261680</ID>
<Name>Cliente261680</Name>
<Duration>PT10M</Duration>
<Location>
<Address>No Address</Address>
<HouseNumber />
<City />
<PostalCode />
<Region />
<Country>ES</Country>
<Coord srs="EPSG:4326" x= "37.984332" y="23.726466" />
<GeocodeLevel>XY</GeocodeLevel>
</Location>
<Priority>1</Priority>
<Windows>
<Window start="2017-04-28T08:00:00" end="2017-04-28T16:00:00"
/>
</Windows>
<UnloadUnits>1</UnloadUnits>
<UnloadKg>6.000</UnloadKg>
<UnloadM3>0.036</UnloadM3>
<Comments />
</Service>
Company A
Company B
Company C
Real case
Spark in everisMoriarty
FACTS4WORKERS. Worker Centric Workplaces in Smart Factories.
PILOT DOMAIN: H2020- FoF 4. 2014
DOMAIN LEAD: Virtual Vehicle
USE CASES:
Assesing the Impact in Workers Based on System Logged Infomation
17#EUent6
4 Large Industry Partners
5 non-profit ResearchCenters
2 SME´s
This project has received funding from
the European Union's Horizon 2020
research and innovation programme
under Grant Agreement n˚ 636778
http://facts4workers.eu
Spark in everisMoriarty
Abstract World
RealWorld
As-isSituation
Problem
Scenario
Should-be
Situation
ActivityScenario
Instance
Problem
Solution
Artefact
HMI/HCI
WorkflowEngine
F4WBBs BackendSystems
BBs andServices Provided bytheWFE provides standard APIs: REST JSONAPI.
Mobile
Devices
Wearables
(e.g.Smart
Watches)
Smart
Glasses
Desktop
Authenti-
cation
Control
Charts
Logbook
ERP
Interface
Multimedia
Manage-
ment
Semantic
Search
Training
Module
Chat
Module
Video
Chat
Module
Alarm
Warning
Manager
User
Content
Rating
Machine
Status
Operator
Skill
Profiling
Task
Manager
planned
BBs
…
Work
Flow
Engine
DocsRepository
PMS
KMS
SocialBig Data
Industrial IoT Big Data
3DModelsTranformers
Environment Sensors
Data Repositories
Social Software
Security System
ERP
…
Iteravive,
Agile Development
18#EUent6
Problem Artefact
Evaluation
Spark in everisMoriarty
HMI/HCI
WorkflowEngine
F4W BBs Backend Systems
Mobile
Devices
Wearables
(e.g.Smart
Watches)
Smart
Glasses
Authenti-
cation
Control
Charts
Logbook
ERP
Interface
Multimedia
Manage-
ment
Semantic
Search
Training
Module
Chat
Module
Alarm
Warning
Manager
User
Content
Rating
Machine
Operator
Skill
Profiling
Docs Repository
PMS
KMS
Social Big Data
Industrial IoT Big Data
3D Models Tranformers
Environment Sensors
Data Repositories
Security System
19#EUent6
Desktop Video
Chat
Module
Rating
Machine
Status
Profiling
Task
Manager
planned
BBs
…
Social Software
ERP
…
Dashboard
Alarms
Reports
• Real case: Spark Streaming
Spark in everisMoriarty
20#EUent6
• USE CASES:
• Urban Mobility events processing
• Classification of individuals
Spark in everisMoriarty
Urban mobility events processing & classification
21#EUent6
HDFS DBs
Dashboards
Spark in everisMoriarty
Urban mobility events processing & classification
Spark Context
22#EUent6
Spark
Streaming
connector
Spark Engine
Spark
Streaming
Context Start
HDFS DBs
Dashboards
Moving faster with Spark
Easy Spark integration
Increase computation capabilities
Reduce execution time
Easy deployment of non-spark WFs over Spark infrastructure
Fast Prototyping
Basic Spark applications
23#EUent6
Basic Spark applications
Spark ML applications
Spark Streaming applications
Moving faster with SparkFast Prototyping & Fast deployment
Design and Developement Testing and Deployment Production
Code control
Integration Engine
Development environment Continuous Integration and
Delivery
Production Environment
Integration Engine
QA
Functional
Tests
Performance
Tests
Unit &
Integration
Tests
24#EUent6
Gartner: Market Trends: Top Five Buyer Expectations of Intelligent Automation in Data and Analytics Services.
Published: 17 March 2017 ID: G00322319
Questions?
25#EUent6
Contact us
#EUent6
Rosa Montañés
rmontanes@itainnova.es
Master IA, Data Scientist and BD Architect
Telecom Engineer
Francisco J. Lacueva
fjlacueva@itainnova.es
Data Scientist and BD Architect
Master Software Engineer
Ad

More Related Content

What's hot (20)

Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Using Apache Spark in the Cloud—A Devops Perspective with Telmo OliveiraUsing Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Spark Summit
 
Extending Spark SQL 2.4 with New Data Sources (Live Coding Session)
Extending Spark SQL 2.4 with New Data Sources (Live Coding Session)Extending Spark SQL 2.4 with New Data Sources (Live Coding Session)
Extending Spark SQL 2.4 with New Data Sources (Live Coding Session)
Databricks
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
Spark Summit
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
Spark Summit
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
Spark Summit
 
Dr. Elephant: Achieving Quicker, Easier, and Cost-Effective Big Data Analytic...
Dr. Elephant: Achieving Quicker, Easier, and Cost-Effective Big Data Analytic...Dr. Elephant: Achieving Quicker, Easier, and Cost-Effective Big Data Analytic...
Dr. Elephant: Achieving Quicker, Easier, and Cost-Effective Big Data Analytic...
Spark Summit
 
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
Spark Summit
 
Feature Hashing for Scalable Machine Learning with Nick Pentreath
Feature Hashing for Scalable Machine Learning with Nick PentreathFeature Hashing for Scalable Machine Learning with Nick Pentreath
Feature Hashing for Scalable Machine Learning with Nick Pentreath
Spark Summit
 
Spark Summit EU talk by Stephan Kessler
Spark Summit EU talk by Stephan KesslerSpark Summit EU talk by Stephan Kessler
Spark Summit EU talk by Stephan Kessler
Spark Summit
 
How Apache Spark Is Helping Tame the Wild West of Wi-Fi
How Apache Spark Is Helping Tame the Wild West of Wi-FiHow Apache Spark Is Helping Tame the Wild West of Wi-Fi
How Apache Spark Is Helping Tame the Wild West of Wi-Fi
Spark Summit
 
Parallelizing Large Simulations with Apache SparkR with Daniel Jeavons and Wa...
Parallelizing Large Simulations with Apache SparkR with Daniel Jeavons and Wa...Parallelizing Large Simulations with Apache SparkR with Daniel Jeavons and Wa...
Parallelizing Large Simulations with Apache SparkR with Daniel Jeavons and Wa...
Spark Summit
 
End-to-End Data Pipelines with Apache Spark
End-to-End Data Pipelines with Apache SparkEnd-to-End Data Pipelines with Apache Spark
End-to-End Data Pipelines with Apache Spark
Burak Yavuz
 
An AI-Powered Chatbot to Simplify Apache Spark Performance Management
An AI-Powered Chatbot to Simplify Apache Spark Performance ManagementAn AI-Powered Chatbot to Simplify Apache Spark Performance Management
An AI-Powered Chatbot to Simplify Apache Spark Performance Management
Databricks
 
Spark Summit EU talk by Michael Nitschinger
Spark Summit EU talk by Michael NitschingerSpark Summit EU talk by Michael Nitschinger
Spark Summit EU talk by Michael Nitschinger
Spark Summit
 
Spark Summit EU talk by Kaarthik Sivashanmugam
Spark Summit EU talk by Kaarthik SivashanmugamSpark Summit EU talk by Kaarthik Sivashanmugam
Spark Summit EU talk by Kaarthik Sivashanmugam
Spark Summit
 
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Alex Zeltov
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit
 
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
Databricks
 
Apache Pulsar: The Next Generation Messaging and Queuing System
Apache Pulsar: The Next Generation Messaging and Queuing SystemApache Pulsar: The Next Generation Messaging and Queuing System
Apache Pulsar: The Next Generation Messaging and Queuing System
Databricks
 
Data Science with Spark & Zeppelin
Data Science with Spark & ZeppelinData Science with Spark & Zeppelin
Data Science with Spark & Zeppelin
Vinay Shukla
 
Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Using Apache Spark in the Cloud—A Devops Perspective with Telmo OliveiraUsing Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Using Apache Spark in the Cloud—A Devops Perspective with Telmo Oliveira
Spark Summit
 
Extending Spark SQL 2.4 with New Data Sources (Live Coding Session)
Extending Spark SQL 2.4 with New Data Sources (Live Coding Session)Extending Spark SQL 2.4 with New Data Sources (Live Coding Session)
Extending Spark SQL 2.4 with New Data Sources (Live Coding Session)
Databricks
 
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
Spark Summit
 
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
Spark Summit
 
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin KimPowering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
Spark Summit
 
Dr. Elephant: Achieving Quicker, Easier, and Cost-Effective Big Data Analytic...
Dr. Elephant: Achieving Quicker, Easier, and Cost-Effective Big Data Analytic...Dr. Elephant: Achieving Quicker, Easier, and Cost-Effective Big Data Analytic...
Dr. Elephant: Achieving Quicker, Easier, and Cost-Effective Big Data Analytic...
Spark Summit
 
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
What No One Tells You About Writing a Streaming App: Spark Summit East talk b...
Spark Summit
 
Feature Hashing for Scalable Machine Learning with Nick Pentreath
Feature Hashing for Scalable Machine Learning with Nick PentreathFeature Hashing for Scalable Machine Learning with Nick Pentreath
Feature Hashing for Scalable Machine Learning with Nick Pentreath
Spark Summit
 
Spark Summit EU talk by Stephan Kessler
Spark Summit EU talk by Stephan KesslerSpark Summit EU talk by Stephan Kessler
Spark Summit EU talk by Stephan Kessler
Spark Summit
 
How Apache Spark Is Helping Tame the Wild West of Wi-Fi
How Apache Spark Is Helping Tame the Wild West of Wi-FiHow Apache Spark Is Helping Tame the Wild West of Wi-Fi
How Apache Spark Is Helping Tame the Wild West of Wi-Fi
Spark Summit
 
Parallelizing Large Simulations with Apache SparkR with Daniel Jeavons and Wa...
Parallelizing Large Simulations with Apache SparkR with Daniel Jeavons and Wa...Parallelizing Large Simulations with Apache SparkR with Daniel Jeavons and Wa...
Parallelizing Large Simulations with Apache SparkR with Daniel Jeavons and Wa...
Spark Summit
 
End-to-End Data Pipelines with Apache Spark
End-to-End Data Pipelines with Apache SparkEnd-to-End Data Pipelines with Apache Spark
End-to-End Data Pipelines with Apache Spark
Burak Yavuz
 
An AI-Powered Chatbot to Simplify Apache Spark Performance Management
An AI-Powered Chatbot to Simplify Apache Spark Performance ManagementAn AI-Powered Chatbot to Simplify Apache Spark Performance Management
An AI-Powered Chatbot to Simplify Apache Spark Performance Management
Databricks
 
Spark Summit EU talk by Michael Nitschinger
Spark Summit EU talk by Michael NitschingerSpark Summit EU talk by Michael Nitschinger
Spark Summit EU talk by Michael Nitschinger
Spark Summit
 
Spark Summit EU talk by Kaarthik Sivashanmugam
Spark Summit EU talk by Kaarthik SivashanmugamSpark Summit EU talk by Kaarthik Sivashanmugam
Spark Summit EU talk by Kaarthik Sivashanmugam
Spark Summit
 
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Alex Zeltov
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit
 
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
A Journey to Building an Autonomous Streaming Data Platform—Scaling to Trilli...
Databricks
 
Apache Pulsar: The Next Generation Messaging and Queuing System
Apache Pulsar: The Next Generation Messaging and Queuing SystemApache Pulsar: The Next Generation Messaging and Queuing System
Apache Pulsar: The Next Generation Messaging and Queuing System
Databricks
 
Data Science with Spark & Zeppelin
Data Science with Spark & ZeppelinData Science with Spark & Zeppelin
Data Science with Spark & Zeppelin
Vinay Shukla
 

Similar to Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—Industry 4.0 and Logistics Success Examples with Francisco J. Lacueva and Rosa Montañés (20)

Spark Streaming the Industrial IoT
Spark Streaming the Industrial IoTSpark Streaming the Industrial IoT
Spark Streaming the Industrial IoT
Jim Haughwout
 
Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches
DataWorks Summit
 
Cloud Platform for IoT
Cloud Platform for IoTCloud Platform for IoT
Cloud Platform for IoT
Naoto Umemori
 
SplunkLive! Milano 2016 - customer presentation - Unicredit
SplunkLive! Milano 2016 -  customer presentation - UnicreditSplunkLive! Milano 2016 -  customer presentation - Unicredit
SplunkLive! Milano 2016 - customer presentation - Unicredit
Splunk
 
FIWARE Global Summit - FIWARE Overview
FIWARE Global Summit - FIWARE OverviewFIWARE Global Summit - FIWARE Overview
FIWARE Global Summit - FIWARE Overview
FIWARE
 
Activeeon technology for Big Compute and cloud migration
Activeeon technology for Big Compute and cloud migrationActiveeon technology for Big Compute and cloud migration
Activeeon technology for Big Compute and cloud migration
Activeeon
 
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingTiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Paco Nathan
 
Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionGetting Started with Splunk Breakout Session
Getting Started with Splunk Breakout Session
Splunk
 
Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionGetting Started with Splunk Breakout Session
Getting Started with Splunk Breakout Session
Splunk
 
FIWARE Overview
FIWARE OverviewFIWARE Overview
FIWARE Overview
Fernando Lopez Aguilar
 
Fiware overview3
Fiware overview3Fiware overview3
Fiware overview3
Joaquín Salvachúa
 
Accelerate Big Data Application Development with Cascading
Accelerate Big Data Application Development with CascadingAccelerate Big Data Application Development with Cascading
Accelerate Big Data Application Development with Cascading
Cascading
 
[Strata] Sparkta
[Strata] Sparkta[Strata] Sparkta
[Strata] Sparkta
Stratio
 
Voxxed days thessaloniki 21/10/2016 - Streaming Engines for Big Data
Voxxed days thessaloniki 21/10/2016 - Streaming Engines for Big DataVoxxed days thessaloniki 21/10/2016 - Streaming Engines for Big Data
Voxxed days thessaloniki 21/10/2016 - Streaming Engines for Big Data
Stavros Kontopoulos
 
Voxxed Days Thesaloniki 2016 - Streaming Engines for Big Data
Voxxed Days Thesaloniki 2016 - Streaming Engines for Big DataVoxxed Days Thesaloniki 2016 - Streaming Engines for Big Data
Voxxed Days Thesaloniki 2016 - Streaming Engines for Big Data
Voxxed Days Thessaloniki
 
Activeeon - Scale Beyond Limits
Activeeon - Scale Beyond LimitsActiveeon - Scale Beyond Limits
Activeeon - Scale Beyond Limits
Activeeon
 
FI Workshop Sesión Inaugural TID Chile
FI Workshop Sesión Inaugural TID ChileFI Workshop Sesión Inaugural TID Chile
FI Workshop Sesión Inaugural TID Chile
TIDChile
 
OVH Analytics Data Compute - Apache Spark Cluster as a Service
OVH Analytics Data Compute - Apache Spark Cluster as a ServiceOVH Analytics Data Compute - Apache Spark Cluster as a Service
OVH Analytics Data Compute - Apache Spark Cluster as a Service
OVHcloud
 
Ovh analytics data compute with apache spark as a service meetup ovh bordeaux
Ovh analytics data compute with apache spark as a service   meetup ovh bordeauxOvh analytics data compute with apache spark as a service   meetup ovh bordeaux
Ovh analytics data compute with apache spark as a service meetup ovh bordeaux
Mojtaba Imani
 
Open Source Edge Computing Platforms - Overview
Open Source Edge Computing Platforms - OverviewOpen Source Edge Computing Platforms - Overview
Open Source Edge Computing Platforms - Overview
Krishna-Kumar
 
Spark Streaming the Industrial IoT
Spark Streaming the Industrial IoTSpark Streaming the Industrial IoT
Spark Streaming the Industrial IoT
Jim Haughwout
 
Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches Implementing and running a secure datalake from the trenches
Implementing and running a secure datalake from the trenches
DataWorks Summit
 
Cloud Platform for IoT
Cloud Platform for IoTCloud Platform for IoT
Cloud Platform for IoT
Naoto Umemori
 
SplunkLive! Milano 2016 - customer presentation - Unicredit
SplunkLive! Milano 2016 -  customer presentation - UnicreditSplunkLive! Milano 2016 -  customer presentation - Unicredit
SplunkLive! Milano 2016 - customer presentation - Unicredit
Splunk
 
FIWARE Global Summit - FIWARE Overview
FIWARE Global Summit - FIWARE OverviewFIWARE Global Summit - FIWARE Overview
FIWARE Global Summit - FIWARE Overview
FIWARE
 
Activeeon technology for Big Compute and cloud migration
Activeeon technology for Big Compute and cloud migrationActiveeon technology for Big Compute and cloud migration
Activeeon technology for Big Compute and cloud migration
Activeeon
 
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingTiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Paco Nathan
 
Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionGetting Started with Splunk Breakout Session
Getting Started with Splunk Breakout Session
Splunk
 
Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout SessionGetting Started with Splunk Breakout Session
Getting Started with Splunk Breakout Session
Splunk
 
Accelerate Big Data Application Development with Cascading
Accelerate Big Data Application Development with CascadingAccelerate Big Data Application Development with Cascading
Accelerate Big Data Application Development with Cascading
Cascading
 
[Strata] Sparkta
[Strata] Sparkta[Strata] Sparkta
[Strata] Sparkta
Stratio
 
Voxxed days thessaloniki 21/10/2016 - Streaming Engines for Big Data
Voxxed days thessaloniki 21/10/2016 - Streaming Engines for Big DataVoxxed days thessaloniki 21/10/2016 - Streaming Engines for Big Data
Voxxed days thessaloniki 21/10/2016 - Streaming Engines for Big Data
Stavros Kontopoulos
 
Voxxed Days Thesaloniki 2016 - Streaming Engines for Big Data
Voxxed Days Thesaloniki 2016 - Streaming Engines for Big DataVoxxed Days Thesaloniki 2016 - Streaming Engines for Big Data
Voxxed Days Thesaloniki 2016 - Streaming Engines for Big Data
Voxxed Days Thessaloniki
 
Activeeon - Scale Beyond Limits
Activeeon - Scale Beyond LimitsActiveeon - Scale Beyond Limits
Activeeon - Scale Beyond Limits
Activeeon
 
FI Workshop Sesión Inaugural TID Chile
FI Workshop Sesión Inaugural TID ChileFI Workshop Sesión Inaugural TID Chile
FI Workshop Sesión Inaugural TID Chile
TIDChile
 
OVH Analytics Data Compute - Apache Spark Cluster as a Service
OVH Analytics Data Compute - Apache Spark Cluster as a ServiceOVH Analytics Data Compute - Apache Spark Cluster as a Service
OVH Analytics Data Compute - Apache Spark Cluster as a Service
OVHcloud
 
Ovh analytics data compute with apache spark as a service meetup ovh bordeaux
Ovh analytics data compute with apache spark as a service   meetup ovh bordeauxOvh analytics data compute with apache spark as a service   meetup ovh bordeaux
Ovh analytics data compute with apache spark as a service meetup ovh bordeaux
Mojtaba Imani
 
Open Source Edge Computing Platforms - Overview
Open Source Edge Computing Platforms - OverviewOpen Source Edge Computing Platforms - Overview
Open Source Edge Computing Platforms - Overview
Krishna-Kumar
 
Ad

More from Spark Summit (20)

FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
Spark Summit
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
Spark Summit
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Spark Summit
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
Spark Summit
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
Spark Summit
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Spark Summit
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
Spark Summit
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
Spark Summit
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Spark Summit
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Spark Summit
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Spark Summit
 
Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...
Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...
Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...
Spark Summit
 
Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...
Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...
Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...
Spark Summit
 
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Spark Summit
 
Variant-Apache Spark for Bioinformatics with Piotr Szul
Variant-Apache Spark for Bioinformatics with Piotr SzulVariant-Apache Spark for Bioinformatics with Piotr Szul
Variant-Apache Spark for Bioinformatics with Piotr Szul
Spark Summit
 
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
Spark Summit
 
Best Practices for Using Alluxio with Apache Spark with Gene Pang
Best Practices for Using Alluxio with Apache Spark with Gene PangBest Practices for Using Alluxio with Apache Spark with Gene Pang
Best Practices for Using Alluxio with Apache Spark with Gene Pang
Spark Summit
 
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
Spark Summit
 
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
Spark Summit
 
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data  with Ramya RaghavendraImproving Traffic Prediction Using Weather Data  with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Spark Summit
 
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
Spark Summit
 
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
 
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim DowlingApache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
Spark Summit
 
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya RaghavendraImproving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Spark Summit
 
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
Spark Summit
 
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim SimeonovGoal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
Spark Summit
 
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Spark Summit
 
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir VolkGetting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Spark Summit
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Spark Summit
 
Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...
Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...
Indicium: Interactive Querying at Scale Using Apache Spark, Zeppelin, and Spa...
Spark Summit
 
Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...
Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...
Apache Spark-Bench: Simulate, Test, Compare, Exercise, and Yes, Benchmark wit...
Spark Summit
 
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Spark Summit
 
Variant-Apache Spark for Bioinformatics with Piotr Szul
Variant-Apache Spark for Bioinformatics with Piotr SzulVariant-Apache Spark for Bioinformatics with Piotr Szul
Variant-Apache Spark for Bioinformatics with Piotr Szul
Spark Summit
 
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
Spark Summit
 
Best Practices for Using Alluxio with Apache Spark with Gene Pang
Best Practices for Using Alluxio with Apache Spark with Gene PangBest Practices for Using Alluxio with Apache Spark with Gene Pang
Best Practices for Using Alluxio with Apache Spark with Gene Pang
Spark Summit
 
Ad

Recently uploaded (20)

03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia03 Daniel 2-notes.ppt seminario escatologia
03 Daniel 2-notes.ppt seminario escatologia
Alexander Romero Arosquipa
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnTemplate_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
cegiver630
 
Stack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptxStack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptx
binduraniha86
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
Developing Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response ApplicationsDeveloping Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response Applications
VICTOR MAESTRE RAMIREZ
 
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.pptJust-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
ssuser5f8f49
 
Geometry maths presentation for begginers
Geometry maths presentation for begginersGeometry maths presentation for begginers
Geometry maths presentation for begginers
zrjacob283
 
Defense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptxDefense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptx
Greg Makowski
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story
ccctableauusergroup
 
chapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.pptchapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.ppt
justinebandajbn
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptxmd-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
fatimalazaar2004
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
Digilocker under workingProcess Flow.pptx
Digilocker  under workingProcess Flow.pptxDigilocker  under workingProcess Flow.pptx
Digilocker under workingProcess Flow.pptx
satnamsadguru491
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
Simple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptxSimple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptx
ssuser2aa19f
 
Ch3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendencyCh3MCT24.pptx measure of central tendency
Ch3MCT24.pptx measure of central tendency
ayeleasefa2
 
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnTemplate_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
cegiver630
 
Stack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptxStack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptx
binduraniha86
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
Developing Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response ApplicationsDeveloping Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response Applications
VICTOR MAESTRE RAMIREZ
 
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.pptJust-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
ssuser5f8f49
 
Geometry maths presentation for begginers
Geometry maths presentation for begginersGeometry maths presentation for begginers
Geometry maths presentation for begginers
zrjacob283
 
Defense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptxDefense Against LLM Scheming 2025_04_28.pptx
Defense Against LLM Scheming 2025_04_28.pptx
Greg Makowski
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story04302025_CCC TUG_DataVista: The Design Story
04302025_CCC TUG_DataVista: The Design Story
ccctableauusergroup
 
chapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.pptchapter3 Central Tendency statistics.ppt
chapter3 Central Tendency statistics.ppt
justinebandajbn
 
VKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptxVKS-Python Basics for Beginners and advance.pptx
VKS-Python Basics for Beginners and advance.pptx
Vinod Srivastava
 
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptxmd-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
fatimalazaar2004
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
Digilocker under workingProcess Flow.pptx
Digilocker  under workingProcess Flow.pptxDigilocker  under workingProcess Flow.pptx
Digilocker under workingProcess Flow.pptx
satnamsadguru491
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
computer organization and assembly language.docx
computer organization and assembly language.docxcomputer organization and assembly language.docx
computer organization and assembly language.docx
alisoftwareengineer1
 
Simple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptxSimple_AI_Explanation_English somplr.pptx
Simple_AI_Explanation_English somplr.pptx
ssuser2aa19f
 

Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—Industry 4.0 and Logistics Success Examples with Francisco J. Lacueva and Rosa Montañés

  • 1. Hiding Apache Spark Complexity for Fast Prototyping Francisco J. Lacueva, ITAINNOVA Rosa Montañés, ITAINNOVA of Big Data Applications #EUent6
  • 2. Index • everisMoriarty – Win-win dialog: Developer & Data Scientist – everisMoriarty platform • Spark in everisMoriarty• Spark in everisMoriarty – Basic example – Real cases • TT • FACTS4WORKERS • Streaming 2#EUent6
  • 3. everisMoriarty everisMoriarty is a platform for end-to-end development of data analytics models 3#EUent6 everis (www.everis.es) belongs to NTT Data Group It offers IT solutions, services and outsourcing to several sectors such as Telcos., financial entities, public administration, industrial, utilities, energy providers o health companies. Revenues: 1031M€ Sites in 16 countries and 19000 professionals work for everis. NTT Data Group has sites in 50 countries and 110000 professionals work in ITAINNOVA (www.itainnova.es) is a non profit R&D Centre owned by Aragon Regional Government. Sites: Zaragoza and Huesca 15000m2 facilities. 15M€ Budget 59% private projects, 30% CPF, 11%NCPF 1,5M€ Investments, 15000m2 facilities. Knowledge Areas: Materials, Mechatronics, Electrical Power Systems, Industrial Processes, ICT-Big Data, Laboratories –Quality
  • 5. everisMoriarty Win-win dialog Aspect “WorkTeam” (solution) I use your models and develop tools quicker and effectively I build models, patterns, and cognitive systems ready to use 5#EUent6 Data Scientist Collaborate with developer
  • 6. everisMoriarty More than 100 workitems (WI) covering: ML and DL Models, DB connectors, Crawlers, Text Mining algorithms, Spark components… WFs can be used as a WI within another WF. Supported programming languages: Java, Python. Deploy Button 6#EUent6 Python. Automatic Provision of REST APIs for Published WFs. Easy integration with third-party APIs List of Workitems eM WorkFlow eM WorkItem WorkFlow Parameters
  • 7. Spark in everisMoriarty Spark Clúster (Cloudera CDH) eM Spark WIs: Generic, SparkStreaming, SparkML 7#EUent6
  • 8. everisMoriarty Spark Basic Wis Spark Streaming WIs Spark Machine Learning WIs Spark WIs overview 8#EUent6 • Spark Context: • SparkContextCreator • SparkContextStop • Data Sources: • SparkCSVLoader • SparkCSVWriter • MongoSpark • … • Spark Streaming Context: • SparkStreamingContextCreator • SparkStreamingContextStartAnd Wait • SparkStreamingContextStop • DStream transformations: • SparkStreamingInputDStreamWi ndower • SparkKMeans • SparkLinearRegression • SparkDecisionTreeClassification • SparkPCA • SparkClassifier • SparkTokenizer • SparkTfIdf
  • 9. Spark in everisMoriarty Integration with Spark: – Spark WFs (using Spark WIs) • Native Spark operations & Spark data types 9#EUent6 – Other WFs • No Native Spark operations & No Spark data types  Reused within Spark WFs, i.e. as mapFunction.  Disitributed execution over Spark environment.
  • 10. Spark in everisMoriarty (Very) Basic Example Start Event End EventSparkLoadData ProcessData SparkStoreData Record processingWF 10#EUent6 Encapsulates mapFunction DataScientist-based:
  • 11. Spark in everisMoriarty (Very) Basic Example: classify data Start Event End EventSparkLoadData ProcessData SparkStoreData Record processingWF 11#EUent6 Encapsulates mapFunction DataScientist-based: - Classification
  • 12. Spark in everisMoriarty (Very) Basic Example: classify by opinion Start Event End EventSparkLoadData ProcessData SparkStoreData Record processingWF 12#EUent6 Encapsulates mapFunction DataScientist-based: - Classification - Opinion Mining
  • 13. Spark in everisMoriarty (Very) Basic Example: use external resource Start Event End EventSparkLoadData ProcessData SparkStoreData Record processingWF 13#EUent6 Encapsulates mapFunction DataScientist-based: - Classification - Opinion Mining - Invoke REST APIs
  • 14. Spark in everisMoriarty (Very) Basic Example Start Event End EventSparkLoadData ProcessData SparkStoreData Record processingWF 14#EUent6 Encapsulates mapFunction DataScientist-based: - Classification - Opinion Mining - Invoke REST APIs - ...
  • 15. Real case Spark in everisMoriarty 15#EUent6 • PILOT DOMAIN: Dynamic Supply Networks / eCommerce • DOMAIN LEAD: Athens University of Economics and Business AUEB • USE CASES: • Route Planning • Forecasting https://transformingtransport.eu/
  • 16. Data Transformation & Route Planning Service Invocation Spark in everisMoriarty • Data processed over Spark platform • Parallelized execution 16 <Service> <ID>261680</ID> <Name>Cliente261680</Name> <Duration>PT10M</Duration> <Location> <Address>No Address</Address> <HouseNumber /> <City /> <PostalCode /> <Region /> <Country>ES</Country> <Coord srs="EPSG:4326" x= "37.984332" y="23.726466" /> <GeocodeLevel>XY</GeocodeLevel> </Location> <Priority>1</Priority> <Windows> <Window start="2017-04-28T08:00:00" end="2017-04-28T16:00:00" /> </Windows> <UnloadUnits>1</UnloadUnits> <UnloadKg>6.000</UnloadKg> <UnloadM3>0.036</UnloadM3> <Comments /> </Service> Aaaa Aaaa Aaaa Aaaa Aaaaaaaa Aaaaaaaa Aaaaaaaa Aaaaaaaa Aaaaaaa Aaaaaaa Aaaaaaa Aaaaaaa Aaaaaaaa Aaaaaaaa Aaaaaaaa Aaaaaaaa <Service> <ID>261680</ID> <Name>Cliente261680</Name> <Duration>PT10M</Duration> <Location> <Address>No Address</Address> <HouseNumber /> <City /> <PostalCode /> <Region /> <Country>ES</Country> <Coord srs="EPSG:4326" x= "37.984332" y="23.726466" /> <GeocodeLevel>XY</GeocodeLevel> </Location> <Priority>1</Priority> <Windows> <Window start="2017-04-28T08:00:00" end="2017-04-28T16:00:00" /> </Windows> <UnloadUnits>1</UnloadUnits> <UnloadKg>6.000</UnloadKg> <UnloadM3>0.036</UnloadM3> <Comments /> </Service> Company A Company B Company C
  • 17. Real case Spark in everisMoriarty FACTS4WORKERS. Worker Centric Workplaces in Smart Factories. PILOT DOMAIN: H2020- FoF 4. 2014 DOMAIN LEAD: Virtual Vehicle USE CASES: Assesing the Impact in Workers Based on System Logged Infomation 17#EUent6 4 Large Industry Partners 5 non-profit ResearchCenters 2 SME´s This project has received funding from the European Union's Horizon 2020 research and innovation programme under Grant Agreement n˚ 636778 http://facts4workers.eu
  • 18. Spark in everisMoriarty Abstract World RealWorld As-isSituation Problem Scenario Should-be Situation ActivityScenario Instance Problem Solution Artefact HMI/HCI WorkflowEngine F4WBBs BackendSystems BBs andServices Provided bytheWFE provides standard APIs: REST JSONAPI. Mobile Devices Wearables (e.g.Smart Watches) Smart Glasses Desktop Authenti- cation Control Charts Logbook ERP Interface Multimedia Manage- ment Semantic Search Training Module Chat Module Video Chat Module Alarm Warning Manager User Content Rating Machine Status Operator Skill Profiling Task Manager planned BBs … Work Flow Engine DocsRepository PMS KMS SocialBig Data Industrial IoT Big Data 3DModelsTranformers Environment Sensors Data Repositories Social Software Security System ERP … Iteravive, Agile Development 18#EUent6 Problem Artefact Evaluation
  • 19. Spark in everisMoriarty HMI/HCI WorkflowEngine F4W BBs Backend Systems Mobile Devices Wearables (e.g.Smart Watches) Smart Glasses Authenti- cation Control Charts Logbook ERP Interface Multimedia Manage- ment Semantic Search Training Module Chat Module Alarm Warning Manager User Content Rating Machine Operator Skill Profiling Docs Repository PMS KMS Social Big Data Industrial IoT Big Data 3D Models Tranformers Environment Sensors Data Repositories Security System 19#EUent6 Desktop Video Chat Module Rating Machine Status Profiling Task Manager planned BBs … Social Software ERP … Dashboard Alarms Reports
  • 20. • Real case: Spark Streaming Spark in everisMoriarty 20#EUent6 • USE CASES: • Urban Mobility events processing • Classification of individuals
  • 21. Spark in everisMoriarty Urban mobility events processing & classification 21#EUent6 HDFS DBs Dashboards
  • 22. Spark in everisMoriarty Urban mobility events processing & classification Spark Context 22#EUent6 Spark Streaming connector Spark Engine Spark Streaming Context Start HDFS DBs Dashboards
  • 23. Moving faster with Spark Easy Spark integration Increase computation capabilities Reduce execution time Easy deployment of non-spark WFs over Spark infrastructure Fast Prototyping Basic Spark applications 23#EUent6 Basic Spark applications Spark ML applications Spark Streaming applications
  • 24. Moving faster with SparkFast Prototyping & Fast deployment Design and Developement Testing and Deployment Production Code control Integration Engine Development environment Continuous Integration and Delivery Production Environment Integration Engine QA Functional Tests Performance Tests Unit & Integration Tests 24#EUent6 Gartner: Market Trends: Top Five Buyer Expectations of Intelligent Automation in Data and Analytics Services. Published: 17 March 2017 ID: G00322319
  • 26. Contact us #EUent6 Rosa Montañés [email protected] Master IA, Data Scientist and BD Architect Telecom Engineer Francisco J. Lacueva [email protected] Data Scientist and BD Architect Master Software Engineer