SlideShare a Scribd company logo
© Copyright SELA Software & Education Labs Ltd. | 14-18 Baruch Hirsch St Bnei Brak, 51202 Israel | www.selagroup.com
Eyal Ben Ivri
Building Big Data Solutions
on Azure
About me
Eyal Ben Ivri
Big Data & Cloud Architect, Sela Group
Focus On Hadoop Eco-System & Big-Data +
NoSQL Solutions
Modern Data – The Big Picture
IoT
User Data
Media Files
Documents
Machine Data
Log Files
Building big data solutions on azure
The Light Rail problem – TLV
Railway
Imagine the new light Rail maintenance
company
IoT – Internet of Trains (and cameras, and cash
registers and carts and rails and more…)
Analyze data in stream and in batch
Dashboards
Alerts
The perfect problem
What We Need
An integrated data solution that will be:
Able to process events from external sources
Able to walk data through different pipelines
Fast and responsive
Big-Data Ready
In Other Words
Consume
BI Dashboards Applications
Process
ETL Aggregations Computation Analysis Querying
Persist
Hadoop SQL NoSQL
Ingest
IoT Structured Data Un-Structured Data
Microsoft Azure Services for
IoT and BigData
Devices Device Connectivity Storage Analytics Presentation & Action
Event Hubs SQL Database
Machine
Learning
App Service
Service Bus
Table/Blob
Storage
Stream Analytics Power BI
External Data
Sources
DocumentDB HDInsight
Notification
Hubs
Data Lake Store Data Factory Mobile Services
External Data
Sources
Data Lake
Analytics
BizTalk Services
{ }
Microsoft Azure Services for
IoT and BigData
Devices Device Connectivity Storage Analytics Presentation & Action
Event Hubs SQL Database
Machine
Learning
App Service
Service Bus
Table/Blob
Storage
Stream Analytics Power BI
External Data
Sources
DocumentDB HDInsight
Notification
Hubs
Data Lake Store Data Factory Mobile Services
External Data
Sources
Data Lake
Analytics
BizTalk Services
{ }
Event Hub
Messages at scale
Why not throw it into a queue, and have a
listener at the backend?
Scaling limits, because of the architecture of queues
and topics of a standard Service Bus
Event Hub uses a partition model
Getting Started
Easy to set up
Two Configurations
Partition Count – Depend on the number of consumers (2-
32)
Message Retention (days) – between 1 and 7 days
Secured using SAS Policies
Field
Gateway
Device
Connectivity & Management
IoT with Event Hubs
Devices
RTOS,Linux,Windows,Android,iOS
Cloud Gateway
Event Hubs
Field
Gateway
Protocol
Adaptation
Field
Gateway
Device
Connectivity & Management
Analytics &
Operationalized Insights
IoT & Data Processing Patterns
Devices
RTOS,Linux,Windows,Android,iOS
Protocol
Adaptation
Batch Analytics & Visualizations
Azure HDInsight, AzureML, Power BI,
Azure Data Factory
Hot Path Analytics
Azure Stream Analytics, Azure HDInsight Storm
Hot Path Business Logic
Service Fabric & Actor Framework
Cloud Gateway
Event Hubs
&
IoT Hub
Field
Gateway
Protocol
Adaptation
TLV Railway
Can now ingest millions of messages each
second
These messages carry data from:
Devices
End-Machines
Servers
Next, we need to use this data to create real-
time alerts when something goes wrong
Azure Stream Analytics
Automatic recovery
Monitoring and alerting
Scale on demand
Managed Cloud Service
Each unit handles 1MB/s
Can scale up to 1GB/s
SQL like language
temporal windowing
semantics
support for reference data
Stream Analytics – Main Concepts
Inputs
Can be stream or reference data (metadata)
Stream Data sources can be Event Hub, Blob Storage
(using blobs with timestamps) or IoT Hub (preview)
Serialization types support CSV, JSON, and Avro
Query
A SQL query to that will select from input(s) and
dump results to output(s)
Output
Can be Blob, SQL, Event Hub (notification), Power BI
(preview), Table storage, Service Bus or DocumentDB
Tumbling Windows
How many trains entered each station every 5
minutes?
SELECT TrainId, COUNT(*) FROM EntryStream
GROUP BY TrainId, TumblingWindow(minute,5)
Temporal Windows
Tumbling Window
A series of fixed-sized, non-overlapping and
contiguous time intervals
Hopping Window
Scheduled overlapping windows
Sliding Window
Outputs events only for those points in time when
the content of the window actually changes
TLV Railway
Can now respond in near-real-time to events as
they happen
Track and maintain malfunctioning equipment
Receive real time data regarding customers
entering and leaving stations
Data can now be processed, so we need a place
to save it, preferably at scale.
DocumentDB and Azure Data
Services
fully managed, scalable, queryable, schema free JSON
document database service for modern applications
transactional processing
rich query
managed as a service
elastic scale
internet accessible http/rest
schema-free data model
arbitrary data formats
DocumentDB features
JSON Documents
SQL support
Linq Support
REST API Support
JS Support (triggers, UDFs, stored procedures)
Automatic Index
Multiple Document Transactions
Tunable Consistency
DocumentDB Key Concept
Collection
A collection of Documents
Not a table (different entities can go into the same
collection)
Collections = Partitions
Not just logical containers, but physical ones
Demo
TLV Railway – Part 1
TLV Railway
Can now store it’s data in a highly scalable store
Great for interactive querying of any data
Messages from sensors
Reference Data
But this data (and other data) needs to move to
other places (SQL, Batch processing, ML). How?
What is Azure Data Factory?
Azure Data Factory is a managed service to produce
trusted information from data stored in the cloud
and on-premises. Easily create, orchestrate and
schedule highly-available, fault tolerant work flows
to move and transform your data at scale.
Evolving Approaches to Analytics
ETL Tool
(SSIS, etc)
EDW
(SQL Svr, Teradata, etc)
Extract
Original
Data
Load
Transformed
Data
Transform
BI Tools
Ingest
Original
Data
Scale-out
Storage &
Compute
(HDFS, Blob Storage,
etc)
Transform & Load
Data Marts
Data Lake(s)
Dashboards
Apps
Streaming data
Data Factory – Main concepts
Data Store
A data source/sink component
SQL (Azure or On-Premise), Storage, DocumentDB and
more)
Data Set
A defined data set that is contained inside a data store
One data store can have many data sets
Compute
A service for computation
HDInsight, Azure Batch, Data Lake Analytics, Azure ML
Data Factory – Main concepts
Pipeline
Set of instructions
“Take data from data set A and move to compute,
then store results in data set B”
Slices
Everything is time sliced
A data set (source) can declare on what time
intervals the data can be sliced, and the pipeline will
be activated when a new slice is ready
JSON
Building big data solutions on azure
Microsoft Azure Services for
IoT and BigData
Devices Device Connectivity Storage Analytics Presentation & Action
Event Hubs SQL Database
Machine
Learning
App Service
Service Bus
Table/Blob
Storage
Stream Analytics Power BI
External Data
Sources
DocumentDB HDInsight
Notification
Hubs
Data Lake Store Data Factory Mobile Services
External Data
Sources
Data Lake
Analytics
BizTalk Services
{ }
Microsoft Azure Services for
IoT and BigData
Devices Device Connectivity Storage Analytics Presentation & Action
Event Hubs SQL Database
Machine
Learning
App Service
Service Bus
Table/Blob
Storage
Stream Analytics Power BI
External Data
Sources
DocumentDB HDInsight
Notification
Hubs
Data Lake Store Data Factory Mobile Services
External Data
Sources
Data Lake
Analytics
BizTalk Services
{ }
TLV Railway
Can now integrate different services and
different data sources
Move data with ease and as little hassle as
possible
What about aggregations, deeper dive into
data, for more complex analysis?
Building big data solutions on azure
HDInsight
Hadoop-as-a-Service
Based on the Hortonworks distribution
Few flavors:
Hadoop (Windows + Linux)
Storm (Windows + Linux)
HBase (Windows + Linux)
Spark (Windows + Linux)
Data size
Access
Updates
Structure
Integrity
Scaling
Hadoop vs. Relational DB
Demo
TLV Railway – Part 2
TLV Railway - Summary
Can now perform advanced analytics on top of
large amounts of data, in a variety of formats
(not just structured, boring data)
Can integrate all the loose ends of data coming
in, with data generated in ”Old-School” data
platforms like SQL that is collected from Line-
of-Business applications
We’ve covered data ingestion, responding in
real-time, querying, storing and processing
Azure Stack
Hadoop and OSS vs.
Azure IoT and BigData Ecosystem
Azure Ecosystem OSS
Event Hubs Kafka
Stream Analytics Storm
HDInsight Hadoop
Map Reduce Map Reduce
Hive Hive
Spark Spark
HBase HBase
Azure ML Mahout
Data Factory Pig
DocumentDB MongoDB / Couchbase
Data Lake (preview)
Is “TLV Railway” fake?
London did it first
Summary
Get started today at
http://azure.microsoft.com
Questions
Ad

More Related Content

What's hot (20)

Big Data with Azure
Big Data with AzureBig Data with Azure
Big Data with Azure
Aaron (Ari) Bornstein
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft Azure
Mark Kromer
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
Dmitry Anoshin
 
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsVerizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
DataWorks Summit
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
Infochimps, a CSC Big Data Business
 
Entity Resolution Service - Bringing Petabytes of Data Online for Instant Access
Entity Resolution Service - Bringing Petabytes of Data Online for Instant AccessEntity Resolution Service - Bringing Petabytes of Data Online for Instant Access
Entity Resolution Service - Bringing Petabytes of Data Online for Instant Access
DataWorks Summit
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
boorad
 
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics SolutionCortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
MSAdvAnalytics
 
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big DataMicrosoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Hortonworks
 
Big data on Azure for Architects
Big data on Azure for ArchitectsBig data on Azure for Architects
Big data on Azure for Architects
Tomasz Kopacz
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Big Data in Azure
DataWorks Summit/Hadoop Summit
 
The modern analytics architecture
The modern analytics architectureThe modern analytics architecture
The modern analytics architecture
Joseph D'Antoni
 
Big Data on Azure Tutorial
Big Data on Azure TutorialBig Data on Azure Tutorial
Big Data on Azure Tutorial
rustd
 
Best Practices: Hadoop migration to Azure HDInsight
Best Practices: Hadoop migration to Azure HDInsightBest Practices: Hadoop migration to Azure HDInsight
Best Practices: Hadoop migration to Azure HDInsight
Revin Chalil
 
Cortana Analytics Suite
Cortana Analytics SuiteCortana Analytics Suite
Cortana Analytics Suite
James Serra
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
James Serra
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dataconomy Media
 
Pentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and HadoopPentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and Hadoop
Mark Kromer
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Databricks
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
Databricks
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft Azure
Mark Kromer
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
Dmitry Anoshin
 
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsVerizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
DataWorks Summit
 
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
[Webinar] Getting to Insights Faster: A Framework for Agile Big Data
Infochimps, a CSC Big Data Business
 
Entity Resolution Service - Bringing Petabytes of Data Online for Instant Access
Entity Resolution Service - Bringing Petabytes of Data Online for Instant AccessEntity Resolution Service - Bringing Petabytes of Data Online for Instant Access
Entity Resolution Service - Bringing Petabytes of Data Online for Instant Access
DataWorks Summit
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
boorad
 
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics SolutionCortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
MSAdvAnalytics
 
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big DataMicrosoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Microsoft and Hortonworks Delivers the Modern Data Architecture for Big Data
Hortonworks
 
Big data on Azure for Architects
Big data on Azure for ArchitectsBig data on Azure for Architects
Big data on Azure for Architects
Tomasz Kopacz
 
The modern analytics architecture
The modern analytics architectureThe modern analytics architecture
The modern analytics architecture
Joseph D'Antoni
 
Big Data on Azure Tutorial
Big Data on Azure TutorialBig Data on Azure Tutorial
Big Data on Azure Tutorial
rustd
 
Best Practices: Hadoop migration to Azure HDInsight
Best Practices: Hadoop migration to Azure HDInsightBest Practices: Hadoop migration to Azure HDInsight
Best Practices: Hadoop migration to Azure HDInsight
Revin Chalil
 
Cortana Analytics Suite
Cortana Analytics SuiteCortana Analytics Suite
Cortana Analytics Suite
James Serra
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
James Serra
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dataconomy Media
 
Pentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and HadoopPentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and Hadoop
Mark Kromer
 
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Data Mesh in Practice: How Europe’s Leading Online Platform for Fashion Goes ...
Databricks
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
Databricks
 

Viewers also liked (20)

MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR Technologies
 
Intorducing Big Data and Microsoft Azure
Intorducing Big Data and Microsoft AzureIntorducing Big Data and Microsoft Azure
Intorducing Big Data and Microsoft Azure
Khalid Salama
 
Big Data Application Architectures - Fraud Detection
Big Data Application Architectures - Fraud DetectionBig Data Application Architectures - Fraud Detection
Big Data Application Architectures - Fraud Detection
DataWorks Summit/Hadoop Summit
 
Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015
Cindy Gross
 
Visualising the tabular model for power view upload
Visualising the tabular model for power view uploadVisualising the tabular model for power view upload
Visualising the tabular model for power view upload
Jen Stirrup
 
Belgian Windows Server 2012 Launch windows azure insights for the enterprise ...
Belgian Windows Server 2012 Launch windows azure insights for the enterprise ...Belgian Windows Server 2012 Launch windows azure insights for the enterprise ...
Belgian Windows Server 2012 Launch windows azure insights for the enterprise ...
Mike Martin
 
Go Serverless with Azure Functions
Go Serverless with Azure FunctionsGo Serverless with Azure Functions
Go Serverless with Azure Functions
Jim O'Neil
 
Enterprise Data Workflows with Cascading and Windows Azure HDInsight
Enterprise Data Workflows with Cascading and Windows Azure HDInsightEnterprise Data Workflows with Cascading and Windows Azure HDInsight
Enterprise Data Workflows with Cascading and Windows Azure HDInsight
Paco Nathan
 
Fraud Detection using Hadoop
Fraud Detection using HadoopFraud Detection using Hadoop
Fraud Detection using Hadoop
hadooparchbook
 
Microsoft NYC 14
Microsoft NYC 14Microsoft NYC 14
Microsoft NYC 14
SwitchPitch
 
Azure api app métricas com application insights
Azure api app métricas com application insightsAzure api app métricas com application insights
Azure api app métricas com application insights
Nicolas Takashi
 
Azure IOT
Azure IOTAzure IOT
Azure IOT
Maik van der Gaag
 
Big data streaming with Apache Spark on Azure
Big data streaming with Apache Spark on AzureBig data streaming with Apache Spark on Azure
Big data streaming with Apache Spark on Azure
Willem Meints
 
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
Sascha Dittmann
 
Software scope
Software scopeSoftware scope
Software scope
Shubham Dubey
 
Azure HDInsight
Azure HDInsightAzure HDInsight
Azure HDInsight
Koray Kocabas
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in Motion
Ruhani Arora
 
Going serverless
Going serverlessGoing serverless
Going serverless
TechExeter
 
2016-08-25 TechExeter - going serverless with Azure
2016-08-25 TechExeter - going serverless with Azure2016-08-25 TechExeter - going serverless with Azure
2016-08-25 TechExeter - going serverless with Azure
Steve Lee
 
Azure functions
Azure functionsAzure functions
Azure functions
vivek p s
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR Technologies
 
Intorducing Big Data and Microsoft Azure
Intorducing Big Data and Microsoft AzureIntorducing Big Data and Microsoft Azure
Intorducing Big Data and Microsoft Azure
Khalid Salama
 
Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015Big Data in the Cloud - Montreal April 2015
Big Data in the Cloud - Montreal April 2015
Cindy Gross
 
Visualising the tabular model for power view upload
Visualising the tabular model for power view uploadVisualising the tabular model for power view upload
Visualising the tabular model for power view upload
Jen Stirrup
 
Belgian Windows Server 2012 Launch windows azure insights for the enterprise ...
Belgian Windows Server 2012 Launch windows azure insights for the enterprise ...Belgian Windows Server 2012 Launch windows azure insights for the enterprise ...
Belgian Windows Server 2012 Launch windows azure insights for the enterprise ...
Mike Martin
 
Go Serverless with Azure Functions
Go Serverless with Azure FunctionsGo Serverless with Azure Functions
Go Serverless with Azure Functions
Jim O'Neil
 
Enterprise Data Workflows with Cascading and Windows Azure HDInsight
Enterprise Data Workflows with Cascading and Windows Azure HDInsightEnterprise Data Workflows with Cascading and Windows Azure HDInsight
Enterprise Data Workflows with Cascading and Windows Azure HDInsight
Paco Nathan
 
Fraud Detection using Hadoop
Fraud Detection using HadoopFraud Detection using Hadoop
Fraud Detection using Hadoop
hadooparchbook
 
Microsoft NYC 14
Microsoft NYC 14Microsoft NYC 14
Microsoft NYC 14
SwitchPitch
 
Azure api app métricas com application insights
Azure api app métricas com application insightsAzure api app métricas com application insights
Azure api app métricas com application insights
Nicolas Takashi
 
Big data streaming with Apache Spark on Azure
Big data streaming with Apache Spark on AzureBig data streaming with Apache Spark on Azure
Big data streaming with Apache Spark on Azure
Willem Meints
 
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
Sascha Dittmann
 
Azure Stream Analytics : Analyse Data in Motion
Azure Stream Analytics  : Analyse Data in MotionAzure Stream Analytics  : Analyse Data in Motion
Azure Stream Analytics : Analyse Data in Motion
Ruhani Arora
 
Going serverless
Going serverlessGoing serverless
Going serverless
TechExeter
 
2016-08-25 TechExeter - going serverless with Azure
2016-08-25 TechExeter - going serverless with Azure2016-08-25 TechExeter - going serverless with Azure
2016-08-25 TechExeter - going serverless with Azure
Steve Lee
 
Azure functions
Azure functionsAzure functions
Azure functions
vivek p s
 
Ad

Similar to Building big data solutions on azure (20)

Building IoT and Big Data Solutions on Azure
Building IoT and Big Data Solutions on AzureBuilding IoT and Big Data Solutions on Azure
Building IoT and Big Data Solutions on Azure
Ido Flatow
 
Azure Platform
Azure Platform Azure Platform
Azure Platform
Wes Yanaga
 
Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3
Jeffrey T. Pollock
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
James Serra
 
[「RDB技術者のためのNoSQLガイド」出版記念セミナー] Azure DocumentDB
[「RDB技術者のためのNoSQLガイド」出版記念セミナー] Azure DocumentDB[「RDB技術者のためのNoSQLガイド」出版記念セミナー] Azure DocumentDB
[「RDB技術者のためのNoSQLガイド」出版記念セミナー] Azure DocumentDB
Naoki (Neo) SATO
 
Internet of Things in Tbilisi
Internet of Things in TbilisiInternet of Things in Tbilisi
Internet of Things in Tbilisi
Alexey Bokov
 
Azure IoT Summary
Azure IoT SummaryAzure IoT Summary
Azure IoT Summary
Todd Whitehead
 
Microsoft Azure Technical Overview
Microsoft Azure Technical OverviewMicrosoft Azure Technical Overview
Microsoft Azure Technical Overview
gjuljo
 
SQL Server Data Services
SQL Server Data ServicesSQL Server Data Services
SQL Server Data Services
Eduardo Castro
 
Big Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI MobileBig Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI Mobile
Roy Kim
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
Frank Kienle
 
Trivadis Azure Data Lake
Trivadis Azure Data LakeTrivadis Azure Data Lake
Trivadis Azure Data Lake
Trivadis
 
Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020
Riccardo Zamana
 
IoT & Azure, the field of possibilities
IoT & Azure, the field of possibilitiesIoT & Azure, the field of possibilities
IoT & Azure, the field of possibilities
Alex Danvy
 
Azure Cloud Services
Azure Cloud ServicesAzure Cloud Services
Azure Cloud Services
Kajal Kathrotiya
 
Windows Azure: Lessons From The Field
Windows Azure: Lessons From The FieldWindows Azure: Lessons From The Field
Windows Azure: Lessons From The Field
Rob Gillen
 
Data Estate Modernization
Data Estate ModernizationData Estate Modernization
Data Estate Modernization
Karina Matos
 
Understanding the Windows Azure Platform - Dec 2010
Understanding the Windows Azure Platform - Dec 2010Understanding the Windows Azure Platform - Dec 2010
Understanding the Windows Azure Platform - Dec 2010
DavidGristwood
 
Cloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service OptionCloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service Option
Denodo
 
Building IoT and Big Data Solutions on Azure
Building IoT and Big Data Solutions on AzureBuilding IoT and Big Data Solutions on Azure
Building IoT and Big Data Solutions on Azure
Ido Flatow
 
Azure Platform
Azure Platform Azure Platform
Azure Platform
Wes Yanaga
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
James Serra
 
[「RDB技術者のためのNoSQLガイド」出版記念セミナー] Azure DocumentDB
[「RDB技術者のためのNoSQLガイド」出版記念セミナー] Azure DocumentDB[「RDB技術者のためのNoSQLガイド」出版記念セミナー] Azure DocumentDB
[「RDB技術者のためのNoSQLガイド」出版記念セミナー] Azure DocumentDB
Naoki (Neo) SATO
 
Internet of Things in Tbilisi
Internet of Things in TbilisiInternet of Things in Tbilisi
Internet of Things in Tbilisi
Alexey Bokov
 
Microsoft Azure Technical Overview
Microsoft Azure Technical OverviewMicrosoft Azure Technical Overview
Microsoft Azure Technical Overview
gjuljo
 
SQL Server Data Services
SQL Server Data ServicesSQL Server Data Services
SQL Server Data Services
Eduardo Castro
 
Big Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI MobileBig Data Analytics from Azure Cloud to Power BI Mobile
Big Data Analytics from Azure Cloud to Power BI Mobile
Roy Kim
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
Frank Kienle
 
Trivadis Azure Data Lake
Trivadis Azure Data LakeTrivadis Azure Data Lake
Trivadis Azure Data Lake
Trivadis
 
Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020Azure Data Explorer deep dive - review 04.2020
Azure Data Explorer deep dive - review 04.2020
Riccardo Zamana
 
IoT & Azure, the field of possibilities
IoT & Azure, the field of possibilitiesIoT & Azure, the field of possibilities
IoT & Azure, the field of possibilities
Alex Danvy
 
Windows Azure: Lessons From The Field
Windows Azure: Lessons From The FieldWindows Azure: Lessons From The Field
Windows Azure: Lessons From The Field
Rob Gillen
 
Data Estate Modernization
Data Estate ModernizationData Estate Modernization
Data Estate Modernization
Karina Matos
 
Understanding the Windows Azure Platform - Dec 2010
Understanding the Windows Azure Platform - Dec 2010Understanding the Windows Azure Platform - Dec 2010
Understanding the Windows Azure Platform - Dec 2010
DavidGristwood
 
Cloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service OptionCloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service Option
Denodo
 
Ad

Recently uploaded (20)

Developing Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response ApplicationsDeveloping Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response Applications
VICTOR MAESTRE RAMIREZ
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptxmd-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
fatimalazaar2004
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
Minions Want to eat presentacion muy linda
Minions Want to eat presentacion muy lindaMinions Want to eat presentacion muy linda
Minions Want to eat presentacion muy linda
CarlaAndradesSoler1
 
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnTemplate_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
cegiver630
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..
yuvarajreddy2002
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.pptJust-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
ssuser5f8f49
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your CompetitorsAI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
Contify
 
Stack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptxStack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptx
binduraniha86
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 
Developing Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response ApplicationsDeveloping Security Orchestration, Automation, and Response Applications
Developing Security Orchestration, Automation, and Response Applications
VICTOR MAESTRE RAMIREZ
 
Cleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdfCleaned_Lecture 6666666_Simulation_I.pdf
Cleaned_Lecture 6666666_Simulation_I.pdf
alcinialbob1234
 
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptxmd-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
md-presentHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHation.pptx
fatimalazaar2004
 
DPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdfDPR_Expert_Recruitment_notice_Revised.pdf
DPR_Expert_Recruitment_notice_Revised.pdf
inmishra17121973
 
Medical Dataset including visualizations
Medical Dataset including visualizationsMedical Dataset including visualizations
Medical Dataset including visualizations
vishrut8750588758
 
Minions Want to eat presentacion muy linda
Minions Want to eat presentacion muy lindaMinions Want to eat presentacion muy linda
Minions Want to eat presentacion muy linda
CarlaAndradesSoler1
 
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
Safety Innovation in Mt. Vernon A Westchester County Model for New Rochelle a...
James Francis Paradigm Asset Management
 
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Day 1 - Lab 1 Reconnaissance Scanning with NMAP, Vulnerability Assessment wit...
Abodahab
 
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnTemplate_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Template_A3nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
cegiver630
 
C++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptxC++_OOPs_DSA1_Presentation_Template.pptx
C++_OOPs_DSA1_Presentation_Template.pptx
aquibnoor22079
 
Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..Secure_File_Storage_Hybrid_Cryptography.pptx..
Secure_File_Storage_Hybrid_Cryptography.pptx..
yuvarajreddy2002
 
Classification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptxClassification_in_Machinee_Learning.pptx
Classification_in_Machinee_Learning.pptx
wencyjorda88
 
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.pptJust-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
Just-In-Timeasdfffffffghhhhhhhhhhj Systems.ppt
ssuser5f8f49
 
Conic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptxConic Sectionfaggavahabaayhahahahahs.pptx
Conic Sectionfaggavahabaayhahahahahs.pptx
taiwanesechetan
 
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbbEDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
EDU533 DEMO.pptxccccvbnjjkoo jhgggggbbbb
JessaMaeEvangelista2
 
Deloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit contextDeloitte Analytics - Applying Process Mining in an audit context
Deloitte Analytics - Applying Process Mining in an audit context
Process mining Evangelist
 
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your CompetitorsAI Competitor Analysis: How to Monitor and Outperform Your Competitors
AI Competitor Analysis: How to Monitor and Outperform Your Competitors
Contify
 
Stack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptxStack_and_Queue_Presentation_Final (1).pptx
Stack_and_Queue_Presentation_Final (1).pptx
binduraniha86
 
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
Adobe Analytics NOAM Central User Group April 2025 Agent AI: Uncovering the S...
gmuir1066
 
FPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptxFPET_Implementation_2_MA to 360 Engage Direct.pptx
FPET_Implementation_2_MA to 360 Engage Direct.pptx
ssuser4ef83d
 

Building big data solutions on azure

  • 1. © Copyright SELA Software & Education Labs Ltd. | 14-18 Baruch Hirsch St Bnei Brak, 51202 Israel | www.selagroup.com Eyal Ben Ivri Building Big Data Solutions on Azure
  • 2. About me Eyal Ben Ivri Big Data & Cloud Architect, Sela Group Focus On Hadoop Eco-System & Big-Data + NoSQL Solutions
  • 3. Modern Data – The Big Picture IoT User Data Media Files Documents Machine Data Log Files
  • 5. The Light Rail problem – TLV Railway Imagine the new light Rail maintenance company IoT – Internet of Trains (and cameras, and cash registers and carts and rails and more…) Analyze data in stream and in batch Dashboards Alerts The perfect problem
  • 6. What We Need An integrated data solution that will be: Able to process events from external sources Able to walk data through different pipelines Fast and responsive Big-Data Ready
  • 7. In Other Words Consume BI Dashboards Applications Process ETL Aggregations Computation Analysis Querying Persist Hadoop SQL NoSQL Ingest IoT Structured Data Un-Structured Data
  • 8. Microsoft Azure Services for IoT and BigData Devices Device Connectivity Storage Analytics Presentation & Action Event Hubs SQL Database Machine Learning App Service Service Bus Table/Blob Storage Stream Analytics Power BI External Data Sources DocumentDB HDInsight Notification Hubs Data Lake Store Data Factory Mobile Services External Data Sources Data Lake Analytics BizTalk Services { }
  • 9. Microsoft Azure Services for IoT and BigData Devices Device Connectivity Storage Analytics Presentation & Action Event Hubs SQL Database Machine Learning App Service Service Bus Table/Blob Storage Stream Analytics Power BI External Data Sources DocumentDB HDInsight Notification Hubs Data Lake Store Data Factory Mobile Services External Data Sources Data Lake Analytics BizTalk Services { }
  • 10. Event Hub Messages at scale Why not throw it into a queue, and have a listener at the backend? Scaling limits, because of the architecture of queues and topics of a standard Service Bus Event Hub uses a partition model
  • 11. Getting Started Easy to set up Two Configurations Partition Count – Depend on the number of consumers (2- 32) Message Retention (days) – between 1 and 7 days Secured using SAS Policies
  • 12. Field Gateway Device Connectivity & Management IoT with Event Hubs Devices RTOS,Linux,Windows,Android,iOS Cloud Gateway Event Hubs Field Gateway Protocol Adaptation
  • 13. Field Gateway Device Connectivity & Management Analytics & Operationalized Insights IoT & Data Processing Patterns Devices RTOS,Linux,Windows,Android,iOS Protocol Adaptation Batch Analytics & Visualizations Azure HDInsight, AzureML, Power BI, Azure Data Factory Hot Path Analytics Azure Stream Analytics, Azure HDInsight Storm Hot Path Business Logic Service Fabric & Actor Framework Cloud Gateway Event Hubs & IoT Hub Field Gateway Protocol Adaptation
  • 14. TLV Railway Can now ingest millions of messages each second These messages carry data from: Devices End-Machines Servers Next, we need to use this data to create real- time alerts when something goes wrong
  • 15. Azure Stream Analytics Automatic recovery Monitoring and alerting Scale on demand Managed Cloud Service Each unit handles 1MB/s Can scale up to 1GB/s SQL like language temporal windowing semantics support for reference data
  • 16. Stream Analytics – Main Concepts Inputs Can be stream or reference data (metadata) Stream Data sources can be Event Hub, Blob Storage (using blobs with timestamps) or IoT Hub (preview) Serialization types support CSV, JSON, and Avro Query A SQL query to that will select from input(s) and dump results to output(s) Output Can be Blob, SQL, Event Hub (notification), Power BI (preview), Table storage, Service Bus or DocumentDB
  • 17. Tumbling Windows How many trains entered each station every 5 minutes? SELECT TrainId, COUNT(*) FROM EntryStream GROUP BY TrainId, TumblingWindow(minute,5)
  • 18. Temporal Windows Tumbling Window A series of fixed-sized, non-overlapping and contiguous time intervals Hopping Window Scheduled overlapping windows Sliding Window Outputs events only for those points in time when the content of the window actually changes
  • 19. TLV Railway Can now respond in near-real-time to events as they happen Track and maintain malfunctioning equipment Receive real time data regarding customers entering and leaving stations Data can now be processed, so we need a place to save it, preferably at scale.
  • 20. DocumentDB and Azure Data Services fully managed, scalable, queryable, schema free JSON document database service for modern applications transactional processing rich query managed as a service elastic scale internet accessible http/rest schema-free data model arbitrary data formats
  • 21. DocumentDB features JSON Documents SQL support Linq Support REST API Support JS Support (triggers, UDFs, stored procedures) Automatic Index Multiple Document Transactions Tunable Consistency
  • 22. DocumentDB Key Concept Collection A collection of Documents Not a table (different entities can go into the same collection) Collections = Partitions Not just logical containers, but physical ones
  • 24. TLV Railway Can now store it’s data in a highly scalable store Great for interactive querying of any data Messages from sensors Reference Data But this data (and other data) needs to move to other places (SQL, Batch processing, ML). How?
  • 25. What is Azure Data Factory? Azure Data Factory is a managed service to produce trusted information from data stored in the cloud and on-premises. Easily create, orchestrate and schedule highly-available, fault tolerant work flows to move and transform your data at scale.
  • 26. Evolving Approaches to Analytics ETL Tool (SSIS, etc) EDW (SQL Svr, Teradata, etc) Extract Original Data Load Transformed Data Transform BI Tools Ingest Original Data Scale-out Storage & Compute (HDFS, Blob Storage, etc) Transform & Load Data Marts Data Lake(s) Dashboards Apps Streaming data
  • 27. Data Factory – Main concepts Data Store A data source/sink component SQL (Azure or On-Premise), Storage, DocumentDB and more) Data Set A defined data set that is contained inside a data store One data store can have many data sets Compute A service for computation HDInsight, Azure Batch, Data Lake Analytics, Azure ML
  • 28. Data Factory – Main concepts Pipeline Set of instructions “Take data from data set A and move to compute, then store results in data set B” Slices Everything is time sliced A data set (source) can declare on what time intervals the data can be sliced, and the pipeline will be activated when a new slice is ready JSON
  • 30. Microsoft Azure Services for IoT and BigData Devices Device Connectivity Storage Analytics Presentation & Action Event Hubs SQL Database Machine Learning App Service Service Bus Table/Blob Storage Stream Analytics Power BI External Data Sources DocumentDB HDInsight Notification Hubs Data Lake Store Data Factory Mobile Services External Data Sources Data Lake Analytics BizTalk Services { }
  • 31. Microsoft Azure Services for IoT and BigData Devices Device Connectivity Storage Analytics Presentation & Action Event Hubs SQL Database Machine Learning App Service Service Bus Table/Blob Storage Stream Analytics Power BI External Data Sources DocumentDB HDInsight Notification Hubs Data Lake Store Data Factory Mobile Services External Data Sources Data Lake Analytics BizTalk Services { }
  • 32. TLV Railway Can now integrate different services and different data sources Move data with ease and as little hassle as possible What about aggregations, deeper dive into data, for more complex analysis?
  • 34. HDInsight Hadoop-as-a-Service Based on the Hortonworks distribution Few flavors: Hadoop (Windows + Linux) Storm (Windows + Linux) HBase (Windows + Linux) Spark (Windows + Linux)
  • 37. TLV Railway - Summary Can now perform advanced analytics on top of large amounts of data, in a variety of formats (not just structured, boring data) Can integrate all the loose ends of data coming in, with data generated in ”Old-School” data platforms like SQL that is collected from Line- of-Business applications We’ve covered data ingestion, responding in real-time, querying, storing and processing Azure Stack
  • 38. Hadoop and OSS vs. Azure IoT and BigData Ecosystem Azure Ecosystem OSS Event Hubs Kafka Stream Analytics Storm HDInsight Hadoop Map Reduce Map Reduce Hive Hive Spark Spark HBase HBase Azure ML Mahout Data Factory Pig DocumentDB MongoDB / Couchbase
  • 41. London did it first
  • 42. Summary Get started today at http://azure.microsoft.com

Editor's Notes

  • #9: Key goal of slide: IoT as you know is a hot area these days and there are a number of players that claim to be active in this space…. And they tend to focus on specific elements you see in this diagram. Microsoft has the most comprehensive portfolio of cloud services that customers need to develop and deploy end-to-end IoT solutions. Customers are adopting these services and are successfully deploying their solutions today (reference Rockwell, ThyssenKrupp) Talk track [Short Version for Sam’s Leadership Session]: As we think about Azure IoT services, Microsoft has the most comprehensive portfolio of cloud services that customers need to develop and deploy end-to-end IoT solutions Ranging from devices that produce data, to connecting them to the cloud storage, and driving analytics to gain valuable business insights that allows enterprises to take actions Talk track [Long Version Chris’ Breakout Session]: As we think about Azure IoT services, there are a collection of capabilities involved. First there are Producers. These can be basic sensors, small form factor devices, traditional computer systems, or even complex assets made up of a number of data sources. Next we have the Connect Devices capabilities on the ingress level within and around Azure. The primary destination is Service Bus & Event Hubs, but this relies on client agent technology either at the edge device level or within a field or cloud gateway. We also have capabilities for other external data sources o provide data As data is ingressed to Azure, there are various Storage options there can be a number of destinations engaged. Traditional database technology, table or blob, or even more complex destinations like Document DB are possible. External or third party technologies can also be used. This is where the flexibility and agility of a platform shows its strength, This is where analysts like Gartner are forming opinions about just how robust our platform can be. As this data is processed in Azure, there are a number of capabilities that can be utilized. Machine Learning, HD Insight, Stream Analytics are examples of tools that can analytics the data in various ways. Finally the concept of Take Actions uses Azure services. Data may populate a LOB portal, be pushed to apps, or presented in analytics and productivity tools. These are all ways that the data gets out of these architecture points to allow organizations to use analysis to change / transform their business. Through all of these areas, there is the possibility of utilizing existing investments either within your Azure environment, or elsewhere.
  • #10: Key goal of slide: IoT as you know is a hot area these days and there are a number of players that claim to be active in this space…. And they tend to focus on specific elements you see in this diagram. Microsoft has the most comprehensive portfolio of cloud services that customers need to develop and deploy end-to-end IoT solutions. Customers are adopting these services and are successfully deploying their solutions today (reference Rockwell, ThyssenKrupp) Talk track [Short Version for Sam’s Leadership Session]: As we think about Azure IoT services, Microsoft has the most comprehensive portfolio of cloud services that customers need to develop and deploy end-to-end IoT solutions Ranging from devices that produce data, to connecting them to the cloud storage, and driving analytics to gain valuable business insights that allows enterprises to take actions Talk track [Long Version Chris’ Breakout Session]: As we think about Azure IoT services, there are a collection of capabilities involved. First there are Producers. These can be basic sensors, small form factor devices, traditional computer systems, or even complex assets made up of a number of data sources. Next we have the Connect Devices capabilities on the ingress level within and around Azure. The primary destination is Service Bus & Event Hubs, but this relies on client agent technology either at the edge device level or within a field or cloud gateway. We also have capabilities for other external data sources o provide data As data is ingressed to Azure, there are various Storage options there can be a number of destinations engaged. Traditional database technology, table or blob, or even more complex destinations like Document DB are possible. External or third party technologies can also be used. This is where the flexibility and agility of a platform shows its strength, This is where analysts like Gartner are forming opinions about just how robust our platform can be. As this data is processed in Azure, there are a number of capabilities that can be utilized. Machine Learning, HD Insight, Stream Analytics are examples of tools that can analytics the data in various ways. Finally the concept of Take Actions uses Azure services. Data may populate a LOB portal, be pushed to apps, or presented in analytics and productivity tools. These are all ways that the data gets out of these architecture points to allow organizations to use analysis to change / transform their business. Through all of these areas, there is the possibility of utilizing existing investments either within your Azure environment, or elsewhere.
  • #31: Key goal of slide: IoT as you know is a hot area these days and there are a number of players that claim to be active in this space…. And they tend to focus on specific elements you see in this diagram. Microsoft has the most comprehensive portfolio of cloud services that customers need to develop and deploy end-to-end IoT solutions. Customers are adopting these services and are successfully deploying their solutions today (reference Rockwell, ThyssenKrupp) Talk track [Short Version for Sam’s Leadership Session]: As we think about Azure IoT services, Microsoft has the most comprehensive portfolio of cloud services that customers need to develop and deploy end-to-end IoT solutions Ranging from devices that produce data, to connecting them to the cloud storage, and driving analytics to gain valuable business insights that allows enterprises to take actions Talk track [Long Version Chris’ Breakout Session]: As we think about Azure IoT services, there are a collection of capabilities involved. First there are Producers. These can be basic sensors, small form factor devices, traditional computer systems, or even complex assets made up of a number of data sources. Next we have the Connect Devices capabilities on the ingress level within and around Azure. The primary destination is Service Bus & Event Hubs, but this relies on client agent technology either at the edge device level or within a field or cloud gateway. We also have capabilities for other external data sources o provide data As data is ingressed to Azure, there are various Storage options there can be a number of destinations engaged. Traditional database technology, table or blob, or even more complex destinations like Document DB are possible. External or third party technologies can also be used. This is where the flexibility and agility of a platform shows its strength, This is where analysts like Gartner are forming opinions about just how robust our platform can be. As this data is processed in Azure, there are a number of capabilities that can be utilized. Machine Learning, HD Insight, Stream Analytics are examples of tools that can analytics the data in various ways. Finally the concept of Take Actions uses Azure services. Data may populate a LOB portal, be pushed to apps, or presented in analytics and productivity tools. These are all ways that the data gets out of these architecture points to allow organizations to use analysis to change / transform their business. Through all of these areas, there is the possibility of utilizing existing investments either within your Azure environment, or elsewhere.
  • #32: Key goal of slide: IoT as you know is a hot area these days and there are a number of players that claim to be active in this space…. And they tend to focus on specific elements you see in this diagram. Microsoft has the most comprehensive portfolio of cloud services that customers need to develop and deploy end-to-end IoT solutions. Customers are adopting these services and are successfully deploying their solutions today (reference Rockwell, ThyssenKrupp) Talk track [Short Version for Sam’s Leadership Session]: As we think about Azure IoT services, Microsoft has the most comprehensive portfolio of cloud services that customers need to develop and deploy end-to-end IoT solutions Ranging from devices that produce data, to connecting them to the cloud storage, and driving analytics to gain valuable business insights that allows enterprises to take actions Talk track [Long Version Chris’ Breakout Session]: As we think about Azure IoT services, there are a collection of capabilities involved. First there are Producers. These can be basic sensors, small form factor devices, traditional computer systems, or even complex assets made up of a number of data sources. Next we have the Connect Devices capabilities on the ingress level within and around Azure. The primary destination is Service Bus & Event Hubs, but this relies on client agent technology either at the edge device level or within a field or cloud gateway. We also have capabilities for other external data sources o provide data As data is ingressed to Azure, there are various Storage options there can be a number of destinations engaged. Traditional database technology, table or blob, or even more complex destinations like Document DB are possible. External or third party technologies can also be used. This is where the flexibility and agility of a platform shows its strength, This is where analysts like Gartner are forming opinions about just how robust our platform can be. As this data is processed in Azure, there are a number of capabilities that can be utilized. Machine Learning, HD Insight, Stream Analytics are examples of tools that can analytics the data in various ways. Finally the concept of Take Actions uses Azure services. Data may populate a LOB portal, be pushed to apps, or presented in analytics and productivity tools. These are all ways that the data gets out of these architecture points to allow organizations to use analysis to change / transform their business. Through all of these areas, there is the possibility of utilizing existing investments either within your Azure environment, or elsewhere.