SlideShare a Scribd company logo
60 min
Big data on Azure for Architects
Data Complexity: Variety and Velocity
Terabytes (1012)
Gigabytes (109)
Megabytes (106)
Petabytes (1015)
Exabyte (1018)
Big data on Azure for Architects
Volume Velocity
Variety Variability
Reduces
NoSQL:
• No cleansing!
• No ETL!
• No load!
• Analyze the data where it lands! Store now, question later
RDBMS
Data
Arrives
Derive a
schema
Cleanse
the data
Transform
the data
Load
the data
SQL
Queries
1
2
3 4 5
6
Data
Arrives
Application
Program
1 2
HOW?? IF I
DON’T
KNOW THE
STRUCTURE?
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Distributed Storage (HDFS)
Query
(Hive)
Distributed Processing
(MapReduce)
DataIntegration
(ODBC/SQOOP/REST)
EventPipeline
(EventHub/
Flume)
Legend
Red =
Core Hadoop
Blue =
Data processing
Gray= Microsoft
integration points
and value adds
Orange =
Data Movement
Green = Packages
YARN
Name Node
de
Data Node
HDFS API
DFS (1 Data Node per
Worker Role) and Compute
Cluster / VM
Azure Storage (WASB)
Benefits:
Data reuse and sharing
Data storage cost
Elastic scale-out
Geo-replication
…
Data Node
Most important Benefit:
Data are INDEPENDENT from cluster
And WASB is FAST…
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
SOSP Paper - Windows Azure Storage: A Highly
Available Cloud Storage Service with Strong
Consistency
http://nasuni.com
Report link is here
M
Extent Nodes (EN)
Paxos
Front End
Layer
FE
Incoming Write Request
M
M
Partition
Server
Partition
Server
Partition
Server
Partition
Server
Partition
Master
FE FE FE FE
Lock
Service
Ack
Partition Layer
Stream
Layer
Account
Name
Container
Name
Blob
Name
aaaa aaaa aaaaa
…….. …….. ……..
…….. …….. ……..
…….. …….. ……..
…….. …….. ……..
…….. …….. ……..
…….. …….. ……..
…….. …….. ……..
…….. …….. ……..
…….. …….. ……..
…….. …….. ……..
…….. …….. ……..
zzzz zzzz zzzzz
Storage Stamp
Partition
Server
Partition
Server
Account
Name
Container
Name
Blob
Name
richard videos tennis
……… ……… ………
……… ……… ………
zzzz zzzz zzzzz
Account
Name
Container
Name
Blob
Name
harry pictures sunset
……… ……… ………
……… ……… ………
richard videos soccer
Partition
Server
Partition
Master
Front-End
Server
PS 2 PS 3
PS 1
A-H: PS1
H’-R: PS2
R’-Z: PS3
A-H: PS1
H’-R: PS2
R’-Z: PS3
Partition
Map
Blob Index
Partition
Map
Account
Name
Container
Name
Blob
Name
aaaa aaaa aaaaa
……… ……… ………
……… ……… ………
harry pictures sunrise
A-H
R’-ZH’-R
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
• Programming framework
(library and runtime) for
analyzing datasets stored in
HDFS
• Composed of user-supplied
Map and Reduce functions:
• Map() - subdivide and
conquer
• Reduce() - combine and
reduce cardinality
………
Do work() Do work() Do work()
Big data on Azure for Architects
Big data on Azure for Architects
context.write(word, one);
context.write(key, new IntWritable(sum));
wasb:///example/data/gutenberg/davinci.txt wasb:///example/data/WordCountOutput
Start-AzureHDInsightJob
Get-AzureStorageBlob
Run in PS
https://pltkhdc01.azurehdinsight.net:443/ambari/ap
i/v1/clusters/pltkhdc01.azurehdinsight.net/service
s/yarn
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
• It’s important to check that the results generated
by queries are realistic, valid, and useful for better
RoI
• Automate tasks in a repeatable solution, and run
the solution from a remote computer rather than
directly from the cluster server desktop.
• There’s a huge range of tools that you can use
with Hadoop, and choosing the most appropriate
can be difficult.
• If you decide to use a resource-intensive
application such as HBase or Storm, you should
consider running it on a separate cluster.
Data-flow platform to transform and
analyze HDFS data
Scripting – No Java Needed!
Focus on semantics, not on implementation
Extensible through user defined functions and
methods
Pigs Eat Anything
Pig can operate on data whether it has metadata or not.
Pigs Live Anywhere
Pig is not tied to one particular parallel framework.
Pigs Are Domestic Animals
Pig is designed to be easily controlled. Complex tasks involving
interrelated data transformations can be simplified and
encoded as data flow sequences. Pig programs accomplish
huge tasks, but they are easy to write and maintain.
Pigs Fly
Pig processes data quickly. The system automatically optimizes
execution of Pig jobs, so the user can focus on semantics.
Big data on Azure for Architects
LOGS = LOAD 'wasb:///example/data/sample.log';
LEVELS = foreach LOGS generate REGEX_EXTRACT($0, '(TRACE|DEBUG|INFO|WARN|ERROR|FATAL)', 1)
as LOGLEVEL;
FILTEREDLEVELS = FILTER LEVELS by LOGLEVEL is not null;
GROUPEDLEVELS = GROUP FILTEREDLEVELS by LOGLEVEL;
FREQUENCIES = foreach GROUPEDLEVELS generate group as LOGLEVEL, COUNT(FILTEREDLEVELS.LOGLEVEL)
as COUNT;
RESULT = order FREQUENCIES by COUNT desc;
DUMP RESULT; STORE RESULT INTO 'tkR1'
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Check result in PS
Hadoop 2.0
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
What is Machine Learning (ML)
Solve extremely hard problems
Extract more value from Big Data
Drive a shift in business analytics
Business
Knowledge
Data
Preparation
Modelling
Evaluation
Data
Understanding
Idea
Data
Publish
Machine Learning Process Model
Based on the CRISP-DM Model
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Volume,batchprocessing
Events, Real Time processing
Big data on Azure for Architects
Relay
Queue
Topic
Notification Hub
Event Hub
NAT and Firewall Traversal Service
Request/Response Services
Unbuffered with TCP Throttling.
Hybrid Connection
Transactional Cloud AMQP/HTTP Broker
High-Scale, High-Reliability Messaging
Sessions, Scheduled Delivery, etc.
Transactional Message Distribution
Up to 2000 subscriptions per Topic
Up to 2K/100K filter rules per subscription
High-scale notification distribution
Most mobile push notification services
Millions of notification targets
EVENTS, MASSIVE
SCALE
Event
Producers
> 1M Producers
> 1GB/sec
Aggregate
Throughput
Partitions
Direct
PartitionKey
Hash
Throughput Units:
• 1 ≤ TUs ≤ Partition Count
• TU: 1 MB/s writes, 2 MB/s reads
• We pay for TU
AMQP 1.0
Credit-based flow control
Client-side cursors
Offset by Id or Timestamp
Ingestor
(broker)
Collection Presentation
and action
Event
producers
Transformation Long-term
storage
Event hubs
Storage
adapters
Stream
processingCloud gateways
(web APIs)
Field
gateways
Applications
Legacy IOT
(custom protocols)
Devices
IP-capable devices
(Windows/Linux)
Low-power
devices (RTOS)
Search and query
Data analytics (Excel)
Web/thick client
dashboards
Service bus
Azure DBs
Azure storage
HDInsight
Stream
Analytics
Devices to take action
Storm
IEventProcessor
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Daughter
jumping
in garage
Me with
compressed
(cold) air
Me with
small dryer
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
* Tick tuples scheme is Storm’s built-in mechanism for generating tuples and sending them to each bolt in the topology at specified intervals.
Worth to check: https://storm.apache.org/apidocs/backtype/storm/topology/TopologyBuilder.BoltGetter.html
EventHubSpout
spoutConfig.getPartitionCount
PartialCountBolt
EventHubSpout
DBGlobalCountBolt
collector.emit
collector.ack
db.insertValue(System.currentTimeMillis(), partialCount);
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Big data on Azure for Architects
Compute
Visualisation
Orchestration Storage
Service bus
Event Hub
Data Factory
Power BI
Stream Analytics
HD Insight
Machine Learning
Virtual Machines
Table Storage
Blob Storage
SQL Azure
Document DB
Feeds
IoT
Data Sources
Near real time analysisData Journeys
Azure
Compute
Visualisation
Orchestration Storage
Service bus
Event Hub
Data Factory
Power BI
Stream Analytics
HD Insight
Machine Learning
Virtual Machines
Table Storage
Blob Storage
SQL Azure
Document DB
Feeds
IoT
Data Sources
Near real time analysisPredictive Analytics
Azure
Compute
Visualisation
Orchestration Storage
Service bus
Event Hub
Data Factory
Power BI
Stream Analytics
HD Insight
Machine Learning
Virtual Machines
Table Storage
Blob Storage
SQL Azure
Document DB
Feeds
IoT
Data Sources
Near real time analysisNear real time analysis
Azure
Compute
Visualisation
Orchestration Storage
Service bus
Event Hub
Data Factory
Power BI
Stream Analytics
HD Insight
Machine Learning
Virtual Machines
Table Storage
Blob Storage
SQL Azure
Document DB
Feeds
IoT
Data Sources
Near real time analysisBig Data
Azure
Compute
Visualisation
Orchestration Storage
Service bus
Event Hub
Data Factory
Power BI
Stream Analytics
HD Insight
Machine Learning
Virtual Machines
Table Storage
Blob Storage
SQL Azure
Document DB
Feeds
IoT
Data Sources
Near real time analysis“Traditional” BI
Azure
Big data on Azure for Architects
tkopacz@microsoft.com
Big data on Azure for Architects
Azure
Windows
Server
Linux
Hosted Clouds
Windows
Server
Linux
Service Fabric
Private Clouds
Windows
Server
Linux
High Availability
Hyper-Scale
Hybrid Operations
High Density
Microservices
Rolling Upgrades
Stateful services
Low Latency
Fast startup &
shutdown
Container Orchestration
& lifecycle management
Replication &
Failover
Simple
programming
models
Load balancing
Self-healingData Partitioning
Automated Rollback
Health
Monitoring
Placement
Constraints
Big data on Azure for Architects
Ad

More Related Content

What's hot (20)

Democratizing Data Science on Kubernetes
Democratizing Data Science on Kubernetes Democratizing Data Science on Kubernetes
Democratizing Data Science on Kubernetes
John Archer
 
Data Lakes with Azure Databricks
Data Lakes with Azure DatabricksData Lakes with Azure Databricks
Data Lakes with Azure Databricks
Data Con LA
 
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics SolutionCortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
MSAdvAnalytics
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Big Data in Azure
DataWorks Summit/Hadoop Summit
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick view
Rajesh Nadipalli
 
Cortana Analytics Suite
Cortana Analytics SuiteCortana Analytics Suite
Cortana Analytics Suite
James Serra
 
Introduction to PolyBase
Introduction to PolyBaseIntroduction to PolyBase
Introduction to PolyBase
James Serra
 
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAi & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Alberto Diaz Martin
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
Mark Rittman
 
Azure Databricks for Data Scientists
Azure Databricks for Data ScientistsAzure Databricks for Data Scientists
Azure Databricks for Data Scientists
Richard Garris
 
DataOps for the Modern Data Warehouse on Microsoft Azure @ NDCOslo 2020 - Lac...
DataOps for the Modern Data Warehouse on Microsoft Azure @ NDCOslo 2020 - Lac...DataOps for the Modern Data Warehouse on Microsoft Azure @ NDCOslo 2020 - Lac...
DataOps for the Modern Data Warehouse on Microsoft Azure @ NDCOslo 2020 - Lac...
Lace Lofranco
 
Introduction to Azure HDInsight
Introduction to Azure HDInsightIntroduction to Azure HDInsight
Introduction to Azure HDInsight
Stéphane Fréchette
 
Global AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure DatabricksGlobal AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure Databricks
Alberto Diaz Martin
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
Dmitry Anoshin
 
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Michael Rys
 
Designing big data analytics solutions on azure
Designing big data analytics solutions on azureDesigning big data analytics solutions on azure
Designing big data analytics solutions on azure
Mohamed Tawfik
 
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu GantaAzure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Databricks
 
Big Data Architecture and Design Patterns
Big Data Architecture and Design PatternsBig Data Architecture and Design Patterns
Big Data Architecture and Design Patterns
John Yeung
 
How Azure Databricks helped make IoT Analytics a Reality with Janath Manohara...
How Azure Databricks helped make IoT Analytics a Reality with Janath Manohara...How Azure Databricks helped make IoT Analytics a Reality with Janath Manohara...
How Azure Databricks helped make IoT Analytics a Reality with Janath Manohara...
Databricks
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
James Serra
 
Democratizing Data Science on Kubernetes
Democratizing Data Science on Kubernetes Democratizing Data Science on Kubernetes
Democratizing Data Science on Kubernetes
John Archer
 
Data Lakes with Azure Databricks
Data Lakes with Azure DatabricksData Lakes with Azure Databricks
Data Lakes with Azure Databricks
Data Con LA
 
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics SolutionCortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
Cortana Analytics Workshop: Operationalizing Your End-to-End Analytics Solution
MSAdvAnalytics
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick view
Rajesh Nadipalli
 
Cortana Analytics Suite
Cortana Analytics SuiteCortana Analytics Suite
Cortana Analytics Suite
James Serra
 
Introduction to PolyBase
Introduction to PolyBaseIntroduction to PolyBase
Introduction to PolyBase
James Serra
 
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAi & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Alberto Diaz Martin
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
Mark Rittman
 
Azure Databricks for Data Scientists
Azure Databricks for Data ScientistsAzure Databricks for Data Scientists
Azure Databricks for Data Scientists
Richard Garris
 
DataOps for the Modern Data Warehouse on Microsoft Azure @ NDCOslo 2020 - Lac...
DataOps for the Modern Data Warehouse on Microsoft Azure @ NDCOslo 2020 - Lac...DataOps for the Modern Data Warehouse on Microsoft Azure @ NDCOslo 2020 - Lac...
DataOps for the Modern Data Warehouse on Microsoft Azure @ NDCOslo 2020 - Lac...
Lace Lofranco
 
Global AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure DatabricksGlobal AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure Databricks
Alberto Diaz Martin
 
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft AzureBuilding Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
Dmitry Anoshin
 
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...
Michael Rys
 
Designing big data analytics solutions on azure
Designing big data analytics solutions on azureDesigning big data analytics solutions on azure
Designing big data analytics solutions on azure
Mohamed Tawfik
 
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu GantaAzure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Azure Databricks – Customer Experiences and Lessons Denzil Ribeiro Madhu Ganta
Databricks
 
Big Data Architecture and Design Patterns
Big Data Architecture and Design PatternsBig Data Architecture and Design Patterns
Big Data Architecture and Design Patterns
John Yeung
 
How Azure Databricks helped make IoT Analytics a Reality with Janath Manohara...
How Azure Databricks helped make IoT Analytics a Reality with Janath Manohara...How Azure Databricks helped make IoT Analytics a Reality with Janath Manohara...
How Azure Databricks helped make IoT Analytics a Reality with Janath Manohara...
Databricks
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
James Serra
 

Viewers also liked (20)

Desayuno de arquitectos: Big data en azure
Desayuno de arquitectos: Big data en azureDesayuno de arquitectos: Big data en azure
Desayuno de arquitectos: Big data en azure
Guillermo Javier Bellmann
 
Haddop in Business Intelligence
Haddop in Business IntelligenceHaddop in Business Intelligence
Haddop in Business Intelligence
HGanesh
 
Big data in Azure
Big data in AzureBig data in Azure
Big data in Azure
Venkatesh Narayanan
 
Azure Big Data Story
Azure Big Data StoryAzure Big Data Story
Azure Big Data Story
Lynn Langit
 
Azure architecture
Azure architectureAzure architecture
Azure architecture
Amal Dev
 
Windows Azure and the Hybrid Cloud
Windows Azure and the Hybrid CloudWindows Azure and the Hybrid Cloud
Windows Azure and the Hybrid Cloud
Windows Azure
 
Building Big data solutions in Azure
Building Big data solutions in AzureBuilding Big data solutions in Azure
Building Big data solutions in Azure
Mostafa
 
Improving Application Security With Azure
Improving Application Security With AzureImproving Application Security With Azure
Improving Application Security With Azure
Softchoice Corporation
 
Architecting azure IaaS Solutions
Architecting azure IaaS SolutionsArchitecting azure IaaS Solutions
Architecting azure IaaS Solutions
swapnilrkambli
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
Patrick Nicolas
 
Microsoft Azure Hybrid Cloud - Getting Started For Techies
Microsoft Azure Hybrid Cloud - Getting Started For TechiesMicrosoft Azure Hybrid Cloud - Getting Started For Techies
Microsoft Azure Hybrid Cloud - Getting Started For Techies
Aidan Finn
 
Hortonworks Technical Workshop: Apache Ambari
Hortonworks Technical Workshop:   Apache AmbariHortonworks Technical Workshop:   Apache Ambari
Hortonworks Technical Workshop: Apache Ambari
Hortonworks
 
Hadoop Ecosystem Architecture Overview
Hadoop Ecosystem Architecture Overview Hadoop Ecosystem Architecture Overview
Hadoop Ecosystem Architecture Overview
Senthil Kumar
 
Azure Stack - Azure in your own Data Center
Azure Stack - Azure in your own Data CenterAzure Stack - Azure in your own Data Center
Azure Stack - Azure in your own Data Center
Adnan Hashmi
 
Optimize your azure architecture
Optimize your azure architectureOptimize your azure architecture
Optimize your azure architecture
Asaf Nakash
 
Introduction To Hadoop Ecosystem
Introduction To Hadoop EcosystemIntroduction To Hadoop Ecosystem
Introduction To Hadoop Ecosystem
InSemble
 
MS Cloud Summit Paris 2017 - Azure Stack
MS Cloud Summit Paris 2017 - Azure StackMS Cloud Summit Paris 2017 - Azure Stack
MS Cloud Summit Paris 2017 - Azure Stack
Benoît SAUTIERE
 
Big Data en Azure: Azure Data Lake
Big Data en Azure: Azure Data LakeBig Data en Azure: Azure Data Lake
Big Data en Azure: Azure Data Lake
Guillermo Javier Bellmann
 
Intorducing Big Data and Microsoft Azure
Intorducing Big Data and Microsoft AzureIntorducing Big Data and Microsoft Azure
Intorducing Big Data and Microsoft Azure
Khalid Salama
 
Real world hybrid cloud session - OpenStack DACH 2015
Real world hybrid cloud session - OpenStack DACH 2015Real world hybrid cloud session - OpenStack DACH 2015
Real world hybrid cloud session - OpenStack DACH 2015
assafleb
 
Haddop in Business Intelligence
Haddop in Business IntelligenceHaddop in Business Intelligence
Haddop in Business Intelligence
HGanesh
 
Azure Big Data Story
Azure Big Data StoryAzure Big Data Story
Azure Big Data Story
Lynn Langit
 
Azure architecture
Azure architectureAzure architecture
Azure architecture
Amal Dev
 
Windows Azure and the Hybrid Cloud
Windows Azure and the Hybrid CloudWindows Azure and the Hybrid Cloud
Windows Azure and the Hybrid Cloud
Windows Azure
 
Building Big data solutions in Azure
Building Big data solutions in AzureBuilding Big data solutions in Azure
Building Big data solutions in Azure
Mostafa
 
Improving Application Security With Azure
Improving Application Security With AzureImproving Application Security With Azure
Improving Application Security With Azure
Softchoice Corporation
 
Architecting azure IaaS Solutions
Architecting azure IaaS SolutionsArchitecting azure IaaS Solutions
Architecting azure IaaS Solutions
swapnilrkambli
 
Microsoft Azure Hybrid Cloud - Getting Started For Techies
Microsoft Azure Hybrid Cloud - Getting Started For TechiesMicrosoft Azure Hybrid Cloud - Getting Started For Techies
Microsoft Azure Hybrid Cloud - Getting Started For Techies
Aidan Finn
 
Hortonworks Technical Workshop: Apache Ambari
Hortonworks Technical Workshop:   Apache AmbariHortonworks Technical Workshop:   Apache Ambari
Hortonworks Technical Workshop: Apache Ambari
Hortonworks
 
Hadoop Ecosystem Architecture Overview
Hadoop Ecosystem Architecture Overview Hadoop Ecosystem Architecture Overview
Hadoop Ecosystem Architecture Overview
Senthil Kumar
 
Azure Stack - Azure in your own Data Center
Azure Stack - Azure in your own Data CenterAzure Stack - Azure in your own Data Center
Azure Stack - Azure in your own Data Center
Adnan Hashmi
 
Optimize your azure architecture
Optimize your azure architectureOptimize your azure architecture
Optimize your azure architecture
Asaf Nakash
 
Introduction To Hadoop Ecosystem
Introduction To Hadoop EcosystemIntroduction To Hadoop Ecosystem
Introduction To Hadoop Ecosystem
InSemble
 
MS Cloud Summit Paris 2017 - Azure Stack
MS Cloud Summit Paris 2017 - Azure StackMS Cloud Summit Paris 2017 - Azure Stack
MS Cloud Summit Paris 2017 - Azure Stack
Benoît SAUTIERE
 
Intorducing Big Data and Microsoft Azure
Intorducing Big Data and Microsoft AzureIntorducing Big Data and Microsoft Azure
Intorducing Big Data and Microsoft Azure
Khalid Salama
 
Real world hybrid cloud session - OpenStack DACH 2015
Real world hybrid cloud session - OpenStack DACH 2015Real world hybrid cloud session - OpenStack DACH 2015
Real world hybrid cloud session - OpenStack DACH 2015
assafleb
 
Ad

Similar to Big data on Azure for Architects (20)

Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
Flavio Vit
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Etu Solution
 
How can Hadoop & SAP be integrated
How can Hadoop & SAP be integratedHow can Hadoop & SAP be integrated
How can Hadoop & SAP be integrated
Douglas Bernardini
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
Roushan Sinha
 
Real time analytics
Real time analyticsReal time analytics
Real time analytics
Leandro Totino Pereira
 
Building a scalable analytics environment to support diverse workloads
Building a scalable analytics environment to support diverse workloadsBuilding a scalable analytics environment to support diverse workloads
Building a scalable analytics environment to support diverse workloads
Alluxio, Inc.
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Bhupesh Bansal
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
Hadoop User Group
 
Hadoop tutorial
Hadoop tutorialHadoop tutorial
Hadoop tutorial
Aamir Ameen
 
Hadoop Tutorial.ppt
Hadoop Tutorial.pptHadoop Tutorial.ppt
Hadoop Tutorial.ppt
Sathish24111
 
Hadoop_arunam_ppt
Hadoop_arunam_pptHadoop_arunam_ppt
Hadoop_arunam_ppt
jerrin joseph
 
Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...
Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...
Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...
nnakasone
 
getFamiliarWithHadoop
getFamiliarWithHadoopgetFamiliarWithHadoop
getFamiliarWithHadoop
AmirReza Mohammadi
 
عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟
datastack
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016
Ranjith Sekar
 
CCD-410 Cloudera Study Material
CCD-410 Cloudera Study MaterialCCD-410 Cloudera Study Material
CCD-410 Cloudera Study Material
Roxycodone Online
 
Hadoop bigdata overview
Hadoop bigdata overviewHadoop bigdata overview
Hadoop bigdata overview
harithakannan
 
Hadoop Big Data A big picture
Hadoop Big Data A big pictureHadoop Big Data A big picture
Hadoop Big Data A big picture
J S Jodha
 
Slides: Accelerating Queries on Cloud Data Lakes
Slides: Accelerating Queries on Cloud Data LakesSlides: Accelerating Queries on Cloud Data Lakes
Slides: Accelerating Queries on Cloud Data Lakes
DATAVERSITY
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
Subhas Kumar Ghosh
 
Big Data and Hadoop
Big Data and HadoopBig Data and Hadoop
Big Data and Hadoop
Flavio Vit
 
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data SolutionBig Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Etu Solution
 
How can Hadoop & SAP be integrated
How can Hadoop & SAP be integratedHow can Hadoop & SAP be integrated
How can Hadoop & SAP be integrated
Douglas Bernardini
 
Building a scalable analytics environment to support diverse workloads
Building a scalable analytics environment to support diverse workloadsBuilding a scalable analytics environment to support diverse workloads
Building a scalable analytics environment to support diverse workloads
Alluxio, Inc.
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Bhupesh Bansal
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
Hadoop User Group
 
Hadoop Tutorial.ppt
Hadoop Tutorial.pptHadoop Tutorial.ppt
Hadoop Tutorial.ppt
Sathish24111
 
Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...
Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...
Ai tour 2019 Mejores Practicas en Entornos de Produccion Big Data Open Source...
nnakasone
 
عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟عصر کلان داده، چرا و چگونه؟
عصر کلان داده، چرا و چگونه؟
datastack
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016
Ranjith Sekar
 
CCD-410 Cloudera Study Material
CCD-410 Cloudera Study MaterialCCD-410 Cloudera Study Material
CCD-410 Cloudera Study Material
Roxycodone Online
 
Hadoop bigdata overview
Hadoop bigdata overviewHadoop bigdata overview
Hadoop bigdata overview
harithakannan
 
Hadoop Big Data A big picture
Hadoop Big Data A big pictureHadoop Big Data A big picture
Hadoop Big Data A big picture
J S Jodha
 
Slides: Accelerating Queries on Cloud Data Lakes
Slides: Accelerating Queries on Cloud Data LakesSlides: Accelerating Queries on Cloud Data Lakes
Slides: Accelerating Queries on Cloud Data Lakes
DATAVERSITY
 
Ad

More from Tomasz Kopacz (17)

Azure Digital Twins.pdf
Azure Digital Twins.pdfAzure Digital Twins.pdf
Azure Digital Twins.pdf
Tomasz Kopacz
 
24032022 Zero Trust for Developers Pub.pdf
24032022 Zero Trust for Developers Pub.pdf24032022 Zero Trust for Developers Pub.pdf
24032022 Zero Trust for Developers Pub.pdf
Tomasz Kopacz
 
Deep dive into service fabric after 2 years
Deep dive into service fabric after 2 yearsDeep dive into service fabric after 2 years
Deep dive into service fabric after 2 years
Tomasz Kopacz
 
O danych w 2016
O danych w 2016O danych w 2016
O danych w 2016
Tomasz Kopacz
 
Net core (dawniej 5.0) – co to dla mnie. też dużo o open source
Net core (dawniej   5.0) – co to dla mnie. też dużo o open sourceNet core (dawniej   5.0) – co to dla mnie. też dużo o open source
Net core (dawniej 5.0) – co to dla mnie. też dużo o open source
Tomasz Kopacz
 
Visual Studio – jak zorganizować pracę używając Scrum i GIT?
Visual Studio – jak zorganizować pracę używając Scrum i GIT?Visual Studio – jak zorganizować pracę używając Scrum i GIT?
Visual Studio – jak zorganizować pracę używając Scrum i GIT?
Tomasz Kopacz
 
Visual Studio - zastosowania
Visual Studio - zastosowaniaVisual Studio - zastosowania
Visual Studio - zastosowania
Tomasz Kopacz
 
Coś o service fabric, architekturze, i bardzo skalowalnych aplikacjach
Coś o service fabric, architekturze, i bardzo skalowalnych aplikacjachCoś o service fabric, architekturze, i bardzo skalowalnych aplikacjach
Coś o service fabric, architekturze, i bardzo skalowalnych aplikacjach
Tomasz Kopacz
 
Kiedy napadnie na nas pralka – jak budować bezpieczne systemy internet of thi...
Kiedy napadnie na nas pralka – jak budować bezpieczne systemy internet of thi...Kiedy napadnie na nas pralka – jak budować bezpieczne systemy internet of thi...
Kiedy napadnie na nas pralka – jak budować bezpieczne systemy internet of thi...
Tomasz Kopacz
 
Windows 10, internet of things, komunikacja duplex od kabli do odrobiny azu...
Windows 10, internet of things, komunikacja duplex   od kabli do odrobiny azu...Windows 10, internet of things, komunikacja duplex   od kabli do odrobiny azu...
Windows 10, internet of things, komunikacja duplex od kabli do odrobiny azu...
Tomasz Kopacz
 
It w roku 201x – dom, szkoła, potem praca. no i – jak tu (i czego!) uczyć
It w roku 201x – dom, szkoła, potem praca. no i – jak tu (i czego!) uczyćIt w roku 201x – dom, szkoła, potem praca. no i – jak tu (i czego!) uczyć
It w roku 201x – dom, szkoła, potem praca. no i – jak tu (i czego!) uczyć
Tomasz Kopacz
 
(Azure) Machine Learning 2015
(Azure) Machine Learning 2015(Azure) Machine Learning 2015
(Azure) Machine Learning 2015
Tomasz Kopacz
 
Azure paa s v2 – microservices, microsoft (azure) service fabric, .apps and o...
Azure paa s v2 – microservices, microsoft (azure) service fabric, .apps and o...Azure paa s v2 – microservices, microsoft (azure) service fabric, .apps and o...
Azure paa s v2 – microservices, microsoft (azure) service fabric, .apps and o...
Tomasz Kopacz
 
Mts 2013 tomasz kopacz - windows 8, office 365, workflow manager, windows a...
Mts 2013   tomasz kopacz - windows 8, office 365, workflow manager, windows a...Mts 2013   tomasz kopacz - windows 8, office 365, workflow manager, windows a...
Mts 2013 tomasz kopacz - windows 8, office 365, workflow manager, windows a...
Tomasz Kopacz
 
Mts 2013 tomasz kopacz - wydajność aplikacji dla windows 8 - jak ją mierzyć...
Mts 2013   tomasz kopacz - wydajność aplikacji dla windows 8 - jak ją mierzyć...Mts 2013   tomasz kopacz - wydajność aplikacji dla windows 8 - jak ją mierzyć...
Mts 2013 tomasz kopacz - wydajność aplikacji dla windows 8 - jak ją mierzyć...
Tomasz Kopacz
 
Tomasz Kopacz MTS 2012 Wind RT w Windows 8 i tzw aplikacje lob (line of busin...
Tomasz Kopacz MTS 2012 Wind RT w Windows 8 i tzw aplikacje lob (line of busin...Tomasz Kopacz MTS 2012 Wind RT w Windows 8 i tzw aplikacje lob (line of busin...
Tomasz Kopacz MTS 2012 Wind RT w Windows 8 i tzw aplikacje lob (line of busin...
Tomasz Kopacz
 
Tomasz Kopacz MTS 2012 Azure - Co i kiedy użyć (IaaS vs paas vshybrid cloud v...
Tomasz Kopacz MTS 2012 Azure - Co i kiedy użyć (IaaS vs paas vshybrid cloud v...Tomasz Kopacz MTS 2012 Azure - Co i kiedy użyć (IaaS vs paas vshybrid cloud v...
Tomasz Kopacz MTS 2012 Azure - Co i kiedy użyć (IaaS vs paas vshybrid cloud v...
Tomasz Kopacz
 
Azure Digital Twins.pdf
Azure Digital Twins.pdfAzure Digital Twins.pdf
Azure Digital Twins.pdf
Tomasz Kopacz
 
24032022 Zero Trust for Developers Pub.pdf
24032022 Zero Trust for Developers Pub.pdf24032022 Zero Trust for Developers Pub.pdf
24032022 Zero Trust for Developers Pub.pdf
Tomasz Kopacz
 
Deep dive into service fabric after 2 years
Deep dive into service fabric after 2 yearsDeep dive into service fabric after 2 years
Deep dive into service fabric after 2 years
Tomasz Kopacz
 
Net core (dawniej 5.0) – co to dla mnie. też dużo o open source
Net core (dawniej   5.0) – co to dla mnie. też dużo o open sourceNet core (dawniej   5.0) – co to dla mnie. też dużo o open source
Net core (dawniej 5.0) – co to dla mnie. też dużo o open source
Tomasz Kopacz
 
Visual Studio – jak zorganizować pracę używając Scrum i GIT?
Visual Studio – jak zorganizować pracę używając Scrum i GIT?Visual Studio – jak zorganizować pracę używając Scrum i GIT?
Visual Studio – jak zorganizować pracę używając Scrum i GIT?
Tomasz Kopacz
 
Visual Studio - zastosowania
Visual Studio - zastosowaniaVisual Studio - zastosowania
Visual Studio - zastosowania
Tomasz Kopacz
 
Coś o service fabric, architekturze, i bardzo skalowalnych aplikacjach
Coś o service fabric, architekturze, i bardzo skalowalnych aplikacjachCoś o service fabric, architekturze, i bardzo skalowalnych aplikacjach
Coś o service fabric, architekturze, i bardzo skalowalnych aplikacjach
Tomasz Kopacz
 
Kiedy napadnie na nas pralka – jak budować bezpieczne systemy internet of thi...
Kiedy napadnie na nas pralka – jak budować bezpieczne systemy internet of thi...Kiedy napadnie na nas pralka – jak budować bezpieczne systemy internet of thi...
Kiedy napadnie na nas pralka – jak budować bezpieczne systemy internet of thi...
Tomasz Kopacz
 
Windows 10, internet of things, komunikacja duplex od kabli do odrobiny azu...
Windows 10, internet of things, komunikacja duplex   od kabli do odrobiny azu...Windows 10, internet of things, komunikacja duplex   od kabli do odrobiny azu...
Windows 10, internet of things, komunikacja duplex od kabli do odrobiny azu...
Tomasz Kopacz
 
It w roku 201x – dom, szkoła, potem praca. no i – jak tu (i czego!) uczyć
It w roku 201x – dom, szkoła, potem praca. no i – jak tu (i czego!) uczyćIt w roku 201x – dom, szkoła, potem praca. no i – jak tu (i czego!) uczyć
It w roku 201x – dom, szkoła, potem praca. no i – jak tu (i czego!) uczyć
Tomasz Kopacz
 
(Azure) Machine Learning 2015
(Azure) Machine Learning 2015(Azure) Machine Learning 2015
(Azure) Machine Learning 2015
Tomasz Kopacz
 
Azure paa s v2 – microservices, microsoft (azure) service fabric, .apps and o...
Azure paa s v2 – microservices, microsoft (azure) service fabric, .apps and o...Azure paa s v2 – microservices, microsoft (azure) service fabric, .apps and o...
Azure paa s v2 – microservices, microsoft (azure) service fabric, .apps and o...
Tomasz Kopacz
 
Mts 2013 tomasz kopacz - windows 8, office 365, workflow manager, windows a...
Mts 2013   tomasz kopacz - windows 8, office 365, workflow manager, windows a...Mts 2013   tomasz kopacz - windows 8, office 365, workflow manager, windows a...
Mts 2013 tomasz kopacz - windows 8, office 365, workflow manager, windows a...
Tomasz Kopacz
 
Mts 2013 tomasz kopacz - wydajność aplikacji dla windows 8 - jak ją mierzyć...
Mts 2013   tomasz kopacz - wydajność aplikacji dla windows 8 - jak ją mierzyć...Mts 2013   tomasz kopacz - wydajność aplikacji dla windows 8 - jak ją mierzyć...
Mts 2013 tomasz kopacz - wydajność aplikacji dla windows 8 - jak ją mierzyć...
Tomasz Kopacz
 
Tomasz Kopacz MTS 2012 Wind RT w Windows 8 i tzw aplikacje lob (line of busin...
Tomasz Kopacz MTS 2012 Wind RT w Windows 8 i tzw aplikacje lob (line of busin...Tomasz Kopacz MTS 2012 Wind RT w Windows 8 i tzw aplikacje lob (line of busin...
Tomasz Kopacz MTS 2012 Wind RT w Windows 8 i tzw aplikacje lob (line of busin...
Tomasz Kopacz
 
Tomasz Kopacz MTS 2012 Azure - Co i kiedy użyć (IaaS vs paas vshybrid cloud v...
Tomasz Kopacz MTS 2012 Azure - Co i kiedy użyć (IaaS vs paas vshybrid cloud v...Tomasz Kopacz MTS 2012 Azure - Co i kiedy użyć (IaaS vs paas vshybrid cloud v...
Tomasz Kopacz MTS 2012 Azure - Co i kiedy użyć (IaaS vs paas vshybrid cloud v...
Tomasz Kopacz
 

Recently uploaded (20)

How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Generative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in BusinessGenerative Artificial Intelligence (GenAI) in Business
Generative Artificial Intelligence (GenAI) in Business
Dr. Tathagat Varma
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.Greenhouse_Monitoring_Presentation.pptx.
Greenhouse_Monitoring_Presentation.pptx.
hpbmnnxrvb
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxIncreasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptx
Anoop Ashok
 
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul Shares 5 Steps to Implement AI Agents for Maximum Business Efficien...
Noah Loul
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes Partner Innovation Updates for May 2025
ThousandEyes
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Enhancing ICU Intelligence: How Our Functional Testing Enabled a Healthcare I...
Impelsys Inc.
 
2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx2025-05-Q4-2024-Investor-Presentation.pptx
2025-05-Q4-2024-Investor-Presentation.pptx
Samuele Fogagnolo
 
Heap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and DeletionHeap, Types of Heap, Insertion and Deletion
Heap, Types of Heap, Insertion and Deletion
Jaydeep Kale
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 

Big data on Azure for Architects

  • 3. Data Complexity: Variety and Velocity Terabytes (1012) Gigabytes (109) Megabytes (106) Petabytes (1015) Exabyte (1018)
  • 6. Reduces NoSQL: • No cleansing! • No ETL! • No load! • Analyze the data where it lands! Store now, question later RDBMS Data Arrives Derive a schema Cleanse the data Transform the data Load the data SQL Queries 1 2 3 4 5 6 Data Arrives Application Program 1 2 HOW?? IF I DON’T KNOW THE STRUCTURE?
  • 16. Distributed Storage (HDFS) Query (Hive) Distributed Processing (MapReduce) DataIntegration (ODBC/SQOOP/REST) EventPipeline (EventHub/ Flume) Legend Red = Core Hadoop Blue = Data processing Gray= Microsoft integration points and value adds Orange = Data Movement Green = Packages YARN
  • 17. Name Node de Data Node HDFS API DFS (1 Data Node per Worker Role) and Compute Cluster / VM Azure Storage (WASB) Benefits: Data reuse and sharing Data storage cost Elastic scale-out Geo-replication … Data Node Most important Benefit: Data are INDEPENDENT from cluster And WASB is FAST…
  • 21. SOSP Paper - Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency http://nasuni.com Report link is here
  • 22. M Extent Nodes (EN) Paxos Front End Layer FE Incoming Write Request M M Partition Server Partition Server Partition Server Partition Server Partition Master FE FE FE FE Lock Service Ack Partition Layer Stream Layer
  • 23. Account Name Container Name Blob Name aaaa aaaa aaaaa …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. …….. zzzz zzzz zzzzz Storage Stamp Partition Server Partition Server Account Name Container Name Blob Name richard videos tennis ……… ……… ……… ……… ……… ……… zzzz zzzz zzzzz Account Name Container Name Blob Name harry pictures sunset ……… ……… ……… ……… ……… ……… richard videos soccer Partition Server Partition Master Front-End Server PS 2 PS 3 PS 1 A-H: PS1 H’-R: PS2 R’-Z: PS3 A-H: PS1 H’-R: PS2 R’-Z: PS3 Partition Map Blob Index Partition Map Account Name Container Name Blob Name aaaa aaaa aaaaa ……… ……… ……… ……… ……… ……… harry pictures sunrise A-H R’-ZH’-R
  • 27. • Programming framework (library and runtime) for analyzing datasets stored in HDFS • Composed of user-supplied Map and Reduce functions: • Map() - subdivide and conquer • Reduce() - combine and reduce cardinality ……… Do work() Do work() Do work()
  • 30. context.write(word, one); context.write(key, new IntWritable(sum)); wasb:///example/data/gutenberg/davinci.txt wasb:///example/data/WordCountOutput Start-AzureHDInsightJob Get-AzureStorageBlob Run in PS
  • 36. • It’s important to check that the results generated by queries are realistic, valid, and useful for better RoI • Automate tasks in a repeatable solution, and run the solution from a remote computer rather than directly from the cluster server desktop. • There’s a huge range of tools that you can use with Hadoop, and choosing the most appropriate can be difficult. • If you decide to use a resource-intensive application such as HBase or Storm, you should consider running it on a separate cluster.
  • 37. Data-flow platform to transform and analyze HDFS data Scripting – No Java Needed! Focus on semantics, not on implementation Extensible through user defined functions and methods Pigs Eat Anything Pig can operate on data whether it has metadata or not. Pigs Live Anywhere Pig is not tied to one particular parallel framework. Pigs Are Domestic Animals Pig is designed to be easily controlled. Complex tasks involving interrelated data transformations can be simplified and encoded as data flow sequences. Pig programs accomplish huge tasks, but they are easy to write and maintain. Pigs Fly Pig processes data quickly. The system automatically optimizes execution of Pig jobs, so the user can focus on semantics.
  • 39. LOGS = LOAD 'wasb:///example/data/sample.log'; LEVELS = foreach LOGS generate REGEX_EXTRACT($0, '(TRACE|DEBUG|INFO|WARN|ERROR|FATAL)', 1) as LOGLEVEL; FILTEREDLEVELS = FILTER LEVELS by LOGLEVEL is not null; GROUPEDLEVELS = GROUP FILTEREDLEVELS by LOGLEVEL; FREQUENCIES = foreach GROUPEDLEVELS generate group as LOGLEVEL, COUNT(FILTEREDLEVELS.LOGLEVEL) as COUNT; RESULT = order FREQUENCIES by COUNT desc; DUMP RESULT; STORE RESULT INTO 'tkR1'
  • 58. What is Machine Learning (ML) Solve extremely hard problems Extract more value from Big Data Drive a shift in business analytics
  • 70. Relay Queue Topic Notification Hub Event Hub NAT and Firewall Traversal Service Request/Response Services Unbuffered with TCP Throttling. Hybrid Connection Transactional Cloud AMQP/HTTP Broker High-Scale, High-Reliability Messaging Sessions, Scheduled Delivery, etc. Transactional Message Distribution Up to 2000 subscriptions per Topic Up to 2K/100K filter rules per subscription High-scale notification distribution Most mobile push notification services Millions of notification targets EVENTS, MASSIVE SCALE
  • 71. Event Producers > 1M Producers > 1GB/sec Aggregate Throughput Partitions Direct PartitionKey Hash Throughput Units: • 1 ≤ TUs ≤ Partition Count • TU: 1 MB/s writes, 2 MB/s reads • We pay for TU AMQP 1.0 Credit-based flow control Client-side cursors Offset by Id or Timestamp
  • 72. Ingestor (broker) Collection Presentation and action Event producers Transformation Long-term storage Event hubs Storage adapters Stream processingCloud gateways (web APIs) Field gateways Applications Legacy IOT (custom protocols) Devices IP-capable devices (Windows/Linux) Low-power devices (RTOS) Search and query Data analytics (Excel) Web/thick client dashboards Service bus Azure DBs Azure storage HDInsight Stream Analytics Devices to take action Storm IEventProcessor
  • 86. * Tick tuples scheme is Storm’s built-in mechanism for generating tuples and sending them to each bolt in the topology at specified intervals. Worth to check: https://storm.apache.org/apidocs/backtype/storm/topology/TopologyBuilder.BoltGetter.html
  • 102. Compute Visualisation Orchestration Storage Service bus Event Hub Data Factory Power BI Stream Analytics HD Insight Machine Learning Virtual Machines Table Storage Blob Storage SQL Azure Document DB Feeds IoT Data Sources Near real time analysisData Journeys Azure
  • 103. Compute Visualisation Orchestration Storage Service bus Event Hub Data Factory Power BI Stream Analytics HD Insight Machine Learning Virtual Machines Table Storage Blob Storage SQL Azure Document DB Feeds IoT Data Sources Near real time analysisPredictive Analytics Azure
  • 104. Compute Visualisation Orchestration Storage Service bus Event Hub Data Factory Power BI Stream Analytics HD Insight Machine Learning Virtual Machines Table Storage Blob Storage SQL Azure Document DB Feeds IoT Data Sources Near real time analysisNear real time analysis Azure
  • 105. Compute Visualisation Orchestration Storage Service bus Event Hub Data Factory Power BI Stream Analytics HD Insight Machine Learning Virtual Machines Table Storage Blob Storage SQL Azure Document DB Feeds IoT Data Sources Near real time analysisBig Data Azure
  • 106. Compute Visualisation Orchestration Storage Service bus Event Hub Data Factory Power BI Stream Analytics HD Insight Machine Learning Virtual Machines Table Storage Blob Storage SQL Azure Document DB Feeds IoT Data Sources Near real time analysis“Traditional” BI Azure
  • 110. Azure Windows Server Linux Hosted Clouds Windows Server Linux Service Fabric Private Clouds Windows Server Linux High Availability Hyper-Scale Hybrid Operations High Density Microservices Rolling Upgrades Stateful services Low Latency Fast startup & shutdown Container Orchestration & lifecycle management Replication & Failover Simple programming models Load balancing Self-healingData Partitioning Automated Rollback Health Monitoring Placement Constraints