SlideShare a Scribd company logo
Building Modern Cloud
Analytics Solution
Dmitry Anoshin
Outline
• About Me
• Role of Analytics
• History of Cloud
• Analytics powered by Microsoft Azure
• DW modernization Project
• Use cases and Challenges
• Alternative Solution with Azure
About Myself
About Myself
• Work with Business Intelligence
since 2007
#dimaworkplace
Technical Skills Matrix
2015
2010
2007
Data
Warehouse
ETL/ELT
Business
Intelligence
Big Data
Cloud
Analytics
(AWS,
Azure,
GCP)
Machine
Learning
2019
Other Activities
Jumpstart Sno
wflake: A Step-
by-Step Guide
to Modern
Cloud Analytics.
• Victoria Power BI andVictoria SQL Server meetup
• Victoria andVancouverTableau User Group
• Conferences (EDW 2018, 2019, Data Architecture Summit)
• Amazon internal conferences
Role of Analytics
BusinessValue
Stakeholders Employees Customers
Value
”The goal of any organization is to generateValue”
The Future of Competition.
https://www.amazon.com/Future-Competition-Co-Creating-Unique-Customers/dp/1578519535
BIValue Chain
Stakeholders Employees Customers
Value
Decisions
Data
Value creation based on effective decisions
Effective decisions based on accurate
information
For Data to be a differentiator, customers
need to be able to…
• Capture and store new non-relational data at
PB-EB scale in real time
• Discover value in a new type of analytics that
go beyond batch reporting to incorporate
real-time, predictive, voice, and image
recognition
• Democratize access to data in a secure and
governed way
New types of analytics
Dashboards Predictive Image
Recognition
VoiceReal-time
New types of data
Cloud Analytics
Introduction
Cloud Early History
1970
Time Sharing Concept by
GE
1977
Cloud symbol
used in ARPANET
1990
VPN by telecom
1993
Cloud refer to
Distributed
Computing
1994 Cloud
metaphor for
virtualized
services
Cloud Recent History
2002
AWS
2006
AWS Elastic
Compute Cloud
2006
Google Docs
2008
Google App
Engine
2008
Microsoft
Announced Azure
2010
Microsoft Azure
Why moving to the Cloud?
• Elasticity
• Pay for what
you need
• Fail fast
• Fast time to
market
• Secure
• Reliable
• Business SLA
Downsides of on-premise solution
Scale
Constrained
Up-front cost Maintenance
Resources
Tuning and
Deployment
Cloud Restrictions -> Hybrid Clouds
Sensitive Data Data Moving
Cost
Public/Private
Cloud
Cloud Service Models
Cloud Service Models – friendly version
Cloud Analytics
with Microsoft
Azure
Microsoft Azure for Analytics
Data Analytics with Azure
• Data Factory
• Integration
Service
• Kafka
• Event Hub
• Data Lake Gen 1
• Data Lake Gen 2
• Blob Storage
• HD Insight
• Data Lake Analytics
• Streaming Analytics
• PolyBase
• CosmosDB
• SQL DW
• Analysis Service
• SQL Database
• SQL Server in
VM
• Cosmos DB
Data Integration
and
Transformation
Data Warehouse
and Data bases
Big Data
• Analysis Service
• ML Analytics
• Business Intelligence
Analytics
DW Modernization
Use Case
BI/DW (before)
Storage LayerSource Layer
Ad-hoc SQL
SFTP
Data Warehouse
ETL (PL/SQL)Files
Inventory
Sales
Access Layer
Cloud Migration Strategy
Lift & Shift
• Typical Approach
• Move all-at-once
• Target platform then evolve
• Approach gets you to the cloud quickly
• Relatively small barrier to learning new technology
since it tends to be a close fit
Split & Flip
• Split application into logical functional data layers
• Match the data functionality with the right
technology
• Leverage the wide selection of tools onAWS to
best fit the need
• Move data in phases — prototype, learn and
perfect
Migration Approach
Useful tools:
• Total Cost Ownership (TCO) Calculator
• Azure Database Migration Service
• Azure Migration Assistant
Building Modern Data Platform with Microsoft Azure
Cloud Data Warehouse
What is Azure DW?
• Decouple Storage
and Compute
• MPP
• Distribution Styles:
Hash/Robin/Replicat
e
MPP?
SQL Database vs SQL Data Warehouse
What is Azure Data Factory?
Azure Data Factory (ADF) is Microsoft’s fully managed ELT service
in the cloud that’s delivered as a Platform as a Service (PaaS)
Lack of Notification
Problem: Users are missing emails or they jump to spam.
Solution: Leverage Messenger with Webhooks. (Slack, Chime or so on).
Lack of Logging
Problem: We didn’t have any detail logs about our ETL performance and we didn’t
have any insights.
Solution: Collecting logs and events. In addition, we are able to collect logs on any
level of jobs and transformation.
Self-Service BI
Problem: Business Users wants Interactive and Self-Service tool. Fast time to Market
and less dependency on IT.
Solution: Implement modern Visual Analytics Platform
Marketing Automation
Problem: Marketing team wants “Move Fast and Break Things”.
Solution: Using ADF the gave Marketing template jobs and they doing their jobs
themselves.
Affiliates
Insights
Integration with BI
Problem: Having best BI tool doesn’t guaranty good SLA.
Solution: Build Integration between Matillion ETL and Tableau based on Trigger. Add
data quality checks.
Evolving to Cloud
Data Analytics
Platform
Streaming Data
Problem: Organization is using NoSQL database and mobile application. It is
critical to deliver near real time analytics
Solution: Using Apache Kaffka, we are able to stream data into the Data lake
and query this data in near real time
Data Lake Dashboard
Kafka
CosmoDB
Mobile App
Clickstream Analytics
Problem: Business wants to analyze Bots traffics and discover broken URLs.
Access logs are ~50GB per day, 5600 log files per day.
Solution: Leveraging Databricks in order to produce Parquet file and store in
Azure Data Lake Gen2. User are able query it with T-SQL and BI Tools.
Databricks ParquetBlob Storage
Access Logs
Load Balancer Data Lake Data Factory SQL DW
Query with SQL or Databricks
DevOps onboarding
Problem: Solution isn’t reliable and could easy break. As a result end users will
experience bad experience and it will affect business decisions.
Solution: Onboarding Continuous Integration methodology for Cloud Data
Platform
• Agile and Kanban board
• Code branching (Git)
• Gated check-ins
• Automated Tests
• Build
• Release
Evolving to Cloud Data Analytics Platform
Alternative Implementation
What is Matillion ETL?
What is Snowflake?

More Related Content

What's hot (20)

PDF
Modernizing to a Cloud Data Architecture
Databricks
 
PDF
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Cathrine Wilhelmsen
 
PPTX
Microsoft Data Platform - What's included
James Serra
 
PPTX
Data Sharing with Snowflake
Snowflake Computing
 
PDF
Enabling a Data Mesh Architecture with Data Virtualization
Denodo
 
PDF
Time to Talk about Data Mesh
LibbySchulze
 
PDF
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
PPTX
Databricks Fundamentals
Dalibor Wijas
 
PDF
adb.pdf
AdityaMehta724216
 
PPTX
Microsoft Purview
Mohammed Chaaraoui
 
PPTX
Microsoft Fabric Introduction
James Serra
 
PPTX
DW Migration Webinar-March 2022.pptx
Databricks
 
PPTX
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
DataScienceConferenc1
 
PPTX
Data Lakehouse, Data Mesh, and Data Fabric (r1)
James Serra
 
PDF
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
HostedbyConfluent
 
PDF
Webinar Data Mesh - Part 3
Jeffrey T. Pollock
 
PDF
Building Lakehouses on Delta Lake with SQL Analytics Primer
Databricks
 
PDF
How a Semantic Layer Makes Data Mesh Work at Scale
DATAVERSITY
 
PPTX
Data Lakehouse, Data Mesh, and Data Fabric (r2)
James Serra
 
PDF
Designing a modern data warehouse in azure
Antonios Chatzipavlis
 
Modernizing to a Cloud Data Architecture
Databricks
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Cathrine Wilhelmsen
 
Microsoft Data Platform - What's included
James Serra
 
Data Sharing with Snowflake
Snowflake Computing
 
Enabling a Data Mesh Architecture with Data Virtualization
Denodo
 
Time to Talk about Data Mesh
LibbySchulze
 
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
Databricks Fundamentals
Dalibor Wijas
 
Microsoft Purview
Mohammed Chaaraoui
 
Microsoft Fabric Introduction
James Serra
 
DW Migration Webinar-March 2022.pptx
Databricks
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
DataScienceConferenc1
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
James Serra
 
How to Build the Data Mesh Foundation: A Principled Approach | Zhamak Dehghan...
HostedbyConfluent
 
Webinar Data Mesh - Part 3
Jeffrey T. Pollock
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Databricks
 
How a Semantic Layer Makes Data Mesh Work at Scale
DATAVERSITY
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
James Serra
 
Designing a modern data warehouse in azure
Antonios Chatzipavlis
 

Similar to Building Modern Data Platform with Microsoft Azure (20)

PPTX
Building Modern Data Platform with AWS
Dmitry Anoshin
 
PDF
Analytics in a Day Virtual Workshop
CCG
 
PDF
Analytics in a Day Ft. Synapse Virtual Workshop
CCG
 
PDF
Analytics in a Day Virtual Workshop
CCG
 
PPTX
Creating an Enterprise AI Strategy
AtScale
 
PDF
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
DATAVERSITY
 
PPTX
Big Data Analytics in the Cloud with Microsoft Azure
Mark Kromer
 
PDF
Analytics in a Day Ft. Synapse Virtual Workshop
CCG
 
PDF
Modernize your Infrastructure and Mobilize Your Data
Precisely
 
PPTX
Journey to Cloud Analytics
Datavail
 
PDF
AWS User Group: Building Cloud Analytics Solution with AWS
Dmitry Anoshin
 
PDF
ADV Slides: Comparing the Enterprise Analytic Solutions
DATAVERSITY
 
PDF
Slides: Success Stories for Data-to-Cloud
DATAVERSITY
 
PPTX
Altis: AWS Snowflake Practice
Altis Consulting
 
PPTX
Altis AWS Snowflake Practice
SamanthaSwain7
 
PDF
Data and Application Modernization in the Age of the Cloud
redmondpulver
 
PDF
Analytics in a day
Peter Ward
 
PDF
Trends in Enterprise Advanced Analytics
DATAVERSITY
 
PPTX
Achieve New Heights with Modern Analytics
Sense Corp
 
PPTX
Data Lake Overview
James Serra
 
Building Modern Data Platform with AWS
Dmitry Anoshin
 
Analytics in a Day Virtual Workshop
CCG
 
Analytics in a Day Ft. Synapse Virtual Workshop
CCG
 
Analytics in a Day Virtual Workshop
CCG
 
Creating an Enterprise AI Strategy
AtScale
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
DATAVERSITY
 
Big Data Analytics in the Cloud with Microsoft Azure
Mark Kromer
 
Analytics in a Day Ft. Synapse Virtual Workshop
CCG
 
Modernize your Infrastructure and Mobilize Your Data
Precisely
 
Journey to Cloud Analytics
Datavail
 
AWS User Group: Building Cloud Analytics Solution with AWS
Dmitry Anoshin
 
ADV Slides: Comparing the Enterprise Analytic Solutions
DATAVERSITY
 
Slides: Success Stories for Data-to-Cloud
DATAVERSITY
 
Altis: AWS Snowflake Practice
Altis Consulting
 
Altis AWS Snowflake Practice
SamanthaSwain7
 
Data and Application Modernization in the Age of the Cloud
redmondpulver
 
Analytics in a day
Peter Ward
 
Trends in Enterprise Advanced Analytics
DATAVERSITY
 
Achieve New Heights with Modern Analytics
Sense Corp
 
Data Lake Overview
James Serra
 
Ad

More from Dmitry Anoshin (20)

PPTX
Cloud Analytics Use Cases and Architecture, Math Marketing Conference, Russia...
Dmitry Anoshin
 
PPTX
Victoria Tableau User Group - Getting started with Tableau
Dmitry Anoshin
 
PPTX
Hey, what is about data?
Dmitry Anoshin
 
PPTX
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Dmitry Anoshin
 
PPTX
Tableau API
Dmitry Anoshin
 
PPTX
My experience of writing technical books
Dmitry Anoshin
 
PDF
Business objects activities web intelligence
Dmitry Anoshin
 
PPTX
Splunk 6.2 new features
Dmitry Anoshin
 
PPTX
Business Analytics Paradigm Change
Dmitry Anoshin
 
PPTX
SAP BO and Teradata best practices
Dmitry Anoshin
 
PPTX
Exploring Splunk
Dmitry Anoshin
 
PPTX
Splunk Digital Intelligence
Dmitry Anoshin
 
PDF
Role of Tableau on the Data Discovery Market
Dmitry Anoshin
 
PPTX
SAP Lumira - Building visualizations
Dmitry Anoshin
 
PPTX
SAP Lumira - Acquiring data
Dmitry Anoshin
 
PPTX
SAP Lumira - Enriching data
Dmitry Anoshin
 
PDF
Microstrategy for Retailer Company
Dmitry Anoshin
 
PPTX
SAP BusinessObjects 4.1 Web Intelligence Report Development
Dmitry Anoshin
 
PPTX
Sap BusinessObjects 4
Dmitry Anoshin
 
PPT
Business objects web intelligence training tasks
Dmitry Anoshin
 
Cloud Analytics Use Cases and Architecture, Math Marketing Conference, Russia...
Dmitry Anoshin
 
Victoria Tableau User Group - Getting started with Tableau
Dmitry Anoshin
 
Hey, what is about data?
Dmitry Anoshin
 
Enterprise Data World 2018 - Building Cloud Self-Service Analytical Solution
Dmitry Anoshin
 
Tableau API
Dmitry Anoshin
 
My experience of writing technical books
Dmitry Anoshin
 
Business objects activities web intelligence
Dmitry Anoshin
 
Splunk 6.2 new features
Dmitry Anoshin
 
Business Analytics Paradigm Change
Dmitry Anoshin
 
SAP BO and Teradata best practices
Dmitry Anoshin
 
Exploring Splunk
Dmitry Anoshin
 
Splunk Digital Intelligence
Dmitry Anoshin
 
Role of Tableau on the Data Discovery Market
Dmitry Anoshin
 
SAP Lumira - Building visualizations
Dmitry Anoshin
 
SAP Lumira - Acquiring data
Dmitry Anoshin
 
SAP Lumira - Enriching data
Dmitry Anoshin
 
Microstrategy for Retailer Company
Dmitry Anoshin
 
SAP BusinessObjects 4.1 Web Intelligence Report Development
Dmitry Anoshin
 
Sap BusinessObjects 4
Dmitry Anoshin
 
Business objects web intelligence training tasks
Dmitry Anoshin
 
Ad

Recently uploaded (20)

PDF
IT GOVERNANCE 4-2 - Information System Security (1).pdf
mdirfanuddin1322
 
PDF
Loading Data into Snowflake (Bulk & Stream)
Accentfuture
 
PDF
Group 5_RMB Final Project on circular economy
pgban24anmola
 
PDF
2025 Global Data Summit - FOM with AI.pdf
Marco Wobben
 
PPTX
办理学历认证InformaticsLetter新加坡英华美学院毕业证书,Informatics成绩单
Taqyea
 
PDF
SQL for Accountants and Finance Managers
ysmaelreyes
 
PDF
GOOGLE ADS (1).pdf THE ULTIMATE GUIDE TO
kushalkeshwanisou
 
DOCX
🧩 1. Solvent R-WPS Office work scientific
NohaSalah45
 
PPTX
covid 19 data analysis updates in our municipality
RhuAyungon1
 
PDF
Data Science Course Certificate by Sigma Software University
Stepan Kalika
 
PPTX
thid ppt defines the ich guridlens and gives the information about the ICH gu...
shaistabegum14
 
PDF
Using AI/ML for Space Biology Research
VICTOR MAESTRE RAMIREZ
 
DOCX
INDUSTRIAL BENEFIT FROM MICROSOFT AZURE.docx
writercontent500
 
PPTX
Generative AI Boost Data Governance and Quality- Tejasvi Addagada
Tejasvi Addagada
 
PPTX
04_Tamás Marton_Intuitech .pptx_AI_Barometer_2025
FinTech Belgium
 
PPTX
美国史蒂文斯理工学院毕业证书{SIT学费发票SIT录取通知书}哪里购买
Taqyea
 
PPTX
03_Ariane BERCKMOES_Ethias.pptx_AIBarometer_release_event
FinTech Belgium
 
PDF
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
PPTX
Comparative Study of ML Techniques for RealTime Credit Card Fraud Detection S...
Debolina Ghosh
 
IT GOVERNANCE 4-2 - Information System Security (1).pdf
mdirfanuddin1322
 
Loading Data into Snowflake (Bulk & Stream)
Accentfuture
 
Group 5_RMB Final Project on circular economy
pgban24anmola
 
2025 Global Data Summit - FOM with AI.pdf
Marco Wobben
 
办理学历认证InformaticsLetter新加坡英华美学院毕业证书,Informatics成绩单
Taqyea
 
SQL for Accountants and Finance Managers
ysmaelreyes
 
GOOGLE ADS (1).pdf THE ULTIMATE GUIDE TO
kushalkeshwanisou
 
🧩 1. Solvent R-WPS Office work scientific
NohaSalah45
 
covid 19 data analysis updates in our municipality
RhuAyungon1
 
Data Science Course Certificate by Sigma Software University
Stepan Kalika
 
thid ppt defines the ich guridlens and gives the information about the ICH gu...
shaistabegum14
 
Using AI/ML for Space Biology Research
VICTOR MAESTRE RAMIREZ
 
INDUSTRIAL BENEFIT FROM MICROSOFT AZURE.docx
writercontent500
 
Generative AI Boost Data Governance and Quality- Tejasvi Addagada
Tejasvi Addagada
 
04_Tamás Marton_Intuitech .pptx_AI_Barometer_2025
FinTech Belgium
 
美国史蒂文斯理工学院毕业证书{SIT学费发票SIT录取通知书}哪里购买
Taqyea
 
03_Ariane BERCKMOES_Ethias.pptx_AIBarometer_release_event
FinTech Belgium
 
apidays Singapore 2025 - From API Intelligence to API Governance by Harsha Ch...
apidays
 
Comparative Study of ML Techniques for RealTime Credit Card Fraud Detection S...
Debolina Ghosh
 

Building Modern Data Platform with Microsoft Azure

  • 1. Building Modern Cloud Analytics Solution Dmitry Anoshin
  • 2. Outline • About Me • Role of Analytics • History of Cloud • Analytics powered by Microsoft Azure • DW modernization Project • Use cases and Challenges • Alternative Solution with Azure
  • 4. About Myself • Work with Business Intelligence since 2007
  • 6. Technical Skills Matrix 2015 2010 2007 Data Warehouse ETL/ELT Business Intelligence Big Data Cloud Analytics (AWS, Azure, GCP) Machine Learning 2019
  • 7. Other Activities Jumpstart Sno wflake: A Step- by-Step Guide to Modern Cloud Analytics. • Victoria Power BI andVictoria SQL Server meetup • Victoria andVancouverTableau User Group • Conferences (EDW 2018, 2019, Data Architecture Summit) • Amazon internal conferences
  • 9. BusinessValue Stakeholders Employees Customers Value ”The goal of any organization is to generateValue” The Future of Competition. https://www.amazon.com/Future-Competition-Co-Creating-Unique-Customers/dp/1578519535
  • 10. BIValue Chain Stakeholders Employees Customers Value Decisions Data Value creation based on effective decisions Effective decisions based on accurate information
  • 11. For Data to be a differentiator, customers need to be able to… • Capture and store new non-relational data at PB-EB scale in real time • Discover value in a new type of analytics that go beyond batch reporting to incorporate real-time, predictive, voice, and image recognition • Democratize access to data in a secure and governed way New types of analytics Dashboards Predictive Image Recognition VoiceReal-time New types of data
  • 13. Cloud Early History 1970 Time Sharing Concept by GE 1977 Cloud symbol used in ARPANET 1990 VPN by telecom 1993 Cloud refer to Distributed Computing 1994 Cloud metaphor for virtualized services
  • 14. Cloud Recent History 2002 AWS 2006 AWS Elastic Compute Cloud 2006 Google Docs 2008 Google App Engine 2008 Microsoft Announced Azure 2010 Microsoft Azure
  • 15. Why moving to the Cloud? • Elasticity • Pay for what you need • Fail fast • Fast time to market • Secure • Reliable • Business SLA
  • 16. Downsides of on-premise solution Scale Constrained Up-front cost Maintenance Resources Tuning and Deployment
  • 17. Cloud Restrictions -> Hybrid Clouds Sensitive Data Data Moving Cost Public/Private Cloud
  • 19. Cloud Service Models – friendly version
  • 21. Microsoft Azure for Analytics
  • 22. Data Analytics with Azure • Data Factory • Integration Service • Kafka • Event Hub • Data Lake Gen 1 • Data Lake Gen 2 • Blob Storage • HD Insight • Data Lake Analytics • Streaming Analytics • PolyBase • CosmosDB • SQL DW • Analysis Service • SQL Database • SQL Server in VM • Cosmos DB Data Integration and Transformation Data Warehouse and Data bases Big Data • Analysis Service • ML Analytics • Business Intelligence Analytics
  • 24. BI/DW (before) Storage LayerSource Layer Ad-hoc SQL SFTP Data Warehouse ETL (PL/SQL)Files Inventory Sales Access Layer
  • 25. Cloud Migration Strategy Lift & Shift • Typical Approach • Move all-at-once • Target platform then evolve • Approach gets you to the cloud quickly • Relatively small barrier to learning new technology since it tends to be a close fit Split & Flip • Split application into logical functional data layers • Match the data functionality with the right technology • Leverage the wide selection of tools onAWS to best fit the need • Move data in phases — prototype, learn and perfect
  • 26. Migration Approach Useful tools: • Total Cost Ownership (TCO) Calculator • Azure Database Migration Service • Azure Migration Assistant
  • 29. What is Azure DW? • Decouple Storage and Compute • MPP • Distribution Styles: Hash/Robin/Replicat e
  • 30. MPP?
  • 31. SQL Database vs SQL Data Warehouse
  • 32. What is Azure Data Factory? Azure Data Factory (ADF) is Microsoft’s fully managed ELT service in the cloud that’s delivered as a Platform as a Service (PaaS)
  • 33. Lack of Notification Problem: Users are missing emails or they jump to spam. Solution: Leverage Messenger with Webhooks. (Slack, Chime or so on).
  • 34. Lack of Logging Problem: We didn’t have any detail logs about our ETL performance and we didn’t have any insights. Solution: Collecting logs and events. In addition, we are able to collect logs on any level of jobs and transformation.
  • 35. Self-Service BI Problem: Business Users wants Interactive and Self-Service tool. Fast time to Market and less dependency on IT. Solution: Implement modern Visual Analytics Platform
  • 36. Marketing Automation Problem: Marketing team wants “Move Fast and Break Things”. Solution: Using ADF the gave Marketing template jobs and they doing their jobs themselves. Affiliates Insights
  • 37. Integration with BI Problem: Having best BI tool doesn’t guaranty good SLA. Solution: Build Integration between Matillion ETL and Tableau based on Trigger. Add data quality checks.
  • 38. Evolving to Cloud Data Analytics Platform
  • 39. Streaming Data Problem: Organization is using NoSQL database and mobile application. It is critical to deliver near real time analytics Solution: Using Apache Kaffka, we are able to stream data into the Data lake and query this data in near real time Data Lake Dashboard Kafka CosmoDB Mobile App
  • 40. Clickstream Analytics Problem: Business wants to analyze Bots traffics and discover broken URLs. Access logs are ~50GB per day, 5600 log files per day. Solution: Leveraging Databricks in order to produce Parquet file and store in Azure Data Lake Gen2. User are able query it with T-SQL and BI Tools. Databricks ParquetBlob Storage Access Logs Load Balancer Data Lake Data Factory SQL DW Query with SQL or Databricks
  • 41. DevOps onboarding Problem: Solution isn’t reliable and could easy break. As a result end users will experience bad experience and it will affect business decisions. Solution: Onboarding Continuous Integration methodology for Cloud Data Platform • Agile and Kanban board • Code branching (Git) • Gated check-ins • Automated Tests • Build • Release
  • 42. Evolving to Cloud Data Analytics Platform

Editor's Notes

  • #14: The cloud symbol was used to represent networks of computing equipment in the original ARPANET by as early as 1977 The term cloud was used to refer to platforms for distributed computing as early as 1993, when Apple spin-off General Magic and AT&T used it in describing their (paired) Telescript and PersonaLink technologies.
  • #15: The cloud symbol was used to represent networks of computing equipment in the original ARPANET by as early as 1977 The term cloud was used to refer to platforms for distributed computing as early as 1993, when Apple spin-off General Magic and AT&T used it in describing their (paired) Telescript and PersonaLink technologies.