Say goodbye to data silos! Analytics in a Day will simplify and accelerate your journey towards the modern data warehouse. Join CCG and Microsoft for a half-day virtual workshop, hosted by James McAuliffe.
Organizations have been collecting, storing, and accessing data from the beginning of computerization. Insights gained from analyzing the data enable them to identify new opportunities, improve core processes, enable continuous learning and differentiation, remain competitive, and thrive in an increasingly challenging business environment.
The well-established data architecture, consisting of a data warehouse, fed from multiple operational data stores, and fronted by BI tools, has served most organizations well. However, over the last two decades, with the explosion of internet-scale data, and the advent of new approaches to data and computational processing, this tried-and-true data architecture has come under strain, and has created both challenges and opportunities for organizations.
In this green paper, we will discuss modern approaches to data architecture that have evolved to address these challenges and provide a framework for companies to build a data architecture and better adapt to increasing demands of the modern business environment. This discussion of data architecture will be tied to the Data Maturity Journey introduced in EQengineered’s June 2021 green paper on Data Modernization.
The document discusses the challenges of maintaining separate data lake and data warehouse systems. It notes that businesses need to integrate these areas to overcome issues like managing diverse workloads, providing consistent security and user management across uses cases, and enabling data sharing between data science and business analytics teams. An integrated system is needed that can support both structured analytics and big data/semi-structured workloads from a single platform.
Say goodbye to data silos! Analytics in a Day will simplify and accelerate your journey towards the modern data warehouse. Join CCG and Microsoft for a two-day virtual workshop, hosted by James McAuliffe.
This white paper will present the opportunities laid down by
data lake and advanced analytics, as well as, the challenges
in integrating, mining and analyzing the data collected from
these sources. It goes over the important characteristics of
the data lake architecture and Data and Analytics as a
Service (DAaaS) model. It also delves into the features of a
successful data lake and its optimal designing. It goes over
data, applications, and analytics that are strung together to
speed-up the insight brewing process for industry’s
improvements with the help of a powerful architecture for
mining and analyzing unstructured data – data lake.
Analytics in a Day Ft. Synapse Virtual WorkshopCCG
Say goodbye to data silos! Analytics in a Day will simplify and accelerate your journey towards the modern data warehouse. Join CCG and Microsoft for a half-day virtual workshop, hosted by James McAuliffe.
Analytics in a Day Ft. Synapse Virtual WorkshopCCG
Say goodbye to data silos! Analytics in a Day will simplify and accelerate your journey towards the modern data warehouse. Join CCG and Microsoft for a half-day virtual workshop, hosted by James McAuliffe.
Analytics in a Day Ft. Synapse Virtual WorkshopCCG
Say goodbye to data silos! Analytics in a Day will simplify and accelerate your journey towards the modern data warehouse. Join CCG and Microsoft for a half-day virtual workshop, hosted by James McAuliffe.
Power BI Advanced Data Modeling Virtual WorkshopCCG
Join CCG and Microsoft for a virtual workshop, hosted by Solution Architect, Doug McClurg, to learn how to create professional, frustration-free data models that engage your customers.
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Data Con LA
The document discusses how an Enterprise Data Lake (EDL) provides a more effective solution for enterprise BI and analytics compared to traditional enterprise data warehouses (EDW). It argues that EDL allows enterprises to retain all datasets, service ad-hoc requests with no latency or development time, and offer a low-cost, low-maintenance solution that supports direct analytics and reporting on data stored in its native format. The document promotes EDL as a mainstream solution that should be part of every mid-sized and large enterprise's standard IT stack.
Caserta Concepts, Datameer and Microsoft shared their combined knowledge and a use case on big data, the cloud and deep analytics. Attendes learned how a global leader in the test, measurement and control systems market reduced their big data implementations from 18 months to just a few.
Speakers shared how to provide a business user-friendly, self-service environment for data discovery and analytics, and focus on how to extend and optimize Hadoop based analytics, highlighting the advantages and practical applications of deploying on the cloud for enhanced performance, scalability and lower TCO.
Agenda included:
- Pizza and Networking
- Joe Caserta, President, Caserta Concepts - Why are we here?
- Nikhil Kumar, Sr. Solutions Engineer, Datameer - Solution use cases and technical demonstration
- Stefan Groschupf, CEO & Chairman, Datameer - The evolving Hadoop-based analytics trends and the role of cloud computing
- James Serra, Data Platform Solution Architect, Microsoft, Benefits of the Azure Cloud Service
- Q&A, Networking
For more information on Caserta Concepts, visit our website: http://casertaconcepts.com/
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a modern data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. They all may sound great in theory, but I'll dig into the concerns you need to be aware of before taking the plunge. I’ll also include use cases so you can see what approach will work best for your big data needs. And I'll discuss Microsoft version of the data mesh.
Creating a Next-Generation Big Data ArchitecturePerficient, Inc.
If you’ve spent time investigating Big Data, you quickly realize that the issues surrounding Big Data are often complex to analyze and solve. The sheer volume, velocity and variety changes the way we think about data – including how enterprises approach data architecture.
Significant reduction in costs for processing, managing, and storing data, combined with the need for business agility and analytics, requires CIOs and enterprise architects to rethink their enterprise data architecture and develop a next-generation approach to solve the complexities of Big Data.
Creating the data architecture while integrating Big Data into the heart of the enterprise data architecture is a challenge. This webinar covered:
-Why Big Data capabilities must be strategically integrated into an enterprise’s data architecture
-How a next-generation architecture can be conceptualized
-The key components to a robust next generation architecture
-How to incrementally transition to a next generation data architecture
Whether you are interested in healthcare data analytics or looking to get started with big data and marketing, these fundamental principles from data experts will contribute to your success. http://www.qubole.com/new-series-big-data-tips/
This is a run-through at a 200 level of the Microsoft Azure Big Data Analytics for the Cloud data platform based on the Cortana Intelligence Suite offerings.
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. I’ll include use cases so you can see what approach will work best for your big data needs.
Fixing data science & Accelerating Artificial Super Intelligence DevelopmentManojKumarR41
This presentation discusses Challenges, Problems, Issues, Measures, Mistakes, Opportunities, Ideas, Technologies, Research and Visions around Data Science
HashGraph, Data Mesh, Data Trajectories, Citrix HDX and Anonos BigPrivacy
Combination of these 5 and few other ideas will ultimately lead us to the VGB Platform. Will soon come up with other document explaining the vision and how exactly work on the vision to gradually develop this Platform, which fixes Data Science Efforts Globally.
Chug building a data lake in azure with spark and databricksBrandon Berlinrut
- The document discusses building a data lake in Azure using Spark and Databricks. It begins with an introduction of the presenter and their experience.
- The rest of the document is organized into sections that discuss decisions around why to use a data lake and Azure/Databricks, how to build the lake by ingesting and organizing data, using Delta Lake for integrated and curated layers, securing the lake, and enabling analytics against the lake.
- The key aspects covered include getting data into the lake from various sources using custom Spark jobs, organizing the lake into layers, cataloging data, using Delta Lake for transactional tables, implementing role-based security, and allowing ad-hoc queries.
This document provides a sector roadmap for cloud analytic databases in 2017. It discusses key topics such as usage scenarios, disruption vectors, and an analysis of companies in the sector. Some main points:
- Cloud databases can now be considered the default option for most selections in 2017 due to economics and functionality.
- Several newer cloud-native offerings have been able to leapfrog more established databases through tight integration of cloud features like elasticity and separation of compute and storage.
- While traditional database functionality is still required, cloud dynamics are causing needs for capabilities like robust SQL support, diverse data support, and dynamic environment adaptation.
- Vendor solutions are evaluated on disruption vectors including SQL support, optimization, elasticity, environment
Power BI for Big Data and the New Look of Big Data SolutionsJames Serra
New features in Power BI give it enterprise tools, but that does not mean it automatically creates an enterprise solution. In this talk we will cover these new features (composite models, aggregations tables, dataflow) as well as Azure Data Lake Store Gen2, and describe the use cases and products of an individual, departmental, and enterprise big data solution. We will also talk about why a data warehouse and cubes still should be part of an enterprise solution, and how a data lake should be organized.
Big Data Analytics in the Cloud with Microsoft AzureMark Kromer
Big Data Analytics in the Cloud using Microsoft Azure services was discussed. Key points included:
1) Azure provides tools for collecting, processing, analyzing and visualizing big data including Azure Data Lake, HDInsight, Data Factory, Machine Learning, and Power BI. These services can be used to build solutions for common big data use cases and architectures.
2) U-SQL is a language for preparing, transforming and analyzing data that allows users to focus on the what rather than the how of problems. It uses SQL and C# and can operate on structured and unstructured data.
3) Visual Studio provides an integrated environment for authoring, debugging, and monitoring U-SQL scripts and jobs. This allows
Data Quality in the Data Hub with RedPointGlobalCaserta
At a Big Data Warehousing Meetup, George Corugedo, CTO of RedPoint Global demonstrated how to use your big data platform for data integration, data quality and identity resolution to provide a true 360 degree view of your customer on Hadoop using the RedPoint product.
For more information or questions, please contact us at www.casertaconcepts.com.
The document discusses Cassandra and how it is used by various companies for applications requiring scalability, high performance, and reliability. It summarizes Cassandra's capabilities and how companies like Netflix, Backupify, Ooyala, and Formspring have used Cassandra to handle large and increasing amounts of data and queries in a scalable and cost-effective manner. The document also describes DataStax's commercial offerings around Apache Cassandra including support, tools, and services.
In this presentation at DAMA New York, Joe started by asking a key question: why are we doing this? Why analyze and share all these massive amounts of data? Basically, it comes down to the belief that in any organization, in any situation, if we can get the data and make it correct and timely, insights from it will become instantly actionable for companies to function more nimbly and successfully. Enabling the use of data can be a world-changing, world-improving activity and this session presents the steps necessary to get you there. Joe explained the concept of the "data lake" and also emphasizes the role of a strong data governance strategy that incorporates seven components needed for a successful program.
For more information on this presentation or Caserta Concepts, visit our website at http://casertaconcepts.com/.
The RNC recently tackled a massive data migration that will help them scale tremendously to support national campaigns at every level of government. Convergence Consulting Group supported the RNC in migrating their data from legacy on prem. systems to a Microsoft Azure Cloud data warehouse. The RNC and its partners can now utilize Microsoft Power BI to expose the data from anywhere with a few simple clicks. See some examples of recent polling data in the presentation. Questions? Contact us at (813) 265-3239.
Presentation on Data Mesh: The paradigm shift is a new type of eco-system architecture, which is a shift left towards a modern distributed architecture in which it allows domain-specific data and views “data-as-a-product,” enabling each domain to handle its own data pipelines.
The document discusses Azure Synapse Analytics, a limitless analytics service that delivers insights from all data sources with unmatched speed. It provides a unified experience with Azure Synapse Studio for SQL, Apache Spark, pipelines, and BI/AI integration. Key capabilities include cloud-scale analytics, a modern data warehouse with SQL and Spark runtimes, and an integrated platform for AI/BI/continuous intelligence. Synapse Studio is the main interface with hubs for overview, data exploration, development, orchestration, and management.
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Data Con LA
The document discusses how an Enterprise Data Lake (EDL) provides a more effective solution for enterprise BI and analytics compared to traditional enterprise data warehouses (EDW). It argues that EDL allows enterprises to retain all datasets, service ad-hoc requests with no latency or development time, and offer a low-cost, low-maintenance solution that supports direct analytics and reporting on data stored in its native format. The document promotes EDL as a mainstream solution that should be part of every mid-sized and large enterprise's standard IT stack.
Caserta Concepts, Datameer and Microsoft shared their combined knowledge and a use case on big data, the cloud and deep analytics. Attendes learned how a global leader in the test, measurement and control systems market reduced their big data implementations from 18 months to just a few.
Speakers shared how to provide a business user-friendly, self-service environment for data discovery and analytics, and focus on how to extend and optimize Hadoop based analytics, highlighting the advantages and practical applications of deploying on the cloud for enhanced performance, scalability and lower TCO.
Agenda included:
- Pizza and Networking
- Joe Caserta, President, Caserta Concepts - Why are we here?
- Nikhil Kumar, Sr. Solutions Engineer, Datameer - Solution use cases and technical demonstration
- Stefan Groschupf, CEO & Chairman, Datameer - The evolving Hadoop-based analytics trends and the role of cloud computing
- James Serra, Data Platform Solution Architect, Microsoft, Benefits of the Azure Cloud Service
- Q&A, Networking
For more information on Caserta Concepts, visit our website: http://casertaconcepts.com/
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a modern data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. They all may sound great in theory, but I'll dig into the concerns you need to be aware of before taking the plunge. I’ll also include use cases so you can see what approach will work best for your big data needs. And I'll discuss Microsoft version of the data mesh.
Creating a Next-Generation Big Data ArchitecturePerficient, Inc.
If you’ve spent time investigating Big Data, you quickly realize that the issues surrounding Big Data are often complex to analyze and solve. The sheer volume, velocity and variety changes the way we think about data – including how enterprises approach data architecture.
Significant reduction in costs for processing, managing, and storing data, combined with the need for business agility and analytics, requires CIOs and enterprise architects to rethink their enterprise data architecture and develop a next-generation approach to solve the complexities of Big Data.
Creating the data architecture while integrating Big Data into the heart of the enterprise data architecture is a challenge. This webinar covered:
-Why Big Data capabilities must be strategically integrated into an enterprise’s data architecture
-How a next-generation architecture can be conceptualized
-The key components to a robust next generation architecture
-How to incrementally transition to a next generation data architecture
Whether you are interested in healthcare data analytics or looking to get started with big data and marketing, these fundamental principles from data experts will contribute to your success. http://www.qubole.com/new-series-big-data-tips/
This is a run-through at a 200 level of the Microsoft Azure Big Data Analytics for the Cloud data platform based on the Cortana Intelligence Suite offerings.
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. I’ll include use cases so you can see what approach will work best for your big data needs.
Fixing data science & Accelerating Artificial Super Intelligence DevelopmentManojKumarR41
This presentation discusses Challenges, Problems, Issues, Measures, Mistakes, Opportunities, Ideas, Technologies, Research and Visions around Data Science
HashGraph, Data Mesh, Data Trajectories, Citrix HDX and Anonos BigPrivacy
Combination of these 5 and few other ideas will ultimately lead us to the VGB Platform. Will soon come up with other document explaining the vision and how exactly work on the vision to gradually develop this Platform, which fixes Data Science Efforts Globally.
Chug building a data lake in azure with spark and databricksBrandon Berlinrut
- The document discusses building a data lake in Azure using Spark and Databricks. It begins with an introduction of the presenter and their experience.
- The rest of the document is organized into sections that discuss decisions around why to use a data lake and Azure/Databricks, how to build the lake by ingesting and organizing data, using Delta Lake for integrated and curated layers, securing the lake, and enabling analytics against the lake.
- The key aspects covered include getting data into the lake from various sources using custom Spark jobs, organizing the lake into layers, cataloging data, using Delta Lake for transactional tables, implementing role-based security, and allowing ad-hoc queries.
This document provides a sector roadmap for cloud analytic databases in 2017. It discusses key topics such as usage scenarios, disruption vectors, and an analysis of companies in the sector. Some main points:
- Cloud databases can now be considered the default option for most selections in 2017 due to economics and functionality.
- Several newer cloud-native offerings have been able to leapfrog more established databases through tight integration of cloud features like elasticity and separation of compute and storage.
- While traditional database functionality is still required, cloud dynamics are causing needs for capabilities like robust SQL support, diverse data support, and dynamic environment adaptation.
- Vendor solutions are evaluated on disruption vectors including SQL support, optimization, elasticity, environment
Power BI for Big Data and the New Look of Big Data SolutionsJames Serra
New features in Power BI give it enterprise tools, but that does not mean it automatically creates an enterprise solution. In this talk we will cover these new features (composite models, aggregations tables, dataflow) as well as Azure Data Lake Store Gen2, and describe the use cases and products of an individual, departmental, and enterprise big data solution. We will also talk about why a data warehouse and cubes still should be part of an enterprise solution, and how a data lake should be organized.
Big Data Analytics in the Cloud with Microsoft AzureMark Kromer
Big Data Analytics in the Cloud using Microsoft Azure services was discussed. Key points included:
1) Azure provides tools for collecting, processing, analyzing and visualizing big data including Azure Data Lake, HDInsight, Data Factory, Machine Learning, and Power BI. These services can be used to build solutions for common big data use cases and architectures.
2) U-SQL is a language for preparing, transforming and analyzing data that allows users to focus on the what rather than the how of problems. It uses SQL and C# and can operate on structured and unstructured data.
3) Visual Studio provides an integrated environment for authoring, debugging, and monitoring U-SQL scripts and jobs. This allows
Data Quality in the Data Hub with RedPointGlobalCaserta
At a Big Data Warehousing Meetup, George Corugedo, CTO of RedPoint Global demonstrated how to use your big data platform for data integration, data quality and identity resolution to provide a true 360 degree view of your customer on Hadoop using the RedPoint product.
For more information or questions, please contact us at www.casertaconcepts.com.
The document discusses Cassandra and how it is used by various companies for applications requiring scalability, high performance, and reliability. It summarizes Cassandra's capabilities and how companies like Netflix, Backupify, Ooyala, and Formspring have used Cassandra to handle large and increasing amounts of data and queries in a scalable and cost-effective manner. The document also describes DataStax's commercial offerings around Apache Cassandra including support, tools, and services.
In this presentation at DAMA New York, Joe started by asking a key question: why are we doing this? Why analyze and share all these massive amounts of data? Basically, it comes down to the belief that in any organization, in any situation, if we can get the data and make it correct and timely, insights from it will become instantly actionable for companies to function more nimbly and successfully. Enabling the use of data can be a world-changing, world-improving activity and this session presents the steps necessary to get you there. Joe explained the concept of the "data lake" and also emphasizes the role of a strong data governance strategy that incorporates seven components needed for a successful program.
For more information on this presentation or Caserta Concepts, visit our website at http://casertaconcepts.com/.
The RNC recently tackled a massive data migration that will help them scale tremendously to support national campaigns at every level of government. Convergence Consulting Group supported the RNC in migrating their data from legacy on prem. systems to a Microsoft Azure Cloud data warehouse. The RNC and its partners can now utilize Microsoft Power BI to expose the data from anywhere with a few simple clicks. See some examples of recent polling data in the presentation. Questions? Contact us at (813) 265-3239.
Presentation on Data Mesh: The paradigm shift is a new type of eco-system architecture, which is a shift left towards a modern distributed architecture in which it allows domain-specific data and views “data-as-a-product,” enabling each domain to handle its own data pipelines.
The document discusses Azure Synapse Analytics, a limitless analytics service that delivers insights from all data sources with unmatched speed. It provides a unified experience with Azure Synapse Studio for SQL, Apache Spark, pipelines, and BI/AI integration. Key capabilities include cloud-scale analytics, a modern data warehouse with SQL and Spark runtimes, and an integrated platform for AI/BI/continuous intelligence. Synapse Studio is the main interface with hubs for overview, data exploration, development, orchestration, and management.
Introduces the Microsoft’s Data Platform for on premise and cloud. Challenges businesses are facing with data and sources of data. Understand about Evolution of Database Systems in the modern world and what business are doing with their data and what their new needs are with respect to changing industry landscapes.
Dive into the Opportunities available for businesses and industry verticals: the ones which are identified already and the ones which are not explored yet.
Understand the Microsoft’s Cloud vision and what is Microsoft’s Azure platform is offering, for Infrastructure as a Service or Platform as a Service for you to build your own offerings.
Introduce and demo some of the Real World Scenarios/Case Studies where Businesses have used the Cloud/Azure for creating New and Innovative solutions to unlock these potentials.
Harnessing Microsoft Fabric and Azure Service Fabric Analytics as a Service a...Microsoft Dynamics
Understand the key capabilities of Microsoft Fabric Services and how they offer solutions for today's data and analytics needs.
https://dynatechconsultancy.com/microsoft-fabric
IBM Cloud Pak for Data is a unified platform that simplifies data collection, organization, and analysis through an integrated cloud-native architecture. It allows enterprises to turn data into insights by unifying various data sources and providing a catalog of microservices for additional functionality. The platform addresses challenges organizations face in leveraging data due to legacy systems, regulatory constraints, and time spent preparing data. It provides a single interface for data teams to collaborate and access over 45 integrated services to more efficiently gain insights from data.
High-Performance Analytics in the Cloud with Apache ImpalaCloudera, Inc.
With more and more data being generated and stored in the cloud, you need a modern data platform that can extend to any environment so you can derive value from all your data. Cloudera Enterprise is the leading enterprise Hadoop platform for cloud deployments. It’s the easiest way to manage and secure Hadoop data across any cloud environment and includes component-level support for cloud-native object stores. This makes the platform uniquely suited to handle transient jobs like ETL and BI analytics, as well as persistent workloads like stream processing and advanced analytics.
With the recent release of Cloudera 5.8, Apache Impala (incubating) has added support for Amazon S3, enabling business analysts to get instant insights from all data through high-performance exploratory analytics and BI.
3 Things to learn:
Join David Tishgart, Director of Product Marketing, and James Curtis, Senior Analyst Data Platforms & Analytics at 451 Research, as they discuss:
* Best practices for analytic workloads in the cloud
* A live demo and real-world use cases
* What’s next for Cloudera and the cloud
Enable Better Decision Making with Power BI Visualizations & Modern Data EstateCCG
Self-service BI empowers users to reach analytic outputs through data visualizations and reporting tools. Solution Architect and Cloud Solution Specialist, James McAuliffe, will be taking you through a journey of Azure's Modern Data Estate.
The document discusses challenges with traditional data warehousing and analytics including high upfront costs, difficulty managing infrastructure, and inability to scale easily. It introduces Amazon Web Services (AWS) and Amazon Redshift as a solution, allowing for easy setup of data warehousing and analytics in the cloud at low costs without large upfront investments. AWS services like Amazon Redshift provide flexible, scalable infrastructure that is easier to manage than traditional on-premise systems and enables organizations to more effectively analyze large amounts of data.
The document discusses Microsoft's approach to implementing a data mesh architecture using their Azure Data Fabric. It describes how the Fabric can provide a unified foundation for data governance, security, and compliance while also enabling business units to independently manage their own domain-specific data products and analytics using automated data services. The Fabric aims to overcome issues with centralized data architectures by empowering lines of business and reducing dependencies on central teams. It also discusses how domains, workspaces, and "shortcuts" can help virtualize and share data across business units and data platforms while maintaining appropriate access controls and governance.
To disrupt and innovate, you need access to data. All of your data. The challenge for many organisations is that the data they need is locked away in a variety of silos. And there's perhaps no bigger silo than one of the most a widely deployed business application: SAP. Bringing together all your data for analytics and machine learning unlocks new insights and business value. Together, Cloudera and Datavard hold the key to breaking SAP data out of its silo, providing access to unlimited and untapped opportunities that currently lay hidden.
DataLakes kan skalere i takt med skyen, nedbryde integrationsbarrierer og data gemt i siloer og bane vejen for nye forretningsmuligheder. Det er alt sammen med til at give et bedre beslutningsgrundlag for ledelse og medarbejdere. Kom og hør hvordan.
David Bojsen, Arkitekt, Microsoft
Big Data and Data Warehousing Together with Azure Synapse Analytics (SQLBits ...Michael Rys
SQLBits 2020 presentation on how you can build solutions based on the modern data warehouse pattern with Azure Synapse Spark and SQL including demos of Azure Synapse.
Microsoft Fabric is the next version of Azure Data Factory, Azure Data Explorer, Azure Synapse Analytics, and Power BI. It brings all of these capabilities together into a single unified analytics platform that goes from the data lake to the business user in a SaaS-like environment. Therefore, the vision of Fabric is to be a one-stop shop for all the analytical needs for every enterprise and one platform for everyone from a citizen developer to a data engineer. Fabric will cover the complete spectrum of services including data movement, data lake, data engineering, data integration and data science, observational analytics, and business intelligence. With Fabric, there is no need to stitch together different services from multiple vendors. Instead, the customer enjoys end-to-end, highly integrated, single offering that is easy to understand, onboard, create and operate.
This is a hugely important new product from Microsoft and I will simplify your understanding of it via a presentation and demo.
Agenda:
What is Microsoft Fabric?
Workspaces and capacities
OneLake
Lakehouse
Data Warehouse
ADF
Power BI / DirectLake
Resources
Unlock Data-driven Insights in Databricks Using Location IntelligencePrecisely
Today’s data-driven organisations are turning to Databricks for a cloud-based, open, unified platform for data and AI. Yet many companies struggle to unlock the value of the data they have in Databricks. To capitalise on the promise of a competitive edge through increased efficiency and insight, data scientists are turning to location to make sense of massive volumes of business data.
Watch this on-demand to hear from The Spatial Distillery Co. and Databricks on how to leverage advanced location intelligence and enrichment solutions in Databricks to:
- Simplify the complexity of location data and transform it into valuable insights
- Enrich data with thousands of attributes for better, more accurate analytics, AI, and ML models
- Leverage the power of Databricks to integrate geospatial data into business processes for real-time answers
- Create more meaningful and timely customer interactions by streamlining customer-facing and operational tasks
Think of big data as all data, no matter what the volume, velocity, or variety. The simple truth is a traditional on-prem data warehouse will not handle big data. So what is Microsoft’s strategy for building a big data solution? And why is it best to have this solution in the cloud? That is what this presentation will cover. Be prepared to discover all the various Microsoft technologies and products from collecting data, transforming it, storing it, to visualizing it. My goal is to help you not only understand each product but understand how they all fit together, so you can be the hero who builds your companies big data solution.
This document provides an overview of a course on implementing a modern data platform architecture using Azure services. The course objectives are to understand cloud and big data concepts, the role of Azure data services in a modern data platform, and how to implement a reference architecture using Azure data services. The course will provide an ARM template for a data platform solution that can address most data challenges.
The document discusses how organizations can leverage big data through Oracle's integrated big data solutions. It describes Oracle's offerings for acquiring and organizing big data from various sources using products like Oracle NoSQL Database and Hadoop. It then discusses how Oracle solutions allow users to analyze large datasets using R and visualize insights in BI dashboards. Finally, it provides an overview of Oracle's Exalytics and Big Data Appliance hardware and software platforms for processing and managing big data at scale.
The document discusses how data accessibility is driving innovation in manufacturing through cloud and vault data management systems. It outlines how data has become more disruptive as information needs to be accessed in real-time across sites and stakeholders. Those not leveraging their data may fall behind. The presentation will demonstrate how a cloud-based vault provides real-time accessibility, analytics, and concurrent engineering across organizations and on mobile devices. Key benefits include data centricity, productivity, flexibility, and reduced costs compared to on-premise systems. Attendees will understand how partners can help leverage data for business optimization through these solutions.
Introduction to Machine Learning with Azure & DatabricksCCG
Join CCG and Microsoft for a hands-on demonstration of Azure’s machine learning capabilities. During the workshop, we will:
- Hold a Machine Learning 101 session to explain what machine learning is and how it fits in the analytics landscape
- Demonstrate Azure Databricks’ capabilities for building custom machine learning models
- Take a tour of the Azure Machine Learning’s capabilities for MLOps, Automated Machine Learning, and code-free Machine Learning
By the end of the workshop, you’ll have the tools you need to begin your own journey to AI.
The document outlines several upcoming workshops hosted by CCG, an analytics consulting firm, including:
- An Analytics in a Day workshop focusing on Synapse on March 16th and April 20th.
- An Introduction to Machine Learning workshop on March 23rd.
- A Data Modernization workshop on March 30th.
- A Data Governance workshop with CCG and Profisee on May 4th focusing on leveraging MDM within data governance.
More details and registration information can be found on ccganalytics.com/events. The document encourages following CCG on LinkedIn for event updates.
How to Monetize Your Data Assets and Gain a Competitive AdvantageCCG
Join us for this session where Doug Laney will share insights from his best-selling book, Infonomics, about how organizations can actually treat information as an enterprise asset.
You had a strategy. You were executing it. You were then side-swiped by COVID, spending countless cycles blocking and tackling. It is now time to step back onto your path.
CCG is holding a workshop to help you update your roadmap and get your team back on track and review how Microsoft Azure Solutions can be leveraged to build a strong foundation for governed data insights.
Machine Learning with Azure and Databricks Virtual WorkshopCCG
Join CCG and Microsoft for a hands-on demonstration of Azure’s machine learning capabilities. During the workshop, we will:
- Hold a Machine Learning 101 session to explain what machine learning is and how it fits in the analytics landscape
- Demonstrate Azure Databricks’ capabilities for building custom machine learning models
- Take a tour of the Azure Machine Learning’s capabilities for MLOps, Automated Machine Learning, and code-free Machine Learning
By the end of the workshop, you’ll have the tools you need to begin your own journey to AI.
Join Brian Beesley, Director of Data Science, for an executive-level tour of AI capabilities. Get an inside peek at how others have used AI, and learn how you can harness the power of AI to transform your business.
Virtual Governance in a Time of Crisis WorkshopCCG
The CCGDG framework is focused on the following 5 key competencies. These 5 competencies were identified as areas within DG that have the biggest ROI for you, our customer. The pandemic has uncovered many challenges related to governance, therefore the backbone of this model is the emphasis on risk mitigation.
1. Program Management
2. Data Quality
3. Data Architecture
4. Metadata Management
5. Privacy
Advance Data Visualization and Storytelling Virtual WorkshopCCG
Join CCG and Microsoft for a virtual workshop, hosted by Senior BI Architect, Martin Rivera, taking you through a journey of advanced data visualization and storytelling.
In early 2019, Microsoft created the AZ-900 Microsoft Azure Fundamentals certification. This is a certification for all individuals, IT or non IT background, who want to further their careers and learn how to navigate the Azure cloud platform.
Learn about AZ-900 exam concepts and how to prepare and pass the exam
This document provides an overview and agenda for a Power BI Advanced training course. The course objectives are outlined, which include understanding data modeling concepts, calculated columns and measures, and evaluation contexts in DAX. The agenda lists the modules to be covered, including data modeling best practices, modeling scenarios, and DAX. Housekeeping items are provided, instructing participants to send questions to Sami and mute their lines. It is noted the session will be recorded.
This document provides an overview of Azure core services, including compute, storage, and networking options. It discusses Azure management tools like the portal, PowerShell, and CLI. For compute, it covers virtual machines, containers, App Service, and serverless options. For storage, it discusses SQL Database, Cosmos DB, blob, file, queue, and data lake storage. It also discusses networking concepts like load balancing and traffic management. The document ends with potential exam questions related to Azure services.
This document provides an agenda and objectives for an advanced Power BI training session. The agenda includes sections on Power BI M transformations, merge types, creating a BudgetFact table using multiple queries, and data profiling. The objectives are to understand M transformations, merging queries, using multiple queries for advanced transformations, and data profiling. Attendees will learn key M transformations like transpose, pivot columns, and unpivot columns. They will also learn about different merge types in Power BI.
This document provides an overview of Azure cloud concepts for exam preparation. It begins with an introduction to cloud computing benefits like scalability, reliability and cost effectiveness. It then covers Azure architecture including regions, availability zones and performance service level agreements. The document reviews cloud deployment models and compares infrastructure as a service, platform as a service and software as a service. It also discusses how to use the Azure pricing calculator and reduce infrastructure costs. Potential exam questions are provided at the end.
Business intelligence dashboards and data visualizations serve as a launching point for better business decision making. Learn how you can leverage Power BI to easily build reports and dashboards with interactive visualizations.
Data Governance and MDM | Profisse, Microsoft, and CCGCCG
CCG will introduce a methodology and framework for DG that allows organizations to assess DG faster, deriving actionable insights that can be quickly implemented with minimal disruption. CCG will also review how Microsoft Azure Solutions can be leveraged to build a strong foundation for governed data insights. In addition, Profisee will introduce a popular component of data governance, MDM.
Data Governance with Profisee, Microsoft & CCG CCG
1. The workshop agenda covers data governance fundamentals, assessing an organization's data governance maturity using the CCGDG framework, and prioritizing a roadmap for improvement.
2. The Profisee presentation promotes their master data management solution for enabling digital transformation by providing a single view of critical data across systems.
3. Profisee's solution focuses on five key areas: stewardship, matching configuration, adjusting the configuration, operational matching, and workflow management to ensure data quality.
[Webinar] Top Power BI Updates You *Acutally* Need to Know CCG
1)Summary of the over 25 feature improvements made by Power BI in 2019
2) Top ways to leverage the changes in functionality
3) Ways to get buy-in and further utilize your Microsoft Power BI investment
Key takeaways:
-Identify with the key reasons for failing Data Governance initiatives
-Uncover the commonly used Data Governance terms and their meanings
-Learn the Framework for a successful Data Governance Program
This document provides an overview of machine learning and how it can be used by organizations. It begins with definitions of key concepts like data science, advanced analytics, artificial intelligence, statistics, and machine learning. It then discusses why machine learning has become more feasible in recent years due to increases in data, computing power, and attention from researchers. Examples are given of common machine learning applications in areas like computer vision, natural language processing, and personalized recommendations. The document outlines the machine learning process of creating and evaluating statistical models on data to make predictions. It also discusses the roles of people, processes, technology, and data in successfully applying machine learning within an organization.
The document provides an overview of machine learning and how it works. It explains that machine learning involves using algorithms to generate models by analyzing large datasets to find patterns. The models are mathematical representations of relationships in the data that can be used to make predictions on new data. Machine learning works by giving historical data to algorithms which then determine the best model to use for making future predictions, rather than being explicitly programmed with a fixed model. This iterative process allows the models to become more accurate over time as more data is analyzed.
By James Francis, CEO of Paradigm Asset Management
In the landscape of urban safety innovation, Mt. Vernon is emerging as a compelling case study for neighboring Westchester County cities. The municipality’s recently launched Public Safety Camera Program not only represents a significant advancement in community protection but also offers valuable insights for New Rochelle and White Plains as they consider their own safety infrastructure enhancements.
Telangana State, India’s newest state that was carved from the erstwhile state of Andhra
Pradesh in 2014 has launched the Water Grid Scheme named as ‘Mission Bhagiratha (MB)’
to seek a permanent and sustainable solution to the drinking water problem in the state. MB is
designed to provide potable drinking water to every household in their premises through
piped water supply (PWS) by 2018. The vision of the project is to ensure safe and sustainable
piped drinking water supply from surface water sources
This comprehensive Data Science course is designed to equip learners with the essential skills and knowledge required to analyze, interpret, and visualize complex data. Covering both theoretical concepts and practical applications, the course introduces tools and techniques used in the data science field, such as Python programming, data wrangling, statistical analysis, machine learning, and data visualization.
1. Analytics in a Day
Cloud analytics in the age of
self-service and data science
2. Housekeeping
Please message Sami
with any questions,
concerns or if you
need assistance during
this workshop.
Please mute your line!
We will be applying mute.
This session will be
recorded.
If you do not want to be
recorded, please disconnect at
this time.
Links:
See chat window.
Worksheet:
See handouts.
To make presentation
larger, draw the
bottom half of screen
‘up’.
3. Agenda
Workshop Instruction & Discussion
9:00 – 10:00 The heart of analytics - why modernize the data warehouse?
10:00 – 11:00 Optimizing analytics with Azure Synapse
11:00 – 11:15 Insights for all with Power BI + Azure Synapse
Break for 15 min
Hands-on lab
11:30 – 1:00 Hands-on lab using the Azure Synapse Studio
Times are approximate and will be fluid with the class
4. James McAuliffe,
Cloud Solution Architect
James McAuliffe is a Cloud Solution Architect with over 20 years of technology
industry experience. During this journey into data and analytics, he’s held all
of the traditional Business Intelligence Solution project roles, ranging from
design and development to complete life cycle BI implementations. He is a
Microsoft Preferred Partner Solutions expert and has worked with clients of all
sizes, from local businesses to Fortune 500 companies.
And I like old Italian cars.
linkedin.com/in/jamesmcauliffesql/
8. 10%
of organizations are expected
to have a highly profitable
business unit specifically for
productizing and
commercializing data by 2020
$100M
The most digitally transformed
enterprises generate on
average $100 million in
additional operating income
each year
5,247GB
Approximate amount of data
for every man, woman and
child on earth in 2020
Data is a key strategic asset
9. Data Landscape – Volume and Pressure
IDC Data Age 2025 - The Digitization of the World
10. Data Landscape - Different Types of Data
• Mobile
• Social
• Scanners
• Sensors
• RFID
• Devices - IoT
• Feeds/APIs
• Other, non-traditional sources
85%
12. The heart of analytics
Section 1
Data businesses need
data warehouses
Section 2
Data warehouses &
data lakes come together
Section 3
BI & DW come together
Section 4
The cloud for modern analytics
Section 5
A new class of analytics
14. Is the data warehouse
still relevant?
What’s changed since 1988?
A 30-year-old architecture, still going strong
Commerce and technology
The data warehouse itself
16. All data businesses need
to be analytic businesses
Without analytics data is a cost center,
not a resource
17. Analytic businesses need
to evolve data science
Every business has opportunities to make
analytics faster, easier, and more insightful
18. Store
Data Ingestion Big Data Data Warehousing
The cloud data warehouse in the data-driven business
19. Data Ingestion Big Data Data Warehousing
Store
The cloud data warehouse in the data-driven business
20. Store
Data Ingestion Big Data Data Warehousing
Cloud data
SaaS data
On-premises data
Devices data
The cloud data warehouse in the data-driven business
21. Store
Data Ingestion Big Data Data Warehousing
Cloud data
SaaS data
On-premises data
Devices data
The cloud data warehouse in the data-driven business
22. Store
Data Ingestion Big Data Data Warehousing
Cloud data
SaaS data
On-premises data
Devices data
The cloud data warehouse in the data-driven business
26. 80%
report struggling to
become mature users
of data*
55%
report data silos and
data management
difficulties as roadblocks*
* Harvard Business Review (2019), Understanding why analytics strategies fall short for some, but not for others
Analytics & AI is the #1 investment for business leaders,
however they struggle to maximize ROI
27. Big data
Experimentation
Fast exploration
Semi-structured
Data science
OR
Relational data
Proven security & privacy
Dependable performance
Structured
Business analytics
Data lake Data warehouse
Businesses are forced to maintain
two critical, yet independent analytics systems
38. The new economy
thrives on data literacy
Communicating with data is a critical skill in
the new economy
39. Users and IT must come
together in the new enterprise
Get over the IT / business divide
40. Governance and self-service
enhance decision-making
Governance is not about making the right decisions,
it is about making decisions the right way
41. The importance of data models
BI models Power BI
• Built and maintained by business users or BI developers
• Use enterprise models, departmental data, and external sources
• Focused on a single subject area, but often widely shared
Machine Learning
models
Azure Synapse
Analytics
• Built and maintained by data scientists
• Mostly developed from raw sources in the data lake
• Often experimental, needing a data engineer for production use
Azure Synapse
AnalyticsEnterprise models
• Built and maintained by IT architects
• Consolidated data from many systems
• Centralized as an authoritative source for reporting and analysis
42. Enterprise models in the
self-service environment
If business users
are tech-smart and
data literate, why
do they need
enterprise models?
Consistency
Some business processes can be built once and shared as a
corporate standard
Governance
Certain data sets need complex security and privacy controls
Efficiency
No need to repeat design, preparing, and loading or securing
Line-of-business sources
Data ingestion &
transformation
Enterprise models
Azure Synapse
Analytics
Power BI
43. BI models in the enterprise
environment
If enterprise
models are so
important, why do
users need self-
service BI models?
Flexibility
Some data sets are temporary, external, or ad-hoc don’t need to
be consolidated
Efficiency
Tech-smart business users have fresh and innovative ideas they need to
explore with agility
Ad-hoc, departmental and
external sources
Line-of-business sources
Data ingestion &
transformation
Power BI
Enterprise models
Azure Synapse
Analytics
BI models
44. Section 4
The cloud for modern analytics
Data science models in the
enterprise environment
What is the role of
the data
warehouse with
data science?
Integrating results with enterprise models
Making the results of data science easily available for business functions
Serving enterprise data for data scientists
Helps ensure consistency across diverse analyses
Power BI
Azure Synapse
Analytics
Azure
Databricks
Enterprise models
Azure Synapse
Analytics
Data science results
49. Structured, unstructured, and streaming data
integrated in a single, scalable, environment
A cloud analytics platform
is the hub for all data models
50. BI
Bring together the best of both worlds with the market-
leading BI service and the industry-leading analytics platform
Power BI can analyze and visualize
massive volumes of data
Azure Synapse Analytics provides a
scalable platform to enable real-time BI
Analytics
51. Section 5
A new class of analytics
Power BI can analyze and
visualize massive volumes of data
Azure Synapse Analytics
provides a scalable platform
to enable real-time BI
Azure Machine Learning natively
integrates with Azure Synapse &
Power BI to democratize AI across
your business
BI Analytics Machine learning
Bring together the best of both worlds with the market-
leading BI service and the industry-leading analytics platform
53. Unified experience
Azure Synapse Studio
Integration Management Monitoring Security
Analytics runtimes
SQL
Azure Data Lake Storage
Azure Machine
Learning
On-premises data
Cloud data
SaaS data
Streaming data
Power BI
Azure Synapse lies at the heart of business, AI, and BI
Azure Synapse Analytics
54. Unified experienceAzure Synapse Studio
Integration Management Monitoring SecuritySQL
Azure Data Lake Storage
Azure Machine
Learning
On-premises
data
Cloud
data
SaaS data
Streaming
data
Cloud analytics has taken a leap forward
with a unified, unmatched platform
Azure Synapse Analytics
Power BI
57. Introducing Azure
Synapse Analytics
A limitless analytics service with unmatched
time to insight, that delivers insights from all
your data, across data warehouses and big
data analytics systems, with blazing speed
Simply put, Azure Synapse is Azure SQL Data
Warehouse evolved
We have taken the same industry leading data
warehouse and elevated it to a whole new level of
performance and capabilities
58. Azure Synapse
Analytics
Snowflake
Standard
Amazon
Redshift
Google
BigQuery
per byte
$33
$103
$48
…$564
94% less
TPC-H benchmark comparison
Price-performance | Lower is better
* GigaOm TPC-H benchmark report, January 2019, “GigaOm report: Data Warehouse in the Cloud Benchmark
With the best price-performance
in the business
Up to 14x faster and costs 94%
less than other cloud providers
A breakthrough in the cost of enterprise analytics
59. Data consolidation using
Azure Synapse Analytics
Migration to the cloud for
efficient business operations
Using Azure Synapse Analytics
for predictive analytics
Organizations that fully harness their data outperform
60. At the core of all use cases is…Azure Synapse Analytics
Real-time
analytics
Modern data
warehousing
Advanced
analytics
"We want to analyze
data coming from
multiple sources and
in varied formats"
"We want to leverage
the analytics platform
for advanced fraud
detection"
“We’re trying to get
insights from our
devices in real-time”
Cloud-scale analytics
64. Query and analyze data with
T-SQL using both provisioned
and serverless models
Quickly create notebooks with
your choice of Python, Scala,
SparkSQL, and .NET for
Apache Spark
Build end-to-end workflows
for your data movement and
data processing scenarios
Execute all data tasks with a
simple UI and unified
environment
Azure Synapse Analytics
Synapse SQL
Apache Spark
for Synapse
Synapse Pipelines Synapse Studio
65. Integrated analytics platform for AI, BI, and continuous intelligence
Platform
Azure
Data Lake Storage
Common Data Model
Enterprise Security
Optimized for Analytics
Data lake integrated and Common Data
Model aware
METASTORE
SECURITY
MANAGEMENT
MONITORING
Integrated platform services
for, management, security, monitoring,
and metastore
DATA INTEGRATION
Analytics Runtimes
Integrated analytics runtimes available
provisioned and serverless
Synapse SQL offering T-SQL for batch,
streaming, and interactive processing
Synapse Spark for big data processing
with Python, Scala, R and .NET
PROVISIONED (DW) SERVERLESS
Form Factors
SQL
Languages
Python .NET Java Scala R
Multiple languages suited to different
analytics workloads
Experience Synapse Studio
SaaS developer experiences for code
free and code first
Artificial Intelligence / Machine Learning / Internet of Things
Intelligent Apps / Business Intelligence
Designed for analytics workloads at any
scale
Azure Synapse Analytics
66. Integrated analytics platform for AI, BI, and continuous intelligence
Platform
Azure
Data Lake Storage
Common Data Model
Enterprise Security
Optimized for Analytics
METASTORE
SECURITY
MANAGEMENT
MONITORING
DATA INTEGRATION
Analytics Runtimes
PROVISIONED (DW) SERVERLESS
Form Factors
SQL
Languages
Python .NET Java Scala R
Experience Synapse Studio
Artificial Intelligence / Machine Learning / Internet of Things
Intelligent Apps / Business Intelligence
Azure Synapse Analytics
Connected Services
Azure Data Catalog
Azure Data Lake Storage
Azure Data Share
Azure Databricks
Azure HDInsight
Azure Machine Learning
Power BI
3rd Party Integration
69. Synapse Studio is divided into
Activity hubs
Hubs organize the tasks needed for
building analytics solutions
Synapse Studio
Overview Data
Monitor Manage
Quick-access to common
gestures, most-recently used
items, and links to tutorials
and documentation.
Explore structured and
unstructured data
Centralized view of all resource
usage and activities in the
workspace.
Configure the workspace,
pool, access to artifacts
Develop
Write code and the define
business logic of the pipeline
via notebooks, SQL scripts,
Data flows, etc.
Orchestrate
Design pipelines that that
move and transform data.
80. Author SQL Scripts
Execute SQL script on provisioned SQL
Pool or SQL Serverless
Publish individual SQL script or multiple
SQL scripts through Publish all feature
Support for languages and Intellisense
Develop hub -
SQL scripts
81. View results in table or chart form and
export results in several popular formats
Develop hub -
SQL scripts
82. Data flows are a visual way of
specifying how to transform data,
providing a code-free experience
Develop hub -
Data flows
83. Develop hub –
Power BI
Create Power BI reports in the workspace
Provide access to published reports in the
workspace
Update reports in real time from Synapse
workspace and show on Power BI service
Visually explore and analyze data
85. Best-in-class
Price-performance is calculated by GigaOm as the TPC-H metric of cost of ownership divided by composite query.
Results based on GigaOm’s TPC-H results, published in January 2019
Leader in price per performance
87. Price-performance @ 30TB
Lower is Better
Amazon
Redshift
Google BigQuery
Flat Rate
Azure Synapse
Analytics
Google BigQuery
Flat Rate
Snowflake
Standard
$1310
$570
$309
$206
$286
$153
$0
$100
$200
$300
$400
$500
$600
Snowflake
Standard
Best-in-class
Price-performance is calculated by GigaOm as the TPC-H metric of cost of ownership divided by composite query.
Results based on GigaOm’s TPC-H results, published in January 2019
89. --T-SQL syntax for scoring data in SQL DW
SELECT
d.*, p.Score
FROM PREDICT(MODEL = @onnx_model, DATA = dbo.mytable AS
d)
WITH (Score float) AS p;
Upload
models
Machine learning
enabled DW
Native PREDICT-ion
T-SQL based experience
(interactive/batch scoring)
Interoperability with other
models built elsewhere
Scoring executed where the
data lives
T-SQL Language
Data Warehouse
Data
+
Score models
Model Predictions
=
Synapse SQL
Create models
90. Event Hubs
IoT Hub
T-SQL language
Built-in streaming ingestion & analytics
Streaming Ingestion Data Warehouse
Synapse SQL
Heterogenous
data preparation
and ingestion
Native SQL streaming
High throughput ingestion
(up to 200MB/sec)
Delivery latencies in seconds
Ingestion throughput scales with
compute scale
Analytics capabilities
91. Empower more users
per data warehouse
Leverage up to 128 concurrent
slots, simultaneously, on a single
data warehouse
Number of simultaneous workloads
increases with data warehouse capacity
Utilize preset functions to allocate
resources that need them the most
92. Intra cluster workload isolation
(Scale in)
Marketing
CREATE WORKLOAD GROUP Sales
WITH
(
[ MIN_PERCENTAGE_RESOURCE = 60 ]
[ CAP_PERCENTAGE_RESOURCE = 100 ]
[ MAX_CONCURRENCY = 6 ] )
40%
Data
warehouse
Local In-Memory + SSD Cache
Compute
1000c DWU
60%
Sales
60%
100%
Workload aware
query execution
Workload isolation
Multiple workloads share
deployed resources
Reservation or shared resource
configuration
Online changes to workload policies
93. Cluster N
Multi-clusters
(Scale out)
Sales Marketing
Finance
Data Warehouses
Workload
Management
Scale-out Clusters
Independent elasticity,
pause, and resume
Highest performance
Physical workload isolation
Highest concurrency
Chargeback per cluster
94. Benefits:
• Most predictable cost
• Most efficient for unpredictable workloads
• No cache eviction for scaling (no performance cliff)
• Workload isolation
• Single endpoint (auto isolation with classification)
Benefits:
• Maximize cluster throughput
• Workload aware query scheduling
• Fine grained cluster scaling
Benefits:
• Best performance
• Physical workload isolation
• Chargeback
• Highest concurrency
Intra-cluster workload isolation
(scale in)
Marketing
Sales
60%
40%
Data
Warehouse
Autonomous workload balancing
Cluster
1
Cluster
2
Cluster
3
Data
Warehouse
Cluster
N
Multi-clusters
(scale out)
Data
Warehouse
95. CREATE MATERIALZIED VIEW vw_ProductSales
WITH (DISTRIBUTION = HASH(ProductKey))
AS
SELECT
ProductName
ProductKey,
SUM(Amount) AS TotalSales
FROM
FactSales fs
INNER JOIN DimProduct dp ON fs.prodkey = dp.prodkey
GROUP BY
ProductName,
ProductKey
See more by scaling
to petabytes
96. ProductName ProductKey TotalSales
Product A 5453 784,943.00
Product B 763 48,723.00
… … …
FactSales
Table
10B Records
DimProduct
Table
1,000 Records
Materialized View
(1000 Records)
See more by scaling
to petabytes
FactInventory
Table
mvw_ProductSales
1,000 Records
CREATE MATERIALZIED VIEW
mvw_ProductSales
WITH (DISTRIBUTION = HASH(ProductKey))
AS
SELECT
ProductName
ProductKey,
SUM(Amount) AS TotalSales
FROM
FactSales fs
INNER JOIN DimProduct dp
ON fs.prodkey = dp.prodkey
GROUP BY
ProductName,
ProductKey
SELECT
<COLUMNS>
FROM FactSales fs
INNER JOIN
SELECT
ProductName
ProductKey,
SUM(Amount) AS TotalSales
FROM
FactSales fs
INNER JOIN DimProduct dp
GROUP BY
ProductName,
ProductKey ) ps
INNER JOIN FactInventory
GROUP BY …
97. Execution 2
Cache Hit
~.2 seconds
Execution 1
Cache Miss
Regular
Execution
SELECT
ProductName
ProductKey,
SUM(Amount) AS TotalSales
FROM
Fact Sales
INNER JOIN DimProduct
GROUP BY
ProductName,
ProductKey
Build confidence in your
data with result set cache
Data
Warehouse
Resultset
Cache
98. Most secure data
warehouse in the cloud
Multiple levels of security between the
user and the data warehouse
...at no additional cost
Threat Protection
Network Security
Authentication
Access Control
Data Protection
Customer Data
99. Comprehensive security
Category Feature
Data protection
Data in transit
Data encryption at rest
Data discovery and classification
Access control
Object level security (tables/views)
Row level security
Column level security
Dynamic data masking
SQL login
Authentication Azure active directory
Multi-factor authentication
Virtual networks
Network
Ssecurity
Firewall
Azure ExpressRoute
Threat detection
Threat protection Auditing
Vulnerability assessment
101. Discovery and
exploration
What’s in this file? How many rows are there? What’s the max value?
SQL serverless reduces data lake exploration to the right-click
Data
transformation
How to convert CSVs to Parquet quickly? How to transform the raw data?
Use the full power of T-SQL to transform the data in the data lake
102. Overview
An interactive query service that provides T-SQL
queries over high scale data in Azure Storage.
Benefits
Serverless
No infrastructure
Pay only for query execution
No ETL
Offers security
Data integration with Databricks, HDInsight
T-SQL syntax to query data
Supports data in various formats
(Parquet, CSV, JSON)
Support for BI ecosystem
Azure Storage
SQL
Serverless
Query
Power BI
Azure Data
Studio
SSMS
DW
Read and write
data files
Curate and
transform data
Sync table
definitions
Read and write
data files
Azure Synapse Analytics > SQL > SQL serverless
104. Allows multiple languages in one notebook
%%<Name of language>
Offers use of temporary tables across languages
Support for syntax highlight, syntax error, syntax code
completion, smart indent, and code folding
Export results
Quickly create &
configure notebooks
105. As notebook cells run, the underlying
Apache Spark application status is
shown, providing immediate feedback
and progress tracking.
Quickly create &
configure notebooks
107. Overview
Linked services defines the connection
information needed for pipelines to connect to
external resources
Benefits
Offers 85+ pre-built connectors
Allows easy cross platform data migration
Represents data store or compute resources
108. Prep and transform data
Mapping dataflow
Code free data transformation at scale
Wrangling dataflow
Code free data preparation at scale
109. Handle upserts,
updates, deletes
on sql sinks
Add new partition
methods
Add schema
drift support
Add file handling (move
files after read, write files
to file names described
in rows, etc.)
New inventory of
functions (e.g. Hash
functions for row
comparison)
Commonly used ETL
patterns (Sequence
generator/Lookup
transformation/SCD…)
Data lineage – Capturing
sink column lineage &
impact analysis
(invaluable if this is for
enterprise deployment)
Implement commonly
used ETL patterns as
templates (SCD type1,
type2, data vault)
Data flow
Capabilities
111. Insights for all with
Power BI + Azure
Power up your BI with Azure Synapse
112. 2020 Gartner Magic Quadrant for Analytics and Business Intelligence Platforms
113. Where do you find yourself on the curve?
Hindsight Insight Foresight
Value
Difficulty
What happened?
Descriptive Analysis
Why did it happen?
Diagnostic Analysis
What will happen?
Predictive Analysis
How can we make it happen?
Prescriptive Analysis
114. Where do you find yourself on the curve?
Hindsight Insight Foresight
Value
Difficulty
What happened?
Descriptive Analysis
Why did it happen?
Diagnostic Analysis
What will happen?
Predictive Analysis
How can we make it happen?
Prescriptive Analysis
BI
115. BI + Analytics unlock the door to AI, machine learning, and
real-time insights
Hindsight Insight Foresight
Value
Difficulty
What happened?
Descriptive Analysis
Why did it happen?
Diagnostic Analysis
What will happen?
Predictive Analysis
How can we make it happen?
Prescriptive Analysis
AnalyticsBI
116. BI
Bring together the best of both worlds with the market-
leading BI service and the industry-leading analytics platform
Power BI can analyze and visualize
massive volumes of data
Azure Synapse Analytics provides a
scalable platform to enable real-time BI
Analytics
117. Power BI can analyze and
visualize massive volumes of data
Azure Synapse Analytics
provides a scalable platform
to enable real-time BI
Azure Machine Learning natively
integrates with Azure Synapse &
Power BI to democratize AI across
your business
BI Analytics Machine learning
Bring together the best of both worlds with the market-
leading BI service and the industry-leading analytics platform
118. Accelerate business value with a powerful analytics platform
Business analysts IT professionals Data scientists
Frictionless
collaboration
Unified
analytics platform
Advanced analytics
and AI
Powerful visualization and
reporting
Unmatched
capabilities
Business value
Common Data Model on Azure Data Lake StorageUnified data
Azure Synapse AnalyticsPower BI
Powerful and
integrated
tooling
Azure Machine Learning
119. Visualize and
report
Power BI
Model &
serve
Azure Synapse
Analytics
CDM folders
Azure Data Lake
Storage
Respond instantly
Enable instant response times with
Power BI Aggregations on massive
datasets when querying at the
aggregated level
Get granular with your data
Queries at the granular level are
sent to Azure Synapse Analytics
with DirectQuery leveraging its
industry-leading performance
Save money with industry-
leading performance
Azure Synapse Analytics is up to
14x faster and 94% cheaper than
other cloud providers
View reports with a single pane
of glass
Skip the configuration when
connecting to Power BI with
integrated Power BI-authoring
directly in the Azure Synapse Studio
Accelerate business value with a powerful analytics platform
120. Customers using Azure Synapse & Power BI today
are transforming their business with purpose
27%
Faster time
to insights
271% Average ROI
26%
Lower total cost
of ownership
60%
Increased customer
satisfaction
* Forrester, October 2019, “The Total Economic Impact of Microsoft Azure Analytics with Power BI”
121. Build Power BI dashboards directly
from Azure Synapse
Azure Synapse + Power BI integration
126. Exercise 1 - Explore the data lake with Azure Synapse SQL On-demand and Azure Synapse Spark
Exercise 2 - Build a Modern Data Warehouse with Azure Synapse Pipelines
Exercise 3 - Power BI integration
Exercise 4 - High Performance Analysis with Azure Synapse SQL Pools
Exercise 5 - Data Science with Azure Synapse Spark
128. Analytics in a Day
Thank You!
James McAuliffe
[email protected]
https://www.linkedin.com/in/jamesmcauliffesql/
https://ccganalytics.com/
129. Get Started Today
Create a free Azure account and get started with Azure Synapse Analytics:
https://azure.microsoft.com/en-us/free/synapse-analytics/
Get in touch with us:
https://info.microsoft.com/ww-landing-contact-me-azure-analytics.html
Learn more:
https://aka.ms/synapse
Get the Azure Synapse Analytics Toolkit
130. Power BI COVID Crisis Response Resources
Power BI & COVID-19
Keeping citizens informed
Find out more at: https://aka.ms/pbicovid19
Crisis Communications App
https://aka.ms/crisis-communication-app-docs
Emergency Response Solution
https://aka.ms/emergency-response-doc
131. The Ignite Book of News
https://news.microsoft.com/wp-content/uploads/prod/sites/563/2019/11/Ignite-2019-Book-of-News-2.pdf
132. Azure Synapse Analytics
Get the Azure Synapse Analytics Toolkit
Azure Synapse is Azure SQL Data Warehouse evolved
Analytics Primer in 60 minutes with Microsoft Azure
Accelerate Time to Analytics with Azure Synapse Analytics
Build 2020
Data Warehouse in the Cloud Benchmark
Overview of Microsoft Azure compliance
Microsoft Compliance Offerings
2020 Gartner Magic Quadrant for Analytics and Business Intelligence Platforms
The Digitization of the World from Edge to Core
The Total Economic Impact of Microsoft Azure Analytics with Power BI
Azure Data Factory Overview
Power BI Governance Admin
References and Links
135. HIPAA /
HITECH
IRS 1075 Section 508
VPAT
ISO 27001 PCI DSS Level 1SOC 1 Type 2 SOC 2 Type 2 ISO 27018Cloud Controls
Matrix
Content Delivery and
Security Association
Singapore
MTCS Level 3
United
Kingdom
G-Cloud
China Multi
Layer Protection
Scheme
China
CCCPPF
China
GB 18030
European Union
Model Clauses
EU Safe
Harbor
ENISA
IAF
Shared
Assessments
ITAR-ready
Japan
Financial Services
FedRAMP JAB
P-ATO
FIPS 140-2 21 CFR
Part 11
DISA Level 2FERPA CJIS
Australian
Signals
Directorate
New Zealand
GCIO
Industry-leading compliance
136. Threat Protection
Threat Protection - Business requirements
Network Security
Authentication
Access Control
Data ProtectionHow do we enumerate
and track potential SQL
vulnerabilities?
To mitigate any security
misconfigurations before they
become a serious issue.
How do we discover and
alert on suspicious
database activity?
To detect and resolve any data
exfiltration or SQL injection attacks.
137. ✓ Automatic discovery of columns with
sensitive data
✓ Add persistent sensitive data labels
✓ Audit and detect access to the sensitive data
✓ Manage labels for your entire Azure tenant
using Azure Security Center
SQL Data Discovery & Classification
Discover, classify, protect and track access to sensitive data
138. SQL Data Discovery & Classification - setup
Step 1: Enable Advanced Data Security
on the logical SQL Server
Step 2: Use recommendations and/or manual classification to
classify all the sensitive columns in your tables
139. SQL Data Discovery & Classification – audit sensitive data access
Step 1: Configure auditing for your target Data warehouse. This can be
configured for just a single data warehouse or all databases on a server.
Step 2: Navigate to audit logs in storage account and
download ‘xel’ log files to local machine.
Step 3: Open logs using extended events viewer in SSMS.
Configure viewer to include ‘data_sensitivity_information’ column
140. Single Sign-On
Implicit authentication - User provides
login credentials once to access Azure
Synapse Workspace
AAD authentication - Azure Synapse
Studio will request token to access each
linked services as user. A separate token is
acquired for each of the below services:
1. ADLS Gen2
2. Azure Synapse Analytics
3. Power BI
4. Spark – Spark Livy API
5. management.azure.com – resource
provisioning
6. Develop artifacts – dev.workspace.net
7. Graph endpoints
MSI authentication - Orchestration uses
MSI auth for automation
141. The data warehouse in the data-driven business
Azure Synapse
Analytics
Azure
Databricks
Azure Data
Lake Storage
Business
services
Power BI
Transform
and enrich
PrepareIngest
Azure
Data Factory
142. ADF’s execution engine
• Data movement
• Pipeline activity execution
• SSIS package execution
Azure
Integration runtime
Self-hosted
Integration runtime
Cloud services
Apps & Data
Pipeline SSIS package
Command
and control
LEGEND
Data
Integration Runtime (IR)
Azure Data Factory v2 Service Scheduling | Orchestration | Monitoring
UX & SDK Authoring | Monitoring/Management
143. Serverless, scalable, hybrid data integration service
Lift existing SQL Server ETL
to Azure
Use existing tools
(SSMS, SSDT)
Azure Data Factory
Cloud and hybrid w/
80+ connectors
Up to 2 GB/s ETL/ELT
in the cloud
Seamlessly span on-prem,
Azure, other clouds, SaaS
Run on-demand, scheduled,
or on-event data-availability
Programmability with
multi-language SDK
Visual tools
Data movement
and transformation
at scale
Hybrid
pipeline model
Author
and monitor
SSIS package
execution
144. No-code data transformation at scale
Focus on building business
logic and transforming data
• Data cleansing, transformation,
aggregation, conversion, etc.
• Cloud scale via Spark execution
• Resilient data flows with ease
146. Best-in-class monitoring and management
Monitor pipeline and
activity runs
Query runs with rich language
Operational lineage between
parent-child pipelines
Azure Monitor Integration
• Diagnostics logging
• Metrics and alerts
• Events
Restate pipeline and activities
147. Use templates to quickly get started
Quickly build data
integration solutions
Avoid rebuilding workflows—
instantiate a template
Improve developer productivity
and reducing development
time for repeat processes
148. Pipelines
Overview
It provides ability to load data from storage
account to desired linked service. Load data by
manual execution of pipeline or by
orchestration
Benefits
Supports common loading patterns
Fully parallel loading into data lake or SQL
tables
Graphical development experience
149. Triggers
Overview
Triggers represent a unit of processing that
determines when a pipeline execution needs to be
kicked off.
Data Integration offers 3 trigger types as –
1. Schedule – gets fired at a schedule with
information of start date, recurrence, end date
2. Event – gets fired on specified event
3. Tumbling window – gets fired at a periodic time
interval from a specified start date, while
retaining state
It also provides ability to monitor pipeline runs and
control trigger execution.
150. Prep & Transform Data
Overview
It offers data cleansing, transformation,
aggregation, conversion, etc
Benefits
Cloud scale via Spark execution
Guided experience to easily build resilient data
flows
Flexibility to transform data per user’s comfort
Monitor and manage dataflows from a single
pane of glass
152. Coming Later This Summer
Synapse will collect query patterns in order to create materialized views
Composite Models
Microsoft Information Protection improvements
154. Power BI service
Cloud-based SaaS solutions
Get started quickly
Secure, live connection to your data sources,
on-premises and in the cloud
Auto insights and intuitive data exploration using
natural language query
Deliver insights through other services such as
SharePoint, PowerApps & Teams
Pre-built dashboards and reports for popular SaaS
solutions
Sharing and collaboration of dashboards, reports & datasets
Live, real-time dashboard updates
155. Deliver insights through other services
Collaborate and share insights with teams in your
organization using existing services
Fully interactive reports integrated into your service
156. Data Connectivity Modes in Power BI Desktop
Import DirectQuery Live/Exploration
Overview
• ETL
• Data download
• Select specific tables
• No data download
• Queries triggered from
Report visuals
• Explore source objects from
Report surface
• No data download
• Queries triggered from
Report visuals
Supported Data Sources • All sources (>80 sources)
• SQL Server
• Azure SQL Database
• Azure SQL Data Warehouse
• SAP HANA
• Oracle
• Teradata
• SQL Server Analysis Services
(Tabular & Multidimensional)
Max # of data sources per report • Unlimited • One One
Data Transformations • All transformations (100’s)
• Partial support
(varies by data source)
None
Mashup Capabilities
• Merge (Joins)
• Append (Union)
• Parameterized queries
• Merge (Joins)
• Append (Union)
None
Modeling Capabilities
• Relationships
• Calculated Columns & Tables
• Measures
• Hierarchies
• Calculated Columns
• Measures
• Change Column Types
None
With Power BI Desktop,
you can connect to
your data in three ways:
• Import
• DirectQuery
• LiveConnect
157. Dedicated resources in the cloud
Flexibility to license by capacity
Greater scale and performance
Extending on-premises capabilities
Premium capacity – P3
Premium capacity – P2
Premium capacity – P1
My workspace
User 2
My workspace
User 3
App workspace
Marketing
App workspace
Sales
My workspace
User 1
APIs
Custom app
Power BI service – Contoso organization
Power BI Premium