SlideShare a Scribd company logo
Microsoft SQL Server Data
Warehouses for SQL DBAs

SQL Saturday Philly June 9, 2012
http://mssqldude.wordpress.com


http://www.sqlmag.com/blog/sql-server-bi-blog-17

mkromer@microsoft.com




http://joedantoni.wordpress.com

jdanton1@yahoo.com
Agenda
•
•
    −
    −
•
    −
    −
•
    −
•
•
    −
    −
Microsoft Data Warehousing
Offerings
                                                         Tier 1 Offerings
                                      Fast Track Data                    HP Business DW                          Parallel Data
      Enterprise
                                        Warehouse                           Appliance                             Warehouse
                                                                                                             Appliance for high end Data
 Scalable and reliable platform     Reference Architectures offering    An affordable SMP solution for
                                                                                                            Warehousing requiring highest
 for Data Warehousing on any        best price performance for Data     data warehousing on optimized
                                                                                                             scalability, performance or
           hardware                          Warehousing                          hardware
                                                                                                                     complexity

 Ideal for data marts or small to    Ideal for data marts or small to
                                                                        Ideal for small data marts or DWs   Offers flexibility in hardware and
    mid-sized enterprise data       mid-sized DWs with scan centric
                                                                           with scan centric workloads                 architecture
       warehouses (EDWs)                        workloads

                                                                                                                      DW Appliance
                                       Reference Architectures                Integrated Appliance
         Software only                                                                                       (Fully integrated Software and
                                       (Software and Hardware)              (Software and Hardware)
                                                                                                                        Hardware)

                                                                                                               Scale out data warehousing
  Scale up data warehousing           Scale up data warehousing            Scale up data warehousing            with massively parallel
                                                                                                                   processing (MPP)

        10s of terabytes                     4–80 terabytes                     Up to 5 terabytes                10s–100s of terabytes
Some Data Warehouses today

Big SAN
Big SMP Server
Connected together




       What’s wrong with this picture?
Answer: system out of balance

   This server can consume 12 GB/Sec of IO, but the
    SAN can only deliver 2 GB/Sec
       Even when the SAN is dedicated to the SQL Data
        Warehouse, which it often isn’t
   Queries are slow
       Despite significant investment in both Server and Storage




Result: significant investment, not delivering performance
Microsoft SQL Server Data Warehouses for SQL Server DBAs
The Alternative: A Balanced System

   Design a server + storage configuration that can
    deliver all the IO bandwidth that CPUs can
    consume when executing a SQL Relational DW
    workload
   Avoid sharing storage devices among servers
   Avoid overinvesting in disk drives
Microsoft SQL Server Data Warehouses for SQL Server DBAs
SQL Server Fast Track Data Warehouse
Solution to help customers and partners
accelerate their data warehouse deployments

   A method for designing a cost-effective,
    balanced system for Data Warehouse
    workloads
   Reference hardware configurations
    developed in conjunction with hardware
    partners using this method
   Best practices for data layout, loading and
    management
Software:
  • SQL Server 2008 R2
     Enterprise
  • Windows Server 2008 R2

Configuration guidelines:
  • Physical table structures
  • Indexes
  • Compression
  • SQL Server settings
  • Windows Server settings
  • Loading

Hardware:
  • Tight specifications for servers,
    storage and networking
  • ‘Per core’ building block
Core Fast Track Metrics

•
    −
        −

    −
        −
System Benchmarking - MCR

•

    −
    −
•
    −

•
    − 200MB/s per core
Establishing Fast Track MCR

•

    −
    −
•

    −
System Benchmarking - BCR

•


    −
    −
•                Actual Miles Per Gallon

•
Establishing Fast Track BCR

•
    −
        −

        −
        −
Microsoft SQL Server Data Warehouses for SQL Server DBAs
Fast Track Reference Configurations

2 Processor Configurations (5 – 20 TB, 2-3.7 GB/s)
  
  
  
  


4 Processor Configurations (20 – 40 TB, 3.5-7.5 GB/s)
  
  
  
  


8 processor Configurations (40 – 80 TB, 7.5-14 GB/s)
  
Data Warehouse Workload Characteristics


SELECT    L_RETURNFLAG, L_LINESTATUS, SUM(L_QUANTITY) AS SUM_QTY,
          SUM(L_EXTENDEDPRICE) AS SUM_BASE_PRICE,
          SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)) AS SUM_DISC_PRICE,
          SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)*(1+L_TAX))
                      AS SUM_CHARGE,
          AVG(L_QUANTITY) AS AVG_QTY,
          AVG(L_EXTENDEDPRICE) AS AVG_PRICE,
          AVG(L_DISCOUNT) AS AVG_DISC,
          COUNT(*) AS COUNT_ORDER
     FROM LINEITEM
     GROUP BY L_RETURNFLAG,
                      L_LINESTATUS
     ORDER BY L_RETURNFLAG,
                L_LINESTATUS
Software configuration
SQL Server Startup
•
    −
•
Software configuration
Temp DB
•
    −
        −
•
    −
•
•
    −
    −
Software configuration
Temp DB & TLOG
•
    −
        −
    −
    −
•
    −

    −

    −
•
    −
    −
DW Server Baseline Configs

•
    −
        −
        −
        −
        −
•
    −
Microsoft SQL Server Data Warehouses for SQL Server DBAs
Fast Track Data Striping

  •


                      FT Storage Enclosure
 Raid-1
                                Primary Data                             Log


             ARY01D1v01   ARY02D1v03        ARY03D1v05     ARY04D1v07   ARY05v09



              DB1-1.ndf                     DB1-5.ndf       DB1-7.ndf   DB1.ldf
                          DB1-3.ndf




Disk 1 & 2   ARY01D2v02   ARY02D2v04        ARY03D2v06     ARY04D2v08



             DB1-2.ndf    DB1-4.ndf         DB1-6.ndf       DB1-8.ndf




                                  Microsoft Confidential
User Databases

•

    −
    −
    −
•
•

•
    −
Transaction Log


•

•
•
LUN 1                LUN 2                  LUN 3                               LUN16


                                                          Permanent FG
  Permanant_DB




                 Permanent_1.ndf     Permanent_2.ndf        Permanent_3.ndf                    Permanent_16.ndf




                                                           Stage FG
Database
 Stage




                  Stage_1.ndf          Stage_2.ndf          Stage_3.ndf                         Stage_16.ndf
                 Local Drive 1
  TempDB




                 TempDB.mdf (25GB) TempDB_02.ndf (25GB)    TempDB_03ndf (25GB)            TempDB_16.ndf (25GB)



                                                                                   Log LUN 1

                                                                                 Permanent DB
                                                                                     Log
                                                                                 Stage DB Log
Microsoft SQL Server Data Warehouses for SQL Server DBAs
Control rack                                                                      Data racks

 Control Rack                                          Data Rack



                                            Compute Nodes                           Storage Nodes


  Control Nodes                                          SQL

  Active / Passive
                                                         SQL

                   SQL                                   SQL


                                                         SQL

Management Nodes




                                                               Dual Fiber Channel
                                                         SQL




                         Dual Infiniband
                                                         SQL


                                                         SQL
   Landing Node
                                                         SQL


                                                         SQL

  Backup Node
                                                         SQL

                                           Spare Compute Node

 Private Network
1 Data Rack

• 17 Servers
• 22 Procs
• 132 Cores




                   Control Rack     DataRack




                 Expand to 4 data racks and
           quadruple your performance and capacity!
Query Speed in Seconds
                       PDW Time       Orig. Time

4500     4200
4000
3500
3000
2500
2000
1500                1200                                    1200
1000
 500   16          6         2 120      2 120      2 120   4
   0
        Q1        Q2         Q3         Q4         Q5      Q6
       263x       200x        60x         60x       60x    300x
          PDW times faster than original query speeds
Parallel Data Warehouse Appliance
    Hardware Architecture
                                                                 Compute Nodes                              Storage Nodes


                        Control Nodes                                            SQL


                        Active/Passive
                                                                                 SQL


                                                                                 SQL
   Client Drivers                        SQL


                                                                                 SQL



                      Management Nodes                                           SQL




                                                                                       Dual Fiber Channel
   Data Center                                 Dual Infiniband                   SQL
   Monitoring
                                                                                 SQL


                        Landing Node                                             SQL

 ETL Load Interface
                                                                                 SQL



                        Backup Node                                              SQL

 Corporate Backup
     Solution
                                                                 Spare Compute Node


Corporate Network     Private Network
Parallel Data Warehouse benefits
   Massively Parallel Processing
                                                                   Compute Nodes                              Storage Nodes


                      Control Nodes                            ?                   SQL


                      Active/Passive                                                                                          Query 1 is
      Query 1
                                                               ?                   SQL
                                                                                                                              submitted to
                                                                                                                              SQL Server
                                       SQL                     ?                   SQL
                                                                                                                              on Control
                                                                                                                              Node
                                                               ?                   SQL



                    Management Nodes                           ?                   SQL




                                                                                         Dual Fiber Channel
                                                                                                                              Query is
                                             Dual Infiniband   ?                   SQL
                                                                                                                              executed on
                                                                                                                              all 10 Nodes
                                                               ?                   SQL


                      Landing Node
                                                               ?                   SQL                                        Results are
                                                                                                                              sent back to
                                                               ?                   SQL                                        client
                      Backup Node                              ?                   SQL



                                                                   Spare Compute Node


Corporate Network   Private Network
Parallel Data Warehouse benefits
   Massively Parallel Processing
                                                               Compute Nodes                              Storage Nodes


                      Control Nodes                                                                                       Multiple
                                                               ????????        SQL

                                                                                                                          queries are
            ?         Active/Passive
                                                               ????????        SQL
                                                                                                                          simultane-
    ?                    ????          SQL
                                                               ????????        SQL
                                                                                                                          ously
                         ???
                                                                                                                          executed
            ?               ?                                                                                             across all
                                                               ????????        SQL

                                                                                                                          nodes.
    ?               Management Nodes                           ????????        SQL




                                                                                     Dual Fiber Channel
                                             Dual Infiniband   ????????        SQL



        ?                                                      ????????        SQL
                                                                                                                          PDW
                                                                                                                          supports
                ?     Landing Node                             ????????        SQL                                        querying
                                                               ????????                                                   while
                                                                               SQL                                        data is
    ?                                                          ????????                                                   loading.
                ?     Backup Node                                              SQL



                                                               Spare Compute Node
       Blazing fast performance by parallelizing queries on highly optimized
Corporate Network    Private Network
                                 shared nothing nodes
•



•




•

    −

    −
MPP Engine Coordinator

Software Architecture                                               Provides single system image
                                                                    SQL compilation
                                                                    Global metadata and appliance configuration
                                                                    Global query optimization and plan generation
                                                                    Global query execution coordination
                                        Other                       Global transaction coordination
Query       MS BI                                    Internet       Authentication and authorization
                          DWSQL         Third-       Explorer
Tool       (AS, RS)                                                 Supportability (hardware and software status)
                                      Party Tools

                                                                         Compute Node
                                                                           Compute Nodes
                                                                             Compute Nodes
                                                       IIS                  Data Movement Service
              Data Access                            Admin
    (OLEDB, ODBC, ADO.NET, JDBC)
                                                     Console
                                                                                 User Data
                                                                                               SQL Server


                     Core
     SQL                          DMS
                    Engine
    Parser                       Manager              Data               Backup Node
                   Services
                                                    Movement
        MPP Engine Coordinator                       Service                Data Movement Service


                                                                         Landing Zone Node

       DW                  DW             DW                                Data Movement Service
                                                      TempDB
  Authentication      Configuration     Schema

                                                       SQL Server
                                                                               Data Movement Service
Control Node                                                            Data movement across the appliance
                                                                        Distributed query execution operators
Microsoft SQL Server Data Warehouses for SQL Server DBAs
Blazing-Fast Performance



“400 percent
improvement in
performance
                            First American Title
                           Insurance Company




                                             Now, up to                10xFaster³
                                                                       ColumnStore
¹Source: Microsoft customer evidence, Choice Hotels International
²Source: Microsoft customer evidence, KAS Bank
³Source: Microsoft customer testing; common data warehousing queries
ProductKey              SalesAmount

           OrderDateKey
                                          OrderDateKey   ProductKey   SalesAmount
                                          20101107       106          30.00
                                          20101107
StoreKey    RegionKey   Quantity                         103
                                                                      17.00
                                          20101107
01          1           6                                109
                                          20101107                    20.00
            2           1                                103
04                                        20101107
                                                                      17.00
            2           2                                106
04                                        20101108
            2                                                         20.00
                        1                                106
03
            3                             OrderDateKey                25.00
                        4
05          1                             20101108       ProductKey
                        5
02                                        20101108                    SalesAmount
                                                         102
            RegionKey
                        Quantity          20101108
                                                         106          14.00
StoreKey    1
                        1                 20101109
                                                         109          25.00
02          2
                        5                 20101109
            1
                                                         106          10.00
03                                        20101109
                        1                                106
01          2                                                         20.00
                        4                                103
            2
04                                                                    25.00
            1
                        5
04                      1
                                                                      17.00
01
41




•                                 Batch object
•
                                  Column vectors
•




        List of qualifying rows
    −
    −
•
Microsoft SQL Server Data Warehouses for SQL Server DBAs
In a standard scale-out server deployment, multiple report servers share a single
report server database. The report server database should be installed on a
remote SQL Server instance. The following diagram is an example of a standard
scale-out server deployment configuration with the report server database on a
remote SQL Server instance.
As another option, you might decide to host the report server database on a
SQL Server instance that is part of a failover cluster. The following diagram is
an example of a scale-out server deployment configuration where the report
server databases are on an instance that is part of a failover cluster.
In addition to the standard scale-out deployment, you might determine that your reporting environment
would benefit from a more advanced scale-out deployment configuration. For example, you might decide
to use the load-balanced report servers for interactive report processing and add a separate report server
computer to process only scheduled reports. The following diagram is an example of this advanced scale-
out server deployment configuration.
Log                               Description

                                  The report server execution log contains data about specific reports, including when a report was run,
Report Server Execution Log       who ran it, where it was delivered, and which rendering format was used.
                                  The execution log is stored in the report server database.


                                  The service trace log contains very detailed information that is useful if you are debugging an
Report Server Service Trace Log   application or investigating an issue or event. The file is located at Microsoft SQL Server<SQL Server
                                  Instance>Reporting ServicesLogFiles.


                                  The HTTP log file contains a record of all HTTP requests and responses handled by the Report Server
                                  Web service and Report Manager. HTTP logging is not enabled by default. You must modify the
Report Server HTTP Log
                                  ReportingServicesService.exe configuration file to use this feature in your installation. The file is
                                  located at Microsoft SQL Server<SQL Server Instance>Reporting ServicesLogFiles.
Microsoft SQL Server Data Warehouses for SQL Server DBAs
•
    −



•
•
•
•



•
•
    −
    −

        −
        −
        −
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•

•
•
•

•
Under the properties of your data source, increasing the network packet size for SQL
Server minimizes the protocol overhead require to build many, small packages. The
default value for SQL Server 2008 is 4096. With a data warehouse load, a packet size of
32K (in SQL Server, this means assigning the value 32767) can benefit processing. Don’t
change the value in SQL Server using sp_configure; instead override it in your data source.
This can be set whether you are using TCP/IP or Shared Memory.
Microsoft SQL Server Data Warehouses for SQL Server DBAs
•
•
•
•
•
•

•

•

•

•

•

•

•

•
•
•
    −
•
    −

    −

    −


•
•
•
•
Microsoft SQL Server Data Warehouses for SQL Server DBAs
Microsoft SQL Server Data Warehouses for SQL Server DBAs
Microsoft SQL Server Data Warehouses for SQL Server DBAs
© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions,
                 it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.
                                       MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

More Related Content

What's hot (20)

PDF
Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle
Ashnikbiz
 
PPTX
Hadoop databases for oracle DBAs
Maxym Kharchenko
 
PPTX
Exadata 12c New Features RMOUG
Fuad Arshad
 
PPTX
Debunking the Myths of HDFS Erasure Coding Performance
DataWorks Summit/Hadoop Summit
 
PPTX
Experience sql server on l inux and docker
Bob Ward
 
PDF
Oracle GoldenGate for Oracle DBAs
Guatemala User Group
 
PDF
SQL on Hadoop: Defining the New Generation of Analytic SQL Databases
OReillyStrata
 
PPTX
Introduction to Apache Accumulo
Jared Winick
 
PPTX
Simplify Consolidation with Oracle Database 12c
Maris Elsins
 
PPT
Teradata vs-exadata
Louis liu
 
PDF
My First 100 days with a MySQL DBMS
Gustavo Rene Antunez
 
PDF
Oracle database high availability solutions
Kirill Loifman
 
PDF
PayPal Big Data and MySQL Cluster
Mat Keep
 
PDF
My First 100 days with a MySQL DBMS (WP)
Gustavo Rene Antunez
 
PDF
Rapid Cluster Computing with Apache Spark 2016
Zohar Elkayam
 
PDF
Running E-Business Suite Database on Oracle Database Appliance
Maris Elsins
 
PPTX
Oracle Goldengate training by Vipin Mishra
Vipin Mishra
 
ODP
Exadata
talek
 
PDF
Best Practices – Extreme Performance with Data Warehousing on Oracle Database
Edgar Alejandro Villegas
 
PDF
Overview of EnterpriseDB Postgres Plus Advanced Server 9.4 and Postgres Enter...
EDB
 
Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle
Ashnikbiz
 
Hadoop databases for oracle DBAs
Maxym Kharchenko
 
Exadata 12c New Features RMOUG
Fuad Arshad
 
Debunking the Myths of HDFS Erasure Coding Performance
DataWorks Summit/Hadoop Summit
 
Experience sql server on l inux and docker
Bob Ward
 
Oracle GoldenGate for Oracle DBAs
Guatemala User Group
 
SQL on Hadoop: Defining the New Generation of Analytic SQL Databases
OReillyStrata
 
Introduction to Apache Accumulo
Jared Winick
 
Simplify Consolidation with Oracle Database 12c
Maris Elsins
 
Teradata vs-exadata
Louis liu
 
My First 100 days with a MySQL DBMS
Gustavo Rene Antunez
 
Oracle database high availability solutions
Kirill Loifman
 
PayPal Big Data and MySQL Cluster
Mat Keep
 
My First 100 days with a MySQL DBMS (WP)
Gustavo Rene Antunez
 
Rapid Cluster Computing with Apache Spark 2016
Zohar Elkayam
 
Running E-Business Suite Database on Oracle Database Appliance
Maris Elsins
 
Oracle Goldengate training by Vipin Mishra
Vipin Mishra
 
Exadata
talek
 
Best Practices – Extreme Performance with Data Warehousing on Oracle Database
Edgar Alejandro Villegas
 
Overview of EnterpriseDB Postgres Plus Advanced Server 9.4 and Postgres Enter...
EDB
 

Viewers also liked (20)

PDF
Building Data Warehouse in SQL Server
Antonios Chatzipavlis
 
PPTX
PSSUG Nov 2012: Big Data with SQL Server
Mark Kromer
 
PPTX
Big Data in the Cloud with Azure Marketplace Images
Mark Kromer
 
PPTX
Microsoft Cloud BI Update 2012 for SQL Saturday Philly
Mark Kromer
 
DOCX
MEC Data sheet
Mark Kromer
 
PPTX
What's new in SQL Server 2012 for philly code camp 2012.1
Mark Kromer
 
PPTX
Philly Code Camp 2013 Mark Kromer Big Data with SQL Server
Mark Kromer
 
PPTX
Microsoft Event Registration System Hosted on Windows Azure
Mark Kromer
 
PDF
Sql server 2012 tutorials reporting services
Steve Xu
 
PPTX
Big Data with SQL Server
Mark Kromer
 
PPTX
Pentaho Big Data Analytics with Vertica and Hadoop
Mark Kromer
 
PDF
Best Practices – Extreme Performance with Data Warehousing on Oracle Databa...
Edgar Alejandro Villegas
 
PDF
Adventures with Angular 2
Dragos Ionita
 
PPTX
Anexinet Big Data Solutions
Mark Kromer
 
PPTX
Big Data in the Real World
Mark Kromer
 
PPTX
Pentaho Analytics on MongoDB
Mark Kromer
 
PPTX
Big Data Analytics Projects - Real World with Pentaho
Mark Kromer
 
PPTX
Sql server 2012 roadshow masd overview 003
Mark Kromer
 
PPT
SQL Server Transaction Management
Mark Ginnebaugh
 
PPTX
Azure vs. amazon
Omid Vahdaty
 
Building Data Warehouse in SQL Server
Antonios Chatzipavlis
 
PSSUG Nov 2012: Big Data with SQL Server
Mark Kromer
 
Big Data in the Cloud with Azure Marketplace Images
Mark Kromer
 
Microsoft Cloud BI Update 2012 for SQL Saturday Philly
Mark Kromer
 
MEC Data sheet
Mark Kromer
 
What's new in SQL Server 2012 for philly code camp 2012.1
Mark Kromer
 
Philly Code Camp 2013 Mark Kromer Big Data with SQL Server
Mark Kromer
 
Microsoft Event Registration System Hosted on Windows Azure
Mark Kromer
 
Sql server 2012 tutorials reporting services
Steve Xu
 
Big Data with SQL Server
Mark Kromer
 
Pentaho Big Data Analytics with Vertica and Hadoop
Mark Kromer
 
Best Practices – Extreme Performance with Data Warehousing on Oracle Databa...
Edgar Alejandro Villegas
 
Adventures with Angular 2
Dragos Ionita
 
Anexinet Big Data Solutions
Mark Kromer
 
Big Data in the Real World
Mark Kromer
 
Pentaho Analytics on MongoDB
Mark Kromer
 
Big Data Analytics Projects - Real World with Pentaho
Mark Kromer
 
Sql server 2012 roadshow masd overview 003
Mark Kromer
 
SQL Server Transaction Management
Mark Ginnebaugh
 
Azure vs. amazon
Omid Vahdaty
 
Ad

Similar to Microsoft SQL Server Data Warehouses for SQL Server DBAs (20)

PPTX
HP Microsoft SQL Server Data Management Solutions
Eduardo Castro
 
PPTX
Overview of Microsoft Appliances: Scaling SQL Server to Hundreds of Terabytes
James Serra
 
PDF
SQL Server 2008 Fast Track Data Warehouse
Mark Ginnebaugh
 
PDF
User Group Bi
sqlserver.co.il
 
PDF
Bi303 data warehousing with fast track and pdw - Assaf Fraenkel
sqlserver.co.il
 
PPTX
Sql Server 2008 Performance and Scaleability
dataplex systems limited
 
PDF
SQL Server 2008 R2 Parallel Data Warehouse
Mark Ginnebaugh
 
PDF
Tools for developing and monitoring SQL in DB2 for z/OS
Surekha Parekh
 
PPT
Cs753 2a
9887860753
 
PDF
Oow 2008 yahoo_pie-db
bohanchen
 
PPTX
From the Big Data keynote at InCSIghts 2012
Anand Deshpande
 
PDF
BI Forum 2009 - Principy architektury MPP datového skladu
OKsystem
 
PDF
SQL Server User Group 02/2009
Database Architechs
 
PPTX
Oracle: Dw Design
oracle content
 
PPTX
Oracle: DW Design
DataminingTools Inc
 
PDF
SQL Server 2008 Migration Workshop 04/29/2009
Database Architechs
 
PDF
SQL Server Workshop Paul Bertucci
Mark Ginnebaugh
 
PDF
An overview of Microsoft data mining technology
Mark Tabladillo
 
PPT
Tivoli Storage Productivity Center... What’s new in v4.2.2?
IBM India Smarter Computing
 
PDF
The fillmore-group-aese-presentation-111810
Gennaro (Rino) Persico
 
HP Microsoft SQL Server Data Management Solutions
Eduardo Castro
 
Overview of Microsoft Appliances: Scaling SQL Server to Hundreds of Terabytes
James Serra
 
SQL Server 2008 Fast Track Data Warehouse
Mark Ginnebaugh
 
User Group Bi
sqlserver.co.il
 
Bi303 data warehousing with fast track and pdw - Assaf Fraenkel
sqlserver.co.il
 
Sql Server 2008 Performance and Scaleability
dataplex systems limited
 
SQL Server 2008 R2 Parallel Data Warehouse
Mark Ginnebaugh
 
Tools for developing and monitoring SQL in DB2 for z/OS
Surekha Parekh
 
Cs753 2a
9887860753
 
Oow 2008 yahoo_pie-db
bohanchen
 
From the Big Data keynote at InCSIghts 2012
Anand Deshpande
 
BI Forum 2009 - Principy architektury MPP datového skladu
OKsystem
 
SQL Server User Group 02/2009
Database Architechs
 
Oracle: Dw Design
oracle content
 
Oracle: DW Design
DataminingTools Inc
 
SQL Server 2008 Migration Workshop 04/29/2009
Database Architechs
 
SQL Server Workshop Paul Bertucci
Mark Ginnebaugh
 
An overview of Microsoft data mining technology
Mark Tabladillo
 
Tivoli Storage Productivity Center... What’s new in v4.2.2?
IBM India Smarter Computing
 
The fillmore-group-aese-presentation-111810
Gennaro (Rino) Persico
 
Ad

More from Mark Kromer (20)

PPTX
Fabric Data Factory Pipeline Copy Perf Tips.pptx
Mark Kromer
 
PPTX
Build data quality rules and data cleansing into your data pipelines
Mark Kromer
 
PPTX
Mapping Data Flows Training deck Q1 CY22
Mark Kromer
 
PPTX
Data cleansing and prep with synapse data flows
Mark Kromer
 
PPTX
Data cleansing and data prep with synapse data flows
Mark Kromer
 
PPTX
Mapping Data Flows Training April 2021
Mark Kromer
 
PPTX
Mapping Data Flows Perf Tuning April 2021
Mark Kromer
 
PPTX
Data Lake ETL in the Cloud with ADF
Mark Kromer
 
PPTX
Azure Data Factory Data Wrangling with Power Query
Mark Kromer
 
PPTX
Azure Data Factory Data Flow Performance Tuning 101
Mark Kromer
 
PPTX
Data Quality Patterns in the Cloud with ADF
Mark Kromer
 
PPTX
Azure Data Factory Data Flows Training (Sept 2020 Update)
Mark Kromer
 
PPTX
Data quality patterns in the cloud with ADF
Mark Kromer
 
PPTX
Azure Data Factory Data Flows Training v005
Mark Kromer
 
PPTX
Data Quality Patterns in the Cloud with Azure Data Factory
Mark Kromer
 
PPTX
ADF Mapping Data Flows Level 300
Mark Kromer
 
PPTX
ADF Mapping Data Flows Training V2
Mark Kromer
 
PPTX
ADF Mapping Data Flows Training Slides V1
Mark Kromer
 
PDF
ADF Mapping Data Flow Private Preview Migration
Mark Kromer
 
PPTX
Azure Data Factory ETL Patterns in the Cloud
Mark Kromer
 
Fabric Data Factory Pipeline Copy Perf Tips.pptx
Mark Kromer
 
Build data quality rules and data cleansing into your data pipelines
Mark Kromer
 
Mapping Data Flows Training deck Q1 CY22
Mark Kromer
 
Data cleansing and prep with synapse data flows
Mark Kromer
 
Data cleansing and data prep with synapse data flows
Mark Kromer
 
Mapping Data Flows Training April 2021
Mark Kromer
 
Mapping Data Flows Perf Tuning April 2021
Mark Kromer
 
Data Lake ETL in the Cloud with ADF
Mark Kromer
 
Azure Data Factory Data Wrangling with Power Query
Mark Kromer
 
Azure Data Factory Data Flow Performance Tuning 101
Mark Kromer
 
Data Quality Patterns in the Cloud with ADF
Mark Kromer
 
Azure Data Factory Data Flows Training (Sept 2020 Update)
Mark Kromer
 
Data quality patterns in the cloud with ADF
Mark Kromer
 
Azure Data Factory Data Flows Training v005
Mark Kromer
 
Data Quality Patterns in the Cloud with Azure Data Factory
Mark Kromer
 
ADF Mapping Data Flows Level 300
Mark Kromer
 
ADF Mapping Data Flows Training V2
Mark Kromer
 
ADF Mapping Data Flows Training Slides V1
Mark Kromer
 
ADF Mapping Data Flow Private Preview Migration
Mark Kromer
 
Azure Data Factory ETL Patterns in the Cloud
Mark Kromer
 

Recently uploaded (20)

PDF
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
DOCX
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
PPTX
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
PDF
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
PDF
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PPTX
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
PPTX
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
PPTX
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
PPTX
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
PDF
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PDF
Staying Human in a Machine- Accelerated World
Catalin Jora
 
PDF
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
PDF
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
PDF
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
PDF
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
PPT
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
Newgen Beyond Frankenstein_Build vs Buy_Digital_version.pdf
darshakparmar
 
Cryptography Quiz: test your knowledge of this important security concept.
Rajni Bhardwaj Grover
 
Designing_the_Future_AI_Driven_Product_Experiences_Across_Devices.pptx
presentifyai
 
POV_ Why Enterprises Need to Find Value in ZERO.pdf
darshakparmar
 
Future-Proof or Fall Behind? 10 Tech Trends You Can’t Afford to Ignore in 2025
DIGITALCONFEX
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
The Project Compass - GDG on Campus MSIT
dscmsitkol
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
Future Tech Innovations 2025 – A TechLists Insight
TechLists
 
COMPARISON OF RASTER ANALYSIS TOOLS OF QGIS AND ARCGIS
Sharanya Sarkar
 
Agentforce World Tour Toronto '25 - Supercharge MuleSoft Development with Mod...
Alexandra N. Martinez
 
The 2025 InfraRed Report - Redpoint Ventures
Razin Mustafiz
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
Staying Human in a Machine- Accelerated World
Catalin Jora
 
“Voice Interfaces on a Budget: Building Real-time Speech Recognition on Low-c...
Edge AI and Vision Alliance
 
“NPU IP Hardware Shaped Through Software and Use-case Analysis,” a Presentati...
Edge AI and Vision Alliance
 
CIFDAQ Market Wrap for the week of 4th July 2025
CIFDAQ
 
Newgen 2022-Forrester Newgen TEI_13 05 2022-The-Total-Economic-Impact-Newgen-...
darshakparmar
 
Ericsson LTE presentation SEMINAR 2010.ppt
npat3
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 

Microsoft SQL Server Data Warehouses for SQL Server DBAs

  • 1. Microsoft SQL Server Data Warehouses for SQL DBAs SQL Saturday Philly June 9, 2012
  • 3. Agenda • • − − • − − • − • • − −
  • 4. Microsoft Data Warehousing Offerings Tier 1 Offerings Fast Track Data HP Business DW Parallel Data Enterprise Warehouse Appliance Warehouse Appliance for high end Data Scalable and reliable platform Reference Architectures offering An affordable SMP solution for Warehousing requiring highest for Data Warehousing on any best price performance for Data data warehousing on optimized scalability, performance or hardware Warehousing hardware complexity Ideal for data marts or small to Ideal for data marts or small to Ideal for small data marts or DWs Offers flexibility in hardware and mid-sized enterprise data mid-sized DWs with scan centric with scan centric workloads architecture warehouses (EDWs) workloads DW Appliance Reference Architectures Integrated Appliance Software only (Fully integrated Software and (Software and Hardware) (Software and Hardware) Hardware) Scale out data warehousing Scale up data warehousing Scale up data warehousing Scale up data warehousing with massively parallel processing (MPP) 10s of terabytes 4–80 terabytes Up to 5 terabytes 10s–100s of terabytes
  • 5. Some Data Warehouses today Big SAN Big SMP Server Connected together What’s wrong with this picture?
  • 6. Answer: system out of balance  This server can consume 12 GB/Sec of IO, but the SAN can only deliver 2 GB/Sec  Even when the SAN is dedicated to the SQL Data Warehouse, which it often isn’t  Queries are slow  Despite significant investment in both Server and Storage Result: significant investment, not delivering performance
  • 8. The Alternative: A Balanced System  Design a server + storage configuration that can deliver all the IO bandwidth that CPUs can consume when executing a SQL Relational DW workload  Avoid sharing storage devices among servers  Avoid overinvesting in disk drives
  • 10. SQL Server Fast Track Data Warehouse Solution to help customers and partners accelerate their data warehouse deployments  A method for designing a cost-effective, balanced system for Data Warehouse workloads  Reference hardware configurations developed in conjunction with hardware partners using this method  Best practices for data layout, loading and management
  • 11. Software: • SQL Server 2008 R2 Enterprise • Windows Server 2008 R2 Configuration guidelines: • Physical table structures • Indexes • Compression • SQL Server settings • Windows Server settings • Loading Hardware: • Tight specifications for servers, storage and networking • ‘Per core’ building block
  • 12. Core Fast Track Metrics • − − − −
  • 13. System Benchmarking - MCR • − − • − • − 200MB/s per core
  • 14. Establishing Fast Track MCR • − − • −
  • 15. System Benchmarking - BCR • − − • Actual Miles Per Gallon •
  • 16. Establishing Fast Track BCR • − − − −
  • 18. Fast Track Reference Configurations 2 Processor Configurations (5 – 20 TB, 2-3.7 GB/s)     4 Processor Configurations (20 – 40 TB, 3.5-7.5 GB/s)     8 processor Configurations (40 – 80 TB, 7.5-14 GB/s) 
  • 19. Data Warehouse Workload Characteristics SELECT L_RETURNFLAG, L_LINESTATUS, SUM(L_QUANTITY) AS SUM_QTY, SUM(L_EXTENDEDPRICE) AS SUM_BASE_PRICE, SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)) AS SUM_DISC_PRICE, SUM(L_EXTENDEDPRICE*(1-L_DISCOUNT)*(1+L_TAX)) AS SUM_CHARGE, AVG(L_QUANTITY) AS AVG_QTY, AVG(L_EXTENDEDPRICE) AS AVG_PRICE, AVG(L_DISCOUNT) AS AVG_DISC, COUNT(*) AS COUNT_ORDER FROM LINEITEM GROUP BY L_RETURNFLAG, L_LINESTATUS ORDER BY L_RETURNFLAG, L_LINESTATUS
  • 20. Software configuration SQL Server Startup • − •
  • 21. Software configuration Temp DB • − − • − • • − −
  • 22. Software configuration Temp DB & TLOG • − − − − • − − − • − −
  • 23. DW Server Baseline Configs • − − − − − • −
  • 25. Fast Track Data Striping • FT Storage Enclosure Raid-1 Primary Data Log ARY01D1v01 ARY02D1v03 ARY03D1v05 ARY04D1v07 ARY05v09 DB1-1.ndf DB1-5.ndf DB1-7.ndf DB1.ldf DB1-3.ndf Disk 1 & 2 ARY01D2v02 ARY02D2v04 ARY03D2v06 ARY04D2v08 DB1-2.ndf DB1-4.ndf DB1-6.ndf DB1-8.ndf Microsoft Confidential
  • 26. User Databases • − − − • • • −
  • 28. LUN 1 LUN 2 LUN 3 LUN16 Permanent FG Permanant_DB Permanent_1.ndf Permanent_2.ndf Permanent_3.ndf Permanent_16.ndf Stage FG Database Stage Stage_1.ndf Stage_2.ndf Stage_3.ndf Stage_16.ndf Local Drive 1 TempDB TempDB.mdf (25GB) TempDB_02.ndf (25GB) TempDB_03ndf (25GB) TempDB_16.ndf (25GB) Log LUN 1 Permanent DB Log Stage DB Log
  • 30. Control rack Data racks Control Rack Data Rack Compute Nodes Storage Nodes Control Nodes SQL Active / Passive SQL SQL SQL SQL Management Nodes Dual Fiber Channel SQL Dual Infiniband SQL SQL Landing Node SQL SQL Backup Node SQL Spare Compute Node Private Network
  • 31. 1 Data Rack • 17 Servers • 22 Procs • 132 Cores Control Rack DataRack Expand to 4 data racks and quadruple your performance and capacity!
  • 32. Query Speed in Seconds PDW Time Orig. Time 4500 4200 4000 3500 3000 2500 2000 1500 1200 1200 1000 500 16 6 2 120 2 120 2 120 4 0 Q1 Q2 Q3 Q4 Q5 Q6 263x 200x 60x 60x 60x 300x PDW times faster than original query speeds
  • 33. Parallel Data Warehouse Appliance Hardware Architecture Compute Nodes Storage Nodes Control Nodes SQL Active/Passive SQL SQL Client Drivers SQL SQL Management Nodes SQL Dual Fiber Channel Data Center Dual Infiniband SQL Monitoring SQL Landing Node SQL ETL Load Interface SQL Backup Node SQL Corporate Backup Solution Spare Compute Node Corporate Network Private Network
  • 34. Parallel Data Warehouse benefits Massively Parallel Processing Compute Nodes Storage Nodes Control Nodes ? SQL Active/Passive Query 1 is Query 1 ? SQL submitted to SQL Server SQL ? SQL on Control Node ? SQL Management Nodes ? SQL Dual Fiber Channel Query is Dual Infiniband ? SQL executed on all 10 Nodes ? SQL Landing Node ? SQL Results are sent back to ? SQL client Backup Node ? SQL Spare Compute Node Corporate Network Private Network
  • 35. Parallel Data Warehouse benefits Massively Parallel Processing Compute Nodes Storage Nodes Control Nodes Multiple ???????? SQL queries are ? Active/Passive ???????? SQL simultane- ? ???? SQL ???????? SQL ously ??? executed ? ? across all ???????? SQL nodes. ? Management Nodes ???????? SQL Dual Fiber Channel Dual Infiniband ???????? SQL ? ???????? SQL PDW supports ? Landing Node ???????? SQL querying ???????? while SQL data is ? ???????? loading. ? Backup Node SQL Spare Compute Node Blazing fast performance by parallelizing queries on highly optimized Corporate Network Private Network shared nothing nodes
  • 36. • • • − −
  • 37. MPP Engine Coordinator Software Architecture Provides single system image SQL compilation Global metadata and appliance configuration Global query optimization and plan generation Global query execution coordination Other Global transaction coordination Query MS BI Internet Authentication and authorization DWSQL Third- Explorer Tool (AS, RS) Supportability (hardware and software status) Party Tools Compute Node Compute Nodes Compute Nodes IIS Data Movement Service Data Access Admin (OLEDB, ODBC, ADO.NET, JDBC) Console User Data SQL Server Core SQL DMS Engine Parser Manager Data Backup Node Services Movement MPP Engine Coordinator Service Data Movement Service Landing Zone Node DW DW DW Data Movement Service TempDB Authentication Configuration Schema SQL Server Data Movement Service Control Node Data movement across the appliance Distributed query execution operators
  • 39. Blazing-Fast Performance “400 percent improvement in performance First American Title Insurance Company Now, up to 10xFaster³ ColumnStore ¹Source: Microsoft customer evidence, Choice Hotels International ²Source: Microsoft customer evidence, KAS Bank ³Source: Microsoft customer testing; common data warehousing queries
  • 40. ProductKey SalesAmount OrderDateKey OrderDateKey ProductKey SalesAmount 20101107 106 30.00 20101107 StoreKey RegionKey Quantity 103 17.00 20101107 01 1 6 109 20101107 20.00 2 1 103 04 20101107 17.00 2 2 106 04 20101108 2 20.00 1 106 03 3 OrderDateKey 25.00 4 05 1 20101108 ProductKey 5 02 20101108 SalesAmount 102 RegionKey Quantity 20101108 106 14.00 StoreKey 1 1 20101109 109 25.00 02 2 5 20101109 1 106 10.00 03 20101109 1 106 01 2 20.00 4 103 2 04 25.00 1 5 04 1 17.00 01
  • 41. 41 • Batch object • Column vectors • List of qualifying rows − − •
  • 43. In a standard scale-out server deployment, multiple report servers share a single report server database. The report server database should be installed on a remote SQL Server instance. The following diagram is an example of a standard scale-out server deployment configuration with the report server database on a remote SQL Server instance.
  • 44. As another option, you might decide to host the report server database on a SQL Server instance that is part of a failover cluster. The following diagram is an example of a scale-out server deployment configuration where the report server databases are on an instance that is part of a failover cluster.
  • 45. In addition to the standard scale-out deployment, you might determine that your reporting environment would benefit from a more advanced scale-out deployment configuration. For example, you might decide to use the load-balanced report servers for interactive report processing and add a separate report server computer to process only scheduled reports. The following diagram is an example of this advanced scale- out server deployment configuration.
  • 46. Log Description The report server execution log contains data about specific reports, including when a report was run, Report Server Execution Log who ran it, where it was delivered, and which rendering format was used. The execution log is stored in the report server database. The service trace log contains very detailed information that is useful if you are debugging an Report Server Service Trace Log application or investigating an issue or event. The file is located at Microsoft SQL Server<SQL Server Instance>Reporting ServicesLogFiles. The HTTP log file contains a record of all HTTP requests and responses handled by the Report Server Web service and Report Manager. HTTP logging is not enabled by default. You must modify the Report Server HTTP Log ReportingServicesService.exe configuration file to use this feature in your installation. The file is located at Microsoft SQL Server<SQL Server Instance>Reporting ServicesLogFiles.
  • 48. − • • • • •
  • 49. − − − − − • • • • • • • • • • • • •
  • 51. Under the properties of your data source, increasing the network packet size for SQL Server minimizes the protocol overhead require to build many, small packages. The default value for SQL Server 2008 is 4096. With a data warehouse load, a packet size of 32K (in SQL Server, this means assigning the value 32767) can benefit processing. Don’t change the value in SQL Server using sp_configure; instead override it in your data source. This can be set whether you are using TCP/IP or Shared Memory.
  • 54. − • − − − • • • •
  • 58. © 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Editor's Notes

  • #4: This slide shows what we are going to talk about today. We will start off discussing Microsoft’s vision for data warehousing solutions. Then we will discuss the different offerings. Next, we will discuss how you can get support and services to help you get started with your data warehouse and to help accelerate the completion of your solution. Finally, we will end with a discussion of the quick start services to enable you to begin your data warehouse solution quickly.
  • #5: SQL Server 2008 R2 comes in several editions. In this presentation, we will look at 4 different SKUs, each of which has different features that are important for data warehousing. We will drill down to get more information about each edition and the features that are important.
  • #14: Remind them
  • #15: In order to ensure the query is cached you need to do the following:Ensure the results of the query will fit in memoryRun the query once. The 2nd and subsequent times you execute the query it should be cached from memory. You can tell this b/c the 2nd execution should be much faster than the initialReview:TPC BENCHMARKTM Hhttp://www.tpc.org/tpch/spec/tpch2.8.0.pdfTPC-H Data Sethttp://www.tpc.org/tpch/spec/tpch_2_8_0.ziphttp://www.tpc.org/tpch/spec/reference2.8.0.zip
  • #16: Remind them “Your mileage may vary”
  • #21: -E is the primary way we help to ensure longer “runs” of contiguous, logically grouped pages.An extent is (8) 8k pages.. Or 64k (64k*64k)/1024 = 4MBSQL will still allocate the 4MB extent in groups of (8) 8k pages at a time. This means that pages can still be interleaved (extent fragmentation) down to the extent level.TF117 is specific to TempDB as Autogrow should be off for all other databasesCustomer may have a database with a specific use case that requires autogrow..this is ok just needs to be managedShould not be a major part of the overall workload. This file will become fragmentedUsing Autogrow for Tempdb is about practicality. It can be hard to pre-allocated TempDB. If they can pre-allocate it, go for itReview:Using the SQL Server Service Startup Optionshttp://msdn.microsoft.com/en-us/library/ms190737.aspxSAP with Microsoft SQL Server 2005: Best Practices for High Availability, Maximum Performance, and Scalabilityhttp://download.microsoft.com/download/d/9/4/d948f981-926e-40fa-a026-5bfcf076d9b9/SAP_SQL2005_Best%20Practices.doc
  • #22: Remember that additional space may be needed during initial migration of data if moving onto a Fast Track RA or during the initial load of a new Fast Track RAReview:Working with tempdb in SQL Server 2005http://technet.microsoft.com/en-us/library/cc966545.aspxCapacity Planning for tempdbhttp://msdn.microsoft.com/en-us/library/ms345368.aspx
  • #23: Remember that additional space may be needed during initial migration of data if moving onto a Fast Track RA or during the initial load of a new Fast Track RAReview:Working with tempdb in SQL Server 2005http://technet.microsoft.com/en-us/library/cc966545.aspxCapacity Planning for tempdbhttp://msdn.microsoft.com/en-us/library/ms345368.aspx
  • #24: Workloads often need large amounts of data pages to be in cache, in this case add additional memory as neededHash Joins and Sorts can make use of additional memory to help prevent them from spilling to tempdb. Workloads with large amounts of queries and bulk loads performing hash joins and sorts will benefit from more memory.Review:Troubleshooting Performance Problems in SQL Server 2008http://msdn.microsoft.com/en-us/library/dd672789.aspxHow to: Enable the Lock Pages in Memory Optionhttp://msdn.microsoft.com/en-us/library/ms190730.aspxTuning options for SQL Server 2005 and SQL Server 2008 when running in high performance workloads
  • #31: 4 Racks in V1Orderable at the rack levelRequired software13k Price per TB Pricing and licensing training in resources
  • #37: Data layout options:Dimension tables are typically replicated.PDW maintains data integrity across all nodes.Fact tables are typically distributed.The data model, table sizes, and workloads must all be considered when choosing between replicated and distributed tables.The following join types are used to achieve Distribution Compatibility:Shared Nothing join - Achieves Distribution Compatibility by using compatible Distribution Keys in the SQL join criteria.Ultra Shared Nothing join - Achieves Distribution Compatibility through a replicated table; no data movement between nodes is required.Redistribution join - Requires data to be dynamically distributed between Compute Nodes to achieve Distribution Compatibility.