DB2 11 Query Optimizer Perspective
DB2 11 Query Optimizer Perspective
John Hornibrook
IBM Canada
2 2
© 2017 IBM Corporation
DB2 11.1 – The Highlights
• BLU acceleration (column-organized table support) for partitioned
database systems (MPP)
• BLU acceleration performance improvements
• Sorting using fast parallel radix sort on compressed and encoded data
• Nested-loop join
• OLAP functions
• Native support for more scalar functions
• Faster SQL MERGE
• Additional functional support for BLU acceleration
• Declared global temporary tables
• IDENTITY and EXPRESSION generated columns
• NOT LOGGED INITIALLY
• Codepage 819
• General DB2 improvements (relevant to query optimization)
• Advanced automatic query decorrelation
• User-defined aggregate functions
• Inline optimization guidelines
3 3
© 2017 IBM Corporation
DB2 11.1.1.1 – Even More…
4 4
© 2017 IBM Corporation
DB2 with BLU Acceleration
• Advanced technology for analytic queries in DB2 LUW
• Introduced in DB2 10.5
• DB2 column-organized tables add
columnar capabilities to DB2 databases
• Table data is stored column organized rather than row organized
• Using a vector processing engine
• Using this table format with star schema data
marts provides significant improvements
to storage, query performance, ease of
use, and time-to-value
• New unique runtime technology which
leverages the CPU architecture and is built
directly into the DB2 kernel
• New unique encoding for speed and compression
• This new capability is both main-memory
optimized, CPU optimized, and I/O optimized
5 4
© 2017 IBM Corporation
Column-organized Runtime Processing
6 6
© 2017 IBM Corporation
Column-organized Processing
• Column-organized operators execute in different subsections than row-organized
operators
• Column-organized subsections are processed by different sets of DB2 subagents
• Data is transferred between subsections using a column-organized table queue (CTQ)
• CTQ also performs row materialization
Row processing
RETURN •Subsections run concurrently
(subsection 1)
•There can be multiple column or
row processing subsections
CTQ
•All subsections can be processed
Column
by multiple subagents
SORT processing •(MCP – multi-core parallelism)
(subsection 2)
HSJOIN
SCAN SCAN
7 7
© 2017 IBM Corporation
DB2 11.1 – BLU Acceleration in a Partitioned Database System
• Technology
Query #1
• Value Proposition
• Improve Response Time
1/3 data 1/3 data 1/3 data
9 9
© 2017 IBM Corporation
DPF Parallel join strategies
Collocated join Directed join
Partitioning keys:
CUSTOMER: CUSTKEY Partitioning keys:
DAILY_SALES: CUSTKEY CUSTOMER: CUST_NUMBER
Join predicate: DAILY_SALES: CUSTKEY
CUSTOMER.CUSTKEY = DAILY_SALES.CUSTKEY
Join predicate:
CUSTOMER.CUSTKEY = DAILY_SALES.CUSTKEY
JOIN JOIN
SCAN
Customer Daily Sales Daily Sales
Equi-join predicate on each table’s partitioning key Customer
Tables must be in same DB partition group
Join column(s) data type must be partition compatible Equi-join predicate on one table’s partitioning key
No table queues (TQs) necessary Direct rows of one table to partitioning of the other
BTQ
SCAN
SCAN
Daily Sales
Store
12 12
© 2017 IBM Corporation
BLU on DPF Architecture – Common Compression
Encoding
• Each column of a column- DB2 BLU with 4 data partitions
independently
• Multiple compression
techniques used in
combination MyTable MyTable MyTable MyTable
• Dictionary encoding Slice 1 Slice 2 Slice 3 Slice 4
• Prefix encoding A B C D A B C D A B C D A B C D
• Offset coding
• Etc.
• BLU DPF exploits a common
compression encoding across In the table “MyTable” columns A and B can have very
different encoding, but column A in one slice of the table
data slices will have the same encoding as column A in another slice.
13 13
© 2017 IBM Corporation
BLU and Database Partitioning Feature (DPF)
15 15
© 2017 IBM Corporation
DB2 11.1 DISTRIBUTE BY RANDOM
• New clause on CREATE TABLE
• Ability to partition a table without choosing a partitioning key
• Can be used for row and column-organized tables
• If a primary or unique key is defined, it is used as the partitioning key
• Otherwise, an IMPLICITLY HIDDEN column is added to the table to serve as
the partitioning key
• Called RANDOM_DISTRIBUTION_KEY
• Excluded from SELECT *
• Otherwise, behaves like a regular column
• Simplifies schema design for some applications
• Provides syntactic compatibility with Netezza DDL
• Tradeoff: Can’t do collocated joins or complete aggregation/distincting
16 16
© 2017 IBM Corporation
DB2 11.1 BLU Sort Support
17 17
© 2017 IBM Corporation
DB2 11.1 BLU OLAP Function Support
18 18
© 2017 IBM Corporation
BLU SORT and OLAP examples
RETURN
( 1)
|
CTQ
( 2)
select |
ascending.rnk, TBSCAN
ascending.item_sk worst_performing, ( 3)
descending.item_sk best_performing |
from SORT Truncated SORT for
Common sub-expression ( 4) ORDER BY and
(select item_sk, TEMP | FF100RO
rank() over (order by rank_col asc) rnk | HSJOIN
from TEMP ( 5)
(select ss_item_sk item_sk, ( 10) /-+-\ OLAPs processed after
avg(ss_net_profit) rank_col | MDTQ MDTQ MDTQs
from store_sales ss1 GRPBY ( 6) ( 15)
( 11) | |
group by ss_item_sk )) ascending,
| TBSCAN TBSCAN
(select item_sk, DTQ ( 7) ( 16)
rank() over (order by rank_col desc) rnk ( 12) | |
from | SORT SORT SORTs for
(select ss_item_sk item_sk, GRPBY ( 8) ( 17) OLAP
avg(ss_net_profit) rank_col ( 13) | |
from store_sales ss1 | TBSCAN TBSCAN
TBSCAN ( 9) ( 18)
group by ss_item_sk )) descending
( 14) | |
where | TEMP TEMP
ascending.rnk = descending.rnk CO-TABLE: DB2USER ( 10) ( 10)
order by ascending.rnk STORE_SALES
fetch first 100 rows only;
19 19
© 2017 IBM Corporation
Industry Leading Parallel Sort
• Leverages the latest sort innovations from IBM TJ Watson Research and
DB2 Development
• Enhancements can increase BLU Acceleration performance by as much as 13.9X
• BLU Sort+OLAP on SMP Environment
Row Sort BLU Sort
2000
Query elapsed time(s)
1500
1000
500 4.6 X faster
13.9 X faster 5.4 X faster
0
Sort+OLAP query 1 Sort+OLAP query 2 Sort+OLAP query 3
• Configuration Details
• On 4-socket Intel Xeon platform with 72 Cores and 742G RAM
• 1 TB TPC-DS database
• Query scenarios involving multiple sort and OLAP operations
Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or
performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in
the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an
individual user will achieve results similar to those stated here.
20 20
© 2017 IBM Corporation
BLU Nested-loop Join
21 21
© 2017 IBM Corporation
BLU Nested-loop Join
22 22
© 2017 IBM Corporation
BLU Nested-loop Join Example
299879
GRPBY
1.79927e+06
DB2 11.1 ( 5) DB2
CTQ |
( 2) 299879 10.5
select (...) TBSCAN
( 6)
c_customer_id, 299879 |
c_first_name, GRPBY 299879
c_last_name,
( 6) 120M rows returned by join SORT
| ( 7)
sum(ws_ext_sales_price) 299879 and sent to row-processing |
299879
from HSJOIN NLJOIN
( 7) ( 8)
customer c, /----+----\ /--------+--------\
promotion p, 1.99914e+06 300470 1.20188e+08 0.0025
web_sales ws TBSCAN DTQ CTQ TBSCAN
( 9) ( 14)
( 8) ( 9)
where | |
| | 1.20188e+08 1
p.p_promo_id = ‘Back to school' and 1.99914e+06 300470 ^HSJOIN TEMP
ws.ws_sold_date_sk > p_start_date_sk and CO-TABLE: DB2USER NLJOIN ( 10) ( 15)
/-----+-----\ |
ws.ws_sold_date_sk < p_end_date_sk and CUSTOMER ( 10) 1.20188e+08 1.19948e+07 1
/----+----\ TBSCAN BTQ BTQ
c_customer_sk = ws_bill_customer_sk 1 300470 ( 11) ( 12) ( 16)
group by BTQ TBSCAN | | |
c_customer_id, ( 11) ( 13) 1.20188e+08 1.99914e+06 1
CO-TABLE: DB2USER TBSCAN CTQ
c_first_name, | | WEB_SALES ( 13) ( 17)
1 1.20188e+08 | |
c_last_name TBSCAN CO-TABLE: DB2USER 1.99914e+06 1
( 12) WEB_SALES CO-TABLE: DB2USER TBSCAN
| CUSTOMER ( 18)
|
Find the customer information for web 1500 1500
CO-TABLE: DB2USER CO-TABLE: DB2USER
sales that occurred when the back-to- PROMOTION PROMOTION
school promotion was on.
23 23
© 2017 IBM Corporation
BLU Nested-loop Join Examples
1.8e+06
CTQ
( 2)
|
1.8e+06
DTQ
( 3)
|
300000
UNIQUE
•The NLJOIN inner can be
select (
|
4)
stored in a BLU TEMP if
distinct ss_item_sk, 1.20798e+08
ws_item_sk DTQ very filtering local
( 5)
from | predicates are applied to
web_sales, 1.20798e+08
store_sales (
NLJOIN
6)
the base table
where /----+----\
4.02673e+07 2.9999
ss_quantity < 10 and TBSCAN TBSCAN SS_LIST_PRICE < WS_LIST_PRICE
ss_list_price < ws_list_price (
|
7) (
|
8)
and 4.58213e+08 6
ws_quantity > 99 CO-TABLE: DB2USER
STORE_SALES
TEMP
( 9)
|
6
BTQ
( 10)
|
1
TBSCAN WS_QUANTITY > 99
( 11)
|
1.20188e+08
CO-TABLE: DB2USER
WEB_SALES
24 24
© 2017 IBM Corporation
BLU Nested-loop Join (Not!)
select CTQ
( 2)
sum(ss_quantity*ss_list_price) |
sales, 1
GRPBY
•Some NLJOIN operators aren’t
count(*) number_sales (
|
3)
really joins!
from 6
store_sales,
DTQ
( 4)
•The are a mechanism to pass
date_dim
|
1 scalar values
GRPBY
where (
|
5) •i.e. scalar subqueries
ss_sold_date_sk = d_date_sk 43944.2
and ^HSJOIN
( 6)
d_week_seq = /-------+--------\
4.58213e+08 6.99971 USAGE : (Usage of Join)
(select TBSCAN NLJOIN SCALAR SUBQUERY
( 7) ( 8)
d_week_seq | /----+----\
4.58213e+08 1 6.99971
from CO-TABLE: DB2USER BTQ FILTER
STORE_SALES ( 9) ( 11)
date_dim | |
1.22836 73049
where TBSCAN BTQ
( 10) ( 12)
d_year = 1998 and | |
d_moy = 12 and 73049 73049
CO-TABLE: DB2USER TBSCAN
d_dom = 16 DATE_DIM ( 13)
|
) 73049
CO-TABLE: DB2USER
DATE_DIM
25 25
© 2017 IBM Corporation
BLU Nested-loop Join (Not!)
•Some NLJOIN operators aren’t
Rows really joins!
RETURN
( 1) •They are a mechanism to pass
|
select 1 scalar values
DTQ
c_first_name, ( 2) •Pass row-ids from a row-processing
|
c_last_name 0.166667
PIPE
index scan to column-processing
from (
|
3) •Extra columns retrieved, or
customer 1
CMPEXP
predicates applied, by column
( 4)
where | processing
1
c_customer_sk = 1495578 CTQ •Supported for single-row access
( 5)
| only
1
NLJOIN •(Supported in 10.5 FP3)
( 6)
1
/----+----\
0.166667
•Supported in 11.1 MPP
•IXSCAN executed using row processing RCTQ
( 7)
TBSCAN
( 9)
•Unique index enforcing a primary or | |
1 1.99914e+06
unique key constraint IXSCAN CO-TABLE: DB2USER
( 8) CUSTOMER
•Reverse CTQ flows row-ids to column |
processing 1.99914e+06
INDEX: SYSIBM
SQL160309082316090
26 26
© 2017 IBM Corporation
SQL Functions Optimized for BLU Processing
• Allow more operations to execute in column-organized processing (below
the CTQ)
• String Functions
• LPAD, RPAD
• TO_CHAR
• INITCAP
• Numeric Functions
• POWER, EXP, LOG10, LN
• TO_NUMBER
• MOD
• SIN, COS, TAN, COT, ASIN, ACOS, ATAN
• TRUNCATE
• Date and Time Functions
• TO_DATE
• MONTHNAME, DAYNAME
• Miscellaneous
• COLLATION_KEY
CUST_TEMP
CUSTOMER
29 29
© 2017 IBM Corporation
DB2 11.1.1.1 Parallel Insert Enhancements
30 30
© 2017 IBM Corporation
User-Defined Aggregate Functions
• Ability to create your own functions to be used with GROUP BY or OLAP aggregation
functions e.g.
SELECT WS_ITEM_SK,
MY_AVG(WS_LIST_PRICE) AVG_LISTP
FROM WEB_SALES
GROUP BY WS_ITEM_SK;
SELECT WS_ITEM_SK,
MY_AVG(WS_LIST_PRICE) OVER (PARTITION BY WS_ITEM_SK)
FROM WEB_SALES;
31 31
© 2017 IBM Corporation
User-Defined Aggregate Functions
32 32
© 2017 IBM Corporation
User-Defined Aggregate Functions
Rows
SELECT WS_ITEM_SK, RETURN
( 1)
MY_AVG(WS_LIST_PRICE) |
AVG_LISTP 300456 AGGMODE=FINAL
GRPBY PHASES: MERGE, FINALIZE
FROM WEB_SALES ( 2)
|
GROUP BY WS_ITEM_SK; 300456
LMTQ
( 3)
|
AGGMODE=PARTIAL 300456
PHASES: INITIALIZE, ACCUMULATE GRPBY
( 4)
|
7.21128e+08
FETCH
( 5)
/----+-----\
7.21128e+08 7.21128e+08
IXSCAN TABLE: DB2USER
( 6) WEB_SALES
|
7.21128e+08
INDEX: SYSIBM
SQL170221153923880
33 33
© 2017 IBM Corporation
Automatic Query Transformations
• UPDATE/DELETE statements with correlated SET clauses
• De-correlate when no index is available
• Correlation requires expensive looping and scanning
UPDATE CUST C SET LOGIN =
UPDATE C SET C.C_LOGIN = Q2.LOGIN
(SELECT C_LOGIN FROM CUST_STAGING
WHERE Q2.ROWID=C.ROWID CS
WHERE C.CUST_ID = CS.CUST_ID)
(SELECT DISTINCT C.ROWID, Q1.LOGIN
WHERE EXISTS(
FROM CUST C, CUST_STAGING CS, Q1 SELECT 1 FROM CUST_STAGING CS2
WHERE C.CUST_ID = CS.CUST_ID) AS Q2 WHERE C.CUST_ID = CS2.CUST_ID);
34 34
© 2017 IBM Corporation
UPDATE/DELETE statements with correlated SET clauses
UPDATE CUST SET LOGIN = Q3.LOGIN • Replace correlated sub-select with a join
WHERE Q3.ROWID=ROWID
• Detect duplicates using COUNT(*) for each
(SELECT Q2.ROWID, Q2.LOGIN,
ROWID, raise -811 error if any found
(CASE WHEN CNT > 1 THEN RAISE_ERROR (-811) ), • Typically, there are none
FROM Q2) AS Q3
35 35
© 2017 IBM Corporation
UPDATE/DELETE statements with correlated SET clauses
1.19948e+07 1.19948e+07
UPDATE UPDATE
DB2 10.5 ( 2) DB2 11.1 ( 2)
/-----+------\ /-----+------\
1.19948e+07 1.19948e+07 1.19948e+07 1.19948e+07
RCTQ CO-TABLE: DB2USER RCTQ CO-TABLE: DB2USER
•2 CTQs ( 3) CUSTOMER •1 CTQ (
|
3) CUSTOMER
|
•TBSCAN (12) is 1.19948e+07 •No correlation 1.19948e+07
TBSCAN CTQ
correlated to ( 4) •Distincting ( 4)
| |
CTQ(7), executes 1.19948e+07 pushed down 1.19948e+07
12M times! SORT •But an extra GRPBY
( 5) ( 5)
•SORT not pushed | GROUP BY |
1.19948e+07
1.19948e+07
down NLJOIN UNIQUE
( 6) ( 6)
/----------+-----------\ |
1.19948e+07 1 1.50027e+08
CTQ CTQ HSJOIN
( 7) ( 11) ( 7)
| | /----------+-----------\
1.19948e+07 1 1.19948e+07 1.19948e+07
HSJOIN TBSCAN HSJOIN TBSCAN
( 8) ( 12) ( 8) ( 11)
/-------+-------\ | /-------+-------\ |
1.19948e+07 1.19948e+07 1.19948e+07 1.19948e+07 1.19948e+07 1.19948e+07
TBSCAN TBSCAN CO-TABLE: DB2USER TBSCAN TBSCAN CO-TABLE: DB2USER
( 9) ( 10) CUSTOMER_STAGING ( 8) ( 10) CUSTOMER_STAGING
| | | |
1.19948e+07 1.19948e+07 1.19948e+07 1.19948e+07
CO-TABLE: DB2USER CO-TABLE: DB2USER CO-TABLE: DB2USER CO-TABLE: DB2USER
CUSTOMER CUSTOMER_STAGING CUSTOMER CUSTOMER_STAGING
36 36
© 2017 IBM Corporation
UPDATE/DELETE statements with correlated SET clauses
Row-store tables
DB2 10.5 DB2 11.1
1.19948e+07
1.19948e+07 UPDATE
•The same problem can UPDATE •No correlation ( 2)
occur for row-store ( 2) •But an extra /----+-----\
/----+-----\ 1.19948e+07 1.19948e+07
tables too 1.19948e+07 1.19948e+07 GROUP BY FETCH TABLE: BCULINUX
FETCH TABLE: DB2USER ( 3) CUSTOMER_R
•TBSCAN (9) is scanned ( 3) CUSTOMER_R /----+-----\
/----+-----\ 1.19948e+07 1.19948e+07
12M times! 1.19948e+07 1.19948e+07 GRPBY TABLE: BCULINUX
TBSCAN TABLE: DB2USER ( 4) CUSTOMER_R
•An index could be ( 4) CUSTOMER_R |
added to | 1.19948e+07
1.19948e+07 TBSCAN
CUSTOMER_STAGING_R SORT ( 5)
( 5) |
| 1.19948e+07
1.19948e+07 SORT
HSJOIN ( 6)
( 6) |
/---------+---------\ 2.42356e+07
1.19948e+07 1.19948e+07 HSJOIN
NLJOIN TBSCAN ( 7)
( 7) ( 10) /------------+------------\
/------+-------\ | 1.19948e+07 1.93767e+06
1.19948e+07 1 1.19948e+07 HSJOIN pUNIQUE
TBSCAN TBSCAN TABLE: DB2USER ( 8) ( 11)
( 8) ( 9) CUSTOMER_STAGING_R /------+-------\ |
| | 1.19948e+07 1.19948e+07 1.19948e+07
1.19948e+07 1.19948e+07 TBSCAN TBSCAN TBSCAN
TABLE: DB2USER TABLE: DB2USER ( 9) ( 10) ( 12)
CUSTOMER_R CUSTOMER_STAGING_R | | |
1.19948e+07 1.19948e+07 1.19948e+07
TABLE: BCULINUX TABLE: BCULINUX TABLE: BCULINUX
CUSTOMER_R CUSTOMER_STAGING_R CUSTOMER_STAGING_R
37 37
© 2017 IBM Corporation
Inline Optimization Guidelines (Hints)
39 39
© 2017 IBM Corporation
Appendix
ADDITIONAL MATERIAL
40 40
© 2017 IBM Corporation
BLU Query Optimization - Terminology
• Late materialization
• Columns are retrieved as late as possible depending on
predicate filtering
• Occurs for TBSCANs and probe side of HSJOINs
•e.g. SELECT C1, C2, C3 FROM T1 WHERE C1=5 AND C2=10
1.SCAN C1, apply C1=5, return row-ids
2.Using row IDs from 1), SCAN C2, apply C2=10, return row IDs
3.Using row IDs from 2), SCAN C3 and return values
41 41
© 2017 IBM Corporation
Late materialization
LOAD DS.SALES_PRICE
select
c.first_name,
c.last_name, JOIN DS.CUSTKEY = C.CUSTKEY
ds.sales_price
from LOAD DS.CUSTKEY
SCAN
customer c,
date d, DS.PERKEY = D.PERKEY JOIN
daily_sales ds Customer
where LOAD DS.PERKEY
ds.perkey = d.perkey and SCAN
ds.custkey = c.custkey and
d.year = 2015 SCAN
Date
Daily Sales
42 42
© 2017 IBM Corporation
BLU Query Optimization - terminology
• Row materialization
• Column-organized data is reconstructed as row-organized
data
• Performed by the column-organized table queue (CTQ)
• CTQ placement is determined by the optimizer
• Subsection degree is determined by the optimizer
•Dynamically readjusted at runtime with DEGREE=ANY
43 43
© 2017 IBM Corporation
Subsection 1
RETURN (1 subagent)
( 1)
Row processing |
CTQ
( 2) SCANGRAN: (Intra-Partition Parallelism Scan Granularity)
| 200
MDTQ SCANTYPE: (Intra-Partition Parallelism Scan Type)
( 3) LOCAL PARALLEL
BLU Query Optimization | SCANUNIT: (Intra-Partition Parallelism Scan Unit)
ROW
Explain example TBSCAN
( 4) TQDEGREE: (Degree of Intra-Partition parallelism)
| 32
Column processing SORT TQORIGIN: (Table Queue Origin type)
( 5) COLUMN-ORGANIZED DATA
|
GRPBY
( 6)
|
DTQ
( 7)
|
Subsection 2
AGGMODE : (Aggregation Mode)
GRPBY (32 subagents)
HASHED COMPLETE ( 8)
|
^HSJOIN
( 9)
/-------+--------\
^HSJOIN TBSCAN
( 10) ( 14)
/----+----\ |
TBSCAN BTQ CO-TABLE: DB2USER
( 11) ( 12) ITEM
| |
CO-TABLE: DB2USER TBSCAN
WEB_SALES ( 13)
|
CO-TABLE: DB2USER
DATE_DIM
44 44
© 2017 IBM Corporation
Automatic Query Transformations
CUST_ADDR
DB2 10.5 CA
45 45
© 2017 IBM Corporation
Queries with correlated scalar sub-selects
CUST CUST_ADDR
C CA
46 46
© 2017 IBM Corporation
Queries with correlated scalar sub-selects
•2 CTQs •1 CTQ
•TBSCAN (6) is •No correlation
correlated to CTQ(3), •Join is pushed down
executes 12M times! •Requires an extra GROUP
BY
Rows
Rows RETURN
RETURN ( 1)
( 1) |
| 1.19948e+07
1.19948e+07 CTQ
NLJOIN ( 2)
( 2) |
/-------+-------\ 1.19948e+07
1.19948e+07 1 GRPBY
CTQ CTQ ( 3)
( 3) ( 5) |
| | 1.19948e+07
1.19948e+07 1 HSJOIN<
TBSCAN TBSCAN ( 4)
( 4) ( 6) /-------+-------\
| | 5.98932e+06 1.19948e+07
1.19948e+07 5.98932e+06 TBSCAN TBSCAN
CO-TABLE: DB2USER CO-TABLE: DB2USER ( 5) ( 6)
CUSTOMER CUSTOMER_ADDRESS | |
5.98932e+06 1.19948e+07
CO-TABLE: DB2USER CO-TABLE: DB2USER
CUSTOMER_ADDRESS CUSTOMER
DB2
10.5 DB2 11.1
47 47
© 2017 IBM Corporation