0% found this document useful (0 votes)
8 views

Cap 5

This document discusses parallel processing concepts in 3 main points: 1. It explains the benefits of parallel processing such as reducing response times and maximizing hardware usage. It describes operations that can be parallelized like scans, joins, and DDL statements. 2. It introduces parallel execution terminology like degree of parallelism, granules, and parallel execution servers. It explains how these components work together for parallel operations. 3. It discusses features for controlling and enabling parallelism like the parallel execution server pool and adaptive multi-user settings. It provides examples of how parallel queries are executed in real-world systems.

Uploaded by

Philipe Rodrigo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Cap 5

This document discusses parallel processing concepts in 3 main points: 1. It explains the benefits of parallel processing such as reducing response times and maximizing hardware usage. It describes operations that can be parallelized like scans, joins, and DDL statements. 2. It introduces parallel execution terminology like degree of parallelism, granules, and parallel execution servers. It explains how these components work together for parallel operations. 3. It discusses features for controlling and enabling parallelism like the parallel execution server pool and adaptive multi-user settings. It provides examples of how parallel queries are executed in real-world systems.

Uploaded by

Philipe Rodrigo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Parallelism Concepts

Objectives

After completing this lesson, you should be able to do the


following:
• Explain the benefit of using parallel operations
• List operations that can be parallelized and how they work
• Describe parallel execution terminology
• Control the parallel execution server pool
• Enable parallel execution
• Interpret a parallel execution plan
• Parallel execution features

5-2
Lesson Agenda

• Introduction to Parallel Execution


– Benefits of using parallel operations
– Operations that can be parallelized and how they work
• Parallel execution terminology
• Controlling the parallel execution server pool
• Enabling Parallel Execution
• Parallel execution plan
• Parallel Execution Features

5-3
Why Use Parallel Processing?

Do I have to
count all of this
myself?

5-4
Basic Theory Behind Parallel Processing

Determine how
many processes:
PX Servers / DOP

Divide the data:

5-5
Understanding Data Partitioning Architectures

Data Data Data Data Data


A-E F-K L-S T-Z A-Z

Shared Nothing Shared Everything


Static data partitioning is required. No data partitioning is required.

5-6
Benefits of Parallel Processing

CPU CPU
scan scan
CPU CPU
idle scan
CPU CPU
idle scan
CPU CPU
idle scan

Server without parallelism Server with parallelism:


Maximum gain with four processes

5-7
When to Use Parallel Processing

Property Candidates for Serial Candidates for PX


Response time Subseconds to seconds Seconds to hours

Data organization Application Subject, time

Activities Processes Analysis

Nature of data Generally 30–60 days, Snapshots over time,


transactional derived data/aggregates

Size Small to large Large to very large

Data sources Operational, internal Operational, internal,


external

Duplicated data Normalized RDBMS Denormalized RDBMS


(ROLAP or MOLAP)

5-8
Operations That Can Be Parallelized

• Access methods:
– Table scans, fast full index scans
– Partitioned index range scans
– Various SQL operations
• Joins:
– Nested loop, sort merge
– Hash, star transformation, partitionwise join
• DDL statements:
– CTAS, CREATE INDEX, REBUILD INDEX
[PARTITION]
– MOVE, SPLIT, COALESCE PARTITION
• DML statements:
– INSERT SELECT, UPDATE, DELETE, MERGE

5-9
Scanning a Table in Parallel

• With serial execution, only one process is used.


• With parallel execution:
– One parallel execution coordinator process is used
– Many parallel execution servers are used
– A table is dynamically divided into granules
Coordinator process (QC)
Serial process SELECT COUNT(*)
SELECT COUNT(*) FROM sales
FROM sales SALES
Pnnn

Pnnn
SALES Parallel
execution servers (PX) Pnnn

5 - 10
Parallel Execution with
Real Application Clusters (RAC)
Execution processes have node affinity for the execution
coordinator, but they expand if needed.

Node 1 Node 2 Node 3 Node 4


QC P025
P001 P026 P030
P002 P040 P031

Execution
coordinator (QC)

Shared disks Parallel


execution
server (Pnnn)

5 - 11
Lesson Agenda

• Introduction to Parallel Execution


• Parallel execution terminology
• Controlling the parallel execution server pool
• Enabling Parallel Execution
• Parallel execution plan
• Parallel Execution Features

5 - 12
The Granule

• The basic unit of work in parallelism is called the


granule. Types of granules:
– Block range granules are dynamically generated at
execution time.
– Partition granules are statically determined by the
number of partitions.
• The granule type used depends on the kind of parallel
operation that is performed.
• Parallel execution servers work on one granule at a
time and granules are processed by one server.
• Parallel execution servers progress from one granule
to the next if there is work left to be done.

5 - 13
Degree of Parallelism (DOP)

• Degree of parallelism is the number of parallel execution


servers used by one parallel operation.
• If interoperation parallelism is used, the number of parallel
execution servers can be twice the DOP.
• No more than two sets of parallel execution servers can be
used for one parallelized statement.
• Partition granules are used for certain operations and may
limit the degree of parallelism (the optimizer determines
whether or not you get a partition wise join).

5 - 14
Default Degree of Parallelism

The default DOP:


• Is used for a parallel operation that does not specify a
DOP
• Is dynamically calculated at run time
• Depends on:
– Total number of CPUs
– PARALLEL_THREADS_PER_CPU

5 - 15
Parallel Operations

SELECT cust_last_name, cust_first_name


FROM customers
ORDER BY cust_last_name;

Execution Servers
Table on disk
Consumer set Producer set
SQL Data
sort A-M scan
Dispatching
results
sort N-Z scan
Coordinator (QC)
Table’s
dynamic
partitioning
DOP=2 (granules)

5 - 16
How Parallel Execution
Servers Communicate
Rows distribution:
• PARTITION QC
• HASH
• RANGE
Parallel
• BROADCAST P003 P004 execution
• ROUND ROBIN server set 1

• QC(RANDOM)
Parallel
• QC(ORDER) P002 execution
P001
server set 2

DOP=2
Processes = 2*DOP
Connections = [DOP2 + 2*DOP]

5 - 17
Lesson Agenda

• Introduction to Parallel Execution


• Parallel execution terminology
• Controlling the parallel execution server pool
• Enabling Parallel Execution
• Parallel execution plan
• Parallel Execution Features

5 - 18
Parallel Execution Server Pool

• A pool of servers are created at instance startup.


• Minimum pool size is determined by
PARALLEL_MIN_SERVERS.
• Pool size can increase based on demand.
• Maximum pool size is determined by
PARALLEL_MAX_SERVERS.
• If a parallel execution server is idle for more than a
threshold period of time, it is terminated.
– Processes specified in the minimum set are never
terminated.

5 - 19
Adaptive Multiuser and DOP

• The adaptive multiuser feature adjusts the DOP on the


basis of user load.
• PARALLEL_ADAPTIVE_MULTI_USER when set to:
– TRUE improves performance in a multiuser environment
(default)
– FALSE is used for batch processing

5 - 20
Lesson Agenda

• Introduction to Parallel Execution


• Parallel execution terminology
• Controlling the parallel execution server pool
• Enabling Parallel Execution
• Parallel execution plan

5 - 21
Enabling Parallel DML, DDL, and QUERY

• The ALTER SESSION statement enables parallel


mode:
ALTER SESSION ( ENABLE | DISABLE | FORCE )
PARALLEL ( DML | DDL | QUERY ) ( PARALLEL n );

• FORCE … PARALLEL n is used to override the


default degree of parallelism.
• A degree supplied in a hint overrides a forced degree
of parallelism.

5 - 22
Enabling Parallel DML, DDL, and QUERY

• You can use V$SESSION to look at session status:


– PDML_STATUS
– PDDL_STATUS
– PQ_STATUS
• Values for the above columns can be:
– ENABLED
– DISABLED
– FORCED

5 - 23
Enabling Parallelization and Determining DOP

• A SQL statement can be parallelized if:


– It includes a PARALLEL hint
– Parallelization is forced using the ALTER SESSION FORCE
command
– The object operated on is or was declared with a PARALLEL
clause (dictionary DOP greater than one)
– The resource manager allows DOP greater than one
• DOP is determined by looking at referenced objects:
– Parallel queries use the largest specified or dictionary DOP
– Parallel DDL sets the DOP to the one specified by the
PARALLEL clause
– The final DOP may be reduced based on availability of
resources and tools such as resource manager

5 - 24
Using Parallelization Hints

The following parallelization hints are used to override


existing DOPs:
• PARALLEL (table_name, DOP_value)
SELECT /*+PARALLEL(sales,8)*/ *
FROM sales;

• NOPARALLEL (table_name)
• PARALLEL_INDEX (table_name, index,
DOP_value)
SELECT /*+ PARALLEL_INDEX(c,ic,4)*/ *
FROM customers c
WHERE cust_city = 'CLERMONT';
• NOPARALLEL_INDEX (table_name, index)

5 - 25
PARALLEL Clause: Examples

CREATE INDEX ord_customer_ix ON oe.orders


(customer_id) NOLOGGING PARALLEL;

ALTER TABLE customers PARALLEL 4;

ALTER TABLE sales SPLIT PARTITION


sales_q4_2000 AT ('15-NOV-2000') INTO (PARTITION
sales_q4_1, PARTITION sales_q4_2) PARALLEL 2;

5 - 26
Object’s PARALLEL Clause

• It can be specified for tables and indexes.


• View the degree of parallelism in the DEGREE column
of DBA_TABLES and DBA_INDEXES (dictionary DOP).
• It is modified using the corresponding ALTER
command:
ALTER TABLE sales NOPARALLEL;
ALTER TABLE sales PARALLEL 8;

• It is used to specify the DOP during the object’s DDL:


– CREATE INDEX
– CREATE TABLE … AS SELECT
– Partition maintenance commands

5 - 27
Lesson Agenda

• Introduction to Parallel Execution


• Parallel execution terminology
• Controlling the parallel execution server pool
• Enabling Parallel Execution
• Parallel execution plan
• Parallel Execution Features

5 - 28
Parallel Execution Plan

• For the same statement, a parallel plan generally


differs from the corresponding serial plan.
• To generate the execution plan, use the EXPLAIN
PLAN command, or execute the statement.
• To view the execution plan:
– Select directly from PLAN_TABLE
– Select directly from V$SQL_PLAN
– Run $ORACLE_HOME/rdbms/admin/utlxplp.sql
– Use the DBMS_XPLAN.DISPLAY table function
• Columns of interest:
– OBJECT_NODE
– OTHER_TAG
– DISTRIBUTION

5 - 29
OTHER_TAG Column

OTHER_TAG Interpretation
SERIAL Serial execution

SERIAL_FROM_REMOTE (S -> R) Serial execution at a remote site

PARALLEL_FROM_SERIAL (S -> P) Serial execution: Output partitioned or


broadcast to PX

PARALLEL_TO_PARALLEL (P -> P) Parallel execution: Output repartitioned to


second set of PX

PARALLEL_TO_SERIAL (P -> S) Parallel execution: Output returns to


coordinator

5 - 30
Lesson Agenda

• Introduction to Parallel Execution


• Parallel execution terminology
• Controlling the parallel execution server pool
• Enabling Parallel Execution
• Parallel execution plan
• Parallel Execution Features
– Automatic degree of parallelism
– New and changed parameters
– Changes in using parallelization hints
– Parallel Statement Queuing
– In-Memory Parallel Query
– Explain Plan enhancements

5 - 31
Automatic Degree of Parallelism Determination

Statement is hard parsed Optimizer determines


SQL If estimated time
and optimizer determines autoDOP based on all
statement greater than ‘threshold*
the execution plan scan operations

Actual DOP = MIN(PARALLEL_DEGREE_LIMIT, autoDOP)


If estimated time less
than the threshold*
Statement
executes in
Statement parallel
executes serially

* Threshold set in parallel_min_time_threshold (default = 10s)

5 - 32
What Parameters to Use?

5 - 33
Enabling Auto Degree of Parallelism

Instance or session parameter PARALLEL_DEGREE_POLICY


• MANUAL
– The DBA manually specifies all aspects of parallelism.
– No new features are enabled.
• LIMITED
– Auto DOP restricted for queries with tables decorated with
PARALLEL (if an explicit DOP is specified, use that one)
– No Statement Queuing, no In-Memory Parallel Execution
• AUTO
– All qualifying statements are subject to executing in parallel
or not.
– DOP is automatically computed.
– DOP set on tables is ignored.
– Statements can be queued.
– In-memory PX is available.

5 - 34
New PARALLEL_DEGREE_POLICY Value

Instance or session parameter PARALLEL_DEGREE_POLICY


• ADAPTIVE
– All qualifying statements are subject to executing in parallel
or not.
– DOP is automatically computed.
– DOP set on tables is ignored.
– Statements can be queued.
– In-memory PX is available.
– Performance feedback is enabled (New).

5 - 35
Auto DOP Requirements

1. Require optimizer statistics and the cost of scan operations


– No statistics:
— Cannot appropriately decide between parallelizing and not
parallelizing
– Can lead to:
— Incorrect decision based on incorrect costing based on incorrect
information
— Too high or too low a DOP
— Too much queuing if DOP overestimated
2. Require hardware characteristics including I/O calibration
– No I/O statistics:
— Automatic DOP uses default value of 200MB/s.
— Automatic DOP default value may be inaccurate.
– Solution: Run I/O calibration with Resource Manager.

5 - 36
Using PARALLEL_MIN_TIME_THRESHOLD

• Automatic degree of parallelism determines the DOP on a


per statement basis rather than using a one-size-fits-all
policy.
• It determines that it is not beneficial for some statements to
be executed in parallel.
– This improves the response time of the SQL statement while
also making parallel resources available to other more
demanding statements.
– A query may be downgraded to serial, based on a user
specified cut-off time controlled by the
PARALLEL_MIN_TIME_THRESHOLD parameter.

New Parameter Allowable Values Default


PARALLEL_MIN_TIME_THRESHOLD any number > 0, AUTO AUTO

5 - 37
Using PARALLEL_DEGREE_LIMIT

• The maximum degree of parallelism for a statement is


capped by the default DOP.
• In some cases, this DOP might be too high.
• PARALLEL_DEGREE_LIMIT enables you to specify a
maximum DOP.
CPU scan

CPU scan

CPU scan

CPU scan

Maximum DOP?

New Parameter Allowable Values Default


PARALLEL_DEGREE_LIMIT Any number > 1, CPU, IO, AUTO CPU

5 - 38
Using PARALLEL_SERVER_TARGET

• PARALLEL_SERVERS_TARGET specifies the number of


parallel server processes allowed to run parallel
statements before statement queuing will be used.
– When PARALLEL_DEGREE_POLICY is set to AUTO, Oracle
will queue SQL statements that require parallel execution, if
the necessary parallel server processes are not available.
– By default, PARALLEL_SERVER_TARGET is set lower than
the maximum number of parallel server processes allowed
on the system (PARALLEL_MAX_SERVERS).

New Parameter Allowable Values Default


PARALLEL_SERVERS_TARGET 0 to 4 * CPU_COUNT *
PARALLEL_MAX_SERVERS PARALLEL_THREADS_PER_CPU
* ACTIVE_INSTANCE_COUNT

5 - 39
For RAC: Using PARALLEL_FORCE_LOCAL

• When PARALLEL_FORCE_LOCAL is set to TRUE, it restricts


the allocation of parallel server processes to the node to
which the query coordinator is connected in a RAC
environment.
– Parallel statements are executed within a particular instance
to avoid any interconnection with other instances.
– If a user connects to a RAC service that encompasses more
than one RAC node, PARALLEL_FORCE_LOCAL restricts the
allocation of parallel server processes to whichever node the
initial connection was mapped.

New Parameter Allowable Values Default


PARALLEL_FORCE_LOCAL TRUE, FALSE FALSE

5 - 40
Summary of Important Parameters for Parallel Execution Features

Parameter Default Value Description


PARALLEL_DEGREE_LIMIT “CPU” Max DOP that can be
granted with Auto DOP
PARALLEL_DEGREE_POLICY “MANUAL” Specifies if Auto DOP,
Queuing, and In-memory
PE are enabled
PARALLEL_MIN_TIME_THRESHOLD “AUTO” Specifies minimum
execution time a statement
should have before AUTO
DOP will kick in (baseline
def = 10 seconds)
PARALLEL_SERVERS_TARGET 4*CPU_COUNT* Specifies number of
PARALLEL_THREADS_PER parallel processes allowed
_CPU * to run parallel statements
ACTIVE_INSTANCES before queuing will be used

5 - 41
Parallel Hints Are now at the Statement Level

• The scope of parallel hints is now at the statement level,


superseding parallelism specified at table and object level.
• The PARALLEL_DEGREE_POLICY initialization parameter
set to MANUAL (default): Hints beginning with PARALLEL
indicate the degree of parallelism for a specified object.
• PARALLEL_DEGREE_POLICY set to LIMITED: The scope
of the PARALLEL hint is the statement, not an object.

5 - 42
Implication of Statement-Level Parallel Hints

• Set parallelism on the EMPLOYEES table to 2 and disable


parallelism on the DEPARTMENTS table, as follows:
ALTER TABLE employees PARALLEL 2;
ALTER TABLE departments NOPARALLEL;

• PARALLEL_DEGREE_POLICY is set to MANUAL, so parallel


hint is for object:
SELECT /*+ PARALLEL(employees 3) */ e.last_name, d.department_name
FROM employees e, departments d
WHERE e.department_id=d.department_id;
• PARALLEL_DEGREE_POLICY is set to LIMITED, so
parallel hint is for statement:
SELECT /*+ PARALLEL(4) */ hr_emp.last_name, d.department_name
FROM employees hr_emp, departments d
WHERE hr_emp.department_id=d.department_id;

5 - 43
Parallel Statement Queuing

Statement is parsed If not enough parallel


and Oracle automatically servers available queue
determines DOP the statement
SQL
statements
64 32
64 16
32 128
16

FIFO queue

When the required


number of parallel servers
If enough parallel become available the first
servers available statement on the queue is
execute immediately dequeued and executed

8
128

5 - 44
Controlling Parallel Statement Queuing

• PARALLEL_SERVER_TARGET indicates how many PX


servers are available to run queries before queuing
begins.
Total PX servers available

Parallel Max Servers


160

PX servers:
Parallel Server Target
1-64 available to run
64
queries before queuing
CPU Count
8

On an 8 CPU system

5 - 45
In-Memory Parallel Query

• New automatic parallelization capabilities enable the


instancewide buffer cache to store and reuse the
clusterwide data cache for a single parallel operation.
• Minimizes or even completely eliminates the physical I/O
for a parallel operation
• Parallel Data Cache enables the clusterwide usage of the
buffer cache for parallel operations, scaling out the
available memory for data caching with the number of
nodes in a cluster.
• Parallel Data Cache optimizes the physical I/O
requirements, speeding up the processing of large parallel
operations.

5 - 46
How In-Memory Parallel Execution Works
SQL Determine size of the Table is a good candidate Fragments of Table are
statement table being looked at for In-Memory PX read into each node’s
buffer cache

Table is Table is
extremely small extremely large

Read into the buffer


cache on any node Only parallel server on
the same RAC node
Always use direct read will access each
from disk fragment

5 - 47
Explain Plan Example

Parallel
hint

In Memory Parallel Query

5 - 48
View the Degree Limit in Enterprise Manager

• PARALLEL_DEGREE_POLICY=AUTO
• PARALLEL_DEGREE_LIMIT=16
• Rather than 56 or 64, DOP is 16.
Note: There is no indication of downgrades on the statements.

5 - 49
Summary

In this lesson, you should have learned how to:


• Explain the benefit of using parallel operations
• List operations that can be parallelized and how they work
• Describe parallel execution terminology
• Control the parallel execution server pool
• Enable Parallel Execution
• Interpret a parallel execution plan
• Leverage 11gR2 enhancements for parallel execution

5 - 50

You might also like