100% found this document useful (1 vote)
537 views

Mapping Datawarehouse Architecture

The document discusses mapping data warehouse architecture to multiprocessor architecture. There are two types of parallelism for improving performance: inter-query parallelism which handles multiple queries simultaneously, and intra-query parallelism which decomposes queries into parallelizable tasks like scans and joins. Data partitioning is key for effective parallel execution, and can be done randomly or intelligently through methods like hash, key range, schema, or user-defined partitioning.

Uploaded by

durai murugan
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
537 views

Mapping Datawarehouse Architecture

The document discusses mapping data warehouse architecture to multiprocessor architecture. There are two types of parallelism for improving performance: inter-query parallelism which handles multiple queries simultaneously, and intra-query parallelism which decomposes queries into parallelizable tasks like scans and joins. Data partitioning is key for effective parallel execution, and can be done randomly or intelligently through methods like hash, key range, schema, or user-defined partitioning.

Uploaded by

durai murugan
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 2

MAPPING THE DATA WAREHOUSE ARCHITECTURE TO

MULTIPROCESSOR ARCHITECTURE

The functions of data warehouse are based on the relational data base technology. The
relational data base technology is implemented in parallel manner.

There are two advantages of having parallel relational data base technology for data
warehouse:

Linear Speed up: refers the ability to increase the number of processor to reduce response
time
Linear Scale up: refers the ability to provide same performance on the same requests as the
database size increases

Types of parallelism
There are two types of parallelism:
Inter query Parallelism: In which different server threads or processes handle multiple
requests at the same time.
Intra query Parallelism: This form of parallelism decomposes the serial SQL query into
lower level operations such as scan, join, sort etc. Then these lower level operations are
executed concurrently in parallel.
Intra query parallelism can be done in either of two ways:

Horizontal parallelism: which means that the data base is partitioned across multiple disks
and parallel processing occurs within a specific task that is performed concurrently on
different processors against different set of data
Vertical parallelism: This occurs among different tasks. All query components such as scan,
join, sort etc are executed in parallel in a pipelined fashion. In other words, an output from one
task becomes an input into another task.

DATA PARTITIONING:

Data partitioning is the key component for


effective parallel execution of data base
operations. Partition can be done randomly or
intelligently.
Random portioning includes random data
striping across multiple disks on a single
server. Another option for random portioning is round robin fashion partitioning in which each
record is placed on the next disk assigned to the data base.

Intelligent partitioning assumes that DBMS knows where a specific record is located and
does not waste time searching for it across all disks. The various intelligent partitioning
include:

HASH PARTITIONING:
A hash algorithm is used to calculate the partition number based on the value of the
partitioning key for each row

Key range partitioning: Rows are placed and located in the partitions according to the value
of the partitioning key. That is all the rows with the key value from A to K are in partition 1,
L to T are in partition 2 and so on.

Schema portioning: an entire table is placed on one disk; another table is placed on different
disk etc. This is useful for small reference tables.

User defined portioning: It allows a table to be partitioned on the basis of a user defined
expression.

You might also like