0% found this document useful (0 votes)
37 views

Presentation14 Physical Database Design

The document discusses physical database design which involves translating the logical data model into specifications for storing and retrieving data efficiently on disk. It describes the key steps as selecting appropriate file organizations (e.g. heap, ordered, hash), access methods, and indexes to physically implement the logical structure in a way that provides good performance. Primary considerations for file organization include fast data access, efficient storage utilization, and accommodating growth.

Uploaded by

Nyok Biliu Kuol
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Presentation14 Physical Database Design

The document discusses physical database design which involves translating the logical data model into specifications for storing and retrieving data efficiently on disk. It describes the key steps as selecting appropriate file organizations (e.g. heap, ordered, hash), access methods, and indexes to physically implement the logical structure in a way that provides good performance. Primary considerations for file organization include fast data access, efficient storage utilization, and accommodating growth.

Uploaded by

Nyok Biliu Kuol
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Physical Database Design

• Purpose
– Translate the logical description of data into
the technical specifications for storing and
retrieving data
– How the logical structure is to be physically
implemented in the target DBMS
– Create a design for storing data that will
provide adequate performance

1
Physical Database Design
• The process of producing a description of
the implementation of the database on
secondary storage; it describes the base
relations, file Organizations, and indexes
used to achieve efficient access to the
data
Physical Database Design
• The steps of the physical database design
methodology are as follows:
– Step 3 Translate logical data model for
target DBMS
– Step 4 Design file organizations and indexes
Basic Concepts
• The database on secondary storage is organized
into one or more files, where each file consists of
one or more records and each record consists of
one or more fields
• Consider the relation given on the next slide,
when a user requests a tuple from the DBMS, for
example Staff tuple SG37, the DBMS maps this
logical record on to a physical record and
retrieves the physical record into the DBMS
buffers in primary storage using the operating
4
system file access routines
Figure: Reduced Staff relation
from DreamHome case study

5
Basic Concepts
• The physical record is the unit of transfer
between disk and primary storage, and vice versa
• Generally, a physical record consists of more
than one logical records, although depending on
size, a logical record can correspond to one
physical record
• It is even possible for a large logical record to
span more than one physical record

6
Basic Concepts
• The terms block and page are sometimes used in
place of physical record
• For example, the Staff tuples may be stored on
two pages as shown in Figure on next slide
• The order in which records are stored and
accessed in the file is dependent on the file
organization

7
Figure: Storage of Staff relation in pages

8
File Organization
• Technique for physically arranging records of a
file on secondary storage
• Factors for selecting file organization:
– Fast data retrieval and throughput
– Efficient storage space utilization
– Protection from failure and data loss
– Minimizing need for reorganization
– Accommodating growth
– Security from unauthorized use

9
File Organization
• The physical arrangement of data in a file into
pages on secondary storage
• The main types of file organization are:
– Heap (unordered) files Records are placed on disk
in no particular order
– Sequential (ordered) files Records are ordered by
the value of a specified field
– Hash files Records are placed on disk according to
a hash function
Access Method
• Along with a file organization, there is a set of
access methods
• Access method: The steps involved in storing
and retrieving records from a file.
• Since some access methods can be applied
only to certain file organizations (for example,
we cannot apply an indexed access method to
a file without an index), the terms file
organization and access method are used
interchangeably
Unordered (Heap)Files
• Simplest organization
• Records placed in same order inserted
• Linear search for retrieval
• Insertion efficient; retrieval not efficient
• Deletion process
– Relevant page identified
– Record marked as deleted
– Page rewritten to disk
– Deleted record space not reused → inefficient storage
• Best suited for bulk loading data; there is no overhead of
calculating what page the record should go on
• Suitable when all of the records are to be retrieved in a typical
scan
Ordered Files
• Ordered files
– Aka sequential files
– Sorted on field – ordering field
– If ordering field = key → ordering key
– Binary search for retrieval
– Insertion and deletion problematic
• Need to maintain order of records
– Rarely used for database storage
– Best if records are retrieved in some order
or a ‘range’ of records are retrieved
Ordered Files
• If the tuples are ordered on staffNo, under
certain conditions we can use a binary search
to execute queries that involve a search
condition based on staffNo
• For example, consider the following SQL query:
SELECT *
FROM Staff
WHERE staffNo = ‘SG37’;
Binary search on an ordered file

Assuming there is one record


per page

15
Hash Files
• Hash files
– Good for equality selections
– Aka random/direct files
– Hash function is used to determine the page address for storing
record
• Each address corresponds to a page/bucket
• Each bucket has slots for multiple records
• Hash function is chosen to provide most even distribution of
records – min. collisions
• Examples:
– Division-remainder – uses mod function to determine field
value
– Folding – applying arithmetic function to hash field e.g. + 7
• Collision
– Hash function calculates same address for 2 or more records
Hash Files
Indexes
• Data structure that allows DBMS to locate
particular records in file more quickly
• Similar to index in book
• Main types of indexes:
– Primary index
• Index a key field
– Clustering index
• File sequentially ordered on non-key field
i.e. more than one records can correspond
with index
Step 3 Translate logical data model for
target DBMS
• Extend the design of base relations (defined in
logical data model) by identifying data types,
lengths, and constraints that the target DBMS
supports
Step 4 Design file organizations and indexes

• The choice of file organizations should be fully


documented, along with the reasons for the
choice
• The choice of indexes should be fully
documented along with the reasons for the
choice

21

You might also like