GROUP 5 Physical Database Design and Performance
GROUP 5 Physical Database Design and Performance
and Performance
Database design basics
• Determining data to be stored
• Logically structuring data
• ER diagram
• A design process suggestion for Microsoft Access
• Normalization
• Physical design
What is database?
A database is an organized collection of structured information, or
data, typically stored electronically in a computer system. A
database is usually controlled by a database
management system (DBMS). Together, the data and the DBMS,
along with the applications that are associated with them, are
referred to as a database system, often shortened to just database.
Data base design basic
A properly designed database provides you
with access to up-to-date, accurate
information. Because a correct design is
essential to achieving your goals in working
with a database, investing the time required
to learn the principles of good design makes
sense. In the end, you are much more likely to
end up with a database that meets your
needs and can easily accommodate change.
Determining
data to be
stored
In a majority of cases, a person who is doing the
design of a database is a person with expertise in the
area of database design, rather than expertise in the
domain from which the data to be stored is
drawn. Therefore, the data to be stored in the database
must be determined in cooperation with a person who
does have expertise in that domain, and who is aware
of what data must be stored within the system.
Determining data relationships
Sometimes when data is changed you can be changing other data that is not
visible..
A good database design is, therefore, one that:
Find and
Determine the Divide the
organize the
purpose of your information into
information
data tables
required
for example:
“The customer database keeps a list of customer information for the
purpose of producing mailings and reports”
• Security
• Replication
• High-availability
• Partitioning
• Backup and restore schemes.
DENORMALIZATION
AND PARTITIONING
DATA
NORMALIZATION
VS
DENORMALIZATION
NORMALIZATION VS DENORMALIZATION
• Opposite of normalization
• Joining tables and allow the data to be repeated
• It is the process of adding redundant columns to the
database in order to improve performance
PURPOSE OF DENORMALIZATION
PURPOSE OF DENORMALIZATION
• To improve the performance of database infrastructure:
By either:
• Reducing the number of tables
• Reducing the number of joins required during query
execution
• Reducing the number of rows to be retrieved from primary data
table
WHAT IS DATA PARTITIONING?
DATA PARTITIONING
1 HORIZONTAL PARTITIONING
2 VERTICAL PARTITIONING
HORIZONTAL PARTIONING (often called sharding)
VERTICAL PARTITIONING
• divides the table vertically (by columns), which means that the
structure of the main table changes in the new ones .
ADVANTAGES AND
DISADVANTAGES
ADVANTAGES OF PARTITIONING
• Organized
• Efficiency: Record used together are grouped
together.
• Each partition can be optimized for performance.
• Security and Recovery
DISADVANTAGES OF PARTITIONING
IMPROVE SECURITY
• In some cases, you can separate sensitive and nonsensitive data into different partitions
and apply different security controls to the sensitive data.
IMPROVE AVAILABILTY
• Separating data across multiple servers avoids a single point of failure. If one
instance fails, only the data in that partition is unavailable. Operations on other
partitions can continue.
DESIGNING PHYSICAL
DATABASE FILE
What is physical design in database
Process of producing a description of the implementation of the
database
on secondary storage; it describes the base relations, file organizations,
and indexes used to achieve efficient
access to the data, and any associated integrity constraints and security
measures.
The conceptual design and logical were independent of physical
considerations. Now, we not only know that we want a relation model, we
have selected a database management system (DBMS) Oracle, and we
focus on those physical conderations.
Logical Database Design is concerned with what
to store.
o Physical Database Design is concerned with how to store.
Underlying Concepts
Because physical design is related to how data are physically stored, we need
to consider a few underlying concepts about physical storage. One goal of
physical design is optimal performance and storage space utilization.
Physical design includes data structures and file organization, keeping in
mind that the database software will communicate with your computer’s
operating system. Typical concerns include: • Storage allocations for data
and indexes
Record descriptions and stored sizes of the actual data
Record placement
Data compression, encryption
PHYSICAL DESIGN IN DBMS (Naming)
Note that I specified the table design above is applicable
to MS ACCESS. Access is very “forgiving” when it comes to
things like names. You can have spaces, you can pretty
much do what you want.
Other DBMS, such as Oracle, are more strict. Most DBMS,
in fact, have naming restrictions similar to those in
Oracle. Attribute names must begin with a letter, you
cannot have spaces (although you CAN have underscore).
You can’t use “reserved words” such as the name of
functions (max) or datatypes (date).
PHYSICAL DESIGN IN DBMS SPECIFIC (data types)
Oracle example – when you create the table, you could list the check constraint in the
create table statement: Create table vitals (…[list of attributes and data types],
SBP Number (3) check SBP between 0 and 350, …
In Access you would create the SBP column in table design view, then in the properties for
this column, create the validation rule:
SomeDBMS provide more facilities than others for defining
enterprise constraints. An example of a complex constraint that
might work in Oracle, but that would not work in Access:
CONSTRAINT StaffNotHandlingTooMuch
CHECK (NOT EXISTS (SELECT staffNo
FROM PropertyForRent
GROUP BY staffNo
HAVING COUNT(*) > 100))
*radio buttons can only store NUMERIC data. Each choice is given a
number code, and the user is allowed to choose only ONE of the choices
– like the channel selection buttons on a car radio.
Derived data
• Multilevel Indexing
Clustered Indexing
• Clustering index is defined on an ordered data file. The data file is
ordered on a non-key field. In some cases, the index is created on
non-primary key columns which may not be unique for each record. In
such cases, in order to identify the records faster, we will group two
or more columns together to get the unique values and create index
out of them. This method is known as the clustering index. Basically,
records with similar characteristics are grouped together and indexes
are created for these groups.
Non-clustered or Secondary Indexing
• A non clustered index just tells us where the
data lies, i.e. it gives us a list of virtual pointers
or references to the location where the data is
actually stored.
Multilevel Indexing
• The multilevel indexing segregates the main block into various smaller
blocks so that the same can stored in a single block. The outer blocks
are divided into inner blocks which in turn are pointed to the data
blocks. This can be easily stored in the main memory with fewer
overheads.
Designing Database for
Optimal Query Performance
•Parallel query processing- Possible when
working in multiprocessor system.
•Overriding automatic query optimization-
allows for query writers to preempt the
automated optimization.