0% found this document useful (0 votes)
17 views25 pages

5 RelationalDesignTheory

The document discusses relational database design and desirable properties for database normalization. It covers topics like functional dependencies, lossless join decomposition, and dependency preservation. The goal of normalization is to generate relation schemes without unnecessary redundancy while allowing easy information retrieval.

Uploaded by

adamshkolnik5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views25 pages

5 RelationalDesignTheory

The document discusses relational database design and desirable properties for database normalization. It covers topics like functional dependencies, lossless join decomposition, and dependency preservation. The goal of normalization is to generate relation schemes without unnecessary redundancy while allowing easy information retrieval.

Uploaded by

adamshkolnik5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Relational

Database
Design

COMP 353/453
Database Programming
Prof. Silva

These slides were prepared by Prof. Silva adapting the slides from Fundamentals of Database Systems (Elmasri and Navathe) and Understanding
1
Relational Database Query Languages (Dietrich)
Relational Database Design

Goal: Generate a set of relation schemes that allow us to store information


without unnecessary redundancy, yet allows us to retrieve information easily

Undesirable Properties
• Repetition of information
• Inability to represent certain information
• Loss of information

Desirable Properties
• No Repetition of Information (Normal Forms)
• Dependency Preservation
• Lossless-join Decomposition

2
Functional Dependencies - Definition

Recall superkey definition:


Let R be a relation scheme. A subset K of R is a superkey of R if, in any legal
relation r(R), for all pairs t1 and t2 tuples in r s.t. t1  t2
t1[K]  t2[K]
"No two tuples in any legal relation r(R) may have the same value on
attribute set K".

Let V  R and W  R. The Functional Dependency (FD) VW holds on R if in


any legal relation r(R), for all pairs of tuples t 1 and t2 in r s.t. t1[V] = t2[V] it is
also the case that t1[W] = t2[W].

An FD XY is trivial if Y  X
Note: if K is a superkey of R, then KR
3
Functional Dependencies - Examples

1. suppliers(SNAME, SADDR, ITEM, PRICE)

R = {SNAME, SADDR, ITEM, PRICE}


F = {f1: SNAME  SADDR,
f2: SNAME,ITEM  PRICE}

2. addresses(NAME, CITY, ZIP)

F = {f1: NAME  ZIP,


f2: ZIP  CITY}

4
More on Functional Dependencies

• A FD is a property of the meaning or semantics of the attributes in a


relation schema
• Legal relation instances r of R satisfy the functional dependency
constraints

FDs are used to


• provide a set of constraints on the attributes of a relation schema that
must hold at all times
• provide additional information to be used in the design process

5
Functional Dependencies: Company Training Courses
Enterprise

Most of the FDs for an enterprise are already included in the conceptual
model. Using the ER diagram and the semantics of the enterprise, give
the FDs for the company training courses example.

1
mgr 1
ID DNUM
NAME emp dept
N 1 DNAME
SAL M
works
DATE takes

N
CRSID INST
course
CNAME LENGTH

6
Theory of Functional Dependencies

Example: F = {XY, YZ}


By definition of FDs, XZ is "logically implied" by F.
Let F be a set of FDs. Let F+ denote the closure of F, which is the set of all FDs logically
implied by F.

Rules of Inference for FDs (FD rules)


1. Reflexivity if Y  X, then XY
2. Augmentation if XY, then WXWY
3. Transitivity if XY & YZ, then XZ
4. Union if XY & XZ, then XYZ
5. Decomposition if XYZ, then XY & XZ
6. Pseudotransitivity if XY & WYZ, then XWZ

· Rules 1, 2 & 3 are Armstrong's axioms (complete)


7
Example Proof of Inference

Let F = {f1: ZA, f2: BX, f3: AXY}

Prove that ZBY is in F+?

ZBAB Augmentation(f1,B)
ABAX Augmentation(f2, A)
ZBAX Transitivity(ZBAB, ABAX)
ZBY Transitivity(ZBAX, f3)

What if we just wanted to know whether a FD is in F+?


Computing F+ is costly. Rather than computing F+, we can find the set of attributes
functionally determined by the given set of attributes on the left-hand side of the
FD to check.
8
Attribute Closure

Let X be a set of attributes.


X+ be the closure of X under F, i.e., the set of all attributes functionally
determined by X under a set F of FDs.

Algorithm: compute X+
Input: F - set of FDs
X - set of Attributes
Output: X+ - set of attributes
Result := X; /*reflexivity */
while (changes to Result) do
for each FD YZ in F do
if Y  Result then Result := Result  Z; /* transitivity */
X+ := Result;

9
Example: Attribute Closure

F={ f1: ZA,


f2: BX,
f3: AXY}

Is ZBY in F+?
Equivalently, is Y in (ZB)+?

Result: ZB Given
ZBA f1
ZBAX f2
ZBAXY f3

Yes!
10
Decomposition

Let U be a relation scheme.


A set of relation schemes {R1, R2, ..., Rn} is a decomposition of U if
n
 Ri = U
i=1

Let u be a relation on scheme U.


Let ri = Ri(u), for 1 i  n n
u  _ ri
i=1
11
Desirable Properties: Lossless-join Decomposition

Let C be a set of constraints on the DB. A decomposition {R1, R2, ..., Rn} of a
relation scheme U is a lossless-join decomposition if for all relations u on
scheme U that are legal under C
n
u = _  Ri (u)
i=1

Let R be a relation scheme, F be a set of FDs on R, R1 & R2 form a


decomposition of R. This decomposition is a lossless-join decomposition of
R if at least one of the following holds:
• (R1  R2)R1 in F+
• (R1  R2)R2 in F+

12
Lossless-join Decomposition – Supplier Example

1. suppliers(SNAME, SADDR, ITEM, PRICE)


R = {SNAME, SADDR, ITEM, PRICE}
F = {f1: SNAMESADDR, f2: SNAME,ITEMPRICE}

Consider the decomposition {R1,R2} where


R1 = {SNAME, SADDR}
R2 = {SNAME, ITEM, PRICE}
R1  R2 = {SNAME}
SNAME+ = {SNAME, SADDR} = R1

 lossless-join decomposition
r1 = suppaddr(SNAME, SADDR)
r2 = suppitem(SNAME, ITEM, PRICE)
13
Lossless-join Decomposition - Addresses Example
Assumption: a ZIP
2. addresses(NAME, CITY, ZIP) F={f1, f2} code covers only 1 city

(a) f1: NAMEZIP (b) f1: NAMEZIP


f2: ZIPCITY f2: ZIPCITY

Consider the decomposition Consider the decomposition


R1 = {NAME, CITY} R1 = {NAME, ZIP}
R2 = {CITY, ZIP} R2 = {CITY, ZIP}
R1  R2 = {CITY} R1  R2 = {ZIP}
(CITY)+ = {CITY} (ZIP)+ = {ZIP, CITY} = R2
 NOT a lossless-join  lossless-join decomposition
decomposition

There is a generic algorithm to test for Lossless-join decomposition


into any number of relations. 14
Intuition

• If a table contains a key for the universal relation scheme (all


attributes), then the relational decomposition will have the
lossless-join property.
• Finding the key:
• union together the attributes on the left-hand side of all fds
• remove attributes that are logically implied by remaining
attributes
• Examples:
• suppliers: SNAME ITEM
• addresses: NAME

15
Desirable Properties: Dependency
Preservation
"Update validation without computation of joins"

Let F be a set of FDs on a scheme R, and R1, ..., Rn be a decomposition of R.


The restriction of F to Ri is the set Fi of all FDs in F+ that include only attrs
of Ri. The set of restrictions F1, F2, ..., Fn is the set of dependencies that can
be checked efficiently. Let n
F  =  Fi
i=1

In general, F  F'.
A dependency preserving decomposition has the property that (F') + = F+,
i.e., every dependency in F is logically implied by F'.

16
Testing Dependency Preservation

compute F+
For each scheme Ri in D do
Fi := the restriction of F+ to Ri;
n
F :=  F j
j=1

compute(F')+
return((F')+ = F+)

... Not always necessary to use this algorithm, we can check that F'=F or
that each FD in F - F' is logically implied by F'

17
Dependency Preservation - Suppliers
Example
1. suppliers(SNAME, SADDR, ITEM, PRICE)

R = {SNAME, SADDR, ITEM, PRICE}


F = {f1: SNAMESADDR, f2: SNAME,ITEMPRICE}

R1 = {SNAME, SADDR} F1 = {f1}


R2 = {SNAME, ITEM, PRICE} F2 = {f2}

F' = F,
 dependency preserving

18
Dependency Preservation - Addresses
Example
Assumption: a ZIP
2. addresses(NAME, CITY, ZIP)
code covers only 1 city
(a) F = {f1: NAMEZIP f2: ZIPCITY}
R1 = {NAME, CITY} F1 = 
R2 = {ZIP, CITY} F2 = {f2}
F' = {f2} F - F' = {f1}
Is f1 logically implied by F' ?
No,  NOT dependency preserving
(b) F ={f1: NAME  ZIP, f2: ZIP  CITY}
R1 = {NAME, ZIP} F1 = {f1}
R2 = {CITY, ZIP} F2 = {f2}
F' = F,  dependency preserving
Note:
1. A decomposition may have a lossless-join w.r.t. F yet not preserve F.
2. A decomposition could preserve F yet not have a lossless-join.

19
Desirable Properties: No Repetition of
Information
Normal forms represent degrees of redundancy ...

PART AVIALABLE PART WH QTY ADDR


WH QTY ADDR
P1 W1 30 A1
P1 W1 30 A1
W2 20 A2 P1 W2 20 A2
WH QTY ADDR P2 W1 15 A1
P2 W1 15 A1
W2 10 A2 P2 W2 10 A2

UNNORMALIZED 1NF

1NF: A relation scheme is in 1NF if the domains of all attributes of R are


atomic. (Basic assumption of relational databases)
The 1NF design still exhibits redundancy of data.

20
Normal Forms: 2NF
A relation scheme R is in 2NF if each attr A in R either
• appears in a candidate key, or
• is not dependent on any proper subset of any candidate key
(is dependent on the whole of every candidate key)

Example: R = {PART, WH, QTY, ADDR}


F = {PART, WHQTY, WHADDR}
 R is not in 2NF
2NF:
R1 = {PART, WH, QTY} PART,WHQTY
R2 = {WH, ADDR} WHADDR

21
Normal Forms: 3NF
A relation scheme R is in 3NF if for all FDs that hold on R of the form XY,
where X  R and Y  R, at least one of the following holds:
• XY is a trivial FD
• X is a superkey
• Y is contained in a candidate key for R

Example: R = {EMP, DEPT, LOC}


F = {f1: EMPDEPT, f2: DEPTLOC}
R is in 2NF, but not in 3NF since DEPTLOC violates 3NF.

3NF: R1 = {EMP, DEPT} F1 = {EMPDEPT}


R2 = {DEPT, LOC} F2 = {DEPTLOC}
22
Normal Forms: Boyce-Codd Normal Form
(BCNF)
A relation scheme is in BCNF if for all FDs that hold on R of the form XY, where X  R &
Y  R, at least one of the following holds:
• XY is a trivial FD
• X is a superkey for scheme R

Example: R = {EMP, PHONE, PROJECT, HRS}


F = {EMPPHONE, EMP PROJECTHRS,
PHONEEMP, PHONE PROJECTHRS}
candidate keys: EMP PROJECT and PHONE PROJECT
3NF, but not BCNF: EMP/PHONE not superkey

BCNF: R1 = {EMP, PHONE} F1 = {EMPPHONE,


PHONEEMP}
R2 = {EMP, PROJECT, HRS} F2 = {EMP,PROJECTHRS}
23
Normal Forms: Additional Details
• There is a BCNF Decomposition Algorithm
• BCNF Decomposition guarantees Loos-less Decomposition
• However, not every BCNF decomposition is Dependency
Preserving

24
Design Goals
• Goal for a relational database design is:
• BCNF (no redundant information)
• Lossless join
• Dependency preservation
• If we cannot achieve this, we accept:
• 3NF (possible repetition of information)
• Lossless join
• Dependency preservation

25

You might also like