0% found this document useful (0 votes)
7 views61 pages

DB_Chapter_4

Chapter 4 discusses logical database design, focusing on structuring entity attributes into relational database tables while ensuring non-redundancy and proper relationships through foreign keys. It outlines the ER-to-relational mapping algorithm, detailing steps for mapping regular and weak entity types, binary relationships, and multivalued attributes. The chapter also emphasizes the importance of normalization, integrity constraints, and managing redundancy to avoid data anomalies in database design.

Uploaded by

guyoboru12345
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views61 pages

DB_Chapter_4

Chapter 4 discusses logical database design, focusing on structuring entity attributes into relational database tables while ensuring non-redundancy and proper relationships through foreign keys. It outlines the ER-to-relational mapping algorithm, detailing steps for mapping regular and weak entity types, binary relationships, and multivalued attributes. The chapter also emphasizes the importance of normalization, integrity constraints, and managing redundancy to avoid data anomalies in database design.

Uploaded by

guyoboru12345
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 61

Chapter - 4

Logical database design


Logical Database Design
 Logical database design is the process of deciding
how to arrange the attributes of the entities in a given
business environment into database structures, such
as the tables of a relational database.
 Logical design is the process of constructing a
model of the data used in an enterprise based on a
specific data model (in our case relational), but
independent of a particular DBMS and other physical
considerations.

2
Logical Database Design…

 The goal of logical database design is to create well


structured tables that properly reflect the company's
business environment.
 The tables will be able to store data about the
company's entities in a non-redundant manner and
foreign keys will be placed in the tables so that all the
relationships among the entities will be supported.
 The purpose of logical design is to translate the
conceptual representation to the logical structure of
the database, which includes designing the relations.

3
Logical Database Design…

 Recall, a relational database is a database of two

dimensional tables called relations. The tables are

composed of rows (tuples) and columns (attributes).

 In a relational database, all attributes must be

atomic (simple), and keys must be not null.

 The process of converting an ER diagram into a

database is called mapping.

4
ER-to-Relational Mapping Algorithm
– Step 1: Mapping of Regular Entity Types
– Step 2: Mapping of Weak Entity Types
– Step 3: Mapping of Binary 1:1 Relation Types
– Step 4: Mapping of Binary 1:N Relationship Types.
– Step 5: Mapping of Binary M:N Relationship Types.
– Step 6: Mapping of Multivalued attributes.
– Step 7: Mapping of N-ary Relationship Types.

Compiled by Daniel S. 5
FIGURE 9.1
The ER conceptual schema diagram for the COMPANY database.

Compiled by Daniel S.
6
ER-to-Relational Mapping Algorithm

• Step 1: Mapping of Regular Entity Types.

– For each regular (strong) entity type E in the ER schema, create a


relation R that includes all the simple attributes of E.
– Choose one of the key attributes of E as the primary key for R.
– If the chosen key of E is composite, the set of simple attributes that
form it will together form the primary key of R.

• Example: We create the relations EMPLOYEE, DEPARTMENT, and


PROJECT in the relational schema corresponding to the regular
entities in the ER diagram.

– SSN, DNUMBER, and PNUMBER are the primary keys for the
relations EMPLOYEE, DEPARTMENT, and PROJECT as shown.
Compiled by Daniel S.
7
Slide 7- 7
Compiled by Daniel S.
8
ER-to-Relational Mapping Algorithm (cont)
• Step 2: Mapping of Weak Entity Types
– For each weak entity type W in the ER schema with owner entity type E,
create a relation R & include all simple attributes (or simple components
of composite attributes) of W as attributes of R.
– Also, include as foreign key attributes of R the primary key attribute(s)
of the relation(s) that correspond to the owner entity type(s).
– The primary key of R is the combination of the primary key(s) of the
owner(s) and the partial key of the weak entity type W, if any.

• Example: Create the relation DEPENDENT in this step to


correspond to the weak entity type DEPENDENT.
– Include the primary key SSN of the EMPLOYEE relation as a foreign key
attribute of DEPENDENT (renamed to ESSN).
– The primary key of the DEPENDENT relation is the combination {ESSN,
DEPENDENT_NAME} because DEPENDENT_NAME is the partial key of
DEPENDENT.

Compiled by Daniel S 9
Example for step-2

Compiled by Daniel S 10
Example: 1:1 relation MANAGES is mapped by choosing the participating entity type
DEPARTMENT to serve in the role of S, because its participation in the MANAGES
relationship type is total(Mandatory).
2. Merged relation option: An alternate mapping of a 1:1 relationship type is
possible by merging the two entity types and the relationship into a single
relation. This may be appropriate when both participations are total. OR
mandatory participation on both sides of 1:1 relationship

3. Cross-reference or relationship relation option(both participations are


total)
The third alternative is to set up a third relation R for the purpose of
cross-referencing the primary keys of the two relations S and T
representing the entity types.
Compiled by Daniel S.
11
Example for step-3

Compiled by Daniel S.
12
ER-to-Relational Mapping Algorithm (cont)
• Step 4: Mapping of Binary 1:N Relationship Types.

• Example: 1:N relationship types WORKS_FOR, CONTROLS,


and SUPERVISION in the figure.
– For WORKS_FOR we include the primary key DNUMBER of the
DEPARTMENT relation as foreign key in the EMPLOYEE relation
and call it DNO.

Compiled by Daniel S.
13
Example for step-4

14
Example for step-4
Finally step 4 becomes

Compiled by Daniel S.
15
ER-to-Relational Mapping Algorithm (cont)

Step 5: Mapping of Binary M:N Relationship Types.


-
For each regular binary M:N relationship type R, create a new relation S to
represent R. i.e. Create a third relation containing the primary keys of both
the entity sets and descriptive attribute (if any)

Include as foreign key attributes in S the primary keys of the relations that
represent the participating entity types; their combination will form the
primary key of S.
Also include any simple attributes of the M:N relationship type (or simple
components of composite attributes) as attributes of S.

Example: The M:N relationship type WORKS_ON from the ER diagram is


mapped by creating a relation WORKS_ON in the relational database
schema. The primary keys of the PROJECT and EMPLOYEE relations are
included as foreign keys in WORKS_ON and renamed PNO and ESSN,
respectively.
Attribute HOURS in WORKS_ON represents the HOURS attribute of the
relation type. The primary key of the WORKS_ON relation is the
combination of the foreign key attributes {ESSN, PNO}.
16
Example for step-5

Compiled by Daniel S.
17
ER-to-Relational Mapping Algorithm (cont)

Step 6: Mapping of Multivalued attributes.


For each multivalued attribute A, create a new relation R.
• Add primary key of the entity set in new relation as a
Foreign key.
• The Foreign key attribute and multivalued attribute will
become composite key
Example: The relation DEPT_LOCATIONS is created. The attribute DLOCATION
represents the multivalued attribute LOCATIONS of DEPARTMENT, while
DNUMBER-as foreign key-represents the primary key of the DEPARTMENT
relation. The primary key of R is the combination of {DNUMBER, DLOCATION}.

Compiled by Daniel S.
18
Example for step-6

Compiled by Daniel S.
19
ER-to-Relational Mapping Algorithm (cont)
Step 7: Mapping of N-ary Relationship Types.
– For each n-ary relationship type R, where n>2, create a new relation S
to represent R.
– For each N-ary Relationship , create a separate Relation and
descriptive attribute(if any)
– Include as foreign key attributes in S the primary keys of the relations
that represent the participating entity types.
– Also include any simple attributes of the n-ary relationship type (or
simple components of composite attributes) as attributes of S.
• Example: The relationship type SUPPY in the ER on the next slide.
– This can be mapped to the relation SUPPLY shown in the relational schema,
whose primary key is the combination of the three foreign keys {SNAME,
PARTNO, PROJNAME}

Compiled by Daniel S.
20
FIGURE Ternary relationship types. (a) The SUPPLY relationship.

Compiled by Daniel S.
21
FIGURE Mapping the n-ary relationship type SUPPLY
from above Figure .

Compiled by Daniel S.
22
Class work
• Create the resulting relation of the figure
below

23
Mapping Exercise

Compiled by Daniel S.
24
Summary of Mapping constructs and constraints
Mapping from Conceptual to Logical Model
The mapping process involves several systematic transformations:
1. Entity Mapping
• Each entity in the conceptual model becomes a table in the logical
model
• Composite attributes are decomposed into simple attributes
• Multi valued attributes become separate tables
2. Relationship Mapping
• One-to-one (1:1): Foreign key in either table or merged into one table
• One-to-many (1:M): Foreign key in the "many" side table
• Many-to-many (M:N): Create junction/intersection table with foreign
keys to both entities
• N-ary relationships: Create junction table with foreign keys to all
participating entities 25
Summary of Mapping constructs and constraints….

3. Attribute Mapping
• Simple attributes become columns
• Composite attributes are broken down into component
columns
• Derived attributes are marked (may or may not be
physically implemented)
• Domain constraints are specified for each attribute
4. Constraint Mapping
• Primary keys are identified for all tables
• Foreign keys and referential integrity constraints are
defined
• Business rules are translated into check constraints 26
Validate relations using normalization

• The logical design should contain only properly normalized


tables.
• Check composition of each table using the rules of
normalization, to avoid unnecessary duplication of data.
• For each identified table (old and new), you must ensure
that all attributes are fully dependent on the identified
primary key.
• Ensure each table is in at least 3NF.
27
Check key and Integrity Constraints

• Check integrity constraints are represented in the logical data model.


This includes identifying:
– Required data (e.g. Not NULL): Some attributes must always contain a value.

– Attribute domain constraints: Every attribute has a set of values that are
legal (domain).
– Structural Constraints: Handled by relationship cardinality and participation.

– Entity integrity: Primary key must be unique and not null.

– Referential integrity (see Referential Integrity constraints).

28
Check key and Integrity Constraints
– As stated previously a foreign key value is an attribute that is
copied to one table and is a primary key value in another
table. Foreign key values are used to link tables together.
– The attribute staffNo in the Client table is a foreign key
value as it is the primary key value in the Staff table.

29
Cascading Referential Integrity constraints
– We must ensure that cascading referential integrity
constraints are specified that define conditions under
which a candidate key or foreign key may be inserted,
deleted, or updated:
– Insert tuple into child relation: Value of foreign key must
match primary key value of parent table or else be null.
Inserting a Client record: staffNo must match an existing
staffNo from the Staff table or else be NULL.

– Delete tuple from child relation: No difference or effect on


parent table. Deleting a Client record: No effect on Staff table.
30
Cascading Referential Integrity constraints

• Update foreign key of child tuple: Value of foreign key


must match primary key value of parent table or else be
null. Updating a Client record: staffNo must match an
existing staffNo from the Staff table or else be NULL.
• Insert tuple into parent relation: No difference or
effect on child table. Inserting a Staff record: No effect
on Client table.
31
Cascading Referential Integrity constraints
• Delete tuple from parent relation: This has effect on the child table.
What if we deleted a Staff member that is associated with Clients? The
corresponding Client records would have a staff member that no longer
exists in the Staff table, therefore violates referential integrity.
Choose the appropriate action based on your business logic:
– No Action: Prevent a deletion from parent relation if there are any referenced
child tuples.
– Cascade: When a parent tuple is deleted automatically delete any referenced
child tuples.
– Set Null: When a parent is deleted, the foreign key values in all corresponding
child tuples are automatically set to null.
– Set default: When a parent is deleted, the foreign key values in all
corresponding child tuples are automatically set to their default values.
– No Check: When a parent tuple is deleted, do nothing to ensure that
referential integrity is maintained.

32
Cascading Referential Integrity constraints

• Update primary key of parent tuple: Again, this has effects on the
child table. What would happen if all staff numbers in the Staff
table were updated from 5 to 6 characters? For example, s1234
became s01234? Staff numbers in the Client table would no longer
be valid.
– If the primary key value of a parent relation tuple is updated,
referential integrity is lost if there exists a child tuple
referencing the old primary key value. To ensure referential
integrity, the strategies as for (Delete tuple from parent
relation) will suffice.
• Note that the options available in MySQL are: cascade, set null, no
action, and restrict. No Action is a keyword from standard SQL. In
MySQL, it is equivalent to RESTRICT.
33
Redundancy and Data Anomaly
• What is Redundancy

• Redundancy refers to the duplication of data within a


database system. While some degree of redundancy is
inevitable and even necessary for efficient data retrieval,
excessive redundancy can lead to various issues, including
– increased storage requirements,

– data inconsistency, and

– decreased performance.
34
Redundancy and Data Anomaly
• Redundancy, in the context of a DBMS, occurs when the same
data is stored in multiple locations within a database.
• It can arise due to various reasons, such as
– denormalized database design,
– a lack of proper data modeling, and
– the replication of data for backup or distribution purposes.

• Redundancy can exist at the attribute level (repeating data values


within a single record) or at the relation level (repeating entire
records across multiple tables).
35
Redundancy and Data Anomaly

Redundancy and Data Anomaly If the SID is primary key


to each row, you can
SID SName Age use it to remove the
duplicates as shown
1 Jojo 20
below:
2 Kit 25 SID SName Age
1 Jojo 20
1 Jojo 20
2 Kit 25

36
Redundancy and Data Anomaly (Cont..)
• Column Level Redundancy:
• Now Rows are same but in column level
because of Sid is primary key but columns are
same.
Sid Sname Cid Cname Fid Fname Salary
1 AA C1 DBMS F1 Jojo 30000

2 BB C2 JAVA F2 KK 50000

3 CC C1 DBMS F1 Jojo 30000

4 DD C1 DBMS F1 Jojo 30000

37
Redundancy and Data Anomaly
• What is Data Anomaly
• Problems that can occur in poorly planned, unnormalized databases
where all the data is stored in one table (a flat-file database).
• Problems caused due to redundancy are anomalies.
• There are three types of anomalies that may occur when the
database is not normalized.
– Insertion Anomaly
– Update Anomaly
– Deletion Anomaly

38
Redundancy and Data Anomaly
• Insertion Anomaly
• An Insert Anomaly occurs when certain attributes cannot be
inserted into the database without the presence of other
attributes.
• Suppose a new faculty joins the University, and the Database
Administrator inserts the faculty data into the above table. But
he is not able to insert because Sid is a primary key, and can’t be
NULL. So this type of anomaly is known as an insertion anomaly.

39
Redundancy and Data Anomaly
• Delete Anomaly
• A Delete Anomaly exists when certain attributes are lost because
of the deletion of other attributes.
• When the Database Administrator wants to delete the student
details of Sid=2 from the above table, then it will delete the
faculty and course information too which cannot be recovered
further.

SQL:
DELETE FROM University WHE
RE Sid=2;

40
Redundancy and Data Anomaly
• Update Anomaly
• An Update Anomaly exists when one or more instances of
duplicated data is updated, but not all.
• When the Database Administrator wants to change the salary
of faculty F1 from 30000 to 40000 in above table University,
then the database will update salary in more than one row due
to data redundancy. So, this is an update anomaly in a table.
SQL:
UPDATE University
SET Salary= 40000
WHERE Fid=“F1”;

To remove all these anomalies, we need to normalize


the data in the database.
41
Functional Dependency (FD)
• A functional dependency (FD) is a relationship between two
attributes, typically between the primary key and other non-
key attributes within a table.
• A functional dependency denoted by X→Y , is an association
between two sets of attribute X and Y. Here, X is called the
determinant, and Y is called the dependent.
• For example,
– SIN ———-> Name, Address, Birthdate
• Here, SIN determines Name, Address and Birthdate. So, SIN is the determinant and
Name, Address and Birthdate are the dependents.
42
Functional Dependency (FD)…

• Types of functional dependency


• The following are types functional dependency in DBMS:

1. Fully-Functional Dependency
2. Partial Dependency
3. Transitive Dependency
4. Trivial Dependency
5. Multivalued Dependency

43
Functional Dependency (FD)…
1. Full functional Dependency
• A functional dependency X→Y is said to be a full functional
dependency, if removal of any attribute A from X, the dependency
does not hold any more. i.e. Y is fully functional dependent on X, if
it is Functionally Dependent on X and not on any of the proper
subset of X. For example,
– {Emp_num,Proj_num} → Hour is a full functional dependency. Here,
Hour is the working time by an employee in a project.

– { Student_id, subject_Id } -> marks

44
Functional Dependency (FD)…
2. Partial functional Dependency
• functional dependency X → Y is said to be a partial functional
dependency, if after removal of any attribute A from X, the
dependency does not holds. i.e. Y is dependent on a proper
subset of X.
• It is the dependency where non-key attribute are functionally
dependent on any parts of the composite key

45
Functional Dependency (FD)…
3. Transitive Dependency
• A functional dependency is X → Z is said to be a transitive
functional dependency if there exists the functional
dependencies X → Y and Y → Z. i.e. it is an indirect
relationship.
• E.g. Emp_ID → Dept_ID → Dept_Name (where Dept_Name transitively
depends on Emp_ID).

• For example,
– Staff_No→Branch_No and Branch_No→BAddress

46
Functional Dependency (FD)
4. Trivial Dependency
• A functional dependency X → Y is said to be a trivial functional
dependency if Y is a subset of X.
• For example,
– Emp_ID → Dept_ID → Dept_Name (where Dept_Name transitively depends on
Emp_ID).

Multivalued Dependency
• Multivalued dependency occurs in the situation where there are
multiple independent multivalued attributes in a single table.
• A multivalued dependency is a complete constraint between two sets
of attributes in a relation. It requires that certain tuples be present in a
relation.
47
Functional Dependency (FD)
• Example: Consider the following table

The functional dependencies


car_model -> manufr_year
car_model-> colour are multivalued dependency since manufr_year
and color both are multivalued attribute. 48
Normalization
• Normalization is the process of decomposing the relations into well
structured relations to organize the data in the database to remove
redundancy of data, insertion anomaly, update anomaly and
deletion anomaly.
• The normal forms used for normalization are:
– First normal form(1NF)
– Second normal form(2NF)
– Third normal form(3NF)
– Boyce & Codd normal form (BCNF)
– Fourth normal form (4NF)
– Fifth normal form (5NF)
49
Normalization – 1NF
• The first normal form is based on the simple or atomic attribute
and single valued attribute. A relation is said to be in 1NF if all the
attributes of the relation are atomic and single valued.
• Example-

The Relation employee is not in 1NF, because employees with


employee id 102 and 104 are having two phone numbers. i.e.
the Emp_mobile attribute is a multi valued attribute.
Normalization – 1NF
• After normalization to 1NF the relation will be
like this:

51
Normalization – 2NF
• The second normal form is based on the full functional
dependency. A relation is said to be in 2NF if it is first in 1NF and
all non-key attributes are fully functional dependent on the
primary key.. i.e. no partial dependency exists.
• Example- Consider the following example:

This table has a composite primary key {Customer ID, Store ID}. The non-key attribute is
Purchase Location. In this case, Purchase Location only depends on Store ID, which is a
part of the primary key. Therefore, this table does not satisfy second normal form.

52
Normalization – 2NF
• To bring this table to second normal form, we have to break the
table into two tables as shown below.

• Now, in the table TABLE_STORE, the column Purchase Location is


fully dependent on the primary key of that table, which is Store
ID.

53
Example

54
Normalization – 3NF
• The third normal form is based on the transitive dependency. A
relation is said to be in 3NF if it is first in 2NF and no transitive
functional dependency exists.
• Example: Consider the following table:

In the above table, Book ID determines Genre ID, and Genre ID


determines Genre Type. Therefore, Book ID determines Genre
Type via Genre ID and we have transitive functional dependency,
and this structure does not satisfy third normal form. 55
Normalization – 3NF
• To bring this table to 3NF, we split the table into two as follows:

Now all non-key attributes are fully functional dependent only on


the primary key. In TABLE_BOOK, both Genre ID and Price are
only dependent on Book ID. In TABLE_GENRE, Genre Type is only
dependent on Genre ID.

56
Normalization – BCNF
• Boyce Codd normal form (BCNF) - is the advance
version of 3NF. It is stricter than 3NF.
• A table is in BCNF
1. If every functional dependency X → Y, X is the super
key of the table.
2. Table should be in 3NF, and for every FD, LHS is
super key.

57
Normalization – BCNF
• Example: assume there is a company where employees work
in more than one department.
• EMPLOYEE table:

EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO

264 India Designing D394 283


264 India Testing D394 300

364 UK Stores D283 232

364 UK Developing D283 549


EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO

58
Normalization – BCNF
• In the above table Functional dependencies are as follows:
– EMP_ID → EMP_COUNTRY
– EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
– Candidate key: {EMP-ID, EMP-DEPT}
• The table is not in BCNF because neither EMP_DEPT nor
EMP_ID alone are keys.

59
Normalization – BCNF
• To convert the given table into BCNF, we decompose it into
three tables
EMP_COUNTRY table: EMP_DEPT table:
EMP_DEPT DEPT_TYPE EMP_DEPT_NO
EMP_ID EMP_COUNTRY
Designing D394 283
264 India
Testing D394 300
264 India Stores D283 232
Developing D283 549
EMP_DEPT_MAPPING table:

EMP_ID EMP_DEPT
D394 283
D394 300
D283 232
D283 549

60
De-normalization
• De-normalization is a database optimization technique where we add
redundant data in the database to get rid of the complex join
operations.
• This is done to speed up database access speed. De-normalization is
done after normalization for improving the performance of the
database.
• The data from one table is included in another table to reduce the
number of joins in the query and hence helps in speeding up the
performance.

61

You might also like