0% found this document useful (0 votes)
202 views

Report Anomalies and Normalization Summary

This document discusses database normalization and anomalies. It defines three types of anomalies - update, insertion, and deletion anomalies - that can occur due to improperly normalized tables. It also defines three normal forms - 1NF, 2NF, and 3NF - with 3NF being fully normalized and free of anomalies. The document then provides steps for normalizing tables, including removing repeating groups, resolving partial dependencies, and eliminating transitive dependencies. The goal of normalization is to break tables into smaller, independent tables to reduce structural flaws and anomalies.

Uploaded by

Thomas_Godric
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
202 views

Report Anomalies and Normalization Summary

This document discusses database normalization and anomalies. It defines three types of anomalies - update, insertion, and deletion anomalies - that can occur due to improperly normalized tables. It also defines three normal forms - 1NF, 2NF, and 3NF - with 3NF being fully normalized and free of anomalies. The document then provides steps for normalizing tables, including removing repeating groups, resolving partial dependencies, and eliminating transitive dependencies. The goal of normalization is to break tables into smaller, independent tables to reduce structural flaws and anomalies.

Uploaded by

Thomas_Godric
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Anomalies, Structural Dependencies, and Data Normalization

Database Anomalies - negative operational symptoms exhibited by improperly normalized


tables which can cause DBMS processing problems that restrict, or even deny, users access
to the information they need.
Kinds of Anomaly
1. Update anomaly
2. Insertion anomaly
3. Deletion anomaly
Level of Normalized Tables
1. First normal form (1NF) lowest level
2. Second normal form (2NF) low level
3. Third normal form (3NF) highest level and free of anomalies
Update anomaly - The unintentional updating of data in a table, resulting from data
redundancy. It can generate conflicting and obsolete database values
Insertion anomaly - The unintentional insertion of data into a table. It can result in
unrecorded transactions
and incomplete audit trails
Deletion anomaly - The unintentional deletion of data from a table can cause the loss of
accounting
records and the destruction of audit trails
Normalizing Tables
Dependencies - structural problems within tables that causes database anomalies
Types of Dependencies
1. Repeating groups
2. Partial dependencies
3. Transitive dependencies
Repeating Groups - the existence of multiple values for a particular attribute in a specific
record
Partial Dependencies - occurs when one or more nonkey attributes are dependent on
(defined by) only part of the primary key, rather than the whole key
Transitive Dependencies - occurs in a table where nonkey attributes are dependent on
another nonkey attribute and independent of the tables primary key

What is NORMALIZATION?
The normalization process involves identifying and removing structural dependencies from
the table(s) under review.
For a table to be free from anomalies, these conditions should be met:
1. All nonkey attributes will be wholly and uniquely dependent on (defined by) the
primary key.
2. None of the nonkey attributes will be dependent on (defined by) other nonkey
attributes.
Steps
1.
2.
3.
4.
5.

in Table Normalization
Design the User View
Unnormalized Table (Represent View with a Single Table)
Table in First Normal Form 1NF (Repeating Groups removed)
Table in Second Normal Form 2NF (Partial Dependencies removed)
Table in Third Normal Form 3NF (Transitive Dependencies removed)

1. Design the User View


View - is merely a pictorial representation of a set of data the user will eventually have
when the project is completed
1. output report
2. source document
3. input screen.
Images of user views may be prepared using
1. word processor
2. graphics package
3. pencil and paper
2. Unnormalized Table (Represent View with a Single Table)
Represent the view as a single table that contains all of the view attributes with a logical
primary key
Because the table contains customer invoices,
invoice number (INVOICE NUM) will serve as a logical primary key.
Notice the attributes Ex Price and Total Due have been grayed out in Figure 8.39. The
values for these attributes may be either stored or calculated.

Ex Price is the product of two other attributes (Quantity 3 Unit Price) and
Total Due is the sum of all Ex Price values,
they both can be calculated from existing stored attributes rather than storing them
directly in the database table.

To simplify this example, therefore, we will assume that the system will calculate these
attributes and they will be ignored from further analysis.
The next steps in the involve
identifying and, if necessary,
eliminating structural dependencies that exist
splitting the original single-table structure into two or more smaller and independent
3NF tables.

3. Table in First Normal Form 1NF (Repeating Groups removed)


a. Determine if the table under review contains repeating groups
b. The repeating group data need to be removed from the table and placed in a
separate table
Repeating group data is the existence of multiple values for a particular attribute in a
specific record.
For example, the sales invoice in Figure 8.38 contains multiple values for the attributes
PROD NUM,
DESCRIPTION,
QUANTITY,
UNIT PRICE (we ignore EX PRICE).
These repeating groups represent the transaction details of the invoice.
We see repeating group data in many business user views, such as
purchase orders,
receiving reports,
bills of lading, and so on.
Relational database theory prohibits the construction of a table in which a single record
(a row in the table) represents multiple values for an attribute (a column in the table).
a) To represent repeating group values in a single table, will require multiple rows as
illustrated Figure 8.39.
b) The invoice attributes, will also be represented multiple times.
c) Order Date, Shipped Date, Customer Name, Customer Address, and so on, are
recorded along with each unique occurrence of Prod Num, Description, Quantity, and
Unit Price.
d) To avoid such data redundancy, the repeating group data need to be removed from
the table and placed in a separate table.
e) Figure 8.40 shows the resulting tables. One is called Sales Invoice Table, with INVOICE
NUM as the primary key.
f) The second table contains the transaction details for the invoice and is called Line
Item Table.

a) The primary key of the Line Item Table is a composite key comprising two
attributes: Invoice Num and Prod Num.
b) Keep in mind that this table will contain the transaction details for our example
invoice as well as the transaction details for the invoices for all customers.
c) Relational database theory requires that a tables primary key uniquely identify each
record stored in the table.
d) PROD NUM alone cannot do this since a particular product, such as r234 (bolt cutter),
may well have been sold to many other customers whose transactions are also in the
table.
e) By combining PROD NUM with the INVOICE NUM, we can uniquely define each
transaction because the table will never contain two occurrences of the same invoice
number and product number together.
4. Table in Second Normal Form 2NF (Partial Dependencies removed)
a. Check to see if the resulting tables contain partial dependencies. This can occur
only in tables that have composite (two or more attribute) primary keys.

b. The partial dependencies need to be removed from the table and placed in a
separate table.
a) A partial dependency occurs when one or more nonkey attributes are dependent
on (defined by) only part of the primary key, rather than the whole key.
b) This can occur only in tables that have composite (two or more attribute) primary
keys.
c) Because the Sales Invoice Table has a single attribute primary key, we can ignore it in
this step of the analysis. This table is already in 2NF.
d) The Line Item Table, needs to be examined further.
e) Figure 8.41 illustrates the partial dependencies in it.
f) In the Line Item Table, INVOICE NUM and PROD NUM together define the quantity sold
attribute (Qunty).
g) If we assume, however, that the price charged for r234 is the same for all customers,
then the Unit Price attribute is common to all transactions involving product r234.
h) Similarly, the attribute Description is common to all such transactions.
i) These two attributes are not dependent on the Invoice Num component of the
composite key.
j) Instead, they are defined by Prod Num and, therefore, only partially rather than
wholly dependent on the primary key.
k) We resolve this by splitting the table into two, as illustrated in Figure 8.41.
l) The resulting Line Item Table is now left with the single nonkey attribute Qunty.
m) Product description and unit price data are placed in a new table called Inventory.
n) Notice that the Inventory table contains additional attributes that do not pertain to
this user view.
o) A typical inventory table may contain attributes such as reorder point, quantity on
hand, supplier code, warehouse location, and more.
p) This demonstrates how a single table may be used to support many different user
views and reminds us that this normalization example pertains to only a small portion
of the entire database. We will return to this issue later.
q) At this point, both of the tables in Figure 8.41 are in 3NF.
r) The Line Item Tables primary key (INVOICE NUM PROD NUM) wholly defines the
attribute QUNTY.
s) Similarly, in the Inventory Table, the attributes Description and Unit Price are wholly
defined by the primary key PROD NUM.
5. Table in Third Normal Form 3NF (Transitive Dependencies removed)
a. Review the tables further if it contains transitive dependencies
b. Resolve this transitive dependency by splitting out the table
a) A transitive dependency occurs in a table where nonkey attributes are dependent
on another nonkey attribute and independent of the tables primary key.
b) The Sales Invoice Table in Figure 8.42. The primary key INVOICE NUM uniquely and
wholly defines the economic event that the attributes
Order Date
Shpd Date
Shpd Via represent.
The key does not, however, uniquely define the customer attributes.
c) The attributes Cust Name, Street Address, and so on, define an entity (Customer)
that is independent of the specific transaction captured by a particular invoice record.
d) Assume that during the period the firm had sold to a particular customer on ten
different occasions.
e) This would result in ten different invoice records stored in the table.

f)

Using the current table structure, each of these invoice records would capture the
data uniquely related to the respective transaction along with customer data that are
common to all ten transactions.
g) Therefore, the primary key does not uniquely define customer attributes in the table.
Indeed, they are independent of it.
h) We resolve this transitive dependency by splitting out the customer data and placing
them in a new table called Customer.
i) The logical key for this table is CUST NUM, which was the nonkey attribute in the
former table on which the other nonkey customer attributes were dependent.
j) With this dependency resolved, both the revised Sales Invoice Table and the new
Customer Table are in 3NF.
6. Linking the Normalized Tables
Multiple 3NF tables are need to be linked together so the data in them can be related and
made accessible to users
a. These tables need to be linked via foreign keys
b. By determining the cardinality (degree of association) between the tables
c. Assigning foreign keys
Cardinality
a. 1:1 (one-to-one) association
b. 1:M (one-to-many)/(many-to-one) association
Business Rule 1. Each vendor supplies the firm with three (or fewer) different items of
inventory, but each item is supplied by only one vendor.
Business Rule 2. Each vendor supplies the firm with any number of inventory items,
but each item is supplied by only one vendor.
c. M:M (many-to-many) association
Business Rule 3. Each vendor supplies the firm with any number of inventory items,
and each item may be supplied by any number of vendors.
Auditors and Data Normalization
1. Database normalization is usually the responsibility of systems professionals.
2. However, the subject has implications for internal control that make it the concern of
auditors.
3. Although most auditors will not be responsible for normalizing databases, they should
have an understanding of the process and be able to determine whether a table is
properly normalized.
4. Furthermore, the auditor needs to know how the data are structured before he or she
can extract data from tables to perform audit procedures.

You might also like