Report Anomalies and Normalization Summary
Report Anomalies and Normalization Summary
What is NORMALIZATION?
The normalization process involves identifying and removing structural dependencies from
the table(s) under review.
For a table to be free from anomalies, these conditions should be met:
1. All nonkey attributes will be wholly and uniquely dependent on (defined by) the
primary key.
2. None of the nonkey attributes will be dependent on (defined by) other nonkey
attributes.
Steps
1.
2.
3.
4.
5.
in Table Normalization
Design the User View
Unnormalized Table (Represent View with a Single Table)
Table in First Normal Form 1NF (Repeating Groups removed)
Table in Second Normal Form 2NF (Partial Dependencies removed)
Table in Third Normal Form 3NF (Transitive Dependencies removed)
Ex Price is the product of two other attributes (Quantity 3 Unit Price) and
Total Due is the sum of all Ex Price values,
they both can be calculated from existing stored attributes rather than storing them
directly in the database table.
To simplify this example, therefore, we will assume that the system will calculate these
attributes and they will be ignored from further analysis.
The next steps in the involve
identifying and, if necessary,
eliminating structural dependencies that exist
splitting the original single-table structure into two or more smaller and independent
3NF tables.
a) The primary key of the Line Item Table is a composite key comprising two
attributes: Invoice Num and Prod Num.
b) Keep in mind that this table will contain the transaction details for our example
invoice as well as the transaction details for the invoices for all customers.
c) Relational database theory requires that a tables primary key uniquely identify each
record stored in the table.
d) PROD NUM alone cannot do this since a particular product, such as r234 (bolt cutter),
may well have been sold to many other customers whose transactions are also in the
table.
e) By combining PROD NUM with the INVOICE NUM, we can uniquely define each
transaction because the table will never contain two occurrences of the same invoice
number and product number together.
4. Table in Second Normal Form 2NF (Partial Dependencies removed)
a. Check to see if the resulting tables contain partial dependencies. This can occur
only in tables that have composite (two or more attribute) primary keys.
b. The partial dependencies need to be removed from the table and placed in a
separate table.
a) A partial dependency occurs when one or more nonkey attributes are dependent
on (defined by) only part of the primary key, rather than the whole key.
b) This can occur only in tables that have composite (two or more attribute) primary
keys.
c) Because the Sales Invoice Table has a single attribute primary key, we can ignore it in
this step of the analysis. This table is already in 2NF.
d) The Line Item Table, needs to be examined further.
e) Figure 8.41 illustrates the partial dependencies in it.
f) In the Line Item Table, INVOICE NUM and PROD NUM together define the quantity sold
attribute (Qunty).
g) If we assume, however, that the price charged for r234 is the same for all customers,
then the Unit Price attribute is common to all transactions involving product r234.
h) Similarly, the attribute Description is common to all such transactions.
i) These two attributes are not dependent on the Invoice Num component of the
composite key.
j) Instead, they are defined by Prod Num and, therefore, only partially rather than
wholly dependent on the primary key.
k) We resolve this by splitting the table into two, as illustrated in Figure 8.41.
l) The resulting Line Item Table is now left with the single nonkey attribute Qunty.
m) Product description and unit price data are placed in a new table called Inventory.
n) Notice that the Inventory table contains additional attributes that do not pertain to
this user view.
o) A typical inventory table may contain attributes such as reorder point, quantity on
hand, supplier code, warehouse location, and more.
p) This demonstrates how a single table may be used to support many different user
views and reminds us that this normalization example pertains to only a small portion
of the entire database. We will return to this issue later.
q) At this point, both of the tables in Figure 8.41 are in 3NF.
r) The Line Item Tables primary key (INVOICE NUM PROD NUM) wholly defines the
attribute QUNTY.
s) Similarly, in the Inventory Table, the attributes Description and Unit Price are wholly
defined by the primary key PROD NUM.
5. Table in Third Normal Form 3NF (Transitive Dependencies removed)
a. Review the tables further if it contains transitive dependencies
b. Resolve this transitive dependency by splitting out the table
a) A transitive dependency occurs in a table where nonkey attributes are dependent
on another nonkey attribute and independent of the tables primary key.
b) The Sales Invoice Table in Figure 8.42. The primary key INVOICE NUM uniquely and
wholly defines the economic event that the attributes
Order Date
Shpd Date
Shpd Via represent.
The key does not, however, uniquely define the customer attributes.
c) The attributes Cust Name, Street Address, and so on, define an entity (Customer)
that is independent of the specific transaction captured by a particular invoice record.
d) Assume that during the period the firm had sold to a particular customer on ten
different occasions.
e) This would result in ten different invoice records stored in the table.
f)
Using the current table structure, each of these invoice records would capture the
data uniquely related to the respective transaction along with customer data that are
common to all ten transactions.
g) Therefore, the primary key does not uniquely define customer attributes in the table.
Indeed, they are independent of it.
h) We resolve this transitive dependency by splitting out the customer data and placing
them in a new table called Customer.
i) The logical key for this table is CUST NUM, which was the nonkey attribute in the
former table on which the other nonkey customer attributes were dependent.
j) With this dependency resolved, both the revised Sales Invoice Table and the new
Customer Table are in 3NF.
6. Linking the Normalized Tables
Multiple 3NF tables are need to be linked together so the data in them can be related and
made accessible to users
a. These tables need to be linked via foreign keys
b. By determining the cardinality (degree of association) between the tables
c. Assigning foreign keys
Cardinality
a. 1:1 (one-to-one) association
b. 1:M (one-to-many)/(many-to-one) association
Business Rule 1. Each vendor supplies the firm with three (or fewer) different items of
inventory, but each item is supplied by only one vendor.
Business Rule 2. Each vendor supplies the firm with any number of inventory items,
but each item is supplied by only one vendor.
c. M:M (many-to-many) association
Business Rule 3. Each vendor supplies the firm with any number of inventory items,
and each item may be supplied by any number of vendors.
Auditors and Data Normalization
1. Database normalization is usually the responsibility of systems professionals.
2. However, the subject has implications for internal control that make it the concern of
auditors.
3. Although most auditors will not be responsible for normalizing databases, they should
have an understanding of the process and be able to determine whether a table is
properly normalized.
4. Furthermore, the auditor needs to know how the data are structured before he or she
can extract data from tables to perform audit procedures.