DBI202
DBI202
Database System
(COURSE CODE: DBI202)
Course Objectives:
Contents
Assessment
1) On-going Assessment
- At least 2 progress tests: 10%
- Labs (5): 10%
- 1 assignment: 20%
- 1 practical exam: 30%
2) Final exam (60'): 30%
3) Final Result: 100%
Completion Criteria:
1) Every on-going assessment component >0
2) Final Exam Score >=4 & Final Result >=5
Materials
Contents
Database
- A collection of information that exists over a long period of
time.
- A collection of related data.
- Managed by a DBMS
Database Management System (DBMS)
- A software package/system to facilitate the creation and
maintenance of a computerized database
Database System
- The DBMS software together with the data itself.
Sometimes, the applications are also included
The DBMS is expected to Early DBMS
1) Allow users to create 1960s, the first DBMS based
new databases and on file system
specify their schemas
2) Give users the ability to Responsibility Yes/no
query the data
3) Support the storage of (1) Limited
very large amounts of
data
4) Enable durability (2) Not directly
5) Control access to data supported
from many users at (3) Yes
once
(4) Not always
supported
(5) No
Information Integration
- Join the information contained in many related databases
into a whole
- Example: a large company has many divisions, each
division have built its own database of products and
employees on different DBMS’s and different structures
- How we join these databases without any matters
- Need to build structures on top of existing databases, with
the goal of integrating the information distributed among
them
Database Users
- Database Administrators, authorize access to database,
coordinate, monitor its use, acquiring software, and
hardware resources, …
- Database Designers, define the content, the structure, the
constraints, and functions or transactions against the
database
- Database End users, use data for queries, reports and
some of them actually update the database content
Contents
- Foreign key: là một hoặc một tập hợp các thuộc tính trong
một bảng mà là Primary của một Relation khác
● Set operations
R and S must be ‘type compatible’:
1. The same number of attributes
2. The domain of corresponding attributes must be
compatible
- Union
R ∪ S = { t | t ∈ R ∨ t ∈ S}
- Intersection
R ∩ S = { t | t ∈ R ⴷ t ∈ S}
- Difference
R \ S = { t | t ∈ R ⴷ t ∉ S}
- Intersection can be expressed in terms of set
difference
R∩S=R\(R\S)
Relation R
Relation S
R∪S
R∩S
R\S
Movies
σlength>100(Movies)
● PROJECTION
- S := πA1,A2,…,An (R)
- A1,A2,…,An are attributes of R
- S relation schema S(A1,A2,…,An)
Movies
𝞹title,year,length(Movies)
𝞹genre(Movies)
● Cartesian product and Joins
Cartesian product R3 := R1 X R2
Relation U Relation V
Relation Expression
Exercise 2:
Product (ProductCode, Name, PurchasePrice, SellPrice, Type,
SupplierCode)
Supplier (SupplierCode, SupplierName, Address)
Employee (EmployID, FullName, Gender, BirthDate, Address)
Invoice (InvoiceID, SellDate, EmployeeID)
InvoiceLine (ProductCode, InvoiceID, Quantity)
- Super-key
1. A set of attributes that contains a key is called a super-key
2. Every super-key satisfies the first condition of a key: it
functionally determines all other attributes of the relation
3. If K is a key, L is a super-key, then: K ⊆ L
4. A key is also a super key
Armstrong’s Axioms
- Fundamental Rules: Let X, Y, Z are sets of attributes
1. Reflexivity: if X is a subset of Y, then Y → X
2. Augmentation: if X → Y, then XZ → YZ for any Z
3. Transitivity: if X → Y and Y → Z, then X → Z
Example: R(A, B, C, D)
S = {A → B, B → C, C → D, D → A}
+ +
Compute {A} ? {B} ?
What are some the keys of R?
Just work with only FD’s that have singleton right sides
A minimal basis for FD’s S is a basis B that satisfies three
conditions:
- All the FD’s in B have singleton right sides
- If any FD is removed from, the result is no longer a basis
- If for any FD in B we remove one or more attributes from
the left side, the result is no longer a basis
Example:
- R(A, B, C)
- S = {A → B, A → C, B → A, B → C, C → A, C → B, AB → C, BC → A, AC
→ B, A → BC, B → AC, C → AB}
- R and its FD’s have several minimal basis
1. {A → B, B → A, B → C, C → B}, or
2. {A → B, B → C, C → A}
What happens to …
… a set of FD’s S of R when we project R on some attributes?
That is, suppose a relation R with set of FD’s S, and R1=πL(R).
What FD’s hold in R1?
Anomalies introduction
- Careless selection of a relational database schema can lead
to redundancy and related anomalies
- So, in this session we shall tackle the problems of relational
database designing
- Problems such as redundancy that occur when we try to
cram too much into a single relation are called “anomalies”
Decomposition
● The accepted way to eliminate anomalies í the
decomposition of relations
● Decomposition of a relation R involves splitting the
attributes of R to make the schemas of 2 new relations
- Definition: Given a relation R(A1,..,An), we say R is
decomposed into S(B1,..,Bm) and T(C1,..,Ck) if:
1. {A1,..,An} = {B1,..,Bm} U {C1,..,Ck}
2. S = ∏B1,..Bm(R)
3. T = ∏C1,..,Ck(R)
Example:
➡
And
➡
Discuss:
- The redundancy is eliminated (the length of each film
appears only once)
- The risk of an update anomaly is gone (we only have to
change the length of Star Wars in one tuple)
- The risk of a deletion anomaly is gone (if we delete all the
stars for Gone with the wind, that deletion makes the
movie disappear from the right but still be found in the
left)
1NF
1NF A relation R is in first normal form (1NF) if and only if all
underlying domains contain atomic values only
Take the following table:
2NF
A relation R is in second normal form (2NF) if and only if it is in
1NF and every non-key attribute is fully dependent on the
primary key
STEP 1
________________________________________________________________
STEP 4 - cardinality
STUDENT TABLE (key = StudentID)
SUBJECTS TABLE (key = Subject)
*NOTES:
1. Each student can only appear ONCE in the student table
2. Each subject can only appear ONCE in the subjects table
3. A subject can be listed MANY times in the results table (for
different students)
4. A student can be listed MANY times in the results table (for
different subjects)
A 2NF check:
STUDENT TABLE (key = StudentID)
3NF
A relation R is in third normal form (3NF) if and only if it is in
2NF and every non-key attribute is non-transitively dependent
on the primary key.
An attribute C is transitively dependent on attribute A if there
exists an attribute B such that: A->B and B->C
A 3NF check:
STUDENT TABLE (key = StudentID)
BCNF is the same, but the embedded table may involve key
attributes.
BCNF
A relation R is in BCNF if and only if: Whenever there is a Non-
Trivial FD
A1A2..An -> B1B2..Bm for R, it is the case that:
{A1,..,An} is a super-key for R
Summary 1
Decompose a relation into BCNF is a solution for eliminating
anomalies
But BCNF can cause information loss and dependency loss
3NF is a relax solution of BCNF that keep loss-less join and
dependency-preservation properties
Summary 2
Chapter 4: High - Level
Database Model
Objectives
Contents
- 1-1
- 1-M/M-1
- M-M
- Degree Constraints
- Recursive relationship
- Unary, Binary, Ternary relationship
Key attribute
Multivalued attribute
Derived attribute
Composite attribute
Weak Entity Sets
Example
Representing Class Hierarchy
Two general approaches depending on disjointness and
completeness
- For disjoint AND complete mapping class hierarchy:
- DO NOT create a table for the super class entity set
- Create a table for each subclass entity set include all
attributes of that subclass entity set and attributes of the
superclass entity set
Combining Relations
UML Classes
Associations
Consider an associations between Movies, Stars, and Studios
in UML
Comparison with E/R Multiplicities
Self-Associations
An association can have both ends at the same class; such an
association is called a self-association
Example
Association Classes
Subclasses in UML
Consider Movies and its three subclasses.
Figure 4.40: Cartoons and murder mysteries as disjoint
subclasses of movies.
Aggregations and Compositions
UML-to-Relations Basics
Classes to Relations
- For each class, create a relation
1. name is the name of the class
2. attributes are the attributes of the class
Associations to Relations
- For each association, create a relation
1. name is the name of that association
2. attributes are the key attributes of the two connected
classes
Contents
- Integrity constraints
- Structure Query Language
- DDL
- DML
- DCL (self studying)
- Sub query
Review
Studied:
- ER diagram
- Relational model
- Convert ERD → Relational model
Now: we learn how to set up a relational database on DBMS
COMPANY Database:
Example 5.1:
Find all employees named as ‘Võ Việt Anh’
Example 5.2:
Find all employees whose name is ended at ‘Anh’
USING ESCAPE keyword
- SQL allows us to specify any one character we like as the
escape character for a single pattern
- Example
- WHERE s LIKE ‘%20!%%’ ESCAPE !
- Or WHERE s LIKE ‘%20@%%’ ESCAPE @
➡ Matching any s string that begins and ends with the
character %
- WHERE s LIKE ‘x%%x%’ ESCAPE X
➡ Matching any s string that begins and ends with the
character %
Null Values
Null value: special value in SQL
Some interpretations
- Value unknown: there is, but i don’t know what it is
- Value inapplicable: there is no value that makes sense here
- Value withheld: we are not entitled to know the value that
belongs here
Null is not a constant
Two rules for operating upon a NULL value in WHERE clause
- Arithmetic operators on NULL values will return a NULL
value
- Comparisons with NULL values will return UNKNOWN
- Add/remove constraints
ALTER TABLE tablename
ADD CONSTRAINT constraintName PRIMARY KEY(<attribute
list>);
ALTER TABLE tablename
ADD CONSTRAINT constraintName FOREIGN KEY (<attribute
list>)
REFERENCES parentTableName (<attribute list>);
ALTER TABLE tablename
ADD CONSTRAINT constraintName CHECK
(expressionChecking)
ALTER TABLE tablename
DROP CONSTRAINT constraintName