UNIT 3 PRE
UNIT 3 PRE
Relational Model: Introduction, CODD Rules, relational data model, concept of key, relational
integrity, relational algebra, relational algebra operations, advantages of relational algebra.
The relational model was introduced in 1970 by E.F. Codd. The Relational Data Model is
implemented through Relational Database Management System (RDBMS).
Examples of record are (1, divya, kkd, …..), (2, sailu, rjy, ….).
Field : A Field is defined as a character or a group of characters that has a specific meaning.
A field is also known as “Attribute”.
Every database which has tables and constraints need not be a relational database system.
There are certain rules for a database to be a perfect Relational Database Management System
(RDBMS).
The database must support set-level inserts, updates, and delete operations.
Changes to the physical level (how the data is stored) must not require a change in the
application.
Changes to the logical level (tables) must not require a change in the application based
on the structure.
All the integrity constraints can be independently modified without the need for any
change in the application.
If the system provides a low-level interface, then that interface cannot be used to
subvert the system and bypass relational security or integrity constraints.
What is Relational Algebra? Explain about Relational
Algebra Operations (OR) Relational Set Operators?
A. Relational Algebra:
1. Select (σ)
2. Project (π)
3. Union (U)
4. Intersect (Ո)
5. Set Difference (-)
6. Cartesian Product (×)
7. Rename (ƿ)
8. Joins (|×|)
Notation: σ p(R)
σ → Represents SELECTION.
R → Represents RELATION.
p → Represents the logic formula.
Example:
Suppose we want to retrieve rows from the STUDENT relation where "AGE" is 20.
The query will be:
σ AGE=20 (STUDENT)
Notation: ∏ a(R)
∏ → Represents PROJECTION.
R → Represents RELATION.
a → Represents the attribute list.
Example:
Suppose we want to retrieve the names of all students from the STUDENT relation.
The query will be:
∏ NAME(STUDENT)
Notation: R ∪ S
Important Condition: If the relations do not have the same set of attributes, then the union
operation will result in NULL.
Example:
Suppose we want to retrieve all the names from both STUDENT and EMPLOYEE
relations.
∏ NAME(STUDENT) ∪ ∏ NAME(EMPLOYEE)
Notation:R - S
Just like Union (∪), Set Difference also requires that both relations have the same
set of attributes.
Example:
Suppose we want to retrieve the names of students who are in the STUDENT
relation but not in the EMPLOYEE relation.
The query will be:
∏ NAME(STUDENT) - ∏ NAME(EMPLOYEE)
Notation: R × S
Example:
Notation: ρ(R, S)
Example:
Suppose we are fetching the names of students from the STUDENT relation.
We would like to rename this relation as STUDENT_NAME.
The query will be:
ρ(STUDENT_NAME, ∏ NAME(STUDENT))
Join Operations
7. Inner Join
Inner Join returns only those tuples that satisfy a certain condition.
It is classified into three types:
1. Theta Join (θ)
2. Equi Join
3. Natural Join
R → First relation
S → Second relation
2. Equi Join
Equi Join is a special case of Theta Join, where the condition only uses equality (=).
If the condition uses any operator other than (=), it becomes a non-equijoin.
Notation: R ⋈ S
R → First relation
S → Second relation
Outer Join
Unlike Inner Join, which includes only matching tuples, Outer Join includes tuples
that don’t satisfy the condition.
It is classified into three types:
1. Left Outer Join ( )
2. Right Outer Join ( )
3. Full Outer Join ( )
Notation: R ∩ S
R → First relation
S → Second relation
Example:
Advantages:
Limitations:
Notation − {T | Condition}
For example −
Output − Returns tuples with 'name' from Author who has written article on 'database'.
TRC can be quantified. We can use Existential (∃) and Universal Quantifiers (∀).
For example −
Output − The above query will yield the same result as the previous one.
In DRC, the filtering variable uses the domain of attributes instead of entire tuple values (as
done in TRC, mentioned above).
Notation − { a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}
Where a1, a2 are attributes and P stands for formulae built by inner attributes.
For example −
1. Entity Integrity
2. Referential Integrity
3. Domain Integrity
1. Entity Integrity:
It specifies that there should be no duplicate rows in a table (Primary Key and
Unique Constraint).
o Primary Key: A Primary Key uniquely identifies each record in a table. It must
have unique values and cannot contain NULLs.
o Unique Constraint: The UNIQUE constraint enforces a column or set of
columns to have unique values. If a column has a unique constraint, it means
that particular column cannot have duplicate values in a table.
2. Referential Integrity:
3. Domain Integrity:
It enforces valid entries for a given column (CHECK and NOT NULL Constraints).
o NOT NULL Constraint: Ensures that a column does not hold a NULL value.
o CHECK Constraint: Used to specify a range of values for a particular column
in a table.
A database consists of tables, which contain records, which further contain fields
(attributes).
A key is a set of one or more columns whose values are unique in a given table.
A key is a relational means of specifying uniqueness.
Types of Keys:
1. Super Key
2. Candidate Key
3. Primary Key
4. Foreign Key
5. Alternative Key / Secondary Key
6. Composite Key
1. Super Key
A Super Key is a set of one or more attributes (columns) that can uniquely identify
each record (row) in a table.
It may contain extra attributes that are not necessary for uniqueness.
A table can have multiple super keys.
Example: Consider a table named EMPLOYEE with attributes E_ID (Employee ID), E_NAME
(Employee Name), and E_EMAIL (Employee Email).
{E_ID}, {E_ID, E_NAME}, and {E_ID, E_EMAIL} are all super keys because the E_ID
attribute can uniquely identify each record, even if extra attributes are added.
2. Candidate Key
A Candidate Key is a minimal super key. This means it is a set of attributes that
uniquely identifies a record in a table, and no attribute can be removed without
losing the uniqueness property.
There can be multiple candidate keys in a table, but each one is minimal.
Example: In the EMPLOYEE table, the following attributes are candidate keys:
Here, both E_ID and E_EMAIL are candidate keys, as each can uniquely identify a row in the
table, and neither has extra attributes.
3. Primary Key
A Primary Key is a special candidate key that is chosen to uniquely identify records
in a table.
It cannot contain NULL values and must be unique across the table.
A table can have only one primary key, which can consist of one or more columns
(composite primary key).
Example: In the EMPLOYEE table, the E_ID attribute can be chosen as the primary key
because it uniquely identifies each employee.
4. Foreign Key
A Foreign Key is a column or a set of columns in a table that references the Primary
Key or Candidate Key in another table.
It is used to establish relationships between tables.
A foreign key can contain NULL values, and the values in the foreign key column must
match values in the referenced key or be NULL.
EMPLOYEE Table:
DEPARTMENT Table:
The D_ID in the EMPLOYEE table is a foreign key that references the D_ID in the
DEPARTMENT table.
An Alternative Key is a candidate key that was not selected as the primary key but
still has the ability to uniquely identify records.
A Secondary Key is another term used for Alternative Key, although in some cases it
may refer to a key used for searching or sorting rather than uniquely identifying
records.
Example: In the EMPLOYEE table, if E_EMAIL was not chosen as the primary key but can still
uniquely identify rows, it is considered an Alternative Key.
6. Composite Key
A Composite Key is a primary key that consists of two or more attributes (columns)
used together to uniquely identify records in a table.
Each individual column in a composite key may not be unique on its own, but when
combined, they provide uniqueness.
Example: Consider the ENROLLED table, which records which students are enrolled in which
courses.
STUDENT_ID COURSE_ID
Student_1 DBMS
Student_2 DBMS
Student_1 OS
Student_3 OS
Here, {STUDENT_ID, COURSE_ID} forms a composite key because no single column (like
STUDENT_ID or COURSE_ID) is unique by itself, but together they uniquely identify each
record.
Here, roll_no can determine values of fields name, dept_name and dept_building, hence a
valid Functional dependency
roll_no → dept_name
Since, roll_no can determine whole set of {name, dept_name, dept_building}, it can
determine its subset dept_name also.
Here:
Here, enrol_no → dept and dept → building_no. Hence, according to the axiom of transitivity,
enrol_no → building_no is a valid functional dependency. This is an indirect functional
dependency, hence called Transitive functional dependency.
What is Normalization?
o Normalization is the process of organizing the data in the database.
o Normalization is used to minimize the redundancy from a relation or set of relations.
o It is also used to eliminate undesirable characteristics like Insertion, Update, and
Deletion Anomalies.
o Normalization divides the larger table into smaller and links them using relationships.
o The normal form is used to reduce redundancy from the database table.
The main reason for normalizing the relations is removing these anomalies.
o Insertion Anomaly: Insertion Anomaly refers to when one cannot insert a new tuple
into a relationship due to lack of data.
o Deletion Anomaly: The delete anomaly refers to the situation where the deletion of
data results in the unintended loss of some other important data.
o Updatation Anomaly: The update anomaly is when an update of a single data value
requires multiple rows of data to be updated.
Definition: A table is in 1NF if all columns contain atomic (indivisible) values and each
record is unique.
Requirements:
1. Every column must have atomic values (no repeating groups or arrays).
2. Every record (row) should be unique.
Example:
2. Second Normal Form (2NF)
Definition: A table is in 2NF if it is in 1NF and all non-key columns are fully dependent
on the primary key (i.e., no partial dependency).
Requirements:
1. The table must be in 1NF.
2. All non-key attributes must depend on the entire primary key (no partial
dependencies).
Example:
Problem: The Professor_Room depends only on the Professor, not the whole primary key
(Student_ID, Course). This is a partial dependency.
Example:
5. Fourth Normal Form (4NF)
Example:
6.Fifth Normal Form (5NF)
Example: