0% found this document useful (0 votes)
31 views22 pages

UNIT 3 PRE

The document provides an overview of the Relational Data Model, introduced by E.F. Codd, detailing its components, operations, and integrity constraints. It explains relational algebra and calculus, including various operations like selection, projection, and joins, along with their advantages and limitations. Additionally, it discusses Codd's rules for a perfect RDBMS and the importance of integrity constraints in maintaining data accuracy.

Uploaded by

Arun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views22 pages

UNIT 3 PRE

The document provides an overview of the Relational Data Model, introduced by E.F. Codd, detailing its components, operations, and integrity constraints. It explains relational algebra and calculus, including various operations like selection, projection, and joins, along with their advantages and limitations. Additionally, it discusses Codd's rules for a perfect RDBMS and the importance of integrity constraints in maintaining data accuracy.

Uploaded by

Arun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

UNIT-III

Relational Model: Introduction, CODD Rules, relational data model, concept of key, relational
integrity, relational algebra, relational algebra operations, advantages of relational algebra.

limitations of relational algebra, relational calculus, tuple relational calculus, domain


relational Calculus (DRC), Functional dependencies and normal forms upto 3rd normal form.

Relational Data Model

The relational model was introduced in 1970 by E.F. Codd. The Relational Data Model is
implemented through Relational Database Management System (RDBMS).

 In the Relational Data Model, data is represented in the form of “Tables” or


“Relations”.
 A Table contains rows and columns.
o Each row is known as a “Tuple”.
o Each column is known as an “Attribute”.
 We can establish relationships between two tables, but a “common field” must exist
between them.
 SQL (Structured Query Language) is used to perform ad hoc queries.

Components of Relational Data Model

1. Set of Relations (Tables), Tuples, Attributes, and Domain


2. Keys in a Relation
3. Integrity Rules
4. Operations performed on Relations (Tables)

Relation or Table : A relation is a collection of records. It is also known as an “Entity”.

Example: STUDENT Table

Student_ID Name Age Course


101 Alice 20 CS
102 Bob 22 IT
103 Charlie 21 ECE

Record : A record is defined as a collection of related fields or attributes. A record is also


known as “Tuple”.

Examples of record are (1, divya, kkd, …..), (2, sailu, rjy, ….).

Field : A Field is defined as a character or a group of characters that has a specific meaning.
A field is also known as “Attribute”.

Examples of fields are sno, sname, addr, fname, dob etc.


Domain : Domain is a value given to attribute.

Example 1: Male , Female are domains of attribute gender.

Example 2: 20, 24, 27 are domains of attribute age.

Exlpain E. F. Codd’s Rules


Dr. Edgar Frank Codd was a computer scientist. While working for IBM, he invented the
relational model for database management in the 1970s.

Every database which has tables and constraints need not be a relational database system.
There are certain rules for a database to be a perfect Relational Database Management System
(RDBMS).

 Codd’s proposed 13 rules (0 to 12) in 1985 to define a perfect RDBMS.


 If a DBMS needs these rules, it can be called as RDBMS.
 These rules are called as Codd’s rules.

Rule 0: Foundation Rule

 This is the Foundation rule.


 This rule states that any database system must qualify as a relational, as a database,
and as a management system.
 The other 12 rules are derived from this rule.

Rule 1: The Information Rule

 A database contains a lot of data, i.e.; user data or metadata.


 Everything in a database must be stored in a table format.
 Each cell should have a single data.
 This rule is satisfied by all the databases.

Rule 2: Guaranteed Access Rule

 This rule refers to the primary key.


 Each unique piece of data should be accessible by:
Table name + Primary Key (row) + Attribute (column).
 When the combination of these three are used, it should give the correct result.

Rule 3: Systematic Treatment of Null

 This rule states about handling the Nulls in the database.


 Null has several meanings, i.e.; Data is missing, cannot be applicable, or unknown.
 So, null representations must be manipulated by DBMS in a systematic way.
Rule 4: Active Online Catalog

 This rule illustrates a data dictionary.


 Metadata should be maintained for all the data in the database.

Rule 5: Comprehensive Data Sublanguage Rule

 Any RDBMS database should not be directly accessed.


 It should always be accessed by using a language that supports data manipulation, data
definition, and transaction management operations.

Rule 6: View Updating Rule

 Views are the virtual tables created by using queries.


 This rule states that, views should also be able to get updated just like tables.

Rule 7: High-Level Insert, Update, Delete Rule

 The database must support set-level inserts, updates, and delete operations.

Rule 8: Physical Independence

 Changes to the physical level (how the data is stored) must not require a change in the
application.

Rule 9: Logical Independence

 Changes to the logical level (tables) must not require a change in the application based
on the structure.

Rule 10: Integrity Independence

 All the integrity constraints can be independently modified without the need for any
change in the application.

Rule 11: Distribution Independence

 This rule lays the foundation for distributed database systems.


 Even if the database is located on different servers, the accessibility time should be
comparatively low.

Rule 12: Non-Subversive Rule

 If the system provides a low-level interface, then that interface cannot be used to
subvert the system and bypass relational security or integrity constraints.
What is Relational Algebra? Explain about Relational
Algebra Operations (OR) Relational Set Operators?
A. Relational Algebra:

 Relational algebra is a procedural query language.


 It contains a set of operations that take one or more relations (tables) as input
and produce a new relation as output.

Relational Algebra Operations:

The fundamental operations of relational algebra are:

1. Select (σ)
2. Project (π)
3. Union (U)
4. Intersect (Ո)
5. Set Difference (-)
6. Cartesian Product (×)
7. Rename (ƿ)
8. Joins (|×|)

1. Select Operation (σ)

 Selection Operator is represented by "sigma" (σ).


 It is used to retrieve tuples (rows) from the table where the given condition is
satisfied.
 It is a unary operator, meaning it requires only one operand.

Notation: σ p(R)

 σ → Represents SELECTION.
 R → Represents RELATION.
 p → Represents the logic formula.

Example:

 Suppose we want to retrieve rows from the STUDENT relation where "AGE" is 20.
 The query will be:
σ AGE=20 (STUDENT)

2. Project Operation (∏)

 Projection Operator is represented by "pi" (∏).


 It is used to retrieve certain attributes (columns) from the table.
 Also known as vertical partitioning, as it separates the table vertically.
 It is a unary operator, meaning it requires only one operand.

Notation: ∏ a(R)

 ∏ → Represents PROJECTION.
 R → Represents RELATION.
 a → Represents the attribute list.

Example:

 Suppose we want to retrieve the names of all students from the STUDENT relation.
 The query will be:
∏ NAME(STUDENT)

3. Union Operation (∪)

 Union Operator is represented by "union" (∪).


 It is similar to the union operator in set theory.
 It selects all tuples from both relations, but with a condition:
o Both relations must have the same set of attributes.
 It is a binary operator, meaning it requires two operands.

Notation: R ∪ S

 R → Represents the first relation.


 S → Represents the second relation.

Important Condition: If the relations do not have the same set of attributes, then the union
operation will result in NULL.

Example:

 Suppose we want to retrieve all the names from both STUDENT and EMPLOYEE
relations.

∏ NAME(STUDENT) ∪ ∏ NAME(EMPLOYEE)

4. Set Difference (-)

 Set Difference represents the difference between two relations (R - S).


 It is denoted by a "Hyphen" (-).
 It returns all the tuples (rows) that are in relation R but not in relation S.
 It is a binary operator, meaning it requires two operands.

Notation:R - S

 R → Represents the first relation.


 S → Represents the second relation.
Important Condition:

 Just like Union (∪), Set Difference also requires that both relations have the same
set of attributes.

Example:

 Suppose we want to retrieve the names of students who are in the STUDENT
relation but not in the EMPLOYEE relation.
 The query will be:
∏ NAME(STUDENT) - ∏ NAME(EMPLOYEE)

5. Cartesian Product (×)

 Cartesian Product is denoted by the "X" (×) symbol.


 It combines every tuple (row) from relation R with all tuples from relation S.
 It is a binary operator, meaning it requires two operands.

Notation: R × S

 R → Represents the first relation.


 S → Represents the second relation.

Example:

 Suppose we want to combine the STUDENT and EMPLOYEE relations.


 The query will be:
STUDENT × EMPLOYEE

6. Rename Operation (ρ)

 Rename Operator is denoted by "Rho" (ρ).


 It is used to rename the output relation.
 It is a binary operator, meaning it requires two operands.

Notation: ρ(R, S)

 R → Represents the new relation name.


 S → Represents the old relation name.

Example:

 Suppose we are fetching the names of students from the STUDENT relation.
 We would like to rename this relation as STUDENT_NAME.
 The query will be:
ρ(STUDENT_NAME, ∏ NAME(STUDENT))
Join Operations

 Join Operation in DBMS allows us to combine two or more relations.


 Join is a binary operation, meaning it requires two relations.
 Joins are classified into two types:
1. Inner Join
2. Outer Join

7. Inner Join

 Inner Join returns only those tuples that satisfy a certain condition.
 It is classified into three types:
1. Theta Join (θ)
2. Equi Join
3. Natural Join

1. Theta Join (θ)

 Theta Join combines two relations using a condition.


 This condition is represented by the symbol "theta" (θ).
 The condition can use inequality operators such as >, <, >=, <=, etc.
Notation: R ⋈θ S

 R → First relation
 S → Second relation

2. Equi Join

 Equi Join is a special case of Theta Join, where the condition only uses equality (=).
 If the condition uses any operator other than (=), it becomes a non-equijoin.

3. Natural Join (⋈)

 Natural Join does not use a comparison operator.


 It automatically joins relations based on common attributes.
 The common attributes must have the same name and domain.
 Natural Join removes duplicate attributes in the output.

Notation: R ⋈ S

 R → First relation
 S → Second relation

Outer Join

 Unlike Inner Join, which includes only matching tuples, Outer Join includes tuples
that don’t satisfy the condition.
 It is classified into three types:
1. Left Outer Join ( )
2. Right Outer Join ( )
3. Full Outer Join ( )

1. Left Outer Join ( )

 Returns all tuples from the left relation (R).


 If there is no matching tuple in the right relation (S), NULL values are assigned to
its attributes.

2. Right Outer Join ( )


 Returns all tuples from the right relation (S).
 If there is no matching tuple in the left relation (R), NULL values are assigned to its
attributes.

3. Full Outer Join ( )

 Returns all tuples from both relations.


 If there is no matching tuple, NULL values are assigned.
8. Intersection (∩)

 Intersection selects all tuples that are present in both relations.


 It removes duplicates and requires both relations to have the same attributes.
 It is a binary operator.

Notation: R ∩ S

 R → First relation
 S → Second relation

Example:

 Suppose we want names present in both STUDENT and EMPLOYEE relations.


 The query will be:
∏ NAME(STUDENT) ∩ ∏ NAME(EMPLOYEE)

Write about Advantages and Limitations of Relational


Algebra?
Relational Algebra:

 Relational algebra is a procedural query language.


 It consists of a set of operations that take one or more relations (tables) as input
and produce a new relation as output.

Advantages:

 It provides a formal foundation for relational model operations.


 Relational algebra has a solid mathematical background.
 It is used as a basis for implementing and optimizing queries in RDBMS.
 It is a high-level language in terms of set of tuples.
 Some of its concepts are incorporated into SQL for RDBMS.
 The use of relational algebra ensures that there is no ambiguity in the relation
between tables.
 It establishes links in a complex network-type database.
 Information that needs to be linked and extracted can be easily manipulated using
operators such as project and join.

Limitations:

 Relational algebra cannot perform arithmetic operations.


 It cannot sort or print results in various formats.
 It cannot perform aggregate operations.
 The biggest limitation is that it is abstract, meaning it cannot be run on a database
server to get a result.
 It only deals with relations.
 Logical and physical modeling will never be completely separable.
 There are some simple and natural operations in relations that cannot be
expressed using relational algebra.

Explain About Relational Calculus?


1) Relational Calculus

In contrast to Relational Algebra, Relational Calculus is a non-procedural query language, that


is, it tells what to do but never explains how to do it.

Relational calculus exists in two forms −

Tuple Relational Calculus (TRC)

Filtering variable ranges over tuples

Notation − {T | Condition}

Returns all tuples T that satisfies a condition.

For example −

{ T.name | Author(T) AND T.article = 'database' }

Output − Returns tuples with 'name' from Author who has written article on 'database'.

TRC can be quantified. We can use Existential (∃) and Universal Quantifiers (∀).

For example −

{ R| ∃T ∈ Authors(T.article='database' AND R.name=T.name)}

Output − The above query will yield the same result as the previous one.

2) Domain Relational Calculus (DRC)

In DRC, the filtering variable uses the domain of attributes instead of entire tuple values (as
done in TRC, mentioned above).

Notation − { a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}

Where a1, a2 are attributes and P stands for formulae built by inner attributes.

For example −

{< article, page, subject > | ∈ TutorialsPoint ∧ subject = 'database'}


What do you mean by Integrity Constraints? Or Relational
Integrity
Integrity Constraints:

 These are used to maintain accuracy and correctness of data in a table.


 Integrity rules are enforced on tables to maintain database integrity.
 These rules prevent invalid data entry into a table.

There are three types of Integrity Rules:

1. Entity Integrity
2. Referential Integrity
3. Domain Integrity

1. Entity Integrity:

 It specifies that there should be no duplicate rows in a table (Primary Key and
Unique Constraint).
o Primary Key: A Primary Key uniquely identifies each record in a table. It must
have unique values and cannot contain NULLs.
o Unique Constraint: The UNIQUE constraint enforces a column or set of
columns to have unique values. If a column has a unique constraint, it means
that particular column cannot have duplicate values in a table.

2. Referential Integrity:

 It allows relationships with other tables using a Foreign Key.


o Foreign Key: Foreign keys are the columns of a table that point to the
Primary Key of another table. They act as a cross-reference between tables.

3. Domain Integrity:

 It enforces valid entries for a given column (CHECK and NOT NULL Constraints).
o NOT NULL Constraint: Ensures that a column does not hold a NULL value.
o CHECK Constraint: Used to specify a range of values for a particular column
in a table.

Write about the different types of keys in the relational


model?
Key:

 A database consists of tables, which contain records, which further contain fields
(attributes).
 A key is a set of one or more columns whose values are unique in a given table.
 A key is a relational means of specifying uniqueness.

Types of Keys:

The different types of keys are:

1. Super Key
2. Candidate Key
3. Primary Key
4. Foreign Key
5. Alternative Key / Secondary Key
6. Composite Key

1. Super Key

 A Super Key is a set of one or more attributes (columns) that can uniquely identify
each record (row) in a table.
 It may contain extra attributes that are not necessary for uniqueness.
 A table can have multiple super keys.

Example: Consider a table named EMPLOYEE with attributes E_ID (Employee ID), E_NAME
(Employee Name), and E_EMAIL (Employee Email).

E_ID E_NAME E_EMAIL


1 Ram [email protected]
2 Varun [email protected]
3 Ravi [email protected]

 {E_ID}, {E_ID, E_NAME}, and {E_ID, E_EMAIL} are all super keys because the E_ID
attribute can uniquely identify each record, even if extra attributes are added.

2. Candidate Key

 A Candidate Key is a minimal super key. This means it is a set of attributes that
uniquely identifies a record in a table, and no attribute can be removed without
losing the uniqueness property.
 There can be multiple candidate keys in a table, but each one is minimal.

Example: In the EMPLOYEE table, the following attributes are candidate keys:

 {E_ID} (Employee ID is unique by itself)


 {E_EMAIL} (Email is unique by itself)

Here, both E_ID and E_EMAIL are candidate keys, as each can uniquely identify a row in the
table, and neither has extra attributes.

3. Primary Key
 A Primary Key is a special candidate key that is chosen to uniquely identify records
in a table.
 It cannot contain NULL values and must be unique across the table.
 A table can have only one primary key, which can consist of one or more columns
(composite primary key).

Example: In the EMPLOYEE table, the E_ID attribute can be chosen as the primary key
because it uniquely identifies each employee.

E_ID (Primary Key) E_NAME E_EMAIL


1 Ram [email protected]
2 Varun [email protected]
3 Ravi [email protected]

4. Foreign Key

 A Foreign Key is a column or a set of columns in a table that references the Primary
Key or Candidate Key in another table.
 It is used to establish relationships between tables.
 A foreign key can contain NULL values, and the values in the foreign key column must
match values in the referenced key or be NULL.

Example: Consider two tables: EMPLOYEE and DEPARTMENT.

EMPLOYEE Table:

E_ID (Primary Key) E_NAME D_ID (Foreign Key)


1 Ram 101
2 Varun 102

DEPARTMENT Table:

D_ID (Primary Key) D_NAME


101 HR
102 IT

 The D_ID in the EMPLOYEE table is a foreign key that references the D_ID in the
DEPARTMENT table.

5. Alternative Key / Secondary Key

 An Alternative Key is a candidate key that was not selected as the primary key but
still has the ability to uniquely identify records.
 A Secondary Key is another term used for Alternative Key, although in some cases it
may refer to a key used for searching or sorting rather than uniquely identifying
records.

Example: In the EMPLOYEE table, if E_EMAIL was not chosen as the primary key but can still
uniquely identify rows, it is considered an Alternative Key.

6. Composite Key

 A Composite Key is a primary key that consists of two or more attributes (columns)
used together to uniquely identify records in a table.
 Each individual column in a composite key may not be unique on its own, but when
combined, they provide uniqueness.

Example: Consider the ENROLLED table, which records which students are enrolled in which
courses.

STUDENT_ID COURSE_ID
Student_1 DBMS
Student_2 DBMS
Student_1 OS
Student_3 OS

Here, {STUDENT_ID, COURSE_ID} forms a composite key because no single column (like
STUDENT_ID or COURSE_ID) is unique by itself, but together they uniquely identify each
record.

What is Functional Dependency?


 A functional dependency occurs when one attribute uniquely determines another
attribute within a relation.
 In relational database management, functional dependency is a concept that specifies
the relationship between two sets of attributes where one attribute determines the
value of another attribute. It is denoted as X → Y, where the attribute set on the left
side of the arrow, X is called Determinant, and Y is called the Dependent.
From the above table we can conclude some valid functional dependencies:

 roll_no → { name, dept_name, dept_building }

Here, roll_no can determine values of fields name, dept_name and dept_building, hence a
valid Functional dependency

 roll_no → dept_name

Since, roll_no can determine whole set of {name, dept_name, dept_building}, it can
determine its subset dept_name also.

Types of Functional Dependencies in DBMS


1. Trivial functional dependency
2. Non-Trivial functional dependency
3. Multivalued functional dependency
4. Transitive functional dependency

1. Trivial Functional Dependency


In Trivial Functional Dependency, a dependent is always a subset of the determinant. i.e. If X
→ Y and Y is the subset of X, then it is called trivial functional dependency.
Symbolically: A→B is trivial functional dependency if B is a subset of A.
The following dependencies are also trivial: A→A & B→B
Example 1 :
 ABC -> AB
 ABC -> A
 ABC -> ABC

2. Non-trivial Functional Dependency


In Non-trivial functional dependency, the dependent is strictly not a subset of the
determinant. i.e. If X → Y and Y is not a subset of X, then it is called Non-trivial functional
dependency.
Example 1 :
 Id -> Name
 Name -> DOB
3 Multivalued Functional Dependency
In Multivalued functional dependency, entities of the dependent set are not dependent on
each other. i.e. If a → {b, c} and there exists no functional dependency between b and c, then
it is called a multivalued functional dependency.

Here:

 A Student_ID determines multiple Courses (e.g., Student 1 is enrolled in Math and


English).
 A Student_ID also determines multiple Professors (e.g., Student 1 is taught by Dr.
Smith for Math and Dr. Johnson for English).

4 Transitive Functional Dependency


In transitive functional dependency, dependent is indirectly dependent on determinant. i.e.
If a → b & b → c, then according to axiom of transitivity, a → c. This is a transitive functional
dependency.
Example:

Here, enrol_no → dept and dept → building_no. Hence, according to the axiom of transitivity,
enrol_no → building_no is a valid functional dependency. This is an indirect functional
dependency, hence called Transitive functional dependency.
What is Normalization?
o Normalization is the process of organizing the data in the database.
o Normalization is used to minimize the redundancy from a relation or set of relations.
o It is also used to eliminate undesirable characteristics like Insertion, Update, and
Deletion Anomalies.
o Normalization divides the larger table into smaller and links them using relationships.
o The normal form is used to reduce redundancy from the database table.
The main reason for normalizing the relations is removing these anomalies.

Data modification anomalies can be categorized into three types:

o Insertion Anomaly: Insertion Anomaly refers to when one cannot insert a new tuple
into a relationship due to lack of data.
o Deletion Anomaly: The delete anomaly refers to the situation where the deletion of
data results in the unintended loss of some other important data.
o Updatation Anomaly: The update anomaly is when an update of a single data value
requires multiple rows of data to be updated.

Types of Normal Forms:

1. First Normal Form (1NF)

 Definition: A table is in 1NF if all columns contain atomic (indivisible) values and each
record is unique.
 Requirements:
1. Every column must have atomic values (no repeating groups or arrays).
2. Every record (row) should be unique.

Example:
2. Second Normal Form (2NF)

 Definition: A table is in 2NF if it is in 1NF and all non-key columns are fully dependent
on the primary key (i.e., no partial dependency).
 Requirements:
1. The table must be in 1NF.
2. All non-key attributes must depend on the entire primary key (no partial
dependencies).

Example:

Problem: The Professor_Room depends only on the Professor, not the whole primary key
(Student_ID, Course). This is a partial dependency.

3. Third Normal Form (3NF)

 Definition: A table is in 3NF if it is in 2NF and there are no transitive dependencies


(non-key attributes depend on other non-key attributes).
 Requirements:
1. The table must be in 2NF.
2. There must be no transitive dependency (non-key attributes should not depend
on other non-key attributes).
Example:

4. Boyce-Codd Normal Form (BCNF)

 Definition: A table is in BCNF if it is in 3NF and every determinant is a candidate key.


 Requirements:
1. The table must be in 3NF.
2. For every functional dependency, the determinant (the column(s) determining
other values) must be a candidate key.

Example:
5. Fourth Normal Form (4NF)

 Definition: A table is in 4NF if it is in BCNF and there are no multivalued


dependencies.
 Requirements:
1. The table must be in BCNF.
2. There must be no multivalued dependencies (when one attribute determines
multiple independent values).

Example:
6.Fifth Normal Form (5NF)

 Definition: A table is in 5NF if it is in 4NF and cannot be further decomposed without


losing information.
 Requirements:
1. The table must be in 4NF.
2. There must be no join dependency (i.e., all joins in the table must be lossless).

Example:

You might also like