Dbms Unit III Notes
Dbms Unit III Notes
1. SELECT: The SELECT statement specifies the columns or fields that you want to retrieve
from the database table. You can select specific columns or use the asterisk (*) to select all
columns.
Example:
2. FROM: The FROM clause indicates the table or tables from which you want to retrieve the
data. It specifies the name of the table or tables separated by commas.
Example:
3. WHERE: The WHERE clause is used to specify conditions that filter the rows returned by
the query. It allows you to define criteria that the data must meet to be included in the result
set.
Example:
4. GROUP BY: The GROUP BY clause is used to group rows based on one or more
columns. It is often used in conjunction with aggregate functions like SUM, COUNT, AVG,
etc.
Example:
Example:
6. ORDER BY: The ORDER BY clause is used to sort the result set based on one or more
columns. It can sort in ascending (ASC) or descending (DESC) order.
Example:
Example:
The UNION operator is used to combine the result-set of two or more SELECT statements.
Every SELECT statement within UNION must have the same number of columns
The columns must also have similar data types
The columns in every SELECT statement must also be in the same order
UNION Syntax
SELECT column_name(s) FROM table1
UNION
SELECT column_name(s) FROM table2;
UNION ALL Syntax
The UNION operator selects only distinct values by default. To allow duplicate values,
use UNION ALL:
Note: The column names in the result-set are usually equal to the column names in the
first SELECT statement.
The following SQL statement returns the cities (only distinct values) from both the
"Customers" and the "Suppliers" table:
Example:
SELECT City FROM Customers
UNION
SELECT City FROM Suppliers
ORDER BY City;
o/p:
City
Aachen
Annecy
Bend
Berlin
Cork
Delhi
INTERSECT in SQL
The INTERSECT operator in SQL is used to retrieve the records that are identical/common
between the result sets of two SELECT (tables) statements.
In real-time scenarios, there will be a huge number of tables in a database that contains
information. The user may find it challenging to gather common information from various
tables. So we use the INTERSECT operator to accomplish that. It helps to retrieve the
common data from various tables.
Syntax
To retrieve identical records from two different tables, we use the following syntax
SELECT column1, column2,…, columnN
FROM table1, table2,…, tableN
INTERSECT
SELECT column1, column2,…, columnN
FROM table1, table2,…, tableN
Example
First of all, let us create a table named “STUDENTS” using the following query −
SQL> CREATE TABLE STUDENTS(
ID INT NOT NULL,
NAME VARCHAR(20) NOT NULL,
HOBBY VARCHAR(20) NOT NULL,
AGE INT NOT NULL,
PRIMARY KEY(ID)
);
Once the table is created, let us insert some values to the table using the query below −
Let us verify whether the table “STUDENTS” is created or not using the following query −
INTERSECT
INTERSECT
INTERSECT
EXCEPT:
The EXCEPT operator in SQL is used to retrieve the unique records that exist in the first
table, not the common records of both tables. This operator acts as the opposite of the SQL
UNION operator.
For better understanding consider two tables with records as shown in the following image −
If we perform the EXCEPT operator on the above two tables to retrieve the names, it will
display the records only from the first table which are not in common with the records of the
second table.
Here, “Dev” is common in both tables. So, the EXECPT operator will eliminate it and
retrieves only “Sara” and “Jay” as output.
Syntax
Following is the syntax of the EXCEPT operator in SQL −
SELECT column1, column2,…, columnN
FROM table1, table2,…, tableN
[Conditions] //optional
EXCEPT
SELECT column1, column2,…, columnN
FROM table1, table2,…, tableN
[Conditions] //optional
Note − The number and order of columns in both SELECT statements should be the same.
Example
First of all, let us create a table named “STUDENTS” using the following query −
SQL> CREATE TABLE STUDENTS(
ID INT NOT NULL,
NAME VARCHAR(20) NOT NULL,
HOBBY VARCHAR(20) NOT NULL,
AGE INT NOT NULL,
PRIMARY KEY(ID)
);
Once the table is created, let us insert some values to the table using the query below –
SQL> INSERT INTO STUDENTS(ID, NAME, HOBBY, AGE) VALUES(1, 'Vijay',
'Cricket', 18);
INSERT INTO STUDENTS(ID, NAME, HOBBY, AGE) VALUES(2, 'Varun', 'Football',
26);
INSERT INTO STUDENTS(ID, NAME, HOBBY, AGE) VALUES(3, 'Surya', 'Cricket', 19);
INSERT INTO STUDENTS(ID, NAME, HOBBY, AGE) VALUES(4, 'Karthik', 'Cricket',
25);
INSERT INTO STUDENTS(ID, NAME, HOBBY, AGE) VALUES(5, 'Sunny', 'Football',
26);
INSERT INTO STUDENTS(ID, NAME, HOBBY, AGE) VALUES(6, 'Dev', 'Cricket', 23);
Let us verify whether the table “STUDENTS” is created or not using the following query −
SQL> SELECT * FROM STUDENTS;
As we can see in the below output, the table has been created in the database.
+-----+----------+--------------+-------+
| ID | NAME | HOBBY | AGE |
+-----+----------+--------------+-------+
| 1 | Vijay | Cricket | 18 |
| 2 | Varun | Football | 26 |
| 3 | Surya | Cricket | 19 |
| 4 | Karthik | Cricket | 25 |
| 5 | Sunny | Football | 26 |
| 6 | Dev | Cricket | 23 |
+-----+----------+--------------+-------+
Let us create another table named “ASSOCIATES” using the following query −
SQL> CREATE TABLE ASSOCIATES(
ID INT NOT NULL,
NAME VARCHAR(20) NOT NULL,
SUBJECT VARCHAR(20) NOT NULL,
AGE INT NOT NULL,
HOBBY VARCHAR(20) NOT NULL,
PRIMARY KEY(ID)
);
Once the table is created, let us insert some values to the table using the query below −
Let us retrieve the records that are only unique in the first table using the below query –
SQL> SELECT NAME, HOBBY, AGE
FROM STUDENTS
EXCEPT
EXCEPT
EXCEPT
EXCEPT
These are some most important types of Nested queries which we will look with example in
next sections.
1.The SQL Nested Query will be always enclosed inside the parentheses.
4.User needs to take care of multiple rows operator (IN,ANY) if sub-query will return more
than one rows.
There are so many business situations where user needs to use nested subqueries to fetch the
exact data from two or more tables.It is also called as Inline view in SQL.
Syntax :
Select Column1,Column2… From Table_Name
;
The user can use N Number of Inner Queries to fetch the required output. But using nesting
of Queries is not a good practice for performance tuning perspective.
Employee Table :
1 Rohan SQL
2 Rajiv PL SQL
3 Ram Java
Salary Table :
Employee No Salary
1 25000
2 35000
3 45000
If user wants to fetch the all records of Employees who’s salary is greater that 25000.
Query :
Select * from Employee where Employee_No
The Above Query is nested query which will give you the Employees data whose salary is
greater than 25000.
There are so many real life situations where user needs to use nested queries to insert the data
in table. So many times user needs to use the testing and will need some special data.To
tackle this situation Nested Queries with Insert statements will work.
Syntax :
Insert in to Tablename
Query :
Insert in to Employee_Bkp
There are sometimes user needs to update the data according to client requirements.If the data
is not huge then user can use the nested queries to update the data.
Syntax :
Update Table Set Column_name =
Sometimes user needs to delete the data with specific condition.So to delete the data with
condition user needs to use Delete nested queries.
Syntax :
Delete from tablename
We need to delete data from Employee table where salary is greater than 25000.
Query :
Delete from Employee where Employee_No IN
1. COUNT FUNCTION
o COUNT function is used to Count the number of rows in a database table. It can work
on both numeric and non-numeric data types.
o COUNT function uses the COUNT(*) that returns the count of all the rows in a
specified table. COUNT(*) considers duplicate and Null.
Syntax
1. COUNT(*)
2. or
3. COUNT( [ALL|DISTINCT] expression )
Sample table:
PRODUCT_MAST
Item1 Com1 2 10 20
Item2 Com2 3 25 75
Item3 Com1 2 30 60
Item4 Com3 5 10 50
Item5 Com2 2 20 40
Item6 Cpm1 3 25 75
Item8 Com1 3 10 30
Item9 Com2 2 25 50
Example: COUNT()
1. SELECT COUNT(*)
2. FROM PRODUCT_MAST;
Output:
10
Output:
Output:
Output:
Com1 5
Com2 3
Com3 2
Output:
Com1 5
Com2 3
2. SUM Function
Sum function is used to calculate the sum of all selected columns. It works on numeric fields
only.
Syntax
1. SUM()
2. or
3. SUM( [ALL|DISTINCT] expression )
Example: SUM()
1. SELECT SUM(COST)
2. FROM PRODUCT_MAST;
Output:
670
1. SELECT SUM(COST)
2. FROM PRODUCT_MAST
3. WHERE QTY>3;
Output:
320
1. SELECT SUM(COST)
2. FROM PRODUCT_MAST
3. WHERE QTY>3
4. GROUP BY COMPANY;
Output:
Com1 150
Com2 170
Com1 335
Com3 170
3. AVG function
The AVG function is used to calculate the average value of the numeric type. AVG function
returns the average of all non-Null values.
Syntax
1. AVG()
2. or
3. AVG( [ALL|DISTINCT] expression )
Example:
1. SELECT AVG(COST)
2. FROM PRODUCT_MAST;
Output:
67.00
4. MAX Function
MAX function is used to find the maximum value of a certain column. This function
determines the largest value of all selected values of a column.
Syntax
1. MAX()
2. or
3. MAX( [ALL|DISTINCT] expression )
Example:
1. SELECT MAX(RATE)
2. FROM PRODUCT_MAST;
30
5. MIN Function
MIN function is used to find the minimum value of a certain column. This function
determines the smallest value of all selected values of a column.
Syntax
1. MIN()
2. or
3. MIN( [ALL|DISTINCT] expression )
Example:
1. SELECT MIN(RATE)
2. FROM PRODUCT_MAST;
Output:
10
NULL Values:
In SQL there may be some records in a table that do not have values or data for every field
and those fields are termed as a NULL value.
NULL values could be possible because at the time of data entry information is not available.
So SQL supports a special value known as NULL which is used to represent the values of
attributes that may be unknown or not apply to a tuple.
Importance of NULL Value
It is important to understand that a NULL value differs from a zero value.
A NULL value is used to represent a missing value, but it usually has one of three different
interpretations:
The value unknown (value exists but is not known)
Value not available (exists but is purposely withheld)
Attribute not applicable (undefined for this tuple)
It is often not possible to determine which of the meanings is intended. Hence, SQL does not
distinguish between the different meanings of NULL.
Principles of NULL values
Setting a NULL value is appropriate when the actual value is unknown, or when a value is
not meaningful.
A NULL value is not equivalent to a value of ZERO if the data type is a number and is not
equivalent to spaces if the data type is a character.
A NULL value can be inserted into columns of any data type.
A NULL value will evaluate NULL in any expression.
A trigger is a procedure which is automatically invoked by the DBMS in
response to changes to the database, and is specified by the database
administrator (DBA). A database with a set of associated triggers is
generally called an active database.
Parts of trigger
A triggers description contains three parts, which are as follows −
Event − An event is a change to the database which activates
the trigger.
Condition − A query that is run when the trigger is activated
is called as a condition.
Action −A procedure which is executed when the trigger is
activated and its condition is true.
Use of trigger
Triggers may be used for any of the following reasons −
To implement any complex business rule, that cannot be
implemented using integrity constraints.
Triggers will be used to audit the process. For example, to
keep track of changes made to a table.
Trigger is used to perform automatic action when another
concerned action takes place.
Types of triggers
The different types of triggers are explained below −
Statement level trigger − It is fired only once for DML
statement irrespective of number of rows affected by
statement. Statement-level triggers are the default type of
trigger.
Before-triggers − At the time of defining a trigger we can
specify whether the trigger is to be fired before a command
like INSERT, DELETE, or UPDATE is executed or after the
command is executed. Before triggers are automatically used
to check the validity of data before the action is performed.
For instance, we can use before trigger to prevent deletion of
rows if deletion should not be allowed in a given case.
After-triggers − It is used after the triggering action is
completed. For example, if the trigger is associated with the
INSERT command then it is fired after the row is inserted into
the table.
Row-level triggers − It is fired for each row that is affected
by DML command. For example, if an UPDATE command
updates 150 rows then a row-level trigger is fired 150 times
whereas a statement-level trigger is fired only for once.
Create database trigger
To create a database trigger, we use the CREATE TRIGGER command. The
details to be given at the time of creating a trigger are as follows −
Name of the trigger.
Table to be associated with.
When trigger is to be fired: before or after.
Command that invokes the trigger- UPDATE, DELETE, or INSERT.
Whether row-level triggers or not.
Condition to filter rows.
PL/SQL block is to be executed when trigger is fired.
The syntax to create database trigger is as follows −
CREATE [OR REPLACE] TRIGGER triggername
{BEFORE|AFTER}
{DELETE|INSERT|UPDATE[OF COLUMNS]} ON table
[FOR EACH ROW {WHEN condition]]
[REFERENCE [OLD AS old] [NEW AS new]]
BEGIN
PL/SQL BLOCK
END.
Sometimes, it is done on purpose for recovery or backup of data, faster access of data, or
updating data easily. Redundant data costs extra money, demands higher storage capacity,
and requires extra effort to keep all the files up to date.
Sometimes, unintentional duplicity of data causes a problem for the database to work
properly, or it may become harder for the end user to access data. Redundant data
unnecessarily occupy space in the database to save identical copies, which leads to space
constraints, which is one of the major problems.
We will understand these anomalies with the help of the following student table:
Example: If you want to add the details of the student in the above table, then you must know
the details of the department; otherwise, you will not be able to add the details because
student details are dependent on department details.
2. Deletion Anomaly:
Deletion anomaly arises when you delete some data from the database, but some unrelated
data is also deleted; that is, there will be a loss of data due to deletion anomaly.
Example: If we want to delete the student detail, which has student_id 2, we will also lose the
unrelated data, i.e., department_id 102, from the above table.
3. Updating Anomaly:
An update anomaly arises when you update some data in the database, but the data is partially
updated, which causes data inconsistency.
Example: If we want to update the details of dept_head from Jaspreet Kaur to Ankit Goyal
for Dept_id 104, then we have to update it everywhere else; otherwise, the data will get
partially updated, which causes data inconsistency.
Database Normalization: We can normalize the data using the normalization method. In this
method, the data is broken down into pieces, which means a large table is divided into two or
more small tables to remove redundancy. Normalization removes insert anomaly, update
anomaly, and delete anomaly.
Deleting Unused Data: It is important to remove redundant data from the database as it
generates data redundancy in the DBMS. It is a good practice to remove unwanted data to
reduce redundancy.
Master Data: The data administrator shares master data across multiple systems. Although it
does not remove data redundancy, but it updates the redundant data whenever the data is
changed.
Conclusion:
You have read this article about Data Redundancy in Database Management Systems. You
have understood that data redundancy refers to the repetition of similar data, which may be
done intentionally or it may be accidentally.
You have studied the problems caused by data redundancy, such as delete anomaly, insert
anomaly, and update anomaly.
You have studied the advantages and disadvantages of data redundancy in DBMS.
You have studied some of the methods which reduce data redundancy in DBMS
Relational Decomposition
o When a relation in the relational model is not in appropriate normal
form then the decomposition of a relation is required.
o In a database, it breaks the table into multiple tables.
o If the relation has no proper decomposition, then it may lead to
problems like loss of information.
o Decomposition is used to eliminate some of the problems of bad
design like anomalies, inconsistencies, and redundancy.
Types of Decomposition
Lossless Decomposition
Example:
EMPLOYEE_DEPARTMENT table:
EMPLOYEE table:
22 Denim 28 Mumbai
33 Alina 25 Delhi
46 Stephan 30 Bangalore
52 Katherine 36 Mumbai
60 Jack 40 Noida
DEPARTMENT table
827 22 Sales
438 33 Marketing
869 46 Finance
575 52 Production
678 60 Testing
Now, when these two relations are joined on the common column
"EMP_ID", then the resultant relation will look like:
Employee ⋈ Department
Dependency Preserving
1. X → Y
For example:
1. Emp_Id → Emp_Name
Example:
Example:
1. ID → Name,
2. Name → DOB
Inference Rule (IR):
o The Armstrong's axioms are the basic inference rule.
o Armstrong's axioms are used to conclude functional dependencies on a
relational database.
o The inference rule is a type of assertion. It can apply to a set of
FD(functional dependency) to derive other FD.
o Using the inference rule, we can derive additional functional dependency
from the initial set.
1. If X ⊇ Y then X → Y
Example:
1. X = {a, b, c, d, e}
2. Y = {a, b, c}
PlayNext
Unmute
Duration 18:10
Loaded: 0.37%
Â
Fullscreen
Backward Skip 10sPlay VideoForward Skip 10s
1. If X → Y then XZ → YZ
Example:
1. If X → Y and Y → Z then X → Z
1. If X → Y and X → Z then X → YZ
Proof:
1. X → Y (given)
2. X → Z (given)
3. X → XY (using IR2 on 1 by augmentation with X. Where XX = X)
4. XY → YZ (using IR2 on 2 by augmentation with Y)
5. X → YZ (using IR3 on 3 and 4)
1. If X → YZ then X → Y and X → Z
Proof:
1. X → YZ (given)
2. YZ → Y (using IR 1 Rule)
3. X → Y (using IR3 on 1 and 2)
6. Pseudo transitive Rule (IR6)
In Pseudo transitive Rule, if X determines Y and YZ determines W, then XZ
determines W.
1. If X → Y and YZ → W then XZ → W
Proof:
1.X→Y(given)
2.WY→Z (given)
3.WX→WY(using IR2 on 1 by augmenting with W)
4. WX → Z (using IR3 on 3 and 2)
Normalization
A large database defined as a single relation may result in data
duplication. This repetition of data may result in:
What is Normalization?
o Normalization is the process of organizing the data in the database.
o Normalization is used to minimize the redundancy from a relation or set of
relations. It is also used to eliminate undesirable characteristics like
Insertion, Update, and Deletion Anomalies.
o Normalization divides the larger table into smaller and links them using
relationships.
o The normal form is used to reduce redundancy from the database table.
Normal Description
Form
5NF A relation is in 5NF. If it is in 4NF and does not contain any join
dependency, joining should be lossless.
Advantages of Normalization
o Normalization helps to minimize data redundancy.
o Greater overall database organization.
o Data consistency within the database.
o Much more flexible database design.
o Enforces the concept of relational integrity.
Disadvantages of Normalization
o You cannot start building the database before knowing what the user
needs.
o The performance degrades when normalizing the relations to higher
normal forms, i.e., 4NF, 5NF.
o It is very time-consuming and difficult to normalize relations of a higher
degree.
o Careless decomposition may lead to a bad database design, leading to
serious problems.
First Normal Form (1NF)
o A relation will be 1NF if it contains an atomic value.
o It states that an attribute of a table cannot hold multiple values. It must
hold only single-valued attribute.
o First normal form disallows the multi-valued attribute, composite attribute,
and their combinations.
EMPLOYEE table:
14 John 7272826385, UP
9064738238
The decomposition of the EMPLOYEE table into 1NF has been shown
below:
14 John 7272826385 UP
14 John 9064738238 UP
Example: Let's assume, a school can store the data of teachers and the
subjects they teach. In a school, a teacher can teach more than one
subject.
TEACHER table
25 Chemistry 30
25 Biology 30
47 English 35
83 Math 38
83 Computer 38
To convert the given table into 2NF, we decompose it into two tables:
TEACHER_DETAIL table:
TEACHER_ID TEACHER_AGE
25 30
47 35
83 38
TEACHER_SUBJECT table:
TEACHER_ID SUBJECT
25 Chemistry
25 Biology
47 English
83 Math
83 Computer
1. X is a super key.
2. Y is a prime attribute, i.e., each element of Y is part of some candidate
key.
Example:
EMPLOYEE_DETAIL table:
EMPLOYEE table:
EMPLOYEE_ZIP table:
EMP_ZIP EMP_STATE EMP_CITY
201010 UP Noida
02228 US Boston
60007 US Chicago
06389 UK Norwich
462007 MP Bhopal
EMPLOYEE table:
1. EMP_ID → EMP_COUNTRY
2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are
keys.
To convert the given table into BCNF, we decompose it into three tables:
EMP_COUNTRY table:
EMP_ID EMP_COUNTRY
264 India
264 India
EMP_DEPT table:
EMP_DEPT_MAPPING table:
EMP_ID EMP_DEPT
D394 283
D394 300
D283 232
D283 549
Functional dependencies:
1. EMP_ID → EMP_COUNTRY
2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Candidate keys:
Now, this is in BCNF because left side part of both the functional
dependencies is a key.
Example
STUDENT
21 Computer Dancing
21 Math Singing
34 Chemistry Dancing
74 Biology Cricket
59 Physics Hockey
The given STUDENT table is in 3NF, but the COURSE and HOBBY are two
independent entity. Hence, there is no relationship between COURSE and
HOBBY.
So to make the above table into 4NF, we can decompose it into two
tables:
STUDENT_COURSE
STU_ID COURSE
21 Computer
21 Math
34 Chemistry
74 Biology
59 Physics
STUDENT_HOBBY
STU_ID HOBBY
21 Dancing
21 Singing
34 Dancing
74 Cricket
59 Hockey
Fifth normal form (5NF)
o A relation is in 5NF if it is in 4NF and not contains any join dependency and
joining should be lossless.
o 5NF is satisfied when all the tables are broken into as many tables as
possible in order to avoid redundancy.
o 5NF is also known as Project-join normal form (PJ/NF).
Example
In the above table, John takes both Computer and Math class for Semester
1 but he doesn't take Math class for Semester 2. In this case, combination
of all these fields required to identify a valid data.
So to make the above table into 5NF, we can decompose it into three
relations P1, P2 & P3:
P1
SEMESTER SUBJECT
Semester 1 Computer
Semester 1 Math
Semester 1 Chemistry
Semester 2 Math
P2
SUBJECT LECTURER
Computer Anshika
Computer John
Math John
Math Akash
Chemistry Praveen
P3
SEMSTER LECTURER
Semester 1 Anshika
Semester 1 John
Semester 1 John
Semester 2 Akash
Semester 1 Praveen