DBMS Unit 3
DBMS Unit 3
Database System Concepts - 5th Edition 11.4 ©Silberschatz, Korth and Sudarshan
• File – A file is named collection of related information that is
recorded on secondary storage such as magnetic disks, magnetic
tables and optical disks.
Types of file organization
Index structure:
Indexes can be created using some database columns.
• The first column of the database is the search key that contains
a copy of the primary key or candidate key of the table. The
values of the primary key are stored in sorted order so that the
corresponding data can be accessed easily.
• The second column of the database is the data reference. It
contains a set of pointers holding the address of the disk block
where the value of the particular key can be found.
Ordered indices
How do we find the data record using
the sparse index ?
Look up search key ≤ 40 in sparse index file
Follow database address to data block (or
next level index !!)
Search key in data block
B - Tree
B-Tree is a self-balanced search tree in which every
node contains multiple keys and has more than two
children.
B-Tree of Order m has the following properties...
Property #1 - All leaf nodes must be at same level.
Property #2 - All nodes except root must have at least [m/2]-1 keys
and maximum of m-1 keys.
Property #3 - All non leaf nodes except root (i.e. all internal nodes)
must have at least m/2 children.
Property #4 - If the root node is a non leaf node, then it must
have atleast 2 children.
Property #5 - A non leaf node with n-1 keys must have n number of
children.
Property #6 - All the key values in a node must be in Ascending
Order.
Operations on a B-Tree
The following operations are performed on a B-Tree...
1.Search
2.Insertion
3.Deletion
B+ Tree
• B+ Tree is an extension of B Tree which allows efficient
insertion, deletion and search operations.
Structure of B+ Tree
•In the B+ tree, every leaf node is at equal distance from the root
node.
•The B+ tree is of the order n where n is fixed for every B+ tree.
Database Catalog
Data Statistics about Data
Translating SQL Queries into Relational
Algebra
SELECT LNAME, FNAME
FROM EMPLOYEE
WHERE SALARY > ( SELECT MAX (SALARY)
FROM EMPLOYEE
WHERE DNO = 5);
Branch=‘CE’ (Student)
RollNo Name Branch SPI
101 Raj CE 8
104 Punit CE 9
Search algorithm for selection operation
1. Linear search (A1)
2. Binary search (A2)
Linear search (A1)
It scans each blocks and tests all records to see whether
they
satisfy the selection condition.
• Cost of linear search (worst case) = br
br denotes number of blocks containing records from relation r
If the selection condition is there on a (primary) key attribute,
then system can stop searching if the required record is found.
• cost of linear search (best case) = (br /2)
Linear search can be applied regardless of
• selection condition or
• ordering of records in the file (relation)
This algorithm is slower than binary search algorithm.
Binary search (A2)
Generally, this algorithm is used if selection is an equality
comparison on the (primary) key attribute and file (relation) is
ordered (sorted) on (primary) key attribute.
cost of binary search = [log2(br)]
• br denotes number of blocks containing records from relation r
If the selection is on non (primary) key attribute then multiple
block may contains required records, then the cost of scanning
such blocks need to be added to the cost estimate.
This algorithm is faster than linear search algorithm.
Evaluation of expressions
Expression may contain more than one operations,
solving expression will be difficult if it contains more than one
expression. customer )
Cust_Name ( Balance<2500 (account)
To evaluate such expression we need to evaluate each operation
one by one in appropriate order.
3 Cust_Name
2
Bottom to top
Execution
1 customer
Balance<2500
account
Query optimization
It is a process of selecting the most efficient query evaluation
plan from the available possible plans.
Customer )
Cust_Name ( Balance<2500 (Account)
Efficient plan 2 records 4 records
Customer ))
Cust_Name ( Balance<2500 (Account
Customer ))
Cust_Name ( Balance<2500 (Account
Customer )
Cust_Name ( Balance<2500 (Account)
Transformation of relational expressions
1. Combined selection operation can be divided into sequence of
individual selections. This transformation is called cascade of σ.
Customer Output
Cid Ano Cust_name Balance Cid Ano Cust_name Balance
C01 1 Raj 3000 C02 2 Meet 1000
C02 2 Meet 1000
C03 3 Harsh 2000
C04 4 Punit 4000
= σθ1(σθ2 (E))
σθ1Λθ2 (E)
Transformation of relational expressions
2. Selection operations are commutative.
Customer Output
Cid Ano Cust_name Balance Cid Ano Cust_name Balance
C01 1 Raj 3000 C02 2 Meet 1000
C02 2 Meet 1000
C03 3 Harsh 2000
C04 4 Punit 4000
σ (Customer
Ano<3 = (Customer) σ Ano<3
Balance) (Balance)
σ θ (E1 E2) = E1 θ E2
σ θ1 (E1 θ2E2) =
θ1Λθ2 E2
E1
Transformation of relational expressions
5. Theta operations are commutative.
E1 θ E2 = E2 θ E1
Transformation of relational expressions
6. Natural join operations are associative.
(E1 E2) E3 = E1 (E2 E3)
7. Selection operation distribute over theta join operation
under the following condition
• When all the attributes in the selection condition θ0 involves only
the attributes of the one of the expression (says E1) being joined.
Customer U = Employee U
Employee Customer
Customer ∩ Employee = Employee ∩ Customer
E1 U E2 = E2
U E1 E1 ∩ E2 =
E2 ∩ E1
Transformation of relational expressions
9. Set operations union and intersection are associative.
Union Intersect
Customer Employee Student Output Output
Cust_name Emp_name Emp_name Name Name
Raj Meet Raj Raj Meet
Meet Suresh Meet Meet
Suresh
(E1 U E2) U E3 = E1 U
(E2 U E3) (E1 ∩ E2) ∩ E2 =
E1 ∩ (E2 ∩ E3)
Transformation of relational expressions
10. Selection operation distributes over U, ∩ and –.
σ (E1 – E2)
θ = σθ(E1) – σθ(E2)
similarly selection operation is distributed for U and ∩ also.
Result of the expressions
Expression may contain more than one operations,
solving expression will be difficult if it contains more than one
expression. customer )
Cust_Name ( Balance<2500 (account)
To evaluate such expression we need to evaluate each operation
one by one in appropriate order.
3 Cust_Name
2
Bottom to top
Execution
1 customer
Balance<2500
account