DBMS Mod 2
DBMS Mod 2
Syllabus
Introduction to database Systems, advantages of database system over traditional
file system, Basic concepts & Definitions, Database users, Database Language,
Database System Architecture, Schemas, Sub Schemas, & Instances, database
constraints, 3-level database architecture, Data Abstraction, Data Independence,
Mappings, Structure, Components & functions of DBMS, Data models.
Key Components:
DBMS Module 2 1
1. Entity:
2. Relationship:
3. Attribute:
4. Key Attribute:
5. Multivalued Attribute:
6. Derived Attribute:
7. Weak Entity:
An entity that does not have a key attribute of its own and relies on the
parent entity for identification.
8. Cardinality:
Describes the number of instances of one entity that can be related to the
number of instances in another entity.
9. Participation Constraints:
DBMS Module 2 2
Specify whether the existence of an entity depends on its relationship with
another entity.
Relational Algebra:
Definition:
Key Operations:
1. Selection (σ):
2. Projection (π):
3. Union ( ∪):
Combines tuples from two relations, removing duplicates.
Example: R ∪S
4. Intersection (∩):
Example: R ∩ S
5. Difference (-):
DBMS Module 2 3
Retrieves tuples from the first relation that are not in the second.
Example: R - S
Example: R × S
7. Join (⨝):
Example: R ⨝_{R.A=S.A} S
Syntax:
{t | P(t)}
t: Tuple variable
Example:
Syntax:
DBMS Module 2 4
Example:
∃
{ Name | Age, Salary (Employee(EmpID, Name, Age, Salary) ∧ Age > 25 ∧
Salary > 50000)}
Key Differences:
1. Relational Algebra:
Syntax: {t | ∃u, v, ... (P(t, u, v, ...))} with quantified variables and logical
conditions.
Query Rewriting:
Optimization Rules:
DBMS Module 2 5
Plan Generation:
Generate different execution plans for the query based on the optimized
algebraic expressions.
Plan Selection:
Evaluate various plans and select the one with the least cost or the most
optimal execution strategy.
2. Query Optimization:
Cost-Based Optimization:
Evaluate the cost of executing different query plans based on factors like
access methods, join strategies, and index usage.
Statistics Collection:
Collect and maintain statistics about the database, such as the number of
tuples in a relation, to aid in cost estimation.
Index Selection:
Decide whether to use indexes for accessing data and, if so, which indexes
to use.
Determine the most efficient order in which to perform joins, considering the
cost associated with different join orders.
Materialized Views:
Cost Metrics:
Assign costs to different operations in the query plan, such as the cost of
reading a tuple, performing a join, or sorting data.
Cardinality Estimation:
Estimate the number of tuples that will be processed at each stage of the
query execution.
DBMS Module 2 6
Consider the I/O and CPU costs associated with accessing and processing
data, factoring in disk I/O, CPU processing time, and network costs.
Optimization Techniques:
Feedback Mechanism:
Example Scenario:
Consider a complex SQL query involving multiple joins and aggregations. The
system goes through the following steps:
2. Query Rewriting:
3. Optimization Rules:
4. Plan Generation:
5. Plan Selection:
Choose the plan with the lowest estimated cost by considering factors like
join order, index usage, and access methods.
6. Cost-Based Optimization:
Evaluate the cost of different plans based on statistics about the database,
such as the number of tuples and distribution of data.
DBMS Module 2 7
Determine the most efficient order for joining tables by estimating the cost of
different join orders.
Assign costs to various operations in the query plan, considering I/O costs,
CPU costs, and cardinality estimation.
9. Feedback Mechanism:
If the actual execution deviates significantly from the estimated cost, adjust
the optimization strategy dynamically.
DBMS Module 2 8