DBMS Complete Syllabus
DBMS Complete Syllabus
Ensures data safety and integrity, while offering accessibility and concurrency control.
Supports functions like data querying, reporting, and analytics for informed decision- making.
● Physical Level
● View Level
View Level: This is the pinnacle of data abstraction, displaying only a portion of the entire
database focusing on user-interest areas. It can represent multiple views of the same data,
allowing users to access information through various applications from the database
Data independence
Data independence is defined as the capacity to change the schema at one level of a
database system without having to change the schema at the next higher level.
internal schema without changing the conceptual schema. b. Modification at the physical
level is occasionally necessary in order to improve performance. c. It refers to the immunity
of the conceptual schema to change in the internal schema. d.
Examples of physical data independence are reorganizations of files, adding a new access
path or modifying indexes, etc.
Logical data independence: a. Logical data independence is the ability to modify the
conceptual schema without having to change the external schemas or application programs.
b. It refers to the immunity of the external model to changes in the conceptual model. c.
Examples of logical data independence are addition/removal of entities.
Database Schema:
The database schema refers to the overall design of the database, illustrating the logical
structure and organization of data. It defines how data is organized and how relationships
between data are handled, essentially serving as the blueprint for how the database is
constructed.
Multimedia Database: Stores data types such as images, audio, and video files,
facilitating the management and retrieval of multimedia content. A digital asset management
system like Adobe Experience Manager that facilitates the storage and retrieval of
multimedia content.
data stored in a database, allowing for more complex and analytical queries. A database
using Datalog (a query language) which allows for complex logical queries and information
derivation.
Temporal Database: Keeps track of changing data over time, allowing for queries
concerning time-based data. A historical trading database in the financial sector which keeps
track of stock prices over time.
Geological Information System (GIS): Stores, organizes, and analyzes geographical data,
aiding in spatial analysis and mapping projects. A system like ArcGIS which enables the
storage, analysis, and visualization of geographical data.
DBA(Database Administrator
Database administrators hold authority over data and the programs facilitating data access.
Their roles/functions are:
Achieved through writing definitions translated to permanent labels in the data dictionary by
the DDL compiler.
Achieved through writing definitions translated by the data storage and definition language
compiler.
● Granting of Authorization for Data Access: DBA grants varied types of data
access authorization to different database users.
● Integrity Constraint Specification: DBA implements and maintains integrity
constraints to ensure data accuracy and consistency.
Query Processor: is the component of a DBMS that interprets and executes user
queries. It comprises several sub-components including:
1. DML Compiler: Processes Data Manipulation Language (DML) statements into low-level
instructions that can be executed.
2. DDL Interpreter: Processes Data Definition Language (DDL) statements into metadata
tables.
4. Query Optimizer: Determines the most efficient way to execute a query by evaluating
different query plans.
2.Storage Manager: Also known as the Database Control System, it is responsible for
managing the data stored in the database, ensuring its consistency and integrity. It includes
the following subcomponents:
4. File Manager: Manages file space and data structures representing information in the
database.
5. Buffer Manager: Manages data cache and data transfer between main memory and
secondary storage.
3.Disk Storage: Represents the storage aspect of a DBMS, encompassing the
following components:
1. Internal Level: Concerns the physical storage of data in databases, overseeing data
storage on hardware devices, and managing low-level aspects like data compression and
indexing.
2. Conceptual Level: Represents the logical layout of the database, detailing the schema
with tables and attributes and their interrelations. It's independent of specific DBMS
implementations, focusing on organizing and connecting data elements.
3. External Level: Embodies the user interface of the database, facilitating data access and
interaction through user-friendly views and interfaces tailored to various user groups.
ER Diagram
An entity is a thing or an object in the real world that is distinguishable from other object
based on the values of the attributes it possesses.
•TangibleEntities which physically exist in real world. E.g. - Car, Pen, locker
• Intangible - Entities which exist logically. E.g. - Account, video.
Entity SET- Collection of same type of entities that share the same properties or attributes.
• While in a relational model they are represented by independent column. e.g. Instructor (ID,
name, salary, dept_name)
Type of Attributes
Single valued- Attributes having single value at any instance of time for an entity. E.g. -
Aadhar no, dob.
Multivalued - Attributes which can have more than one value for an entity at same time. E.g.
Separate table for each multivalued attribute, by taking mva and pk of main table as fk in
new table
Simple Attributes which cannot be divided further into sub parts. E.g. Age
Composite - Attributes which can be further divided into sub parts, as simple attributes. A
composite attribute is represented by an ellipse connected to an ellipse and in a relational
model by a separate column.
Stored - Main attributes whose value is permanently stored in database. E.g. date_of_birth
Derived - The value of these types of attributes can be derived from values of other
Attributes. E.g. - Age attribute can be derived from date_of_birth and Date attribute.
An attribute takes a null value when an entity does not have a value for it. The null value
may indicate "not applicable" - that is, that the value does not exist for the entity.
Relationship / Association
● Name
● Degree
An E-R enterprise schema may define certain constraints to which the contents of a
database must conform.
Express the number of entities to which another entity can be associated via a relationship
set. Four possible categories are-
One to One (1:1, Relationship - An entity in A is associated with at most one entity in B,
and an entity in B is associated with at most one entity in A.
E.g.- The directed line from relationship set advisor to both entities set indicates that 'an
instructor may advise at most one student, and a student may have at most one advisor'.
One to Many (1. M) Relationship - An entity in A is associated with any number (zero or
more) of entities in B. An entity in B, however, can be associated with at most one entity in A.
E.g.- This indicates that an instructor may advise many students, but a student may have at
most one advisor.
Many to One (M: 1) Relationship - An entity in A is associated with at most one entity in B.
An entity in B, however, can be associated with any number (zero or more) of entities in A.
E.g.- This indicates that student may have many instructors but an instructor can advise at
most one student.
Many to Many M.N) Relationship - An entity in A is associated with any number (zero or
more) of entities in B, and an entity in B is associated with any number (zero or more) of
entities in A
E.g.- This indicates a student may have many advisors and an inst.ustor may advise many
students.
• Partial participation
• Total Participation.
• An entity set is called strong entity set, if it has a primary key, all the tuples in the set are
distinguishable by that key.
• An entity set that does not process sufficient attributes to form a primary key is called a
weak entity set.
It contains discriminator attributes (partial key) which contain partial information about the
entity set, but it is not sufficient enough to identify each tuple uniquely.
•Entity Set
● Convert every strong, weak entity set into a separate table. In weak entity set we
make it dependent onto one strong entity set (identifying or owner entity set).
•Relationship
● If Unary: No separate table is required, add a new column as fk which refer the pk of
the same table.
● if 1:1 No separate table is required, take pk of one side and put it as fk on other side,
priority must be given to the side having total participation.
● If m-n Separate table is required take pk of both table and declare their combination
as a pk of new table
● (3 or More) Take the pk of all participating entity sets as fk and declare their
combinations as pk in the new table.
•Attributes
● Multivalued-A separate table must be taken for all multivalued attributes, where we
take pk of the main table as fk and declare combination of fk and multivalued
attribute are pk in the new table.
● Composite Attributes-A separate column must be taken for all simple attributes of the
composite attribute.
Generalization
Leads to a simplified, structured data representation, aiding in database design and querying
processes.
Specialization
A process where a higher-level entity is broken down into more specific, lower- level entities.
Aggregation
A concept wherein relationships are abstracted to form higher-level entities, enabling a more
organized representation of complex relationships.
● Constructs used in the ER diagram can easily be transformed into relational tables.
BASICS OF RDBMS
By atomic we mean that each value in the domain is indivisible as far as the formal relational
model is concerned.
A common method of specifying a domain is to specify a data type from which the data
values forming the domain are drawn.
E.g. Names: The set of character strings that represent names of persons.
• Student.
Update Anomalies- Anomalies that cause redundant work to be done during insertion into
and Modification of a relation and that may cause accidental loss of information during a
deletion from a relation
● Insertion Anomalies
● Modification Anomalies
● Deletion Anomalies
• Redundancy
• Inconsistent Dependency
With out normalization data base system may be inaccurate, slow and inefficient and they
might not produce the data we expect. A series of
normal form tests that can be carried out on
individual relation schemas so that the relational
database can be normalized to any desired
degree.
● 1NF>>>2NF>>3NF>>BCNF
Conclusion
Like every paragraph must have a single idea similarly every table must have a single idea
and if a table contains more than one idea then that table must be decomposed until each
table contains only one idea.
FUNCTIONAL DEPENDENCY
⚫
• In a Relation R, if 'a' R AND 'B' R, then attribute or a Set of attribute 'a'
Functionally derives an attribute or set of attributes 'ẞ', iff each 'a'
value is associated with precisely one 'ẞ' value.
For all pairs of tuples t₁ and t₂ in R such that
• If T₁[a] = T₂[α]
• Then, T₁[B] = T₂[B]
A) A → b
B) BC → A
C) B → C
D) AC → B
Trivial Functional dependency - If B is a subset of dependency a → ẞ
will always hold. a, then the functional
• DENOTED BY F+
ARIVISTRONG'S AXIOMS
Armstrong Axioms
Augmentation: If X → Y, then XZ → YZ
F1+ = F2+
Or
One or more than one attributes may be redundant on right hand side of
a production.
One or more than one attributes may be redundant on Left hand side of
a production.
QR(ABCD)
A→B
C→B
D→ ABC
AC → D
R(A,B,C)
A→B
B→C
A→C
AB → B
AB→ C
AC → B
KEY
Super key
Set of attributes using which we can identify each tuple uniquely is called
Super key.
Minimal set of attributes using which we can identify each tuple uniquely
is called Candidate key. A super key is called candidate key if it's No
proper subset is a super key. Also called as MINIMAL SUPER KEY.
Primary key
Candidate key which are not chosen as primary key is alternate key.
Foreign Keys
Primary Key: A table in 1NF should have a primary key that uniquely
identifies each record.
Eg R(ABCD)
AB→CD
Mere candidate key is AB so, A and B are prime attribute, C and D are
non-prime attributes.
PARTIAL DEPENDENCY- When a non - prime attribute is dependent
only on a part (Proper subset) of candidate key then it is called partial
dependency. (PRIME > NON-PRIME)
eg. R(ABCD)
AB → D
A→C
•R should be in 1 NF.
A→B
● R should be in 2NF
R (A,B,C)
A→B
B→C
BCNF (BOYCE CODD NORMAL FORM)
• αβ
R (A,B,C)
AB → C
C→B
S_Name Club_Name
Kamesh Dance
Kamesh Guitar
• {Restaurant} →→ {Variety}
• It is in BCNF
Common attribute must be a key for at least one relation (R₁ or R₂) •
Att(R₁) ∩ Att(R₂) → (R₁) or Att(R₁) ∩ Att(R₂)→(R2)
Q R (A, B, C, D)
A→BC
C→DE
DE→E
• it is in 4 NF
X = PQRS
F = {QR → S, R → P, S→ Q}
Y = (PR) and Z = (QRS)
Indexing
•All the records in the file are ordered on some search key field.
•Notes that we will get binary search only if we are using that key for
searching on which indexing is done, otherwise it will behave as
unsorted file.
Records are typically added at the end of the file, without following any
specific order.
2. Each attribute can have a dedicated index file, meaning multiple index
files may exist for one main file.
3. Index files are always organized, allowing for the utilization of binary
search advantages, irrespective of the main file's order.
5. The correct block in the main file can be located with log2(number of
blocks in index file) + 1 accesses.
Types of Indexing
In single-level indexing, an index file is created for the main file, marking
the end of the indexing process.
Index file have two columns, first primary key and second anchor pointer
(base address of block)
• Here first record (anchor record) of every block gets an entry in the
index file
No. of entries in the index file = No of blocks acquired by the main file.
CLUSTERED INDEXING
SECONDARY INDEXING
•No of entries in the index file is same as the number of entries in the
main file.
Dense Vs Sparse
Dense Index In dense index, there is an entry in the index file for every
search key value in the main file. This makes searching faster but
requires more space to store index records itself. Note that it is not for
every record, it is for every search key value.
Sometime number of records in the main file > number of search keys in
the main file, for example if search key is repeated.
Sparse Index-If an index entry is created only for some records of the
main file, then it is called sparse index. No. of index entries in the index
file < No. of records in the main file. Note: - dense and sparse are not
complementary to each other, sometimes it is possible that a record is
both dense and sparse.
B tree
The root has at least zero child nodes and at most m child nodes.
The internal nodes except the root have at least celling(m/2) child nodes
and at most m child nodes.
The number of keys in each internal node is one less than the number of
child nodes and these keys partition the subtrees of the nodes in a
manner similar to that of m-way search tree.
Means user provides both what data to be retrieved and how data to be
retrieved. e.g. Relational Algebra.
Non-Procedural Query Language
The select, project, and rename operations are called unary operations,
because they operate on one relation.
• column_name (table_name)
It is a unary operator.
condition (table_name)
Using the connectives and (A), or (V), and not (-), we can combine
several predicates into a larger predicate.
• The relations r and s must be of the same arity. That is, they mu of
attributes.
• The domains of the ith attribute of r and the ith attribute of s must be
the same, for all i.
Q. Write a RELATIONAL ALGEBRA query to find all the customer name
who have a loan or an account or both?
customer_name(depositor)) U Πcustomer_name(borrower))
customer_name(borrower)) - (Πcustomer_name(depositor)))
Πcustomer_name, balance
(account.account_number=depositor.account_number (account X
depositor))
• If R₁ has m tuples and R₂ has n tuples the result will be having = m*n
tuples.
Same attribute name may appear in both R₁ and R₂, we need to devise a
naming • schema to distinguish between these attributes.
Rename Operation
• The results of relational algebra are also relations but without any
name. This Query do not change the name of the table in the original
data base, but create a new copy of the table.
PLearner(Student)
customer_name(depositor)) ∩ customer_name(borrower))
The Natural-Join Operation*⛝
Пcustomer_name, balance
(account.account_number=depositor.account_number (account ⛝
depositor))
2. The Sequel language has evolved since then, and its name has
changed to SQL (Structured Query Language) (some other company
has trademark on the word sequel). SQL has clearly established itself as
the standard relational database language.
);
FirstName VARCHAR(50),
LastName VARCHAR(50),
Age INT,
Email VARCHAR(100)
);
list of some common data types supported by SQL along with a brief
description of each:
4. `DECIMAL(p, s)`: For storing exact numerical values, where `p` is the
precision and 's' is the scale.