University Solution 2021-22 Dbms (1)
University Solution 2021-22 Dbms (1)
B. TECH.
(SEM V) THEORY EXAMINATION 2021-22
DATABASE MANAGEMENT SYSTEM
Time: 3 Hours Total Marks: 100
Note: 1. Attempt all Sections. If require any missing data; then choose suitably.
2. Any special paper specific instruction.
SECTION A
Ans: It is the ability to change the internal schema definition without affecting
the conceptual or external schema. An internal schema may be changed due to
several reasons such as for creating additional access structure, changing the
storage structure, etc. The separation of internal schema from the conceptual
schema facilitates physical data independence.
b. List the four functions of DBA.
Ans:
1. Software installation and Maintenance
2. Data Extraction, Transformation, and Loading
3. Database Backup and Recovery
4. Security
it is a technical person or technician responsible for implementing the data
administrator’s decisions.
SECTION B
Related
User query DD/D subsystem to
DD/
facility or “form” D
It is an example of employee of relation P(p) and Q(q) that are working on two
projects a1 and a2 respectively.
Different operations can be performed on these tables,
1. there are no. of persons that are working on both projects a1 and a2.
ID Name
03 z
04 a
When we are applying the union operations to the relations we must remember
that the both relations should be in union compatible. If we want to Apply these
three operators on two relations that are not union compatible, then we can use
rename operator to make them union compatible first before performing the desire
operation.
x y
y z, here z is dependent on y as well as x by transitivity rule x z. x is only
the prime key attribute but y and z are non-key attributes.
To achieve the normalization standard of Third Normal Form (3NF), you must
eliminate any transitive dependency.
Example
<MovieListing>
The above table is not in 3NF because it has a transitive functional dependency
The above states the relation <MovieListing> violates the 3rd Normal Form
(3NF).
To remove the violation, you need to split the tables and remove the transitive
functional dependency.
<Movie>
<Listing>
Listing_ID Listing_Type
L09 Crime
L05 Drama
L09 Crime
A timestamp is a tag that can be attached to any transaction or any data item, which
denotes a specific time on which the transaction or the data item had been used in any
way. A timestamp can be implemented in 2 ways. One is to directly assign the current
value of the clock to the transaction or data item. The other is to attach the value of a
logical counter that keeps increment as new timestamps are required.
(i) W-timestamp(X):
This means the latest time when the data item X has been written into.
(ii) R-timestamp(X):
This means the latest time when the data item X has been read from. These 2 timestamps
are updated each time a successful read/write operation is performed on the data item
X.
2. A manipulative part, defining the types of operation that are allowed on the data
(this includes the operations that are used for updating or retrieving data from the
database and for changing the structure of the database). (Selection, insertion,
deletion, updation)
3. Possibly a set of integrity rules, which ensures that the data is accurate. (check,
null, not null, key etc.)
The purpose of a data model is to represent data and to make the data
understandable. There have been many data models proposed in the literature. They
fall into three broad categories:
The object based and record based data models are used to describe data at the
conceptual and external levels, the physical data model is used to· describe data at
the internal level.
Physical data models describe how data is stored in the computer, representing
information such as record structures, record ordering, and access paths. There are
not as many physical data models as logical data models, the most common one
being the Unifying Model (promoting unity, It defines data architecture that is
necessarily rooted in actual digital hardware that can encompass every data
relationship).
This unifying model may be comprising of NTFS (New Technology file system,
new version) and FAT32 (file allocation table, old version). Actual H/W circuitry
and functionality of components are discussed in such kind of models.
Record based logical models are used in describing data at the logical and view
levels. In contrast to object based data models, they are used to specify the overall
logical structure of the database and to provide a higher-level description of the
implementation. Record based models are so named because the database is
structured in fixed format records of several types. Each record type defines a fixed
number of fields, or attributes, and each field is usually of a fixed length. The three
most widely accepted record based data models are:
• Hierarchical Model
• Network Model
• Relational Model
The relational model has gained favor over the other two in recent years. The
network and hierarchical models are still used in a large number of older databases.
Hierarchical Data Model
It was used to define the file system arrangements within a computer. In this model
each entity has only one parent but can have several children. At the top of hierarchy
there is only one entity which is called Root.
A large number of computer systems have been written that use this structure.
Unlike families in real life, a parent in a hierarchical database may have more than
one child, but a child always has only one parent. To find a particular record, you
have to start at the top with a parent and trace it down the chart to that child. It is
used in some reservation systems. Accessing or updating data is very fast because
the relationships have been predefined. The problem is that there are no
relationships among the child records.
The hierarchical data model is the oldest type of data model, developed by IBM in
1968. This data model organizes the data in a tree-like structure (tree is a
connected graph having no loop), in which each child node (also known as
dependents) can have only one parent node. The database based on the
hierarchical data model comprises a set of records connected to one another through
links. The link is an association between two or more records. The top of the tree
structure consists of a single node that does not have any parent and is called the
root node.
The root may have any number of dependents; each of these dependents may have
any number of lower level dependents. Each child node can have only one parent
node and a parent node can have any number of (many) child nodes. It, therefore,
represents only one-to-one and one-to-many relationships. The collection of same
type of records is known as a record type.
The main advantage of the hierarchical data model is that the data access is quite
predictable in the structure and, therefore, both the retrieval and updates can be
highly optimized by the DBMS.
However, the main drawback of this model is that the links are ‘hard coded’ into
the data structure, that is, the link is permanently established and cannot be
modified. The hard coding makes the hierarchical model rigid. In addition, the
physical links make it difficult to expand or modify the database and the changes
require substantial redesigning efforts.
These are the 1: N mapping between record types. This is done by using the
tree concept. It restricts only child segment to have only one parent segment.
Fig: Hierarchical Model
Figure shows the hierarchical model of Doctor’s Patient database. It consists of two
record types, namely, Physician number and Physician Name. For simplicity,
only few fields of each record type are shown. One complete record of each record
type represents a node.
It was used to define the network structure among the computers. In the network
model, entities are organized in a graph, in which some entities can be accessed
through several paths. Data in such type of model are represented by collection of
records and relationships between the records of two tables are represented by links
usually called as pointer.
It is similar to the hierarchical model, but each child record can have more than one
parent record. Thus a child record, in network terminology called a member, may
be reached through more than one parent, called owners.
The first specification of network data model was presented by Conference on Data
Systems Languages (CODASYL) in 1969, followed by the second specification in
1971. It is powerful but complicated. In a network model the data is also represented
by a collection of records, and relationships among data are represented by links.
However, the link in a network data model represents an association between
precisely two records. Like hierarchical data model, each record of a particular
record type represents a node. However, unlike hierarchical data model, all the
nodes are linked to each other without any hierarchy. The main difference between
hierarchical and network data model is that in hierarchical data model, the data is
organized in the form of trees and in network data model, the data is organized in
the form of graphs.
The main advantage of network data model is that a parent node can have many
child nodes and a child can also have many parent nodes. Thus, the network model
permits the modeling of many-to-many relationships in data.
The main limitation of the network data model is that it can be quite complicated
to maintain all the links and a single broken link can lead to problems in the
database. In addition, since there are no restrictions on the number of relationships,
the database design can become complex. Figure below shows the network model
of Online Book database.
The popularity of the network data model coincided with the popularity of the
hierarchical data model. Some data were more naturally modelled with more than
one parent per child. So, the network model permitted the modelling of many-to-
many relationships in data. The basic data modelling construct in the network model
is the set construct. A set consists of an owner record type, a set name, and a member
record type.
Relational Model
Object based data models use concepts such as entities, attributes, and relationships.
An entity is a distinct object (a person, place, concept, and event) in the organization
that is to be represented in the database. An attribute is a property that describes
some aspect of the object that we wish to record, and a relationship is an association
between entities. Some of the more common types of object based data model are:
The Entity-Relationship model has emerged as one of the main techniques for
modeling database design and forms the basis for the database design methodology.
The object oriented data model extends the definition of an entity to include, not
only the attributes that describe the state of the object but also the actions that are
associated with the object, that is, its behaviour.
The object is said to encapsulate both state and behavior. Entities in semantic
systems represent the equivalent of a record in a relational system or an object in an
OO system but they do not include behavior (methods). They are abstractions 'used
to represent real world (e.g. customer) or conceptual (e.g. bank account) objects.
The functional data model is now almost twenty years old. The original idea was to'
view the database as a collection of extensionally defined functions and to use a
functional language for querying the database.
Object-oriented Model
Entity-Relationship Model
(b) State the procedural DML and nonprocedural DML with their differences.
Ans:
Database languages:
Language is the medium to express our view. Programming languages are based on
syntax (grammar which is the concern of compiler or interpreter itself) and
semantics (logic which is the concern of developer). The main objective of a
database management system is to allow its users to perform a number of operations
on the database such as insert, delete, and retrieve data in abstract terms without
knowing about the physical representations of data. To provide the various facilities
to different types of users, a DBMS normally provides one or more specialized
programming languages called Database (or DBMS) Languages.
It defines/ describes the definition of database schema or any table in the form that
can be linked or that can be acceptable by the data manipulation language or that
can be accepted by application a program which is written for the operations to be
performed on that table (which is included in the application program in the form of
schema portion of database).
Or we can say that it is the language used to specify the database schema. The result
of compilation or collection of DDL statements is a set of tables that is stored in a
special file called data dictionary (it is a file that contains the metadata).
The DDL statements are also used to specify the integrity rules (constraints) in order
to maintain the integrity of the database. The various integrity constraints are
domain constraints, referential integrity, assertions and authorization. These
constraints are discussed in detail in subsequent chapters. Like any other
programming language, DDL also accepts input in the form of instructions
(statements) and generates the description of schema as output. The output is placed
in the data dictionary, which is a special type of table containing metadata. The
DBMS refers the data dictionary before reading or modifying the data. Note that the
database users cannot update the data dictionary; instead it is only modified by
database system itself.
The DML are of two types, namely, non-procedural DML and procedural DML.
On the other hand, the procedural or low-level DML requires user to specify what
data is required and how to access that data by providing step-by-step procedure to
solve a problem. For example, pascal, PLSQL and relational algebra, DL/1 is
procedural query language, which consists of set of operations such as select,
project, union, etc., to manipulate the data in the database. \
The major difference between these computational models is that the procedural
language is command-driven whereas non-procedural language is function oriented.
I. List roll number and name of all students of the branch ‘CSE’.
II. Find the name of student who has issued a book published by ‘ABC’
publisher.
III. List title of all books and their authors issued to a student ‘RAM’.
IV. select Title from Book join Issue on Issue.ISBN = Book.ISBN where
date-of-issue >="December 1, 2020";
CreateTrigger
These two keywords are used to specify that a trigger block is going to be declared.
Trigger_Name
It specifies the name of the trigger. Trigger name has to be unique and shouldn’t repeat.
( Before|After )
This specifies when the trigger will be executed. It tells us the time at which the trigger
is initiated, i.e, either before the ongoing event or after.
Before Triggers are used to update or validate record values before they’re saved to the
database.
After Triggers are used to access field values that are set by the system and to effect
changes in other records. The records that activate the after trigger are read-only. We
cannot use After trigger if we want to update a record because it will lead to read-only
error.
[ Insert|Update|Delete ]
These are the DML operations and we can use either of them in a given trigger.
on [ Table_Name ]
We need to mention the table name on which the trigger is being applied. Don’t forget
to use on keyword and also make sure the selected table is present in the database.
[ for each row | for each column ]
o Row-level trigger gets executed before or after any column value of a
row changes
o Column Level Trigger gets executed before or after the specified
column changes
[ trigger_body]
It consists of queries that need to be executed when the trigger is called.
(b) Describe the term MVD in the context of DBMS by giving an example. Discuss
4NF and 5NF also.
Ans: Multivalued dependency occurs when there are more than
one independent multivalued attributes in a table. two attributes (or columns)
in a table are independent of one another, but both depend on a third attribute.
If two or more independent relations are kept in a single relation (Cartesian
Product), then Multivalued Dependency is possible.
A relation should have 3 columns to have MVD with 3 conditions to satisfy,
1. A->>B, for a single value of A, more than one value exist in B.
2. Table should have at least 3 columns.
3. For this table with columns A,B,C; B and C should be independent.
Multivalued dependency can exist with more one column too, A->>B, and A-
>> C. A->>B, means A multidetermines B, or B is multidependent on A.
Fourth Normal Form: Any relation will be in 4NF if it is in BCNF and it contains
no multivalued dependency.
Example:
Enrolment Table:
St_Id Course Hobby
1 Science Cricket
1 maths Hockey
2 C# Cricket
2 Php Hockey
This above arrangement could lead to the problem.
St_Id Course Hobby
1 Science Cricket
1 maths Hockey
These two rows of data will virtually give rise to two more additional rows, that
is, science with hockey and maths with cricket.
St_Id Course Hobby
1 Science Cricket
1 maths Hockey
1 Science Hockey
1 maths Cricket
Isn’t that right. There is no relationship between course and hobby of student.
That’s why values can be interchanged with respect to science and maths of st_id
1.
So above condition occurs due bad designing practice. This problem can be
eliminated by keep such attributes in separate/independent relations.
It can be decomposed into course and hobby relations to avoid the MVD and
satisfy the 4NF.
Course:
St_Id Course
1 Science
1 maths
2 C#
2 Php
Hobby:
St_Id Hobby
1 Cricket
1 Hockey
2 Cricket
2 Hockey
Fifth normal form (5NF): A relation is in 5NF if it is in 4NF and not contains any
join dependency and joining should be lossless. 5NF is satisfied when all the
tables are broken into as many tables as possible in order to avoid redundancy.
5NF is also known as Project-join normal form (PJ/NF).
6. Attempt any one part of the following: 10 x 1 = 10
(a) Describe serializable schedule. Discuss conflict serializability with suitable
example.
(b) Discuss 2 phase commit protocol and time stamp based protocol with suitable
example. How the validation based protocols differ from 2PC?
Ans: Two-Phase Locking: This locking protocol divides the execution phase
of a transaction into three parts. In the first part, when the transaction starts
executing, it seeks permission for the locks it requires. The second part is where
the transaction acquires all the locks. As soon as the transaction releases its first
lock, the third phase starts. In this phase, the transaction cannot demand any
new locks; it only releases the acquired locks.
Timestamp-based Protocols
The most commonly used concurrency protocol is the timestamp based
protocol. This protocol uses either system time or logical counter as a
timestamp.
Lock-based protocols manage the order between the conflicting pairs among
transactions at the time of execution, whereas timestamp-based protocols start
working as soon as a transaction is created.
Every transaction has a timestamp associated with it, and the ordering is
determined by the age of the transaction. A transaction created at 0002 clock
time would be older than all other transactions that come after it. For example,
any transaction 'y' entering the system at 0004 is two seconds younger and the
priority would be given to the older one.
In addition, every data item is given the latest read and write-timestamp. This
lets the system know when the last ‘read and write’ operation was performed on
the data item.
Validation Based Protocol
Validation phase is also known as optimistic concurrency control technique. In
the validation based protocol, the transaction is executed in the following three
phases:
Read phase: In this phase, the transaction T is read and executed. It is used to
read the value of various data items and stores them in temporary local
variables. It can perform all the write operations on temporary variables without
an update to the actual database.
Validation phase: In this phase, the temporary variable value will be validated
against the actual data to see if it violates the serializability.
Write phase: If the validation of the transaction is validated, then the temporary
results are written to the database or system otherwise the transaction is rolled
back.
Here each phase has the following different timestamps:
Start(Ti): It contains the time when Ti started its execution.
Validation (Ti): It contains the time when Ti finishes its read phase and starts
its validation phase.