0% found this document useful (0 votes)
5 views

Assignment III

Uploaded by

Nayana Kulkarni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Assignment III

Uploaded by

Nayana Kulkarni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Arka Educational & Cultural Trust (Regd.

)
Jain Institute of Technology, Davangere
(A Unit of Jain Group of Institutions, Bengaluru)
# 323, Near Veereshwara Punyashrama, Avaragere, Davangere- 577003.

Department of Computer Science and Engineering


(Accredited by NBA, Valid up to 30.06.2026)

Subject: Database Management Systems (BCS403)

1. Explain the anomalies that occur due to interleaved execution


with examples.
2. a)Define Transaction. With a neat diagram explain the state
transition diagram of a transaction. https://youtu.be/WyiI6SiSYPk
b) Explain the Desirable Properties of a transaction (ACID
Properties). https://youtu.be/GS0OxFJsYQ?si=osIuv3uUCJlTnXEH
3. When does deadlock and starvation problem occur? Explain the
different approaches to deal with the problem.
4. a. With an example, explore basic Timestamp Ordering
algorithm for Concurrency Control. https://youtu.be/PYgebFyWZwI
b. Explain two phase locking Protocol with example.
5. Consider the following schema:
EMP(Fname, Lname, SSN, Bdate, Address, gender, salary,
superssn, DNO)
DEPT(Dname, Dnumber, Mgrssn, Mgrstartdate)
DEPT_LOC(Dnumber, Dlocation)
WORKS_ON(ESSN, PNO, Hours)
DEPENDANT(ESSN, Dependant_name,gender)
Design the schema diagram and Write the queries for the
following
1. List the female employees from Department number
20 earning more than 50000.
2. Retrieve the names of the employees whose name
starts with ‘M’ and third letter is ‘G’.
3. Retrieve the names of the employee who have no
dependants.
4. Retrieve the name of each employee who works on all
the projects controlled by department number
5. Retrieve the last name of each employee and his or
her supervisor.
**Schema diagram to be written highlighting primary
keys and foreign keys**

Solution :
1. List the female employees from Department number 20
earning more than 50000.
SELECT Fname, Lname
FROM EMP
WHERE gender = 'F' AND DNO = 20 AND salary >
50000;

2. Retrieve the names of the employees whose name starts


with ‘M’ and third letter is ‘G’.
SELECT Fname, Lname
FROM EMP
WHERE Fname LIKE 'M_G%';

3. Retrieve the names of the employees who have no


dependants.
SELECT Fname, Lname
FROM EMP E
WHERE NOT EXISTS (
SELECT 1
FROM DEPENDANT D
WHERE E.SSN = D.ESSN);

4. Retrieve the name of each employee who works on all


the projects controlled by department number 5.
SELECT Fname, Lname
FROM EMP E
WHERE NOT EXISTS (
SELECT PNO
FROM PROJECT P
WHERE P.DNO = 5 AND NOT EXISTS (
SELECT 1
FROM WORKS_ON W
WHERE W.ESSN = E.SSN AND W.PNO = P.PNO
)
);

5. Retrieve the last name of each employee and his or her


supervisor.
SELECT E.Lname AS EmployeeLastName, S.Lname AS
SupervisorLastName
FROM EMP E
LEFT JOIN EMP S ON E.superssn = S.SSN;

6. Illustrate with precedence graph, which of the following


schedules are conflict serializable:
i) S1: R1(X);R2(X);W1(X);R2(X);W3(X);
ii) S2: R3(X);R2(X);W3(X);R1(X);W1(X);
iii) S1:R1(X); R2(Z); R1(Z); R3(X); R3(Y); W1(X); W3( Y);
R2(Y); W2(Z); W2(Y);
iv) S: R1(A); R2(B); W1(A); R3(A); W2(B); W3(A); R1(B);
W1(B)
v) S2: r1 (X); r2 (Z); r3 (X); r1 (Z); r2 (Y); r3 (Y); w1 (X); w2
(Z); w3 (Y); w2 (Y);
https://youtu.be/zv0ba0Iok1Y?si=PvAcq7hTYg-L_xAC
7. Explain Boyce Codd Normal Formal, 4NF and 5NF with example.
8 a. Explain the CAP theorem.
b. Explain the Mongo DB CRUD operations.
(Refer Lab Experiment 7)
9. a. Explain the characteristics of MongoDB Distributed
Systems Characteristics.
b. Explain the features of the Voldemort Key-Value
Distributed Data Store.
10. What is a trigger? Create a row level trigger for the customers
table that would fire for INSERT or UPDATE or DELETE operations
performed on the CUSTOMERS table. This trigger will display the
salary difference between the old & new Salary.
CUSTOMERS (ID, NAME, AGE, ADDRESS, SALARY)
(Refer DBMS LAB PROGRAM NUMBER 4)
1) Explain the anomalies that occur due to interleaved
execution with examples.

 Concurrency control in DBMS is a method used to


manage simultaneous operations on the database
without letting them interfere with each other.
 It ensures the consistency of the database in a
multi-user environment.

 CPU is switched to execute another process than


remaining idle during I/O time.
 Interleaving prevents a long process from delaying
other processes.
 Problems that can occur for certain transaction
schedules without appropriate concurrency control
mechanisms are:
The Lost Update Problem
 This occurs when two transactions that access the
same database items have their operations
interleaved in a way that makes the value of some
database item incorrect.
 T2 reads X before its update by T1 is written to database
and hence the update to X is lost by T2.

The temporary update problem


This occurs when one transaction updates a database item and
then the transaction fails for some reason.
The updated item is accessed by another transaction before it
is changed back to its original value.

T1 modifies db object, and then the transactionT1 fails for


some reason. Meanwhile the modified db object, however,
has been accessed by another transaction T2. Thus T2 read
the data that never existed.

The Incorrect Summary Problem


If one transaction is calculating an aggregate summary
function on a number of records while other transactions are
updating some of these records, the aggregate function may
calculate some values before they are updated and others
after they are updated.
2a) Define Transaction. With a neat diagram explain
the state transition diagram of a transaction.
https://youtu.be/WyiI6SiSYPk
• A transaction can be defined as a group of tasks
performing a set of logical work.
• A single task is the minimum processing unit which cannot
be divided further.
• Transaction states are the states through which a
transaction goes during its lifetime.
• They tell about the current state and how we will further
process.
• They also use System log which is a file maintained by
recovery management component to record all the
activities of the transaction.
• After commit is done transaction log file is removed.
1. Active State –
When the instructions of the transaction are running then
the transaction is in active state. If all the ‘read and write’
operations are performed without any error then it goes to
the “partially committed state”; if any instruction fails, it
goes to the “failed state”.

2. Partially Committed –
After completion of all the read and write operation the
changes are made in main memory or local buffer. If the
changes are made permanent on the DataBase then the
state will change to “committed state” and in case of
failure it will go to the “failed state”.

3. Failed State –
When any instruction of the transaction fails, it goes to the
“failed state” or if failure occurs in making a permanent
change of data on Data Base.
4. Aborted State –
After having any type of failure the transaction goes from
“failed state” to “aborted state” and since in previous
states, the changes are only made to local buffer or main
memory and hence these changes are deleted or rolled-
back.

5. Committed State –
It is the state when the changes are made permanent on
the Data Base and the transaction is complete and
therefore terminated in the “terminated state”.

6. Terminated State –
If there isn’t any roll-back or the transaction comes from
the “committed state”, then the system is consistent and
ready for new transaction and the old transaction is
terminated.

2b) Desirable properties of transaction (ACID


Properties)
3) When does deadlock and starvation problem
occur? Explain the different approaches to deal with
the problem.
Deadlock happens when two or more transactions are
waiting for each other to release a lock on a data item, and
none of them can proceed. Starvation or Livelock is the
situation when a transaction has to wait for an indefinite
period of time to acquire a lock.
Solutions to avoid Deadlock:
 To avoid deadlock in DBMS using locking techniques,
the DBMS has to detect and resolve any circular wait
among transactions.
 A circular wait is a situation where each transaction
in the set is waiting for a lock held by another
transaction. For instance, if Transaction A holds a
lock on Record X and waits for a lock on Record Y,
and Transaction B holds a lock on Record Y and waits
for a lock on Record X, then there is a circular wait
between A and B.
 To detect this, the DBMS can use a wait-for graph,
which is a directed graph showing the transactions
as nodes and the locks as edges.
 If the wait-for graph has a cycle, then there is a
deadlock.
To resolve it, the DBMS can adopt one of the following
methods:
 Aborting one or more transactions involved in the
deadlock.
 Pre-empting one or more locks held by them or
 Converting one or more locks held by them and
reducing their level of granularity or mode. For
instance, it can convert an exclusive lock on a table
to a shared lock, or a shared lock on a page to a
shared lock on a record.
Solutions to starvation:
1. Increasing Priority: Starvation occurs when a
transaction has to wait for an indefinite time, In this
situation, we can increase the priority of that particular
transaction/s. But the drawback with this solution is that it
may happen that the other transaction may have to wait
longer until the highest priority transaction comes and
proceeds.
2. Modification in Victim Selection algorithm: If a
transaction has been a victim of repeated selections, then
the algorithm can be modified by lowering its priority over
other transactions.
3. First Come First Serve approach: A fair scheduling
approach i.e FCFS can be adopted, In which the
transaction can acquire a lock on an item in the order, in
which the requested lock.
4. Wait-die and wound wait scheme: These are the
schemes that use the timestamp ordering mechanism of
transactions.
5. Timeout Mechanism: A timeout mechanism can be
implemented in which a transaction is only allowed to wait
for a certain amount of time before it is aborted or
restarted. This ensures that no transaction waits
indefinitely, and prevents the possibility of starvation.
6. Resource Reservation: A resource reservation scheme
can be used to allocate resources to a transaction before it
starts execution. This ensures that the transaction has
access to the necessary resources and reduces the
chances of waiting for a resource indefinitely.
7. Preemption: Preemption involves the forcible removal of
a lock from a transaction that has been waiting for a long
time, in favor of another transaction that has a higher
priority or has been waiting for a shorter time. Preemption
ensures that no transaction waits indefinitely, and
prevents the possibility of starvation.
8. Dynamic Lock Allocation: In this approach, locks are
allocated dynamically based on the current state of the
system. The system may analyze the current lock requests
and allocate locks in such a way that prevents deadlocks
and reduces the chances of starvation.
9. Parallelism: By allowing multiple transactions to execute
in parallel, the system can ensure that no transaction
waits indefinitely, and reduces the chances of starvation.
This approach requires careful consideration of the
potential for conflicts and race conditions between
transactions.
4a) With an example, explore basic Timestamp Ordering
algorithm for Concurrency Control.
1. Time stamping protocol in concurrency control techniques transactions
(https://youtu.be/k0Tuf2weFyA)

2. Time stamping protocol in detail


(https://youtu.be/PYgebFyWZwI)

o The Timestamp Ordering Protocol is used to order the


transactions based on their Timestamps.
o The order of transaction is nothing but the ascending
order of the transaction creation.
o The priority of the older transaction is higher that's why it
executes first.
o To determine the timestamp of the transaction, this
protocol uses system time or logical counter.
o The lock-based protocol is used to manage the order
between conflicting pairs among transactions at the
execution time.
o But Timestamp based protocols start working as soon as a
transaction is created.
o Let's assume there are two transactions T1 and T2.
o Suppose the transaction T1 has entered the system at 007
times and transaction T2 has entered the system at 009
times, T1 has the higher priority, so it executes first as it is
entered the system first.
o The timestamp ordering protocol also maintains the
timestamp of last 'read' and 'write' operation on a data.

Basic Timestamp ordering protocol works as follows:


1. Check the following condition whenever a transaction Ti
issues a Read (X) operation:
o If W_TS(X) >TS(Ti) then the operation is rejected.
o If W_TS(X) <= TS(Ti) then the operation is executed.
o Timestamps of all the data items are updated.
2. Check the following condition whenever a transaction Ti
issues a Write(X) operation:
o If TS(Ti) < R_TS(X) then the operation is rejected.
o If TS(Ti) < W_TS(X) then the operation is rejected and Ti is
rolled back otherwise the operation is executed.
Where,
TS(TI) denotes the timestamp of the transaction Ti.
R_TS(X) denotes the Read time-stamp of data-item X.
W_TS(X) denotes the Write time-stamp of data-item X.

Advantages and Disadvantages of TO protocol:


o TO protocol ensures serializability since the precedence
graph is as follows:
o TS protocol ensures freedom from deadlock that means no
transaction ever waits.
o But the schedule may not be recoverable and may not
even be cascade- free.
4b) Explain two phase locking Protocol with example.
https://youtu.be/3fPFTbQuSdc

Two Phase Locking (2PL)


 Two Phase Locking is a technique used to control
concurrent access to shared resources in a database
management system.
 The basic idea behind 2PL is to ensure that a transaction
can only acquire locks on resources after it has released
all of its existing locks.
 This prevents deadlocks, which can occur when two or
more transactions are waiting for each other to release a
lock.
A transaction is said to follow the Two-Phase Locking protocol if
Locking and Unlocking can be done in two phases.
 Growing Phase: New locks on data items may be
acquired but none of them can be released.
 Shrinking Phase: Existing locks may be released but no
new locks can be acquired.

In the growing phase transaction reaches a point where all


the locks it may need has been acquired. This point is
called LOCK POINT.
After the lock point has been reached, the transaction
enters a shrinking phase.
Two phase locking is of two types –
Strict two-phase locking protocol
 A transaction can release a shared lock after the lock
point, but it cannot release any exclusive lock until
the transaction commits.
 It guarantees serializability.
 But it can lead to decreased concurrency and
increased contention for resources, as transactions
are not able to release locks until they are
committed.
Rigorous two-phase locking protocol
 A transaction cannot release any lock either shared
or exclusive until it commits.
 The 2PL protocol guarantees serializability, but
cannot guarantee that deadlock will not happen.
 It is difficult to implement

Conservative Two-Phase Locking (Conservative 2PL)


 Conservative 2PL is a less restrictive form of 2PL than
strict 2PL and rigorous 2PL.
 In conservative 2PL, a transaction is allowed to
release any lock at any time, regardless of whether it
will need the lock again.
 The advantage of conservative 2PL is that it allows
for maximum concurrency, as transactions are able
to release locks at any time.
 This can lead to the best performance in terms of
throughput and response time.
 The disadvantage of conservative 2PL is that it does
not guarantee serializability and can lead to
inconsistent results if not implemented carefully.
 Additionally, it does not prevent deadlocks which
could cause transaction to hang.

7) Explain Boyce Codd Normal Formal, 4NF and 5NF with


example.
Fourth normal form
(https://youtu.be/OTCuykFHBeA)

Fifth normal form


(https://youtu.be/mbj3HSK28Kk)

Fifth normal form (techtud)


(https://youtu.be/-VdEyjLHRlQ)

 BCNF is the advanced version of 3NF.


 It is stricter than 3NF.
 A table is in BCNF if every functional dependency X →
Y, X is the candidate key or super key of the table.
 For BCNF, the table should be in 3NF, and for every
FD, LHS is candidate key or super key.

Example
• Consider a table

Emp_i Emp_Countr Dn D_name D_Loc


d y o

1 India 100 Sales India

1 India 101 Productio UK


n

2 UK 102 Testing UK

2 UK 103 Sales USA

For the above table Functional dependencies are :


• EMP_ID → EMP_COUNTRY
• DNO → {D_NAME, D_LOC}
• The table is not in BCNF because neither DNO nor EMP_ID
alone are keys.
• To convert the given table into BCNF, we decompose it
into three tables:
EMP_COUNTRY TABLE

EMP_I EMP_COUNTR
D Y

1 INDIA

2 UK
DEPT table

Dno D_name D_Loc

100 Sales India

101 Testing UK

102 Productio UK
n

103 Sales USA

EMP_DEPT_MAPPING table

Emp_i Dno
d

1 100

1 101

2 102

2 103
8a) CAP Theorem
MongoDB is a popular NoSQL database, follows to
the CAP theorem by prioritizing availability and partition
tolerance over strong consistency. It defaults to eventual
consistency but offers tools like replica sets and transactions
for stronger consistency when needed and making it
versatile for distributed environments. In this article, We will
learn about the Where Does MongoDB Stand in the CAP
Theorem in detail.
What is the CAP Theorem?
The CAP theorem was formulated by computer scientist Eric
Brewer in the early 2000s. It is a fundamental principle in
distributed systems. It states that in a distributed data store,
you can only achieve two out of three
guarantees: consistency, availability, and partition
tolerance.
 Consistency: Every read receives the most recent write
or an error.
 Availability: Every request receives a response, without
guaranteeing that it contains the most recent write.
 Partition Tolerance: The system continues to operate
despite network partitions that may cause messages to be
delayed or lost.
Understanding these trade-offs is crucial when designing
distributed systems and selecting the
appropriate database technology.
How Distributed Systems Break Availability or Consistency?
In distributed systems, achieving both high availability and
strong consistency simultaneously can be challenging due to
the inherent trade-offs defined by the CAP theorem. Here’s
how:
 Availability Breakdown: Distributed systems may face
availability issues when network partitions occur, leading
to nodes being unreachable. If a system prioritizes
consistency over availability, it may reject requests during
partitioning to maintain data integrity, thus reducing
availability.
 Consistency Breakdown: Ensuring strong
consistency across distributed nodes requires
coordination, which can introduce latency and increase
the risk of failures if nodes cannot communicate
effectively. In scenarios prioritizing availability, the system
might sacrifice consistency to remain operational during
partitions, potentially leading to eventual consistency
models.

9a) Explain the characteristics of MongoDB


Distributed Systems Characteristics
MongoDB, a popular NoSQL database, is designed to handle
large volumes of data and distribute it across multiple
servers. Here are some key characteristics of MongoDB's
distributed systems:
1. Sharding
 Horizontal Scaling: MongoDB uses sharding to distribute
data across multiple servers. This allows the database to
scale horizontally by adding more machines to handle
increased load.
 Shard Key: Data is divided into chunks based on a shard
key. This key is crucial as it determines how data is
distributed across shards.
 Balanced Distribution: MongoDB ensures data is evenly
distributed across shards to prevent any single node from
becoming a bottleneck.
2. Replica Sets
 High Availability: MongoDB ensures data redundancy and
high availability through replica sets. A replica set consists
of a primary node and multiple secondary nodes.
 Automatic Failover: If the primary node fails, one of the
secondary nodes is automatically elected as the new
primary, ensuring continuous availability.
 Data Consistency: Writes are sent to the primary node and
then replicated to secondary nodes, ensuring data
consistency across the replica set.
3. Data Consistency
 Eventual Consistency: MongoDB provides eventual
consistency, where all nodes will eventually reflect the
same data. This is common in distributed systems to
improve performance.
 Read Preferences: Clients can configure read preferences
to balance between consistency and performance. For
example, reads can be directed to secondary nodes for
load balancing.
4. Fault Tolerance
 Redundancy: Data is replicated across multiple nodes,
providing redundancy and protecting against data loss.
 Self-Healing: MongoDB's architecture can self-heal by
redistributing data and re-electing nodes in case of
failures.
5. Scalability
 Elastic Scalability: MongoDB can scale both horizontally
and vertically, accommodating growth in data volume and
user load.
 Dynamic Sharding: Shards can be added or removed from
the cluster dynamically, allowing the system to adapt to
changing demands.

6. Load Balancing
 Query Routing: MongoDB uses a query router (mongos) to
route queries to the appropriate shards, balancing the
load across the cluster.
 Chunk Migration: To maintain balanced data distribution,
MongoDB can move data chunks between shards as
needed.
7. Geographically Distributed Clusters
 Global Clusters: MongoDB supports geographically
distributed clusters, allowing data to be stored closer to
users for reduced latency and compliance with data
residency requirements.
 Multi-Region Replication: Data can be replicated across
different regions, ensuring global availability and disaster
recovery.
8. Consistency and Durability
 Write Concerns: MongoDB allows configuring write
concerns to specify the level of acknowledgment required
from the database for write operations, balancing between
performance and data durability.
 Journaling: MongoDB uses journaling to provide durability
by recording write operations before they are applied to
the database.
9. Operational Simplicity
 Automation: MongoDB provides tools for automation of
complex tasks such as provisioning, scaling, and backups.
 Monitoring and Management: Tools like MongoDB Atlas
offer integrated monitoring and management capabilities
for distributed clusters.

10) What is a trigger? Create a row level trigger for


the customers table that would fire for INSERT or
UPDATE or DELETE operations performed on the
CUSTOMERS table. This trigger will display the salary
difference between the old & new Salary.
CUSTOMERS (ID, NAME, AGE, ADDRESS, SALARY)

You might also like