0% found this document useful (0 votes)
14 views

4

DBMS study material

Uploaded by

vecnaog
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

4

DBMS study material

Uploaded by

vecnaog
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Transaction Processing in DBMS

transaction processing concepts in dbms

Transaction processing in Database Management Systems (DBMS) involves managing a sequence of


operations (transactions) in a reliable, consistent, and secure manner. Below are the key concepts:

1. Transaction

A transaction is a sequence of operations performed as a single logical unit of work. It may involve
INSERT, UPDATE, DELETE, or SELECT operations.
Example: Transferring money from one account to another involves two operations: debiting one
account and crediting another.

2. Properties of Transactions (ACID)

To ensure data integrity, transactions must satisfy the following properties:

A. Atomicity

Ensures that either all operations of a transaction are completed or none are.
Example: If a money transfer fails midway, all changes made so far are rolled back.

B. Consistency

Ensures the database transitions from one consistent state to another.


Example: After a successful transaction, the total balance across all accounts remains unchanged.

C. Isolation

Transactions are executed independently of each other, preventing intermediate states from being
visible to other transactions.
Example: Two users transferring money simultaneously won't affect each other's transactions.

D. Durability

Once a transaction is committed, its changes are permanently stored, even in case of system
failure.
Example: After a successful payment, the database ensures the payment record persists.

3. Transaction States

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 1/18
Transactions go through the following states:

1. Active: Transaction is being executed.


2. Partially Committed: All operations are complete, but changes are not yet permanent.
3. Committed: Changes are saved to the database.
4. Failed: An error occurs, and the transaction cannot proceed.
5. Aborted: Changes are undone, and the database returns to its original state.

4. Concurrency Control

Concurrency issues arise when multiple transactions execute simultaneously. To handle this:

Locking Mechanisms:
Shared Lock (Read)
Exclusive Lock (Write)
Concurrency Control Protocols:
Two-Phase Locking (2PL)
Timestamp Ordering

5. Recovery

In case of system failure, the DBMS uses logs and recovery techniques to maintain consistency:

Undo: Reverse changes of incomplete transactions.


Redo: Reapply changes of committed transactions.

6. Schedules

The order of execution of transactions:

Serial Schedule: Transactions execute one after the other.


Concurrent Schedule: Transactions execute in overlapping time periods but maintain ACID
properties.
Serializable Schedule: Equivalent to a serial schedule in terms of outcome.

7. Deadlock

Occurs when two or more transactions wait for each other to release resources. Deadlock handling
techniques:

Deadlock Prevention: Avoiding circular waits.


Deadlock Detection and Resolution: Detecting and aborting transactions involved in deadlock.

These concepts ensure efficient, secure, and reliable transaction processing in DBMS.

Serializability

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 2/18
Serializability in DBMS

Serializability is a fundamental concept in concurrency control within DBMS. It ensures that the outcome
of executing multiple transactions concurrently is equivalent to the outcome of executing them
serially (one after the other), preserving the consistency of the database.

Types of Serializability

There are two main types:

1. Conflict Serializability

Based on the concept of conflicting operations:


Two operations conflict if:
They belong to different transactions.
They operate on the same data item.
At least one of them is a write operation.
A schedule is conflict-serializable if it can be transformed into a serial schedule by swapping non-
conflicting operations.

Example:

Consider two transactions, T 1 and T 2:

T 1 : R(A), W(A)
T 2 : R(A), W(A)

If the schedule is: T 1: R(A) → T 2 : R(A) → T 1 : W(A) → T 2 : W(A) This schedule can be
rearranged to a serial schedule T 1 → T 2, maintaining conflict serializability.

2. View Serializability

Based on equivalence of views:


Read-From (RF): A transaction reads the value written by another transaction.
Initial Reads: A transaction reads the initial value of a data item.
Final Writes: The last write on a data item is the same in both schedules.
A schedule is view-serializable if transactions in the schedule produce the same output as a serial
schedule, even if operations are reordered.

Testing Serializability

To determine whether a schedule is serializable:

1. Precedence Graph (Serialization Graph)

Nodes represent transactions.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 3/18
Edges represent dependencies between transactions (e.g., T 1 → T 2 if T 1's write is read or
overwritten by T 2).
If the graph is acyclic, the schedule is serializable.

Example:

For T 1 : R(A), W (A) and T 2 : R(A), W (A):

T 1 → T 2 (due to W (A) by T 1 before R(A) by T 2)


If the graph has no cycles, the schedule is serializable.

Serializability vs. Recoverability

Serializability ensures consistency during concurrency.


Recoverability ensures consistency after transaction failure.

Importance of Serializability

Maintains data consistency in concurrent transactions.


Prevents anomalies like:
Dirty Reads
Lost Updates
Non-Repeatable Reads
Guarantees correct results irrespective of transaction execution order.

Serializability is the gold standard for correctness in concurrent transaction scheduling.

conflict vs view serializable schedule

Conflict Serializability vs. View Serializability

Both conflict serializability and view serializability are methods for determining whether a schedule
is equivalent to a serial schedule. However, they differ in approach and strictness.

Conflict Serializability

Definition: A schedule is conflict-serializable if it can be transformed into a serial schedule by


swapping non-conflicting operations without altering the order of conflicting operations.

Key Points:

1. Conflicting Operations:
Two operations conflict if:
They belong to different transactions.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 4/18
in table

Here's a table comparing Conflict Serializability and View Serializability:

Aspect Conflict Serializability View Serializability


A schedule is conflict-serializable if it can be A schedule is view-serializable if it produces
Definition transformed into a serial schedule by swapping the same results as a serial schedule in
non-conflicting operations. terms of reads and writes.
Based on conflicting operations (read/write Based on read-from, initial reads, and
Focus
dependencies). final writes equivalence.
Harder to check; requires detailed analysis
Complexity Easier to check using precedence graph.
of read-write patterns.
Checks only for conflicting operations between Ensures transactions follow the same view
Conditions
transactions. (order of operations and results).
More general; all conflict-serializable
Strictness Stricter than view serializability. schedules are view-serializable, but not vice
versa.
Testing Uses a precedence graph: the schedule is No straightforward graph-based test;
Method serializable if the graph is acyclic. involves deeper analysis of equivalence.
Every conflict-serializable schedule is also view- Not all view-serializable schedules are
Relationship
serializable. conflict-serializable.
A schedule with overlapping writes that Allows for schedules that conflict-
Example of
maintain the same outcome may fail conflict serializability rejects, provided they
Difference
serializability but pass view serializability. produce the correct final state.

Example:

Schedule:

1. T 1 : R(A), W(A)
2. T 2 : R(A), W(A)

Conflict-Serializability: If operations conflict (e.g., T 1's W (A) and T 2's R(A)), the precedence
graph may become cyclic, making the schedule non-conflict-serializable.
View-Serializability: If T 2 reads A written by T 1 and the final writes match a serial schedule, the
schedule may still be view-serializable.

concurrency contol

Concurrency Control in DBMS

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 5/18
Concurrency control in DBMS ensures that multiple transactions can execute simultaneously without
interfering with each other, maintaining data consistency, isolation, and integrity. It resolves issues
like data conflicts, deadlocks, and anomalies while adhering to the ACID properties.

Goals of Concurrency Control

1. Ensure Data Consistency: Prevent inconsistent database states due to concurrent transactions.
2. Maintain Isolation: Ensure transactions do not affect one another’s intermediate states.
3. Increase Throughput: Maximize the number of transactions processed in a given time.
4. Prevent Anomalies:
Dirty Reads: Reading uncommitted changes.
Non-repeatable Reads: Different values read for the same data in a transaction.
Lost Updates: Overwriting updates from concurrent transactions.

Concurrency Problems

1. Dirty Reads: Reading data that is modified but not yet committed.
2. Lost Updates: Two transactions simultaneously update a data item, and one overwrites the other.
3. Non-repeatable Reads: A value read multiple times in a transaction differs due to another
transaction's updates.
4. Phantom Reads: A transaction reads a set of rows, and another transaction inserts or deletes
rows, causing a different result upon re-reading.

Concurrency Control Techniques

1. Lock-Based Protocols

Shared (S) Lock: Allows multiple transactions to read but not write a data item.
Exclusive (X) Lock: Allows only one transaction to read/write the data item.
Types:
Two-Phase Locking (2PL):
Growing Phase: Transaction acquires all locks.
Shrinking Phase: Transaction releases locks.
Ensures serializability but can lead to deadlocks.
Strict 2PL: Holds all locks until the transaction commits or aborts.
Rigorous 2PL: Holds all locks until the transaction finishes.

2. Timestamp-Based Protocols

Assigns a unique timestamp to each transaction.


Rules:
A transaction can only read/write data if it does not conflict with a newer transaction.
Older transactions are prioritized to avoid conflicts.

3. Optimistic Concurrency Control (OCC)

Assumes conflicts are rare and allows transactions to proceed without locking.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 6/18
Phases:
1. Read Phase: Transactions read data and make changes locally.
2. Validation Phase: Checks for conflicts before committing.
3. Write Phase: Commits changes if validation succeeds.

4. Multiversion Concurrency Control (MVCC)

Maintains multiple versions of data items to allow simultaneous reads and writes.
Transactions read the version of the data item that was committed before their start time.

5. Deadlock Handling

Deadlock Prevention: Enforces rules to avoid circular wait conditions (e.g., timeout, wait-die,
wound-wait schemes).
Deadlock Detection: Periodically checks for cycles in the wait-for graph.
Deadlock Recovery: Aborts one or more transactions to resolve the deadlock.

Performance Trade-offs

Lock-Based Protocols: High consistency but may lead to deadlocks and reduced parallelism.
Optimistic Protocols: High parallelism but potential rollback overhead.
MVCC: Handles read-heavy workloads efficiently but requires more storage.

Key Concepts

Granularity of Locking: Level at which locks are applied (e.g., row, page, table).
Isolation Levels:
Read Uncommitted: Least strict; allows dirty reads.
Read Committed: Prevents dirty reads.
Repeatable Read: Prevents non-repeatable reads.
Serializable: Strictest; prevents all anomalies.

Concurrency control is critical for maintaining a balance between performance and consistency in a
multi-user environment.

recovery with concurrent transaction

Recovery with Concurrent Transactions in DBMS

Recovery in a concurrent transaction environment ensures that the database remains consistent after
failures, such as system crashes or transaction aborts, while allowing multiple transactions to execute
simultaneously. Recovery mechanisms focus on maintaining atomicity and durability, even when
transactions interact.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 7/18
Key Concepts

1. Transaction States:
Active: Transaction is executing.
Partially Committed: All operations are complete, but changes are not yet durable.
Committed: Transaction is successfully completed, and changes are permanent.
Failed: Transaction cannot proceed due to an error.
Aborted: Transaction is rolled back, and all changes are undone.
2. Logs:
Recovery relies on logs, which record details about each transaction.
Write-Ahead Logging (WAL):
Log entries must be written to stable storage before the actual data is updated.
Types of log entries:
UNDO Log: Used to reverse changes made by aborted or incomplete transactions.
REDO Log: Used to reapply changes made by committed transactions.

Recovery Techniques

1. Checkpointing

Periodically saves the current state of the database and logs to reduce recovery overhead.
Steps:
1. Suspend new transactions momentarily.
2. Write all modified data and logs to disk.
3. Record a checkpoint entry in the log.

2. Recovery Phases

Analysis Phase:
Identifies active, committed, and aborted transactions at the time of failure.
Determines dirty pages (pages in memory but not yet written to disk).
Redo Phase:
Reapplies changes from the log for committed transactions to ensure durability.
Starts from the checkpoint.
Undo Phase:
Rolls back changes of uncommitted transactions to maintain atomicity.
Processes log entries in reverse order.

Concurrency Challenges in Recovery

1. Interleaving of Operations:
Operations from multiple transactions are interleaved, complicating recovery.
Logs must preserve the order of operations for correct recovery.
2. Dirty Reads and Cascading Aborts:
If a transaction reads uncommitted changes of another transaction, a failure may require
rolling back multiple transactions (cascading aborts).
Prevented by strict protocols like Strict Two-Phase Locking (2PL).
3. Concurrency Control During Recovery:
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 8/18
Recovery must handle concurrent transactions without introducing inconsistencies.
Transactions accessing the database during recovery may encounter delays.

Protocols for Recovery

1. Deferred Update Protocol:


Changes are not written to the database until the transaction commits.
Redo is needed for committed transactions; undo is not required.
2. Immediate Update Protocol:
Changes are written to the database immediately but must be logged.
Both undo and redo may be required.

Example of Recovery

Given Log:

sql

START T1
WRITE T1 A=50
START T2
WRITE T2 B=30
COMMIT T1
WRITE T2 C=40
CRASH

Recovery Process:

1. Analysis:
Transactions active at crash: T 2.
Committed transactions: T 1.
2. Redo:
Apply changes of T 1 (committed).
3. Undo:
Revert changes made by T 2 (active).

Cascadeless Schedules for Recovery

To simplify recovery, use cascadeless schedules, where transactions are only allowed to read
committed data, preventing cascading aborts.

Recovery with concurrent transactions combines logging, checkpointing, and concurrency control to
ensure data consistency and durability, even in complex multi-transaction scenarios.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 9/18
oodbms vs dbms

Here’s a comparison between Object-Oriented Database Management Systems (OODBMS) and


Relational Database Management Systems (RDBMS/DBMS):

Aspect OODBMS DBMS (RDBMS)


Object-oriented: Data is stored as objects, Relational: Data is stored in tables with
Data Model
similar to object-oriented programming (OOP). rows and columns.
Based on a relational schema with
Follows OOP principles with classes, objects,
Structure primary keys, foreign keys, and
inheritance, and encapsulation.
normalization.

Complex Data Handles complex data types like multimedia, Less suited for complex data types; best
Support hierarchies, and user-defined data types easily. for structured data.

Uses SQL (Structured Query


Uses OQL (Object Query Language) or
Query Language Language) for data manipulation and
programming languages like Java or C++.
retrieval.

Data Relationships are represented using object Relationships are represented using
Relationships references. foreign keys.

Supports class inheritance, enabling object No direct support for inheritance; must
Inheritance
reuse and extension. be modeled manually.

Highly flexible for dynamic and complex Less flexible for hierarchical or non-
Flexibility
applications. tabular data.

Efficient for applications with complex data Performs better for structured data and
Performance
and frequent object navigation. transactional systems.

Supports easier schema evolution without Schema changes are more complex and
Schema Evolution
requiring extensive restructuring. often require data migration.

Lack of a widely accepted standard; different Highly standardized; SQL is universally


Standardization
vendors use different implementations. supported.
Requires object-relational mapping
Integration with Seamlessly integrates with object-oriented
(ORM) tools for OOP language
OOP programming languages.
integration.

Widely used in business, finance, e-


Used in CAD, CAM, multimedia applications,
Applications commerce, and general-purpose
real-time systems, and scientific databases.
applications.

Relatively newer and less mature compared to Established and mature technology with
Maturity
RDBMS. extensive industry support.

Key Takeaways:

1. Use OODBMS when:


You need to handle complex data types and relationships.
Your application heavily uses OOP.
Schema flexibility and real-time data navigation are critical.
2. Use DBMS (RDBMS) when:
Your data is structured and tabular.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 10/18
You need robust query capabilities using SQL.
Industry standards, maturity, and interoperability are priorities.

fragmentation

Fragmentation in DBMS

Fragmentation in DBMS refers to dividing a database into smaller, logical pieces called fragments.
These fragments are distributed across different sites in a distributed database system to improve
performance, availability, and manageability. Each fragment can be stored and accessed
independently.

Types of Fragmentation

1. Horizontal Fragmentation:
Divides a table into subsets of rows (tuples) based on a condition.
Each fragment contains a subset of rows that satisfy a specific predicate.
All fragments share the same schema (column structure).
Example:
Table: `Employee(ID, Name, Dept, Salary)`
Condition: `Dept = "HR"` and `Dept = "Finance"`
Fragment 1: Employees in HR.
Fragment 2: Employees in Finance.
2. Vertical Fragmentation:
Divides a table into subsets of columns (attributes), along with a key attribute to ensure
reconstruction.
Each fragment contains some attributes of the table.
Example:
Table: `Employee(ID, Name, Dept, Salary)`
Fragment 1: `(ID, Name)`
Fragment 2: `(ID, Dept, Salary)`
Reconstruction: Join fragments on the key attribute (`ID`).
3. Hybrid (Mixed) Fragmentation:
Combines horizontal and vertical fragmentation.
A table is first horizontally fragmented, and each horizontal fragment is then vertically
fragmented (or vice versa).
Example:
Horizontal fragmentation by `Dept`.
Vertical fragmentation of HR department into `(ID, Name)` and `(ID, Salary)`.
4. Derived Fragmentation:
Fragments of one table are derived based on the fragmentation of another table.
Ensures referential integrity when there are relationships between tables.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 11/18
Advantages of Fragmentation

1. Improved Performance:
Queries are executed on relevant fragments, reducing data transfer and processing time.
2. Increased Parallelism:
Multiple fragments can be processed simultaneously across different sites.
3. Better Resource Utilization:
Allows data to be stored closer to where it is frequently accessed.
4. Enhanced Availability:
If one fragment or site fails, others remain accessible.
5. Scalability:
Fragments can be added or distributed across new nodes as the system grows.

Disadvantages of Fragmentation

1. Complexity:
Query processing becomes more complex, requiring reassembly of fragmented data.
2. Overhead:
Managing and maintaining fragments requires additional resources.
3. Data Redundancy:
Fragments may need to be replicated for fault tolerance, increasing storage needs.
4. Reconstruction:
Joining fragments to satisfy global queries may incur additional costs.

Fragmentation Design Principles

1. Completeness: Ensure that all data can be accessed through the fragments.
2. Reconstruction: Ensure that the original table can be reconstructed without loss of information.
3. Disjointness: Fragments should not overlap unless redundancy is necessary (e.g., replication).

Fragmentation plays a crucial role in distributed databases, enhancing system efficiency while posing
challenges in query optimization and data integrity.

multimedia

Multimedia in the Context of Computing and Databases

Multimedia refers to the integration of multiple forms of content, such as text, images, audio, video,
graphics, and animation, into a cohesive digital experience. It is widely used in various fields, including
entertainment, education, advertising, and communication.

Key Components of Multimedia

1. Text: The most basic element, used to convey information.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 12/18
2. Images/Graphics: Visual representations, including photographs, drawings, and charts.
3. Audio: Sound, including music, voice, or sound effects.
4. Video: Moving visual media, often combined with audio.
5. Animation: Simulated motion or transitions between states.
6. Interactive Content: User-engaged elements like clickable links, forms, or virtual reality
environments.

Multimedia in Databases (Multimedia DBMS)

A Multimedia Database Management System (MMDBMS) is designed to store, retrieve, and manage
multimedia content effectively.

Key Features of MMDBMS:

1. Storage of Complex Data Types:


Supports various multimedia formats such as MP4, JPEG, MP3, and more.
Requires large storage capacities and efficient indexing mechanisms.
2. Search and Retrieval:
Content-Based Retrieval: Searching for images, videos, or audio based on features like color,
texture, or patterns.
Metadata-Based Retrieval: Using metadata like title, tags, or timestamps.
3. Compression and Decompression:
Uses algorithms to reduce storage requirements (e.g., JPEG, MPEG, MP3).
4. Streaming Support:
Enables real-time playback of audio or video content.
5. Integration with Other Systems:
Supports integration with web servers, content delivery networks (CDNs), and client
applications.

Applications of Multimedia

1. Entertainment:
Streaming services like Netflix, YouTube, or Spotify.
Gaming and virtual reality.
2. Education:
E-learning platforms with video tutorials, interactive graphics, and animations.
3. Marketing and Advertising:
Multimedia-rich advertisements, banners, and promotional videos.
4. Healthcare:
Storing medical imaging data like X-rays and MRIs.
Interactive tools for patient education.
5. Communication:
Video conferencing, podcasts, and multimedia messaging.

Challenges in Multimedia Systems

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 13/18
1. Large Storage Requirements:
Multimedia files are typically large and require efficient storage solutions.
2. High Bandwidth Demand:
Streaming and downloading multimedia content requires high-speed networks.
3. Data Integration:
Combining multimedia content with structured data in traditional databases.
4. Content Retrieval:
Advanced techniques are needed for effective search and retrieval.
5. Standardization:
Handling diverse file formats and codecs.

Technologies Used in Multimedia

1. File Formats:
Images: JPEG, PNG, GIF.
Audio: MP3, WAV, AAC.
Video: MP4, AVI, MKV.
2. Compression Standards:
Images: JPEG, PNG.
Video: MPEG, H.264, H.265.
3. Tools and Platforms:
Editing: Adobe Photoshop, Premiere Pro, Audacity.
Delivery: Content delivery networks (e.g., Akamai).

Conclusion

Multimedia integrates diverse forms of content to enhance user experience. In computing, effective
management of multimedia requires specialized tools and technologies to address challenges like large
data sizes, retrieval complexities, and high bandwidth needs.

log

Log in DBMS

A log in a Database Management System (DBMS) is a crucial component used for maintaining the
consistency, durability, and recoverability of a database. It records all the operations and transactions
performed on the database, ensuring that the system can recover from failures while adhering to ACID
properties.

Purpose of Logs

1. Durability: Ensures committed transactions are not lost in case of system failures.
2. Atomicity: Allows rollback of incomplete or failed transactions.
3. Recovery: Facilitates recovery of the database to a consistent state after a crash or error.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 14/18
4. Concurrency: Helps manage concurrent transactions and resolve conflicts.

Types of Logging

1. Write-Ahead Logging (WAL):


Ensures that log entries are written to stable storage before actual database changes.
Guarantees recovery by maintaining a record of actions that can be undone or redone.
2. Undo Logging:
Logs old values before they are modified, allowing the system to rollback changes if needed.
Example:

css

<T1, Start>
<T1, A, 50> // Before change, A = 50
<T1, Commit>

3. Redo Logging:
Logs new values of the data items being updated, ensuring that committed transactions can
be redone if required.
Example:

css

<T1, Start>
<T1, A, 70> // After change, A = 70
<T1, Commit>

4. Undo-Redo Logging:
Combines both undo and redo logs to handle partial or failed transactions and ensure
recovery.

Structure of a Log Record

A typical log record contains:

1. Transaction Identifier: Unique ID of the transaction.


2. Operation: Type of operation (e.g., READ, WRITE).
3. Data Item: The item being operated on.
4. Old Value: Value of the data before the operation (for undo).
5. New Value: Value of the data after the operation (for redo).
6. Timestamp: Time when the operation was logged.

Logging Process

1. Start of Transaction:

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 15/18
Log entry records the transaction's beginning.
Example: `<T1, Start>`.
2. During Transaction:
Operations are logged as they are performed.
Example: `<T1, A, Old=50, New=70>`.
3. Commit/Abort:
A log entry indicates the transaction's outcome.
Example:
Commit: `<T1, Commit>`
Abort: `<T1, Abort>`
4. Checkpoint:
A checkpoint log periodically records the state of active transactions and dirty pages in
memory, reducing recovery time.
Example: `<Checkpoint, T1, T2>`.

Advantages of Logging

1. Crash Recovery:
Helps restore the database to a consistent state after a crash.
2. Minimal Overhead:
Logging is efficient and does not significantly impact system performance.
3. Support for Concurrency:
Ensures correctness even in concurrent transaction execution.

Recovery Using Logs

1. Analysis Phase:
Identifies active, committed, and aborted transactions at the time of failure.
2. Redo Phase:
Reapplies committed changes using redo logs to ensure durability.
3. Undo Phase:
Rolls back uncommitted changes using undo logs to maintain atomicity.

Challenges in Logging

1. Storage Overhead:
Log files can grow significantly, requiring periodic management.
2. Complexity in Implementation:
Requires careful synchronization and error handling.
3. Performance Impact:
Frequent logging operations may affect throughput in high-transaction environments.

Logs are essential for ensuring the reliability and robustness of database systems, particularly in
environments where data integrity is critical.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 16/18
2 phase locking and variants

Two-Phase Locking (2PL) in DBMS

Two-Phase Locking (2PL) is a concurrency control protocol that ensures serializability in transaction
management. The protocol governs the acquisition and release of locks in a transaction to avoid
conflicts between concurrent transactions.
In 2PL, each transaction goes through two distinct phases:

1. Growing Phase: A transaction can acquire locks but cannot release any.
2. Shrinking Phase: A transaction can release locks but cannot acquire any new locks.

The key principle is that once a transaction releases a lock, it can no longer acquire new locks, ensuring
that all its data access operations are completed before any locks are released.

Phases of Two-Phase Locking

1. Growing Phase:
A transaction can acquire any number of locks, but it cannot release any locks during this
phase.
2. Shrinking Phase:
Once the transaction releases a lock, it enters the shrinking phase.
During this phase, no new locks can be acquired, but it can continue releasing locks.

Types of 2PL

1. Strict Two-Phase Locking (Strict 2PL):


A transaction holds all its locks until it commits or aborts.
Once a transaction releases a lock, it cannot modify any data, ensuring consistency.
Strict 2PL guarantees serializability and prevents cascading aborts because once a
transaction commits, no other transaction can modify its data.
Example:
Transaction T1 acquires locks on resources A, B, and C, and doesn't release any until it
commits.
2. Rigorous Two-Phase Locking:
A stricter form of 2PL where transactions only release their locks when they commit or abort,
similar to Strict 2PL, but it extends the rule to all locks (including exclusive and shared).
Ensures serializability and prevents cascading aborts.
3. Basic Two-Phase Locking (Basic 2PL):
A transaction may acquire and release locks at will, but it still follows the two-phase rule of
acquiring locks before releasing any.
Basic 2PL ensures serializability but does not guarantee freedom from deadlocks.
Example:
Transaction T1 might acquire locks on resources A and B, then release A and acquire a
lock on resource C.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 17/18
4. Conservative Two-Phase Locking:
A more cautious version where a transaction locks all the data it needs at the start and holds
those locks until the end.
This version avoids deadlocks but might suffer from reduced concurrency.

Advantages of 2PL

1. Ensures Serializability: By preventing cycles in transaction dependency graphs, 2PL guarantees


that the execution of transactions is equivalent to some serial execution.
2. Prevents Inconsistent States: Locks ensure that only one transaction can modify a resource at a
time, avoiding conflicts and inconsistent states.
3. Deadlock-Free in Strict 2PL: Strict 2PL ensures that once a transaction commits, no other
transaction can modify its data.

Disadvantages of 2PL

1. Deadlocks: Basic 2PL can lead to deadlocks, where two or more transactions are waiting for each
other to release resources. This situation requires deadlock detection and resolution techniques.
2. Reduced Concurrency: The protocol may reduce the system's overall concurrency, especially when
transactions lock multiple resources and hold locks for long durations.
3. Overhead: The management of locks introduces some system overhead, especially in high-
transaction environments.
4. Starvation: In the case of deadlock resolution, some transactions may be delayed indefinitely
(starvation), especially if the priority of transaction management isn't well-handled.

Example of Two-Phase Locking

Consider the following two transactions, T1 and T2, accessing resources A, B, and C.

T1 starts by acquiring a lock on A and B (growing phase).


T1 releases A (enter shrinking phase).
T2 starts and acquires a lock on B (blocking due to conflict with T1's lock on B).
T1 releases its lock on B after completing its operations.
T2 can now proceed.

With Strict 2PL, T1 would not release any locks until it commits, making it easier to ensure consistency.

Conclusion

Two-Phase Locking is a widely used concurrency control protocol to ensure serializability in database
systems. While it ensures correctness and consistency of data, it requires careful handling of deadlocks
and system performance, especially in highly concurrent environments.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 18/18

You might also like