This chapter discusses the concept of concurrency in database systems. We talk about different concurrency control techniques along with error recovery.
Distributed deadlock occurs when processes are blocked while waiting for resources held by other processes in a distributed system without a central coordinator. There are four conditions for deadlock: mutual exclusion, hold and wait, non-preemption, and circular wait. Deadlock can be addressed by ignoring it, detecting and resolving occurrences, preventing conditions through constraints, or avoiding it through careful resource allocation. Detection methods include centralized coordination of resource graphs or distributed probe messages to identify resource waiting cycles. Prevention strategies impose timestamp or age-based priority to resource requests to eliminate cycles.
The document discusses database recovery techniques, including:
- Recovery algorithms ensure transaction atomicity and durability despite failures by undoing uncommitted transactions and ensuring committed transactions survive failures.
- Main recovery techniques are log-based using write-ahead logging (WAL) and shadow paging. WAL protocol requires log records be forced to disk before related data updates.
- Recovery restores the database to the most recent consistent state before failure. This may involve restoring from a backup and reapplying log entries, or undoing and reapplying operations to restore consistency.
The document outlines concepts related to distributed database reliability. It begins with definitions of key terms like reliability, availability, failure, and fault tolerance measures. It then discusses different types of faults and failures that can occur in distributed systems. The document focuses on techniques for ensuring transaction atomicity and durability in the face of failures, including logging, write-ahead logging, and various execution strategies. It also covers checkpointing and recovery protocols at both the local and distributed level, particularly two-phase commit.
Deadlock in distribute system by saeed siddikSaeed Siddik
The document discusses deadlocks in distributed systems, outlining the four conditions required for a deadlock, strategies to handle deadlocks such as ignoring, detecting, preventing, and avoiding them, and algorithms for centralized deadlock detection and distributed deadlock detection and prevention. It provides examples of resource allocation graphs to illustrate deadlock conditions and explains how distributed deadlock detection and prevention algorithms work.
The document discusses various database recovery techniques including deferred update, immediate update, shadow paging, and ARIES. Deferred update involves no undo but may require redo after a crash since the database is not physically updated until transaction commit. Immediate update involves undo of uncommitted transactions but no redo since updates are written to disk during the transaction. Shadow paging uses shadow pages and avoids undo and redo by discarding dirty pages after a crash. ARIES uses a redo phase to bring the database to its pre-crash state followed by an undo phase to roll back uncommitted transactions.
This document summarizes a student's research project on improving the performance of real-time distributed databases. It proposes a "user control distributed database model" to help manage overload transactions at runtime. The abstract introduces the topic and outlines the contents. The introduction provides background on distributed databases and the motivation for the student's work in developing an approach to reduce runtime errors during periods of high load. It summarizes some existing research on concurrency control in centralized databases.
Virtualization techniques emulate execution environments, storage, and networks. Execution environments are classified as either process-level, implemented on top of an existing OS, or system-level, implemented directly on hardware without needing an existing OS. Virtualization provides isolation and resource management for software through virtual machines, which are classified as either system VMs that mimic whole hardware systems allowing full OSes, or process VMs that support single processes and provide platform independence. The machine reference model defines interfaces between abstraction layers that virtualization replaces to intercept calls.
Serializability is a concept that helps check if schedules are serializable. A serializable schedule always leaves the database in a consistent state. Non-serial schedules may cause inconsistencies, so serializability checks if they can be converted to an equivalent serial schedule to maintain consistency. Different types of serializability include view serializability and conflict serializability. View serializability requires schedules be view equivalent to a serial schedule with matching initial reads, final writes, and update reads. Conflict serializability converts a schedule by swapping non-conflicting operations, where two operations conflict if they are in different transactions, access the same data item, and one is a write.
The document discusses transaction concepts in database systems. It defines transactions as units of program execution that access and update database items. Transactions must satisfy the ACID properties of atomicity, consistency, isolation, and durability. Concurrent transaction execution allows for increased throughput but requires mechanisms to ensure serializability and recoverability. The document describes transaction states, schedule serializability testing using precedence graphs, and the goal of concurrency control protocols to enforce serializability without examining schedules after execution.
The document discusses distributed query processing and optimization in distributed database systems. It covers topics like query decomposition, distributed query optimization techniques including cost models, statistics collection and use, and algorithms for query optimization. Specifically, it describes the process of optimizing queries distributed across multiple database fragments or sites including generating the search space of possible query execution plans, using cost functions and statistics to pick the best plan, and examples of deterministic and randomized search strategies used.
Chapter 4- Communication in distributed system.pptAschalewAyele2
The document discusses various methods of communication in distributed systems. It outlines the Open Systems Interconnection Reference Model (OSI-RM) which divides communication into seven layers. It also describes protocols like remote procedure call (RPC) and remote object invocation which allow processes on different machines to communicate through procedure or method calls. RPC uses client and server stubs to transform local calls into messages that are sent across the network. Asynchronous RPC is also discussed as a way to avoid blocking the client process while waiting for the response.
Distributed operating systems allow applications to run across multiple connected computers. They extend traditional network operating systems to provide greater communication and integration between machines on the network. While appearing like a regular centralized OS to users, distributed OSs actually run across multiple independent CPUs. Early research in distributed systems began in the 1970s, with many prototypes introduced through the 1980s-90s, though few achieved commercial success. Design considerations for distributed OSs include transparency, inter-process communication, resource management, reliability, and flexibility.
A page table is a data structure used in virtual memory systems to map virtual addresses to physical addresses. Common techniques for structuring page tables include hierarchical paging, hashed page tables, and inverted page tables. Hierarchical paging breaks up the logical address space into multiple page tables, such as a two-level or three-level structure. Hashed page tables use hashing to map virtual page numbers to chained elements in a page table. Inverted page tables combine a page table and frame table into one structure with an entry for each virtual and physical page mapping.
Locks are used in distributed systems to coordinate access to shared resources and ensure consistency. There are different types of locks like read/write locks that can be granted. A distributed lock manager implements locking and allows processes to acquire locks on resources in a hierarchy. This prevents issues like lost updates and deadlocks. Examples of distributed lock managers include Chubby, ZooKeeper and Redis.
The document discusses temporal databases, which store information about how data changes over time. It covers several key points:
- Temporal databases allow storage of past and future states of data, unlike traditional databases which only store the current state.
- Time can be represented in terms of valid time (when facts were true in the real world) and transaction time (when facts were current in the database). Temporal databases may track one or both dimensions.
- SQL supports temporal data types like DATE, TIME, TIMESTAMP, INTERVAL and PERIOD for representing time values and durations.
- Temporal information can describe point events or durations. Relational databases incorporate time by adding timestamp attributes, while object databases
Concurrency control mechanisms use various protocols like lock-based, timestamp-based, and validation-based to maintain database consistency when transactions execute concurrently. Lock-based protocols use locks on data items to control concurrent access, with two-phase locking being a common approach. Timestamp-based protocols order transactions based on timestamps to ensure serializability. Validation-based protocols validate that a transaction's writes do not violate serializability before committing its writes.
A natural extension of the Random Access Machine (RAM) serial architecture is the Parallel Random Access Machine, or PRAM.
PRAMs consist of p processors and a global memory of unbounded size that is uniformly accessible to all processors.
Processors share a common clock but may execute different instructions in each cycle.
This document discusses distributed database and distributed query processing. It covers topics like distributed database, query processing, distributed query processing methodology including query decomposition, data localization, and global query optimization. Query decomposition involves normalizing, analyzing, eliminating redundancy, and rewriting queries. Data localization applies data distribution to algebraic operations to determine involved fragments. Global query optimization finds the best global schedule to minimize costs and uses techniques like join ordering and semi joins. Local query optimization applies centralized optimization techniques to the best global execution schedule.
Virtual memory allows processes to execute even if they are larger than physical memory by storing portions of processes on disk. When a process attempts to access memory that is not currently loaded, a page fault occurs which brings the required page into memory from disk. This is known as demand paging and allows the operating system to efficiently load only those portions of a process needed for execution, reducing memory usage and improving performance compared to loading the entire process at once.
The document discusses techniques used by a database management system (DBMS) to process, optimize, and execute high-level queries. It describes the phases of query processing which include syntax checking, translating the SQL query into an algebraic expression, optimization to choose an efficient execution plan, and running the optimized plan. Query optimization aims to minimize resources like disk I/O and CPU time by selecting the best execution strategy. Techniques for optimization include heuristic rules, cost-based methods, and semantic query optimization using constraints.
Database recovery techniques restore the database to its most recent consistent state before a failure. There are three states: pre-failure consistency, failure occurrence, and post-recovery consistency. Recovery approaches include steal/no-steal and force/no-force, while update strategies are deferred or immediate. Shadow paging maintains current and shadow tables to recover pre-transaction states. The ARIES algorithm analyzes dirty pages, redoes committed transactions, and undoes uncommitted ones. Disk crash recovery uses log/database separation or backups.
The document discusses different methods for deadlock management in distributed database systems. It describes deadlock prevention, avoidance, and detection and resolution. For deadlock prevention, transactions declare all resource needs upfront and the system reserves them to prevent cycles in the wait-for graph. Deadlock avoidance methods order resources or sites and require transactions to request locks in that order. Deadlock detection identifies cycles in the global wait-for graph using centralized, hierarchical, or distributed detection across sites. The system then chooses victim transactions to abort to break cycles.
The document discusses different distribution design alternatives for tables in a distributed database management system (DDBMS), including non-replicated and non-fragmented, fully replicated, partially replicated, fragmented, and mixed. It describes each alternative and discusses when each would be most suitable. The document also covers data replication, advantages and disadvantages of replication, and different replication techniques. Finally, it discusses fragmentation, the different types of fragmentation, and advantages and disadvantages of fragmentation.
There are three main approaches to handling deadlocks: prevention, avoidance, and detection with recovery. Prevention methods constrain how processes request resources to ensure at least one necessary condition for deadlock cannot occur. Avoidance requires advance knowledge of processes' resource needs to decide if requests can be immediately satisfied. Detection identifies when a deadlocked state occurs and recovers by undoing the allocation that caused it.
The document discusses different types of schedules for transactions in a database including serial, serializable, and equivalent schedules. A serial schedule requires transactions to execute consecutively without interleaving, while a serializable schedule allows interleaving as long as the schedule is equivalent to a serial schedule. Equivalence is determined based on conflicts, views, or results between the schedules. Conflict serializable schedules can be tested for cycles in a precedence graph to determine if interleaving introduces conflicts, while view serializable schedules must produce the same reads and writes as a serial schedule.
Transaction concept, ACID property, Objectives of transaction management, Types of transactions, Objectives of Distributed Concurrency Control, Concurrency Control anomalies, Methods of concurrency control, Serializability and recoverability, Distributed Serializability, Enhanced lock based and timestamp based protocols, Multiple granularity, Multi version schemes, Optimistic Concurrency Control techniques
The document discusses cost estimation in query optimization. It explains that the query optimizer should estimate the cost of different execution strategies and choose the strategy with the minimum estimated cost. The cost functions used are estimates and depend on factors like selectivity. The main cost components include access cost to storage, storage cost, computation cost, memory use cost, and communication cost. For different types and sizes of databases, the emphasis may be on minimizing different cost components, such as access cost for large databases. The document provides examples of cost functions for select and join operations that consider factors like index levels, block sizes, and selectivity.
The document discusses communication costs in parallel machines. It summarizes models for estimating the time required to transfer messages between nodes in different network topologies. The models account for startup time, per-hop transfer time, and per-word transfer time. Cut-through routing aims to minimize overhead by ensuring all message parts follow the same path. The document also covers techniques for mapping different graph structures like meshes and hypercubes onto each other to facilitate communication in various parallel architectures.
Transaction processing systems handle large databases and hundreds of concurrent users executing transactions. A transaction is a logical unit of database processing that includes database access operations like insertions, deletions, modifications, or retrievals. Transactions must be atomic, consistent, isolated, and durable (ACID properties). Concurrency control techniques like locking and timestamps are used to coordinate concurrent transactions and ensure serializability and isolation. The two-phase locking protocol enforces serializability by acquiring all locks in the growing phase before releasing any locks in the shrinking phase.
Transactions allow multiple users to access and update shared data concurrently in a database. They have four main properties: atomicity, consistency, isolation, and durability (ACID). Concurrency control schemes ensure transactions are isolated from each other to preserve consistency. A schedule is serializable if its outcome is equivalent to running transactions sequentially. Conflict serializability checks for conflicts between transactions' instructions and views the schedule as equivalent to a serial schedule after swapping non-conflicting instructions. Precedence graphs can test for conflict serializability by checking for cycles.
The document discusses transaction concepts in database systems. It defines transactions as units of program execution that access and update database items. Transactions must satisfy the ACID properties of atomicity, consistency, isolation, and durability. Concurrent transaction execution allows for increased throughput but requires mechanisms to ensure serializability and recoverability. The document describes transaction states, schedule serializability testing using precedence graphs, and the goal of concurrency control protocols to enforce serializability without examining schedules after execution.
The document discusses distributed query processing and optimization in distributed database systems. It covers topics like query decomposition, distributed query optimization techniques including cost models, statistics collection and use, and algorithms for query optimization. Specifically, it describes the process of optimizing queries distributed across multiple database fragments or sites including generating the search space of possible query execution plans, using cost functions and statistics to pick the best plan, and examples of deterministic and randomized search strategies used.
Chapter 4- Communication in distributed system.pptAschalewAyele2
The document discusses various methods of communication in distributed systems. It outlines the Open Systems Interconnection Reference Model (OSI-RM) which divides communication into seven layers. It also describes protocols like remote procedure call (RPC) and remote object invocation which allow processes on different machines to communicate through procedure or method calls. RPC uses client and server stubs to transform local calls into messages that are sent across the network. Asynchronous RPC is also discussed as a way to avoid blocking the client process while waiting for the response.
Distributed operating systems allow applications to run across multiple connected computers. They extend traditional network operating systems to provide greater communication and integration between machines on the network. While appearing like a regular centralized OS to users, distributed OSs actually run across multiple independent CPUs. Early research in distributed systems began in the 1970s, with many prototypes introduced through the 1980s-90s, though few achieved commercial success. Design considerations for distributed OSs include transparency, inter-process communication, resource management, reliability, and flexibility.
A page table is a data structure used in virtual memory systems to map virtual addresses to physical addresses. Common techniques for structuring page tables include hierarchical paging, hashed page tables, and inverted page tables. Hierarchical paging breaks up the logical address space into multiple page tables, such as a two-level or three-level structure. Hashed page tables use hashing to map virtual page numbers to chained elements in a page table. Inverted page tables combine a page table and frame table into one structure with an entry for each virtual and physical page mapping.
Locks are used in distributed systems to coordinate access to shared resources and ensure consistency. There are different types of locks like read/write locks that can be granted. A distributed lock manager implements locking and allows processes to acquire locks on resources in a hierarchy. This prevents issues like lost updates and deadlocks. Examples of distributed lock managers include Chubby, ZooKeeper and Redis.
The document discusses temporal databases, which store information about how data changes over time. It covers several key points:
- Temporal databases allow storage of past and future states of data, unlike traditional databases which only store the current state.
- Time can be represented in terms of valid time (when facts were true in the real world) and transaction time (when facts were current in the database). Temporal databases may track one or both dimensions.
- SQL supports temporal data types like DATE, TIME, TIMESTAMP, INTERVAL and PERIOD for representing time values and durations.
- Temporal information can describe point events or durations. Relational databases incorporate time by adding timestamp attributes, while object databases
Concurrency control mechanisms use various protocols like lock-based, timestamp-based, and validation-based to maintain database consistency when transactions execute concurrently. Lock-based protocols use locks on data items to control concurrent access, with two-phase locking being a common approach. Timestamp-based protocols order transactions based on timestamps to ensure serializability. Validation-based protocols validate that a transaction's writes do not violate serializability before committing its writes.
A natural extension of the Random Access Machine (RAM) serial architecture is the Parallel Random Access Machine, or PRAM.
PRAMs consist of p processors and a global memory of unbounded size that is uniformly accessible to all processors.
Processors share a common clock but may execute different instructions in each cycle.
This document discusses distributed database and distributed query processing. It covers topics like distributed database, query processing, distributed query processing methodology including query decomposition, data localization, and global query optimization. Query decomposition involves normalizing, analyzing, eliminating redundancy, and rewriting queries. Data localization applies data distribution to algebraic operations to determine involved fragments. Global query optimization finds the best global schedule to minimize costs and uses techniques like join ordering and semi joins. Local query optimization applies centralized optimization techniques to the best global execution schedule.
Virtual memory allows processes to execute even if they are larger than physical memory by storing portions of processes on disk. When a process attempts to access memory that is not currently loaded, a page fault occurs which brings the required page into memory from disk. This is known as demand paging and allows the operating system to efficiently load only those portions of a process needed for execution, reducing memory usage and improving performance compared to loading the entire process at once.
The document discusses techniques used by a database management system (DBMS) to process, optimize, and execute high-level queries. It describes the phases of query processing which include syntax checking, translating the SQL query into an algebraic expression, optimization to choose an efficient execution plan, and running the optimized plan. Query optimization aims to minimize resources like disk I/O and CPU time by selecting the best execution strategy. Techniques for optimization include heuristic rules, cost-based methods, and semantic query optimization using constraints.
Database recovery techniques restore the database to its most recent consistent state before a failure. There are three states: pre-failure consistency, failure occurrence, and post-recovery consistency. Recovery approaches include steal/no-steal and force/no-force, while update strategies are deferred or immediate. Shadow paging maintains current and shadow tables to recover pre-transaction states. The ARIES algorithm analyzes dirty pages, redoes committed transactions, and undoes uncommitted ones. Disk crash recovery uses log/database separation or backups.
The document discusses different methods for deadlock management in distributed database systems. It describes deadlock prevention, avoidance, and detection and resolution. For deadlock prevention, transactions declare all resource needs upfront and the system reserves them to prevent cycles in the wait-for graph. Deadlock avoidance methods order resources or sites and require transactions to request locks in that order. Deadlock detection identifies cycles in the global wait-for graph using centralized, hierarchical, or distributed detection across sites. The system then chooses victim transactions to abort to break cycles.
The document discusses different distribution design alternatives for tables in a distributed database management system (DDBMS), including non-replicated and non-fragmented, fully replicated, partially replicated, fragmented, and mixed. It describes each alternative and discusses when each would be most suitable. The document also covers data replication, advantages and disadvantages of replication, and different replication techniques. Finally, it discusses fragmentation, the different types of fragmentation, and advantages and disadvantages of fragmentation.
There are three main approaches to handling deadlocks: prevention, avoidance, and detection with recovery. Prevention methods constrain how processes request resources to ensure at least one necessary condition for deadlock cannot occur. Avoidance requires advance knowledge of processes' resource needs to decide if requests can be immediately satisfied. Detection identifies when a deadlocked state occurs and recovers by undoing the allocation that caused it.
The document discusses different types of schedules for transactions in a database including serial, serializable, and equivalent schedules. A serial schedule requires transactions to execute consecutively without interleaving, while a serializable schedule allows interleaving as long as the schedule is equivalent to a serial schedule. Equivalence is determined based on conflicts, views, or results between the schedules. Conflict serializable schedules can be tested for cycles in a precedence graph to determine if interleaving introduces conflicts, while view serializable schedules must produce the same reads and writes as a serial schedule.
Transaction concept, ACID property, Objectives of transaction management, Types of transactions, Objectives of Distributed Concurrency Control, Concurrency Control anomalies, Methods of concurrency control, Serializability and recoverability, Distributed Serializability, Enhanced lock based and timestamp based protocols, Multiple granularity, Multi version schemes, Optimistic Concurrency Control techniques
The document discusses cost estimation in query optimization. It explains that the query optimizer should estimate the cost of different execution strategies and choose the strategy with the minimum estimated cost. The cost functions used are estimates and depend on factors like selectivity. The main cost components include access cost to storage, storage cost, computation cost, memory use cost, and communication cost. For different types and sizes of databases, the emphasis may be on minimizing different cost components, such as access cost for large databases. The document provides examples of cost functions for select and join operations that consider factors like index levels, block sizes, and selectivity.
The document discusses communication costs in parallel machines. It summarizes models for estimating the time required to transfer messages between nodes in different network topologies. The models account for startup time, per-hop transfer time, and per-word transfer time. Cut-through routing aims to minimize overhead by ensuring all message parts follow the same path. The document also covers techniques for mapping different graph structures like meshes and hypercubes onto each other to facilitate communication in various parallel architectures.
Transaction processing systems handle large databases and hundreds of concurrent users executing transactions. A transaction is a logical unit of database processing that includes database access operations like insertions, deletions, modifications, or retrievals. Transactions must be atomic, consistent, isolated, and durable (ACID properties). Concurrency control techniques like locking and timestamps are used to coordinate concurrent transactions and ensure serializability and isolation. The two-phase locking protocol enforces serializability by acquiring all locks in the growing phase before releasing any locks in the shrinking phase.
Transactions allow multiple users to access and update shared data concurrently in a database. They have four main properties: atomicity, consistency, isolation, and durability (ACID). Concurrency control schemes ensure transactions are isolated from each other to preserve consistency. A schedule is serializable if its outcome is equivalent to running transactions sequentially. Conflict serializability checks for conflicts between transactions' instructions and views the schedule as equivalent to a serial schedule after swapping non-conflicting instructions. Precedence graphs can test for conflict serializability by checking for cycles.
The document discusses transaction processing systems and transactions. A transaction processing system handles large databases and concurrent users executing transactions. A transaction is a logical unit of database processing that includes database access operations like insertions, deletions, modifications, or retrievals. Transactions must be atomic, consistent, isolated, and durable (ACID properties). Concurrency control mechanisms are needed to ensure transactions execute correctly when run concurrently. Schedules specify the order transactions execute and must be serializable to ensure consistency.
The document discusses transactions management and concurrency control in databases. It defines transactions as logical operations like bank transactions or airline reservations that consist of sets of read and write operations. Transactions must have ACID properties - atomicity, consistency, isolation, and durability. Concurrency control techniques like lock-based and timestamp-based protocols are used to coordinate concurrent execution of transactions and prevent conflicts. Schedules can be serial or non-serial, with non-serial schedules further classified as serializable or non-serializable. Recoverable and cascading schedules are discussed.
The document discusses transaction processing and ACID properties in databases. It defines a transaction as a group of tasks that must be atomic, consistent, isolated, and durable. It provides examples of transactions involving bank account transfers. It explains the four ACID properties - atomicity, consistency, isolation, and durability. It also discusses transaction states, recovery, concurrency control techniques like two-phase locking and timestamps to prevent deadlocks.
This presentation discusses database transactions. Key points:
1. A transaction must follow the properties of atomicity, consistency, isolation, and durability (ACID). It accesses and possibly updates data items while preserving a consistent database.
2. Transaction states include active, partially committed, failed, aborted, and committed. Atomicity and durability are implemented using a shadow database with a pointer to the current consistent copy.
3. Concurrent transactions are allowed for better throughput and response time. Concurrency control ensures transaction isolation to prevent inconsistent databases.
4. A schedule specifies the execution order of transaction instructions. A serializable schedule preserves consistency like a serial schedule. Conflict and view serializability are forms
Transaction Processing; Concurrency control; ACID properties; Schedule and Discoverability; Serialization; Concurrency control and Recovery; Two Phase locking; Deadlock Shadow Paging
Distributed Database Design and Relational Query LanguageAAKANKSHA JAIN
1) The document discusses topics related to distributed database design and relational query languages including transaction management, serializability, blocking, deadlocks, and query optimization.
2) A transaction begins with the first SQL statement and ends when committed or rolled back. It has ACID properties - atomicity, consistency, isolation, and durability.
3) Serializability ensures transactions are processed in a consistent order. Conflict serializability allows swapping non-conflicting operations while view serializability requires equivalent initial reads, write-read sequences, and final writers.
Transaction management and concurrency controlDhani Ahmad
The document discusses transaction management and concurrency control in database systems. It covers topics such as transactions and their properties, concurrency control methods like locking, time stamping and optimistic control, and database recovery management. The goal of these techniques is to coordinate simultaneous transaction execution while maintaining data consistency and integrity in multi-user database environments.
TRANSACTION MANAGEMENT AND TIME STAMP PROTOCOLS AND BACKUP RECOVERYRohit Kumar
The document discusses transactions and concurrency control in database systems. It defines transactions as logical units of work that ensure data integrity during concurrent operations. It describes four key properties of transactions - atomicity, consistency, isolation, and durability (ACID) - and explains how they maintain data correctness. The document also discusses serialization, schedules, locking protocols like two-phase locking, and isolation levels to coordinate concurrent transactions and avoid anomalies like dirty reads.
Unit 4 chapter - 8 Transaction processing Concepts (1).pptxKoteswari Kasireddy
The document provides an overview of transaction processing concepts. It defines a transaction as a sequence of operations that transforms a database from one consistent state to another. Transaction processing systems are characterized by large databases, high volumes of concurrent users, and need for reliability. The document discusses transaction states, logging, concurrency control techniques like locking, and properties like atomicity, consistency, isolation and durability (ACID) that ensure transaction integrity.
The document discusses transaction management in database systems. It covers the concepts of transactions, including the ACID properties of atomicity, consistency, isolation, and durability. Transactions must preserve data integrity and consistency when running concurrently. Concurrency control schemes like locking protocols are used to achieve isolation between transactions. Schedules of transaction operations are analyzed to test for serializability, which guarantees consistency. Both conflict serializability and weaker view serializability are discussed.
This presentation discusses about the following topics:
Transaction processing systems
Introduction to TRANSACTION
Need for TRANSACTION
Operations
Transaction Execution and Problems
Transaction States
Transaction Execution with SQL
Transaction Properties
Transaction Log
This document discusses transaction processing in databases. It covers topics like ACID properties, transaction states, concurrency control techniques like locking, and problems that can occur in transaction processing like dirty reads, lost updates, and phantoms reads. Transaction processing aims to ensure transactions are atomic, consistent, isolated, and durable despite concurrent execution through techniques like locking, logging, and multi-version concurrency control.
This chapter deals with the importance of normalization in database management systems. We learn about the necessary criterion needed for normalization. We discuss different types of normal forms along with some sample examples.
In this chapter, we talk about basic concepts of relational database design. We talk about the concept of functional dependency, Armstrong's axioms, closures, and minimal cover.
Here, we talk about various relational algebra operations like select, project, union, intersection, minus, cartesian product, and join in database management systems.
Chapter-2 Database System Concepts and ArchitectureKunal Anand
This document provides an overview of database management systems concepts and architecture. It discusses different data models including hierarchical, network, relational, entity-relationship, object-oriented, and object-relational models. It also describes the 3-schema architecture with external, conceptual, and internal schemas and explains components of a DBMS including users, storage and query managers. Finally, it covers database languages like DDL, DML, and interfaces like menu-based, form-based and graphical user interfaces.
Chapter-1 Introduction to Database Management SystemsKunal Anand
This chapter discusses the fundamental concepts of DBMS like limitations of the traditional file processing systems, characteristics of the database approach, different types of databases and users, advantages and disadvantages of DBMS.
☁️ GDG Cloud Munich: Build With AI Workshop - Introduction to Vertex AI! ☁️
Join us for an exciting #BuildWithAi workshop on the 28th of April, 2025 at the Google Office in Munich!
Dive into the world of AI with our "Introduction to Vertex AI" session, presented by Google Cloud expert Randy Gupta.
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITYijscai
With the increased use of Artificial Intelligence (AI) in malware analysis there is also an increased need to
understand the decisions models make when identifying malicious artifacts. Explainable AI (XAI) becomes
the answer to interpreting the decision-making process that AI malware analysis models use to determine
malicious benign samples to gain trust that in a production environment, the system is able to catch
malware. With any cyber innovation brings a new set of challenges and literature soon came out about XAI
as a new attack vector. Adversarial XAI (AdvXAI) is a relatively new concept but with AI applications in
many sectors, it is crucial to quickly respond to the attack surface that it creates. This paper seeks to
conceptualize a theoretical framework focused on addressing AdvXAI in malware analysis in an effort to
balance explainability with security. Following this framework, designing a machine with an AI malware
detection and analysis model will ensure that it can effectively analyze malware, explain how it came to its
decision, and be built securely to avoid adversarial attacks and manipulations. The framework focuses on
choosing malware datasets to train the model, choosing the AI model, choosing an XAI technique,
implementing AdvXAI defensive measures, and continually evaluating the model. This framework will
significantly contribute to automated malware detection and XAI efforts allowing for secure systems that
are resilient to adversarial attacks.
We introduce the Gaussian process (GP) modeling module developed within the UQLab software framework. The novel design of the GP-module aims at providing seamless integration of GP modeling into any uncertainty quantification workflow, as well as a standalone surrogate modeling tool. We first briefly present the key mathematical tools on the basis of GP modeling (a.k.a. Kriging), as well as the associated theoretical and computational framework. We then provide an extensive overview of the available features of the software and demonstrate its flexibility and user-friendliness. Finally, we showcase the usage and the performance of the software on several applications borrowed from different fields of engineering. These include a basic surrogate of a well-known analytical benchmark function; a hierarchical Kriging example applied to wind turbine aero-servo-elastic simulations and a more complex geotechnical example that requires a non-stationary, user-defined correlation function. The GP-module, like the rest of the scientific code that is shipped with UQLab, is open source (BSD license).
RICS Membership-(The Royal Institution of Chartered Surveyors).pdfMohamedAbdelkader115
Glad to be one of only 14 members inside Kuwait to hold this credential.
Please check the members inside kuwait from this link:
https://www.rics.org/networking/find-a-member.html?firstname=&lastname=&town=&country=Kuwait&member_grade=(AssocRICS)&expert_witness=&accrediation=&page=1
"Feed Water Heaters in Thermal Power Plants: Types, Working, and Efficiency G...Infopitaara
A feed water heater is a device used in power plants to preheat water before it enters the boiler. It plays a critical role in improving the overall efficiency of the power generation process, especially in thermal power plants.
🔧 Function of a Feed Water Heater:
It uses steam extracted from the turbine to preheat the feed water.
This reduces the fuel required to convert water into steam in the boiler.
It supports Regenerative Rankine Cycle, increasing plant efficiency.
🔍 Types of Feed Water Heaters:
Open Feed Water Heater (Direct Contact)
Steam and water come into direct contact.
Mixing occurs, and heat is transferred directly.
Common in low-pressure stages.
Closed Feed Water Heater (Surface Type)
Steam and water are separated by tubes.
Heat is transferred through tube walls.
Common in high-pressure systems.
⚙️ Advantages:
Improves thermal efficiency.
Reduces fuel consumption.
Lowers thermal stress on boiler components.
Minimizes corrosion by removing dissolved gases.
ELectronics Boards & Product Testing_Shiju.pdfShiju Jacob
This presentation provides a high level insight about DFT analysis and test coverage calculation, finalizing test strategy, and types of tests at different levels of the product.
The role of the lexical analyzer
Specification of tokens
Finite state machines
From a regular expressions to an NFA
Convert NFA to DFA
Transforming grammars and regular expressions
Transforming automata to grammars
Language for specifying lexical analyzers
Chapter-10 Transaction Processing and Error Recovery
1. Learning Resource
On
Database Management Systems
Chapter-10
Transaction Processing and Error Recovery
Prepared By:
Kunal Anand, Asst. Professor
SCE, KIIT, DU, Bhubaneswar-24
2. Lecture Outcome:
• After the completion of this chapter, the students
will be able to:
– Define Transaction
– Explain A C I D Properties
– Identify different transaction states
– Explain concurrent execution, Serializability and its
types.
– Explain concurrency control techniques
– List out error recovery techniques
16 March 2021 2
3. Organization of this Lecture:
• Introduction to Transaction Processing
• A C I D Properties
• Transaction States
• Concurrent Execution, Schedules, and
Serializability
• Concurrency Control and Locking Protocol
• Two Phase Locking & Time Stamp based Protocol
• Error Recovery and Logging
16 March 2021 3
4. Introduction to Transaction Processing
• Transaction Processing Systems are the systems with large
databases and hundreds of concurrent users executing database
transactions.
– Ex: Airline or Rail ticket booking system, banking systems,
e-commerce web portals, stock markets etc. are some
examples of transaction processing systems.
• These systems require high availability, accurate results, and
fast response time for hundreds of concurrent users.
• A transaction can be considered as a logical unit of database
processing that must be completed in entirety to ensure
correctness.
• A transaction is typically implemented by a computer
program that includes databse commands like retrieval,
insertion, deletion, and updation.
16 March 2021 4
5. contd..
• A transaction (set of operations) may be stand-alone specified
in a high level language like SQL submitted interactively, or
may be embedded within a program.
• Transaction boundaries:
– Begin and End transaction.
• An application program may contain several transactions
separated by the Begin and End transaction boundaries.
• A transaction that changes the contents of the database must
alter the database from one consistent database state to another.
16 March 2021 5
6. contd..
•The transactions access data item X using the following two
operations:
•Read(X): It transfers the data item X from the database to a
local buffer belonging to the transaction that executed the read
operation.
•Write(X): It transfers the data item X from the local buffer of
the transaction that executed the write operation, back to the
database.
• Let T1 be a transaction that transfers $100 from account A to
account B.
16 March 2021 6
7. A C I D Properties
• Atomicity:
– Either all operations of the transaction are reflected
properly in the database, or no operation is reflected.
– Atomicity requires that all operations of a transaction be
completed; if not, the transaction is aborted by rolling back
all the updates done during the transaction.
– The database system keeps track of the old values of any
data on which a transaction performs a write. If the
transaction does not complete its execution, the database
system restores the old values to make it appear as though
the transaction never executed.
– Ensuring atomicity is the responsibility of the database
system itself. It is handled by the transaction management
component.
16 March 2021 7
8. • Consistency:
– Consistency means execution of a transaction should
preserve the consistency of the database, i.e. a transaction
must transform the database from one consistent state to
another consistent state.
– For example, Sum of A and B in transaction T1 must be
unchanged by the execution of the transaction.
– Ensuring the consistency for an individual transaction is the
responsibility of the application programmers who codes
the transaction.
16 March 2021 8
contd..
9. • Isolation:
– Though multiple transactions may execute concurrently, the
system guarantees that, for every pair of transactions Ti and
Tj , it appears to Ti that either Tj finished execution before Ti
started or Tj started execution after Ti finished. Thus, each
transaction is unaware of other transactions executing
concurrently in the system.
– The solutions are:
• Execute transactions serially.
• However, concurrent execution of transactions provides
significant performance benefits such as increased
throughputs.
• Ensuring the isolation property is the responsibility of
concurrency control component of the database system
16 March 2021 9
10. contd..
• Durability:
– After a transaction completes successfully, the changes it
has made to the database persist, even if there are system
failures.
– Durability ensures that once transaction changes are done or
committed, they can’t be undone or lost, even in the event
of a system failure.
– Ensuring durability is the responsibility of recovery
management component of the database system
16 March 2021 10
11. Transaction State
• Active state: This state is the initial state of a transaction. The
transaction stays in this state while it is executing.
• Partial Committed state: A transaction is partial committed
after its final statement has been executed. A transaction enters
this state immediately before the commit statement.
• Failed state: A transaction enters the failed state after the
discovery that normal execution can no longer proceed.
• Aborted state: A transaction is aborted after it has been rolled
back and the database is restored to its prior state before the
transaction.
• Committed state: Committed state occurs after successful
completion of the transaction.
• Terminate: Transaction is either committed or aborted.
16 March 2021 11
13. contd..
• What causes a Transaction to fail -:
– A computer failure (system crash)
– A transaction or system error
– Local errors or exception conditions detected by the transaction
– Concurrency control enforcement
– Disk failure
– Physical problems and catastrophes
• When a transaction enters the aborted state, the system has
two options:
– Restart the transaction: If the transaction was aborted as
a result of a hardware failure or some software error (other
than logical error), it can be restarted.
– Kill the transaction: If the application program that
initiated the transaction has some logical error.
16 March 2021 13
14. Concurrent Execution
• In the serial execution, one transaction can start executing only
after the completion of the previous one.
• Concurrent execution of transactions means executing more
than one transaction at the same time.
• The advantages of using concurrent execution of transactions
are:
– Improved throughput and resource utilization
– Reduced waiting time
• The database system must control the interaction among the
concurrent transactions to prevent them from destroying the
consistency of the database. It does this through a variety of
mechanisms called concurrency control schemes.
16 March 2021 14
16. Schedule
• A schedule is a sequence that indicates the chronological order
in which the instructions of concurrent transactions are
executed.
• A schedule for a set of transactions must consists of all
instructions of those transactions. We must preserve the order
in which the instructions appear in each individual transaction.
• Schedule can be of following two types:
– Serial Schedule
– Concurrent Schedule
16 March 2021 16
17. Serial Schedule and Concurrent Schedule
• Serial schedule:
– A serial schedule is a schedule where all the instructions
belonging to each transaction appear together. There is no
interleaving of transaction operations.
– A serial schedule has no concurrency and therefore it does
not interleave the actions of different transactions.
– For n transactions, there are exactly n! different serial
schedules possible.
• Concurrent schedule:
– In concurrent schedule, operations from different
concurrent transactions are interleaved.
– The number of possible schedules for a set of n transactions
is much larger than n!
16 March 2021 17
21. Serializability
• Serial schedules preserve consistency as we assume each
transaction individually preserves consistency. Some non-serial
schedules may lead to inconsistency of the database.
• Serializability is a concept that helps to identify which non-
serial schedules are correct and will maintain the consistency
of the database.
• A serializable schedule behaves exactly like serial schedule. A
concurrent schedule is serializable if it is equivalent to a serial
schedule.
• The database system must control concurrent execution of the
transactions to ensure that the database state remains
consistent.
16 March 2021 21
22. Conflict Serializability
• A concurrent schedule S is conflict serializable if it is
conflict
equivalent to a serial schedule.
• If a given non-serial schedule can be converted into a serial
schedule by swapping its non- conflicting operations, then it
is called as a conflict serializable schedule.
• Let us consider a schedule S in which there are two
consecutive instructions, Ii and Ij of transactions Ti and Tj
respectively (i!= j).
– If Ii and Ij access different data items, then we can swap Ii
and Ij without affecting the results of any transactions in
the schedule.
16 March 2021 22
23. contd..
– However, if Ii and Ij access the same data item Q, then the
order of the two instructions may matter:
• Case-1: Ii =Read(Q) and Ij =Read(Q): Order of Ii and Ij
does not matter.
• Case-2: Ii =Read(Q) and Ij =Write(Q): Order of Ii and
Ij matters in a schedule
• Case-3: Ii =Write(Q) and Ij =Read(Q): Order of Ii and
Ij matters in a schedule
• Case-4: Ii =Write(Q) and Ij =Write(Q): Order of Ii and
Ij matters in a schedule.
• Thus, Ii and Ij conflict
– Both the operations belong to different transactions
– Both the operations are on the same data item
– At least one of the two operations is a write operation
16 March 2021 23
24. contd..
• Let Ii and Ij be consecutive instructions of a schedule S.
– If Ii and Ij are instructions of different transactions and they
do not conflict, then we can swap the order of Ii and Ij to
produce a new schedule S’.
– Here, we expect S to be equivalent to S’
• If a schedule S can be transformed into a schedule S’ by a
series of swaps of non-conflicting instructions, we say that S
and S’ are conflict equivalent.
16 March 2021 24
25. Testing of Conflict Serializability
• Perform the following steps to check whether a given non-
serial schedule is conflict serializable or not:
– Step-01: Find and list all the conflicting operations.
– Step-02: Start creating a precedence graph by drawing one
node for each transaction.
• Draw an edge for each conflict pair such that if Xi (V)
and Yj (V) forms a conflict pair then draw an edge from
Ti to Tj.
• This ensures that Ti gets executed before Tj.
– Step-03: Check if there is any cycle formed in the graph.
• If there is no cycle found, then the schedule is conflict
serializable otherwise not.
• Conflict serializable schedules are always recoverable.
16 March 2021 25
26. Solved Example
• Check whether the given schedule S is conflict serializable
or not-
S : R1(A) , R2(A) , R1(B) , R2(B) , R3(B) , W1(A) , W2(B)
Ans:
Step-01: List all the conflicting operations and
determine the dependency between the
transactions-
R2(A) , W1(A) (T2 → T1)
R1(B) , W2(B) (T1 → T2)
R3(B) , W2(B) (T3 → T2)
16 March 2021 26
27. S-2: The precedence graph is below
S-3: Clearly, there exists a cycle in the precedence graph.
Therefore, the given schedule S is not conflict serializable.
16 March 2021 27
28. Exercises
• Q-1: Determine if the following schedule S with transactions
T1, T2, T3, and T4 is conflict serializable or not.
S: R2(X), W3(X), W1(X), W2(Y), R2(Z), R4(X), R4(Y)
Ans: Yes, it is conflict serializable.
• Q-2: Determine if the following schedule S with transactions
T1 and T2 is conflict serializable or not.
S: R1(X) R1(Y) R2(X) R2(Y) W2(Y) W1(X)
Ans: No, S is not conflict serializable.
• Q-3: Determine if the following schedule S with transactions
T1, T2, and T3 is conflict serializable or not.
S: R4(A), R2(A), R3(A), W1(B), W2(A), R3(B), W2(B)
Ans: Yes, it is conflict serializable.
16 March 2021 28
29. View Serializability
• If a given schedule is found to be view equivalent to some
serial schedule, then it is called as a view serializable schedule.
• Consider two schedules S1 and S2 each consisting of two
transactions T1 and T2. Schedules S1 and S2 are called view
equivalent if the following three conditions hold true for them:
– For each data item X, if transaction Ti reads X from the
database initially in schedule S1, then in schedule S2 also,
Ti must perform the initial read of X from the database.
– If transaction Ti reads a data item that has been updated by
the transaction Tj in schedule S1, then in schedule S2 also,
transaction Ti must read the same data item that has been
updated by the transaction Tj.
– For each data item X, if X has been updated at last by
transaction Ti in schedule S1, then in schedule S2 also, X
must be updated at last by transaction Ti.
16 March 2021 29
30. contd..
• Thumb Rule for view serializability
– “Initial readers must be same for all the data items”
– “Write-read sequence must be same.”
– “Final writers must be same for all the data items”.
• Checking Whether a Schedule is View Serializable Or Not-
– Check conflict serializability
• All conflict serializable schedules are also view
serializable. However, the reverse is not always true.
– Check blind write operation: Writing without reading is
called blind write.
• If there does not exist any blind write, then the
schedule is surely not view serializable.
• If blind write does exist then it may or may not be
view serializable.
16 March 2021 30
31. contd..
– In this method, try finding a view equivalent serial
schedule.
• By using the above three conditions, write all the
dependencies.
• Then, draw a graph using those dependencies.
• If there exists no cycle in the graph, then the schedule is
view serializable otherwise not.
• Ex-1: Check whether the given schedule S with four
transactions named T1, T2, T3, and T4 is view serializable
or not-
S= R1(A), R2(A), R3(A), R4(A), W1(B), W2(B), W3(B), W4(B)
Ans: We know, if a schedule is conflict serializable, then it is
surely view serializable. So, let us check whether the
given schedule is conflict serializable or not.
16 March 2021 31
32. • Step-01: Checking conflict serializability
– List all the conflicting operations and determine the
dependency between the transactions-
• W1(B) , W2(B) (T1 → T2)
• W1(B) , W3(B) (T1 → T3)
• W1(B) , W4(B) (T1 → T4)
• W2(B) , W3(B) (T2 → T3)
• W2(B) , W4(B) (T2 → T4)
• W3(B) , W4(B) (T3 → T4)
– Draw the precedence graph-
16 March 2021 32
33. • Clearly, there exists no cycle in the precedence graph. Hence,
S is conflict serializable. Thus, we conclude that S is also
view serializable.
• Ex-2: Check whether the given schedule S with
transactions named T1, T2, and T3 is view serializable or
not.
– S= R1(A), R2(A), W3(A), W1(A)
Ans:
Step-01: Check conflict serializability
List all the conflicting operations and determine the
dependency between the transactions-
R1(A) , W3(A) (T1 → T3)
R2(A) , W3(A) (T2 → T3)
R2(A) , W1(A) (T2 → T1)
W3(A) , W1(A) (T3 → T1)
16 March 2021 33
34. – Draw the precedence graph-
– Clearly, there exists a cycle in the precedence graph.
– Therefore, the given schedule S is not conflict
serializable. So, it may or may not be view serializable.
• Step-2: Check for blind writes
– There exists a blind write W3 (A) in the given schedule S.
– Therefore, the given schedule S may or may not be view
serializable.
16 March 2021 34
35. • Step-03: Let us derive the dependencies and then draw a
dependency graph.
– T1 firstly reads A and T3 firstly updates A. So, T1 must execute before
T3. Thus, we get the dependency T1 → T3.
– Final updation on A is made by the transaction T1. So, T1 must execute
after all other transactions. Thus, we get the dependency (T2, T3) →
T1.
– There exists no write-read sequence.
– Clearly, there exists a cycle in the dependency graph.
– Thus, we conclude that the given schedule S is not view serializable.
16 March 2021 35
36. Exercises
• Q-1: Determine if the following schedule S with two
transactions T1 and T2, is view serializable or not.
S: R1(A), R2(A), W1(A), W2(A), R1(B), R2(B), W1(B),
W2(B)
Ans: Not View Serializable
• Q-2: Determine if the following schedule S with three
transactions T1, T2, and T3 is view serializable or not.
S : R1(A) , W2(A) , R3(A) , W1(A) , W3(A)
Ans: View Serializable
16 March 2021 36
37. Concurrency Control
• Concurrency control is the procedure in DBMS for managing
simultaneous operations without conflicting with each another.
• Concurrent access is quite easy if all users are just reading
data. There is no way they can interfere with one another.
Though for any practical database, there would have a mix of
READ and WRITE operations and hence the concurrency is a
challenge.
• Reasons for using Concurrency control method is DBMS:
– To apply Isolation through mutual exclusion between
conflicting transactions
– To resolve read-write and write-write conflict issues
– The system needs to control the interaction among the
concurrent transactions. This control is achieved using
concurrent-control schemes.
– Concurrency control helps to ensure serializability
16 March 2021 37
38. Potential Problems due to Concurrency
• Lost Update Problem:
– This problem occurs when two transactions that access the same
database items have their operations interleaved in a way that makes
the value of some database item incorrect.
– Here, the updates made by T1 on the shared data item A gets lost
because T2 did a write operation on A, after T1 did write. Hence,
the update of T1 is lost.
16 March 2021 38
39. contd..
• Temporary Update of Dirty Read Problem:
– This problem occurs when one transaction updates a database item
and then the transaction fails due to some reason.
– The updated item is accessed by another transaction before it is
changed back to its original value.
– Here, if T1 fails to execute then system will roll back all the updates
done by T1 and the database will be sent back to its previous state.
However, T2 reads the value of data item A before the roll back
happens and that will be a problem as T2 will get wrong results.
16 March 2021 39
40. contd..
• Incorrect Summary Problem:
– If one transaction is calculating an aggregate summary function on a
number of records while other transactions are updating some of
these records, the aggregate function may calculate some values
before they are updated and others after they are updated.
– Here, T2 will get incorrect summary because T1 updates the data
item B at the same time when T2 calculates the sum.
16 March 2021 40
41. Concurrency Control Protocols
• Lock Based Protocol:
– Locking is a procedure used to control concurrent access to
data.
– Locks enable a multi-user database system to maintain the
integrity of transactions by isolating a transaction from
others executing concurrently.
– Locking is one of the most widely used mechanisms to
ensure serializability.
• In this type of protocol, any transaction cannot read or write
data until it acquires an appropriate lock on it.
• Data items can be locked in two modes:
– Shared Lock or Read Lock
– Exclusive Lock or Write Lock
16 March 2021 41
42. Lock Based Protocol (contd..)
• Shared lock or Read lock: If a transaction Ti has obtained a
shared mode lock(S) on data item Q, then Ti can only read the
data item Q, but can not write on Q.
• Exclusive lock or Write lock: If a transaction Ti has obtained
an exclusive mode lock(X) on data item Q, then Ti can both
read and write Q.
• A transaction must obtain a lock on a data item before it can
perform a read or write operation.
• Basic Rules for Locking:
– If a transaction has a shared or read lock on a data item, it
can only read the item; but can not update its value
– If a transaction has a shared or read lock on a data item,
other transactions can obtain read locks on the same data
item, but they can not obtain any write/update lock on it.
16 March 2021 42
43. Lock Based Protocol (contd..)
– If a transaction has a exclusive or write lock on a data
item, then it can both read and update the value of that data
item
– If a transaction has a exclusive or write lock on a data
item, then other transactions can not obtain either a read
lock or a write lock on that data item.
• A transaction requests a shared lock on data item Q by
executing the Lock-S(Q) instruction. Similarly, a
transaction
can request an exclusive lock through the Lock-X(Q)
instruction.
• A transaction can unlock a data item Q by the Unlock(Q)
16 March 2021 43
44. Lock Based Protocol: How It Works???
• All transactions that need to access a data item must first
acquire a read lock or write lock on the data item depending
on whether it is a read only operation or not.
– If the data item for which the lock is requested is not
already locked, then the transaction is granted with the
requested lock immediately.
– If the data item is currently locked, the database system
determines what kind of lock is the current one. Also, it
finds out which type of lock is requested:
• If a read lock is requested on a data item that is already under
a read lock, then the request will be granted.
• If a write lock is requested on a data item that is already
under a read lock, then the request will be denied.
• Similarly; if a read lock or a write lock is requested on a data
item that is already under a write lock, then the request is
denied and the transaction must wait until the lock is released
16 March 2021 44
45. contd..
– A transaction continues to hold the lock until it explicitly
releases it either during the execution or when it terminates.
– The effects of a write operation will be visible to other
transactions only after the lock is released.
• A concurrent schedule, which is conflict serializable to a serial
schedule, will always get the respective locks from the
concurrency control manager. But, if the concurrent schedule is
not conflict serializable, the requested locks will not be granted
by the concurrency control manager.
• However, in case of Incorrect Summary Problem, all the
requested locks will be granted; it may result in incorrect
values.
16 March 2021 45
49. Deadlock and Starvation
• To solve the previous discussed problem, different alternative
solutions are possible. One solution can be by delaying the
unlocking process. That means the unlocking is delayed to the
end of the transaction.
• Unfortunately, this type of locking can lead to an undesirable
situation called Deadlock.
• Since T1 is holding an exclusive-lock on A and T2 is requesting a shared-
lock on A, the concurrency control manager will not grant the lock
permission to T2. Thus, T2 is waiting for T1 to unlock A.
16 March 2021 49
50. • T2 is waiting for T3 to unlock B. Similarly, T3 is waiting for T2 to
unlock A. Thus, this is a situation where neither of these
transactions can ever proceed with its normal execution. This
type of situation is called deadlock.
16 March 2021 50
51. contd..
• In a database, a deadlock is an unwanted situation in which
two or more transactions are waiting indefinitely for one another
to give up locks.
• Deadlock is said to be one of the most feared complications in
DBMS as it brings the whole system to a Halt. If we do not use
locking, or if we unlock data items as soon as possible after
reading or writing them, we may get inconsistent states.
• On the other hand, if we do not unlock a data item before
requesting a lock on another data item, deadlocks may occur.
16 March 2021 51
52. Starvation
• When a transaction requests a lock on a data item in a
particular mode, and no other transaction has put a lock on the
same data item in a conflicting mode, then the lock can be
granted by the concurrency control manager.
• However, we must take some precautionary measures to avoid
the following scenarios:
•Suppose a transaction T1 has a shared-mode lock on a data
item, and another transaction T2 requests an exclusive-mode
lock on that same data item. In this situation, T2 has to wait
for T1 to release the shared-mode lock.
•Suppose, another transaction T3 requests a shared-mode
lock on the same data item while T1 is holding a shared
lock on it. As the lock request is compatible with lock
granted to T1, so T3 may be granted the shared-mode
lock. But, T2 has to wait for the release of the lock from
that data item.
16 March 2021 52
53. – At this point, T1 may release the lock, but still T2 has to
wait for T3 to finish. There may be a new transaction T4
that requests a shared-mode lock on the same data item,
and is granted the lock before T3 releases it.
•In such a situation, T2 never gets the exclusive-mode lock
on the data item. Thus, T2 cannot progress at all and is said
to be starved. This problem is called as the starvation
problem.
• We can avoid starvation of transactions by granting locks in the
following manner;
– when a transaction Ti requests a lock on a data item Q in a
particular mode M, the concurrency-control manager grants
the lock provided that:
• There is no other transaction holding a lock on Q in a
mode that conflicts with M.
• There is no other transaction that is waiting for a lock on
Q and that made its lock request before Ti
16 March 2021 53
54. 2 Phase Locking (2-PL)
• The most simple concurrency protocol is Lock based protocol.
Though it is simple to implement, it does not completely
guarantee serializability.
• Two-phase locking protocol is a protocol which ensures
serializability.
• This protocol requires that each transaction issues lock and
unlock requests in two phases. The two phases are:
– Growing phase: Here, a transaction acquires all required
locks without unlocking any data, i.e. the transaction may
not release any lock.
– Shrinking phase: Here, a transaction releases all locks and
cannot obtain any new lock.
• The point in the schedule where the transaction has obtained
its final lock is called the lock point of the transaction
• Transactions can be ordered according to their lock points
16 March 2021 54
55. • Two transactions cannot have conflicting locks
• No unlock operation can precede a lock operation in the same
transaction
• No data are affected until all locks are obtained, i.e. until
• the transaction is in its locked point Two-phase locking may
limit the amount of concurrency that can occur in a schedule
16 March 2021 55
56. 2-PL (contd..)
• Transaction T1:
– Growing phase: from
step 0-2
– Shrinking phase: from
step 4-6
– Lock point: at 3
• Transaction T2:
– Growing phase: from
step 1-5
– Shrinking phase: from
step 7-8
– Lock point: at 6
16 March 2021 56
57. Variations in 2-PL
• Static 2-PL:
– This protocol requires the transaction to lock all the items it access
before the Transaction begins execution by predeclaring its read-set
and write-set.
– If any of the predeclared items needed cannot be locked, the
transaction does not lock any of the items, instead it waits until all
the items are available for locking.
– So the operation on data cannot start until we lock all the items
required.
• Strict 2-PL:
– A transaction T doesn’t release any of its exclusive locks until after it
commits or aborts
– No other transaction can read/write an item that is written by T unless T
has committed.
– Strict 2PL is not deadlock-free.
16 March 2021 57
58. • Rigorous 2-PL:
– A transaction T doesn’t release any of its locks until after it
commits or aborts.
– Strict 2PL holds write-locks until it commits; whereas
Rigorous 2PL holds all locks.
– Conservative 2PL must lock all its data items before it
starts, so once the transaction starts it is in shrinking phase;
whereas Rigorous 2PL doesn’t unlock any of its data items
until after it terminates.
• Lock Conversions:
– Lock Upgrade: This is the process in which a shared lock
is upgraded to an exclusive lock.
– Lock Downgrade: This is the process in which an
exclusive lock is downgraded to a shared lock
16 March 2021 58
59. • Lock upgrading can take place only in the growing phase,
where as lock downgrading can take place only in the shrinking
phase.
• Thus, the two-phase locking protocol with lock conversions:
•First Phase:
•Can acquire a lock-S on item
•Can acquire a lock-X on item
•Can convert a lock-S to a lock-X (upgrade)
•Second Phase:
•Can release a lock-S
•Can release a lock-X
•Can convert a lock-X to a lock-S (downgrade)
• Like the basic two-phase locking protocol, two-phase locking
with lock conversion generates only conflict-serializable
schedules and transactions can be serialized by their lock points.
• If the exclusive locks are held until the end of the transaction,
then the schedules became cascadeless.
16 March 2021 59
60. Time Stamp based Protocol
• The timestamp method for concurrency control doesn’t need
any locks and therefore this method is free from deadlock
situation.
• Locking methods generally prevent conflicts by making
transaction to wait; whereas timestamp methods do not make
the transactions to wait. Rather, transactions involved in a
conflicting situation are simply rolled back and restarted.
• A timestamp is a unique identifier created by the Database
system that indicates the relative starting time of a transaction.
Timestamps are generated either using the system clocks or by
incrementing a logical counter every time a new transaction
starts.
• Timestamp protocol is a concurrency control protocol in which
the fundamental goal is to order the transactions globally in
such a way that older transactions get priority in the event of a
conflict.
16 March 2021 60
61. contd..
• Timestamps: Timestamp TS(Ti) is assigned by the database
system before the transaction Ti starts its execution.
• The timestamps of the transactions determine the serializability
order. Thus, if TS(Ti ) < TS(Tj ), then the system must ensure
that the produced schedule is equivalent to a serial schedule in
which Ti appears before Tj.
• There are two timestamp values associated with each data item
Q:
– W-Timestamp(Q): It denotes the largest timestamp among
the transactions that executed write(Q) operation
successfully.
– R-Timestamp(Q): It denotes the largest timestamp among
the transactions that executed read(Q) operation
successfully.
16 March 2021 61
62. Timestamp Ordering Protocols
• This ensures that any conflicting read and write operations are
executed in timestamp order.
• Suppose transaction Ti issues read(Q):
– If TS(Ti) < W_TS(Q) then the operation is rejected.
– If TS(Ti) >= W_TS(Q) then the operation is executed.
– Timestamps of all the data items are updated.
• Suppose transaction Ti issues write(Q):
– If TS(Ti) < R_TS(Q) then the operation is rejected.
– If TS(Ti) < W_TS(Q) then the operation is rejected and Ti is
rolled back otherwise the operation is executed.
• Advantage and Disadvantage:
– TS protocol ensures serializability.
– TS protocol ensures freedom from deadlock that means no
transaction ever waits.
– But the schedule may not be recoverable and may not even be
cascade- free.
16 March 2021 62
63. Error Recovery
• The recovery manager of a database system is responsible for
ensuring two important properties of transactions:
•atomicity
•durability
• It ensures the atomicity by undoing the actions of transactions
that do not commit.
• It ensures the durability by making sure that all actions of
committed transactions survive in case of system crashes and
media failures.
16 March 2021 63
64. Transaction Failure
• Transaction failure:
– Logical error : The transaction can not complete due to
some internal error conditions, such as wrong input, data
not found, overflow or resource limit exceeded
– System error : The transaction can not complete because of
the undesirable state. However, the transaction can be re-
executed at a later time.
• System crash: Power failure or other hardware or software
failure causes the system to crash. This causes the loss of
content of volatile storage and brings transaction processing to
a halt. But, the content of nonvolatile storage remains intact
and is not corrupted.
• Disk failure: A disk block loses its content as a result of either
a head crash or failure during a data transfer operation. Copies
of the data on other disks are used to recover from the failure
16 March 2021 64
65. Database Recovery
• Database recovery is the process of restoring a database to the
correct state in the event of a failure.
• This service is provided by the database system to ensure that
the database is reliable and remains in consistent state in case
of a failure.
• The recovery algorithms, which ensure database consistency
and transaction atomicity, consist of two parts:
– Actions taken during normal transaction processing to
ensure that enough information exists to allow recovery
from failures.
– Actions taken after a failure to recover the database
contents to a state that ensures database consistency,
transaction atomicity and durability.
16 March 2021 65
66. Shadow Copy Scheme
• This scheme is based on making copies of the database called
shadow copies and it assumes that only one transaction is active
at a time.
• This scheme assumes that the database is simply a file on
disk.
A pointer called db-pointer is maintained on disk; it points to
the current copy of the database.
• Unfortunately, this implementation is extremely inefficient in the context of
large database, since executing a single transaction requires copying the
entire database. Also, the implementation does not allow transactions to
execute concurrently with one another. Thus, this can not be used for
efficient recovery.
16 March 2021 66
67. Recovery Facilities
• A database system should provide the following facilities to
assist with the recovery:
– Backup Mechanism: It makes periodic backup copies of
the database.
– Logging Facility: It keeps track of the current state of
transactions and the database modifications.
– Checkpoint Facility: It enables updates to the database that
are in progress to be made permanent.
– Recovery Management: It allows the system to restore the
database to a consistent state following a failure.
16 March 2021 67
68. Log-Based Recovery
• Log is a sequence of log records, recording all the update
activities in the database. It is kept on stable storage.
• There are several types of log records. An update log record
describes a single database write. It has the following fields:
– Transaction Identifier : This is the unique identifier of the
transaction that performed the write operation.
– Data-item Identifier : This is the unique identifier of the
data item written. Typically, it is the location on disk of the
data item.
– Old Value: This is the value of the data item prior to the
write operation.
– New Value: This is the value that the data item will have
after the write operation.
16 March 2021 68
69. • Various types of log records are:
– <Ti start>: Transaction Ti has started
– <Ti , xj ,v1,v2>: Transaction Ti has performed a write operation
on the data item xj . This data item xj had value v1 before the
write, and will have value v2 after the write
– <Ti commit>: Transaction T has committed
– <Ti abort>: Transaction Ti has aborted
• When a transaction performs a write operation, it is essential
that the log record for that write be created before the database is
modified.
• For log records to be useful for recovery from system and disk
failures, the log must reside in stable storage.
• The log contains a complete record of all database activities
16 March 2021 69
70. Deferred Database Modification
• This scheme ensures transaction atomicity by recording all the
database modifications in the log, but deferring the execution
of all write operations of a transaction until the transaction
partially commits.
• The execution of transaction Ti proceeds as follows:
– Before Ti starts its execution, a record <Ti start> is written to
the log.
– A write(X) operation by Ti results in the writing of a new
record to the log as <Ti , X, V>, where V is the new value of
X. For this scheme, the old value is not required
– When Ti partially commits, a record <Ti commit> is written
to the log.
16 March 2021 70
72. • Redo(Ti ): It sets the value of all data items updated by
transaction Ti to the new values. The set of data items updated by
Ti and their respective new values can be found in the log.
• The redo operation must be idempotent, i.e. executing it
several times must be equivalent to executing it once.
• Transaction Ti needs to be redone iff the log contains both the
record <Ti start> and the record <Ti commit>
• Thus, if the system crashes after the transaction completes its
execution, the recovery scheme uses the information in the log
to restore the system to a previous consistent state after the
transaction had completed
16 March 2021 72
75. Immediate Database Modification
• This scheme allows database modifications to be output to the
database while the transaction is still in the active state. Data
modifications written by active transactions are called
uncommitted modifications.
• In the event of a crash or a transaction failure, the system must
use the old-value field of the log records to restore the
modified data items to the value they had prior to the start of
the transaction. The undo operation accomplishes this
restoration
• The execution of transaction Ti proceeds as follows:
• Before Ti starts its execution, the system writes the record <Ti start>
to the log
• During its execution, any write(X) operation by Ti is preceded by the
writing of the appropriate new update record to the log.
• When Ti partially commits, the system writes the record <Ti commit> to
the log
16 March 2021 75
79. Checkpoints
• When a system failure occurs, we must consult the log to
determine those transactions that need to be redone and those
that need to be undone. For this, we need to search the entire
log to determine this information.
• There are two major difficulties with this approach:
– The search process is time consuming
– Most of the transactions that need to be redone have
already written their updates into the database. Although
redoing them will cause no harm, it will nevertheless cause
recovery to take longer time
• To reduce these types of overhead, checkpoints can be used
•Output onto stable storage all log records currently
residing in main memory
•Output to the disk all modified buffer blocks
•Output onto stable storage a log record <checkpoint>
16 March 2021 79
81. contd..
•Transaction T1 has to be ignored
•Transactions T2 and T3 have to be redone
•T4 has to be undone
• By taking checkpoints periodically, the DBMS can reduce the
amount of work to be done during restart in the event of a
subsequent crash
16 March 2021 81