IT3306 - 05 - Consistency and Transaction Processing Concepts
IT3306 - 05 - Consistency and Transaction Processing Concepts
Concepts
IT3306 – Data Management
Level II - Semester 3
1
© e-Learning Centre, UCSC
0
5.1.1. Single-user systems, Multi-user systems and
Transactions
1
© e-Learning Centre, UCSC
1
5.1.1. Single-user systems, Multi-user systems and
Transactions
1
© e-Learning Centre, UCSC
2
5.1.2. Transaction States
1
© e-Learning Centre, UCSC
3
5.1.2. Transaction States
1
© e-Learning Centre, UCSC
4
5.1.2. Transaction States
End
Begin
Transaction Commit
Transaction Partially
Active Committed
Committed
Abort
Abort
Failed Terminated
1
© e-Learning Centre, UCSC
5
5.1.2. Transaction States
1
© e-Learning Centre, UCSC
6
5.1.2. Transaction States
1
© e-Learning Centre, UCSC
7
5.1.3. Problems in Concurrent Execution
Example Transaction
Account balance of A (X) is 1000;
Account balance of B (Y) is 2000;
Transaction T1 - Rs.50 is withdrawn from A and deposited in B.
Transaction T2 - Rs.100 deposited to account A.
T1 T1 T2
T2
A = 1000 A = 1000
read_item(X) read_item(X)
X:= X-N; A – 50= 950 A + M = 1100
X:= X+M;
write_item (X); write_item (X); A = 950 A = 1100
read_item(Y); B = 2000
Y:= Y+N;
B = 2000 + 50
write_item(Y);
B = 2050
1
© e-Learning Centre, UCSC
9
5.1.3. Problems in Concurrent Transaction Processing
2
© e-Learning Centre, UCSC
1
5.1.3. Problems in Concurrent Transaction Processing
The Lost Update Problem -Example
T1 T2 T1 T2
READ(X) X=80
Still the value
X=X-N X=80-8 of X is 80,
READ(X) X=80
because the
change done
WRITE(X) X=72 by T1 is not yet
X=X+M X=80+2 written.
WRITE(X) X=82
READ(Y) Y=100
Y=Y+N Y=100+8
WRITE(Y) Y=108
• But, after the execution, the results ( X=82 and Y=108) does not
match with expected calculations (X=74 and Y=108).
• The resulting X value is incorrect because the update done by T1
for X is lost and T2 gets the X value directly from the DB. 2
© e-Learning Centre, UCSC
2
5.1.3. Problems in Concurrent Transaction Processing
2
© e-Learning Centre, UCSC
3
5.1.3. Problems in Concurrent Transaction Processing
The Temporary Update (or Dirty Read) Problem
T1 T2
read_item(X);
X = X - N; • In the given table,
write_item(X);
transaction T1 has updated
read_item(X); the value of X, and then
X = X + M;
write_item(X); transaction T2 has read the
read_item(Y);
updated value of X.
rollback;
T1 T2 T1 T2
X value should
READ(X); 80 be 80 when T1
is rolled back.
X=X-N; X=80-5
But, T2 has read
WRITE(X); 75 X from the
temporary
READ(X); 75
update done by
X=X+M; X=75+4 T1.
WRITE(X); 79
READ(Y); 100
ROLLBACK; ROLLBACK
2
© e-Learning Centre, UCSC
6
5.1.3. Problems in Concurrent Transaction Processing
2
© e-Learning Centre, UCSC
7
5.1.3. Problems in Concurrent Transaction Processing
The Incorrect Summary Problem
X= 80; Y=100;, N=5; M=4; A=5;
T1 T3 T1 T3 T3 reads X after it is
SUM=0; 0 updated by T1. The
READ(A); 5 correct value of X is
SUM+=A; 5
taken for the sum. But
T3 reads Y before it is
READ(X); 80
getting updated and
X=X-N; 80-5 hence read an
WRITE(X); 75 incorrect value for the
READ(X); 75 sum.
SUM+=X; 5+75
The correct sum after
READ(Y); 100
reading Y should be 80
SUM+=Y; 80+100 + 105.
READ(Y); 100 But instead it gives 80 +
Y=Y+N;
100 since y is read as
100+5
100 instead of 105.
WRITE(Y); 105
2
© e-Learning Centre, UCSC
8
5.1.3. Problems in Concurrent Transaction
Processing
2
© e-Learning Centre, UCSC
9
5.1.3. Problems in Concurrent Transaction Processing
T1 T2 T1 T2
READ(X) 80
READ(X) 80
X=X-5 80-5
WRITE(X) 75
READ(X) 75
3
© e-Learning Centre, UCSC
0
5.1.3. DBMS Failures
3
© e-Learning Centre, UCSC
1
5.1.3. DBMS Failures
3
© e-Learning Centre, UCSC
2
5.1.3. DBMS Failures
3
© e-Learning Centre, UCSC
3
5.1.3. DBMS Failures
3
© e-Learning Centre, UCSC
4
5.1.3. DBMS Failures
3
© e-Learning Centre, UCSC
5
5.2. Properties of Transactions
ACID properties
ACID are the properties of transactions which are
imposed by concurrency control and recovery methods of
the DBMS.
ACID stands for
i) A – Atomicity
ii) C – Consistency
iii) I – Isolation
iv) D – Durability
A detailed description of each property is explained in the
upcoming slides.
3
© e-Learning Centre, UCSC
6
5.2. Properties of Transactions
i) Atomicity
3
© e-Learning Centre, UCSC
7
5.2. Properties of Transactions
3
© e-Learning Centre, UCSC
9
5.2. Properties of Transactions
ii) Consistency
• A transaction should be completely executed from
beginning to end without getting interfered by other
transactions to preserve the consistency. A transaction
leads the database from one consistent state to another.
• A database state is a collection of all the data values in the
database at a given point.
• The conservation of consistency is viewed as the
responsibility of the developers who compose the programs
and of the DBMS module that upholds integrity constraints.
4
© e-Learning Centre, UCSC
0
5.2. Properties of Transactions
ii) Consistency
• A consistent state of the database fulfils the requirements
indicated in the schema and other constraints on the
database that should hold.
• If a database is in a consistent state before executing the
transaction, then it will be in a consistent state after the
complete execution of the transaction (assuming that no
interference occurs with other transactions).
4
© e-Learning Centre, UCSC
1
5.2. Properties of Transactions
4
© e-Learning Centre, UCSC
2
5.2. Properties of Transactions
iii) Isolation
• During the execution of a transaction, it should appear
as if it is isolated from other transactions even though
there are many transactions happening concurrently.
• The execution of a transaction should not interfere
with other transactions executing simultaneously.
• The isolation property is authorized by the
concurrency control subsystem of the DBMS.
• In the event that each transaction doesn't make its
write updates apparent to other transactions until it is
submitted, one type of isolation is authorized that
takes care of the temporary update issue.
4
© e-Learning Centre, UCSC
3
5.2. Properties of Transactions
4
© e-Learning Centre, UCSC
4
5.2. Properties of Transactions
Levels of Isolation
Before talking about isolation levels, let’s discuss about
database locks.
Database Locks
A database lock is used to "lock" data in a database table so that
only one transaction/user/session may edit it. Database locks are
used to prevent two or more transactions from changing the same
piece of data at the same time.
4
© e-Learning Centre, UCSC
5
5.2. Properties of Transactions
Levels of Isolation
There have been attempts to define the level of isolation
of a transaction.
• Level 0 (zero) isolation (known as Read
Uncommitted) - If a transaction does not overwrite the
dirty reads of higher-level transactions.
• Level 1 (one) isolation (known as Read Committed) -
If a transaction has no lost updates.
• Level 2 isolation (known as Repeatable Read) - If a
transaction has no lost updates and no dirty reads.
• Level 3 Isolation / True isolation (known as
Serializable Read) - If a transaction has no lost
updates, no dirty reads and no repeatable reads.
4
© e-Learning Centre, UCSC
6
5.2. Properties of Transactions
Levels of Isolation
Example for Level 0 (zero) isolation
T1 T2
update employee
set salary = salary - 100
where emp_number = 25
select sum(salary)
from employee
Commit;
Rollback;
Levels of Isolation
4
© e-Learning Centre, UCSC
8
5.2. Properties of Transactions
Levels of Isolation
Example for Level 1 isolation
T1 T2
update employee
set salary = salary - 100
where emp_number = 25;
select sum(salary)
from employee
where emp_number < 50;
rollback
commit
T1 T2
select sum(salary)
from employee
where emp_number < 25
update employee
set salary= salary- 100
where emp_number = 22
commit transaction
select sum(salary)
from employee
where emp_number < 25
commit transaction
5
© e-Learning Centre, UCSC
0
5.2. Properties of Transactions
Levels of Isolation
Example for Level 2 isolation
In the example in previous slide;
T1 queries to get the sum of salaries of employees whose
emp_number is less than 25. T2 updates the salary of the
employee whose emp_number is 22. Then T1 executes the
same query again.
If transaction T2 modifies and commits the changes to the
employee table after the first query in T1, but before the second
one, the same two queries in T1 would produce different
results. Isolation level 2 blocks transaction T2 from executing. It
would also block a transaction that attempted to delete the
selected row. Thus, lost updates and dirty reads are avoided.
5
© e-Learning Centre, UCSC
1
5.2. Properties of Transactions
Phantoms
• If a database table includes a record which was not
present at the start of a transaction but is present at the
end then it is called a phantom record.
• For example, If transaction T2 enters a record to a table
that transaction T1 currently reads (the record also
satisfies the filtering conditions used in T1), then that
record is a phantom because it was not there when T1
started but is there when T1 ends.
• If the equivalent serial order is T1 followed by T2, then the
record should not be seen. But if it is T2 followed by T1,
then the phantom record should be in the result given to
T1.
5
© e-Learning Centre, UCSC
2
5.2. Properties of Transactions
Levels of Isolation
Example for Level 2 isolation
Consider the following example on phantom reads
T1 T2
commit transaction
commit transaction
5
© e-Learning Centre, UCSC
3
5.2. Properties of Transactions
Levels of Isolation
Example for Level 2 isolation (Phantom reads in Level 2
Isolation)
In the example given in the previous slide;
T1 retrieves the rows from employee table where salaries are more than 45000.
Then T2 inserts a row that meets the criteria given in T1 (an employee whose
salary is greater than 45000) and commits. T1 issues the same query again.
The number of rows retrieved for the same select query in T1 are different when
the isolation level is 2.
Total no. of records retrieved by executing second select statement = total no.
of records retrieved by first select statement is +1.
This creates a phantom. Phantoms occur when one transaction reads a set of
rows that satisfy a search condition, and then a second transaction modifies
those data. If the first transaction repeats the read with the same search
conditions, it obtains a different set of rows.
In the above example, T1 sees a phantom row in the second select query.
5
© e-Learning Centre, UCSC
4
5.2. Properties of Transactions
Levels of Isolation
Example for Level 2 isolation (Phantom reads in Level 2
Isolation)
5
© e-Learning Centre, UCSC
5
5.2. Properties of Transactions
Levels of Isolation
Example for Level 3 isolation
Explanation of the example is given in the next slide.
T1 T2
commit transaction
commit transaction
5
© e-Learning Centre, UCSC
6
5.2. Properties of Transactions
Levels of Isolation
Example for Level 3 isolation
In the table shown in previous slide;
5
© e-Learning Centre, UCSC
8
5.2. Properties of Transactions
Snapshot isolation
Example for Snapshot isolation
T1 T2 empID empname
SELECT *
FROM employee
100 Upul
ORDER BY empID;
200 Manjitha
INSERT INTO employee
(empID, empname)
VALUES(600, 'Anura');
COMMIT; This is the output of the first
SELECT * select query of T1. It only
FROM employee
ORDER BY empID;
generates the data that is
INSERT INTO employee
available in the current
(empID, empname) snapshot.
VALUES(700, 'Arjuna')
COMMIT;
5
© e-Learning Centre, UCSC
9
5.2. Properties of Transactions
Snapshot isolation
T1 T2 empID empname
SELECT *
FROM employee
100 Upul
ORDER BY empID;
200 Manjitha
INSERT INTO employee
(empID, empname)
VALUES(600, 'Anura');
COMMIT;
The second select statement
SELECT *
FROM employee of T1 produces the same
ORDER BY empID; result as the first select
INSERT INTO employee statement, because T1 has
(empID, empname)
VALUES(700, 'Arjuna') not committed yet. The
COMMIT; snapshot taken by T1
remains without changing
SELECT * FROM employee until it commits.
ORDER BY empID;
6
© e-Learning Centre, UCSC
0
5.2. Properties of Transactions
Snapshot isolation
T1 T2 empID empname
6
© e-Learning Centre, UCSC
2
5.2. Properties of Transactions
iv) Durability
• Durability or permanency means, once the changes of a
transaction are committed to the database, those changes
must remain in the database and should not be lost.
6
© e-Learning Centre, UCSC
3
5.2. Properties of Transactions
Example for Durability
• Definition : Changes must never be lost because of
subsequent failures (eg: power failure)
• In the transaction T1, if transaction failure occurs after
write (A), but before write (B);
To recover the database,
i. We must remove changes of partially done transactions.
Therefore, the change done on A should be rolled back.
(before crash, A was 950. Then it needs to be rolled
back to 1000)
ii. We need to reconstruct completed transactions.
If the system fails after the commit operation of a
transaction, but before the data could be written on to
the disk, then that transaction needs to be
reconstructed.
The database should keep all its latest updates even if the
system fails. If a transaction commits after updating data, then
the database should have the modified data.
6
© e-Learning Centre, UCSC
4
Activity
6
© e-Learning Centre, UCSC
5
Activity
read (x)
x=x-n
read (x)
x=x+m
write (x)
read (y)
write (x)
y=y+n
write (y)
6
© e-Learning Centre, UCSC
6
Activity
T1 T2
read (x)
x=x-n
write (x)
read (x)
x=x+m
write (x)
commit
read (y)
abort
6
© e-Learning Centre, UCSC
7
Activity
T1 T2
sum = 0
Identify the problem that
would result in the given read (a)
read (x)
x=x-n
write (x)
read (x)
sum = sum + x
read (y)
sum = sum + y
read (y)
y=y+n
write (y)
6
© e-Learning Centre, UCSC
8
Activity
6
© e-Learning Centre, UCSC
9
Activity
7
© e-Learning Centre, UCSC
0
Activity
Drag and drop the matching words for the sentence
1. Problems caused by hardware, software, or network error
that occurs in the computer system during transaction
execution. –
2. Occurs due to the errors in operation such as integer
overflow or division by zero. –
3. Occurs due to some exception in the programme cause
the cancellation of a transaction –
4. Occurs due to read or write malfunction and data in some
disk blocks may get lost –
5. Problems such as power loss, failure in air-conditioning,
natural disasters, theft, sabotage, mistakenly overwriting
disks or tape, and mounting of a wrong tape by the
operator –
7
© e-Learning Centre, UCSC
1
Activity
item table=>
item_no 1 2 3 4 5 6 7
A list of item numbers and their prices are given in the above
table. After the two transactions T1, T2 were executed on the
above item table, the output was 14,500.
What can be the least possible isolation level used in T2?
T1 T2
rollback
commit
7
© e-Learning Centre, UCSC
2
Activity
7
© e-Learning Centre, UCSC
3
5.3 Schedules
Schedules of Transactions
• The arrangement or order of operations in a
transaction is named as a schedule.
S = T1, T2, T3,.....,Tn
7
© e-Learning Centre, UCSC
4
5.3 Schedules
Schedules of Transactions
• In this slide set, we will be using a set of notations for
the operations included in a transaction and to identify
the transaction number we will be adding a subscript.
• Following are the notations and their descriptions, that
we use in this slide set.
b begin_transaction
r read_item
w write_item
e end_transaction
c commit
a abort
7
© e-Learning Centre, UCSC
5
5.3 Schedules
T1 T2
r(X)
r(X)
w(X)
r(Y)
w(X)
w(Y)
7
© e-Learning Centre, UCSC
6
5.3 Schedules
• Schedules of Transactions
If two operations in a schedule have the following
properties, it is known as a conflict.
1. Operations are from different transactions.
2. Do the operation on same data item.
3. At least one of the two operations is a write (insert,
update, delete)
7
© e-Learning Centre, UCSC
7
5.3 Schedules
7
© e-Learning Centre, UCSC
8
5.3 Schedules
7
© e-Learning Centre, UCSC
9
5.3 Schedules
8
© e-Learning Centre, UCSC
0
5.3 Schedules
S’’ =r1 (X); w1(X); r2(X); r1(Y); w2(X); w1(Y); a1; a2;
In the above example, T2 is also aborted since T1 aborted.
The reason here is that, T2 reads the X value from T1.
8
© e-Learning Centre, UCSC
1
5.3 Schedules
8
© e-Learning Centre, UCSC
2
5.4 Serializability
8
© e-Learning Centre, UCSC
3
5.4 Serializability
8
© e-Learning Centre, UCSC
4
5.4 Serializability
w(a)=87
Initial values of a=90 and b=90
r(b)=90
b=b+3
w(b)=93
What is the final value of a and b after
completion of T1 and T2?
c
r(a)=87 a= 89
a=a+2 b=93
w(a)=89
w(a)=92 a= 92
b=93
c
w(a)=87
Initial values of a=90 and b=90
r(a)=87
w(b)=93
Is this a correct schedule? Yes. The final
c
answers are correct.
8
© e-Learning Centre, UCSC
7
5.4 Serializability
8
© e-Learning Centre, UCSC
8
5.4 Serializability
S1 = r1(X); w2(X);
S2= w2(X); r1(X);
S1 and S2 are not conflict equivalent since the order of conflicting
operations are different.
8
© e-Learning Centre, UCSC
9
5.4 Serializability
P Q
T1 T2
• A schedule S is serializable, if it
T1 T2
is conflict equivalent to a serial
r(a)
r(a) schedule S’.
a=a -3
a=a -3
w(a)
r(b) w(a) Ex:
b=b+3 r(a) • Schedule P is a serial schedule.
w(b)
a=a+2 • Schedule Q performs all the
c
conflicting operations in the
r(a) w(a)
same order as schedule P.
a=a+2 c
Therefore, P and Q schedules
w(a) r(b) are conflict equivalent.
c
b=b+3 • Hence, Q is a serializable
w(b)
schedule.
c
9
© e-Learning Centre, UCSC
0
5.4 Serializability
9
© e-Learning Centre, UCSC
1
5.4 Serializability
T1 T2
9
© e-Learning Centre, UCSC
3
5.4 Serializability
T1 T2
X
9
© e-Learning Centre, UCSC
4
5.4 Serializability
T1 T2
X
9
© e-Learning Centre, UCSC
5
5.4 Serializability
T1 T2
X
9
© e-Learning Centre, UCSC
6
5.4 Serializability
T1 T2 T3
Testing for Serializability of a
1. r(Z)
Schedule
2 r(Y)
7 w(X)
Line no. 3 and 4: T2->T3 (Y) 8 w(Y)
Line no. 1 and 9: T2->T3 (Z) 9 w(Z)
Line no. 7 and 10: T1->T2 (X) 10 r(X)
13 r(X)
9
© e-Learning Centre, UCSC
7
5.4 Serializability
9
© e-Learning Centre, UCSC
8
5.4 Serializability
9
© e-Learning Centre, UCSC
9
5.4 Serializability
1
© e-Learning Centre, UCSC 0
5.4 Serializability
1
© e-Learning Centre, UCSC 0
5.4 Serializability
View Equivalence and View Serializability
• Criteria for two schedules S and S′ to be view equivalent is
as follows.
1
© e-Learning Centre, UCSC 0
5.4 Serializability
T1 T2 T1 T2
r(a) r(a)
w(a) w(a)
r(a) r(b)
w(a) w(b)
r(b) r(a)
w(b) w(a)
r(b) r(b)
w(b) w(b)
S P 1
© e-Learning Centre, UCSC 0
5.4 Serializability
1
© e-Learning Centre, UCSC 0
5.5 Transaction Support in SQL
1
© e-Learning Centre, UCSC 0
5.5 Transaction Support in SQL
1
© e-Learning Centre, UCSC 0
5.5 Transaction Support in SQL
1
© e-Learning Centre, UCSC 0
5.5 Transaction Support in SQL
SET TRANSACTION
READ ONLY,
ISOLATION LEVEL READ UNCOMMITTED,
DIAGNOSTIC SIZE 6;
1
© e-Learning Centre, UCSC 1
5.5 Transaction Support in SQL
Read
Uncommitted
Read Committed
Repeatable Read
Serializable
1
© e-Learning Centre, UCSC 1
5.5 Transaction Support in SQL
Read Uncommitted: Declares that transaction can read rows
that have been modified by other transactions but not yet
committed. Thus, result in dirty reads, non-repeatable reads
and phantoms.
• Example - Consider the following transactions T1 and T2
occurs on an account that holds Rs.50,000 of initial balance.
Transaction (T1) →
Deduct Rs: 1000 from an account (Customer_ID=Cid_1105)
due to an automated bill payment happens every month.
But, since an error occurred, T1 transaction rolled back
without committing.
Transaction (T2) →
At the same time while T1 executes, customer
(Customer_ID=Cid_1105) checks his account balance.
1
© e-Learning Centre, UCSC 1
5.5 Transaction Support in SQL
1
© e-Learning Centre, UCSC 1
5.5 Transaction Support in SQL
1
© e-Learning Centre, UCSC 1
5.5 Transaction Support in SQL
1
© e-Learning Centre, UCSC 1
5.5 Transaction Support in SQL
Read Committed: Declares that transaction can only read
data that has been committed by other transactions. Thus,
prevent dirty reads. But result in non-repeatable reads and
phantoms.
• Example - Consider the following transactions T1 and T2
occurs on an account that holds Rs.50,000 of initial balance.
Transaction (T1) →
Deduct Rs: 1000 from an account (Customer_ID=Cid_1105)
due to an automated bill payment happens every month.
This transaction was successfully completed and committed
to the database.
Transaction (T2) →
At the same time while T1 executes, customer
(Customer_ID=Cid_1105) checks his account balance twice
consequently. T2 Reads the account balance twice.
1
© e-Learning Centre, UCSC 1
5.5 Transaction Support in SQL
1
© e-Learning Centre, UCSC 1
5.5 Transaction Support in SQL
1
© e-Learning Centre, UCSC 1
5.5 Transaction Support in SQL
1
© e-Learning Centre, UCSC 2
5.5 Transaction Support in SQL
1
© e-Learning Centre, UCSC 2
5.5 Transaction Support in SQL
Transaction (T1) →
Deduct Rs: 1000 from an account (Customer_ID=Cid_1105)
due to an automated bill payment happens every month.
Then T1 transaction commits.
Transaction (T2) →
At the same time while T1 executes, customer
(Customer_ID=Cid_1105) checks his account balance twice
consequently.
1
© e-Learning Centre, UCSC 2
5.5 Transaction Support in SQL
1
© e-Learning Centre, UCSC 2
5.5 Transaction Support in SQL
1
© e-Learning Centre, UCSC 2
5.5 Transaction Support in SQL
• First read statement of T2 will not get the balance, but the
second read statement in T2 will get the output =
49,000.
• Explanation→ We have set the isolation level to
“REPEATABLE READ” in T1, the first read statement in
T2 will not allowed to read the balance because T1 has
updated the balance and not committed yet.
• When T2 reads the balance again, T1 has been
completed and committed to the database. Hence it gets
the output= 49,000.
1
© e-Learning Centre, UCSC 2
5.5 Transaction Support in SQL
1
© e-Learning Centre, UCSC 2
5.5 Transaction Support in SQL
Transaction (T1) →
Reads details of employees who are working in the “123”
department twice consecutively.
Transaction (T2) →
At the same time new record is inserted into the employee
table with name =”June” who is working in the "123"
department.
1
© e-Learning Centre, UCSC 2
5.5 Transaction Support in SQL
1
© e-Learning Centre, UCSC 2
5.5 Transaction Support in SQL
1
© e-Learning Centre, UCSC 2
5.5 Transaction Support in SQL
1
© e-Learning Centre, UCSC 3
5.6 Consistency in NoSQL
Consistency
• As we discussed in previous slides, a transaction leads
the database from one consistent state to another.
• In other words, transactions must affect database only
in valid ways.
Consistency in NoSQL
• In NoSQL databases, eventual consistency is preferred
over immediate consistency.This will be discussed in
detail later.
1
© e-Learning Centre, UCSC 3
5.6 Consistency in NoSQL
Update Consistency
• Update consistency in NoSQL make sure that write-
write conflicts doesn’t occur.
• Write-write conflict occurs when two transactions
update same data item at the same time. If the server
serialize the updates, a lost update occurs.
• There are 2 types of approaches for maintaining
consistency.
– Pessimistic approach: Prevents conflicts from
occurring.
– Optimistic approach: Let the conflicts occur but
detects and takes action to sort them out.
1
© e-Learning Centre, UCSC 3
5.6 Consistency in NoSQL
1
© e-Learning Centre, UCSC 3
5.6 Consistency in NoSQL
1
© e-Learning Centre, UCSC 3
5.6 Consistency in NoSQL
1
© e-Learning Centre, UCSC 3
5.6 Consistency in NoSQL
Samanali and Krishna read the record A which has the value
100. Samanali wants to add 50 to the A value. Just before
writing the value, she checks the value of A, to make sure it
has not changed since her last read and then does the
modification. Meanwhile Krishna wants to subtract 20 from the
value A. Just before the modification, he also checks the value
of A to make sure the value remain unchanged as 100. But as
Samanali has changed A to 150, Krishna fails to do the
update.
1
© e-Learning Centre, UCSC 3
5.6 Consistency in NoSQL
1
© e-Learning Centre, UCSC 3
5.6 Consistency in NoSQL
Samanali and Krishna read the record A which has the value
100. Then Samanali add 50 to this value and write it.
Meanwhile Krishna subtract 20 from value of A and write it.
DBMS will save the both values 150 (changed by Samanali)
and 80 (changed by Krishna) as possible values for A and
then mark them as conflicts.
1
© e-Learning Centre, UCSC 3
5.6 Consistency in NoSQL
Read Consistency
• Read consistency in NoSQL will guarantee that readers
will always get consistent responses to their requests.
• Read consistency will prevent “inconsistent read” or
“read-write conflict”.
• Read consistency will preserve ,
- Logical consistency (ensure that different
data items make sense together).
- Replication consistency (ensure that same
data item has the same value when read from
different replicas).
- Session consistency (within user’s session
there is read-your-writes consistency. It means once
you’ve made an update, you are guaranteed to
continue seeing that update).
1
© e-Learning Centre, UCSC 3
5.6 Consistency in NoSQL
Replication
• Creating multiple copies of data items over different
servers is known as replication.
• Can be implemented using following two forms.
- Master-Slave : In master-slave replication, the
master processes the updates and then changes are
propagated to slaves.
- Peer-to-peer: In peer-to-peer replication, all the
nodes can process updates and then synchronize
their copies of data.
1
© e-Learning Centre, UCSC 4
5.6 Consistency in NoSQL
Master-Slave Replication
• Master Master
1
© e-Learning Centre, UCSC 4
5.6 Consistency in NoSQL
1
© e-Learning Centre, UCSC 4
5.6 Consistency in NoSQL
Peer-to-Peer Replication
• All the replicas have equal weight
• Every replica can process updates
• Even if one replica fails, system can operate normally.
• Pros
- Resistant to node failures
- Can easily add nodes to improve performance
• Cons
- Write-write inconsistencies can occur
- Read-write inconsistencies can occur due to slow
propagation
1
© e-Learning Centre, UCSC 4
5.6 Consistency in NoSQL
Relaxing Consistency
• Even though consistency is a good property, normally it is
impossible to achieve consistency without significant
sacrifices to other characteristics of the system such as
availability.
• Transactions will enforce consistency but it is possible to
relax isolation levels to enable individual transactions to
read data that has not been committed yet.
• Relaxing isolation level will improve the performance but
will reduce the consistency.
1
© e-Learning Centre, UCSC 4
5.6 Consistency in NoSQL
CAP Theorem
• In a database which has several connected nodes, given
the three properties of Consistency, Availability and
Partition tolerance, it is possible to enforce only two
properties at a time.
- Consistency: (We discussed earlier).
- Availability: Every request received by a non failing
node in the system must result in a response.
- Partition tolerance: The system continues to operate
despite communication breakages that separate the
cluster into multiple partitions which are unable to
communicate with each other.
• The resulting system designed using CAP theorem will not
be perfectly consistent or perfectly available but would
have a reasonable combination.
1
© e-Learning Centre, UCSC 4
5.6 Consistency in NoSQL
CP Category CA Category
Some data might Network problems might
become unavailable. stop the system.
Partition tolerance
P A Availability
AP Category
Data inconsistencies may
occur.
1
© e-Learning Centre, UCSC 4
5.6 Consistency in NoSQL
Durability
• Durability means that committed transactions will survive
permanently (even if the system crashed). This is
achieved by flushing the records to disk (Non-volatile
memory) before acknowledging the commit.
Relaxing Durability
• In relaxing durability, database can apply updates in
memory and periodically flush changes to the disk. If the
durability needs can be specified on a call-by-call basis,
more important updates can be flushed to disk.
• By relaxing durability, we can gain higher performance.
1
© e-Learning Centre, UCSC 4
5.6 Consistency in NoSQL
1
© e-Learning Centre, UCSC 5
5.6 Consistency in NoSQL
Relaxing Durability
• Another class of durability tradeoffs comes up with
replicated data.
• A failure of replication durability occurs when a node
processes an update but fails before that update is
replicated to the other nodes.
• For example, assume a peer-to-peer replicated system
with three nodes, R1 , R2 and R3. If the transaction is
updated to the memory of R1, but it crashed before the
update is sent to R2 and R3, a failure of replication will
occurr. This can be avoided by setting the durability level.
If the system doesn’t acknowledge the commit until the
update is propagated to majority of nodes, above
scenario will not have occurred.
1
© e-Learning Centre, UCSC 5
5.6 Consistency in NoSQL
Quorums
• Answers the question, “How many nodes need to be
involved to get strong consistency?”
• Write quorum specifies the number of nodes with non
conflicting writes.
• If W > N/2 ; then the system said to have a strong
consistency.
• W - Number of nodes participating in the write
• N - Number of nodes involved in replication
• The number of replicas is known as the replication
factor.
• If number of nodes required to contact for a read is R;
when R + W > N you can have a strong consistent
read. 1
© e-Learning Centre, UCSC 5
5.6 Consistency in NoSQL
Quorums Example
• Let’s consider a system with replication factor 3. How
many nodes are required to confirm a write?
For a system to have a strong consistency, W should be
greater than N/2. ( N is replication factor)
Here, W needs to be greater than 3/2
W>1.5
Therefore we need at least 2 nodes to confirm a write.
• What is the number of nodes you need to contact for a
read?
R + W >N (according to definition in previous slide)
R > N -W
R> 3 - 2
R>1
Therefore the number of nodes you need to contact for a read is 2. 1
© e-Learning Centre, UCSC 5
5.6 Consistency in NoSQL
Version Stamps
• We need human intervention to work with updates in a
transactional system as transactions have limitations.
• Applying locks for longer period of time will affect the
performance of the system. Solution for this is version
stamps, a field that changes every time the underlying
data in the record changes.
• System can note the version stamp when reading the
data and can check whether it’s changed before writing
the data.
1
© e-Learning Centre, UCSC 5
5.6 Consistency in NoSQL
1
© e-Learning Centre, UCSC 5
5.6 Consistency in NoSQL
1
© e-Learning Centre, UCSC 5
5.6 Consistency in NoSQL
ii.Create a GUID
• Pros
- Can be generated by any node
• Cons
- Large numbers
- Unable to compare and find the most recent
version directly.
1
© e-Learning Centre, UCSC 5
5.6 Consistency in NoSQL
1
© e-Learning Centre, UCSC 5
5.6 Consistency in NoSQL
1
© e-Learning Centre, UCSC 5
5.6 Consistency in NoSQL
1
© e-Learning Centre, UCSC 6
Activity
Consider T1 and T2
65 rows
transactions given in tabular
format. If T1 reads 65 row
and 66 rows respectively in
Read1 and Read2
operations,
what is the minimum
isolation level of transaction
T1?
66 rows
1
© e-Learning Centre, UCSC 6
Activity
65 rows
Consider T1 and T2
transactions given in
tabular format. If T1
reads 65 rows in both
Read1 and Read2
operations,
what is the minimum
isolation level of
transaction T1?
65 rows
1
© e-Learning Centre, UCSC 6
Activity
1
© e-Learning Centre, UCSC 6
Activity
1
© e-Learning Centre, UCSC 6
Activity
1
© e-Learning Centre, UCSC 6
Activity
1
© e-Learning Centre, UCSC 6
Activity
1
© e-Learning Centre, UCSC 6
Activity
1
© e-Learning Centre, UCSC 7
Activity
1
© e-Learning Centre, UCSC 7
Schedule S:
Activity
T1 T2 T3 T4
r(X)
Write whether the given statements
are true or false considering the given
w(X) schedule S.
c
1. S is conflict serializable and
w(X) recoverable. (_______)
c 2. S is conflict serializable but not
recoverable. (_______)
w(Y)
3. S includes blind writes. (_______)
r(Z) 4. S is recoverable but not conflict
c serializable. (_______)
r(X)
r(Y)
1
© e-Learning Centre, UCSC 7
Activity
Property Explanation
Consistency System continues to operate even in the
presence of node failure
Availability System continues to operate in spite of
network failures.
Partition Tolerance All the users can see the same data at
same time.
1
© e-Learning Centre, UCSC 7
Activity
1
© e-Learning Centre, UCSC 7
Activity
1
© e-Learning Centre, UCSC 7
Activity
• Drag and drop the correct answer from the given list.
1
© e-Learning Centre, UCSC 7
Summary
Properties of
Transactions ACID properties, levels of isolation
1
© e-Learning Centre, UCSC 7
Summary
Schedules of Transactions
Schedules Schedules Based on Recoverability
1
© e-Learning Centre, UCSC 7