Unit V Concurrency Control y
Unit V Unit V Concurrency y Control
Foundation of Distributed Concurrency Control
- In order to analyze the correctness of a distributed In order to analyze the correctness of a distributed concurrency control method we need a formal model.
- Serializability in a Centralized Database y A transaction accesses the database by issuing read & write primitives.
Let Ri (X) and Wi(X) denote a read & write operation issued by a transaction Ti for data item X.
A schedule is a sequence of operations performed by transactions. transactions Ex :‐ S1 : Ri(X) Rj(X) Wi(Y) Rk(Y) Wi(X)
- Two transactions Ti and Tj execute serially in a schedule S if • Two transactions Ti and Tj execute serially in a schedule S if the last operation of Ti precedes the first operation of Tj in S (or vice versa); otherwise they execute concurrently.
- A schedule is serial if no transactions execute concurrently concurrently in it in it.
Ex: ‐ S2 : Ri(X) Wi(X) Ri(Y) Rj(X) Wj(Y) Rk(Y) Wi(X)
Serial ( ) (S2) : Ti Tj Tk
We don’t want transactions to execute serially but they must execute concurrently provided y y p their
execution is correct (Serializable).
A A schedule is Serializable if it is computationally schedule is Serializable, if it is computationally equivalent
to a serial schedule.
- In order to get serializability we need the conditions conditions to check whether two schedules are to check whether two schedules are equivalent.
h f ll i h di i
- The following are the conditions to ensure schedules are equivalent.
- – Each red operation reads the data item values which
are produced by the same write operations in both schedules.
- – The final write operation on each data item is p the same in both schedules.
- Serializability in a Distributed Database
In In distributed database each transaction performs operation distributed database, each transaction performs operation at several sites.
The q p p y sequence of operations performed by transactions at a site is a Local Schedule.
An execution of n distributed transaction T1, T2, … Tn at m sites is modelled by a set of local schedule S1, S2,..., Sm.
If we use local concurrency control method at each site, we can can ensure all local schedules are serializable ensure all local schedules are serializable. But this is not sufficient to for distributed transactions. Ex Ex : S1( at Site 1) : Ri(X) Wi(X) Rj(X) Wj(X) :‐ S1( at Site 1) : Ri(X) Wi(X) Rj(X) Wj(X) S2( at Site 2) : Rj(X) Wj(X) Ri(X) Wi(X)
- In order to get serializability of distributed transactions transactions a stronger condition is required a stronger condition is required.
- The execution of Transaction T1, T2, … Tn is correct correct if
if
- – Each local schedule Sk is serializable
- – There exists a Total Ordering of T1, T2, … Tn such
that if Ti < Tj in the total ordering, then there is a schedule Sk’ such that Sk is equivalent to Sk’ and Ti < Tj in Serial (Sk’), for each
site k where both transactions have executed some action.
2 ‐Phase‐Locking as a Distributed Concurrency Control
All
- All distributed transaction are 2‐Phase locked then all local distributed transaction are 2 Phase locked then all local schedule are serializable.
- If a distributed transaction is 2‐phase locked then its subtransactions at different sites, taken separately are 2‐ phase locked. W ill th t 2 h l ki i l t f
- We will prove that 2‐phase locking is also correct for
Distributed concurrency control. Assume
- Assume we have n pairs of operations such that we have n pairs of operations such that
O1(x) < O2(x)
O2(y) < O3(y) ….
On ‐1(v) < On(v)
On(z) < O1(z)
- If transactions T1, T2, …. Tn are 2‐phase locked this situation cannot accur this situation cannot accur.
- But since transactions are all 2‐phase locked, none none of them release any locks before getting
of them release any locks before getting other locks.
This will lead to deadlock situation.
- This will lead to deadlock situation
- One of the transaction will be aborted by Deadlock Deadlock resolution algorithm resolution algorithm.
- Two phase locking ensures that all executions are serializable serializable but it does not allow all serializable
but it does not allow all serializable executions to be produced i.e. 2‐phase locking mechanism
is more restrictive.
- Example :‐ consider transactions Ti & Tj for Fund transfer from ( ) y ( ) x (site 1) to y (site 2).
Ti : Ri(x) Wi(x) Ri(y) Wi(y)
Tj Tj : Rj(x) Wj(x) Rj(y) Wj(y) : Rj(x) Wj(x) Rj(y) Wj(y) Consider
Ti <Tj The The execution E may be like this
execution E may be like this S1 : Ri(x) Wi(x) Rj(x) Wj(x) S2 : S2 Ri(y) Ri(y) Wi(y) Rj(y) Wj(y)
: Wi(y) Rj(y) Wj(y)
- The execution E is not allowed by 2‐phase locking because because Ti will not release the lock on x till he obtain
Ti will not release the lock on x till he obtain lock on y.
Also
- Also as per 2‐phase commitment protocol both write as per 2 phase commitment protocol both write
Time & Timestamps in Distributed Databases
- In distributed system it is required to know if Event A at • In distributed system, it is required to know if Event A at some site happened before event B at different site.
- Many of concurrency control & deadlock prevention • Many of concurrency control & deadlock prevention algorithms need this kind of information.
- The determination of an ordering of events consists in • The determination of an ordering of events consists in assigning
to each event A which occurs in distributed system a timestamp TS(A) having the following y p ( ) g g properties.
- – TS(A) uniquely identifies A.
- – For any two events A & B, if A occurred before B then
TS(A) < TS(B)
- A occur before relationship can be defined as follows.
- The relation occurred before is denoted as Æ, can
y g be to a distributed environment by using following
rules.
- – If A & B are two events at the same site & A occurred before B then A Æ B.
- – If the event A consists in sending a message g g and
event B consists in receiving the same message then A Æ B.
- – If A Æ B and B Æ C then A Æ C.
p
- We call two events A & B pseudo‐simultaneous if neither A Æ B nor B Æ A (Example – See fig).
Site 1 Site 2
A
D Message M1
B EMessage M1 C
F
We consider now the generation of timestamps. The first condition, uniqueness can be easily satisfied in distributed distributed system It is sufficient that each site add to
system. It is sufficient that each site add to locally unique timestamp its site ID in the least significant position. Th d i i l Fi ill
The second requirement is more complex. First we will use at each site a counter which steadily incremented, so
that the transactions which receive the timestamp at the same site are correctly ordered between them.
However synchronization between counters at different site would be difficult.
To solve this, the counters of the two sites can be kept approximately aligned by simply including in each message message the value of counter of the sending site If a
the value of counter of the sending site. If a site receives a message with timestamp value TS which is greater than its current counter, it increments its counter counter to be TS+1 to be TS+1.
Distributed Deadlocks
- The detection and resolution of deadlocks is an important activity in DBMS.
- The deadlock detection in distributed databases consisting circular waiting involves several sites.
- We use Distributed wait‐for graph (DWFG) and
Local wait‐for graph (LWFG) for detecting g p ( ) g deadlock situation.
Site 1 Site 2 T1A1
Distributed wait-for graph (DWFG) showing distributed deadlock
The notation TiAj refers to Agent Aj of Transaction Ti.The directed edge from an agent TiAj to an agent TrAs The directed edge from an agent TiAj to an agent TrAs means that TiAj is blocked and waiting for TrAs.
- There are two reasons for an agent waiting for another. h
- – Agent TiAj waits for the agent TrAj to release the
resource which it needs. In this case Ti & Tr are different and agents are at same site. This is indicated by continuous edge. g j g p
- – Agent TiAj waits for the agent TiAs to perform some required function. In this case two agents g g
belong to same transaction. This is . indicated
by dashed edge
Site 1 Input Port T1A1
T1A2 Output T2A1 T2A2 Port Local wait-for graph (LWFG)
A local wait-for graph (LWFG) is the portion of the DWFG consisting only nodes & edges at single site, extended with an indication of the nodes which represent remote agents having an edge connecting t t t h i d ti
- A deadlock is local if it is caused by a cycle in an LWFG.
y y
- A deadlock is distributed if it is caused by a cycle in an
DWFG which is not contained in any LWFG.
- Deadlock resolution involves the selection of one or more transaction to be aborted & restarted.
- The redundancy present in distributed databases increases the probability of deadlocks.
- There are following methods to deal with deadlocks in distributed
databases
- – Deadlock detection using centralized or hierarchical control
- – Distributed deadlock detection
- – Distributed Deadlock Prevention
Deadlock Detection using Centralized or Hierarchical Hierarchical Controllers Controllers
- In this method a selected site is chosen at which a centralized deadlock detector is run. li d d dl k d i • It builds DWFG & checks for cycle.
- The deadlock detector receives information from all other sites.
- At each site there is a Local deadlock detector whose whose responsibility is to determine all potential
responsibility is to determine all potential global deadlocks at its site.
Site 1 T1A1
T3A2 T4A1 Local wait-for graph (LWFG) at site 1
Site 1 T1A1
T3A2 Potential global deadlock cycle at site 1 Potential global deadlock cycle at site 1
- The global deadlock detector collects the messages messages related to potential global deadlock related to potential global deadlock cycle
from each site & connects the partial information information to build a DWFG, determines the to build a DWFG, determines the cycle & selects the transaction to be aborted.
- Centralized deadlock detection is simple but has • Centralized deadlock detection is simple but has two main drawbacks.
- – The site at which detector runs may fail. Th it t hi h d t t f il
- – Communication cost is high as it may be located very far from other sites.
- It’s very common that deadlock involves very few sites which are close to each other.
- In this case, they can discover the deadlock without communicating with central site.
g
- We can use hierarchical controllers to reduce communication cost.
- In this method a tree of deadlock detector is built.
- The leaves of the tree consists of Local Deadlock
Detector (LDD).
( )
- The nonleaf nodes consists of Nonlocal
Deadlock Detector (NLDD).
( )
NLDD2 LDD2 LDD3 LDD5 LDD1 LDD2 LDD4 LDD5 Site2 Site3 Site5 Site4 Site1 Site2 A tree of deadlock detectors A tree of deadlock detectors
Distributed Deadlock Detection
- In a Distributed Deadlock Detection, there is no distinction between local & nonlocal deadlock detectors.
- Each site has same responsibility. Sites exchange information about waiting transactions in order to determine global deadlock.
- Potential deadlock cycles are detected by each site.
- All the output & input ports are collected into a single single node called External (EX) node, called External (EX).
Site 1 T1 T2 T1 T2 Site 2
Sit 1 T1 T2 EX Site 1 T1 T2 EX Site 2 T1 T2 EX Messages (EX, T2, T1) : From site 2 to site 1
- In centralized deadlock detection all potential deadlock cycles
are sent to one designated site but in this case there is no such site.
- The idea used by distributed deadlock detection algorithm algorithm consists in transmitting the potential deadlock consists in transmitting the potential deadlock information along with deadlock cycle itself.
Ex: In
- Ex:‐ In previous figure, the local deadlock detector has previous figure, the local deadlock detector has detected
potential deadlock cycle consisting of T1, T2, EX.
- The site 1 can choose to send this cycle to
- – The site where there is an agent of T1 waiting for T1 at at site 1 (backward along the cycle) site 1 (backward along the cycle).
- – The site where there is an agent of T2 for which T2 at site site 1 is waiting. (forward along the cycle).
1 is waiting. (forward along the cycle).
- It is not necessary to transmit in both directions. Only forward direction is sufficient.
- But if all sites transmit their potential deadlock cycles forward g y
along the cycle then more information is transmitted than required.
- This may lead to discover same deadlock twice.
- To avoid this algorithm uses following rule.
The potential deadlock cycle is transmitted only if the p y y transaction
ID of the transaction for which EX waits is greater than the transaction ID of the transaction waiting for EX. Ex: ‐ In previous example Site 2 transmit potential deadlock cycle l
- This algorithm works by successive iteration.
At each iteration, the local deadlock detectors at each
- At each iteration the local deadlock detectors at each site perform the following actions.
1. build 1. build LWFG using local information (include EX). LWFG using local information (include EX).
2. For each message which has been received, perform following g modification of the LWFG:
¾ For each transaction in the message, add it to the LWFG if it
does not already exists.¾ For each transaction in the message starting with EX, create an edge to the next transaction in the message.
3 Find the cycle not involving EX in the LWFG Each cycle the cycle not involving EX in the LWFG. Each cycle indicates existence of deadlock.
3. Find
4 Find cycles involving EX These cycles are potential cycles involving EX. These cycles are potential
4. Find
T1 T2 EX Site 1 Site 2 T1 T2 EX Messages ( (EX, T2, T1) : From site 2 to site 1 )
Distributed Deadlock Prevention
- With this method, a transaction is aborted & • With this method a transaction is aborted & restarted
if there is risk that deadlock might occur. occur
- If the transaction T1 requests a resource which is held held b T2 then a “pre enti e test” is applied if
by T2, then a “preventive test” is applied; if test indicates risk of deadlock.
- – Then T1 is not allowed to enter wait state. It is aborted & restarted.
(Nonpreemptive Method)
- – or T2 is aborted & restarted.
(Preemptive Method)
- Nonpreemptive Method :‐ is based on timestamps as follows If Ti requests for a lock on data item which is already locked y j p
y by Tj, then Ti is permitted to wait only if Ti is older than Tj. If Ti is younger than Tj, then Ti is aborted & restarted.
- Preemptive Method :‐ is opposite to previous.
If Ti requests for a lock on data item which is already locked
by Tj, then Ti is permitted to wait only if Ti is younger than Tj otherwise Tj is aborted & lock is granted to to Ti Ti.
Concurrency control based on
Timestamps
This
- This concurrency control mechanism allows a concurrency control mechanism allows a transaction to read or write a data item x only if x had been last written by an older if
x had been last written by an older transaction ; otherwise it rejects the operation and and restarts the transaction
restarts the transaction
The The Basic Timestamp Mechanism Basic Timestamp Mechanism
- The p pp basic timestamp mechanism applies the following rules.
1. Each transaction receives a timestamp where it
is is initiated at its site of origin initiated at its site of origin2. Each read or write operation which is required by by a transaction has the timestamp of the a transaction has the timestamp of the transaction
3. For each data item x , the largest timestamp of a d i d h l i f read operation and the largest timestamp of write operation are recorded ; they will be indicated as RTM(x) and WTM(x). ( ) ( )
4. Let TS be the time stamp of a read operation on data data item x If TS <WTM(x) the read operation item x . If TS <WTM(x),the read operation is
rejected and the issuing transaction is restarted restarted with a new timestamp ; otherwise the with a new timestamp ; otherwise the read
is executed, and RTM(x) is said to max(RTM(x),TS). max(RTM(x) TS)
5. Let TS be the timestamp of a write operation on on data item x If TS <RTM(x) or
data item x . If TS <RTM(x) or TS<WTM(x),then
operation is rejected and issuing issuing transaction is restarted ; otherwise transaction is restarted ; otherwise , write
is executed, and WTM(x) is set to TS
- Rule 4 and 5 ensure that conflicting operation are are executed in timestamp order at all sites ;
executed in timestamp order at all sites ; hence the timestamp order is a total order satisfying y g p p
the condition of proposition 8.1 and the executions produced by this mechanisms are correct
EXAMPLE: ‐ consider p
the concurrent execution E of example 8.1,
which is repeated here for convenience: S1:Ri(x)Wi(x)Rj(x)Wj(x) ( ) ( ) j( ) j( ) S2:
Ri(y)Wi(y)Rj(y)Wj(y)
- In the above example the two operations are one below below the other this means that the two the other ,this means that the two operations start at the same time.
hi i b d d b 2 h
- This execution cannot be produced by 2‐phase‐ locking mechanism.
- With the basic time stamp mechanism this execution
is accepted if TS(Ti)<TS(Tj), because at the site 1 after the execution of Wi(x), RTM(x)=WTM(x)=TS(Ti)
and therefore Rj(X) and Wj(x) are not rejected.
- Similar considerations apply also to site 2
- This appears to be an advantage of the • This appears to be an advantage of the timestamp mechanism.
( )
- However, if Rj(x) were processed at site 1 before
Wi(x) then Wi(x) would be rejected by rule 5 and
Ti would be aborted and restarted. ld b b d d d
- The same would happen if Rj(y) were processed at site 2 before Wi(y).
- An interesting feature of the timestamp mechanism mechanism is that it is deadlock free because is that it is deadlock free, because transactions are never blocked.
If i i i i
- If a transaction cannot execute an operation it is restarted.
- However, the deadlock freedom is obtained at the
cost of restarting transactions , rather than making them wait.
- The basic rules which have been described above are are sufficient to ensure the serializability of sufficient to ensure the serializability of transactions; H h d b i d i h 2
- However, they need to be integrated with 2‐ phase commitment to ensure atomicity.
- With the timestamp mechanism we need a different
solution :instead of exclusive locks we use PREWRITES .
- Prewrites are issued by transaction instead of write operations ,they are buffered and not applied directly to the database.
- Only when the transaction is committed, are the corresponding write operations applied to the database. Only • Only when the transaction is committed, are the when the transaction is committed, are the corresponding writes will not be rejected.
- Integration of the basic timestamp method and
2 h ‐phase‐commitment.
The e abo e u es a d 5 a e subst tuted by t e above rules 4 and 5 are substituted by the following rules 4,5,and 6.
4. p p p Let TS be the timestamp of a prewrite operation
Pi on data item x.
If TS <RTM(x) or TS<WTM(x), then operation is
rejected and the issuing transaction is restarted;
otherwise ,the prewrite Pi and its timestamp TS
are are buffered buffered.5. Let TS be the timestamp of a read operation Pi Pi on data item x on data item x. ¾ If TS<WTM(x), the operation is rejected.
¾ If TS>WTM(x), then Ri is executed only if ¾ If TS>WTM(x) then Ri is executed only if there is no prewrite operation P(x) pending on on data item x having a timestamp TS(P)<TS data item x having a timestamp TS(P)<TS.
¾If there is one or more prewrite operation P(x) P(x) with TS(p)<TS Ri is buffered until the with TS(p)<TS, Ri is buffered until the transaction
the transaction which has issued P(x) ( ) commits.
6.
Let TS be the timestamp of a write operation Wi on on data item x data item x. This operation is never rejected: however, it is possibly ibl b ff d if h i i i
buffered if there is a prewrite operation with a timestamp TS(P)<TS, for the same reason which hi h h b d f b ff i d
has been stated for buffering read operations.
Wi will be executed and eliminated from the buffer when all prewrites with smaller timestamps
have been eliminated from the buffer.
The “ignore obsolete write ” Rule 5 of the basic
timestamp timestamp mechanism can be modified in the mechanism can be modified in the following way:
If h i f i i Wi( ) i
- If the timestamp of a write operation Wi(x) is smaller than the write timestamp of a write operation i WTM( ) f h d i i i
WTM(x) of the data item x, it is possible to ignore the operation, instead of rejecting j i h i d i h
the operation and restarting the transaction.
The Conservative Timestamp Method
- The main disadvantage of the basic timestamp method is the great number of restarts which it causes.
- Conservative timestamping is a method which eliminates restarts by buffering younger operations until all order conflicting operations have been executed. d so that operations are never rejected and h i j d d transactions are never restarted.
- In order to execute a buffered operation it is necessary to know when there are no more older conflicting fli ti ti operations.
- The conservative timestamp method, is based on the following requirements and rules.
1. Each transaction is executed at one site only and does
not activate remote programs. It can only issue read or write requests to remote sites.q
site i must receive all the read requests from a
different site j in timestamp order. Similarly,a site i must receive all the write requests from a different site j in timestamp order.These requirements are not very simple to satisfy. A more attractive solution is to execute transactions by issuing all read requests before their main execution and all write requests after their main execution.
If TS(Ti)<TS(Tj), it is sufficient to wait to send the Rj operations
until all Ri operations have been sent and to wait to send the Wj operations until all Wi operations have
been sent.
3. Assume that a site i has at least one buffered read and one
buffered write operation from each other site of the network. network Because of the requirement 2 site i knows that Because of the requirement 2, site i knows that there
are no older requests which can arrive from any site. The concurrency controller at site i behaves in site. The concurrency controller at site i behaves in following
way:
a: For a read operation R that arrives at site i: p If
there is some write operation W buffered at site a such that TS(R) > TS(W),then R is buffered until these ( ) ( ) writes
are executed ,else R is executed b: Fir a write operation W that arrives at site I :
If there is some read operation R buffered at site i such that TS(W)> TS(R)or there is some write operation W’ b ff buffered d h h ( ) ( ’) h
at site I such that TS(W)>TS(W’), then W is buffered until these operations are executed ,else W is executed. executed
- Conservative timestamping suffers from the following following problems: problems:
- if one site never sends an operation to some other other site then the assumption stated at the
site, then the assumption stated at the beginning of the point 3 does not hold. h bl b l d b h
- This problem can be eliminated by requiring that each
site periodically send timestamped “null” operations to each site. h
- Caution must be taken in the implementation of conservative
timestamping in order to avoid deadlocks. Consider an example of possible deadlock situation like Ti,executed Ti executed at site 1: Ri(x) execute Wi(y) at site 1: Ri(x), execute Wi(y) Tj,executed at site 2: Rj(y), execute Wj(x)
(a) (a) Operations requested by transactions Ti and Tj Operations requested by transactions Ti and Tj site1(stores ( ) ( y)
item x) site 2(stores item y)
Ri(x) buffered, waiting Rj(y) buffered, waiting for a write from site 2 for a write from site 1 (b) A possible deadlock situation
- This shows the situation exists after both read operations have been issued: both sites have buffered these reads, because they are waiting for a write operation from the other other site site.
- Let no null operation is sent when there are still transaction pending,
because it is expected that these transactions will issue some useful operation.
- Both transactions are blocked in this case and will never i issue their writes. With this mechanism a deadlock has h i i Wi h hi h i d dl k h been created.
Th l t id d dl k i t d ll ti
- The only way to avoid deadlocks is to send a null operation anyway
after a timeout; however, this seems equivalent to performing performing deadlock detection by timeout deadlock detection by timeout.
- The above example shows the fact that Ti waits for a write from from Tj and Tj waits for a write from Ti is not due to an Tj and Tj waits for a write from Ti is not due to an explicit
rule of the mechanism but seems more to be an undesired side effect.
OPTIMISTIC METHODS FOR DISTRIBUTED
CONCURRENCY CONTROL
- The basic idea of optimistic method is the following: following: instead of suspending or rejecting instead of suspending or rejecting conflicting
operations, like 2‐phase locking and timestamping, timestamping, always execute a transaction to always execute a transaction to completion.
- At the end of the transaction, if the validation test • At the end of the transaction if the validation test is
passed by the transaction, are the writes applied applied to the database to the database.
- If the validation test is not passed, the transaction is i t t d restarted.
- The validation test verifies if the execution of the transaction transaction is serializable. is serializable.
- In order to perform the test, some information about about the execution of the transaction must be
the execution of the transaction must be retained until the validation is performed.
Th ti i ti h i b d th
- The optimistic approach is based on the assumption