Chapter 7 Consistency And Replication (2)
Principles and Paradigms
Second Edition
ANDREW S. TANENBAUM
Plan
- Motivation • Consistency models
- – Data-centric consistency
- – Client-centric consistency
- Consistency protocols
Consistency Protocols Describe algorithm for • achieving a given consistency model
Data-centric – Client-Centric –
Continuous Consistency Bounding deviations • so as to avoid
Numerical errors –
E.g., that deviation of • replicas’ stock quote is more than 1.00$
- – Staleness errors
E.g., that last stock • quote is more than 1 hour old
Order errors –
E.g., that more than 3 • updates to stock quote are outstanding Bounding Numerical Deviation Assume •
N servers, S – … S
- – Single data item x
A write, W(x), has a weight, weight(W), that is the amount that x – changes
- E.g., x = 1; x := x * 3 <- weight is …
- 2
- – weight(w) > 0
δ
- We have a limit on numerical deviation for the value of x, v , at
i i server S i
W(x) E.g., multicast or W(x) epidemic protocol W(x) x1 x2 x3 origin(W)
L1 L3 L2 Bounding Numerical Deviation Define •
= = ∧ ∈
TW [ i , j ] { weight ( W ) | origin ( W ) S W L }
j i- I.e., sum of writes executed by S
i originating from S j
- – Note, T[i,i] are all values submitted by S
i
W(x) W(x)
W(x) x1 x2 x3
origin(W)
L1 L3 L2 Bounding Numerical Deviation Current value of x at S i
Consistent value of x Define •
N N = + = v v ( ) TW [ i , k ]
- v ( t ) v ( ) TW [ k , k ]
i = = k
1
1 k
Goal • ∀ ∀ ∈ − ≤ δ
{ 1 ,..., } : ( ) ( ) Ensure i i
- –
t i N v t v t
W(x) W(x)
W(x) x1 x2 x3
origin(W)
L1 L3 L2
- Assume
- – S
- – S
- If then S
- – S
- Guarantees that deviation is bounded as
N TW k i j k TW i k
1 ] , [ ] , [ − > −
) ( ) (
S S S → →
k W x i W x j
i
waits to commit new writes until these are commited at S
k
k forwards writes from its log, L k
has seen from k
k [i,k], of what S i
maintains this as a view, TW
k
k
also sends TW[i,k] to S
i
Bounding Numerical Deviation
δ
Bounding Numerical Deviation
N N−
- = −
v ( t ) v ( v ( ) TW [ k , k ] ) ( v ( ) TW [ i , k ] ) i
= =
k1 k
1 N = − ( [ , ] [ , ])
TW k k TW i k = k
1 N ≤ − ( TW [ k , k ] TW [ i , k ]) k
= k
1 − i
1
- = − − ( TW [ k , k ] TW [ i , k ]) ( TW [ i , i ] TW [ i , i ])
k i = k
1 N − + ( TW [ k , k ] TW [ i , k ]) k
= + k i
1 δ i
≤ − ( 1 )
N −
N
1 Bounding Staleness Deviations Goal •
Ensure that there are no W(x) older than TO that S –
i i
has not seen
”Older than”? •
- – Vector clocks?
– If we know drift rate of clocks and communication
delay, we can synchronize within a bound W(x)
W(x) W(x) x1 x2 x3
origin(W)
L1 L3 L2 Bounding Staleness Deviations
- In any case
- – Keep a real-time vector clock
- VT
k
[i] = Local time of last write from S
i- – If |VT
k [i] – T(k)| > TO i pull updates from S i
W(x) x1 x2 x3
origin(W)
W(x) W(x)
L1 L2 L3 Bounding Ordering Deviations
- Check size of local queue of tentative updates
- – Updates may not have been committed due to
ordering constraints
• If too large, try to agree on ordering of updates
before proceeding…
- – Ensuring consistent ordering
W(x) x1 x2 x3
origin(W)
W(x) W(x)
L1 L2 L3 Ensuring Consistent Ordering Make a specific process responsible for • specific data item
- – Primary-Based Protocols
Enable multiple processes to carry out • write operations
- – Replicated-Write Protocols
Remote-Write Protocols
Local-Write Protocols
Replicated-Write: Active Replication
- Largest inter-message
Multicast updates to transmission time all replicas
Need ordering of – multicast
May be inefficient •
Combine primary- – based (”token-site”) and replicated-write (”symmetric”)protocols
Use Use Replicated-Write: Quorum
- Multicast W(x), read x
- – Has the write been committed?
- – Is the value read up-to-date?
- Static process group
- – Basic idea
- Write (version(x), value(x))
• When writing new version, get more than half of the
processes to approve
- When reading, get response with latest version from more
than half of the processes
- – Avoid write/write conflicts
- – Avoid write/read conflicts
Replicated-Write: Quorum
- Process group of size N • Define
- – write quorum size, N
W
- At least N
- – read quorum size, N
- At least N
- To make sure to read latest version
W
processes need to agree on value and version for us to read
R
R
processes need to agree on value and version for us to read
- – N
- + N
W
R > N
At least one process will have latest version
Quorum-Based Protocols
Figure 7-22. Three examples of the voting algorithm. • (a) A correct choice of read and write set. (b) A choice that may lead to write-write conflicts. (c) A correct
Client-Centric Consistency
Each client keeps track of •
Read set: identifiers of the writes
- – that are relevant to the read operations performed by the client
Write set: identifiers of writes
- – performed by the client
Monotonic-read consistency •
- – Read performed
Client hands in read set to • server Server may need to • communicate with other server to become up-to-date Server returns updated read set •
Write performed – (Nothing particular) • Client-Centric Consistency Monotonic-write consistency •
Write –
Client hands in write set to server •
- Server may need to communicate with
other server to become up-to-date Server returns updated write set •
Read-your-writes consistency •
- – Read
- Client hands in write set to server
- Server may need to communicate with
other server to become up-to-date Server returns updated write set •
Writes-follow-reads consistency •
- Client hands in write set and read set to
server Server may need to communicate with • other server to become up-to-date with
read set
- Add identifiers from the read set to the
write set
- Server returns updated write set
Efficiency? • Bound size of writes and • reads by grouping operations into sessions
- – E.g., clear sets when a client
exits a specific application
Use a vector timestamp… •
WVC – [j]: the timestamp of the i latest write operation from S that j
S has processed i
SVC [j]: the timestamp of the – A latest write operation in the session that originates from S j
S – uses WVC and SVC to check i i A
Summary Various protocols for ensuring consistency • Data-centric consistency •
Primary-based protocols –
- – Replicated-write
Primary-based protocols most widely used due • to simplicity
Exact selection of protocol type depends on rate at – which request are processed and on message delay
- Client-centric consistency may be efficiently
implemented using vector timestamps