Chapter 7 Consistency And Replication (2)

  Principles and Paradigms

  

Second Edition

ANDREW S. TANENBAUM

  Plan

  • Motivation • Consistency models
    • – Data-centric consistency
    • – Client-centric consistency

    >Replica management
  • Consistency protocols

  Consistency Protocols Describe algorithm for • achieving a given consistency model

  Data-centric – Client-Centric –

  Continuous Consistency Bounding deviations • so as to avoid

  Numerical errors –

  E.g., that deviation of • replicas’ stock quote is more than 1.00$

  • – Staleness errors

  E.g., that last stock • quote is more than 1 hour old

Order errors –

  E.g., that more than 3 • updates to stock quote are outstanding Bounding Numerical Deviation Assume •

  N servers, S… S

  • – Single data item x

  A write, W(x), has a weight, weight(W), that is the amount that x – changes

  • E.g., x = 1; x := x * 3 <- weight is …
  • 2
    • weight(w) > 0

  δ

  • We have a limit on numerical deviation for the value of x, v , at

  i i server S i

  W(x) E.g., multicast or W(x) epidemic protocol W(x) x1 x2 x3 origin(W)

  L1 L3 L2 Bounding Numerical Deviation Define •

  = = ∧ ∈

  

TW [ i , j ] { weight ( W ) | origin ( W ) S W L }

j i

  • I.e., sum of writes executed by S

  i originating from S j

  • – Note, T[i,i] are all values submitted by S

  i

  W(x) W(x)

  W(x) x1 x2 x3

  origin(W)

  L1 L3 L2 Bounding Numerical Deviation Current value of x at S i

  Consistent value of x Define •

  N N = + = v v ( ) TW [ i , k ]

  • v ( t ) v ( ) TW [ k , k ]

  i = = k

  1

  1 k

  Goal • ∀ ∀ ∈ − ≤ δ

  { 1 ,..., } : ( ) ( ) Ensure i i

  • t i N v t v t

  W(x) W(x)

  W(x) x1 x2 x3

  origin(W)

  L1 L3 L2

  • Assume
    • S
    • S

  • If then S
    • S

  • Guarantees that deviation is bounded as

  N TW k i j k TW i k

  1 ] , [ ] , [ − > −

  ) ( ) (

  S S S   →    → 

  k W x i W x j

  i

  waits to commit new writes until these are commited at S

  k

  k forwards writes from its log, L k

  has seen from k

  k [i,k], of what S i

  maintains this as a view, TW

  k

  k

  also sends TW[i,k] to S

  i

  Bounding Numerical Deviation

  δ

  

Bounding Numerical Deviation

N N

  −

  • = −

  v ( t ) v ( v ( ) TW [ k , k ] ) ( v ( ) TW [ i , k ] ) i

  

= =

k

1 k

  1 N = − ( [ , ] [ , ])

  TW k k TW i k = k

  1 N ≤ − ( TW [ k , k ] TW [ i , k ]) k

  = k

  1 − i

  1

  • = − − ( TW [ k , k ] TW [ i , k ]) ( TW [ i , i ] TW [ i , i ])

  k i = k

  1 N − + ( TW [ k , k ] TW [ i , k ]) k

  = + k i

  1 δ i

  ≤ − ( 1 )

  N

  N

  1 Bounding Staleness Deviations Goal •

  Ensure that there are no W(x) older than TO that S

  i i

  has not seen

  ”Older than”? •

  • – Vector clocks?
  • – If we know drift rate of clocks and communication

  delay, we can synchronize within a bound W(x)

  W(x) W(x) x1 x2 x3

  origin(W)

  L1 L3 L2 Bounding Staleness Deviations

  • In any case
    • – Keep a real-time vector clock

  • VT

  k

[i] = Local time of last write from S

i

  • – If |VT

  k [i] – T(k)| > TO i pull updates from S i

  W(x) x1 x2 x3

  origin(W)

  W(x) W(x)

  L1 L2 L3 Bounding Ordering Deviations

  • Check size of local queue of tentative updates
    • – Updates may not have been committed due to

  ordering constraints

  • • If too large, try to agree on ordering of updates

  before proceeding…

  • Ensuring consistent ordering

  W(x) x1 x2 x3

  origin(W)

  W(x) W(x)

  L1 L2 L3 Ensuring Consistent Ordering Make a specific process responsible for • specific data item

  • Primary-Based Protocols

  Enable multiple processes to carry out • write operations

  • Replicated-Write Protocols

  Remote-Write Protocols

  Local-Write Protocols

  

Replicated-Write: Active Replication

  • Largest inter-message

  Multicast updates to transmission time all replicas

  Need ordering of – multicast

  May be inefficient •

  Combine primary- – based (”token-site”) and replicated-write (”symmetric”)protocols

  Use Use Replicated-Write: Quorum

  • Multicast W(x), read x
    • – Has the write been committed?
    • – Is the value read up-to-date?

  • Static process group
    • – Basic idea

  • Write (version(x), value(x))
  • • When writing new version, get more than half of the

  processes to approve

  • When reading, get response with latest version from more

  than half of the processes

  • – Avoid write/write conflicts
  • – Avoid write/read conflicts

  

Replicated-Write: Quorum

  • Process group of size N • Define
    • write quorum size, N

  W

  • At least N
    • read quorum size, N

  • At least N
  • To make sure to read latest version

  W

  processes need to agree on value and version for us to read

  

R

  R

  processes need to agree on value and version for us to read

  • N
    • + N

  W

  R > N

  At least one process will have latest version

Quorum-Based Protocols

  Figure 7-22. Three examples of the voting algorithm. • (a) A correct choice of read and write set. (b) A choice that may lead to write-write conflicts. (c) A correct

  Client-Centric Consistency

  Each client keeps track of •

  Read set: identifiers of the writes

  • – that are relevant to the read operations performed by the client

  Write set: identifiers of writes

  • – performed by the client

  Monotonic-read consistency •

  • – Read performed

  Client hands in read set to • server Server may need to • communicate with other server to become up-to-date Server returns updated read set

  Write performed – (Nothing particular) • Client-Centric Consistency Monotonic-write consistency •

  Write –

  Client hands in write set to server •

  • Server may need to communicate with

  other server to become up-to-date Server returns updated write set

  Read-your-writes consistency •

  • – Read
    • Client hands in write set to server
    • Server may need to communicate with

  other server to become up-to-date Server returns updated write set

  Writes-follow-reads consistency •

  • Client hands in write set and read set to

  server Server may need to communicate with • other server to become up-to-date with

  read set

  • Add identifiers from the read set to the

  write set

  • Server returns updated write set
Client-Centric Consistency

  Efficiency? • Bound size of writes and • reads by grouping operations into sessions

  • – E.g., clear sets when a client

  exits a specific application

  Use a vector timestamp… •

  WVC – [j]: the timestamp of the i latest write operation from S that j

  S has processed i

  SVC [j]: the timestamp of the – A latest write operation in the session that originates from S j

  S – uses WVC and SVC to check i i A

  Summary Various protocols for ensuring consistency • Data-centric consistency •

  Primary-based protocols –

  • – Replicated-write

  Primary-based protocols most widely used due • to simplicity

  Exact selection of protocol type depends on rate at – which request are processed and on message delay

  • Client-centric consistency may be efficiently

  implemented using vector timestamps