Two signatures Selecting the strong signature

§3.2 Designing a remote update algorithm 52 This would work, but it is not practical because of the computational cost of com- puting a reasonable signature on every possible block. It could be made computationally feasible by making the signature algorithm very cheap to compute but this is hard to do without making the signature too weak. A weak signature would make the algorithm unusable. For example, the signature could be just the first 4 bytes of each block. This would be very easy to compute but the algorithm would fail to produce the right result when two different blocks had their first 4 bytes in common.

3.2.3 Two signatures

The solution and the key to the rsync algorithm is to use not one signature per block, but two. The first signature needs to be very cheap to compute for all byte offsets and the second signature needs to have a very low probability of collision. The second, expensive signature then only needs to be computed by A at byte offsets where the cheap signature matches one of the cheap signatures from B. If we call the two signatures R and H then the algorithm becomes 6 : 1. B divides b i into N equally sized blocks b ′ j and computes signatures R j and H j on each block. These signatures are sent to A. 2. For each byte offset i in a i A computes R ′ i on the block starting at i. 3. A compares R ′ i to each R j received from B. 4. For each j where R ′ i matches R j A computes H ′ i and compares it to H j . 5. If H ′ i matches H j then A sends a token to B indicating a block match and which block matches. Otherwise A sends a literal byte to B. 6. B receives literal bytes and tokens from A and uses these to construct a i . For this algorithm to be effective and efficient we need the following conditions: • the signature R needs to be cheap to compute at every byte offset in a file; 6 I call them R and H for rolling checksum and hash respectively. Hopefully those names will become clear shortly. §3.2 Designing a remote update algorithm 53 • the signature H needs to have a very low probability of random collision; and • A needs to perform the matches on all block signatures received from B very efficiently, as this needs to be done at all byte offsets. Most of the rest of this chapter deals with the selection of the two signature algorithms, and the related problem of implementing the matching function efficiently.

3.2.4 Selecting the strong signature

The strong signature algorithm is the easier of the two. It doesn’t need to be particu- larly fast as it is only computed on block boundaries by B and at byte boundaries on A only when the fast signature matches. The main property that the algorithm must have is that if two blocks are different they should have a very low probability of having the same signature. There are many well known algorithms that have this property, perhaps the best known being the message digest algorithms commonly used in cryptographic applications. These algorithms are believed to have the following properties where b is the number of bits in the signature[Schneier 1996]: • The probability that a randomly generated block has the same signature as a given block is O2 −b . • The computational difficulty of finding a second block that has the same signature as a given block is Ω 2 b . • The individual bits in the signature are uncorrelated and have a uniform distri- bution. These properties make a message digest algorithm ideal for rsync. The particular algorithm that is used for most of the results in this chapter is the 128 bit MD4 message digest[Rivest 1990]. This algorithm was chosen because of the ready availability of source code implementations and the high throughput compared to many other algorithms. Although MD4 is thought to be not as cryptographically strong as some later algorithms such as MD5 or IDEA the difference is unimportant for rsync. §3.2 Designing a remote update algorithm 54 In reality, MD4 is “overkill” as the strong signature algorithm for rsync. It would be quite possible to use a cryptographically weaker but computationally less expensive algorithm such as a simple polynomial based algorithm. This wasn’t done because testing showed that the MD4 computation does not provide a significant bot- tleneck on modern CPUs. For example, on a 200 MHz Pentium processor the MD4 implementation achieved 6 MBsec throughput, which is far in excess of most local area networks. As rsync is aimed at low bandwidth networks the computational cost of MD4 is insignificant 7 .

Two signatures Selecting the strong signature

3.2.3 Two signatures

3.2.4 Selecting the strong signature

3.2.5 Selecting the fast signature

Parts

Dokumen yang terkait

Synchronization Interfaces for Improving Moodle Utilization

Design And Development Of Turntable For Automatic Sorting System.

algorithms for programmers 2008

Efficient for Sustainable Growth

Efficient for Sustainable Growth

Mastering Algorithms with C Useful Techniques from Sorting to Encryption pdf pdf

C++ Data Structures and Algorithms Learn how to write efficient code to build scalable and robust applications in C++ pdf pdf

Starting protection, synchronization and control for synchronous motors

Tips for writing efficient, faster, and organized manuscript

Algorithms from and for Nature and Life

Dukungan

Links

Two signatures Selecting the strong signature

3.2.3 Two signatures

3.2.4 Selecting the strong signature

3.2.5 Selecting the fast signature

Parts

Dokumen yang terkait

Synchronization Interfaces for Improving Moodle Utilization

Design And Development Of Turntable For Automatic Sorting System.

algorithms for programmers 2008

Efficient for Sustainable Growth

Efficient for Sustainable Growth

Mastering Algorithms with C Useful Techniques from Sorting to Encryption pdf pdf

C++ Data Structures and Algorithms Learn how to write efficient code to build scalable and robust applications in C++ pdf pdf

Starting protection, synchronization and control for synchronous motors

Tips for writing efficient, faster, and organized manuscript

Algorithms from and for Nature and Life

Dokumen yang Anda mencari sudah siap untuk unduhkan