Internal Arrays

Internal Arrays

Because the computer’s internal memory has room for only three blocks, the merging process must take place in stages. Let’s say there are three arrays, called arr1 , arr2 , and arr3 , each of which can hold a block.

In the first merge, block 2-9-11-14 is read into arr1 , and 4-12-13-16 is read into arr2 . These two arrays are then mergesorted into arr3 . However, because arr3 holds only one block, it becomes full before the sort is completed. When it becomes full, its contents are written to disk. The sort then continues, filling up arr3 again. This completes the sort, and arr3 is again written to disk. The following lists show the details of each of the three mergesorts.

Mergesort 1:

1. Read 2-9-11-14 into arr1 .

2. Read 4-12-13-16 into arr2 .

3. Merge 2, 4, 9, 11 into arr3 ; write to disk.

4. Merge 12, 13, 14, 16 into arr3 ; write to disk. Mergesort 2:

1. Read 3-5-10-15 into arr1 .

2. Read 1-6-7-8 into arr2 .

3. Merge 1, 3, 5, 6 into arr3 ; write to disk.

4. Merge 7, 8, 10, 15 into arr3 , write to disk. Mergesort 3:

1. Read 2-4-9-11 into arr1 .

2. Read 1-3-5-6 into arr2 .

3. Merge 1, 2, 3, 4 into arr3 ; write to disk.

512 CHAPTER 10 2-3-4 Trees and External Storage

4. Merge 5, 6 into arr3 ( arr2 is now empty).

5. Read 7-8-10-15 into arr2 .

6. Merge 7, 8 into arr3 ; write to disk.

7. Merge 9, 10, 11 into arr3 ( arr1 is now empty).

8. Read 12-13-14-16 into arr1 .

9. Merge 12 into arr3 ; write to disk.

10. Merge 13, 14, 15, 16 into arr3 ; write to disk. This last sequence of 10 steps is rather lengthy, so it may be helpful to examine the

details of the array contents as the steps are completed. Figure 10.28 shows how these arrays look at various stages of mergesort 3.

1 4 3 4 to disk arr3

Steps 1, 2, and 3

9 10 11 12 to disk

Step 4

Steps 8 and 9

5 6 7 8 to disk arr3

13 14 15 16 to disk

Steps 5 and 6

Step 10

FIGURE 10.28 Array contents during mergesort 3.

Summary 513

Summary

• A multiway tree has more keys and children than a binary tree. • A 2-3-4 tree is a multiway tree with up to three keys and four children per

node. • In a multiway tree, the keys in a node are arranged in ascending order. • In a 2-3-4 tree, all insertions are made in leaf nodes, and all leaf nodes are on

the same level. • Three kinds of nodes are possible in a 2-3-4 tree: A 2-node has one key and two

children, a 3-node has two keys and three children, and a 4-node has three keys and four children.

• There is no 1-node in a 2-3-4 tree. • In a search in a 2-3-4 tree, at each node the keys are examined. If the search

key is not found, the next node will be child 0 if the search key is less than key 0; child 1 if the search key is between key 0 and key 1; child 2 if the search key is between key 1 and key 2; and child 3 if the search key is greater than key 2.

• Insertion into a 2-3-4 tree requires that any full node be split on the way down the tree, during the search for the insertion point.

• Splitting the root creates two new nodes; splitting any other node creates one new node.

• The height of a 2-3-4 tree can increase only when the root is split. • There is a one-to-one correspondence between a 2-3-4 tree and a red-black tree. • To transform a 2-3-4 tree into a red-black tree, make each 2-node into a black

node, make each 3-node into a black parent with a red child, and make each 4-node into a black parent with two red children.

• When a 3-node is transformed into a parent and child, either node can become the parent.

• Splitting a node in a 2-3-4 tree is the same as performing a color flip in a red-black tree.

• A rotation in a red-black tree corresponds to changing between the two possible orientations (slants) when transforming a 3-node.

• The height of a 2-3-4 tree is less than log 2 N.

• Search times are proportional to the height. • The 2-3-4 tree wastes space because many nodes are not even half full.

514 CHAPTER 10 2-3-4 Trees and External Storage

• A 2-3 tree is similar to a 2-3-4 tree, except that it can have only one or two data items and one, two, or three children.

• Insertion in a 2-3 tree involves finding the appropriate leaf and then performing splits from the leaf upward, until a non-full node is found.

• External storage means storing data outside of main memory, usually on a disk. • External storage is larger, cheaper (per byte), and slower than main memory. • Data in external storage is typically transferred to and from main memory a

block at a time. • Data can be arranged in external storage in sequential key order. This gives fast

search times but slow insertion (and deletion) times. • A B-tree is a multiway tree in which each node may have dozens or hundreds

of keys and children. • There is always one more child than there are keys in a B-tree node. • For the best performance, a B-tree is typically organized so that a node holds

one block of data. • If the search criteria involve many keys, a sequential search of all the records in

a file may be the most practical approach.