SEARCH Format 2 –Binary, or Half-interval Search (SEARCH ALL)
6.38.2. SEARCH Format 2 –Binary, or Half-interval Search (SEARCH ALL)
Figure 6-84 - Binary SEARCH (ALL) Syntax This format of the
SEARCH ALL table-name SEARCH statement performs a binary,
[ AT END imperative-statement-1 ]
EQUALS
or half-interval,
WHEN key-data-item-1 ( index-name-1 )
IS EQUAL TO
literal-1
search against a
identifier-1
sorted table.
EQUALS
AND key-data-item-2 ( index-name-1 ) … IS EQUAL TO literal-2
identifier-2
imperative-statement-2
[ END-SEARCH ]
1. The definition of table-name must include the OCCURS, ASCENDING (and/or DESCENDING) KEY and INDEXED BY
clauses.
2. In order for a table to be searchable via the SEARCH ALL statement, each of the following must be true:
a. The table meets the requirements of rule #1 above.
b. Just because the table has one or more KEY clauses doesn’t mean the data is actually in that sequence in the
table – the actual sequence of the data must agree with the KEY clause(s)! 26
c. No two records in the table may have the same KEY field values. If the table has multiple KEY definitions, then no two records in the table may have the same combination of KEY field values.
If rule “a” is violated, the compiler will reject the SEARCH ALL. If rules “b” and/or “c” are violated, there will be no message issued by the compiler, but the run-time results of a SEARCH ALL against the table will probably be incorrect.
3. Key-data-item-1 and key-data-item-2 … (if any) must be defined as keys of table-name via ASCENDING KEY or
DESCENDING KEY clauses (see rule #1 above).
4. Index-name-1 is the first INDEXED BY data item for table-name.
5. The WHEN clause is mandatory, unlike format 1 of the SEARCH statement.
6. There can only be one WHEN clause specified; there may be any number of AND clauses, but there cannot be more WHEN & AND clauses than there are KEY fields to the table. Each WHEN/AND clause should reference a different KEY field.
7. The function of the WHEN, along with any ANDs, is to compare the key field(s) of the table, as indexed by the first
INDEXED BY item, against the specified literal and/or identifier values in order to locate the desired entry in the table . The table’s index will be automatically varied by the SEARCH ALL statement in a manner designed to require the minimum number of tests.
26 Of course, if the data sequence doesn’t agree with the KEY clause, you can easily make it that way using a table SORT (see section SORT Format 2 – Table Sort)
8. The internal processing of the SEARCH ALL statement begins by setting internal “first” and “last” pointers to the
1 st and last entry locations of the table. Processing then proceeds as follows 27 :
a. The entry half-way between “first” and “last” is identified. We’ll call this the “current” entry, and will set its table entry location is saved into index-name-1.
b. The WHEN (along with any ANDs) is evaluated. This comparison of the keys against the target literal/identifier values will have one of three possible outcomes:
i. If the key(s) and value(s) match, imperative-statement-2 is executed, after which control falls thru into the next statement following the SEARCH ALL.
ii. If the key(s) are LESS THAN the value(s), then the table entry being searched for can only occur in the “current” to “last” range of the table, so a new “first” pointer value is set (it will be set to the “current” pointer).
iii. If the key(s) are GREATER THAN the value(s), then the table entry being searched for can only occur in the “first” to “current” range of the table, so a new “last” pointer value is set (it will be set to the “current” pointer).
c. If the new “first” and “last” pointers are different than the old “first” and “last” pointers, there’s more left to
be searched, so return to step “a” and continue.
d. If the new “first” and “last” pointers are the same as the old “first” and “last” pointers, the table has been exhausted and the entry being searched for cannot be found; imperative-statement-1 is executed, after which control falls thru into the next statement following the SEARCH ALL.
The net effect of the above algorithm is that only a fraction of the number of elements in the table need ever be tested in order to decide whether or not a particular entry exists. This is because the SEARCH ALL discards half the remaining entries in the table each time it checks an entry.
Computer scientists will compare these two search techniques as follows:
A sequential search (format 1) will need an average of n/2 tests and a worst case of n tests in order to find an entry and n tests to identify that an entry doesn’t exist (n = the number of entries in the table).
A binary search (format 2) will need worst case of log 2 n tests in order to find an entry and log 2 n tests to identify that an entry doesn’t exist (n = the number of entries in the table).
Here’s a more practical view of the difference. Let’s say that a table has 1,000 entries in it. With a sequential (format
1) search, on average, you’ll have to check 500 of them to find an entry and you’ll have to look at all 1,000 of them to find that en entry doesn’t exist. With a binary search, express the number of entries as a binary number (1,000 10 = 1111101000 2 ) and count the number of digits in the result (10) -THAT is the worst-case number of tests required to find an entry or to identify that it doesn’t exist. That’s quite an improvement.
27 This is a simplified view of the algorithm intended purely as a pedagogical tool – an actual implementation of it requires a few additional picky little details to make it work (such as what to do when rule “a” identifies a “current”
entry of 12.5!)