KNOWLEDGE VERIFICATION AND VALIDATION

11.7 KNOWLEDGE VERIFICATION AND VALIDATION

Knowledge acquired from experts needs to be evaluated for quality, including evaluation, validation, and verification. These terms are often used interchangeably. We use the definitions provided by O'Keefe et al. (1987).

 Evaluation is a broad concept. Its objective is to assess an expert system's overall value. In addition to assessing acceptable performance levels, it analyzes whether the system would be usable, efficient, and cost-effective.

 Validation is the part of evaluation that deals with the performance of the system (e.g., as it compares to the expert's). Simply stated, validation is building the right system, that is, substantiating that a system performs with an acceptable level of accuracy.

 Verification is building the system right or substantiating that the system is cor-

rectly implemented to its specifications. In the realm of expert systems, these activities are dynamic because they must be

repeated each time the prototype is changed. In terms of the knowledge base, it is nec- essary to ensure that we have the right knowledge base (i.e., that the knowledge is valid). It

CHAPTER 11 KNOWLEDGE ACQUISITION, REPRESENTATION, AND REASONING

(verification). For each IF statement, more than 30 criteria can be used in verification (see PC AI, March/April 2002, p. ;59).

In performing these quality-control tasks, we deal with several activities and concepts, as listed in Table 11.8. The process can be very difficult if one considers the many sociotechnical issues involved (Sharma and Conrath, 1992).

A method for validating ES, based on validation approaches from psychology, was developed by Sturman and Milkovich (1995). The approach tests the extent to which the system and the expert decisions agree, the inputs and processes used by an expert compared to the machine, and the difference between expert and novice decisions. Validation and verification techniques on specific ES are described byRam and Ram (1996) for innovative management. Avritzer et al. (1996) provide an algorithm for reli- ability testing of expert systems designed to operate in industrial settings, particularly to monitor and control large real-time systems.

Automated verification of knowledge is offered in the ACQUIRE product described earlier. Verification is conducted by measuring the system's performance and is limited to classification cases with probabilities. Itworks as follows: When an ES is presented with a new case to classify, it assigns a confidence factor to each selection. By comparing these confidence factors with those provided by an expert, one can measure the accuracy of the ES for each case. By performing comparisons on many cases, one can derive an overall measure of ES performance (O'Keefe and O'Leary, 1993).

Measure or Criterion Description Accuracy

How well the system reflects reality, how correct the knowledge is in

the knowledge base

Adaptability

Possibilities for future development, changes

Adequacy Portion of the necessary knowledge included in the knowledge base (or completeness)

Appeal How well the knowledge base matches intuition and stimulates

thought and practicability

Breadth

How well the domain is covered

Depth

Degree of detailed knowledge Credibility of knowledge

Face validity Generality Capability of a knowledge base to be used with a broad range of similar problems Capability of the system to replicate particular system parameters,

Precision consistency of advice, coverage of variables in knowledge base Accounting for relevant variables and relations, similarity to reality

Realism Fraction of the ES predictions that are empirically correct Sensitivity of conclusions to model structure

Reliability Impact of changes in the knowledge base on quality of outputs Robustness

Sensitivity Quality of the assumed assumptions, context, constraints, and conditions, and their impact on other measures

Technical and Ability of a human evaluator to identify whether a given conclusion is made by an ES or by a operational validity

human expert

Turing test How adequate the knowledge is (in terms of parameters and relationships) for solving

correctly

Usefulness Knowledge base's capability of producing empirically correct predictions

Validit Source: Adapted from B. Marcot, "Testing Your Knowledge Base," AI Expert, Aug. 1987.

60 PART IV INTELLIGENT DECISION SUPPORTSVSTEMS

Rosenwald and Liu (1997) have developed a validation procedure that uses the rule base's knowledge and structure to generate test cases that efficiently cover the entire input

space of the rule base. Thus, the entire set of cases need not be examined. A symbolic execution of a model of the ES is used to determine all conditions under which the fundamental knowledge can be used. For an extensive bibliography on validation and verification, see Grogono et al. (1991) and Juan et al. (1999). An easy approach to automating verification of a large rule base can be found in Goldstein (2002).

---",.,------------