Validity Categories of Good Test

“Reliability refers to the consistency of measurement – that is, to how consistent test scores or other evaluation results are from one measurement to another.” 40 According to Desmond Allison, “the reliability of a test concerns the accuracy and trustworthiness of its results. Reliable test results will accurately reflect each student’s understanding of whatever is being tested.” 41 To sum up, a test is reliable if it consistently produces the same, or nearly the same result or rank for the same individual taking the test several times on the different occassion.

3. Practicality

The last quality that a good test should have is practicality or usability. In selecting a test and other instruments, practical considerations cannot be neglected. These are some factors relevant to the practicality when selecting tests: 42 a. Ease of Administration “The administrability of evaluation devices refers to the ease and accuracy with which the directions to pupils and evaluator can be followed.” 43 In addition, ease of administration involves the simple and clear directions, the subtests in minimum numbers and the easy timing. b. Time Required for Administration The test’s length is directly related to the reliability of a test, so the availability of enough time should be taken. “A safe procedure is to 40 Norman E. Gronlund, Measurement and Evaluation ..., p. 93. 41 Desmond Allison, Language Testing ..., p. 85. 42 Norman E. Gronlund and Robert L. Linn, Measurement and Evaluation in Teaching, New York: Macmillan Publishing Company, 1990, 6 th Ed., p. 102-103. 43 H. H. Remmers, et. al., A Practical Introduction to Measurement and Evaluation, New York: Harper Brothers, 1960, p. 126. allot as much time as is necessary to obtain valid and reliable results.” 44 c. Ease of Interpretation and Application If the test is interpreted correctly and applied effectively, teacher can make accurate educational decisions about students performance. d. Availability of Equivalent or Comparable Forms Equivalent test measure the same aspect and is alike in content, level of difficulty, and other characteristics. It is useful if teacher wants to remove the factor of memory when retesting students on the same domain. Comporable forms are especially useful in measuring the progress of the basic skills. e. Cost of Testing The factor of the cost is actually not really important in selecting test. Testing is relatively inexpensive. However, the point is the test should be as economical as possible in cost.

C. Types of Test Item

An item is the basic unit of language testing. According to James Dean Brown, the definition of the item “is the smallest unit that produces distinctive and meaningful information on a test or rating scale.” 45 The items used in clasroom tests are commonly divided into two broad categories: 1 the objective item, and 2 the essay test.

1. Objective Test

In constructing an achievement test, the test maker may choose from a variety of item types. One of them is referred to as objective item. This kind of item types can be scored objectively. Furthermore, “equally competent scorers can score them independently and obtain the same 44 Wilmar Tinambunan, Evaluation of Students ..., p. 23. 45 James Dean Brown, Testing in Language ..., p. 49. results.” 46 In addition, Rebecca M. Valette defines objective test as “any item for which there is a single predictable correct answer.” 47 Thus, when scoring this test, any subjective judgement from the scorer is pushed aside because every item in that test has only one absolutely right answer. So, although the test is scored in several different times by one scorer or another, it will obtain the same result. The objective item can be classified into two types, which are selection-type test item and supply-type test item.

a. Selection-Type Test Item

1. Multiple Choice

According to Anthony J. Nitko, “a multiple choice item consists of one or more introductory sentences followed by a list of two or more suggested responses from which the examinee chooses one as the correct answer.” 48 The other responses which are as incorrect answers function to distract students’ attention away from the correct answer in case they are uncertain of the answer. In line with that quote, “multiple choice items are made up of an item stem, or the main part of the item at the top, a correct answer, which is obviously the choice that will be counted correct, and the distractors, which are those choices that will be counted as incorrect.” 49 For example: Budi has been here ____________ half an hour. a. during c. while b. for d. since 46 Norman E. Gronlund, Constructing Achievement Test, New Jersey: Prentice-Hall, Inc., 1982, 3 rd Ed., p. 36. 47 Rebecca M. Valette, Modern Language ..., p. 8 48 Anthony J. Nitko, Educational Test ..., p. 190. 49 James Dean Brown, Testing in Language ..., p. 54.