20
material and with the objectives of instruction in particular class, his test is said to have curricular content validity”.
52
c. Construct Validity
A test is said to have a construct validity if it can demonstrates that it measures just the ability, which it is supposed to measure
.according to Heaton; “if a test has construct validity, it is capable of measuring certain specific characteristics in accordance with a theory
of language behavior and learning”.
53
d. Empirical Validity
A fourth type of validity is usually referred to as statistical or empirical validity. This validity is obtained as a result of comparing
the result of the test with the result of some criterion measure.
54
2. Reliability
The second criterion of a good test is reliability. Reliability has to do with the accuracy and precision of a measurement procedure. Indices of
reliability give an indication of the extent to which a particular measurement is consistent and reproducible.
55
A test should be reliable as a measuring instrument.
According to Finocchiario and Sako; the reliability or stability of a language test is concerned with the degree to which it can be trusted to
produce the same result upon repeated administration to the same individual, or to give consistent information about the value of a learning
variable being measured”.
56
While J. Stanley Ahmann and Marvin D. Glock state that “Reliability means consistency of results. This is
equivalent to saying that a highly reliable instrument can be used
52
Victor H. Noll, Introduction to Educational..., p. 79
53
J.B. Heaton. Writing English..., p. 161
54
J.B. Heaton, Writing English..., p. 161
55
Robert L. Thorndike and Elizabeth Hagen, Measurement and Evaluation in Psychology and Education
, London; John Willey and Sons, Inc., 1961, p. 127
56
Marry Finnochiario and Sydney Sako, Foreign Language..., p. 28
21
repeatedly in an unchanging situation and produce constant or near constant results.”
57
Based on above statements a test is reliable if it consistently yields the same or nearly the same ranks over repeated administrations.
3. Practicality
Practicality is concerned with a wide range of factors economy, convenience and interpretability that determine whether a test is practical
for widespread use. “Practically is concerned with a wide range of factors economy, convenience, and interpretability that determine whether a test is
practical for widespread use”.
58
A test maybe a highly reliable and valid instrument but still is beyond our means facilities. The teacher or someone who makes the test
should keep in mind a number of very practical considerations. There are many factors of practicality; economy, scorability, and administrability.
According to Finnochiario and Sako state that “the criteria for practicality normally will be based upon such factors as economy,
scorability, and administrability”.
59
While, Harrison states that “tests should be as economical as possible in time preparation, sitting, and
marking and in cost material and hidden costs of time spent”.
60
In short, the criteria of a good test are validity, reliability and
practicality. However, besides those three criteria, a good test as whole is also determined by the quality of each item that construct the set test. If the
quality of each item is good, it can give the strength and accuracy of the scores get from the test. Then, the quality of each item individually can be
analyzed by doing item analysis. According to Robert Lado; “item analysis is the study of validity, reliability, and difficulty of test item taken
57
J. Stanley Ahmann and Marvin D. Glock, , Evaluating Pupil..., p. 311
58
Robert L. Thorndike and Elizabeth Hagen, Measurement and Evaluation..., p. 127
59
Marry Finnochiario, Foreign Language Testing..., p. 30
60
Andrew Harrison, A Language Testing..., p. 13
22
individually as if they were separate tests”.
61
through this analysis, the evaluator can get information about which item is good for the future used.
D. Item Analysis