measures what is intended to measure.
21
Besides, validity is also as the test- makers construction or as a selection of classroom assessments.
Moreover, there are more explanations about a validity which defined by Gronlund; like:
a. Validity refers to the result of a test or evaluation instrument for a given group of individuals, not to the instruments itself. Test makers
sometimes speak the validity of the test, for the sake of the convenience, but it is more appropriate to speak of the validity of the
test result, or more special, of the validity of the interpretation to be made from the result.
b. Validity is a matter of degree. It doesn’t exist in an all or none basis.
Consequently, test makers should avoid thinking of evaluation results as valid or invalid. Valid is best considered in terms of categories that
specify degree, such as high validity, moderate validity, and low validity.
c. Validity is always specific to some particular use. It should be never considered a general quality.
22
Here, the example of validity: if the test-maker wants to measure the students’ writing ability, he or she can ask the students to write as many words
as they can in fifteen minutes, then simply count the words for the final score. However, if the test-
maker wants to measure the students’ ability in speaking, he or she uses an objective test or an essay test. It means that he or she does not
measure what should be measured. The writer emphasizes that validity is the important good criteria of a good
test refers to the test which measure what should be measured or intended.
2. Reliability
A test which is administered in different time, but it has a consistent score or the recent test score is not too different with the previous test score called
21
Brown, op.cit., p. 387.
22
Gronlund, op.cit., p. 66-67.
reliability. As Gronlund says “reliability refers to the consistency of evaluation results, if the test-makers obtain quite similar scores when the same test
administered to the same group on the different occasion. Then, it can conclude that the result has a high degree of reliability from one occasion to another.
Similarly, if the teacher independently rate the same student in the same instrument and obtain the similar ratings, it can conclude that the result has a
high degree of reliabil ity”.
23
It is similar with David and Roger Johnson’s definition, W. Johnson and T. Johnson defines “Reliability exists when a student’s performance remains the
same on repeated measurements”.
24
According to Brown, reliability is a consistent and dependable test; if the test-makers give the same test to the same subject or matched subjects on two
different occasions, the test itself should have similar results.
25
Likewise, Allison claims “The reliability of a test concerns the accuracy and trustworthiness of it
s results: if we could erase the test from students’ memories and then repeat it, how similar would the results be?”
26
As a result, a test is considered by reliability if the test-makers get the some result repeatedly. Reliability does not simply validity. It means that a reliable
measures the subjects or materials given consistently, but not necessarily what it is supposed to be measured.
3. Practicality
The third characteristic of a good test is practicality or usability. The term practicality refers to the cost of copiers, the time in administering the test, the
ease of scoring, and other factors teachers have decided before using a particular measurement.
27
Moreover, Popham uses term “practicality” as “usability”. He says:
23
Gronlund, op.cit., p. 65-66.
24
W. Johnson and T. Johnson, op.cit., p. 54.
25
Brown, op.cit., p. 386.
26
Allison, op.cit., p. 85.
27
W. Johnson and T. Johnson, loc.cit.