reliability. As Gronlund says “reliability refers to the consistency of evaluation results, if the test-makers obtain quite similar scores when the same test
administered to the same group on the different occasion. Then, it can conclude that the result has a high degree of reliability from one occasion to another.
Similarly, if the teacher independently rate the same student in the same instrument and obtain the similar ratings, it can conclude that the result has a
high degree of reliabil ity”.
23
It is similar with David and Roger Johnson’s definition, W. Johnson and T. Johnson defines “Reliability exists when a student’s performance remains the
same on repeated measurements”.
24
According to Brown, reliability is a consistent and dependable test; if the test-makers give the same test to the same subject or matched subjects on two
different occasions, the test itself should have similar results.
25
Likewise, Allison claims “The reliability of a test concerns the accuracy and trustworthiness of it
s results: if we could erase the test from students’ memories and then repeat it, how similar would the results be?”
26
As a result, a test is considered by reliability if the test-makers get the some result repeatedly. Reliability does not simply validity. It means that a reliable
measures the subjects or materials given consistently, but not necessarily what it is supposed to be measured.
3. Practicality
The third characteristic of a good test is practicality or usability. The term practicality refers to the cost of copiers, the time in administering the test, the
ease of scoring, and other factors teachers have decided before using a particular measurement.
27
Moreover, Popham uses term “practicality” as “usability”. He says:
23
Gronlund, op.cit., p. 65-66.
24
W. Johnson and T. Johnson, op.cit., p. 54.
25
Brown, op.cit., p. 386.
26
Allison, op.cit., p. 85.
27
W. Johnson and T. Johnson, loc.cit.
Evaluators must be sensitive to the ease with which the tests can be administered. How long will they take to administer? How readily can
tests be scored? How easily can tests results be interpreted? How expensive are the tests? Are they equivalent forms available for
pretesting, post testing, and so forth? Is there evidence that the test is suitable for all the ethnic groups with whom it will be used?
28
Therefore, Practicality is the third criteria of a good test which concerns in the cost, the time, and the ease in administering the test. If the test is expensive,
difficult to be scored, and time consuming; it means the test is impractical.
C. The Test Validity
1. Content Validity
Content validity is how well the test construct as representative as the subject matter which should be covered in the test. It aims for measuring what
should be measured as in syllabus and curriculum. As one of experts defines “Content validity is concerned with the extent to which the test is
representative of a defined body of content consisting of topics and proces
ses”.
29
Moreover, in assessing the content validity of an achievement test, one asks, To what extent does the test require demonstration by the student of the
achievement that constitutes the objectives of instruction in this area?” For a test to have high content validity, it should be a representative sample of
both the contenttopics and the cognitive processesabilities objectives of a given course or unit-the test should contain a representative sample of the
content and uses to which the content is to be applied.
30
Content validity is one of a measuring instrument which measures the similarity of certain types of situations or subject matter. The instrument can be
28
W. James Popham, Educational Evaluation, Boston: Allyn and Bacon, 1993, p. 126.
29
Wiersma and Jurs, op.cit., p. 184.
30
Kenneth D. Hopkins, Educational and Psychological Measurement and Evaluation, Boston: Allyn and Bacon, 1998, 8
th
edition, p. 72-73.