37
D = U - L N
D = The Index of discriminating power
U = The number of pupils in the upper group who answered the item
correctly L =
The number of pupils in the lower group who answered the item correctly
N = Number of pupils in each of the group.
Then, the writer uses the criterion of discriminating power as follow:
59
Table 3.2 The Classification of Discriminating Power
DISCRIMINATING POWER
REMARK
0.6 – 1.0 Very good
0.4 - 0.6 Good
0.1 – 0.3 Ok
-1 – 0.0 Bad
3. Reliability of Test
Reliability refers to the consistency of evaluation results. If we obtain quite similar scores when the same test administrated to the same group on two
different occasions, we can conclude that our result have a high degree of reliability from one occasion to another. Reliability is intimately related to the
type of interpretation to be made. For some uses we may be interested in asking
59
J. B. Heaton, Classroom Testing, New York: Longman Inc, 1990, p. 174.
38 how reliable our evaluation result is over a given period of time, and for other.
How reliable they are over different sample of the same behavior. After analyzed the data, to check the reliability of the test, the writer used
the formula:
60
k S
t 2
- Σ
pi.qi
r
tt
= k-1
S
t 2
r
tt = The reliability of instrument
k = The total of number item
S
t 2
= The total of varians pi
= The proportion of pupil answering each item correctly qi
= The proportion of pupil answering each item uncorrectly q = 1- p
Σ piqi
= The total of pi.qi
Table 3.3 The Classification of Reliability test
Reliable Level REMARK
0.91 – 1.00 Very High
0.71 – 0.90 High
0.41 – 0.70 Enough
0.21 – 0.40 Low
0.21 Very Low
61
4. Validity of Test
60
Mudjijo, Tes Hasil Belajar, Jakarta: Bumi Aksara, 1995, p. 58.
61
Suharsimi Arikunto, Dadar-Dasar Evaluasi Pendidikan, Jakarta: Bumi Aksara, 2006, p. 24.
39 Basically validity is always concerned with the specific use to be made of
the result and with the soundness of our proposed interpretations. Validity is always specific to some particular use, it should never be considered a general
quality. The substantial thing in evaluating the data of this research in order to know that data is valid or not. There are three main strategies have traditionally
been used to investigate validity: content validity, construct validity and criterion- related validity.
62
In this research, the writer examines the test based on the content validity. Content validity is considered with the material that the students
have learned. The test should concern with the materials that the students have learned.
N. The Criteria of the Action Success