This formula was used for validating each score, and the result was consulted to critical value for r-product moment. When the obtain coefficient of
correlation was higher than the critical value for r-product moment, it meant that a scoring is valid at 5 alpha level of significances. The result was being consulted
with r product moment, r x y r was valid.
3.5.3 Reliability
The reliability is the stability of the test scores. Tuckman 1978:160 states that “test reliability meant that a test was consistent”. Reliability shows
whether the instrument is reliable and can be used as the device of collect the data. A test is reliable to the extent that it measures consistently, from one time to
another. To conduct the reliability, I used the Kuder-Richarson Reliability.
Tuckman 1978:160 states that “this formula known as K-R formula 21 was
shown below and was equivalent to the average of all possible split-half reliability coefficients”.
Where : R
11
: Kuder- Richardson reliability coefficient K
: number of item in the test M
: mean score on the test Vt
: test variance measure of variability Arikunto, 2006:189
KVt M
K M
k k
r 1
1
11
First, what I have to find is the variance of the test-symbolized s
2
. The formula used to find s
2
is as follows:
Where Vt
: variance N
: the number of the score
y : the score of the students
2
y
: the quadrate of the students‟ total scores Arikunto, 2006:184
The test is considered being reliable if the r
xy
r
table
for α = 5.
3.5.4 Item Difficulty
Item tests have good difficulty level if it is not too easy or too difficult for the students examiners, so they can answer the item. Therefore, every item
should be analyzed first before it is used in the test. The formula:
JS b
P
Where: P = the facility value index of difficulty
B = number of students who answered the item test correctly JS = the total number of students
The criteria used here are:
N N
y y
V
t
2 2
Difficulty level Category
00 ,
1 70
, 70
. 30
, 30
, 00
,
P P
P
Difficult Medium
Easy
3.5.5 Discriminating Power
The discriminating power is a measurement of the effectiveness of an item in discriminating between high and low scores of the whole test. The higher
the value of the discriminating power is the more effective an item will be. It is also essential to determine the discriminating power of the items since it can
determine between the more and the less able students. There were many steps to calculate the index of discrimination as
follows: 1
The results of the scores of the try out test were arranged by well organizing the students‟ result from the highest to the lowest score.
2 All students who take the test were divided in two groups namely upper and
the lower groups by taking 50 from the highest score as the upper group, and 50 from the lowest score as the lower group.
3 The formula is as follows:
JB BB
JA BA
D
Arikunto, 2002:309
Where: D
: the discriminating index BA
: the number of students in the upper group who answered the item correctly
BB : the number of students in the lower group who answered the item
correctly JA
: the number of all students in the upper group JB
: the number of all students in the lower group The criteria used here were:
Interval Criteria
20 ,
,
D 40
, 20
,
D 70
, 40
,
D 00
, 1
70 ,
D
Poor Satisfactory
Good Excellent
3.5.6 Pre test